[Buildbot-devel] database-backed status/scheduler-state project

Thu Sep 3 01:50:47 UTC 2009

Nicolas Sylvain wrote:
> 
> With your new multi process architecture, does this mean that we can
> have WebStatus in one process, while the slave<->master communication
> in another process? Slaves will stop to die when people do long
> waterfall requests? (!!!)

Maybe. It's still up in the air whether:

 1: the existing WebStatus (and everything else) gets be rewritten in
    terms of database lookups, builds write directly to the database,
    and we get rid of the pickle files, or
 2: we add a status plugin which copies all status info into an extra
    database, leaving the pickle files in place, and leaving WebStatus
    (and everything else) untouched

The former would provide the sort of feature you describe: once we get
the third milestone done, you'd be able to run as many processes at you
like (on arbitrary machines), and enable or disable the WebStatus (or
scheduler or build-process-manager, etc) in each one independently. The
latter would be less work (and result in fewer bugs), but would only
benefit new status-displaying code that was written specifically to
reference the database, leaving the WebStatus to fight for CPU time with
everything else.

> Any plan to make the web server be multi threaded?

Not really, threads scare me. But once the status is in a database,
it'll be easier for people who are less threadophobic than me to
implement stuff like that, in a separate process. :-)

> Where would the saved stdio live? All in the database?

Nope. Databases are not known for handling gigantic strings very
efficiently, and I want to maintain the streaming "tail -f" -type
logfiles because I use that all the time.

So my plan is to create a "Log Server" process, with a key-value store
(the values are logs) that provides HTTP access to the logs, including
streaming access to logs that are still being generated. The database
would hold a unique key for each logfile, and status renderers would be
configured with the HTTP base URL of the logserver. The new Waterfall
page would talk to the database to find out the logids and generate
hyperlinks that point to the logserver's HTTP listening port. The
buildmaster would speak to the logserver over PB to create new logfiles
and fill them with stdio. You'd have one common logserver for your farm
of buildbot machines just like you'd have one common database for them
to all connect to.

> Also we collect more than a few GB of logs every day, and we need to
> prune them often. It would be awesome if there was a way to tell
> buildbot when to auto-prune the old data.

Yeah, I'm imagining a config knob for the logserver which defines a
max-size or max-age or something. Or, since the logserver will be saving
everything to a regular file, you could write an external cronjob to run
'find' and delete the old ones.

> Otherwise, that looks really good. I'll make sure to use these new
> features as they come on the Chromium buildbot.

Great! If you see anything in this project that helps/hurts your use of
the buildbot, let me know.

cheers,
 -Brian