[Buildbot-devel] database-backed status/scheduler-state project

Thu Sep 3 00:08:13 UTC 2009

That would be very cool indeed.  Thanks, Brian!

Brian Warner wrote:
> Hi everybody, it's me again.
>
> I've taken on a short-term contract with Mozilla to make some
> scaling/usability improvements on Buildbot that will be suitable for
> merging upstream. The basic pieces are:
>
>  * persistent (database-backed) scheduling state
>  * DB-backed status information
>  * ability to split buildmaster into multiple load-balanced pieces
>
> I'll be working on this over the next few months, pushing features into
> trunk as we get them working (via my github repo). The result should be
> a buildbot which:
>
>  * lets you bounce the buildmaster without losing queued builds or the
>    state of i.e. Dependent schedulers
>  * bouncing a master or slave during a build should re-queue the
>    interrupted build
>  * third-party tools can read or manipulate the scheduler state, to
>    insert builds, cancel requests, or accelerate requests, all by
>    fussing with the database
>  * third-party tools can render status information (think PHP scripts
>    reading stuff out of the DB and generating a specialized waterfall)
>  * multiple "build-process-master" processes (needs a better name) can
>    be run on separate CPUs, each handling some set of slaves. Each one
>    claims a buildrequest from the DB when it has a slave available, runs
>    the build, then marks the build as done. If one dies, others will
>    take over.
>
> I'm hoping that the persistent scheduler-state code will be done by the
> end of the month, ready to put into a buildbot-0.8.0 release shortly
> thereafter.
>
> DATABASES:
>
> I'm planning to make the default config store the scheduler state in a
> SQLite file inside the buildmaster's base directory. To enable the
> scaling benefits, you'd need a real networked database, so I also plan
> to have connectors for MySQL and potentially others.
>
> The plan is to have the schedulers make synchronous DB calls, rather
> than completely rewriting the scheduler/builder code to look more like a
> state machine with async calls (twisted.enterprise). This should let us
> finish the project sooner and with fewer instabilities, but also means
> that DB performance is an issue, since a slow DB will block everything
> else the buildmaster is doing. The Mozilla folks are ok with this, so
> we'll just build it and see how it goes.
>
> It's very important to me that Buildbot is easy to get installed for all
> users, and installing a big database is not easy, so the default will be
> the no-effort-required entirely-local SQLite. Users will only have to
> set up a real database if they want the "distributed across multiple
> computers" scaling features.
>
> The statusdb (as opposed to the schedulerdb) may be implemented as a
> buildbot status plugin, leaving the existing pickle files alone, but
> exporting a copy of everything to an external database as the builds
> progress. This would reduce the work to be done (there's already some
> code to do much of this) and minimize the impact on the core code (we'd
> just be adding an extra file that could be enabled or not as people saw
> fit), but might not result in something that's as well integrated into
> the buildbot as it could be (and it might be nice to have a
> Waterfall/etc which read from the database, as things like
> filter-by-branchname would finally become efficient enough to use).
>
> DEPENDENCIES:
>
> Buildbot-0.8.0 will need sqlite bindings. These come batteries-included
> (in the standard library) with python 2.5 and 2.6. Users running
> python2.4 will have to install the python-pysqlite2 package to run
> buildbot-0.8.0. I think this is a pretty minimal addition.
>
> I'm examining SQLAlchemy to see if the features it offers would be worth
> the extra dependency load. I don't want to use a heavy ORM (because a
> big goal is to have a schema that's easy to query/manipulate from other
> languages), but it looks like it's got connection-pool-management and
> cross-DB support code that might be useful.
>
> What do people think about the 0.8.0 buildmaster potentially requiring
> sqlalchemy? Would that annoy you? Annoy new users? Make it hard to
> upgrade your environment?
>
> HELP!:
>
> I'm looking to hear about other folk's experiences with this sort of
> project. We've been talking about this for years, and some prototypes
> have been built, so I'd like to hear about them (I've been briefed on
> many of the mozilla efforts already).
>
> I'll attach the proposal below, along with a file of notes that I made
> while walking through the code to see how this needs to work.
>
>
> cheers,
>  -Brian
>
>
> ===== PROJECT PROPOSAL =====
>
> Buildbot Database project:
>
> The goal is to improve the usability and scalability of Buildbot to meet
> Mozilla's current needs, implemented in an appropriate fashion to get
> merged upstream. The primary "pain points" to be addressed are:
>
>  * most buildmaster state is held in RAM, preventing process restarts
>    for fear of losing queued builds and builds-in-progress. There is no
>    "graceful shutdown" command, but even if there were, it could take
>    hours or days to wait for everything in the queue to finish, losing
>    valuable developer time.
>
>  * buildmaster does many things in one process (build scheduling, build
>    processing, status distribution), and CPU exhaustion has been
>    observed
>
>  * Waterfall display is very CPU-intensive. Current deployment does not
>    share waterfall with outside world for fear of overload. Development
>    of alternate status displays (which could run in separate processes)
>    is hampered by the local-file pickle-based status storage format.
>
> The changes planned for this project are:
>
>  * move build scheduling state out of RAM and into a persistent
>    database, allowing buildmaster to be bounced without losing queued
>    builds. Builders will claim builds from the database, perform the
>    builds, then update the DB to mark the build state as done, allowing
>    multiple buildmaster processes (on separate machines) to share the
>    load, communicating mostly through the DB. New tools (written in
>    arbitrary languages) can be used to manipulate the schedulerdb, to
>    implement features like "accelerate build request", "cancel request",
>    etc.
>
>  * move build status out of pickle files into a database, to enable
>    multiple processes (on separate hosts) to access the status. Database
>    replication can then be used to allow a publically-visible Waterfall
>    without threatening to overload the buildmaster. Status displaying
>    tools (dashboards, etc) can be written in arbitrary languages and
>    simply read the information they need from the statusdb.
>
>  * add configuration options to switch on/off the four main buildmaster
>    functions (ChangeMaster, Schedulers, Builder/Build processing, Status
>    distribution), allowing these functions to be spread across multiple
>    processes, using the state/status databases for coordination. The
>    goal is to have one ChangeMaster/Schedulers process, multiple
>    Builder/Build processing tasks (one "build-master" per "pod", with a
>    set of slaves attached to each one), and multiple status distribution
>    processes. This should help the scalability problem, by allowing the
>    load to be spread across multiple computers.
>
>  * the default database will be a local SQLite file, but master.cfg
>    statements will allow flexible configuration of the database
>    connection method. Postgres (or whatever mozilla's favorite DB is)
>    will be tested. Others (at least MySQL) should be possible.
>    Provisions will be made to tolerate the inevitable SQL dialect
>    variations.
>
>  * (probably) add "graceful shutdown" switch to the buildmaster. Once
>    the buildmaster is in this mode, new jobs will not be started, and
>    the buildmaster will shutdown once the last running job completes.
>    The switch may have an option to make the buildmaster restart itself
>    automatically upon shutdown. UI is uncertain.
>
>  * (maybe) add "graceful shutdown" switch to the buildslave, used in the
>    same way as the buildmaster's switch. UI is uncertain.
>
>  * (probably) add "RESUBMIT" state to the overall Build object (along
>    with the existing SUCCESS, WARNING, FAILURE, EXCEPTION states). The
>    scheduling code will react to this by requeueing the BuildRequest.
>    Builds which stop because of a lost slave or restarted buildmaster
>    will be marked with this state, so they will be re-run when the
>    necessary resources come back.
>
>  * retain cancel-build capabilities (may require Builder to poll a DB to
>    see if the build has been cancelled)
>
> Design restrictions imposed by Brian as Buildbot upstream developer:
>
>  * dependency load must not increase significantly. I'm ok with
>    requiring SQLite because it's built-in to python2.5/2.6, and easy to
>    get for python2.4 . I'm not willing to require other database
>    bindings, nor to require all Buildbot users to install/configure an
>    e.g. MySQL database before they can run a buildmaster.
>
>  * existing 0.7.11 deployments must remain compatible with the new code.
>    The default configuration must use SQLite in a local directory. Any
>    state-migration steps that must be done will be handled by adding new
>    code to the existing "buildbot upgrade-master" command.
>
>  * all code must have clear User's Manual documentation (with examples)
>    and adequate unit tests. All changes must be licensed compatibly with
>    the upstream source (GPL2).
>
> The specific milestones we're planning are:
>
>  * phase 1: Create the database connectors (initially only SQLite), move
>    just the scheduler state into the database. This includes the output
>    of the ChangeMaster, the internal state of all Schedulers, and the
>    list of ready-to-go BuildRequests. All existing Scheduler classes and
>    the Builder class will be changed to scan the database for work
>    instead of looking at lists in RAM. The RESUBMIT state will be
>    implemented and Builders updated to requeue such builds.
>
>    This will allow the buildmaster to be bounced without loss of state
>    (although any running builds will be abandoned and requeued). It will
>    not yet enable the use of multiple processes. It will not touch the
>    build status information (currently stored in pickle files).
>
>   * phase 1.1: Implement the Postgres database connector, and the
>     master.cfg options necessary to control which db type/location to
>     use for scheduler state. Test a buildmaster running with a remote
>     schedulerdb.
>
>   * phase 1.2: Implement graceful-shutdown controls.
>
>  * phase 2: Change the build-status code to store its state in a
>    database, instead of in the current pickle files. Implement a "Log
>    Server" to store/publish/stream logfile contents. Write a "buildbot
>    upgrade-master" tool to non-destructively migrate old pickle data
>    into the new database and logserver. Change the existing Status
>    plugins (Waterfall, MailNotifier, IRCBot, etc) to read status from
>    database. Add master.cfg options to control which db is used for
>    status data.
>
>    This will enable non-buildbot status-displaying frontends.
>
>  * phase 3: Add master.cfg options to control which components are
>    enabled in any given process. Provide mechanisms and examples to run
>    e.g. multiple build-process-masters which coordinate through the
>    database. Implement TCP/HTTP/polling -based "ping notifiers" to allow
>    low-latency triggering between components in separate processes (i.e.
>    Scheduler writes ready-to-build requests into DB, but the
>    build-process-master on a separate host must be told to re-scan the
>    DB for new work). Provide master.cfg options to control type/location
>    of DB, ping-notifiers, and Log Server. build-process-master instances
>    will have some configuration in common, other configuration unique to
>    each instance.
>
>    This will finally enable scaling through multiple buildbot processes,
>    and multiple Waterfall renderers.
>
> I'm roughly targetting phase 1 to be incorporated into an upstream
> buildbot-0.8.0 release, and phase 2 in an 0.9.0 release shortly
> afterwards. Phase 3 may get into 0.9.0, or may go into a subsequent
> upstream release.
>
> Aggressive target is to get phase 1 done by end of september, then
> evaluate schedule and progress made before beginning next phase. Overall
> goal is to complete project in 2-3 months.
>
> Sub-tasks which can be split out easily include:
>
>  * database connector module: python "dbapi2" interface,
>    reconnection-on-error (and log attempts w/backoff), cross-database
>    compatibility code, blocking methods for scheduler state db,
>    fire-and-forget (but retry for a little while) for status writes
>
>  * "ping notifier" module: define HTTP POST / line-oriented TCP /
>    polling protocol, implement client / server modules.
>
>  * Log Server: writer-side PB interface, reader-side HTTP interface
>
> === DESIGN NOTES ===
>   -*- org-mode -*-
>
> * databases: three databases, plus logserver
> ** Changes go in one database
> ** scheduling stuff (Scheduler state, builds ready-to-go/claimed/finished)
>    this includes BuildRequests and their properties
> ** status (steps, logids, results, properties)
>    the goal is for the buildmaster to never read from the status db, only
>    the status-rendering code (which will eventually live elsewhere)
>
> * database connector
> ** all statusdb calls may raise DBUnavailableError
>    renderer should deliver error to client
> ** all schedulerdb calls should block, reconnect, retry */1s, log w/backoff
>    db is critical to this part
> ** config option to set DB type, connection arguments
> ** schema restrictions to get cross-db compatibility:
>    - declare types (SQLite tolerates, but most don't)
>    - revision ids will be strings, SVN will deal
>    - no binary strings. Unicode is ok(?).
>
> * notification mechanism
>   - first milestone (non-distributed) will be all in-process
>   - distributed milestone will require pings
>     HTTP POST (forwards), TCP line-oriented (either), or just polling
>
> * persistent scheduler project
> ** Changemaster:
>    - (changeid, branchname, revisionid, author, timestamp, comment,
> category?)
>      changeids must be comparable and monotonically increasing
>    - (changeid, filename)
>      i.e. changes[changeid].filenames = []
>    - (changeid, propertyname, propertyvaluestring)
>      i.e. changes[changeid].properties = {name: value}
> *** add row to database, ping Schedulers (eventual-send)
> *** ping all schedulers at buildmaster startup
> ** Schedulers:
>    - all state must be put in DB
>      - each records last-change-number, only examines changes since then
>      - each records list of changes, with important/unimportant flag
>    - trickiest part will be relationships between Dependent schedulers
> *** when pinged, or timer wakeup:
>     - loop over all Schedulers
>     - scan for unchecked changeids
>       - default Scheduler ignores changes on the wrong branch
>       - check importance of each
>       - add to changes table
>       - arrange for tree-stable-timer wakeup
>     - if all changes are old enough, and important, then submit build
>       - AnyBranchScheduler processes changes one branch at a time
> *** Dependent (downstream):
>     - configured with an upstream scheduler, by name
>     - wants to be told when upstream BuildSet completes successfully,
>       receive SourceStamp object
>     - then submits a new BuildSet, using the same SourceStamp, with
> different
>       buildernames and properties
> **** so, this scheduler ignores the changes table and watches active-builds
>      - defer figuring it out until I build the active-build table
> *** Periodic
>     - (schedulername, last-build-started-time, last-changeid-built)
>     - if last-build-started-time + delay < now:
>       make SS with recent changes, submit buildset, update
>       last-build-started-time and last-changeid-built
>       - consider checking active-builds, avoid overlaps
>     - else: arrange for wakeup in (now - last-build-started-time + epsilon)
> **** after a long downtime, this should start a build
> *** Nightly/Cron
>     - like Periodic, but compute next build time differently
> **** after a long downtime, this should *not* start a build
>      - maybe make that configurable, catchup=bool
> *** Try: ignores changetable, just submits buildsets
> *** schema:
>     - changes table: (schedulerid, changenum, important_p)
>     - timer table: (wakeup-time)
>       if min(wakeup-time) < now: empty table, ping all schedulers
> **** default Scheduler
>      - (schedulerid, schedulername, last-changeid-checked)
> **** Periodic
>     - (schedulername, last-build-started-time, last-changeid-built)
> **** Triggerable
>      - really just maps scheduler name +properties to buildernames
>      - certain buildsteps can push the trigger, wait for completion
>      - ignores changetable, ignores buildtable
>      - does not use schedulerdb
> **** SourceStamps
>      how to gc?
>      - (sourcestampid, branch, revision/None, patchlevel, patch)
>      - (sourcestampid, changeid)
> *** scheduler has properties, copied into BuildSet
>     - doesn't need to be in the scheduler table, but might need to be in
>       BuildSet table
> *** scheduler's output is a BuildSet, which has .waitUntilFinished()
>     - buildernames, sourcestamp, properties
> ** BuildSet
>    - have .waitUntilFinished(), used by downstream Dependent schedulers and
>      Triggerable steps
>    - (buildsetid, sourcestampid, reason, idstring, current-state)
>      idstring comes from Try job, to associate with external tools
>      - current-state in (hopeful, unhopeful, complete)
>        (no failures seen yet, some failures seen, all builds finished)
>        (idea is to notify early on first sign of failure)
>    - (buildsetid, buildername, buildreqid)
>      i.e. buildset.buildernames = []
>    - (buildsetid, propertyname, valuestring)
>      i.e. buildset.properties = {}
> *** when all buildrequests complete, aggregate the results
>     - when each buildrequest completes, ping the buildsets
>       - this may change the buildset state
>       - buildset state changes should ping schedulers
> ** BuildRequest
>    - created with reason, sourcestamp, buildername, properties
>    - can be merged with other requests, if sourcestamps agree to it
>    - given to Builder to add to the builder queue
>    - can be started multiple times: updates status, informs watchers
>    - can be finished once, informs watchers
>    - IBuildRequestControl: subscribe/un, cancel, .submit_time
>      not sure if anybody calls it.. words.py? a few tests?
>    - "reqtable": (buildrequestid, reason, sourcestampid, buildername,
>                   claimed-at, claimed-by?)
>    - (buildrequestid, propertyname, propertyvalue)
> ** Builder
>    - .buildable, .building
>    - submitBuildRequest adds to .buildable, pings maybeStartAllBuilds
>    - what is __getstate__/__setstate__ doing there?
> *** so we need the Builder to scan the reqtable
>     - this is the part that will get distributed
>     - Builder A can claim any buildreqest that's for it and not yet claimed
>       or was claimed but got orphaned by a dead buildmaster, maybe have
>       a timestamp or two
>     - "claimed-at" holds timestamp, starts at 0, updated when a buildmaster
>       grabs it, refreshed every once in a while. req can be claimed by
>       someone else when (now - claimed-at) > timeout.
>     - when the build is done, the buildrequest is removed from the reqtable
>       and the buildset is examined
>     - to cancel a request: remove it from the table
>     - add submit-time or submit-sequence, to provide first-come-first-built
>       to accelerate a request, change that value
>
> * LogServer
> ** writer-side PB interface:
>    - open(title) -> logid string
>    - write(logid, channel, data)
>    - close(logid)
>      logfile is renamed (from LOGID.open to LOGID.closed) upon close
>    - get_base_url()
> ** buildmaster sends async writes, queues limited amount of requests
>    - fire-and-forget-after-30s, discard if queue grows too big
>    - goal is to tolerate LogServer bounces but not consume lots of memory
> ** reader-side HTTP interface:
> *** logid URL shows title, filesize, options links, open/closed status
>     - with/without headers
>     - just stderr
>     - last N lines (when closed), last N lines plus headers
>     - reads when open do tail-f
> *** all option links are normal statically-computable URLs
>
> * DB-based status writer
> ** write logserver baseurl into DB each time LogServer PB connection is made
> ** indirect this, to plan for multiple LogServers (logserverid=1 for now)
>    - (stepid, logserverid, logid)
>    - (logserverid, logserver_baseurl)
>
> * DB-based status renderer
> **
>
> * random ideas to keep in mind
> ** scheduler db is small
>    - so rather than coming up with clever queries, just grab everything,
>      sort it in memory
>    - also useful to avoid doing multiple queries

Render me gone,                       |||
Bob                                 ^(===)^
---------------------------------oOO--(_)--OOo---------------------------------
      "Magic" and "Miracles" are registered trademarks of Ignorance, Inc.
                              All Rights Reserved.