[Buildbot-devel] database-backed status/scheduler-state project

Tue Sep 15 09:54:05 UTC 2009

hi everyone;

(Sorry for the delayed response - I'm traveling on vacation.)

Great to see the lively discussion on this thread - there were a few 
important points here I wanted to comment on.

1) Its super-awesome having Brian tackle this long-desired 
functionality, in an upstream-able non-Mozilla-specific way.  I'm 
delighted he is able to spend some time on making this happen.

2) In terms of sequence of project work, I agree with Brian that the 
persistent queue db seems best to tackle first. Some reasons include:

a) multiple masters to handle load:
Handling our load of checkins across all the repos and active code lines 
turned out to be too much for our one buildbot master, with its one 
pool-o-slaves. As a workaround, we currently have 3 masters 
(production-master, production-master02, talos-master), each with their 
own dedicated smaller pool-o-slaves. However this workaround is 
inefficient because each master is looking at its own separate queue of 
possible jobs. As developers shift focus from one branch to another, we 
can hit situations like last month where one pool-o-slaves is 
overloaded, while the other pool of identical slaves is idle. Having a 
shared common persistent queue addresses all this nicely.

b) reduce need for downtimes:
This persistent queue db should allow us to restart a buildbot master 
without needing a downtime. Today, we have to declare multihour 
downtimes even for buildbot master restarts that take a few seconds, 
because in-memory queued builds will be lost. If pending jobs remained 
in queue, we could restart masters invisibly, whenever we need to enable 
new code in the masters. We've already fixed a bunch of machine 
reliability problems with our infrastructure, so now these 
enabling-new-master-code downtimes are a significant percentage of the 
downtimes we do still need to do. As Mozilla becomes more and more 24x7, 
and as other problems are solved, its more and more important to fix this.

c) smaller well defined chunk of work:
The list of tasks in Brian's email is a large significant amount of 
work. I think its a good learning experience to do this smaller, 
self-contained, piece of work before taking on the larger status db 
work. I think we'll learn valuable lessons here, both with the tools, 
and db, which will be invaluable for the next phase of work.

d) groundwork for a few important features:
- to allow us requeue a job which was lost because of slave failure.
- to have pods of geographically distributed masters-and-slaves, so we 
can survive losing an individual colo.
- to allow "graceful master shutdown".

Obviously, the status db work is important too; I'm just as eager as 
everyone else to move Mozilla away from the one-size-fits-all Tinderbox 
waterfall used by everyone to generating different dashboards used by 
different types of users. Hence its next on the list. However, starting 
with persistent queue db fixes many important problems, and seems best 
way to see concrete improvements in the short term, while also learning 
before we take on bigger steps.

Just my $0.02.

take care
John.
=====
Brian Warner wrote:
> Hi everybody, it's me again.
> 
> I've taken on a short-term contract with Mozilla to make some
> scaling/usability improvements on Buildbot that will be suitable for
> merging upstream. The basic pieces are:
> 
>  * persistent (database-backed) scheduling state
>  * DB-backed status information
>  * ability to split buildmaster into multiple load-balanced pieces
> 
> I'll be working on this over the next few months, pushing features into
> trunk as we get them working (via my github repo). The result should be
> a buildbot which:
> 
>  * lets you bounce the buildmaster without losing queued builds or the
>    state of i.e. Dependent schedulers
>  * bouncing a master or slave during a build should re-queue the
>    interrupted build
>  * third-party tools can read or manipulate the scheduler state, to
>    insert builds, cancel requests, or accelerate requests, all by
>    fussing with the database
>  * third-party tools can render status information (think PHP scripts
>    reading stuff out of the DB and generating a specialized waterfall)
>  * multiple "build-process-master" processes (needs a better name) can
>    be run on separate CPUs, each handling some set of slaves. Each one
>    claims a buildrequest from the DB when it has a slave available, runs
>    the build, then marks the build as done. If one dies, others will
>    take over.
> 
> I'm hoping that the persistent scheduler-state code will be done by the
> end of the month, ready to put into a buildbot-0.8.0 release shortly
> thereafter.
> 
> DATABASES:
> 
> I'm planning to make the default config store the scheduler state in a
> SQLite file inside the buildmaster's base directory. To enable the
> scaling benefits, you'd need a real networked database, so I also plan
> to have connectors for MySQL and potentially others.
> 
> The plan is to have the schedulers make synchronous DB calls, rather
> than completely rewriting the scheduler/builder code to look more like a
> state machine with async calls (twisted.enterprise). This should let us
> finish the project sooner and with fewer instabilities, but also means
> that DB performance is an issue, since a slow DB will block everything
> else the buildmaster is doing. The Mozilla folks are ok with this, so
> we'll just build it and see how it goes.
> 
> It's very important to me that Buildbot is easy to get installed for all
> users, and installing a big database is not easy, so the default will be
> the no-effort-required entirely-local SQLite. Users will only have to
> set up a real database if they want the "distributed across multiple
> computers" scaling features.
> 
> The statusdb (as opposed to the schedulerdb) may be implemented as a
> buildbot status plugin, leaving the existing pickle files alone, but
> exporting a copy of everything to an external database as the builds
> progress. This would reduce the work to be done (there's already some
> code to do much of this) and minimize the impact on the core code (we'd
> just be adding an extra file that could be enabled or not as people saw
> fit), but might not result in something that's as well integrated into
> the buildbot as it could be (and it might be nice to have a
> Waterfall/etc which read from the database, as things like
> filter-by-branchname would finally become efficient enough to use).
> 
> DEPENDENCIES:
> 
> Buildbot-0.8.0 will need sqlite bindings. These come batteries-included
> (in the standard library) with python 2.5 and 2.6. Users running
> python2.4 will have to install the python-pysqlite2 package to run
> buildbot-0.8.0. I think this is a pretty minimal addition.
> 
> I'm examining SQLAlchemy to see if the features it offers would be worth
> the extra dependency load. I don't want to use a heavy ORM (because a
> big goal is to have a schema that's easy to query/manipulate from other
> languages), but it looks like it's got connection-pool-management and
> cross-DB support code that might be useful.
> 
> What do people think about the 0.8.0 buildmaster potentially requiring
> sqlalchemy? Would that annoy you? Annoy new users? Make it hard to
> upgrade your environment?
> 
> HELP!:
> 
> I'm looking to hear about other folk's experiences with this sort of
> project. We've been talking about this for years, and some prototypes
> have been built, so I'd like to hear about them (I've been briefed on
> many of the mozilla efforts already).
> 
> I'll attach the proposal below, along with a file of notes that I made
> while walking through the code to see how this needs to work.
> 
> 
> cheers,
>  -Brian
> 
> 
> ===== PROJECT PROPOSAL =====
> 
> Buildbot Database project:
> 
> The goal is to improve the usability and scalability of Buildbot to meet
> Mozilla's current needs, implemented in an appropriate fashion to get
> merged upstream. The primary "pain points" to be addressed are:
> 
>  * most buildmaster state is held in RAM, preventing process restarts
>    for fear of losing queued builds and builds-in-progress. There is no
>    "graceful shutdown" command, but even if there were, it could take
>    hours or days to wait for everything in the queue to finish, losing
>    valuable developer time.
> 
>  * buildmaster does many things in one process (build scheduling, build
>    processing, status distribution), and CPU exhaustion has been
>    observed
> 
>  * Waterfall display is very CPU-intensive. Current deployment does not
>    share waterfall with outside world for fear of overload. Development
>    of alternate status displays (which could run in separate processes)
>    is hampered by the local-file pickle-based status storage format.
> 
> The changes planned for this project are:
> 
>  * move build scheduling state out of RAM and into a persistent
>    database, allowing buildmaster to be bounced without losing queued
>    builds. Builders will claim builds from the database, perform the
>    builds, then update the DB to mark the build state as done, allowing
>    multiple buildmaster processes (on separate machines) to share the
>    load, communicating mostly through the DB. New tools (written in
>    arbitrary languages) can be used to manipulate the schedulerdb, to
>    implement features like "accelerate build request", "cancel request",
>    etc.
> 
>  * move build status out of pickle files into a database, to enable
>    multiple processes (on separate hosts) to access the status. Database
>    replication can then be used to allow a publically-visible Waterfall
>    without threatening to overload the buildmaster. Status displaying
>    tools (dashboards, etc) can be written in arbitrary languages and
>    simply read the information they need from the statusdb.
> 
>  * add configuration options to switch on/off the four main buildmaster
>    functions (ChangeMaster, Schedulers, Builder/Build processing, Status
>    distribution), allowing these functions to be spread across multiple
>    processes, using the state/status databases for coordination. The
>    goal is to have one ChangeMaster/Schedulers process, multiple
>    Builder/Build processing tasks (one "build-master" per "pod", with a
>    set of slaves attached to each one), and multiple status distribution
>    processes. This should help the scalability problem, by allowing the
>    load to be spread across multiple computers.
> 
>  * the default database will be a local SQLite file, but master.cfg
>    statements will allow flexible configuration of the database
>    connection method. Postgres (or whatever mozilla's favorite DB is)
>    will be tested. Others (at least MySQL) should be possible.
>    Provisions will be made to tolerate the inevitable SQL dialect
>    variations.
> 
>  * (probably) add "graceful shutdown" switch to the buildmaster. Once
>    the buildmaster is in this mode, new jobs will not be started, and
>    the buildmaster will shutdown once the last running job completes.
>    The switch may have an option to make the buildmaster restart itself
>    automatically upon shutdown. UI is uncertain.
> 
>  * (maybe) add "graceful shutdown" switch to the buildslave, used in the
>    same way as the buildmaster's switch. UI is uncertain.
> 
>  * (probably) add "RESUBMIT" state to the overall Build object (along
>    with the existing SUCCESS, WARNING, FAILURE, EXCEPTION states). The
>    scheduling code will react to this by requeueing the BuildRequest.
>    Builds which stop because of a lost slave or restarted buildmaster
>    will be marked with this state, so they will be re-run when the
>    necessary resources come back.
> 
>  * retain cancel-build capabilities (may require Builder to poll a DB to
>    see if the build has been cancelled)
> 
> Design restrictions imposed by Brian as Buildbot upstream developer:
> 
>  * dependency load must not increase significantly. I'm ok with
>    requiring SQLite because it's built-in to python2.5/2.6, and easy to
>    get for python2.4 . I'm not willing to require other database
>    bindings, nor to require all Buildbot users to install/configure an
>    e.g. MySQL database before they can run a buildmaster.
> 
>  * existing 0.7.11 deployments must remain compatible with the new code.
>    The default configuration must use SQLite in a local directory. Any
>    state-migration steps that must be done will be handled by adding new
>    code to the existing "buildbot upgrade-master" command.
> 
>  * all code must have clear User's Manual documentation (with examples)
>    and adequate unit tests. All changes must be licensed compatibly with
>    the upstream source (GPL2).
> 
> The specific milestones we're planning are:
> 
>  * phase 1: Create the database connectors (initially only SQLite), move
>    just the scheduler state into the database. This includes the output
>    of the ChangeMaster, the internal state of all Schedulers, and the
>    list of ready-to-go BuildRequests. All existing Scheduler classes and
>    the Builder class will be changed to scan the database for work
>    instead of looking at lists in RAM. The RESUBMIT state will be
>    implemented and Builders updated to requeue such builds.
> 
>    This will allow the buildmaster to be bounced without loss of state
>    (although any running builds will be abandoned and requeued). It will
>    not yet enable the use of multiple processes. It will not touch the
>    build status information (currently stored in pickle files).
> 
>   * phase 1.1: Implement the Postgres database connector, and the
>     master.cfg options necessary to control which db type/location to
>     use for scheduler state. Test a buildmaster running with a remote
>     schedulerdb.
> 
>   * phase 1.2: Implement graceful-shutdown controls.
> 
>  * phase 2: Change the build-status code to store its state in a
>    database, instead of in the current pickle files. Implement a "Log
>    Server" to store/publish/stream logfile contents. Write a "buildbot
>    upgrade-master" tool to non-destructively migrate old pickle data
>    into the new database and logserver. Change the existing Status
>    plugins (Waterfall, MailNotifier, IRCBot, etc) to read status from
>    database. Add master.cfg options to control which db is used for
>    status data.
> 
>    This will enable non-buildbot status-displaying frontends.
> 
>  * phase 3: Add master.cfg options to control which components are
>    enabled in any given process. Provide mechanisms and examples to run
>    e.g. multiple build-process-masters which coordinate through the
>    database. Implement TCP/HTTP/polling -based "ping notifiers" to allow
>    low-latency triggering between components in separate processes (i.e.
>    Scheduler writes ready-to-build requests into DB, but the
>    build-process-master on a separate host must be told to re-scan the
>    DB for new work). Provide master.cfg options to control type/location
>    of DB, ping-notifiers, and Log Server. build-process-master instances
>    will have some configuration in common, other configuration unique to
>    each instance.
> 
>    This will finally enable scaling through multiple buildbot processes,
>    and multiple Waterfall renderers.
> 
> I'm roughly targetting phase 1 to be incorporated into an upstream
> buildbot-0.8.0 release, and phase 2 in an 0.9.0 release shortly
> afterwards. Phase 3 may get into 0.9.0, or may go into a subsequent
> upstream release.
> 
> Aggressive target is to get phase 1 done by end of september, then
> evaluate schedule and progress made before beginning next phase. Overall
> goal is to complete project in 2-3 months.
> 
> Sub-tasks which can be split out easily include:
> 
>  * database connector module: python "dbapi2" interface,
>    reconnection-on-error (and log attempts w/backoff), cross-database
>    compatibility code, blocking methods for scheduler state db,
>    fire-and-forget (but retry for a little while) for status writes
> 
>  * "ping notifier" module: define HTTP POST / line-oriented TCP /
>    polling protocol, implement client / server modules.
> 
>  * Log Server: writer-side PB interface, reader-side HTTP interface
> 
> === DESIGN NOTES ===
>   -*- org-mode -*-
> 
> * databases: three databases, plus logserver
> ** Changes go in one database
> ** scheduling stuff (Scheduler state, builds ready-to-go/claimed/finished)
>    this includes BuildRequests and their properties
> ** status (steps, logids, results, properties)
>    the goal is for the buildmaster to never read from the status db, only
>    the status-rendering code (which will eventually live elsewhere)
> 
> * database connector
> ** all statusdb calls may raise DBUnavailableError
>    renderer should deliver error to client
> ** all schedulerdb calls should block, reconnect, retry */1s, log w/backoff
>    db is critical to this part
> ** config option to set DB type, connection arguments
> ** schema restrictions to get cross-db compatibility:
>    - declare types (SQLite tolerates, but most don't)
>    - revision ids will be strings, SVN will deal
>    - no binary strings. Unicode is ok(?).
> 
> * notification mechanism
>   - first milestone (non-distributed) will be all in-process
>   - distributed milestone will require pings
>     HTTP POST (forwards), TCP line-oriented (either), or just polling
> 
> * persistent scheduler project
> ** Changemaster:
>    - (changeid, branchname, revisionid, author, timestamp, comment,
> category?)
>      changeids must be comparable and monotonically increasing
>    - (changeid, filename)
>      i.e. changes[changeid].filenames = []
>    - (changeid, propertyname, propertyvaluestring)
>      i.e. changes[changeid].properties = {name: value}
> *** add row to database, ping Schedulers (eventual-send)
> *** ping all schedulers at buildmaster startup
> ** Schedulers:
>    - all state must be put in DB
>      - each records last-change-number, only examines changes since then
>      - each records list of changes, with important/unimportant flag
>    - trickiest part will be relationships between Dependent schedulers
> *** when pinged, or timer wakeup:
>     - loop over all Schedulers
>     - scan for unchecked changeids
>       - default Scheduler ignores changes on the wrong branch
>       - check importance of each
>       - add to changes table
>       - arrange for tree-stable-timer wakeup
>     - if all changes are old enough, and important, then submit build
>       - AnyBranchScheduler processes changes one branch at a time
> *** Dependent (downstream):
>     - configured with an upstream scheduler, by name
>     - wants to be told when upstream BuildSet completes successfully,
>       receive SourceStamp object
>     - then submits a new BuildSet, using the same SourceStamp, with
> different
>       buildernames and properties
> **** so, this scheduler ignores the changes table and watches active-builds
>      - defer figuring it out until I build the active-build table
> *** Periodic
>     - (schedulername, last-build-started-time, last-changeid-built)
>     - if last-build-started-time + delay < now:
>       make SS with recent changes, submit buildset, update
>       last-build-started-time and last-changeid-built
>       - consider checking active-builds, avoid overlaps
>     - else: arrange for wakeup in (now - last-build-started-time + epsilon)
> **** after a long downtime, this should start a build
> *** Nightly/Cron
>     - like Periodic, but compute next build time differently
> **** after a long downtime, this should *not* start a build
>      - maybe make that configurable, catchup=bool
> *** Try: ignores changetable, just submits buildsets
> *** schema:
>     - changes table: (schedulerid, changenum, important_p)
>     - timer table: (wakeup-time)
>       if min(wakeup-time) < now: empty table, ping all schedulers
> **** default Scheduler
>      - (schedulerid, schedulername, last-changeid-checked)
> **** Periodic
>     - (schedulername, last-build-started-time, last-changeid-built)
> **** Triggerable
>      - really just maps scheduler name +properties to buildernames
>      - certain buildsteps can push the trigger, wait for completion
>      - ignores changetable, ignores buildtable
>      - does not use schedulerdb
> **** SourceStamps
>      how to gc?
>      - (sourcestampid, branch, revision/None, patchlevel, patch)
>      - (sourcestampid, changeid)
> *** scheduler has properties, copied into BuildSet
>     - doesn't need to be in the scheduler table, but might need to be in
>       BuildSet table
> *** scheduler's output is a BuildSet, which has .waitUntilFinished()
>     - buildernames, sourcestamp, properties
> ** BuildSet
>    - have .waitUntilFinished(), used by downstream Dependent schedulers and
>      Triggerable steps
>    - (buildsetid, sourcestampid, reason, idstring, current-state)
>      idstring comes from Try job, to associate with external tools
>      - current-state in (hopeful, unhopeful, complete)
>        (no failures seen yet, some failures seen, all builds finished)
>        (idea is to notify early on first sign of failure)
>    - (buildsetid, buildername, buildreqid)
>      i.e. buildset.buildernames = []
>    - (buildsetid, propertyname, valuestring)
>      i.e. buildset.properties = {}
> *** when all buildrequests complete, aggregate the results
>     - when each buildrequest completes, ping the buildsets
>       - this may change the buildset state
>       - buildset state changes should ping schedulers
> ** BuildRequest
>    - created with reason, sourcestamp, buildername, properties
>    - can be merged with other requests, if sourcestamps agree to it
>    - given to Builder to add to the builder queue
>    - can be started multiple times: updates status, informs watchers
>    - can be finished once, informs watchers
>    - IBuildRequestControl: subscribe/un, cancel, .submit_time
>      not sure if anybody calls it.. words.py? a few tests?
>    - "reqtable": (buildrequestid, reason, sourcestampid, buildername,
>                   claimed-at, claimed-by?)
>    - (buildrequestid, propertyname, propertyvalue)
> ** Builder
>    - .buildable, .building
>    - submitBuildRequest adds to .buildable, pings maybeStartAllBuilds
>    - what is __getstate__/__setstate__ doing there?
> *** so we need the Builder to scan the reqtable
>     - this is the part that will get distributed
>     - Builder A can claim any buildreqest that's for it and not yet claimed
>       or was claimed but got orphaned by a dead buildmaster, maybe have
>       a timestamp or two
>     - "claimed-at" holds timestamp, starts at 0, updated when a buildmaster
>       grabs it, refreshed every once in a while. req can be claimed by
>       someone else when (now - claimed-at) > timeout.
>     - when the build is done, the buildrequest is removed from the reqtable
>       and the buildset is examined
>     - to cancel a request: remove it from the table
>     - add submit-time or submit-sequence, to provide first-come-first-built
>       to accelerate a request, change that value
> 
> * LogServer
> ** writer-side PB interface:
>    - open(title) -> logid string
>    - write(logid, channel, data)
>    - close(logid)
>      logfile is renamed (from LOGID.open to LOGID.closed) upon close
>    - get_base_url()
> ** buildmaster sends async writes, queues limited amount of requests
>    - fire-and-forget-after-30s, discard if queue grows too big
>    - goal is to tolerate LogServer bounces but not consume lots of memory
> ** reader-side HTTP interface:
> *** logid URL shows title, filesize, options links, open/closed status
>     - with/without headers
>     - just stderr
>     - last N lines (when closed), last N lines plus headers
>     - reads when open do tail-f
> *** all option links are normal statically-computable URLs
> 
> * DB-based status writer
> ** write logserver baseurl into DB each time LogServer PB connection is made
> ** indirect this, to plan for multiple LogServers (logserverid=1 for now)
>    - (stepid, logserverid, logid)
>    - (logserverid, logserver_baseurl)
> 
> * DB-based status renderer
> **
> 
> * random ideas to keep in mind
> ** scheduler db is small
>    - so rather than coming up with clever queries, just grab everything,
>      sort it in memory
>    - also useful to avoid doing multiple queries
> 
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
> trial. Simplify your report design, integration and deployment - and focus on 
> what you do best, core application coding. Discover what's new with 
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel