[Buildbot-devel] database-backed status/scheduler-state project
Brian Warner
warner at lothar.com
Wed Sep 2 22:56:05 UTC 2009
Hi everybody, it's me again.
I've taken on a short-term contract with Mozilla to make some
scaling/usability improvements on Buildbot that will be suitable for
merging upstream. The basic pieces are:
* persistent (database-backed) scheduling state
* DB-backed status information
* ability to split buildmaster into multiple load-balanced pieces
I'll be working on this over the next few months, pushing features into
trunk as we get them working (via my github repo). The result should be
a buildbot which:
* lets you bounce the buildmaster without losing queued builds or the
state of i.e. Dependent schedulers
* bouncing a master or slave during a build should re-queue the
interrupted build
* third-party tools can read or manipulate the scheduler state, to
insert builds, cancel requests, or accelerate requests, all by
fussing with the database
* third-party tools can render status information (think PHP scripts
reading stuff out of the DB and generating a specialized waterfall)
* multiple "build-process-master" processes (needs a better name) can
be run on separate CPUs, each handling some set of slaves. Each one
claims a buildrequest from the DB when it has a slave available, runs
the build, then marks the build as done. If one dies, others will
take over.
I'm hoping that the persistent scheduler-state code will be done by the
end of the month, ready to put into a buildbot-0.8.0 release shortly
thereafter.
DATABASES:
I'm planning to make the default config store the scheduler state in a
SQLite file inside the buildmaster's base directory. To enable the
scaling benefits, you'd need a real networked database, so I also plan
to have connectors for MySQL and potentially others.
The plan is to have the schedulers make synchronous DB calls, rather
than completely rewriting the scheduler/builder code to look more like a
state machine with async calls (twisted.enterprise). This should let us
finish the project sooner and with fewer instabilities, but also means
that DB performance is an issue, since a slow DB will block everything
else the buildmaster is doing. The Mozilla folks are ok with this, so
we'll just build it and see how it goes.
It's very important to me that Buildbot is easy to get installed for all
users, and installing a big database is not easy, so the default will be
the no-effort-required entirely-local SQLite. Users will only have to
set up a real database if they want the "distributed across multiple
computers" scaling features.
The statusdb (as opposed to the schedulerdb) may be implemented as a
buildbot status plugin, leaving the existing pickle files alone, but
exporting a copy of everything to an external database as the builds
progress. This would reduce the work to be done (there's already some
code to do much of this) and minimize the impact on the core code (we'd
just be adding an extra file that could be enabled or not as people saw
fit), but might not result in something that's as well integrated into
the buildbot as it could be (and it might be nice to have a
Waterfall/etc which read from the database, as things like
filter-by-branchname would finally become efficient enough to use).
DEPENDENCIES:
Buildbot-0.8.0 will need sqlite bindings. These come batteries-included
(in the standard library) with python 2.5 and 2.6. Users running
python2.4 will have to install the python-pysqlite2 package to run
buildbot-0.8.0. I think this is a pretty minimal addition.
I'm examining SQLAlchemy to see if the features it offers would be worth
the extra dependency load. I don't want to use a heavy ORM (because a
big goal is to have a schema that's easy to query/manipulate from other
languages), but it looks like it's got connection-pool-management and
cross-DB support code that might be useful.
What do people think about the 0.8.0 buildmaster potentially requiring
sqlalchemy? Would that annoy you? Annoy new users? Make it hard to
upgrade your environment?
HELP!:
I'm looking to hear about other folk's experiences with this sort of
project. We've been talking about this for years, and some prototypes
have been built, so I'd like to hear about them (I've been briefed on
many of the mozilla efforts already).
I'll attach the proposal below, along with a file of notes that I made
while walking through the code to see how this needs to work.
cheers,
-Brian
===== PROJECT PROPOSAL =====
Buildbot Database project:
The goal is to improve the usability and scalability of Buildbot to meet
Mozilla's current needs, implemented in an appropriate fashion to get
merged upstream. The primary "pain points" to be addressed are:
* most buildmaster state is held in RAM, preventing process restarts
for fear of losing queued builds and builds-in-progress. There is no
"graceful shutdown" command, but even if there were, it could take
hours or days to wait for everything in the queue to finish, losing
valuable developer time.
* buildmaster does many things in one process (build scheduling, build
processing, status distribution), and CPU exhaustion has been
observed
* Waterfall display is very CPU-intensive. Current deployment does not
share waterfall with outside world for fear of overload. Development
of alternate status displays (which could run in separate processes)
is hampered by the local-file pickle-based status storage format.
The changes planned for this project are:
* move build scheduling state out of RAM and into a persistent
database, allowing buildmaster to be bounced without losing queued
builds. Builders will claim builds from the database, perform the
builds, then update the DB to mark the build state as done, allowing
multiple buildmaster processes (on separate machines) to share the
load, communicating mostly through the DB. New tools (written in
arbitrary languages) can be used to manipulate the schedulerdb, to
implement features like "accelerate build request", "cancel request",
etc.
* move build status out of pickle files into a database, to enable
multiple processes (on separate hosts) to access the status. Database
replication can then be used to allow a publically-visible Waterfall
without threatening to overload the buildmaster. Status displaying
tools (dashboards, etc) can be written in arbitrary languages and
simply read the information they need from the statusdb.
* add configuration options to switch on/off the four main buildmaster
functions (ChangeMaster, Schedulers, Builder/Build processing, Status
distribution), allowing these functions to be spread across multiple
processes, using the state/status databases for coordination. The
goal is to have one ChangeMaster/Schedulers process, multiple
Builder/Build processing tasks (one "build-master" per "pod", with a
set of slaves attached to each one), and multiple status distribution
processes. This should help the scalability problem, by allowing the
load to be spread across multiple computers.
* the default database will be a local SQLite file, but master.cfg
statements will allow flexible configuration of the database
connection method. Postgres (or whatever mozilla's favorite DB is)
will be tested. Others (at least MySQL) should be possible.
Provisions will be made to tolerate the inevitable SQL dialect
variations.
* (probably) add "graceful shutdown" switch to the buildmaster. Once
the buildmaster is in this mode, new jobs will not be started, and
the buildmaster will shutdown once the last running job completes.
The switch may have an option to make the buildmaster restart itself
automatically upon shutdown. UI is uncertain.
* (maybe) add "graceful shutdown" switch to the buildslave, used in the
same way as the buildmaster's switch. UI is uncertain.
* (probably) add "RESUBMIT" state to the overall Build object (along
with the existing SUCCESS, WARNING, FAILURE, EXCEPTION states). The
scheduling code will react to this by requeueing the BuildRequest.
Builds which stop because of a lost slave or restarted buildmaster
will be marked with this state, so they will be re-run when the
necessary resources come back.
* retain cancel-build capabilities (may require Builder to poll a DB to
see if the build has been cancelled)
Design restrictions imposed by Brian as Buildbot upstream developer:
* dependency load must not increase significantly. I'm ok with
requiring SQLite because it's built-in to python2.5/2.6, and easy to
get for python2.4 . I'm not willing to require other database
bindings, nor to require all Buildbot users to install/configure an
e.g. MySQL database before they can run a buildmaster.
* existing 0.7.11 deployments must remain compatible with the new code.
The default configuration must use SQLite in a local directory. Any
state-migration steps that must be done will be handled by adding new
code to the existing "buildbot upgrade-master" command.
* all code must have clear User's Manual documentation (with examples)
and adequate unit tests. All changes must be licensed compatibly with
the upstream source (GPL2).
The specific milestones we're planning are:
* phase 1: Create the database connectors (initially only SQLite), move
just the scheduler state into the database. This includes the output
of the ChangeMaster, the internal state of all Schedulers, and the
list of ready-to-go BuildRequests. All existing Scheduler classes and
the Builder class will be changed to scan the database for work
instead of looking at lists in RAM. The RESUBMIT state will be
implemented and Builders updated to requeue such builds.
This will allow the buildmaster to be bounced without loss of state
(although any running builds will be abandoned and requeued). It will
not yet enable the use of multiple processes. It will not touch the
build status information (currently stored in pickle files).
* phase 1.1: Implement the Postgres database connector, and the
master.cfg options necessary to control which db type/location to
use for scheduler state. Test a buildmaster running with a remote
schedulerdb.
* phase 1.2: Implement graceful-shutdown controls.
* phase 2: Change the build-status code to store its state in a
database, instead of in the current pickle files. Implement a "Log
Server" to store/publish/stream logfile contents. Write a "buildbot
upgrade-master" tool to non-destructively migrate old pickle data
into the new database and logserver. Change the existing Status
plugins (Waterfall, MailNotifier, IRCBot, etc) to read status from
database. Add master.cfg options to control which db is used for
status data.
This will enable non-buildbot status-displaying frontends.
* phase 3: Add master.cfg options to control which components are
enabled in any given process. Provide mechanisms and examples to run
e.g. multiple build-process-masters which coordinate through the
database. Implement TCP/HTTP/polling -based "ping notifiers" to allow
low-latency triggering between components in separate processes (i.e.
Scheduler writes ready-to-build requests into DB, but the
build-process-master on a separate host must be told to re-scan the
DB for new work). Provide master.cfg options to control type/location
of DB, ping-notifiers, and Log Server. build-process-master instances
will have some configuration in common, other configuration unique to
each instance.
This will finally enable scaling through multiple buildbot processes,
and multiple Waterfall renderers.
I'm roughly targetting phase 1 to be incorporated into an upstream
buildbot-0.8.0 release, and phase 2 in an 0.9.0 release shortly
afterwards. Phase 3 may get into 0.9.0, or may go into a subsequent
upstream release.
Aggressive target is to get phase 1 done by end of september, then
evaluate schedule and progress made before beginning next phase. Overall
goal is to complete project in 2-3 months.
Sub-tasks which can be split out easily include:
* database connector module: python "dbapi2" interface,
reconnection-on-error (and log attempts w/backoff), cross-database
compatibility code, blocking methods for scheduler state db,
fire-and-forget (but retry for a little while) for status writes
* "ping notifier" module: define HTTP POST / line-oriented TCP /
polling protocol, implement client / server modules.
* Log Server: writer-side PB interface, reader-side HTTP interface
=== DESIGN NOTES ===
-*- org-mode -*-
* databases: three databases, plus logserver
** Changes go in one database
** scheduling stuff (Scheduler state, builds ready-to-go/claimed/finished)
this includes BuildRequests and their properties
** status (steps, logids, results, properties)
the goal is for the buildmaster to never read from the status db, only
the status-rendering code (which will eventually live elsewhere)
* database connector
** all statusdb calls may raise DBUnavailableError
renderer should deliver error to client
** all schedulerdb calls should block, reconnect, retry */1s, log w/backoff
db is critical to this part
** config option to set DB type, connection arguments
** schema restrictions to get cross-db compatibility:
- declare types (SQLite tolerates, but most don't)
- revision ids will be strings, SVN will deal
- no binary strings. Unicode is ok(?).
* notification mechanism
- first milestone (non-distributed) will be all in-process
- distributed milestone will require pings
HTTP POST (forwards), TCP line-oriented (either), or just polling
* persistent scheduler project
** Changemaster:
- (changeid, branchname, revisionid, author, timestamp, comment,
category?)
changeids must be comparable and monotonically increasing
- (changeid, filename)
i.e. changes[changeid].filenames = []
- (changeid, propertyname, propertyvaluestring)
i.e. changes[changeid].properties = {name: value}
*** add row to database, ping Schedulers (eventual-send)
*** ping all schedulers at buildmaster startup
** Schedulers:
- all state must be put in DB
- each records last-change-number, only examines changes since then
- each records list of changes, with important/unimportant flag
- trickiest part will be relationships between Dependent schedulers
*** when pinged, or timer wakeup:
- loop over all Schedulers
- scan for unchecked changeids
- default Scheduler ignores changes on the wrong branch
- check importance of each
- add to changes table
- arrange for tree-stable-timer wakeup
- if all changes are old enough, and important, then submit build
- AnyBranchScheduler processes changes one branch at a time
*** Dependent (downstream):
- configured with an upstream scheduler, by name
- wants to be told when upstream BuildSet completes successfully,
receive SourceStamp object
- then submits a new BuildSet, using the same SourceStamp, with
different
buildernames and properties
**** so, this scheduler ignores the changes table and watches active-builds
- defer figuring it out until I build the active-build table
*** Periodic
- (schedulername, last-build-started-time, last-changeid-built)
- if last-build-started-time + delay < now:
make SS with recent changes, submit buildset, update
last-build-started-time and last-changeid-built
- consider checking active-builds, avoid overlaps
- else: arrange for wakeup in (now - last-build-started-time + epsilon)
**** after a long downtime, this should start a build
*** Nightly/Cron
- like Periodic, but compute next build time differently
**** after a long downtime, this should *not* start a build
- maybe make that configurable, catchup=bool
*** Try: ignores changetable, just submits buildsets
*** schema:
- changes table: (schedulerid, changenum, important_p)
- timer table: (wakeup-time)
if min(wakeup-time) < now: empty table, ping all schedulers
**** default Scheduler
- (schedulerid, schedulername, last-changeid-checked)
**** Periodic
- (schedulername, last-build-started-time, last-changeid-built)
**** Triggerable
- really just maps scheduler name +properties to buildernames
- certain buildsteps can push the trigger, wait for completion
- ignores changetable, ignores buildtable
- does not use schedulerdb
**** SourceStamps
how to gc?
- (sourcestampid, branch, revision/None, patchlevel, patch)
- (sourcestampid, changeid)
*** scheduler has properties, copied into BuildSet
- doesn't need to be in the scheduler table, but might need to be in
BuildSet table
*** scheduler's output is a BuildSet, which has .waitUntilFinished()
- buildernames, sourcestamp, properties
** BuildSet
- have .waitUntilFinished(), used by downstream Dependent schedulers and
Triggerable steps
- (buildsetid, sourcestampid, reason, idstring, current-state)
idstring comes from Try job, to associate with external tools
- current-state in (hopeful, unhopeful, complete)
(no failures seen yet, some failures seen, all builds finished)
(idea is to notify early on first sign of failure)
- (buildsetid, buildername, buildreqid)
i.e. buildset.buildernames = []
- (buildsetid, propertyname, valuestring)
i.e. buildset.properties = {}
*** when all buildrequests complete, aggregate the results
- when each buildrequest completes, ping the buildsets
- this may change the buildset state
- buildset state changes should ping schedulers
** BuildRequest
- created with reason, sourcestamp, buildername, properties
- can be merged with other requests, if sourcestamps agree to it
- given to Builder to add to the builder queue
- can be started multiple times: updates status, informs watchers
- can be finished once, informs watchers
- IBuildRequestControl: subscribe/un, cancel, .submit_time
not sure if anybody calls it.. words.py? a few tests?
- "reqtable": (buildrequestid, reason, sourcestampid, buildername,
claimed-at, claimed-by?)
- (buildrequestid, propertyname, propertyvalue)
** Builder
- .buildable, .building
- submitBuildRequest adds to .buildable, pings maybeStartAllBuilds
- what is __getstate__/__setstate__ doing there?
*** so we need the Builder to scan the reqtable
- this is the part that will get distributed
- Builder A can claim any buildreqest that's for it and not yet claimed
or was claimed but got orphaned by a dead buildmaster, maybe have
a timestamp or two
- "claimed-at" holds timestamp, starts at 0, updated when a buildmaster
grabs it, refreshed every once in a while. req can be claimed by
someone else when (now - claimed-at) > timeout.
- when the build is done, the buildrequest is removed from the reqtable
and the buildset is examined
- to cancel a request: remove it from the table
- add submit-time or submit-sequence, to provide first-come-first-built
to accelerate a request, change that value
* LogServer
** writer-side PB interface:
- open(title) -> logid string
- write(logid, channel, data)
- close(logid)
logfile is renamed (from LOGID.open to LOGID.closed) upon close
- get_base_url()
** buildmaster sends async writes, queues limited amount of requests
- fire-and-forget-after-30s, discard if queue grows too big
- goal is to tolerate LogServer bounces but not consume lots of memory
** reader-side HTTP interface:
*** logid URL shows title, filesize, options links, open/closed status
- with/without headers
- just stderr
- last N lines (when closed), last N lines plus headers
- reads when open do tail-f
*** all option links are normal statically-computable URLs
* DB-based status writer
** write logserver baseurl into DB each time LogServer PB connection is made
** indirect this, to plan for multiple LogServers (logserverid=1 for now)
- (stepid, logserverid, logid)
- (logserverid, logserver_baseurl)
* DB-based status renderer
**
* random ideas to keep in mind
** scheduler db is small
- so rather than coming up with clever queries, just grab everything,
sort it in memory
- also useful to avoid doing multiple queries
More information about the devel
mailing list