[Buildbot-devel] RFC: Crowd-sourcing CI using volunteer computing

Tue Mar 12 13:41:54 UTC 2013

Why the focus on offline builds?  I think that's actually the hardest
part of this project.

I don't think "pulling jobs" requires a pre-baked build script.  In
some sense, it might work already: when a slave is available, it
connects to the master, and if there's work to do the master will
start a job.  There are some difficulties here in that a slave doesn't
have a way to know that it's finished a job, since it only sees
commands.  So the next least-complicated fix is to alter the protocol
so that the slave can reason about builds, too -- a way to ask a
connected slave "hey are you ready to start a build" and then
notifications that a build is starting and ending.  Then the slave can
schedule its jobs appropriately, even if it's connected to multiple
masters.

Another possibility is to implement status-receiver-only builds.  That
is, an API for slaves to provide all of the relevant data about a
build that they performed independently.  So that would include API
calls to create a new build (including sourcestamps, properties,
etc.), create a new step in that build, and create logs for each step.
 Then slaves can run their own scheduling algorithm -- the simplest of
which, used by Tinderbox, is 'while true; do ..; done'.

Historically, we've gotten ourselves in a lot of trouble by making
fundamental changes to Buildbot that are incomplete or have arbitrary
limits.  For example, latent buildslaves have a number of nasty (and
sometimes expensive) gotchas.  I think we got codebases (mostly)
right, but the development and review for that took Harry several
months of nearly full-time work.  So I think we should try to avoid
fundamental changes as much as possible, and where they are made,
implement them completely.

Dustin