[Buildbot-devel] RFC: Crowd-sourcing CI using volunteer computing
Dustin J. Mitchell
dustin at v.igoro.us
Tue Mar 12 13:41:54 UTC 2013
Why the focus on offline builds? I think that's actually the hardest
part of this project.
I don't think "pulling jobs" requires a pre-baked build script. In
some sense, it might work already: when a slave is available, it
connects to the master, and if there's work to do the master will
start a job. There are some difficulties here in that a slave doesn't
have a way to know that it's finished a job, since it only sees
commands. So the next least-complicated fix is to alter the protocol
so that the slave can reason about builds, too -- a way to ask a
connected slave "hey are you ready to start a build" and then
notifications that a build is starting and ending. Then the slave can
schedule its jobs appropriately, even if it's connected to multiple
masters.
Another possibility is to implement status-receiver-only builds. That
is, an API for slaves to provide all of the relevant data about a
build that they performed independently. So that would include API
calls to create a new build (including sourcestamps, properties,
etc.), create a new step in that build, and create logs for each step.
Then slaves can run their own scheduling algorithm -- the simplest of
which, used by Tinderbox, is 'while true; do ..; done'.
Historically, we've gotten ourselves in a lot of trouble by making
fundamental changes to Buildbot that are incomplete or have arbitrary
limits. For example, latent buildslaves have a number of nasty (and
sometimes expensive) gotchas. I think we got codebases (mostly)
right, but the development and review for that took Harry several
months of nearly full-time work. So I think we should try to avoid
fundamental changes as much as possible, and where they are made,
implement them completely.
Dustin
More information about the devel
mailing list