[Buildbot-devel] upcoming "build on branches" feature

Tue May 24 22:53:37 UTC 2005

Here are my notes on the feature I'm starting on this week. The details still
need some working out, but I wanted to share what I'm thinking so everybody
can chime in. I'll be doing this work on a branch, so that "0.7.0" will be
the first release to contain the new code (even if there are 0.6.7-0.6.9
bugfix releases first). My goal is to include the long-discussed "try"
feature in that release as well, since it is a natural extension of the
"build this branch" framework.

Be aware that I'm just copying in notes from my buildbot diary here, so I'm
working out the ideas and terminology as I go along. Later stuff tends to
supercede earlier stuff, and nothing's nailed down yet.

 -- begin notes --

Now I want to take on SF#1200394, building branches.

Each VC system has a different approach. The general idea is that each build
has three input properties which might change from build to build (as opposed
to the setup of their steps, which is fixed by the config file):

 branch
 revision
 patch

'patch' is for the 'try' feature. 'revision' comes from Changes, some
versionstamp which will build everything in all the Changes (and nothing
else). 'branch' is the new one.

Each Build should get a .branch attribute. Each VC step should look in its
Build to see what branch to use. None means HEAD or Trunk or whatever default
the VC system normally provides (which may require some configuration.. CVS
has a default HEAD, but SVN needs it provided as part of the svnurl). The
branch name will get sent to the slave VC commands in different forms.
Conveniently this doesn't require changes to the slave code.

 Arch/Baz: this is just the args['version'] key
 CVS: args['branch']
 SVN: needs to be appended to args['svnurl']
 Darcs: "branches" are equivalent to repositories, so really the branch name
        would replace args['repourl'], but that is a security issue
 Git: good question, maybe args['repourl'] too
 P4: not applicable, would involve modifying the viewspec
 Monotone: will be args['branch']. checkouts require either a revision ID
           or a branch name. Each revision ID belongs to a single branch.

** Update versus Clobber

You can't update a tree into a different branch (at least it doesn't seem
like a good idea). It probably makes sense to put a .buildbot.branch stamp in
the checked out tree, to remember which branch was last used. If the branch
stamp is different than what we want to wind up with, clobber the tree first.

Or, we could reserve a Builder (and therefore a builddir) for HEAD, and use a
separate one for branches. The HEAD Builder would use mode=update as usual,
while the branches would always use mode=clobber.

Another point on the disk-versus-network axis is to use mode=copy, but use a
different copydir when working with any branch. Really that would mean
mode=copy for HEAD, but for other branches do an rmtree(branchdir) then
mode=copy with copydir=branchdir.

** Security

The buildslaves are pwned by anyone who can commit code to a repository that
they pull from. Branches add a wrinkle to this. We saw the slaves are owned
by anyone who can both commit changes to a given branch and then get the
buildmaster to build from that branch.

Most VC systems don't offer particularly fine-grained ACLs: write access to
the repository is an all-or-nothing affair. SVN probably does better. If the
repository is using per-branch permissions, then either the admin needs to
provide a list of acceptable branches, or they must accept that the slaves
will build code for anything in the repository regardless of ACLs that might
prevent certain people from committing to certain branches. (i.e. the
buildbot will be more permissive than the repository).

** How named-branch Builds are triggered

The most obvious way is that someone specially requests one, with
forceBuild(). It should acquire a branch= argument (and the Builder should
have a default branch in case one is not provided).

Another way is to have Changes trigger non-trunk builds. This would work for
an environment in which branches are shared and need to be maintained just
like HEAD. Ideally, the normal buildbot behavior should just be a degenerate
case where HEAD is the only version tracked.

It might be easiest to do this by having a configured list of branches which
are allowed to trigger builds. The default case just has ["HEAD"] (or
[None]). This would allow well-established/shared branches to be tracked,
while other personal branches could be built only on demand. A separate list
of valid branch names would provide for ACL security (for SVN repositories
which bother with it), but should default to allowing everything.

To do this, each Change must have .branch attribute. The ChangeSource is
responsible for providing it. This is easier to determine for some VC systems
than others:

 Arch/Baz: easy, each commit is part of a specific 'version'
 CVS: Thomas' cvstag change pulls this from the mail parser
 SVN: this is tricky, since the URL is just base://base/branch../dir/file .
      Given just the URL of the file that was changed, you can't tell where
      the branch ends and the tree begins. The ChangeSource will need logic
      to do the split correctly, possibly with a predefined set of branch
      names.
 Darcs: equivalent to the repository name. To follow multiple "branches", the
        notifier must watch multiple repositories.
 Git: dunno
 P4: dunno, it uses a path scheme like SVN
 Monotone: each revision has a specific branch. The change source needs to
           extract it and include it in the change notification.

** Status display

It should be clear when a non-HEAD build is running (so it is obvious why
other builds aren't). It should also not be confused with HEAD builds. Status
targets that track current "tree is good / tree is broken" status should
handle each branch separately, and remember that non-HEAD builds will not
necessarily be re-built once they've been fixed. The tag= attribute that
thomasvs added should be reworked into a branch= attribute. Each BuildStatus
should have a .getBranch() or something (maybe part of .getSourceStamp).

** all-builder passing

Building from branches (like the 'try' feature) is useful to answer the
question "Is it safe to commit my changes to the trunk?". Sometimes
developers want to force a build on a specific Builder, but more often they
want to run their changes on all Builders, just like committing it would. It
would be convenient to have some logic that watches all Builders and
correlates their Builds with the inbound Changes (or Forces), so it could tag
each Change or Force with a pass/fail status based upon a set of Builders.

One component of this is changing forceBuild() to be more like submitBuild(),
which is a less-granular form of submitChange(). The former distributes
BuildRequests to all Builders, while the latter distributes Changes to them.
The Builders are free to deal with Changes as they see fit, ignoring them or
scheduling Builds to incorporate those Changes. The BuildRequests bypass this
mechanism, and just get scheduled to run the next time the slave is free.
(this also hints that Builders should have a queue of pending builds, and
some prioritization logic, and that the Change distribution should be a bit
more clever).

side note: we might need a better name for the thing submitted by this
submitBuild(). "Build" is a single build of a single sourcestamp by a single
Builder. The thing we're talking about is a suite of builds (all of the same
sourcestamp) across multiple Builders.

The other component is status reporting. The developer who submits their
personal branch for testing can watch the waterfall display until the
component Builds finish, and eyeball-merge the results together, but really
there should be something to track all the Builds and notify them upon the
first failure or the last success.

So imagine a BuildSet object. It takes a source stamp and a list of
Builders. It has one "soon" Deferred which fires on the first failure (if no
Builds fail, it fires when the last Build finishes). It also has a "done"
Deferred which doesn't fire at all until all Builds have completed (but when
it fires it still indicates that something went wrong).

This builder list should have some symbolic names, like "all", which the
buildmaster admin can configure to a list of all the Builders that developers
are responsible for keeping happy. This should use thomas' tag= attributes.

I really want to generalize this. I'm thinking that, instead of individual
Builders/Builds making when-to-build decisions, there should be a layer which
accepts Changes and (later) emits BuildSets. The current distinction
between, say, a "quick" Builder and a set of "full" builders should be
replaced with two instances of this BuildSet-producing layer, with
different treeStableTimer values and different isFileImportant
implementations. Ideally they should share Builders (so each Builder is just
an architecture); that requires the quick/full distinction being expressed in
the sourcestamp, which I'm not entirely comfortable with.

The objects that this layer emits should be communicated to the status
targets too, with messages like:

 BuildSetStarted: put in the queue on the various builders
  BuildStarted: one of the component builds has begun
  BuildFinished: one of the component builds has finished
 BuildSetFailed: the first failure was observed
 BuildSetFinished: all builds have finished

Ok, one down side to this approach is that Builds are capable of testing
multiple changes at once (more or less, with the obvious loss of
granularity). The twisted 'reactors' builder takes an hour or two to run, and
if it was asked to match Build for Build with the faster Builders, it would
never catch up. The current "build anything that's stable" approach is a lot
more effective.

This doesn't necessarily invalidate the BuildSet thing (manually requested
BuildSets would behave as before, it's just Change-triggered BuildSets
that are the issue). The BuildSet could be an output thing instead of an
input thing, watching the Changes that come in and tracking all the Builds
that contained them. (this would make it a status thing, but not a scheduling
thing).

Hrm.

Create a BuildRequest object that contains the source stamp. These are passed
to Builders, which put them on the queue. submitBuild() creates regular ones,
which are built independently. When the Change-driven scheduler layer decides
a set of Changes are stable enough to build, they create a BuildSet which
creates mergeable BuildRequest objects. When the Builder gets a second
mergeable BuildRequest, it is allowed to merge the two together.

Ooh, yes. The mergeability depends upon the branch (that is, the BuildRequest
has both a "canBeMerged" flag and a branch name, and merges can only happen
between BuildRequests that are both mergeable and have the same branch name).
This would allow incoming Changes from multiple branches to be built as soon
as possible. Merging will never delay a build, and the mergeable flag means
that explicitly requested builds will run with exactly the set of sources
requested.

The scheduling thing.. let's call it a Scheduler. It decides when to submit
BuildRequests to the Builders. (the Builders get to decide when to actually
run the resulting Builds, however, so the Scheduler doesn't control all
aspects of scheduling).

In particular, Locks are still between Builds, not BuildRequests. However,
Dependencies should (somehow) go through Schedulers. I think it will be
enough to have each Scheduler list the set of other Schedulers that they
depend upon (and maybe have them list the actual Schedulers, not just their
names, to make it impossible to create a loop). Internally, the downstream
one registers with the upstream one to receive information about Changes. It
needs to know when the upstream one has decided that a given Change has
passed or failed. It can also find out when the Change has been ignored
(which probably counts as success here). All Schedulers are guaranteed to get
all Changes, so the downstream one can do 'd=upstream.getResults(changenum)',
which will fire with SUCCESS or FAILURE (or maybe SKIPPED). The downstream
Scheduler has a potential BuildSet, which waits until the upstream one has
passed all of its component Changes.

Schedulers should be able to take input from more than just ChangeSources.
Consider the case where you want to trigger a build every time some remote
buildmaster has completed a successful build of some library that this
project uses. You'd have a BuildTrigger object of some kind which subscribes
to a PBListener status port on the remote buildmaster. You might want these
to be called something else, or somehow mark them as not taking Changes. Or,
maybe just let them take Changes too, just ignore them.

** control flow

change-driven builds:
 ChangeSources send Changes to all Schedulers
 Schedulers send BuildRequests to (some) Builders
 Builders accumulate/prioritize BuildRequests, create and start Builds
 Schedulers watch BuildRequests, trigger dependent Schedulers

higher level builds:
 BuildSources(sp?) give BuildSets to master
  use Schedulers instead?
 master sends BuildRequests to (specified) Builders

** Pieces to implement

 branch= arg/attribute on:
  VC slave commands (with .buildbot-branch stamp, clobber if changed)
  VC steps
  Build
  default value in Builder
  forceBuild (one Builder at a time)

 Change.branch

 BuildSet
  BuildSet(branch=None?, source_stamp, patch=None, builders)
  .waitUntilFirstFailure
  .waitUntilFinished
  control.addBuildSet(buildset)

 BuildSet status notification
  forms the basis for Problem tracking

 BuildRequest(source_stamp, branch, mergeable)
  .waitUntilFinished
 Builder.addRequest(buildrequest)

 Scheduler(name, treeStableTimer, branches, fileIsImportant, builders)
 c['schedulers'] = [s1,s2]
  default scheduler delivers to all builders

 changemaster.addChange: submits to all Schedulers

eventually:
 turn BuildRequest/Build into Build/BuildProcess

** more thoughts

The status events created by BuildSets finishing will form the basis for
Problem tracking.

Schedulers can also subscribe to hear about build status. The "retry
transient failed builds" scheduler would do this to watch for builds that had
failed in one builder (but not the others of the set) and re-try them. I
don't know how to relate this to the BuildSet, though, unless something could
say that the BuildSet hadn't really failed yet because some retries might
still be pending.

 -- end notes --

Let me know what y'all think.

cheers,
 -Brian