[Buildbot-devel] Project dependencies, building branches, etc

Wed Jun 22 07:09:13 UTC 2005

> Plus, it didn't seem like anyone had given you a solid use case (at least
> not on list) for multiple projects, so I thought I could maybe be of use.

Definitely. As the code shapes up, keep taking a look at the design and let
me know how it does or does not handle your needs.

> After reading this I'm not sure what you're trying to accomplish with the
> Scheduler.

Well, to be honest, it's in flux.. I realize that I don't yet know enough to
be sure that I've got a good design.

You can think of ChangeSources, BuildMasters, Schedulers, Builders, and
Slaves all as nodes in a big interconnected graph. Earlier versions of the
buildbot had fairly rigid limitations on how they could be interconnected.
(specifically the Scheduler functionality was inseparable from the Builder,
and all ChangeSources fed into all Builders). Over time, as we recognize
those somewhat arbitrary limitations, we remove them, and the graph becomes
more flexible. ChangeSources can map arbitrarily to Schedulers, Schedulers
can map arbitrarily to Builders. With PBStatusListeners and custom Scheduler
classes, it doesn't matter quite as much whether two nodes happen to be in
the same host process or not.

The BuildMaster is coded in such a way that it doesn't specifically refuse to
share a process with another BuildMaster instance. A little bit of work
(mostly involving how to share TCP ports between the two) would make this
possible. I'm not saying that it's a good idea, but there's no reason to make
the job of (some day) moving to such a scheme any harder than it needs to be.

I think that, as a next step (for 0.7.0), the
ChangeSource->Scheduler->Builders approach should provide for more
functionality than what we've got in 0.6.x, and from there we can figure out
what the next direction needs to be. Experience with projects that involve
semi-independent sub-projects, like yours and GStreamer, will feed these
design decisions.

The big question in my mind right now, as I write the code, is how to
represent the status of everything. It may be the case that buildbot
installations which do tricky clever things involving multiple projects will
also have to do tricky clever things involving StatusTargets that provide
useful displays of build status. The existing Waterfall display is a
chronological view: as we get into BuildSets and BuildRequests and multiple
slaves per builder, perhaps we need a more Change- or BuildSet- oriented
view.

> I guess what you're suggesting is that the B build do all the steps for A
> as well, but be triggered by A building successfully. But this introduces
> extra complexity, as you pointed out. Why do you prefer this way?

I'm not sure that I do prefer it. I think that we'll have the flexibility to
implement this sort of thing in multiple ways, depending upon how much
separation you want between your Builders. Remember that each Builder is
nominally independent, so it may be hard to get compiled code from one to
another.

Let's try for a concrete example. If Thomas will forgive my complete
mischaracterization of his project, let's say that GStreamer has two big
pieces: the (C-based) GStreamer library called libgstreamer, and the python
bindings called pygst. The python bindings depend upon the core library.
Let's pretend that they have separate SVN repositories, and that changes are
showing up in both all the time. We want to make sure that -rHEAD of
everything remains in good working order.

The pygst build process normally compiles the python bindings against the
pre-compiled libgstreamer header files and libraries that are installed on
the system (in /usr/include and /usr/lib). To do something other than that
requires some build-time flags (say, ./configure
--with-gstreamer=/other/path).

A build of pygst can be described by a two-part SourceStamp: the first part
describes the version of libgstreamer that was used, the second part
describes the pygst code that was used. You can imagine SourceStamps
describing a variety of combinations: ("0.8.1", "0.8.1") for two released
versions, ("0.8.1", "r1234") for an SVN version of pygst compiled against a
released version of libgstreamer, etc. Different goals might prompt you to
want to validate different combinations. You could imagine a step.Source
subclass which worked with named+released versions of projects instead of
their SVN repositories, so "0.8.1" could translate into an instruction to
download libgstreamer-0.8.1.tar.gz from some HTTP server and unpack it. This
would let you get into Schedulers that watched mailing lists or freshmeat.net
or whatever, instead of VC repositories. The functionality this could add
would mostly be adding corners to the "four corners" testing matrix: latest
SVN of A against latest release of B, etc.

But, for our purposes, we'll generally be building ("r1234", "rHEAD"), where
r1234 is the latest known-working revision of libgstreamer.

Let's describe two scenarios:

 Scenario 1: separate builders

  libgstreamer is maintained by a separate group than pygst, and they've each
  got their own buildbot setup. The libgstreamer folks only pay attention to
  the C code, and ignore the python bindings. The libgstreamer buildbot
  watches SVN for libgstreamer changes, compiles them, and runs tests. It
  publishes build status on a TCP port via PB for anyone who chooses to
  subscribe.

  The pygst folks have their own buildbot. It watches the libgstreamer
  buildbot to find out when a new (working) -rHEAD of libgstreamer is
  available. There are two situations that prompt it to rebuild pygst:
  libgstreamer has been updated, or pygst has changed. If the only
  information it gets from the libgstreamer buildbot is the revision number
  of the tree that built successfully (that is, we're not downloading
  binaries or anything), then it will somehow need to fetch and compile its
  own copy. They can do this by just copying a Builder config from the
  "upstream" libgstreamer buildbot, so that Builder A does an SVN checkout of
  libgstreamer and compiles it normally.

  I'm envisioning three Schedulers in this setup. The first subscribes to the
  upstream BuildMaster and just triggers a local libgstreamer build each time
  the upstream build succeeds. The second watches the pygst SVN repository
  for Changes and triggers a local pygst build each time something has
  changed. The third watches the local libgstreamer build and triggers a
  local pygst build each time it succeeds.

  Now, when compiling pygst itself, you have to be able to point it at some
  libgstreamer.a and libgstreamer.h files. This is the part where you have to
  violate Builder isolation. Builder A needs to install the compiled
  libgstreamer somewhere well-known. Builder B needs to use a
  --with-gstreamer argument that reads from this well-known location. The
  slaves attached to these builders need to share a filesystem.. they should
  probably both run in the same slave. Performing cross-platform testing will
  involve pairs of Builders.

  The advantage of this approach is that the upstream library is compiled
  exactly once per upstream code change. The disadvantage is that you have to
  arrange for a well-known directory to be shared between the two Builders. I
  want to find a clean way of expressing this shared directory, because it
  imposes more restrictions on the buildslaves than we currently have. (at
  present, the slave admins only have to provide one base directory, and the
  buildslave takes care of everything inside that.. furthermore, each Builder
  is independent). Another disadvantage is that you need a Lock of some sort
  to keep the pygst builder from running while the libgstreamer builder is
  running, otherwise you'll be linking against half-compiled (or
  half-installed) code.

  Note that this scenario works the same way if there's only one buildbot.
  The point is that the binaries being compiled by one Builder are used
  directly by a separate Builder.

 Scenario 2: both components in one Builder

  The pygst buildbot has only the one Builder, which is responsible for
  compiling both the "upstream" libgstreamer library and the "downstream"
  pygst bindings. There are two Schedulers: one to watch the upstream
  buildbot, and a second to watch for pygst SVN changes. The build process
  looks something like the following:

    checkout libgstreamer code of the given revision, into ./upstream

    cd upstream && ./configure && make && make install --prefix ../installed

    checkout pygst code of the given revision into ./downstream

    cd downstream && ./configure --with-gstreamer=../installed && make

    cd downstream && make check

  This has the advantage that each Builder is independent, and can run on any
  qualified slave, no file systems need to be shared. It also has no need for
  Locks of any sort between the separate Builders. The disadvantage is that
  the libgstreamer code is being compiled multiple times (at least multiple
  times per libgstreamer revision).

So, to support the second scenario, we'd need some improvements in the way
that SourceStamps are expressed (specifically, the ability to express more
than one sub-project's revision at the same time). To support the first
scenario cleanly, I'd want a way to allocate and share a directory between
Builders, as well as a way to express the restriction that they always run in
the same buildslave, plus some sort of Lock to avoid compiling against
half-installed libraries.

There's work to be done before we can do either. You could probably implement
the first scenario today, if you put in some absolute pathnames for
'workdir'. You could also probably implement the second scenario today, if
you only ever wanted to build -rHEAD.

Hmm, ok, that got a bit verbose. Hopefully it explains what I was thinking
better.

> It also raises the question: if a listener is offline when an event
> happens, does it ever get notified? I think it would be simpler in general
> for it to be part of the core process, but it's sounding like it would be
> easy to extend Scheduler to do whatever I wanted after a build finished.

The Status interface lets you subscribe to hear about new Builds finishing,
and also lets you ask about earlier Builds. So the correct sequence would be:

 subscribe to new Builds
 retrieve build[-1]
  examine SourceStamp of that build, compare against the last known build
  if different, trigger a new build with the build[-1] sources
 when new builds arrive, if they were SUCCESS, trigger a new build

> [using Locks for resource allocation]
> Obviously, if the different projects aren't talking to each other, that
> doesn't work. I can see uses for this for making sure performance tests run
> cleanly, making sure that tests that use a database can share the same one
> without stomping on each other, etc, etc.

Of course. We could build inter-buildmaster Locks, but I think that would be
a bad idea.

I'm still waffling on what the the Lock semantics ought to be (as opposed to
the Dependency stuff, which is easier because it only affects the Scheduler).
Some of the possibilities:

 each Step could declare a set of named Locks that it wants to acquire before
 starting. These names are either scoped to the buildslave or to the
 buildmaster as a whole (perhaps with names like "slave.using_database" or
 "master.running_benchmarks"). Slaves could request multiple names, but will
 yield all Locks at the end of each step (to avoid deadlock).

 each Build could declare a set of Locks that it must acquire before
 starting.

 there could be special GetLock and ReleaseLock steps that you'd insert
 between your regular steps. This has the possibility of deadlock, but would
 also let you achieve the flexibility of both Step-wise and Build-wise Locks.

I can't currently imagine a scenario where you'd want to be able to lock a
whole build, but I wouldn't be surprised if there were one out there. The
"lock pygst while we're installing libgstreamer into ../installed" use case
from Scenario 2 above could be safely accomplished with just a single lock
that was acquired by both the libgstreamer's install step and the pygst's
configure/compile step. If the configure and compile were in separate steps,
then you'd need either multi-step Locks or whole-Build Locks to be safe.

I'd love to have a syntax for this that didn't make it possible to deadlock.
Trying to obtain a Lock while you already have one held is the primitive that
must be prohibited to avoid this.

> >Let me know if this all sounds like it will fit your needs. 
> >The whole Scheduler thing is still a work in progress, and I 
> >want to make sure it solves these sorts of problems.
> 
> It sounds like it will definitely cover our needs, and then some.
> 
> $64K question: Any idea of a timeline on this?

My target for the Scheduler stuff is to get it done in the next month. The
basic logic is done, but I haven't written the unit tests, or updated the
status targets to handle the brave new world (multiple slaves per builder,
dealing with BuildSets and BuildRequests, somehow showing the status of a
Scheduler sanely [sheesh :-]). I think that target timeline will include
Locks and Dependencies, as well as Scheduler variants that can watch remote
buildmasters. If that slips, it will be because I really want to get the
'try' feature into this revision as well, but with all the other pieces in
place I don't think that will be too hard to implement.

hope that helps,
 -Brian