[Buildbot-devel] Project dependencies, building branches, etc

Wed Jun 15 14:50:17 UTC 2005

>At the moment I recommend one master per VC repository. You 
>can use the isFileImportant method to filter out changes for 
>subprojects. The lack of branch support in 0.6.6 may mean you 
>need one master per branch.

Yeah, that's fine. I should mention the context for this: I've been told to
do an analysis of BuildBot vs. other alternatives, and as part of that I
need to be able to talk about the future. So not all these features need to
be implemented right now or even in the next couple months, but I'd like to
have an idea of what your goals are. Plus, it didn't seem like anyone had
given you a solid use case (at least not on list) for multiple projects, so
I thought I could maybe be of use.

>To be precise, there will be a way to associate Schedulers 
>with a particular branch. Each Scheduler can then trigger 
>builds on multiple Builders (all using the same source code). 

It sounds to me like a Scheduler is essentially a refactoring of the
master's functionality into something you can have more than one of in a
single process. More later.

>True. The Scheduler change should reduce the need for multiple 
>masters (at least for multiple branches). I am also thinking 
>that we can share slaves between multiple buildmasters, and I 
>will probably implement this eventually, but I'd prefer 
>solutions that don't gratuitously require multiple masters.
>One master per project feels like the right target to aim for. 
>The vague long-term goal I have is for one slave per platform, 
>one Builder per process, one BuildMaster per project. (however 
>the current Scheduler work is leading us to one Builder per 
>process*platform, and one BuildFactory per process).

(I'm assuming that by process you meant project, or build process)

>Another possible direction is to put multiple buildmasters in 
>a single (unix) process (with multiple config files), and then 
>allow the status targets to subscribe to whatever subset of 
>those masters they wish. Hmm.

After reading this I'm not sure what you're trying to accomplish with the
Scheduler. Originally it sounded to me like you were going to basically
allow multiple logical masters in a single master. That made good sense to
me. But having done that, why would one master per project be the right
usage pattern? I would think that it would be easier for me to implement my
features if I had all my masters in one process. Your second idea is more
palatable to me, but I don't understand the need, given that Schedulers
should give essentially the same functionality.

>Correct. The goal is to use Schedulers for this purpose. The 
>default Schedulers pay attention to Changes, but you could 
>write one that would instead subscribe to another buildmaster 
>and watch for builds of the dependencies to be completed. To 
>make this useful, we need to add a few
>things:
>
[snip]

Interesting. Either I'm misunderstanding, or you have a very different idea
of how this would work than I did. My assumption would have been you have
project A associated with builder A, and project B with builder B. When A
finishes building, B is notified, picks up the results of the A build, and
does its own build. This is relatively easy to implement with a
StatusListener, as you suggested. I guess what you're suggesting is that the
B build do all the steps for A as well, but be triggered by A building
successfully. But this introduces extra complexity, as you pointed out. Why
do you prefer this way?

>> There is no existing way to aggregate masters' results 
>together. This 
>> could be written relatively easily using the status client APIs to 
>> create a web app that connected to masters to query their status.
>
>That's the approach I'd prefer, at least for aggregating the 
>status of disparate projects. I'm trying to make sure the 
>PBListener interface gives you remote access to everything 
>that the normal IStatus interface provides locally.
>

If I can't put all my builds into one master then this looks like it would
be easy to implement, based on a glance at the waterfall source.

>The "use one Builder and interleave Builds of all branches" 
>behavior is obtained by having multiple Schedulers (one per 
>branch) all feeding into the same Builder. The "use one 
>Builder per branch" behavior you desire would be obtained by 
>having a 1-to-1 relationship between Schedulers and Builders. 
>In that approach, each Builder would only ever compile code 
>from a single branch. (note that those multiple Builders could 
>all share the same
>buildslave: the slaves keep each Builder in a separate directory).

Okay, sounds good. Our branches are independent development efforts, and
need to be treated as such, but I can see where this would be useful for
less constrained SCMs than CVS.

>  The Waterfall class currently has an (undocumented) feature 
>to restrict the
>  display to a subset of Builders. I think you append
>  "?builders=one,two,three" to the URL to see only those three 
>Builders..
>  check the source code to be sure.

Looks like it's ?show=, but it only exists in the CVS version.

>The "right way" to do this is to add a PBListener status 
>target to the upstream buildmaster, and then write a Scheduler 
>which subscribes to that target for use on the downstream 
>buildmaster. There will be a base Scheduler class to do this 
>sort of thing, some kind of example at least.

I think that a full-scale listener is overkill here. It also raises the
question: if a listener is offline when an event happens, does it ever get
notified? I think it would be simpler in general for it to be part of the
core process, but it's sounding like it would be easy to extend Scheduler to
do whatever I wanted after a build finished.

>Also note:
>
> the old Interlock object is going away in the next release. 
>In its place are  two new objects:
>
>  "Lock" (or maybe "Semaphore" or something), which handles temporal
>  exclusivity: a Step or a whole Build can exclude other 
>Steps/Builds from
>  using certain resources at the same time. This would 
>generally be used to
>  keep control of the CPU load on a given machine, or to avoid 
>running two
>  copies of the same test suite at the same time (although any 
>test suite
>  which needs this sort of semaphore should probably be fixed).
>
>  Dependency, which hooks together multiple Schedulers, and 
>makes sure that a
>  given set of Changes work in one place before being used in 
>a second place.
>  I'm still figuring out the details, but I'm thinking the 
>syntax will look
>  like this:
>
>    s1 = Scheduler(builders=["quick"])
>    s2 = Scheduler(builders=["full-linux", "full-solaris"], 
>dependencies=[s1])
>    s3 = Scheduler(builders=["make-tarball"], dependencies=[s2])
>    c['schedulers'] = [s1, s2]

This raises a point I didn't want to get into in the last email. One reason
I wanted to keep everything in a single master is so that we could use these
synchronizations to manage our resources. Obviously, if the different
projects aren't talking to each other, that doesn't work. I can see uses for
this for making sure performance tests run cleanly, making sure that tests
that use a database can share the same one without stomping on each other,
etc, etc.

>Let me know if this all sounds like it will fit your needs. 
>The whole Scheduler thing is still a work in progress, and I 
>want to make sure it solves these sorts of problems.

It sounds like it will definitely cover our needs, and then some.

$64K question: Any idea of a timeline on this?

-Michael