[Buildbot-devel] database-backed status/scheduler-state project

Tue Sep 8 04:27:27 UTC 2009

exarkun at twistedmatrix.com wrote:
> 
> I'm curious about the motivation to use the database directly as the
> RPC mechanism here.
> ...
> 
> It seems to me that providing a more constrained interface to the 
> database would be a better solution all around.
> ...
> I hope you don't select SQLAlchemy.  It has a poor track record and 
> there are a lot of other options out there.  I'll reiterate my point 
> about not liking direct database access as the public API, too.

Tell me more.. what sort of interface would you imagine? The goal is to
have something that's easy to access from, say, PHP, and from multiple
machines. Buildbot's existing XMLRPC interfaces were meant to fill this
role, and apparently they're insufficient (perhaps they're incomplete,
perhaps they're too much of a hassle to make complete, perhaps they're
slow, or hard to use, but I assume there must be a reason that I was
asked to build a schedulerdb/statusdb instead of enhancing the XMLRPC
interfaces). The PB status interfaces are worse: it's infeasible to use
them from anything but python+twisted, severely narrowing the set of
folks who can write those tools.

Also, please tell me more about the "other options" to SQLAlchemy.. I
don't think we need all that much, and can write the layers that we do
need, but if there's a lightweight library to provide a database
connection object that will reconnect when the DB is bounced during
runtime, I'd like to consider using it instead of rolling my own.

Dustin Mitchell wrote:
>
> There are pretty clear "right" ways to implement all of this (I'm
> thinking of plans to use synchronous DB queries here as "wrong").

I think I can imagine how to stick with async DB queries, basically with
the CSP approach: each of the three main blocks (changesources,
schedulers, build-process-masters) runs in a separate loop, such that
e.g. schedulers[1].process() returns a Deferred, and we guarantee that
neither schedules[1].process nor schedulers[2].process will be called
until that Deferred fires. The notification-service/"doorbell" will gate
the loop: if the notifier fires while the loop is running, a flag is
set, and the loop will get run again after the current pass finishes. It
will be a bit more complicated than the sync+blocking approach, but I
certainly agree that it's more "right".

Apart from the sync/async issue, can you expand on the "right" ways to
approach the DB-as-RPC problem? I respect your experience with similar
systems, and don't want to complicate the Buildbot's control mechanisms
with patterns that are Considered Harmful. But I'm having problems
envisioning other ways to build this thing that still will meet the
desired goals (of basically enabling the easy cross-language development
of 3rd-party control/status tools).

> On the other hand, from the "One-Oh" side of things -- Will the new
> version ever get finished (or started)? Will there be enough
> similarity that users of the old version will be willing to migrate to
> the new? Etc.

Yeah. There's an opportunity here (because Mozilla needs this stuff
badly enough to pay me to build it) to finally get this project done. So
I'm looking for a reasonable compromise between the usual long-term
elegance/design/maintainability goals and the short-term make-it-work
goals. This work has been deferred for several years because we couldn't
reach enough consensus (on how it should be structured) to get past the
activation-energy threshold of finding someone to do the actual work. My
threshold has been lowered, for twenty hours per week :-), so as long as
the design is Good Enough and isn't going to cause long-term problems,
I'm in a position to just go for it.

One of my requirements is that this new schedulerdb/statusdb code be
invisible to buildbot users who don't care about it. Of course, if
you've written your own Scheduler or ChangeSource, then you'll be forced
to care about it. But customized BuildSteps should not be affected. I'm
hoping that this will reduce the transition cost/fear, especially
compared to the sorts of larger structural/conceptual changes that the
OneOh effort entails.

thanks everybody,
 -Brian