[Buildbot-devel] database-backed status/scheduler-state project
exarkun at twistedmatrix.com
exarkun at twistedmatrix.com
Tue Sep 8 15:42:18 UTC 2009
On 04:27 am, warner at lothar.com wrote:
>exarkun at twistedmatrix.com wrote:
>>I'm curious about the motivation to use the database directly as the
>>RPC mechanism here.
>>It seems to me that providing a more constrained interface to the
>>database would be a better solution all around.
>>I hope you don't select SQLAlchemy. It has a poor track record and
>>there are a lot of other options out there. I'll reiterate my point
>>about not liking direct database access as the public API, too.
>Tell me more.. what sort of interface would you imagine? The goal is to
>have something that's easy to access from, say, PHP, and from multiple
>machines. Buildbot's existing XMLRPC interfaces were meant to fill this
>role, and apparently they're insufficient (perhaps they're incomplete,
>perhaps they're too much of a hassle to make complete, perhaps they're
>slow, or hard to use, but I assume there must be a reason that I was
>asked to build a schedulerdb/statusdb instead of enhancing the XMLRPC
>interfaces). The PB status interfaces are worse: it's infeasible to use
>them from anything but python+twisted, severely narrowing the set of
>folks who can write those tools.
One thing I can't reliably speak to is why someone else asked you for
some feature, of course. :)
I should say I'm entirely for a database containing scheduler and status
information. These are good things. I like the persistent-scheduled-
builds-across-restarts feature. I like the build-results-in-a-schema-
not-a-pile-of-pickles feature. To *some* extent, I don't even disagree
with a database as the API exposing some information.
What's bothersome about using the database as the third-party API is
that it is very difficult to decide later that you want to change the
implementation of this API. With XML-RPC or PB or whatever other RPC
mechanism along those lines, you can always substitute the existing
implementation for a new one without changing the externally visible
behavior, so long as the necessary data is still available *somewhere*
in the system. So you are free to change how data is stored without
breaking these RPC mechanisms. With the database RPC approach, this is
not so. The details of how data is stored *are* the API, and cannot be
At least, not unless you're willing to break whatever third-party code
is relying on the API. I would definitely want to avoid this. The idea
of actively seeking it out is almost painful. :)
When you said that the XMLRPC/PB interfaces are not complete, the only
thing I can think of that that means is they don't expose every single
implementation detail of the persistent data Buildbot has now. Any
problem other than that is easily remedied with the addition of a new
RPC method (which can have documentation, making it easier for future
developers to learn about some data Buildbot makes available; and which
can have unit tests, offering a guarantee that the data remains
available in future versions of Buildbot).
The only downside from the Mozilla perspective of an API instead of
direct database access is that they'll have to think about what data
they want to access /before/ you finish working on this instead of
after, or they'll have to hire you again or find someone else to
implement any new RPC methods for data they didn't anticipate needing.
This isn't really a bad thing, as long as Buildbot development is
active, since adding any particular new RPC method isn't much of a
challenge, and the only potentially painful thing is waiting for it to
be part of a release. Aside from that, it's really a good thing, since
it's a lot better to know what you're writing software to do before you
write it instead of after.
>Also, please tell me more about the "other options" to SQLAlchemy.. I
>don't think we need all that much, and can write the layers that we do
>need, but if there's a lightweight library to provide a database
>connection object that will reconnect when the DB is bounced during
>runtime, I'd like to consider using it instead of rolling my own.
Hm. Those aren't the features I was considering at all. I'm not
familiar with any RDBMS that provides the necessary features for a
reasonable reconnection story.
I was going to suggest that you take a look at Storm. I've written
little software using it, but from what I can tell it's mostly a
reasonable project. It does nothing special to handle lost connections
to the DB server. There are existing apps based on it that deal with
this problem, though. I spoke with Thomas Herve and he seems to think
it'll be possible to work out something acceptable as well. He also
mentioned he'll be happy to see the pickles go, and said that he might
be able to lend a hand with the Storm work.
More information about the devel