[Buildbot-devel] Buildbot Summit - Out of process

Dustin J. Mitchell dustin at v.igoro.us
Thu Nov 25 20:39:03 UTC 2010


On Thu, Nov 25, 2010 at 10:01 AM, Axel Hecht <l10n.moz at googlemail.com> wrote:
> Performance is something one needs to really pay attention to. My own status
> app on top of my own status db takes a dive at a few hundred builds.
> Mozilla's tbpl easily shows a thousand builds. An real overview of our
> nightly builds shows a good 1000 builds, too.

I don't think TBPL is a good comparison to Buildbot, but certainly
there is a lot of improvement that can be made to buildbot's
performance, even without further distributing it.  For example, with
Jean-Paul's help, I just committed a fix that will speed up the
upgrading of all those status pickles, which can slow things down
remarkably.

> Flexible development is important, too, in particular for the webparts. I
> for one am horrible at both web coding and SQL, so I use the tool that
> enables me to do both. I don't think that we should impact the design
> decisions of the front end by synchronous vs async etc. Unless we're jumping
> on the NodeJS train ;-) .

There may be some confusion over terms here.  When I say
"asynchronous", I mean Twisted Python.  Node.js is not really on the
table - we already have a server-side framework.  I assume "flexible
development" is roughly equivalent to hackability - someone interested
in the web frontend should be able to make some changes and send them
back to Buildbot for inclusion without having to understand all of
Buildbot's internals.  This is true in other areas, too - writing a
BuildStep, reading from the status DB, etc.

> My personal conclusion is to get the status as descriptive as possible out
> of buildbot/twisted and into a db, and then have competing and maybe even
> customizable front ends based on that. Pulse etc is nice, but it's not
> status, so folks hacking on presenting status will only care about the
> status store.

There are two parts to status.  I think as we move forward that I will
actually name them "history" and "events" to avoid confusion with the
existing status interfaces.

Consider, some apps are interested in an analysis of what has happened
before.  Those apps can access that data via the JSON interface, or
(if they're willing to tolerate schema changes) via the buildmaster's
db itself.  Other apps are interested in knowing what's going on right
now - USB lava lamps, notifiers for other teams of completed builds,
etc.  And some apps are interested in both - for example, the console
web view wants to show history, but also keep it up to date with
current events.

Anyway, the idea of having competing, customizable frontends makes a
lot of sense, and I would like those frontends to be built against
well-defined, reasonably future-proofed APIs.  I don't consider the
database schema to be such an API.

> FTR, one thing I found confusing in the thread is the question of sync vs
> async, and it's impact on schedulers and such.

Yes, I think that goes back to the earlier terminology difference.
I'll try to sum it up with a simple example.  We have the following
code buried in Buildbot right now:

sourcestamps = { } # { ss-tuple : earliest time }
for bn in status.getBuilderNames():
    builder = status.getBuilder(bn)
    if categories and builder.category not in categories:
        continue
    build = builder.getBuild(-1)
    while build:
        ss = build.getSourceStamp(absolute=True)
        start = build.getTimes()[0]
        build = build.getPreviousBuild()
        # skip un-started builds
        if not start: continue
        # skip non-matching branches
        if branch != ANYBRANCH and ss.branch != branch:
            continue
        key= self.getSourceStampKey(ss)
        if key not in sourcestamps or \
                            sourcestamps[key][1] > start:
            sourcestamps[key] = (ss, start)

all of those foo.getBar(..) invocations are returning status-related
information, and doing so immediately (synchronously).  Within
Twisted, if that data is coming from a db then we may have to wait for
it (asynchronously), which means using a Deferred.  That radically
changes the structure of the code, because the function must set up
some callbacks and return after each foo.getBar(..) invocation.  There
are ways to minimize the restructuring - deferredGenerator and
inlineDeferreds - but it's still a significant change[1].  The idea of
pushing this functionality out to the client is that the client can
decide for itself how to schedule the JSON requests - synchronous and
asynchronous XHR for a browser-based client, or perhaps using an
Erlang JSON client if that's your bag.  Whatever the client does, the
Buildmaster need only parse the JSON request, issue a DB query, then
attach a callback to the resulting deferred that translates the DB
result set back into JSON for return to the client.

This actually feeds back into the performance issue you mentioned
first - much of the performance difficulty with Buildbot has to do
with latency, not bandwidth, because lots of status-related operations
take hundreds or thousands of milliseconds and block all other
operations.  This causes socket buffers to fill up and .. well, the
result is bad performance.  Converting those long synchronous
operations into a sequence of smaller, asychronous operations will
allow the Buildmaster to scale its resource usage more fairly, and for
databases that allow concurrent connections (most), it will get a nice
performance boost from parallelization, too.

Dustin

P.S. You mentioned that SQL programming is not your bag.  Nor mine!
I'm working now on integrating SQLAlchemy into Buildbot, which should
alleviate some of those issues.  Most of Buildbot will be completely
isolated from database access by helper functions, but when necessary
the pythonic SQLAlchemy invocations are available, too.

[1] You should see the hack I've put in place to get the
eventGenerator to use DB results generated in another thread.  It will
work, but it ain't purty.
  https://github.com/djmitche/buildbot/blob/sqlalchemy/master/buildbot/db/changes.py




More information about the devel mailing list