[Buildbot-devel] new BuildBot "success story"
jm at jmason.org
Thu Dec 9 03:11:30 UTC 2004
-----BEGIN PGP SIGNED MESSAGE-----
Brian Warner writes:
>Excellent! I'll add it to the buildbot web page.
>> - I had to hack up svn commit-emails.pl support -- patch in the sf.net
>> patches queue
>Cool, I'll take a look. Is this a notification script that ships with svn or
>is it a third-party thing?
It's distributed with svn, as far as I know; that one works with the
ASF's svn repository, which is what we're using ;)
However, that patch is slightly incomplete -- while it *does* trigger the
builds at the right time, parses the rev correctly, and gets blame right,
it doesn't seem to be parsing the commit message. I haven't got
around to figuring out why though...
>> - Michael Parker's dynamic-IP-addressed slaves issue
>I've got the first round of lost-slave-handling patches in CVS now, and I'm
>testing it out on the Twisted buildbot (there's one slave which lives behind
>a slow link that will timeout if you even look at it funny, so I'm trying to
>make sure the buildmaster handles that part well before actually fixing the
>slave side to not disconnect so readily).
>> - is there a way to pick up idle slaves, when the master is restarted?
>> it appears that they must also be restarted to show up as online.
>> (it'd be nice if they could poll the server, and reconnect gracefully
>> if the server conn dies.)
>As Stephen pointed out, it's an expontential backoff that could really be
>clamped to a shorter maximum. The parameter is named 'maxDelay', and defaults
>to one hour. If you'd like to clamp it lower (say, 10 minutes), then edit
>buildbot/slave/bot.py (about line 285) to set BotFactory.maxDelay:
> class BotFactory(ReconnectingPBClientFactory):
> maxDelay = 10*60
> keepaliveTimeout = 30
> unsafeTracebacks = 1
>If you're seeing backoff delays of more than an hour, let me know. (I think
>the slave will log each delay in twistd.log, but I could be mistaken). There
>might be a bug somewhere. The slave logs are pretty valuable in this case.
Maybe Michael could take a look for us ;)
>I do believe there was a bug in some versions of Twisted such that certain
>disconnects would get classified as a "UserError" which did not schedule a
>reconnection attempt. This may or may not have any bearing on possible long
>> - and finally, it's shown up some bizarro FreeBSD locking bug in our
>> code; but I can't really blame BuildBot for that, it's just doing
>> its job ;)
>Yes! That's exactly what it's meant to do: ferret out the cross-platform and
well, it's doing it here for sure ;)
>> No, they are running. I just restarted one of them and it came right
>> back up. Maybe I should just put in a cron job that restarts them
>> every so often. It would be nice to put in some sort of HUP or other
>> signal that I could send the slave to cause it to ping the master, or
>> something like that.
>Hmm, I can see having SIGHUP trigger a reconnect perhaps being useful, but if
>you have to get that involved with the slave then you might as well restart
>it. Ideally the timed reconnect should be enough.
Yes. I think if it all worked as expected, the requirements for
slave-restarting et al would not be required. Anyway, if I restart the
master on machine A, in this case I have no access to go around SIGHUPping
the various slaves running at machines B,C,D,etc...
>That said, there are a handful of slave-side control buttons that I haven't
>figured out how to expose properly. Ping and Force Build are things that the
>slave admin should be able to easily do at any time. Maybe a small web page
>served by the buildslave, maybe a SIGUSR1 or 2, maybe a local TCP port that
>you just telnet into. Not sure.
mind you -- they *can* do that by visiting the master's page, and hitting
the button from there... strikes me as good enough. ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS
-----END PGP SIGNATURE-----
More information about the devel