[users at bb.net] More 0.9.0rc2 multi-master anecdotes.
tardyp at gmail.com
Tue Mar 7 09:49:06 UTC 2017
I am not sure exactly how I can help on this as you are describing lots of
What goes to my mind right now is a problem with the message queue. In the
multimaster tests I am doing, I figured out that a disconnection of the
message queue is not recovered right now, which could explain why build do
not start (the master will not check for new requests unless they receive a
However, when the mq fails, I can see evidence of it in the logs, but you
don't mention any issue in the logs.
Database integrity errors looks bad also, what kind of errors is that? We
already had some reports of those which were due to a failing disk. Could
that be the case?
On Mon, Mar 6, 2017 at 10:36 PM Neil Gilmore <ngilmore at grammatech.com>
> Hi everyone,
> Well, things ran OK for a couple weeks. But we had some problems
> starting last weekend. At least some failure emails don't seem to be
> getting sent out. And a problem we'd been having a bit of got a lot worse.
> For whatever reason, queued builds don't seem to want to start.
> Sometimes for hours. Even forced builds. This doesn't seem to be a
> locking problem, though I'll be having a look at that side in a bit. But
> we'll have builds sitting for hours before they start. If they start.
> Some of our people get antsy and cancel the current queue then force a
> build. But sometimes those wait, too.
> And we're having trouble getting the masters to deal with new revisions
> fro svn. Everything else looks OK (postcommit hooks, etc.) I'm just not
> sure what's going on.
> Reconfig hasn't helped, nor has restarting one of the masters.
> We are getting integrity errors in our database, too.
> Except for the database problem, the rest looks like network connection
> stuff, perhaps, though we haven't had any problems there for a while.
> Neil Gilmore
> users mailing list
> users at buildbot.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users