[users at bb.net] More multi-master anecdotes and collapsing questions.
ngilmore at grammatech.com
Wed Jul 5 16:20:04 UTC 2017
Well, now that I can (reliably) release locks though Twisted's manhole,
things are a bit brighter. We have a somewhat rare problem in which all
of a worker's builders are 'acquiring locks'. Doesn't happen often, but
it keeps things from running. Remember, we can't use the worker
configuration to limit builds.
But we're having another problem that seems to be getting a bit worse. I
seem to recall Pierre saying that in a multi-master configuration, if
there was a scheduler that existed on multiple masters, that scheduler
would only be active on a single master. Other masters might activate
that scheduler if the first master went away. So there should only be
one master's scheduler scheduling particular builds.
Well, that isn't happening for us. It's not a problem most of the time,
because the builds do collapse, most of the time. Except when they don't.
For example, last weekend we had 3 builds schedule and build for the
same sourcestamp (according to the debug information in the UI). The
builds were scheduled within 3 seconds of each other. However, they were
claimed many hours apart. It appears that the first build completed
before the second was claimed, etc. Is this how it ought to go? I
haven't quite cracked the submitted/claimed/started timing.
We had a similar claiming problem last week where a build went unclaimed
for 44 days. So when it popped up. it appeared that we had gone back in
time (as the revision was quite old at that time).
Do I just need to figure out how to not put schedulers on more than 1
More information about the users