[users at bb.net] 0.9.0rc2, multi-master, and anecdotes on reconfig.

Fri Nov 11 17:22:35 UTC 2016

Hi everyone,

As always, thank you for your expertise.

Today, just a a couple questions:

Q1.

On 11/4/2016 11:02 AM, Pierre Tardy wrote:
> The workaround is to change the scheduler name so that it is forced 
> re-created.
> Please feel free to dig in the code and submit a PR if this becomes 
> too annoying.
>

OK, it's becoming annoying as we move from the newfound wonder of having 
a system that's not constantly bogging down to niggling issues like our 
builders not getting scheduled.

One problem specific to us (probably), is that we construct our set of 
schedulers from our builder specifications, rather than having some set 
of schedulers, and plugging builders into them. So it would be quite 
difficult to figure out which scheduler a build will use, then find all 
the other builders on that scheduler, disable those in master.cfg and 
reconfig so that the scheduler will be removed, then undo it all so the 
correct set of builders will be in the scheduler when its remade.

My actual question is where can I look for information on submitting a 
PR? A slight search didn't' turn up what I was looking for. I figure 
someone on the list will be able to point me in the right direction.

Q1.5 Assuming I have to add the reconfigAPI code myself, where in the 
current code is a decent class's implementation I can look at to get 
myself up to speed? And Do I need to add anything other than the API to 
the base scheduler class (like pointing a parent at it)?

Q2.

This is one about a preferred way to handle a situation that's currently 
happening.

Some of our test builds, when successful, take days to run. But if one 
fails, we want it rerun fairly soon. We currently schedule them hourly, 
but as these don't rely on revision, the queues can get pretty long.

What happens is that we'll have a successful run, which takes days, and 
results in a long queue. Then something will change, and the builds will 
fail within the first couple minutes. And continue failing. So everyone 
getting emails will get emails for failed builds every couple minutes 
for quite a while. In one example, there were something like 45 builds 
queued, and each was firing off failure emails every couple minutes.

Sure, we can cancel the queues, and then we get a failure every hour. 
But is there a better way to prevent this?

One idea is to have a scheduler that won't schedule if there's another 
build running, though I think it would be easier to not schedule if 
there's another build queued (just because I've already done that to 
make our schedulers always schedule for the latest revision, removing 
builds for old revisions).

Any ideas?

Thanks!

Neil Gilmore
grammatech.com