[Buildbot-devel] database-backed status/scheduler-state project

Thu Sep 3 17:59:37 UTC 2009

On 3-Sep-09, at 1:41 PM, Axel Hecht wrote:

> 2009/9/3 Ben Hearsum <bhearsum at mozilla.com>
>
>>
>> On 3-Sep-09, at 6:07 AM, Axel Hecht wrote:
>>
>>>
>>> A big load issue right now are the changesources, but I got those  
>>> covered.
>>> If there are more master load issues, why not spin more? Riiiight,  
>>> let's
>>> detail on that.
>>>
>>> We're not having any load issues on the slaves. We may have not  
>>> enough
>>> slaves, and disks, but in general, each slave is fine. That's  
>>> because
>>> we're
>>> using generic slaves per platform that can do basically any build  
>>> on that
>>> platform, independent of branch. We're setting the builds limit to  
>>> 1 per
>>> slave, and our slaves build what needs to be built.
>>>
>>> Adding masters breaks that.
>>>
>>
>> I don't think it does. In the plan Brian describes every master  
>> would be
>> identically configured - able to handle any job from any branch. It  
>> allows
>> us to end up with a bunch of masters in different places with  
>> smaller slave
>> pools. (Does this belong on a Mozilla list rather than Buildbot?)
>>
>
> My point was that adding masters as they ship today breaks things.
>
> I do think that the proposed scheme can solve that, I just think  
> that you
> get the same net result by making slave pools work across masters,  
> which
> sounds like an easier patch to me.
>
> To rephrase how I understand the two ways:
>
> In Brian's proposal, you'd have a slave-master per N slaves, which  
> picks up
> build requests from a single scheduling master.
> In my proposal, you have, say a master per branch in our case, and  
> they'd
> share slave pools.

I think I understand what you're saying now. I'm not sure whether  
slaves shared over multiple masters is easier or simpler to implement  
- regardless of that, I don't think it's as good of an end goal. I  
think that having multiple masters that do not directly interact is a  
cleaner approach.

> I fear the edges in fighting race conditions in the scheme where you  
> have
> decoupled daemons trying to fight over rows in a database, tbh.

I don't think this is *that* complicated to deal with. Most databases  
support locking, don't they? Mysql (INNODB) does, postgres does,  
sqlite sortof does (maybe we don't care, since sqlite probably only  
works in single master mode anyways).