[Buildbot-devel] Ignoring offline builders/slaves

Charles Lepple clepple at gmail.com
Thu Mar 12 03:27:18 UTC 2015


Vitali,

If I am not mistaken, canStartBuild is for selecting one slave out of many assigned to a given builder.

The configuration that Jason posted seems to have a 1:1 mapping of builders to slaves.

Jason,

I am not familiar with how the Gerrit summary works. If you cancel a pending build when a slave is down, does it continue to wait for another build for that change?

- Charles

On Mar 11, 2015, at 8:52 PM, Vitali Lovich <vlovich at gmail.com> wrote:

> Look at canStartBuild.  It gives you slaves, build request etc.  You can get the botmaster from the builder & use it to see if the buildslave is online, idle, busy, etc.
> 
> -Vitali
> 
>> On Mar 11, 2015, at 5:06 PM, Jason Edgecombe <jason at rampaginggeek.com> wrote:
>> 
>> On 03/11/2015 02:07 PM, Mikhail Sobolev wrote:
>>> Hi Jason,
>>> 
>>> On Tue, Mar 10, 2015 at 07:54:21PM -0400, Jason Edgecombe wrote:
>>>> On 03/10/2015 02:33 PM, Mikhail Sobolev wrote:
>>>>> On Tue, Mar 10, 2015 at 09:32:00AM -0400, Jason Edgecombe wrote:
>>>>>> We maintain a buildbot farm of volunteer slaves. A subset of the slaves
>>>>>> use Gerrit as a changesource. Occasionally, one of the gerrit slaves is
>>>>>> down, and can be down for hours until the slave admin can fix it. During
>>>>>> the outage, gerrit changes are blocked from being built and receiving
>>>>>> status updates on the builds. Is there a way to dynamically exclude the
>>>>>> offline builders from the gerrit pool?
>>>>> Could you please elaborate a bit?  My understanding is that if a build slave is
>>>>> down, it won't get any jobs, how gerrit changes get blocked?
>>>>> 
>>>>>> For reference, my buildbot config file is at
>>>>>> https://github.com/edgester/afsbotcfg/blob/master/master.cfg
>>>>> Thanks for the link.  I'm looking at it now to see if I understand the problem
>>>>> better.
>>>> We use a summary callback in the Buildbot config, so that only one
>>>> comment is posted to gerrit when all of the slaves are done. This is
>>>> fine, but new gerrit changes will typically trigger a build on all of
>>>> the gerrit builders, even the ones that are down at the time of submission.
>>> Let me see if I understand the problem correctly.
>>> 
>>> There's a number of platforms that you'd like to check the things against.  So
>>> a build is created for each of those platforms.  However when for one of the
>>> platforms all Gerrit capable build slaves are down, the things get stuck.
>>> 
>>> If this is the case, then I do not think it's possible to work around this: a
>>> build for each platform has to be performed.
>>> 
>> When a build is submitted or started, buildbot already knows which 
>> builders are online. Is there a way to instruct the buildmaster to 
>> submit the change to all builders that are online at that time? I 
>> understand that changes that are currently building, and possibly those 
>> in-queue are hosed, but I would like for newly-submitted (and possibly 
>> queued) changes to act as if the downed builder is not in the build list.
>> 
>> The problem is that one of the admins must intervene to reconfigure the 
>> buildmaster to ignore the downed slave in order to restore partial 
>> service. Since the admins, like myself, are volunteers and possibly in 
>> different timezones, and other developers may be waiting for half a day 
>> until the admin gets a chance to intervene.
>> 
>> Thanks,
>> Jason
>> 
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
>> by Intel and developed in partnership with Slashdot Media, is your hub for all
>> things parallel software development, from weekly thought leadership blogs to
>> news, videos, case studies, tutorials and more. Take a look and join the 
>> conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Buildbot-devel mailing list
>> Buildbot-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/buildbot-devel
> 
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the 
> conversation now. http://goparallel.sourceforge.net/_______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel

-- 
Charles Lepple
clepple at gmail



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20150311/e8b9478f/attachment.html>


More information about the devel mailing list