[Buildbot-devel] Ignoring offline builders/slaves

Vitali Lovich vlovich at gmail.com
Thu Mar 12 00:52:18 UTC 2015


Look at canStartBuild.  It gives you slaves, build request etc.  You can get the botmaster from the builder & use it to see if the buildslave is online, idle, busy, etc.

-Vitali

> On Mar 11, 2015, at 5:06 PM, Jason Edgecombe <jason at rampaginggeek.com> wrote:
> 
> On 03/11/2015 02:07 PM, Mikhail Sobolev wrote:
>> Hi Jason,
>> 
>> On Tue, Mar 10, 2015 at 07:54:21PM -0400, Jason Edgecombe wrote:
>>> On 03/10/2015 02:33 PM, Mikhail Sobolev wrote:
>>>> On Tue, Mar 10, 2015 at 09:32:00AM -0400, Jason Edgecombe wrote:
>>>>> We maintain a buildbot farm of volunteer slaves. A subset of the slaves
>>>>> use Gerrit as a changesource. Occasionally, one of the gerrit slaves is
>>>>> down, and can be down for hours until the slave admin can fix it. During
>>>>> the outage, gerrit changes are blocked from being built and receiving
>>>>> status updates on the builds. Is there a way to dynamically exclude the
>>>>> offline builders from the gerrit pool?
>>>> Could you please elaborate a bit?  My understanding is that if a build slave is
>>>> down, it won't get any jobs, how gerrit changes get blocked?
>>>> 
>>>>> For reference, my buildbot config file is at
>>>>> https://github.com/edgester/afsbotcfg/blob/master/master.cfg
>>>> Thanks for the link.  I'm looking at it now to see if I understand the problem
>>>> better.
>>> We use a summary callback in the Buildbot config, so that only one
>>> comment is posted to gerrit when all of the slaves are done. This is
>>> fine, but new gerrit changes will typically trigger a build on all of
>>> the gerrit builders, even the ones that are down at the time of submission.
>> Let me see if I understand the problem correctly.
>> 
>> There's a number of platforms that you'd like to check the things against.  So
>> a build is created for each of those platforms.  However when for one of the
>> platforms all Gerrit capable build slaves are down, the things get stuck.
>> 
>> If this is the case, then I do not think it's possible to work around this: a
>> build for each platform has to be performed.
>> 
> When a build is submitted or started, buildbot already knows which 
> builders are online. Is there a way to instruct the buildmaster to 
> submit the change to all builders that are online at that time? I 
> understand that changes that are currently building, and possibly those 
> in-queue are hosed, but I would like for newly-submitted (and possibly 
> queued) changes to act as if the downed builder is not in the build list.
> 
> The problem is that one of the admins must intervene to reconfigure the 
> buildmaster to ignore the downed slave in order to restore partial 
> service. Since the admins, like myself, are volunteers and possibly in 
> different timezones, and other developers may be waiting for half a day 
> until the admin gets a chance to intervene.
> 
> Thanks,
> Jason
> 
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the 
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20150311/dc4a97de/attachment.html>


More information about the devel mailing list