[Buildbot-devel] Buildslave disconnects

Mark MacVicar mark.macvicar at gmail.com
Wed Nov 5 21:56:02 UTC 2008


I think I may have tracked down the problem. Disk space!

Our root linux partition was filling up. After moving some files off
and cleaning up /tmp, buildbot seems more stable.

Thanks for everyone's advice.

On Mon, Nov 3, 2008 at 2:22 PM, Mark MacVicar <mark.macvicar at gmail.com> wrote:
> Our buildmaster and one slave is running on an instance of VMWare, the
> other slave is a dedicated pc.
>
> I'll see if I can track it down to the TCP issue you mentioned or try
> to get some better hardware.
>
> I tried increasing the timeout for each slave to 1200 seconds but they
> are still reporting "BotFactory.checkActivity: nothing from master for
> 1307 secs" then they reconnect almost immediately.
>
> I think there might be an update to Twisted. I'll try updating that too.
>
> Thanks for you insight,
>
> Mark MacVicar
>
> On Thu, Oct 30, 2008 at 10:26 AM, Bill Baker <bbb at uiuc.edu> wrote:
>> I've had periodic disconnects, but I didn't notice that log message.  It
>> seemed hardware-dependent, actually.  My build slaves are all virtual
>> machines running in VMware Server 1 and 2, and on one physical server I got
>> disconnections every hour or so, causing long-running builds to fail.  I
>> tried moving them to newer hardware, and all the disconnections stopped.
>>
>> So my first guess is that the cause is different.  With mine, I think it was
>> due to an interaction between the ethernet hardware and VMware, which is
>> constantly juggling MAC addresses, and seemed to be breaking TCP connections
>> on one platform but not on the other.
>>
>> On Wed, Oct 29, 2008 at 4:24 PM, Mark MacVicar <mark.macvicar at gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm having a problem where my buildslaves are disconnecting every few
>>> days (sometimes multiple times per day) with the following log:
>>>
>>> buildslave_linux32/twistd.log:1321:2008/10/28 09:47 -0700 [-]
>>> BotFactory.checkActivity: nothing from master for 629 secs
>>>
>>> This is even happening for the slave I have running on the same server
>>> as the master.
>>>
>>> I'm running my buildmaster and one of my slaves on OpenSuse 10.3. I'm
>>> currently running a modfied buildbot 0.7.8.
>>>
>>> I wasn't having this problem for a while, so I'm suspicious of my
>>> modifications, but I haven't modified the slave scripts.
>>>
>>> Has anyone encountered periodic, seemingly random buildslave
>>> disconnects and found a solution?
>>>
>>> Thank you,
>>>
>>> Mark MacVicar
>>>
>>> -------------------------------------------------------------------------
>>> This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> challenge
>>> Build the coolest Linux based applications with Moblin SDK & win great
>>> prizes
>>> Grand prize is a trip for two to an Open Source event anywhere in the
>>> world
>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> _______________________________________________
>>> Buildbot-devel mailing list
>>> Buildbot-devel at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/buildbot-devel
>>
>>
>




More information about the devel mailing list