Windows slaves freezing on downloadFile step

Dustin J. Mitchell dustin at v.igoro.us
Tue Oct 6 01:09:19 UTC 2015


It sounds like it's not so much freezing, as crashing.  Or something's
going wrong with the network between the hosts.

Dustin

On Mon, Oct 5, 2015 at 8:05 PM, Elliot Saba <staticfloat at gmail.com> wrote:

> All I see is the following:
>
> 2015-10-06 00:02:26+0000 [Broker,28,www.xxx.yyy.zzz] Unhandled Error
>         Traceback (most recent call last):
>         Failure: twisted.spread.pb.PBConnectionLost: [Failure instance:
> Traceback (failure with no frames): <class
> 'twisted.internet.error.ConnectionLost'>: Connection to the other side was
> lost in a non-clean fashion.
>         ]
>
> On Sun, Oct 4, 2015 at 6:54 AM, Dustin J. Mitchell <dustin at v.igoro.us>
> wrote:
>
>> Are you seeing any tracebacks in the master's twistd.log?
>>
>> Dustin
>>
>> On Wed, Sep 30, 2015 at 1:35 PM, Elliot Saba <staticfloat at gmail.com>
>> wrote:
>>
>>> Hello all, I have some Windows Server buildslaves that are freezing on a
>>> master -> slave file transfer step.  The file itself is only a few KB, so I
>>> know it's not a network speed issue, and other steps (including slave ->
>>> master uploads) have been successful.
>>>
>>> You can see the buildmaster webpage here
>>> <http://buildbot.e.ip.saba.us:8010/waterfall?tag=Juno>, but that
>>> doesn't show too much information, except that (1) the slaves continually
>>> timeout and restart themselves, and (2) the webpage "interrupt" button
>>> doesn't seem to work, so the buildslaves just continually restart
>>> themselves with no apparent way to force them to stop.  twistd.log on one
>>> of the slaves shows this log
>>> <https://gist.github.com/staticfloat/bed35422a12772439c2c>, which I
>>> think shows a timeout on the slave side, thinking that the buildmaster is
>>> frozen.  My question is:
>>>
>>> (1) What's the proper way to clear the queue when the "interrupt" button
>>> doesn't work?  So far, I'm getting around this by stopping the build
>>> master, then starting it back up, and canceling the jobs from the "pending"
>>> queue before the buildslaves can reconnect.
>>>
>>> (2) Why would this download step be freezing?  I've tried turning off
>>> the windows firewall and that doesn't seem to be the problem.
>>>
>>> If there's more information I need to give, please let me know!
>>> -E
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at buildbot.net
>>> https://lists.buildbot.net/mailman/listinfo/users
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20151005/da6c055b/attachment.html>


More information about the users mailing list