[Buildbot-devel] "Connection lost in a non-clean fashion" / duplicate slave
Harry Percival
harry at pythonanywhere.com
Mon Nov 17 16:02:32 UTC 2014
PS - having inspected the logs on the slave, it looks like one of the
keepalives being sent from the slave is failing maybe?
2014-11-17 09:02:14+0000 [-] sending app-level keepalive
2014-11-17 09:12:14+0000 [-] sending app-level keepalive
2014-11-17 09:18:50+0000 [Broker,client] SlaveBuilder._ackFailed:
SlaveBuilder.sendUpdate
2014-11-17 09:18:50+0000 [Broker,client] Unhandled Error
Traceback (most recent call last):
Failure: twisted.spread.pb.PBConnectionLost: [Failure instance:
Traceback (failure with no frames): <class
'twisted.internet.error.ConnectionLost'>: Connection to the other
side was lost in a non-clean fashion.
]
2014-11-17 09:18:50+0000 [Broker,client] SlaveBuilder._ackFailed:
SlaveBuilder.sendUpdate
2014-11-17 09:18:50+0000 [Broker,client] Unhandled Error
Traceback (most recent call last):
Failure: twisted.spread.pb.PBConnectionLost: [Failure instance:
Traceback (failure with no frames): <class
'twisted.internet.error.ConnectionLost'>: Connection to the other
side was lost in a non-clean fashion.
]
2014-11-17 09:18:50+0000 [Broker,client] SlaveBuilder._ackFailed:
SlaveBuilder.sendUpdate
2014-11-17 09:18:50+0000 [Broker,client] Unhandled Error
Traceback (most recent call last):
Failure: twisted.spread.pb.PBConnectionLost: [Failure instance:
Traceback (failure with no frames): <class
'twisted.internet.error.ConnectionLost'>: Connection to the other
side was lost in a non-clean fashion.
]
2014-11-17 09:18:50+0000 [Broker,client] lost remote
2014-11-17 09:18:50+0000 [Broker,client] lost remote step
2014-11-17 09:18:50+0000 [Broker,client] stopCommand: halting
current command <buildslave.commands.shell.SlaveShellCommand
instance at 0x0000000002DE6C08>
2014-11-17 09:18:50+0000 [Broker,client] command interrupted,
attempting to kill
2014-11-17 09:18:50+0000 [Broker,client] using TASKKILL PID /F /T to
kill pid 3356
2014-11-17 09:18:51+0000 [Broker,client] taskkill'd pid 3356
2014-11-17 09:18:51+0000 [Broker,client] Lost connection to
integration.company.com:9886
2014-11-17 09:18:51+0000 [Broker,client]
<twisted.internet.tcp.Connector instance at 0x0000000002BD6108> will
retry in 2 seconds
2014-11-17 09:18:51+0000 [Broker,client] Stopping factory
<buildslave.bot.BotFactory instance at 0x0000000002BC6E48>
2014-11-17 09:18:51+0000 [-] command finished with signal None, exit
code 1, elapsedTime: 2062.839000
2014-11-17 09:18:51+0000 [-] would sendStatus but not .running
2014-11-17 09:18:51+0000 [-] SlaveBuilder.commandComplete None
2014-11-17 09:18:54+0000 [-] Starting factory
<buildslave.bot.BotFactory instance at 0x0000000002BC6E48>
2014-11-17 09:18:54+0000 [-] Connecting to integration.company.com:9886
2014-11-17 09:18:54+0000 [Broker,client] message from master: master
already has a connection named 'redacted' - checking its liveness
2014-11-17 09:19:04+0000 [Broker,client] message from master: attached
2014-11-17 09:19:04+0000 [Broker,client]
SlaveBuilder.remote_print(google chrome stress): message from
master: attached
2014-11-17 09:19:04+0000 [Broker,client] Connected to
integration.company.com:9886; slave is ready
2014-11-17 09:19:04+0000 [Broker,client] sending application-level
keepalives every 600 seconds
--
Harry Percival
Developer
harry at pythonanywhere.com
PythonAnywhere - a fully browser-based Python development and hosting environment
<http://www.pythonanywhere.com/>
PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
On 17/11/14 09:28, Harry Percival wrote:
> Hi there,
>
> We've been running a build farm with a linux master and windows slaves
> for many years. Have been experimenting with moving the slaves to
> Azure, and I'm seeing a lot of errors saying:
>
> remoteFailed: [Failure instance: Traceback (failure with no
> frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to
> the other side was lost in a non-clean fashion.
>
> Which abort the build. In the twistd.log, I'm seeing these messages at
> around the same time:
>
> <timestamp> [Broker,<n>,<ip>] duplicate slave <slavename>; delaying
> new slave (IPv4Address(TCP, '<ip>', <port>)) and pinging old
> (IPv4Address(TCP, '<port>', <other-port>))
> <timestamp+10s> [Broker,<n-1>,<ip>] BuildSlave.detached(<slavename>)
>
> What could be happening here?
>
> The slaves are running Windows Server 2012R2 Datacenter.
>
> I've had a look at this:
> https://mariadb.com/kb/en/mariadb/development/tools/buildbot/buildbot-setup/buildbot-setup-buildbot-setup-for-windows/
> and tried changing the buildbot.tax keepalive variable, no apparent change.
>
> thanks for any help!
>
> Harry
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20141117/a075de99/attachment.html>
More information about the devel
mailing list