[Buildbot-devel] Intermittent Connection.Lost error with Windows slaves

Dmitry Mikhin dmitry.mikhin at gmail.com
Thu Sep 6 06:48:02 UTC 2012


Hello everyone,

I'm having some trouble with a (virtual) Windows slave.

Setup:
- Master and slave on the same Fedora 14 box
- Windows XP under VirtualBox (LibVirtSlave)
- Buildslave v. 0.8.6p1 on slave, Buildbot 0.8.5, Twisted 11.1.0 on master
- PuTTY for establishing ssh tunnel to the master
- MyEnTunnel for automatically starting PuTTY (plink)

Generally works, but roughly every second build fails with the exception

[Failure instance: Traceback (failure with no frames): <class
'twisted.internet.error.ConnectionLost'>: Connection to the other side
was lost in a non-clean fashion.
]

Not much information in twistd.log files on either slave or master.
Master gets exception, reports slave as lost, then for some time
cannot substantiate the virtual slave. Slave log starts a new build as
if nothing happened in between.

All works stable with Linux slaves, all on the same physical host.

I tried the suggestions from
http://kb.askmonty.org/en/buildbot-setup-buildbot-setup-for-windows,
in particular:

- In buildbot.tac file, specified a higher keepalive value, 60000.

- Modified the Windows KeepAliveTime registry setting to a value of
60000 (had to create this DWORD registry entry, it did not exist
before).

- Turned off Windows firewall.

- I cannot ensure that the master is never 100% busy, it's a multicore
box running multiple slaves. But failures do happen, roughly at the
same frequency, when I manually start this single Windows build and
not much else takes place.

Any suggestions? Any hints how to debug the issue?

Thanks,
Dmitry




More information about the devel mailing list