[Buildbot-devel] Slave disconnected under heavy load?

Sidnei da Silva sidnei at enfoldsystems.com
Fri Jun 2 23:36:29 UTC 2006


We've been happily using the buildbot over here at Enfold Systems to manage our
products and also the nightly builds for the Plone project.

The bot is hosted at:
http://buildbot.enfoldsystems.com/

However, we've been experiencing frequent disconnections from one slave when
there's either high IO or high network load.

The slave is running as a py2exe'd service on a Windows XP box hosted on VMWare.

Whenever a disconnection happens the slave seems to be either:

 - Checking out (sometimes more than one) large project (10MB+) from a svn
repository in the same subnet
 - Copying a large quantity of files around

None of those produce much output, so the first thing I suspected was a timeout,
but it doesn't seem to be the case. I've also added locks so only one build is
doing a checkout at the same time, and that doesn't seem to have helped either.

Since I saw an email from Brian quickly talking about issues under heavy load I
thought it would be a good idea to report this and see if anyone has experienced
similar issues or if there's anything that can be done to debug this.

We've recently upgraded to the 0.7.3 release with Twisted 2.0.x.

-- 
Sidnei da Silva
Enfold Systems, Inc.





More information about the devel mailing list