[Buildbot-devel] lost remote - slave cannot keep connection with master

Aaron Maxwell amax at snaplogic.org
Tue Feb 26 01:02:19 UTC 2008


On Monday 25 February 2008 10:51:12 Aaron Maxwell wrote:
> Hi all,
> I'm suddenly having a problem today with my buildbot installation.  We have

Some more info: the log excerpt above is in the slave's twistd.log.  Here is 
the corresponding set of log messages in the master's:

{{{
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] duplicate slave linbot1 
replacing old one
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] old slave was connected from 
IPv4Address(TCP, '10.200.10.34', 58881)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] new slave is from 
IPv4Address(TCP, '10.200.10.34', 53300)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] disconnecting old slave linbot1 
now
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] waiting for slave to finish 
disconnecting
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] <Builder 'slave_builder' 
at -1222704148>.detached linbot1
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] Buildslave linbot1 detached 
from slave_builder
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] <BuildSlave 'linbot1', current 
builders: slave_builder> removing <SlaveBuilder builder=slave_builder 
slave=li
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] BuildSlave.detached(linbot1)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] Got slaveinfo from 'linbot1'
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] bot attached
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] <BuildSlave 'linbot1', current 
builders: slave_builder> adding <SlaveBuilder builder=slave_builder 
slave=linb
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] Buildslave linbot1 attached to 
slave_builder
2008/02/25 18:40 -0700 [-] maybeStartBuild <Builder 'slave_builder' 
at -1222704148>: [] [<SlaveBuilder builder=slave_builder slave=linbot1>]
2008/02/25 18:40 -0700 [Broker,3,10.200.10.34] duplicate slave linbot1 
replacing old one
}}}

I'm curious about that line "duplicate slave linbot1 replacing old one".  It's 
from BuildSlave.attach() in buildslave.py, and here is what the code comments 
have to say:
            # uh-oh, we've got a duplicate slave. The most likely
            # explanation is that the slave is behind a slow link, thinks we
            # went away, and has attempted to reconnect, so we've got two
            # "connections" from the same slave, but the previous one is
            # stale. Give the new one precedence.

I'm out of time today.  Will start here tomorrow; maybe linbot1 perceives the 
master as going away.  If not, I'll have to break out wireshark.

-- 
Aaron Maxwell .:. amax at snaplogic.org .:. http://snaplogic.org
SnapLogic, Inc. - Data Integration for the Last Mile




More information about the devel mailing list