[Buildbot-devel] lost remote - slave cannot keep connection with master
Aaron Maxwell
amax at snaplogic.org
Tue Feb 26 01:02:19 UTC 2008
On Monday 25 February 2008 10:51:12 Aaron Maxwell wrote:
> Hi all,
> I'm suddenly having a problem today with my buildbot installation. We have
Some more info: the log excerpt above is in the slave's twistd.log. Here is
the corresponding set of log messages in the master's:
{{{
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] duplicate slave linbot1
replacing old one
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] old slave was connected from
IPv4Address(TCP, '10.200.10.34', 58881)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] new slave is from
IPv4Address(TCP, '10.200.10.34', 53300)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] disconnecting old slave linbot1
now
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] waiting for slave to finish
disconnecting
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] <Builder 'slave_builder'
at -1222704148>.detached linbot1
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] Buildslave linbot1 detached
from slave_builder
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] <BuildSlave 'linbot1', current
builders: slave_builder> removing <SlaveBuilder builder=slave_builder
slave=li
2008/02/25 18:40 -0700 [Broker,0,10.200.10.34] BuildSlave.detached(linbot1)
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] Got slaveinfo from 'linbot1'
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] bot attached
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] <BuildSlave 'linbot1', current
builders: slave_builder> adding <SlaveBuilder builder=slave_builder
slave=linb
2008/02/25 18:40 -0700 [Broker,2,10.200.10.34] Buildslave linbot1 attached to
slave_builder
2008/02/25 18:40 -0700 [-] maybeStartBuild <Builder 'slave_builder'
at -1222704148>: [] [<SlaveBuilder builder=slave_builder slave=linbot1>]
2008/02/25 18:40 -0700 [Broker,3,10.200.10.34] duplicate slave linbot1
replacing old one
}}}
I'm curious about that line "duplicate slave linbot1 replacing old one". It's
from BuildSlave.attach() in buildslave.py, and here is what the code comments
have to say:
# uh-oh, we've got a duplicate slave. The most likely
# explanation is that the slave is behind a slow link, thinks we
# went away, and has attempted to reconnect, so we've got two
# "connections" from the same slave, but the previous one is
# stale. Give the new one precedence.
I'm out of time today. Will start here tomorrow; maybe linbot1 perceives the
master as going away. If not, I'll have to break out wireshark.
--
Aaron Maxwell .:. amax at snaplogic.org .:. http://snaplogic.org
SnapLogic, Inc. - Data Integration for the Last Mile
More information about the devel
mailing list