[Buildbot-devel] Losing Contact with Slave

Roy S. Rapoport buildbot-devel at ols.inorganic.org
Wed Nov 8 07:33:26 UTC 2006


I restarted my master to take in some code changes; there were problems
with the code, so I got some exceptions.  A few minutes later, I finally
got everything working.  

Interestingly, most of my slaves are now in 'unconnected' state, even
though it's been a little more than 10 minutes since the master's gone back
up.  The slave's log looks like this:

---
2006/11/07 22:31 PST [-] sending app-level keepalive
2006/11/07 22:41 PST [-] sending app-level keepalive
2006/11/07 22:51 PST [-] sending app-level keepalive
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] lost remote
2006/11/07 23:00 PST [Broker,client] <twisted.internet.tcp.Connector instance at 0xb7182eac> will retry in 2 seconds
2006/11/07 23:00 PST [Broker,client] Stopping factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:00 PST [-] Starting factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:00 PST [Uninitialized] <twisted.internet.tcp.Connector instance at 0xb7182eac> will retry in 7 seconds
2006/11/07 23:00 PST [Uninitialized] Stopping factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:00 PST [-] Starting factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:00 PST [Uninitialized] <twisted.internet.tcp.Connector instance at 0xb7182eac> will retry in 18 seconds
2006/11/07 23:00 PST [Uninitialized] Stopping factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:01 PST [-] Starting factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:01 PST [Uninitialized] <twisted.internet.tcp.Connector instance at 0xb7182eac> will retry in 48 seconds
2006/11/07 23:01 PST [Uninitialized] Stopping factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:01 PST [-] Starting factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:01 PST [Uninitialized] <twisted.internet.tcp.Connector instance at 0xb7182eac> will retry in 102 seconds
2006/11/07 23:01 PST [Uninitialized] Stopping factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
2006/11/07 23:03 PST [-] Starting factory <buildbot.slave.bot.BotFactory instance at 0xb718254c>
---

The master came up at about 2320; I'm writing this at about 2331, and I've
seen no further indication of activity on the slave since 2303.  Has the
slave given up on trying to contact the master at this point, and I have to
log into each of my slaves to restart them?

-roy




More information about the devel mailing list