[Buildbot-devel] Buildbot slave quietly hung after "BuildSlave._commands_unavailable" on master?

Dan Kegel dank at kegel.com
Wed Apr 3 18:04:00 UTC 2013


I'm running buildbot-0.8.7 from git with the fix from
http://trac.buildbot.net/ticket/2427
patched in.

One of the six buildslaves stopped taking jobs, and the last line in its
log file is from a day or so ago:

2013-04-02 17:56:49+0000 [Broker,client] message from master: master
already has a connection named 'g-speak-ubu12-bgarden-01-ubu1204' -
checking its liveness

Restarting the buildslave results in the same message.

On the server side, I see an interesting error about the same time; see log
below.

Restarting the master made things happy again.
So i'm not stuck, just reporting a curious problem.
- Dan

2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] slave
'g-speak-ubu12-bgarden-01-ubu1204' attaching from IPv4Address(TCP,
'10.10.150.5', 41267)
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] Starting buildslave
keepalive timer for 'g-speak-ubu12-bgarden-01-ubu1204'
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] Got slaveinfo from
'g-speak-ubu12-bgarden-01-ubu1204'
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5]
BuildSlave._commands_unavailable
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] Unhandled Error
Traceback (most recent call last):
Failure: twisted.spread.pb.PBConnectionLost: [Failure instance: Traceback
(failure with no frames): <class 'twisted.internet.error.ConnectionDone'>:
Connection was closed cleanly.
]
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] bot attached
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] Peer will receive
following PB traceback:
2013-04-02 10:56:48-0700 [Broker,28,10.10.150.5] Unhandled Error
Traceback (most recent call last):
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/internet/tcp.py",
line 299, in connectionLost
    protocol.connectionLost(reason)
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/spread/pb.py",
line 630, in connectionLost
    d.errback(failure.Failure(PBConnectionLost(reason)))
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/internet/defer.py",
line 422, in errback
    self._startRunCallbacks(fail)
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/internet/defer.py",
line 489, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/internet/defer.py",
line 576, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File
"/home/buildbot/master-state/sandbox/buildbot-git/master/buildbot/buildslave.py",
line 453, in _accept_slave
    return self.updateSlave()
  File
"/home/buildbot/master-state/sandbox/buildbot-git/master/buildbot/buildslave.py",
line 339, in updateSlave
    return self.sendBuilderList()
  File
"/home/buildbot/master-state/sandbox/buildbot-git/master/buildbot/buildslave.py",
line 714, in sendBuilderList
    d = AbstractBuildSlave.sendBuilderList(self)
  File
"/home/buildbot/master-state/sandbox/buildbot-git/master/buildbot/buildslave.py",
line 560, in sendBuilderList
    d = self.slave.callRemote("setBuilderList", blist)
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/spread/pb.py",
line 345, in callRemote
    _name, args, kw)
  File
"/home/buildbot/master-state/sandbox/local/lib/python2.7/site-packages/twisted/spread/pb.py",
line 858, in _sendMessage
    raise DeadReferenceError("Calling Stale Broker")
twisted.spread.pb.DeadReferenceError: Calling Stale Broker
2013-04-02 10:56:48-0700 [-] gitpoller: processing 0 changes: [] from
"git.oblong.com:/ob/git/repo/gst-oblong.git"
2013-04-02 10:56:48-0700 [-] gitpoller: processing 0 changes: [] from "
https://github.com/Oblong/libdogleg.git"
2013-04-02 10:56:49-0700 [Broker,29,10.10.150.5] duplicate slave
g-speak-ubu12-bgarden-01-ubu1204; delaying new slave (IPv4Address(TCP,
'10.10.150.5', 41269)) and pinging old (IPv4Address(TCP, '10.10.150.5',
41267))
2013-04-02 10:56:49-0700 [Broker,29,10.10.150.5] connection lost while
pinging old slave 'g-speak-ubu12-bgarden-01-ubu1204' - keeping new slave

...

2013-04-03 10:49:29-0700 [Broker,57,10.10.150.5] duplicate slave
g-speak-ubu12-bgarden-01-ubu1204; delaying new slave (IPv4Address(TCP,
'10.10.150.5', 54863)) and pinging old (IPv4Address(TCP, '10.10.150.5',
41267))
2013-04-03 10:49:29-0700 [Broker,57,10.10.150.5] connection lost while
pinging old slave 'g-speak-ubu12-bgarden-01-ubu1204' - keeping new slave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20130403/46ca0003/attachment.html>


More information about the devel mailing list