[Buildbot-devel] removing running builders

Elmir Jagudin elmir.jagudin at axis.com
Thu Feb 13 14:43:53 UTC 2014


I wonder what is supposed to happen when you remove/rename a builder 
that is currently running on a slave?

Currently, it seems that slave tries to abort the build, but fails. It 
looks like the build is listed as running on that slave indefinitly. If 
the slave is configured with max_builds=1, no new builds will be 
dispatched to the slave.

Below are contents of slave's twisted.log, when a builder 'hellox' is 
renamed to 'helloy' in master.cfg and 'buildbot reconfig' is run:

2014-02-13 15:34:10+0100 [Broker,client] stopCommand: halting current 
command <buildslave.commands.shell.SlaveShellCommand instance at 0xa4111cc>
2014-02-13 15:34:10+0100 [Broker,client] command interrupted, attempting 
to kill
2014-02-13 15:34:10+0100 [Broker,client] trying to kill process group 20112
2014-02-13 15:34:10+0100 [Broker,client]  signal 9 sent successfully
2014-02-13 15:34:10+0100 [Broker,client] I have a leftover directory 
'hellox' that is not being used by the buildmaster: you can delete it now
2014-02-13 15:34:10+0100 [-] command finished with signal 9, exit code 
None, elapsedTime: 9.045442
2014-02-13 15:34:10+0100 [-] would sendStatus but not .running
2014-02-13 15:34:10+0100 [-] SlaveBuilder.commandComplete None
2014-02-13 15:34:10+0100 [-]  but we weren't running, quitting silently
2014-02-13 15:34:10+0100 [Broker,client] 
SlaveBuilder.remote_print(helloy): message from master: attached

It is possible to restore the slave with 'buildslave restart', however 
following is printed in master's twisted.log when the slave reconnects:

2014-02-13 15:37:55+0100 [Broker,0,] 
2014-02-13 15:37:55+0100 [Broker,0,] releaseLocks(<BuildSlave 
'example-slave'>): []
2014-02-13 15:37:55+0100 [Broker,0,] Buildslave example-slave 
detached from helloy
2014-02-13 15:37:55+0100 [Broker,0,] <Build hellox>.lostRemote
2014-02-13 15:37:55+0100 [Broker,0,]  stopping currentStep 
<buildbot.steps.shell.ShellCommand object at 0xb2e2aec>
2014-02-13 15:37:55+0100 [Broker,0,] addCompleteLog(interrupt)
2014-02-13 15:37:55+0100 [Broker,0,] RemoteCommand.interrupt 
<RemoteShellCommand '['sleep', '180']'> [Failure instance: Traceback 
(failure with no frames): <class 
'twisted.internet.error.ConnectionLost'>: Connection to the other side 
was lost in a non-clean fashion.
2014-02-13 15:37:55+0100 [Broker,0,] RemoteCommand.disconnect: 
lost slave
2014-02-13 15:37:55+0100 [Broker,0,] 
releaseLocks(<buildbot.steps.shell.ShellCommand object at 0xb2e2aec>): []
2014-02-13 15:37:55+0100 [Broker,0,]  step 'shell' complete: retry
2014-02-13 15:37:55+0100 [Broker,0,]  <Build hellox>: build 
2014-02-13 15:37:55+0100 [Broker,0,] from a running build; this 
is a serious error - please file a bug at http://buildbot.net
     Traceback (most recent call last):
       File "/home/elmir/remblds/src/master/buildbot/process/build.py", 
line 519, in allStepsDone
         return self.buildFinished(text, self.result)
       File "/home/elmir/remblds/src/master/buildbot/process/build.py", 
line 558, in buildFinished
line 382, in callback
line 490, in _startRunCallbacks
     --- <exception caught here> ---
line 577, in _runCallbacks
         current.result = callback(current.result, *args, **kw)
"/home/elmir/remblds/src/master/buildbot/process/builder.py", line 455, 
in buildFinished
         d = self.master.db.builds.finishBuilds(bids)
     exceptions.AttributeError: 'NoneType' object has no attribute 'db'

2014-02-13 15:37:55+0100 [Broker,1,] slave 'example-slave' 
attaching from IPv4Address(TCP, '', 54965)
2014-02-13 15:37:55+0100 [Broker,1,] Got slaveinfo from 
2014-02-13 15:37:55+0100 [Broker,1,] Starting buildslave 
keepalive timer for 'example-slave'
2014-02-13 15:37:55+0100 [Broker,1,] bot attached
2014-02-13 15:37:57+0100 [Broker,1,] Buildslave example-slave 
attached to helloy


More information about the devel mailing list