[Buildbot-commits] [Buildbot] #2701: problems removing running builders
Buildbot trac
trac at buildbot.net
Tue Feb 18 15:48:07 UTC 2014
#2701: problems removing running builders
-------------------------+-----------------------
Reporter: elmirjagudin | Owner:
Type: undecided | Status: new
Priority: major | Milestone: undecided
Version: 0.8.8 | Keywords:
-------------------------+-----------------------
Buildbot does not seem to handle removal and renaming of running builders
properly.
When a builder is removed from master.cfg while it is running on some
slave, it looks like buildbot tries to abort it. However it seems like it
fails to do that properly. The builder will be listed as running on that
slave indefinitely. Same issue is encountered when a builder is renamed,
as buildbot will remove a builder with the old name and add a builder with
new name.
Below are contents of slave's twisted.log, when a builder 'hellox' is
renamed to 'helloy' in master.cfg and 'buildbot reconfig' is run:
{{{
2014-02-13 15:34:10+0100 [Broker,client] stopCommand: halting current
command <buildslave.commands.shell.SlaveShellCommand instance at
0xa4111cc>
2014-02-13 15:34:10+0100 [Broker,client] command interrupted, attempting
to kill
2014-02-13 15:34:10+0100 [Broker,client] trying to kill process group
20112
2014-02-13 15:34:10+0100 [Broker,client] signal 9 sent successfully
2014-02-13 15:34:10+0100 [Broker,client] I have a leftover directory
'hellox' that is not being used by the buildmaster: you can delete it now
2014-02-13 15:34:10+0100 [-] command finished with signal 9, exit code
None, elapsedTime: 9.045442
2014-02-13 15:34:10+0100 [-] would sendStatus but not .running
2014-02-13 15:34:10+0100 [-] SlaveBuilder.commandComplete None
2014-02-13 15:34:10+0100 [-] but we weren't running, quitting silently
2014-02-13 15:34:10+0100 [Broker,client]
SlaveBuilder.remote_print(helloy): message from master: attached
}}}
It is possible to restore the slave with 'buildslave restart', however
following is printed in master's twisted.log when the slave reconnects:
{{{
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]
BuildSlave.detached(example-slave)
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] releaseLocks(<BuildSlave
'example-slave'>): []
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] Buildslave example-slave
detached from helloy
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] <Build hellox>.lostRemote
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] stopping currentStep
<buildbot.steps.shell.ShellCommand object at 0xb2e2aec>
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] addCompleteLog(interrupt)
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] RemoteCommand.interrupt
<RemoteShellCommand '['sleep', '180']'> [Failure instance: Traceback
(failure with no frames): <class
'twisted.internet.error.ConnectionLost'>: Connection to the other side
was lost in a non-clean fashion.
]
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] RemoteCommand.disconnect:
lost slave
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]
releaseLocks(<buildbot.steps.shell.ShellCommand object at 0xb2e2aec>): []
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] step 'shell' complete:
retry
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] <Build hellox>: build
finished
2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] from a running build; this
is a serious error - please file a bug at http://buildbot.net
Traceback (most recent call last):
File "/home/elmir/remblds/src/master/buildbot/process/build.py",
line 519, in allStepsDone
return self.buildFinished(text, self.result)
File "/home/elmir/remblds/src/master/buildbot/process/build.py",
line 558, in buildFinished
self.deferred.callback(self)
File
"/home/elmir/remblds/sandbox/local/lib/python2.7/site-
packages/twisted/internet/defer.py",
line 382, in callback
self._startRunCallbacks(result)
File
"/home/elmir/remblds/sandbox/local/lib/python2.7/site-
packages/twisted/internet/defer.py",
line 490, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File
"/home/elmir/remblds/sandbox/local/lib/python2.7/site-
packages/twisted/internet/defer.py",
line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File
"/home/elmir/remblds/src/master/buildbot/process/builder.py", line 455,
in buildFinished
d = self.master.db.builds.finishBuilds(bids)
exceptions.AttributeError: 'NoneType' object has no attribute 'db'
2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] slave 'example-slave'
attaching from IPv4Address(TCP, '127.0.0.1', 54965)
2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] Got slaveinfo from
'example-slave'
2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] Starting buildslave
keepalive timer for 'example-slave'
2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] bot attached
2014-02-13 15:37:57+0100 [Broker,1,127.0.0.1] Buildslave example-slave
attached to helloy
}}}
--
Ticket URL: <http://trac.buildbot.net/ticket/2701>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation
More information about the Commits
mailing list