[Buildbot-commits] [Buildbot] #2701: problems removing running builders

Buildbot trac trac at buildbot.net
Tue Feb 18 15:48:07 UTC 2014


#2701: problems removing running builders
-------------------------+-----------------------
Reporter:  elmirjagudin  |      Owner:
    Type:  undecided     |     Status:  new
Priority:  major         |  Milestone:  undecided
 Version:  0.8.8         |   Keywords:
-------------------------+-----------------------
 Buildbot does not seem to handle removal and renaming of running builders
 properly.

 When a builder is removed from master.cfg while it is running on some
 slave, it looks like buildbot tries to abort it. However it seems like it
 fails to do that properly. The builder will be listed as running on that
 slave indefinitely. Same issue is encountered when a builder is renamed,
 as buildbot will remove a builder with the old name and add a builder with
 new name.

 Below are contents of slave's twisted.log, when a builder 'hellox' is
 renamed to 'helloy' in master.cfg and 'buildbot reconfig' is run:


 {{{
 2014-02-13 15:34:10+0100 [Broker,client] stopCommand: halting current
 command <buildslave.commands.shell.SlaveShellCommand instance at
 0xa4111cc>
 2014-02-13 15:34:10+0100 [Broker,client] command interrupted, attempting
 to kill
 2014-02-13 15:34:10+0100 [Broker,client] trying to kill process group
 20112
 2014-02-13 15:34:10+0100 [Broker,client]  signal 9 sent successfully
 2014-02-13 15:34:10+0100 [Broker,client] I have a leftover directory
 'hellox' that is not being used by the buildmaster: you can delete it now
 2014-02-13 15:34:10+0100 [-] command finished with signal 9, exit code
 None, elapsedTime: 9.045442
 2014-02-13 15:34:10+0100 [-] would sendStatus but not .running
 2014-02-13 15:34:10+0100 [-] SlaveBuilder.commandComplete None
 2014-02-13 15:34:10+0100 [-]  but we weren't running, quitting silently
 2014-02-13 15:34:10+0100 [Broker,client]
 SlaveBuilder.remote_print(helloy): message from master: attached
 }}}


 It is possible to restore the slave with 'buildslave restart', however
 following is printed in master's twisted.log when the slave reconnects:


 {{{
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]
 BuildSlave.detached(example-slave)
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] releaseLocks(<BuildSlave
 'example-slave'>): []
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] Buildslave example-slave
 detached from helloy
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] <Build hellox>.lostRemote
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]  stopping currentStep
 <buildbot.steps.shell.ShellCommand object at 0xb2e2aec>
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] addCompleteLog(interrupt)
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] RemoteCommand.interrupt
 <RemoteShellCommand '['sleep', '180']'> [Failure instance: Traceback
 (failure with no frames): <class
 'twisted.internet.error.ConnectionLost'>: Connection to the other side
 was lost in a non-clean fashion.
      ]
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] RemoteCommand.disconnect:
 lost slave
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]
 releaseLocks(<buildbot.steps.shell.ShellCommand object at 0xb2e2aec>): []
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]  step 'shell' complete:
 retry
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1]  <Build hellox>: build
 finished
 2014-02-13 15:37:55+0100 [Broker,0,127.0.0.1] from a running build; this
 is a serious error - please file a bug at http://buildbot.net
      Traceback (most recent call last):
        File "/home/elmir/remblds/src/master/buildbot/process/build.py",
 line 519, in allStepsDone
          return self.buildFinished(text, self.result)
        File "/home/elmir/remblds/src/master/buildbot/process/build.py",
 line 558, in buildFinished
          self.deferred.callback(self)
        File
 "/home/elmir/remblds/sandbox/local/lib/python2.7/site-
 packages/twisted/internet/defer.py",
 line 382, in callback
          self._startRunCallbacks(result)
        File
 "/home/elmir/remblds/sandbox/local/lib/python2.7/site-
 packages/twisted/internet/defer.py",
 line 490, in _startRunCallbacks
          self._runCallbacks()
      --- <exception caught here> ---
        File
 "/home/elmir/remblds/sandbox/local/lib/python2.7/site-
 packages/twisted/internet/defer.py",
 line 577, in _runCallbacks
          current.result = callback(current.result, *args, **kw)
        File
 "/home/elmir/remblds/src/master/buildbot/process/builder.py", line 455,
 in buildFinished
          d = self.master.db.builds.finishBuilds(bids)
      exceptions.AttributeError: 'NoneType' object has no attribute 'db'

 2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] slave 'example-slave'
 attaching from IPv4Address(TCP, '127.0.0.1', 54965)
 2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] Got slaveinfo from
 'example-slave'
 2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] Starting buildslave
 keepalive timer for 'example-slave'
 2014-02-13 15:37:55+0100 [Broker,1,127.0.0.1] bot attached
 2014-02-13 15:37:57+0100 [Broker,1,127.0.0.1] Buildslave example-slave
 attached to helloy
 }}}

-- 
Ticket URL: <http://trac.buildbot.net/ticket/2701>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation


More information about the Commits mailing list