[Buildbot-devel] cleanly taking slave offline?
warner-buildbot at lothar.com
Thu Aug 4 00:59:40 UTC 2005
> I'm sure I just missed it but is there a way to cleanly take a slave
> offline after it completes its current build?
Nope, not yet. I'll add it to the TODO list (SF#1251484, feel free to add
notes). I'm not sure what the best UI should be.. I could add a 'buildbot
shutdown-slave SLAVENAME' command (with a '--now' option to do it right away
instead of waiting for the end of the current build). But do you actually
need to stop the buildslave, or just pause it (keep it from starting new
builds)? We could add 'buildbot pause-slave SLAVENAME' and 'buildbot
unpause-slave SLAVENAME', which would keep the slave running and connected,
but set a flag that says it isn't eligible for new builds until unpaused.
Would that help, or do you think you want to reboot the slave while it is
paused (which would probably restart the build slave, depending on how you
have it set up, and then the flag would get cleared).
This will be a bit easier to implement with the new Scheduler arrangement,
because I'm no longer making quite the same assumptions about buildslave
availability. A flag that says "don't use" shouldn't be too hard to add.
> I noticed that each build page now has a "stop build" button but that
> doesn't really seem to do anything b/c my slave keeps running after I
> click it. What is its purpose?
It is meant to kill off the current build. I think it's flaky though (at
least I remember fixing some bugs in recent versions). The problem is that
it's optimistic: it sends a SIGKILL to the current command and then just
hopes that everything will come crashing down some time soon. One bug was
that stopping a build during a step which was allowed to terminate with
errors (say, a test case that terminated with non-zero RC if there were any
test failures) might not actually stop the build, it could just shake off the
error and keep on going. Another problem was that sending the SIGKILL to the
process didn't quite work. And windows has always been trouble in this part
of the code.
I think those bugs have been fixed. 'stop build' might work now, I don't
really know. I don't have any good tests for it. But regardless, it's not
quite the piece you're looking for, because if there are multiple builds
queued up, stopping one build will just cause the next one to start right
More information about the devel