[Buildbot-devel] Fwd: canceling a build?

Jean-Paul Calderone exarkun at divmod.com
Fri Feb 1 20:17:58 UTC 2008


On Fri, 1 Feb 2008 12:42:48 -0500, Greg Ward <gerg.ward+buildbot at gmail.com> wrote:
>On Feb 1, 2008 11:48 AM, Stefan Seefeld <seefeld at sympatico.ca> wrote:
>> Really ? The toplevel make should 'own' all its child-processes, and
>> when it dies, its children should die with it. You'd have to take extra
>> steps to have a child process start a new process group. I don't think
>> make does that.
>
>That's what I thought at first too.  But my experiments on Linux have
>demonstrated that "kill 5322" does *not* necessarily kill all of
>process' 5322's children.  And when Buildbot stops a build, *or* when
>you stop the slave Buildbot, any processes that are children of
>Buildbot's children keep running.  Like I said, killing the process
>group works for me.

Indeed it does not.  The probable reason you thought it did is that as
soon as the child of the killed process tries to write to its stdout,
it will get a SIGPIPE, the default (and rarely not overridden) response
to which is to terminate.

In this case, it may be that buildbot is keeping stdout open, so the
children can continue to write without dying.  I'm not sure about this
though, someone would need to look at the state of the file descriptors
involved to verify.

>
>I suspect that Buildbot needs two changes for this to work:
>  * each build step should start a new process group (call
>os.setsid()) after fork()ing
>    but before exec()ing
>  * when stopping a build step (e.g. on explicit user request, or when
>the daemon is
>    shutdown), Buildbot should kill the whole process group (negative
>process ID)
>

This would help, although nothing prevents one of the processes in the
build step from calling setsid() itself, thus exempting it from the
process group which buildbot will eventually try to kill.

It's difficult to deal with this case though and most of the time it's
good enough to pretend it won't happen.

>Again, I think this is System V signal handling.  Not sure how modern BSDs work.

The behavior is specified in POSIX and SUSv3.  FreeBSD implements this,
and I expect the others do as well, although I think they'd sleep better
at night if you used killpg(2).

Jean-Paul





More information about the devel mailing list