[Buildbot-commits] [Buildbot] #751: Sending SIGTERM before SIGKILL to a remote shell command that has timed out

Buildbot trac trac at buildbot.net
Tue Aug 20 02:55:50 UTC 2013


#751: Sending SIGTERM before SIGKILL to a remote shell command that has timed out
-------------------------+-----------------------
Reporter:  Fabrice       |       Owner:
    Type:  enhancement   |      Status:  reopened
Priority:  minor         |   Milestone:  0.8.+
 Version:  0.7.12        |  Resolution:
Keywords:  kill, sprint  |
-------------------------+-----------------------

Old description:

> I have a test step that does not produce output for one hour (the test or
> one of its subtest hangs for some reason). My buildbot is configured to
> timeout this step/command after 3600 seconds of inactivity on stdout or
> stderr. Thus buildbot sends correctly, as expected, a signal SIGKILL(9)
> to it and writes in the log:
>
> {{{
> command timed out: 3600 seconds without output, killing pid <PID>
> process killed by signal 9
> }}}
>
> However, my problem is the following. There is no way for me to
> catch/trap SIGKILL(9) in my test step process running on the slave and
> thus, I am missing test logs. Is it possible to make buildbot send a
> couple of SIGTERM(15) signals before sending a SIGKILL(9) signal?

New description:

 I have a test step that does not produce output for one hour (the test or
 one of its subtest hangs for some reason). My buildbot is configured to
 timeout this step/command after 3600 seconds of inactivity on stdout or
 stderr. Thus buildbot sends correctly, as expected, a signal SIGKILL(9) to
 it and writes in the log:

 {{{
 command timed out: 3600 seconds without output, killing pid <PID>
 process killed by signal 9
 }}}

 However, my problem is the following. There is no way for me to catch/trap
 SIGKILL(9) in my test step process running on the slave and thus, I am
 missing test logs. Is it possible to make buildbot send a couple of
 SIGTERM(15) signals before sending a SIGKILL(9) signal?

--

Comment (by dustin):

 I ended up backing that out.  From the message there:

 I think that the correct approach is what a lot of initscripts do: send
 SIGTERM, poll for process exit for some mid-length time, and if it doesn't
 exit, send SIGKILL. In other words, "please quit", wait, "die". We'll
 probably also need this to be configurable from the master side - both the
 inter-signal timeout, and whether to try SIGTERM at all.

-- 
Ticket URL: <http://trac.buildbot.net/ticket/751#comment:7>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation


More information about the Commits mailing list