[Buildbot-commits] [Buildbot] #2454: SiGHUP doesn't always work

Buildbot trac trac at buildbot.net
Sat May 4 22:16:13 UTC 2013


#2454: SiGHUP doesn't always work
--------------------+------------------------
Reporter:  virgilg  |       Owner:
    Type:  defect   |      Status:  new
Priority:  major    |   Milestone:  undecided
 Version:  0.8.7p1  |  Resolution:
Keywords:           |
--------------------+------------------------

Old description:

> Every now and then on 0.8.5 and more often on 0.8.7p1 we see at
> reconfigure time:
> sending SIGHUP to process 41208
> Never saw reconfiguration finish.
>
> The fix is generally to restart the master, but the problem with this
> approach is it's going to stop everybody else's builds from happening (we
> get slaves lost due to the time it takes to reconfigure - way less than
> the 10 minutes timeout, but we lose them still).
>
> Doing $ kill -SIGHUP 41208 doesn't produce anything in twistd.log, so it
> appears to be indeed stuck.
>
> How can we make this rock-solid?

New description:

 Every now and then on 0.8.5 and more often on 0.8.7p1 we see at
 reconfigure time:
 sending SIGHUP to process 41208
 Never saw reconfiguration finish.

 The fix is generally to restart the master, but the problem with this
 approach is it's going to stop everybody else's builds from happening (we
 get slaves lost due to the time it takes to reconfigure - way less than
 the 10 minutes timeout, but we lose them still).

 Doing $ kill -SIGHUP 41208 doesn't produce anything in twistd.log, so it
 appears to be indeed stuck.

 How can we make this rock-solid?

--

Comment (by dustin):

 Did you try adding the debugging code in comment 3?  That would help to
 narrow this down.

-- 
Ticket URL: <http://trac.buildbot.net/ticket/2454#comment:6>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation


More information about the Commits mailing list