[Buildbot-commits] [Buildbot] #2454: SiGHUP doesn't always work
Buildbot trac
trac at buildbot.net
Sun Aug 11 03:23:33 UTC 2013
#2454: SiGHUP doesn't always work
----------------------------+----------------------
Reporter: virgilg | Owner:
Type: support-request | Status: new
Priority: major | Milestone: ongoing
Version: 0.8.7p1 | Resolution:
Keywords: |
----------------------------+----------------------
Comment (by dustin):
Replying to [comment:12 virgilg]:
> I can repro this as follows:
>
> 1) save the attached script as e.g. test.py[[BR]]
> 2) comment and uncomment line 12: time.sleep(1)
>
> With time.sleep(1) uncommented, one cannot deliver a SIG<MUMBLE> to the
process, since the MainThread is too busy inside the while True: loop and
python is not really multithreaded.
>
> With time.sleep(1) commented out, handler() will get a chance to run.
I can't reproduce this. In either case - sleep or not - I see "Interrupt
received" when I send it a SIGINT. On both Mountain Lion (using the
system Python) and on Linux.
Given the way signal handling works, that's not terribly surprising. UNIX
will deliver a signal to the main thread of a process immediately, either
by pre-empting the process (if it's currently running), by scheduling it
(if it's ready to run), or by returning early from a syscall (if it's
currently in a syscall). In any of those cases, Python sets an internal
flag for the signal, sets the global is_tripped, adds a pending call to
PyErr_CheckSignals, and if necessary uses a selfpipe to wake up the main
process. Python checks for pending calls every few Python opcodes. A
tight python loop (`while True: pass`) is still executing opcodes, so the
pending call gets checked, and the exception gets handled. You indicated
that the loop *with* `time.sleep(1)` was not interruptible. In that case,
the signal is most likely delivered during the sleep, which is either a
syscall that will be awakened, or a select() that includes the selfpipe,
so the signal should *still* be delivered immediately. Which is what I'm
seeing.
Since we're seeing different behavior from a very simple Python script, I
think we should look toward the version and build of Python that we're
using. If we get those to match and we're *still* seeing different
behavior, then we'll have to look more closely at the reproduction recipe,
and try to find a suitable machine we both have access to.
> I see it process a ton of "events" that never finish:
events_company.com/11305023
> Where do these events come from? What generates them?
events_company.com/state is not the droid I'm looking for, is it?
This is a separate issue, which I think you filed a different bug on.
Your events shouldn't be backing up into the disk storage like that. But
let's focus on this bug for now.
If I accept for a moment your hypothesis that "tight" Python loops
preclude UNIX signals, then I see how this would be related. But I don't
know what "tight" means - if I loop over a list with two elements, then
for a moment the CPU is just as busy as it would be if I were looping over
1,000,000 elements. If a signal's delivered during that time, is it
ignored? What if it's delivered while a multiplication operation is
taking place? What defines "tight"?
> The other 6 threads are all stuck in threading.py:
Yep, that's the `Condition` class, Those are all worker threads helpfully
waiting for work.
I think that we should talk through some medium other than Trac. I'll try
to send you an email at the address in Trac, but if that fails, please get
in touch.
--
Ticket URL: <http://trac.buildbot.net/ticket/2454#comment:14>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation
More information about the Commits
mailing list