[Buildbot-devel] Slave not recognizing process end
    "Martin v. Löwis" 
    martin at v.loewis.de
       
    Mon Jan  4 08:08:44 UTC 2010
    
    
  
> I'm also not extremely familiar with the code, but let's say that
> there's a timeout (which I think there is) which tries to kill the
> process.  To avoid the exception, the callback for handling process
> termination (with or without error) should cancel the timeout.
And indeed, it does.
> Thus, no
> attempt is made by the timeout-related code to kill the process.  If
> there are other bits of code which would try to kill the process,
> perhaps they can have their event sources aborted as well.
And they do.
However, there is *still* a race condition.
> The main reason this is better than just handling the exception is that
> it fits right in with the process tracking that BuildBot really should
> already be doing.  If it is going off and killing a process that exited
> already, then the obvious question to ask is why it didn't notice the
> process already exited, and what consequences this has for any related
> status view and other internal state tracking.
That is indeed the question, and there is apparently also a bug lurking
there - in my case, the process terminated well before buildbot tried to
kill it. Normally, buildbot would also recognize the child's
termination, but in this specific case (under yet-to-be-determined
circumstances), it wouldn't.
However, *independent* of this possible bug, there is a race condition
where killing the child may fail even if you follow the guidelines
you outline above.
Regards,
Martin
    
    
More information about the devel
mailing list