[Buildbot-devel] Slave not recognizing process end
"Martin v. Löwis"
martin at v.loewis.de
Sat Jan 2 18:51:02 UTC 2010
We have a slave where the test command is configured with
a timeout of 1800s. The command completed before
that, yet the slave still believed it had to kill it.
So we see the following pieces in the log:
2010-01-01 15:31:43-0500 [-] command timed out: 1800 seconds without output
2010-01-01 15:31:43-0500 [-] self.process has no pid
2010-01-01 15:31:43-0500 [-] trying process.signalProcess('KILL')
2010-01-01 15:31:43-0500 [-] Unhandled Error
[...]
"/usr/lib/python2.6/site-packages/twisted/internet/process.py", line
312, in signalProcess
if os.WIFEXITED(status):
twisted.internet.error.ProcessExitedAlready:
As a consequence, the slave now still believes that the process
is running, and any attempts to cancel the process from the master
all lead to the same exception (ProcessExitedAlready).
Restarting the slave would work around the problem, however, since this
happens often, I would like to fix it for good.
Where should I look?
Regards,
Martin
More information about the devel
mailing list