[Buildbot-commits] [Buildbot] #2176: buildslave hangs trying to kill process after "1200 seconds without output"
Buildbot
nobody at buildbot.net
Mon Jan 16 13:02:26 UTC 2012
#2176: buildslave hangs trying to kill process after "1200 seconds without output"
----------------------+-----------------------
Reporter: hjwp | Owner:
Type: undecided | Status: new
Priority: major | Milestone: undecided
Version: 0.8.5 | Keywords:
----------------------+-----------------------
The buildbot logs show the usual message:
`command timed out: 1200 seconds without output, attempting to kill`
looking at the console window of the machine that's running the
buildslave.bat, we see a message:
`ERROR: The process "None" not found.`
Do you know where this message is coming from? Could it be that buildbot
is trying to kill a process that's already died?
It seems that the "attempting to kill" message is the last one that makes
it to the logs - Looking through the code in `runprocess.py`, that doesn't
make any sense - it seems to me that there's no way of getting through
that function without hitting at least one other `log.msg` call...
weird.
anyway, this hangs the build, and we're forced to go in and reboot the
buildslave machine. that then produces one final line in the logs:
`remoteFailed: [Failure instance: Traceback (failure with no frames):
<class 'twisted.internet.error.ConnectionLost'>: Connection to the other
side was lost in a non-clean fashion.]`
No doubt we should try and write better test code that doesn't cause the
1200 second timeout, but still, it would be good if buildbot didn't
hang...
additional info:
* buildbot-master is running on debian
* buildslave is running windows vista
* seems to be an intermittent problem - maybe one in 5 runs?
* we're using buildslave to run selenium webdriver tests, driven from
python 2.7
--
Ticket URL: <http://trac.buildbot.net/ticket/2176>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation
More information about the Commits
mailing list