[Buildbot-devel] buildbot slave hangs on SunOS?

Ed Hartnett ed at unidata.ucar.edu
Wed Feb 4 20:50:29 UTC 2004


I've been working on getting the buildbot slave to work on a sun. No
luck yet!

Brian Warner <warner-buildbot at lothar.com> writes:

> > 2004/01/30 08:58 MST [Broker,client] startBuild
> > 2004/01/30 08:58 MST [Broker,client]  startCommand:cvs [id 6 325182]
> > 2004/01/30 08:58 MST [Broker,client]   command 'rm -rf /home/ed/BuildBot/laraine/laraine_tip/build' in dir /home/ed/BuildBot/laraine/laraine_tip/. [None]
> > 2004/01/30 08:58 MST [-] command finished with signal None, exit code 0
> 
> That looks normal, the slave reports that it finished the 'rm -rf' correctly.
> 
> > 2004/01/30 08:58 MST [-]   command 'cvs -d /upc/share/CVS -z3 checkout -r HEAD -d build netcdf-3' in dir /home/ed/BuildBot/laraine/laraine_tip/. [None]
> 
> You're saying that it just stops here? For how long? The CVS command should
> be executing during this time, and you won't see any additional messages in
> the buildmaster log until it finishes (or hits the 20-minute idle
> timeout).

It stays this way until I kill the slave.

The cvs output comes out as usual, and seems to work, but it never
seems to detect the end of the cvs command. (That is, the command
doesn't ever return.)

> 
> Look at the buildmaster's web page: there should be a box for "CVS checkout"
> that should be yellow while the step is running. The "log" link there will
> show you all the stdout/stderr that the process has emitted so far. If the
> CVS process was able to get started (that is, if execvpe() was successful,
> which is probably equivalent to the 'cvs' executable being found on $PATH),
> then any later error messages should show up there.

cvs doesn't have an error, it works fine.

> 
> You can also look for clues in the buildslave's log (the twistd.log file in
> the buildslave's base directory). If you edit slavecommand.py around line 56
> (Command.__init__) and do 'self.debug = 1', then the buildslave will log a
> message for every status update it sends to the buildmaster, including every
> byte of stdout/stderr that the child process runs. This is a lot of data, and
> you'll probably want to turn it off once you've gotten everything working.
> 
> Also do a 'ps' on the buildslave host to see if cvs is actually running or
> not, and whether it is consuming any CPU time.

Yes, cvs runs.

There are some weird things about python on a sun system. There is a
test, relating to pty, which fails.

Similarly, some of the twisted tests hang on the sun:

<snip>
  PosixProcessTestCasePTY
    testAbnormalTermination ...                                          [FAIL]
    testNormalTermination ... 

And it hangs here. Perhaps this is related to the problem?

I tried getting twisted 1.1.2 alpha, but that didn't do the
job. Anyway, here's the slave output:

zero.unidata.ucar.edu% twistd -n -f buildbot.tap
2004/02/04 13:27 MST [-] Log opened.
2004/02/04 13:27 MST [-] twistd 1.1.2alpha2 (/opt/bin/python 2.3.3) starting up
2004/02/04 13:27 MST [-] reactor class: twisted.internet.default.SelectReactor
2004/02/04 13:27 MST [-] Removing stale pidfile twistd.pid
2004/02/04 13:27 MST [-] Loading buildbot.tap...
2004/02/04 13:27 MST [-] Loaded.
2004/02/04 13:27 MST [-] set uid/gid 4178/2000
2004/02/04 13:27 MST [-] Starting factory <buildbot.bot.BotFactory instance at 0x2ea8c8>
2004/02/04 13:27 MST [Broker,client] message from master: attached
2004/02/04 13:27 MST [Broker,client] setBuilderList [('zero_3_5_1', 'zero_3_5_1'), ('zero_tip', 'zero_tip')]
2004/02/04 13:27 MST [Broker,client] builder 'zero_tip' message from master: attached
2004/02/04 13:27 MST [Broker,client] builder 'zero_3_5_1' message from master: attached
2004/02/04 13:28 MST [Broker,client] startBuild
2004/02/04 13:28 MST [Broker,client]  startCommand:cvs [id 3 347638]
2004/02/04 13:28 MST [Broker,client]   command 'rm -rf /home/ed/BuildBot/zero/zero_tip/build' in dir /home/ed/BuildBot/zero/zero_tip/. [None]
2004/02/04 13:28 MST [-] command finished with signal None, exit code 0
2004/02/04 13:28 MST [-]   command 'cvs -d /upc/share/CVS -z3 checkout -r HEAD -d build netcdf-3' in dir /home/ed/BuildBot/zero/zero_tip/. [None]
2004/02/04 13:48 MST [-] command timed out: 1200 seconds without output, killing pid 26839
2004/02/04 13:48 MST [-] trying os.kill(-pid, signal.SIGKILL)
2004/02/04 13:48 MST [-]  successful
2004/02/04 13:48 MST [-] trying process.signalProcess('KILL')
2004/02/04 13:48 MST [-]  successful
2004/02/04 13:48 MST [-] Failure: buildbot.slavecommand.TimeoutError: SIGKILL failed to kill process
	
2004/02/04 13:48 MST [-] Failure: buildbot.slavecommand.TimeoutError: SIGKILL failed to kill process
	






More information about the devel mailing list