[Buildbot-devel] buildbot slave hangs on SunOS?
Ed Hartnett
ed at unidata.ucar.edu
Fri Jan 30 16:00:24 UTC 2004
Brian Warner <warner-buildbot at lothar.com> writes:
> > File "/opt/lib/python2.2/site-packages/twisted/internet/default.py", line 162, in spawnProcess
> > return process.PTYProcess(self, executable, args, env, path, processProtocol, uid, gid, usePTY)
> > File "/opt/lib/python2.2/site-packages/twisted/internet/process.py", line 509, in __init__
> > stderr.flush()
> > exceptions.IOError: [Errno 9] Bad file number
> >
> > 2004/01/27 09:22 MST [-] Malformed file descriptor found. Preening lists.
> > 2004/01/27 09:22 MST [-] bad descriptor <twisted.internet.tcp.Client to ('rodney', 8007) at 82a2f74>
> > 2004/01/27 09:22 MST [-] bad descriptor <twisted.internet.tcp.Client to ('rodney', 8007) at 82a2f74>
>
> Eww. That's weird.
>
> It looks like stderr.flush() is only called when the child process (after
> fork) experienced some other error, like execvpe failing because it couldn't
> find the command or something. There should have been a message written to
> stdout or stderr with the exception.. it wouldn't be written to the log, but
> rather to the stdout of the twistd process, which probably goes away when
> twistd daemonizes. The error in .flush() might be related to twistd closing
> fd 1, but then I don't know why the original os.fdopen didn't fail, and why
> you're able to see the log message at all.
>
> Try running that buildslave in the foreground (twistd -n) and watch the
> stdout/stderr to see if it emits more information.
>
> If that doesn't help, maybe it's something more esoteric. Does the Twisted
> test suite pass on that system? (from the top of the Twisted source tree, do
> './bin/trial -v twisted.test'. It takes maybe 5 minutes).
>
> If so, try editing slavecommand.py and set the 'usePTY = 1' at line 170ish to
> =0. That will use regular pipes instead of a PTY for the child process. PTYs
> are one of those funky system-dependent things, and maybe Solaris does it
> just differently enough that it exposes a bug.
>
> Also, if possible, try running the slave under python2.3 instead of 2.2,
> maybe there's a difference in the behavior of the built-in python libraries
> that are responsible for creating PTYs.
>
> weird,
> -Brian
OK, I upgraded to python 2.3.3 and also reinstalled twisted and
buildbot.
Now I get further in the process, but it still hangs:
laraine.unidata.ucar.edu% twistd -n -f buildbot.tap
2004/01/30 08:57 MST [-] Log opened.
2004/01/30 08:57 MST [-] twistd 1.1.1 (/opt/bin/python 2.3.3) starting up
2004/01/30 08:57 MST [-] reactor class: twisted.internet.default.SelectReactor
2004/01/30 08:57 MST [-] Removing stale pidfile twistd.pid
2004/01/30 08:57 MST [-] Loading buildbot.tap...
2004/01/30 08:57 MST [-] Loaded.
2004/01/30 08:57 MST [-] set uid/gid 4178/2000
2004/01/30 08:57 MST [-] Starting factory <buildbot.bot.BotFactory instance at 0x2e07d8>
2004/01/30 08:57 MST [Broker,client] message from master: attached
2004/01/30 08:57 MST [Broker,client] setBuilderList [('laraine_tip', 'laraine_tip'), ('laraine_3_5_1', 'laraine_3_5_1')]
2004/01/30 08:57 MST [Broker,client] builder 'laraine_tip' message from master: attached
2004/01/30 08:57 MST [Broker,client] builder 'laraine_3_5_1' message from master: attached
2004/01/30 08:58 MST [Broker,client] startBuild
2004/01/30 08:58 MST [Broker,client] startCommand:cvs [id 6 325182]
2004/01/30 08:58 MST [Broker,client] command 'rm -rf /home/ed/BuildBot/laraine/laraine_tip/build' in dir /home/ed/BuildBot/laraine/laraine_tip/. [None]
2004/01/30 08:58 MST [-] command finished with signal None, exit code 0
2004/01/30 08:58 MST [-] command 'cvs -d /upc/share/CVS -z3 checkout -r HEAD -d build netcdf-3' in dir /home/ed/BuildBot/laraine/laraine_tip/. [None]
More information about the devel
mailing list