[Buildbot-devel] Severe issues with buildbot 0.7.4
Kenneth Lareau
Ken.Lareau at nominum.com
Thu Apr 19 19:30:34 UTC 2007
This just hit critical mass, so hopefully someone can respond to
this quickly; we are unable to run our current builds without a
fix to this issue.
Software:
buildbot 0.7.4
Twisted 2.5.0
zope.interface 3.3.0
We have a system set up for builds that seems to be causing the
build master process to stop logging all information. The problem
is happening when a step is run that's supposed to copy a file to
a directory. The step itself is:
%(workdir)s/tare/head/nightly/copy_package %(workdir)s
dhcptest-package %(platform)s
When this command attempts to run, the following traceback is seen
on the client:
2007/04/19 12:08 -0700 [Broker,client] error in ShellCommand._startCommand
2007/04/19 12:08 -0700 [Broker,client] Unhandled Error
Traceback (most recent call last):
File
"/usr/local/lib/python2.4/site-packages/buildbot/slave/bot.py", l
ine 169, in remote_startCommand
d = self.command.doStart()
File
"/usr/local/lib/python2.4/site-packages/buildbot/slave/commands.p
y", line 614, in doStart
d = defer.maybeDeferred(self.start)
File
"/usr/local/lib/python2.4/site-packages/twisted/internet/defer.py
", line 107, in maybeDeferred
result = f(*args, **kw)
File
"/usr/local/lib/python2.4/site-packages/buildbot/slave/commands.p
y", line 725, in start
d = self.command.start()
--- <exception caught here> ---
File
"/usr/local/lib/python2.4/site-packages/buildbot/slave/commands.p
y", line 292, in start
self._startCommand()
File
"/usr/local/lib/python2.4/site-packages/buildbot/slave/commands.p
y", line 322, in _startCommand
argv = (" ".join(self.command) % os.environ).split()
File "/usr/local/lib/python2.4/UserDict.py", line 17, in
__getitem__
def __getitem__(self, key): return self.data[key]
exceptions.KeyError: 'workdir'
2007/04/19 12:08 -0700 [Broker,client] SlaveBuilder.commandFailed
<buildbot.slav
e.commands.SlaveShellCommand instance at 0x855b52c>
2007/04/19 12:08 -0700 [Broker,client] Unhandled Error
Traceback (most recent call last):
Failure: buildbot.slave.commands.AbandonChain: -1
At the same time, the following is seen on the master buildbot
process:
2007/04/19 12:08 -0700 [-] ShellCommand.start using log
<buildbot.status.builder.LogFile instance at 0xb3e7658c>
2007/04/19 12:08 -0700 [-] for cmd <RemoteShellCommand
'['%(workdir)s/tare/head/nightly/copy_package', '%(workdir)s',
'dhcptest-package', '%(platform)s']'>
2007/04/19 12:08 -0700 [-] <RemoteShellCommand
'['%(workdir)s/tare/head/nightly/copy_package', '%(workdir)s',
'dhcptest-package', '%(platform)s']'>: RemoteCommand.run [6]
2007/04/19 12:08 -0700 [-] command
'['%(workdir)s/tare/head/nightly/copy_package', '%(workdir)s',
'dhcptest-package', '%(platform)s']' in dir 'build'
2007/04/19 12:08 -0700 [-] LoggedRemoteCommand.start
2007/04/19 12:08 -0700 [-] BuildStep.failed, traceback follows
At this point buildbot stops logging completely on the master,
though it still seems to be responsive to other builders being
run. This is highly undesirable, and what makes this even more
frustrating is that this only began to fail last night, and the
only change in that timeframe was a minor addition of several
new builders across several of the platforms (these are auto-
generated, so there's no "typo"s; in this case, I added a single
(valid) branchname and these new builders suceeded just fine
when I tested them yesterday).
Can anyone help at this point? I will need to take the failing
system offline until I find a solution, and hope that other
systems won't cause the same issue or all of our builds will be
put on hold, which for us is a Very Bad Thing (TM).
Thanks for any assistance you can give.
Ken Lareau
More information about the devel
mailing list