[Buildbot-devel] Perplexing issue on Windows
Daniel
e_list1 at earthlink.net
Thu Jul 31 20:22:39 UTC 2008
On Jul 30, 2008, at 15:19, Daniel wrote:
> Greetings, all.
>
> I have a new build slave that I have never been able to get to work.
> I am moving a buildbot slave from one Windows XP 64-bit system to
> another, and the new slave locks up every time it tries to build
> something. Specifically what happens is: one build might work (or it
> might not), but subsequent builds always (yes, always) fail. There
> are 2 builds on this slave, and both fail with the same two problems.
>
> The first time they fail it is always due to a 1200-second timeout,
> with an error that buildbot could not kill processes. Every
> subsequent failure (that is, every failure prior to my restarting the
> buildbot slave on that system) is an OS error that a specific file
> could not be removed. If I check with procexp, I can see that named
> the file is still in use by a running shell process. However, even if
> I kill that process, the build does not complete. The only solution I
> have found is to quit Cygwin, which sometimes takes a reboot, since at
> that point Cygwin is usually hung and often refuses to die.
>
> The builds still (and always) run fine on the original slave system.
> All of my other build slaves (Linux and OS X) are without issue.
>
> I was running 0.7.7 but have upgraded to 0.7.8 and I see the same
> results on that version. I am running buildbot in Cygwin on XP 64-
> bit.
>
> I would appreciate any guidance you can provide on how to figure out
> what's going on here, and how to get past it.
>
> Thanks!
>
> Daniel
Anyone have any ideas on this one? Since I posted last, I have
completely wiped and reinstalled Buildbot, Python (2.4.4), the Python
Win32 extensions, and Twisted. Results have not changed.
The first error is:
> command timed out: 1200 seconds without output, killing pid 2720
> SIGKILL failed to kill process
> using fake rc=-1
> program finished with exit code -1
>
> remoteFailed: [Failure instance: Traceback from remote host --
> Traceback (most recent call last):
> Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to
> kill process
> ]
Subsequent errors are:
> remoteFailed: [Failure instance: Traceback from remote host --
> Traceback (most recent call last):
> File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
> line 1468, in _didLogin
> return SourceBase.start(self)
> File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
> line 1216, in start
> d.addCallback(self.doClobber, self.workdir)
> File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
> line 195, in addCallback
> callbackKeywords=kw)
> File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
> line 186, in addCallbacks
> self._runCallbacks()
> --- <exception caught here> ---
> File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
> line 328, in _runCallbacks
> self.result = callback(self.result, *args, **kw)
> File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
> line 1374, in doClobber
> rmdirRecursive(d)
> File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
> line 90, in rmdirRecursive
> os.rmdir(dir)
> exceptions.OSError: [Errno 13] Permission denied: 'c:\\buildbot\
> \sanity-90-xp64\\config3'
> ]
As I mentioned, I can see that the file config3 is held open by a
process, but even if I kill that process the builds will not complete.
And, actually, now the builds never succeed - the always fail with
these errors.
I sincerely appreciate any assistance you can provide.
Thanks.
Daniel
More information about the devel
mailing list