[Buildbot-devel] Perplexing issue on Windows

Daniel e_list1 at earthlink.net
Thu Jul 31 20:40:28 UTC 2008


On Jul 31, 2008, at 16:38, Mark Roddy wrote:
> On 7/31/08, Daniel <e_list1 at earthlink.net> wrote:
>> On Jul 30, 2008, at 15:19, Daniel wrote:
>>> Greetings, all.
>>>
>>> I have a new build slave that I have never been able to get to work.
>>> I am moving a buildbot slave from one Windows XP 64-bit system to
>>> another, and the new slave locks up every time it tries to build
>>> something.  Specifically what happens is: one build might work (or  
>>> it
>>> might not), but subsequent builds always (yes, always) fail.  There
>>> are 2 builds on this slave, and both fail with the same two  
>>> problems.
>>>
>>> The first time they fail it is always due to a 1200-second timeout,
>>> with an error that buildbot could not kill processes.  Every
>>> subsequent failure (that is, every failure prior to my restarting  
>>> the
>>> buildbot slave on that system) is an OS error that a specific file
>>> could not be removed.  If I check with procexp, I can see that named
>>> the file is still in use by a running shell process.  However,  
>>> even if
>>> I kill that process, the build does not complete.  The only  
>>> solution I
>>> have found is to quit Cygwin, which sometimes takes a reboot,  
>>> since at
>>> that point Cygwin is usually hung and often refuses to die.
>>>
>>> The builds still (and always) run fine on the original slave system.
>>> All of my other build slaves (Linux and OS X) are without issue.
>>>
>>> I was running 0.7.7 but have upgraded to 0.7.8 and I see the same
>>> results on that version.  I am running buildbot in Cygwin on XP 64-
>>> bit.
>>>
>>> I would appreciate any guidance you can provide on how to figure out
>>> what's going on here, and how to get past it.
>>>
>>> Thanks!
>>>
>>> Daniel
>>
>>
>> Anyone have any ideas on this one?  Since I posted last, I have
>> completely wiped and reinstalled Buildbot, Python (2.4.4), the Python
>> Win32 extensions, and Twisted.  Results have not changed.
>>
>> The first error is:
>>> command timed out: 1200 seconds without output, killing pid 2720
>>> SIGKILL failed to kill process
>>> using fake rc=-1
>>> program finished with exit code -1
>>>
>>> remoteFailed: [Failure instance: Traceback from remote host --
>>> Traceback (most recent call last):
>>> Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to
>>> kill process
>>> ]
>>
>> Subsequent errors are:
>>> remoteFailed: [Failure instance: Traceback from remote host --
>>> Traceback (most recent call last):
>>>  File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
>>> line 1468, in _didLogin
>>>  return SourceBase.start(self)
>>>  File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
>>> line 1216, in start
>>>  d.addCallback(self.doClobber, self.workdir)
>>>  File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
>>> line 195, in addCallback
>>>  callbackKeywords=kw)
>>>  File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
>>> line 186, in addCallbacks
>>>  self._runCallbacks()
>>> --- <exception caught here> ---
>>>  File "C:\Python24\Lib\site-packages\twisted\internet\defer.py",
>>> line 328, in _runCallbacks
>>>  self.result = callback(self.result, *args, **kw)
>>>  File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
>>> line 1374, in doClobber
>>>  rmdirRecursive(d)
>>>  File "C:\PYTHON24\Lib\site-packages\buildbot\slave\commands.py",
>>> line 90, in rmdirRecursive
>>>  os.rmdir(dir)
>>> exceptions.OSError: [Errno 13] Permission denied: 'c:\\buildbot\
>>> \sanity-90-xp64\\config3'
>>> ]
>>
>> As I mentioned, I can see that the file config3 is held open by a
>> process, but even if I kill that process the builds will not  
>> complete.
>>
>> And, actually, now the builds never succeed - the always fail with
>> these errors.
>>
>> I sincerely appreciate any assistance you can provide.
>>
>>
>> Thanks.
>>
>> Daniel
>>
>
> I seem to remember something similar the first time I setup a build
> slave on xp.  What happened was that I had the service set to run as
> some user, but started the slave myself to make sure it would work.
> Then when I ran it as a service it died as all the files created were
> owned by my user name and not the user the service was running as.
> Not sure if this is you're issue or not, but I thought I'd share just
> in case.
>
> -Mark

Mark,

Thanks - I should have mentioned that I am running this from the  
command line - same way that I was running it on the old system.  I'm  
not using the buildbot service on either system.

Daniel




More information about the devel mailing list