[Buildbot-devel] Assert failure when using SlaveLocks

Kinneberg, Steve stevek at qualcomm.com
Thu Oct 2 23:40:15 UTC 2008

I keep encountering the following assert fail when I use SlaveLocks to limit the number of builds a slave performs at a time.  I've attached a simple master.cfg that can reproduce the problem.  Our real config is sufficiently complex enough that we can't set maxBuilds = 1 for the slaves, so we are using SlaveLocks instead for certain builders.

The situation is when several builds are triggered to run so that they are running concurrently, then the order the builds actually run gets confused and triggers the following assert fail:

2008/10/02 16:09 -0700 [-] Unhandled error in Deferred:
2008/10/02 16:09 -0700 [-] Unhandled Error
        Traceback (most recent call last):
          File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 182, in addCallbacks
          File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 317, in _runCallbacks
            self.result = callback(self.result, *args, **kw)
          File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 281, in _continue
          File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 277, in unpause
        --- <exception caught here> ---
          File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 317, in _runCallbacks
            self.result = callback(self.result, *args, **kw)
          File "/home/stevek/lib/python/buildbot/process/base.py", line 366, in _startBuild_2
          File "/home/stevek/lib/python/buildbot/status/builder.py", line 1149, in buildStarted
          File "/home/stevek/lib/python/buildbot/status/builder.py", line 1674, in buildStarted
            assert s.number == self.nextBuildNumber - 1

With the attached master.cfg file this assert fail can be easily triggered by issuing 3 sendchanges in a row 2 seconds apart.

Would it be safe to remove that assert or change it to verify that the s.number is simply less than self.nextBuildNumber since the builds are not happening in order due to the SlaveLock?

I've seen this problem with 0.7.8 and 0.7.9, and I suspect that it has been around for alonger period of time.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: master.cfg
Type: application/octet-stream
Size: 1639 bytes
Desc: master.cfg
URL: <http://buildbot.net/pipermail/devel/attachments/20081002/1aca17c9/attachment.obj>

More information about the devel mailing list