[Buildbot-commits] [SPAM] Re: [Buildbot] #664: sporadic connection lost problems
Buildbot
buildbot-devel at lists.sourceforge.net
Sun Dec 27 23:48:40 UTC 2009
#664: sporadic connection lost problems
--------------------+-------------------------------------------------------
Reporter: ddunbar | Owner: ddunbar
Type: defect | Status: assigned
Priority: major | Milestone: 0.8.+
Version: | Resolution:
Keywords: |
--------------------+-------------------------------------------------------
Changes (by ddunbar):
* owner: => ddunbar
* status: new => assigned
Comment:
Based on some TCP logging and IRC discussion, I believe this is an
approximate picture of what happens. This picture explains why it only
happens with multiple builders, and happens more often in my long running
builder:
1. The slave is busy with the long running build.
2. The master is busy with a number of builds; my particular master is
pretty heavily loaded and is frequently maxing the CPU.
3. A source change arrives which triggers a number of builds, including
one on the slave with the long running builder:
4. Because the slave has multiple builders, a new build is scheduled and
the master initiates a ping to the slave to make sure it is available. If
it only had one builder the build would just be pending and the bug
wouldn't show up.
5. The master is busy doing other work, and the slave is busy feeding it
more data. At some point the master's TCP buffer fill up. If the slave's
response to the ping is delayed behind the other data, it won't get to the
master in time.
6. The master gets a ping timeout and closes the slave connection, killing
an otherwise active build.
--
Ticket URL: <http://buildbot.net/trac/ticket/664#comment:2>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation
More information about the Commits
mailing list