[Buildbot-commits] [Buildbot] #1703: Use a shorter timeout for old slave disconnection (perhaps based on configuration)

Buildbot buildbot-devel at lists.sourceforge.net
Fri Dec 3 19:02:21 UTC 2010


#1703: Use a shorter timeout for old slave disconnection (perhaps based on
configuration)
------------------------+---------------------------------------------------
Reporter:  exarkun      |       Owner:           
    Type:  enhancement  |      Status:  new      
Priority:  major        |   Milestone:  undecided
 Version:  0.8.2        |    Keywords:           
------------------------+---------------------------------------------------
 If there's a network hiccup and a slave loses its connection, but the
 master doesn't notice (ie, never gets the FIN), then when the slave tries
 to reconnect, it will be a dozen minutes (give or take) before the master
 will accept it.  This is because the ping done in
 `BotMaster.getPerspective` to see if the old connection is still alive
 relies on the TCP-level timeouts to cause the connection to really end.

 For buildbot's purposes, a timeout of 1 or 2 minutes is probably equally
 valid in this circumstance.  It would be nice if this were either the
 default, or if there were a way to specify what the timeout used here
 should be.

 One thing to be careful of, though, is that ''any'' activity from the old
 connection should be treated as sufficient to keep it alive.  That is,
 even if the ping response (the "print" remote call response, really) is
 delayed behind a large payload (eg an upload of a build artifact) or even
 behind a long line of smaller payloads, such that it doesn't arrive until
 after the configured timeout, the old connection should still remain
 alive.  The timeout should just be for ''any'' data from the old
 connection.

-- 
Ticket URL: <http://buildbot.net/trac/ticket/1703>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation


More information about the Commits mailing list