[Buildbot-devel] Expected behavior of bots when SIGHUP received?

Wed Sep 21 13:46:39 UTC 2011

On Sep 21, 2011, at 9:30 AM, Todd Cooper wrote:

> I think it is because a slave does not need to do a 'reconfig.'  I  
> don't agree with the different behavior but see how it got there..
> Does it make sense to reconfig a slave?

Sort of - see Dan's original comment below.

>> From: Dan Kegel <dank at kegel.com>
>> To: BuildBot Devel <buildbot-devel at lists.sourceforge.net>
>> Sent: Wednesday, September 21, 2011 8:54 AM
>> Subject: [Buildbot-devel] Expected behavior of bots when SIGHUP  
>> received?
>>
>> master/buildbot/master.py says
>>     def _handleSIGHUP(self, *args):
>>         reactor.callLater(0, self.loadTheConfigFile)
>>
>> slave/buildslave/bot.py says
>>     def _handleSIGHUP(self, *args):
>>         log.msg("Initiating shutdown because we got SIGHUP")
>>         return self.gracefulShutdown()
>>
>> Why the asymmetry?
>>
>> In general, I think processes should do an orderly shutdown on  
>> SIGTERM,
>> and reconfigure themselves on SIGHUP.  Buildslaves get most of
>> their config from the master, so perhaps the current buildslave
>> behavior should be moved to a SIGTERM handler, and the SIGHUP
>> handler force a retry of the connection to the server.
>>
>> I ask because when I restart the master after a code change,
>> I often want to get the slaves to reconnect right away,
>> and a SIGHUP would be the easiest way to do that.

^ This is about as much of a reconfig as a slave would do.

I tend to agree that SIGHUP should trigger a reconfig, but I'm not  
sure that SIGTERM for *gracefully* shutting down a slave is the right  
way, either. Maybe SIGINT? The reason is that SIGTERM usually means  
"this is your last warning before the system hands out a bunch of  
SIGKILLs", and the delay between the two signals is probably shorter  
than most builds would take to complete.

Dan: regarding your master restart situation, I'm wondering if the  
slave connection retry code isn't doing the right thing for you. I  
have one BuildBot setup which has an absurdly complex master config,  
and if I do a "buildbot restart" (versus a manual stop then start),  
the slaves have all connected by the time I switch to the web browser  
to view the slave status page. Or do you need to gracefully stop the  
master before restarting with the new configuration?

>> (Yeah, I could make them poll more often, but maybe that isn't always
>> an option.)
>> (SIGHUP is easier than starting and stopping the normal way
>> because my build slaves need access to the local desktop,
>> which I don't have when ssh'ing in remotely.)

-- 
Charles Lepple

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?