[Buildbot-devel] Dependent schedulers not being triggered

Kenneth Lareau Ken.Lareau at nominum.com
Fri Mar 16 21:13:35 UTC 2007


Sorry for the delay in response, but right around the time you sent your
initial email it began working again... then stopped working again...
then started working again... etc.  Right now it seems to not be working
so I will comment on the following below:

Brian Warner wrote:
> Kenneth Lareau <Ken.Lareau at nominum.com> writes:
> 
>> Follow-up to my previous message: I have no idea if I'm going in the
>> right direction here, but I decided to run a few commands from within
>> a manhole connection:
> 
> (I'm running out the door for a week out of town before PyCon, so I apologize
> for the brevity of my response)
> 
> I can't think of an immediate cause for this problem, but I'll throw out a
> couple of things that might help steer you in the right direction.
> 
> 'expectations' is a red herring, it has to do with ETA calculation, and is
> present if the Builder instance had a previous Build to extract timing
> information from. It will probably be None if this is the first Build that's
> been done since the buildmaster was started (since expectations aren't
> persistent across a restart).

Thanks for letting me know this was the wrong path to pursue; I wasn't
certain but I'm glad that it's been cleared up now for future reference
for me.

> 2007/01/02 05:20 PST [-] maybeStartBuild:[<buildbot.process.base.BuildRequest instance at 0xb1ce1b8c>] [<buildbot.process.builder.SlaveBuilder instance at 0xb32e9f8c>]
> 
> this maybeStartBuild log line emits two lists. The first is the .buildable
> attribute: all the BuildRequests that are ready to go. The second is .slaves:
> a list of all the SlaveBuilder instances (one per buildslave) which it knows
> about. Some of these SlaveBuilders might be busy doing other builds.. that
> test is done a few lines after this log message is emitted (the first line in
> buildbot/process/builder.py, Builder.maybeStartBuild(), about line 500).
> 
> Dependent schedulers work by having the "downstream" scheduler subscribe to
> the "upstream" scheduler, to watch for all builds which succeed. Each time
> one does, the downstream scheduler extracts the SourceStamp from the
> successful build, then submits its own set of BuildRequests with the same
> SourceStamp. This happens in buildbot/scheduler.py:Dependent.upstreamBuilt(),
> about line 307.
> 
> I'd look more carefully at the Dependent scheduler: make sure it is pointing
> at the correct upstream Scheduler, make sure that the upstream scheduler is
> actually the one scheduling these builds (if a different Scheduler is
> responsible for them, then your downstream Dependents won't ever notice
> them). I haven't looked carefully at what happens when the config file is
> reloaded w.r.t. Scheduler identity: I suspect there could be a problem in
> which a reconfig event that modified one of the Schedulers but not the other
> could leave you in a situation where your downstream Dependent is pointing at
> an unused copy of the upstream, in which case it would never get triggered.
> Restarting the buildmaster would clear this, at least until the next reconfig
> that modified one and not the other.
> 
>  (if this turns out to be the case, please file a bug on it.. I've run into
>  this problem in other parts of the config file and it is a confusing and
>  annoying issue, which is very satisfying to get fixed :).
> 
> To investigate this, look at the list returned by master.allSchedulers(),
> find the Dependent one, look at its .upstream instance, and make sure the
> upstream is actually on the master's list. If it points to something that
> isn't on that list, that's a sign of this bug. (note that it would probably
> point to something with all the same attributes as something on that list,
> but the id() value would be different). If a Scheduler isn't on the list
> returned by master.allSchedulers(), then it isn't going to hear about Changes
> and probably won't ever trigger a build.

Okay, I did as you suggested, and got the following information today
(with the dependant builds still not working):

 >>> pprint.pprint([vars(a) for a in master.allSchedulers() if 
repr(a).find('benchmark') != -1])
[{'builderNames': ['benchmark ans-dist benchmarking-0',
                    'benchmark cns-dist benchmarking-0',
                    'benchmark vantio-dist benchmarking-0',
                    'benchmark dcs-dist benchmarking-0',
                    'benchmark dhcperf-dist benchmarking-0',
                    'benchmark eac-dist benchmarking-0',
                    'benchmark gecko-dist benchmarking-0',
                    'benchmark navitas-dist benchmarking-0',
                    'benchmark dhcptest-package benchmarking-0',
                    'benchmark dhcp-testtools-package benchmarking-0',
                    'benchmark benchmark benchmarking-0'],
   'name': 'benchmark-rhel-4-x86-64-0',
   'namedServices': {},
   'parent': <buildbot.master.BuildMaster instance at 0xb72160ec>,
   'running': 1,
   'services': [],
   'successWatchers': [],
   'upstream': <Scheduler 'nightly-rhel-4-x86-64-0' at -1369496404>}]

So there's the upstream mentioned... checking:

 >>> pprint.pprint([vars(a) for a in master.allSchedulers() if 
repr(a).find('nightly-rhel-4-x86-64-0') != -1 ])
[{'branch': None,
   'builderNames': ['ans-dist head rhel-4-x86-64-0',
                    'cns-dist head rhel-4-x86-64-0',
                    'dcs-dist head rhel-4-x86-64-0',
                    'dcs-dist 2.0.4.branch rhel-4-x86-64-0',
                    'dcs-dist 2.0.5.branch rhel-4-x86-64-0',
                    'dcs-dist 2.0.6.branch rhel-4-x86-64-0',
                    'dhcperf-dist head rhel-4-x86-64-0',
                    'eac-dist head rhel-4-x86-64-0',
                    'gecko-dist head rhel-4-x86-64-0',
                    'navitas-dist head rhel-4-x86-64-0',
                    'dhcptest-package head rhel-4-x86-64-0',
                    'dhcp-testtools-package head rhel-4-x86-64-0',
                    'all head rhel-4-x86-64-0'],
   'dayOfMonth': '*',
   'dayOfWeek': '*',
   'delayedRun': <twisted.internet.base.DelayedCall instance at 0xac48ee6c>,
   'hour': [0],
   'minute': 41,
   'month': '*',
   'name': 'nightly-rhel-4-x86-64-0',
   'namedServices': {},
   'nextRunTime': 1174120860.0,
   'parent': <buildbot.master.BuildMaster instance at 0xb72160ec>,
   'running': 1,
   'services': [],
   'successWatchers': [<bound method Dependent.upstreamBuilt of 
<Scheduler 'benchmark-rhel-4-x86-64-0' at -1369496244>>]}]


So it's there, but... the dependant builds still didn't run last night,
which has me scratching my head.  Any further suggestions you might be
able to make?  I'm really not sure what to make of this.

> 
> ok, gotta run.. hope this helped,
>  -Brian
> 

Ken Lareau




More information about the devel mailing list