[Buildbot-devel] Scheduler issues after upgrade from 0.8.5 to 0.8.6p1

Iustin Pop iustin at google.com
Mon Jun 11 08:05:44 UTC 2012


On Fri, Jun 08, 2012 at 11:25:15AM -0600, Tom Prince wrote:
> Iustin Pop <iustin at google.com> writes:
> 
> > Hi all,
> >
> > I see two strange things after upgrade to 0.8.6p1 but not sure if they
> > are bugs or just issues with our config. Both seem to be related to
> > schedulers, btw.
> >
> > First, changes generated by our custom git poller don't seem to be
> > picked up anymore. One scheduler example:
> >
> >   AnyBranchScheduler(name="all",
> >                      branches=UNITTEST_BRANCHES,
> >                      treeStableTimer=30,
> >                      builderNames=builder_tests),
> >
> > And the change appears in the master log:
> >
> > 2012-06-08 11:29:16+0000 [-] added change Change(revision=u'656db618b5690b4f655856875933fac9692fb7c7', who=u'Iustin Pop <iustin at google.com>', branch=u'stable-2.6', comments=u'Remove one obsolete hlint override', when=1339154956, category=None, project=u'', repository=u'') to database
> >
> > But even though 'stable-2.6' is in the UNITTEST_BRANCHES list, it's
> > never picked up.
> 
> This may be a bug, but almost every time somebody has reported a
> scheduler not firing, it has been a misconfiguration somewhere.

Could be, indeed, but note that this worked fine with 0.8.5. See below.

> > Second issue is that it seem builds generated via 'try' or even
> > timed.Nightly sometime are 'stuck' in pending and do not go away for
> > hours, until master restart.
> 
> I'm not sure what to suggest about this, except to ask for more
> information. All the slaves for the assoicated builders are online?

Yes. What I've realised since sending the email is that the
db_poll_interval is not only (as the docs say) mostly useful for
multi-master setup, but that if defined, it actually changes the
behaviour of buildbot. For whatever reason, we had this parameter
defined.

After removing it, everything works fine. I guess what was happening was
that some unhandled exception during the db poll was breaking the reset
of the timer. So there's some bug, I believe, that affects multi-master
setup or any other env where db_poll_interval is set; but for our case,
this went away after removing the param and not relying on the poll
loop.

> > 2012-06-08 10:07:14+0000 [-] WARN attached_slaves: 0 7
> >
> > because:
> > 2012-06-08 10:17:14+0000 [-] Counter BotMaster.attached_slaves: 0
> > 2012-06-08 10:17:14+0000 [-] Counter AbstractBuildSlave.attached_slaves: 7
> 
> I'm not sure what causes this, but I wouldn't be suprised if it is
> harmless. The metrics code hasn't seem much work, and it looks like the
> counter names don't match up everywhere.

OK, I'll ignore that then.

> > 2012-06-08 10:00:14+0000 [HTTPChannel,0,127.0.0.1] unable to save build qa-kvm-tiny-#3
> > 2012-06-08 10:00:14+0000 [HTTPChannel,0,127.0.0.1] Unhandled Error
> > &
> >   File "/usr/local/lib/python2.6/dist-packages/buildbot-0.8.6p1-py2.6.egg/buildbot/status/build.py", line 408, in saveYourself
> >    dump(self, open(tmpfilename, "wb"), -1)
> >   cPickle.PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
> 
> I think this is fixed by
> https://github.com/tomprince/buildbot/commit/pickle-error

Thanks, I've applied that directly, we'll see how it goes.

Thanks for the reply!
iustin




More information about the devel mailing list