[users at bb.net] Tips on debugging Buildbot scheduling failures?

Steven Johnson srj at google.com
Thu Oct 4 17:54:00 UTC 2018


(New to the list, apologies if this is old/dupe/known-issue)

A project I work on (Halide <https://github.com/halide/Halide>) uses
Buildbot for our build/test setup; we've been using it for several years,
but recently we're having a lot of issues with builds not being triggered
in response to GitHub PRs being opened. There isn't any obvious pattern to
this that we've found so far; some PRs simply don't trigger properly. Often
(but not always) stopping and restarting our build master will "heal"
things and cause the changes to actually get scheduled. (We were running an
older version, but upgrading everything to 1.4 doesn't seem to have
affected this at all.)

I've been poring through the twistd.log files on our master but haven't
seen anything obvious -- the log indicates that the changes in question get
seen and added to the DB, but (apparently) never scheduled (even when 100%
of our workers are sitting idle). Our config doesn't seem particular
complex (SingleBranchScheduler and ForceScheduler are the only schedulers
we use; SingleBranchScheduler is the one that is failing randomly)

Anyway, my real question here is whether anyone can offer suggestions for
debugging this -- is there an extra-verbose scheduler flag I can set (which
haven't yet found)? Is there any cookbook / FAQ / etc on this sort of
thing? Is there a way (in the logs and/or UI) to monitor
pending-but-not-scheduled jobs?

Thanks in advance for any suggestions (including pointers to old threads on
this list) that anyone can give.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20181004/e8f9ff18/attachment.html>


More information about the users mailing list