[Buildbot-devel] Dead job permanently yellow?
dianemaple76 at yahoo.com
Fri Mar 20 17:44:23 UTC 2015
I had issues like this on the buildbot systems I run.I upgraded to 0.8.8 at the time, now on 0.8.9 on most systems.0.8.8 added the "--clean" option to the restart. Which when added tells the master to not start any new builds and once the current builds are finished, restart. It also works with stop.This cleared up / worked around / prevented most of the issues.The other thing is I have a few hundred builds on a master and about a hundred on two others. I had to add the sqlite flag to serialize access to the DB file or the master checking slave states while a web refresh was being done would get the master in a weird state. It looked like a deadlock. Basically since multiple sqlite queries are needed per action they would interleave and eventually to the point where the web browser would timeout before the master had all of the data to display and the slaves would sometimes just stop working. I'm guessing due to some db timeout or some other timeout which was caused by the db issues which in the end basically made the slave non-functional.
On Friday, March 20, 2015 9:56 AM, Dan Kegel <dank at kegel.com> wrote:
One more symptom: two slaves are unhappy about a stray lock file:
fatal: Unable to create
Could restarting the server in the middle of a git step do that?
The slave in question is running buildbot-slave-0.8.7.
On Fri, Mar 20, 2015 at 9:45 AM, Dan Kegel <dank at kegel.com> wrote:
> So... our buildbot got in that nasty state again where:
> - the waterfall shows a builder as yellow
> - clicking on the build shows it as yellow (with revision ??, since it
> failed during git step)
> - no buildslave is yellow, all are idle
> Forcing a new build that then fails does leave the builder red, but
> the old yellow job (on a different slave) is still there.
> Restarting the master doesn't help.
> Rebooting the slave doesn't help.
> Everything is normal otherwise. Even if I disable all other slaves, the
> slave with the false yellow job still accepts new jobs properly.
> I guess I can just ignore the problem, but it feels a little funny.
> I think I triggered this by restarting the master while the slave was
> in the middle of a git step.
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
Buildbot-devel mailing list
Buildbot-devel at lists.sourceforge.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the devel