[users at bb.net] Scaling buildbot. 7 seconds to fetch waterfall. Is 1400 builders too much for sqlite?

Pierre Tardy tardyp at gmail.com
Thu Feb 18 18:01:34 UTC 2016


1400 builders looks like a lot indeed!

Even with buildbot nine, we will probably need to optimize a little bit
more the UI requests in order to display the builder page.
You will need to download 100k just to fetch the list of builders

The waterfall or builder page is for me really not useful when you have
that much builders (eight or nine)
I think you would probably require a specific dashboard is order to split
your builder matrix.

Buying new cores is not something I would recommend, as buildbot is
fundamentally monothreaded.
On buildbot eight database is really about the buildrequests, and changes.
I dont think that switching to mysql will help at all loading waterfall.
The best for you is to run https://pypi.python.org/pypi/statprof/ over
manhole (http://docs.buildbot.net/latest/manual/cfg-global.html#manhole)
will definitly tell you were buildbot's code is hanging.

You can increase the buildCacheSize, that may help to trade cpu against
memory.

As for nine, we are approaching a release, cancel/stop have been working
for 6+ month.
We have to see how ui will work with that many builders. For sure it will
never hang the master process for 7 seconds, but we might have to work
together in order to optimize some parts.

Le jeu. 18 févr. 2016 à 18:37, Dan Kegel <dank at kegel.com> a écrit :

> To recap, my site is having response time problems with buildbot 0.8.8
> (yeah, it's old).
> We have 1356 builders, and used to have 190 gitpollers,
> Gitpoller overhead was killing us, and we've slowly been migrating our
> git repos to gitlab so we could use webhooks, so we're now down to 72
> gitpollers, and buildmaster cpu load is usually low.
> Developers are happier than ever with the quick response to checkins,
> and with the gitlab merge request -> buildbot try build gateway I threw
> together
> (which finally made buildbot try builds usable by mere mortals!).
>
> But the waterfall takes 7 seconds to fetch.  Even the /builders page
> takes 5 to 6 seconds to fetch, which these days is an eternity.
> Doing both at once takes 15 seconds (even on my 4-core Xeon VM).
> Developers avoid the waterfall at all costs, and go straight to
> individual builder pages.
> But they don't have the right builder in their browser completion list all
> the time, so they asked me to create a static builders.html page without
> status.
> I now generate that on each reconfigure, and it loads in 5 milliseconds.
> This made them a little happier.
> I think they'd be happier still if I added static links to the builders
> from the
> gitlab page for each project, so they could avoid the buildbot UI even
> more,
> and stay in happy, beautiful gitlab land as long as possible.
>
> Hmm, maybe I should assign another few cores to the buildmaster and
> see what happens.
>
> I wonder if my use of sqlite is part of the problem.  Has anyone
> with > 1000 builders noticed a radical decrease in time to fetch the
> waterfall upon switching from sqlite to mysql?
>
> And how's nine coming along?   Last I heard, it still lacked a 'cancel
> build' button,
> which would be an issue here.
> - Dan
> _______________________________________________
> users mailing list
> users at buildbot.net
> https://lists.buildbot.net/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20160218/5786446e/attachment.html>


More information about the users mailing list