[Buildbot-devel] Practical limit to number of builders?

Dan Kegel dank at kegel.com
Mon Oct 27 18:00:11 UTC 2014


On Fri, Oct 3, 2014 at 6:33 PM, Mikhail Sobolev <mss at mawhrin.net> wrote:
>> > What kind of database do you use?
>>
>> sqlite.  Think that could be the problem?
> This is definitely one thing to check: sqlite is pretty OK for basic
> needs, and your needs do not seem to be that basic.

I looked at this a bit more.  When rendering the waterfall:

If 'top' shows system is idle before clicking 'reload' on waterfall page,
the render finishes "quickly" (about ten seconds), and twisted uses
330% CPU (so it multithreading nicely?).
(This is so even if a du is keeping the disk busy.)

If 'top' showed buildbot was doing git polling (i.e. about 100% cpu
use in twistd
and 3-10 'git' instances and/or zombies), the render finishes "slowly"
(about 35 seconds).
Fewer git instances -> render finishes faster.

So git polling appears to be slowing down the waterfall significantly.

Is there a more efficient way to do large numbers of git polls?

I also profiled the system a bit to check whether sqlite was slow, using
$ perf record -e cpu-clock -v -a -g sleep 20
$ perf report
while restarting the master:

Events: 79K cpu-clock
+  24.62%         twistd  [kernel.kallsyms]         [k] __ticket_spin_lock
+  22.23%        swapper  [kernel.kallsyms]         [k] native_safe_halt
+  10.01%         twistd  [kernel.kallsyms]         [k]
_raw_spin_unlock_irqrestore
+   9.36%         twistd  libsqlite3.so.0.8.6       [.] 0x3f55d
+   3.90%         twistd  python                    [.] 0x16ffcc
+   3.60%         twistd  [kernel.kallsyms]         [k] finish_task_switch

Using -g seems to show that both the ticket_spin_lock and
raw_spin_lock_irqrestore are futex-related
(which makes sense if twisted is using epoll, I guess, but still seems
kind of high).

Anyway, it doesn't seem offhand that sqlite is my problem...
- Dan




More information about the devel mailing list