[Buildbot-devel] Practical limit to number of builders?
Dustin J. Mitchell
dustin at v.igoro.us
Fri Oct 31 21:50:03 UTC 2014
If it does help, that'd be fairly straightforward to add a limit like
htat to Buildbot itself.
Dustin
On Fri, Oct 31, 2014 at 4:35 PM, Dan Kegel <dank at kegel.com> wrote:
> I wonder if it's the sheer number of parallel git processes. I should
> try adding a wrapper around git to limit the number running at once to
> one or two and see if that helps.
>
> On Mon, Oct 27, 2014 at 6:37 PM, Dustin J. Mitchell <dustin at v.igoro.us> wrote:
>> The git polling is almost entirely accomplished by forking 'git' processes
>> -- there's *very* little processing done by Buildbot itself. And the
>> waterfall shouldn't multi-thread at all: it reads almost entirely from
>> pickles, not the database, and thus shouldn't even get any parallelism from
>> concurrent database queries.
>>
>> Note that build pickles are cached pretty heavily. Is it possible that the
>> difference you're observing has to do with whether the cache is hot or cold?
>>
>> Dustin
>>
>> On Mon Oct 27 2014 at 2:00:25 PM Dan Kegel <dank at kegel.com> wrote:
>>>
>>> On Fri, Oct 3, 2014 at 6:33 PM, Mikhail Sobolev <mss at mawhrin.net> wrote:
>>> >> > What kind of database do you use?
>>> >>
>>> >> sqlite. Think that could be the problem?
>>> > This is definitely one thing to check: sqlite is pretty OK for basic
>>> > needs, and your needs do not seem to be that basic.
>>>
>>> I looked at this a bit more. When rendering the waterfall:
>>>
>>> If 'top' shows system is idle before clicking 'reload' on waterfall page,
>>> the render finishes "quickly" (about ten seconds), and twisted uses
>>> 330% CPU (so it multithreading nicely?).
>>> (This is so even if a du is keeping the disk busy.)
>>>
>>> If 'top' showed buildbot was doing git polling (i.e. about 100% cpu
>>> use in twistd
>>> and 3-10 'git' instances and/or zombies), the render finishes "slowly"
>>> (about 35 seconds).
>>> Fewer git instances -> render finishes faster.
>>>
>>> So git polling appears to be slowing down the waterfall significantly.
>>>
>>> Is there a more efficient way to do large numbers of git polls?
>>>
>>> I also profiled the system a bit to check whether sqlite was slow, using
>>> $ perf record -e cpu-clock -v -a -g sleep 20
>>> $ perf report
>>> while restarting the master:
>>>
>>> Events: 79K cpu-clock
>>> + 24.62% twistd [kernel.kallsyms] [k] __ticket_spin_lock
>>> + 22.23% swapper [kernel.kallsyms] [k] native_safe_halt
>>> + 10.01% twistd [kernel.kallsyms] [k]
>>> _raw_spin_unlock_irqrestore
>>> + 9.36% twistd libsqlite3.so.0.8.6 [.] 0x3f55d
>>> + 3.90% twistd python [.] 0x16ffcc
>>> + 3.60% twistd [kernel.kallsyms] [k] finish_task_switch
>>>
>>> Using -g seems to show that both the ticket_spin_lock and
>>> raw_spin_lock_irqrestore are futex-related
>>> (which makes sense if twisted is using epoll, I guess, but still seems
>>> kind of high).
>>>
>>> Anyway, it doesn't seem offhand that sqlite is my problem...
>>> - Dan
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Buildbot-devel mailing list
>>> Buildbot-devel at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/buildbot-devel
More information about the devel
mailing list