[users at bb.net] buildbot CPU usage

Tue Aug 2 16:36:33 UTC 2016

With gitpoller, it was easy to see; whenever the number of git
sessions from the poller went over 0 or so, web gui performance was
poor.
And if it went over 10, well, you could kiss the gui goodbye for
several minutes.

One countermeasure was to randomize the polling intervals, a la

            interval=6  # minutes
            self['change_source'].append(
                # Fuzz the interval to avoid slamming the git server
and hitting the MaxStartups or MaxSessions limits
                # If you hit them, twistd.log will have lots of
"ssh_exchange_identification: Connection closed by remote host" errors
                # See http://trac.buildbot.net/ticket/2480
                changes.GitPoller(repourl,  branches=branchnames,
workdir='gitpoller-workdir-'+name, pollinterval=interval*60 +
random.uniform(-10, 10)))

That made life just barely bearable, at least until number of projects
polled was under 50 or so.
What really helped was not using pollers anymore, and switching to
gitlab's webhooks.
We're at 190 now, of which 57 are still using gitpoller, and it's
almost ok.  (I really have
to move the last 57 onto gitlab.  Or, well, since they're not
critical, increase the polling
interval...)

On Tue, Aug 2, 2016 at 9:13 AM, Pierre Tardy <tardyp at gmail.com> wrote:
> Hi,
>
> Pollers are usually indeed not  scaling as they, hmm, poll.
> What you are describing here is hints that the twisted reactor thread is
> always busy, which should not happen if you only start 10 builds.
> You might have some custom steps which are doing something heavily cpu bound
> in the main thread.
> What I usually do is to use statprof:
> https://pypi.python.org/pypi/statprof/
>
> in order to know what the cpu is doing.
> You could create a builder which you can trig whenever you need, and which
> would start the profiling, wait a few minutes, and then save profiling to a
> file.
>
>
>
> Le mar. 2 août 2016 à 17:53, Francesco Di Mizio <francescodimizio at gmail.com>
> a écrit :
>>
>> Hey Dan,
>>
>> I am using a p4 poller. Maybe it's suffering from the same problems?
>>
>> On Tue, Aug 2, 2016 at 5:45 PM, Francesco Di Mizio
>> <francescodimizio at gmail.com> wrote:
>>>
>>> I'd like to provide a bit more context.Right after restarting the master
>>> and kicking off 10 builds CPU was at 110-120%. This made the UI unusable and
>>> basically all the services were stuck, including the REST API.
>>> After 3-4 minutes like this and WITH all the 10 builds still running the
>>> CPU usage went down to 5%, stayed there for 5 minutes and all was smooth and
>>> quick again. From then on it keps oscillating, I've seen spikes of 240% :(
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Aug 2, 2016 at 4:12 PM, Francesco Di Mizio
>>> <francescodimizio at gmail.com> wrote:
>>>>
>>>> Sometimes it goes up to 140%. I was not able to relate this with a
>>>> particular builds condition - seems like it can happen any time and is not
>>>> related to how many builds are going on.
>>>>
>>>> I usually realize the server got into this state because the web UI gets
>>>> stuck. As soon as the CPU% goes back to normal values (2-3% most times) the
>>>> web finishes loading just instantly.
>>>>
>>>> Any pointers as to what might be causing this? Only reason I can think
>>>> of is too many people trying to access the web UI simultaniously - may I be
>>>> right?
>>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> users at buildbot.net
>> https://lists.buildbot.net/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users at buildbot.net
> https://lists.buildbot.net/mailman/listinfo/users