[users at bb.net] buildbot CPU usage

Francesco Di Mizio francescodimizio at gmail.com
Fri Aug 12 14:26:21 UTC 2016


If I knew how to reproduce the slowness I wouldnt have to rpofile ;)

Jokes aside, I tried to nuke the log file and indeed it gets re-created
every 10 secs but the content is just always the same even though the
master is actually running and I see it hitting the CPU spikes.I get the
same things on my test env: 50 smaples and then that's all. I am 100% that
the log I attached before did NOT include any spikes. That's just the
result of a few seconds of buildbot running. Then you know... somehow it
doesnt update.

If you think it can make a difference I can try the new snippet (is it
incomplete? It's missing dump_stats function)




On Fri, Aug 12, 2016 at 3:48 PM, Pierre Tardy <tardyp at gmail.com> wrote:

> Hy Francesco,
>
> This is weird, as on my environment I can see the sample count growing. It
> only grows if the process is not idle, though
> So I have to use the master ui to actually make the profile change on my
> test environment.
> Your profile looks like fairly idle. I can see pprint and write to logfile
> which might be not scalable if the data start growing do you have some
> heavy debug log in your prod?
>
> Here is an updated more simple snippet, but this should not change many
> thing.
> You can improve the TimerService '10' value to something bigger in order
> to dump less often.
> Are you sure you did reproduce the slowness while profiling?
>
>
> import statprof
> from  twisted.application.internet import TimerService
> from buildbot.util.service import BuildbotService
> from twisted.internet import reactor
>
> class BuildbotTimerService(TimerService, BuildbotService):
>     name = "timer"
>     def __init__(self):
>         BuildbotService.__init__(self)
>         TimerService.__init__(self, 10, self.dump_stats)
>         reactor.callFromThread(statprof.start)
>
>     def dump_stats(self):
>         statprof.stop()
>         with open("profile.txt", "w") as f:
>             statprof.display(f)
>             statprof.display()
>         statprof.start()
>
> c['services'] = [ BuildbotTimerService() ]
>
>
> Le ven. 12 août 2016 à 14:21, Francesco Di Mizio <
> francescodimizio at gmail.com> a écrit :
>
>> Sure thing. it's attached.
>> Please note the sampe count has been stuck at 51 for over 1 hour even if
>> the master is running.
>>
>> On Fri, Aug 12, 2016 at 12:38 PM, Pierre Tardy <tardyp at gmail.com> wrote:
>>
>>> The first lines are the important ones. Can you send the full file?
>>>
>>> Le ven. 12 août 2016 12:23, Francesco Di Mizio <
>>> francescodimizio at gmail.com> a écrit :
>>>
>>>> It does work but statprof seems to stop getting samples. Last few lines
>>>> from it:
>>>>
>>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.06      0.00
>>>>  banana.py:167:gotItem
>>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.01      0.00
>>>>  policies.py:718:setTimeout
>>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.01      0.00
>>>>  encoder.py:434:_iterencode
>>>> 2016-08-12 10:19:43+0000 [-] ---
>>>> 2016-08-12 10:19:43+0000 [-] Sample count: 51
>>>> 2016-08-12 10:19:43+0000 [-] Total time: 0.510000 seconds
>>>>
>>>> The server is running.
>>>>
>>>> On Thu, Aug 11, 2016 at 10:46 AM, Francesco Di Mizio <
>>>> francescodimizio at gmail.com> wrote:
>>>>
>>>>> Pierre many thanks for that. I do not remember why the force scheduler
>>>>> api didnt suite my needs, that was my first approach.
>>>>> I will try the new snippet when I get to work in a few hours.
>>>>>
>>>>> On Wed, Aug 10, 2016 at 11:18 PM, Pierre Tardy <tardyp at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> After carefully looking at your code, it does not look to have
>>>>>> obvious performance flaw.
>>>>>> I think it would have been cheaper for you to just use the
>>>>>> forcescheduler REST api, but if that code works for you this is totally
>>>>>> fine.
>>>>>>
>>>>>> I saw that you did not manage to get sample in your prod as well, so
>>>>>> I did look at my code again, and now I am able to see the samples.
>>>>>>
>>>>>> statprof does not like to stop before start
>>>>>>
>>>>>> Please try again with this code:
>>>>>>
>>>>>>
>>>>>> import statprof
>>>>>> from  twisted.application.internet import TimerService
>>>>>> from buildbot.util.service import BuildbotService
>>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>>     name = "timer"
>>>>>>     def __init__(self):
>>>>>>         BuildbotService.__init__(self)
>>>>>>         TimerService.__init__(self, 10, self.dump_stats)
>>>>>>         self.started = False
>>>>>>     def dump_stats(self):
>>>>>>         if not self.started:
>>>>>>             statprof.start()
>>>>>>             self.started = True
>>>>>>             return
>>>>>>         statprof.stop()
>>>>>>         with open("profile.txt", "w") as f:
>>>>>>             statprof.display(f)
>>>>>>             statprof.display()
>>>>>>         statprof.start()
>>>>>>
>>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>>
>>>>>>
>>>>>>
>>>>>> Le mer. 10 août 2016 à 14:14, Francesco Di Mizio <
>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>
>>>>>>> Many thanks. Find it attached.
>>>>>>>
>>>>>>> It's a tiny http server. People use a web page to send an HTTP
>>>>>>> request to it to kick off builds.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 10, 2016 at 1:55 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'll take a look.
>>>>>>>>
>>>>>>>> Le mer. 10 août 2016 12:45, Francesco Di Mizio <
>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>
>>>>>>>>> I am starting to think it may be due to my own custom scheduler.
>>>>>>>>> If I uploaded the code somewhere could anybody take a look? it's a 150
>>>>>>>>> lines py script.
>>>>>>>>>
>>>>>>>>> On Wed, Aug 10, 2016 at 12:10 PM, Francesco Di Mizio <
>>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for that Pierre. I have it on on production and the
>>>>>>>>>> performances do not degrade indeed.
>>>>>>>>>>
>>>>>>>>>> buildbot at e2123a50d1fb:~$ cat profile.txt
>>>>>>>>>> No samples recorded.
>>>>>>>>>>
>>>>>>>>>> I need to study statprof a bit now ;)
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 10, 2016 at 11:20 AM, Vasily <vassnlit at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> One can also use Intel® VTune™, you know... :-)
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vasily
>>>>>>>>>>> 09 авг. 2016 г. 23:56 пользователь "Pierre Tardy" <
>>>>>>>>>>> tardyp at gmail.com> написал:
>>>>>>>>>>>
>>>>>>>>>>>> You can add following snippet to your master.cfg. It looks
>>>>>>>>>>>> statprof does not work on osx, so I cannot test it to the end, but I know
>>>>>>>>>>>> similar did work on my prod
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> import statprof
>>>>>>>>>>>> from  twisted.application.internet import TimerService
>>>>>>>>>>>> from buildbot.util.service import BuildbotService
>>>>>>>>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>>>>>>>>     name = "timer"
>>>>>>>>>>>>     def __init__(self):
>>>>>>>>>>>>         BuildbotService.__init__(self)
>>>>>>>>>>>>         TimerService.__init__(self, 10, self.dump_stats)
>>>>>>>>>>>>     def dump_stats(self):
>>>>>>>>>>>>         statprof.stop()
>>>>>>>>>>>>         with open("profile.txt", "w") as f:
>>>>>>>>>>>>             statprof.display(f)
>>>>>>>>>>>>         statprof.start()
>>>>>>>>>>>>
>>>>>>>>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le mar. 9 août 2016 à 21:44, Francesco Di Mizio <
>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>> Pierre,
>>>>>>>>>>>>>
>>>>>>>>>>>>> if you can enlighten on how you intend to use statprof, I will
>>>>>>>>>>>>> be happy to give it a try.
>>>>>>>>>>>>> This problem here is on top of my list right now as it's
>>>>>>>>>>>>> making my prod env unusable.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 8:25 PM, Pierre Tardy <tardyp at gmail.com
>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please note that cprofile is very intrusive and will lead to
>>>>>>>>>>>>>> about 3x slower code.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I would not run that in prod. This is why I prefer
>>>>>>>>>>>>>> statprof which is using a non intrusive method to get the stats
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le mar. 9 août 2016 19:15, Francesco Di Mizio <
>>>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On a side note I have managed to get twistd to dump a blob I
>>>>>>>>>>>>>>> can then load and analyze. Got to see if I can run this in production.
>>>>>>>>>>>>>>> If anybody is curious the following worked for me:
>>>>>>>>>>>>>>> twistd --savestats -n --profiler=cprofile
>>>>>>>>>>>>>>>  --profile=/crcdata/profile.stats -y ./buildbot.tac
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:21 PM, Francesco Di Mizio <
>>>>>>>>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Well that's expected to an extent - p4 poller is running
>>>>>>>>>>>>>>>> 'p4 changes' every pollinterval seconds.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Anyway just as a test I've tried to disable the poller and
>>>>>>>>>>>>>>>> still at times I am seeing the CPU spiking up to above 100%. I believe I
>>>>>>>>>>>>>>>> really need to profile this somehow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:08 PM, Dan Kegel <dank at kegel.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also watch 'top' and see if poller processes are hogging
>>>>>>>>>>>>>>>>> resources.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> users at buildbot.net
>>>>>>>>>>>> https://lists.buildbot.net/mailman/listinfo/users
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20160812/541e41d8/attachment.html>


More information about the users mailing list