[users at bb.net] buildbot CPU usage

Pierre Tardy tardyp at gmail.com
Fri Aug 12 13:48:38 UTC 2016


Hy Francesco,

This is weird, as on my environment I can see the sample count growing. It
only grows if the process is not idle, though
So I have to use the master ui to actually make the profile change on my
test environment.
Your profile looks like fairly idle. I can see pprint and write to logfile
which might be not scalable if the data start growing do you have some
heavy debug log in your prod?

Here is an updated more simple snippet, but this should not change many
thing.
You can improve the TimerService '10' value to something bigger in order to
dump less often.
Are you sure you did reproduce the slowness while profiling?


import statprof
from  twisted.application.internet import TimerService
from buildbot.util.service import BuildbotService
from twisted.internet import reactor

class BuildbotTimerService(TimerService, BuildbotService):
    name = "timer"
    def __init__(self):
        BuildbotService.__init__(self)
        TimerService.__init__(self, 10, self.dump_stats)
        reactor.callFromThread(statprof.start)

    def dump_stats(self):
        statprof.stop()
        with open("profile.txt", "w") as f:
            statprof.display(f)
            statprof.display()
        statprof.start()

c['services'] = [ BuildbotTimerService() ]


Le ven. 12 août 2016 à 14:21, Francesco Di Mizio <francescodimizio at gmail.com>
a écrit :

> Sure thing. it's attached.
> Please note the sampe count has been stuck at 51 for over 1 hour even if
> the master is running.
>
> On Fri, Aug 12, 2016 at 12:38 PM, Pierre Tardy <tardyp at gmail.com> wrote:
>
>> The first lines are the important ones. Can you send the full file?
>>
>> Le ven. 12 août 2016 12:23, Francesco Di Mizio <
>> francescodimizio at gmail.com> a écrit :
>>
>>> It does work but statprof seems to stop getting samples. Last few lines
>>> from it:
>>>
>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.06      0.00
>>>  banana.py:167:gotItem
>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.01      0.00
>>>  policies.py:718:setTimeout
>>> 2016-08-12 10:19:43+0000 [-]   0.00      0.01      0.00
>>>  encoder.py:434:_iterencode
>>> 2016-08-12 10:19:43+0000 [-] ---
>>> 2016-08-12 10:19:43+0000 [-] Sample count: 51
>>> 2016-08-12 10:19:43+0000 [-] Total time: 0.510000 seconds
>>>
>>> The server is running.
>>>
>>> On Thu, Aug 11, 2016 at 10:46 AM, Francesco Di Mizio <
>>> francescodimizio at gmail.com> wrote:
>>>
>>>> Pierre many thanks for that. I do not remember why the force scheduler
>>>> api didnt suite my needs, that was my first approach.
>>>> I will try the new snippet when I get to work in a few hours.
>>>>
>>>> On Wed, Aug 10, 2016 at 11:18 PM, Pierre Tardy <tardyp at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> After carefully looking at your code, it does not look to have obvious
>>>>> performance flaw.
>>>>> I think it would have been cheaper for you to just use the
>>>>> forcescheduler REST api, but if that code works for you this is totally
>>>>> fine.
>>>>>
>>>>> I saw that you did not manage to get sample in your prod as well, so I
>>>>> did look at my code again, and now I am able to see the samples.
>>>>>
>>>>> statprof does not like to stop before start
>>>>>
>>>>> Please try again with this code:
>>>>>
>>>>>
>>>>> import statprof
>>>>> from  twisted.application.internet import TimerService
>>>>> from buildbot.util.service import BuildbotService
>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>     name = "timer"
>>>>>     def __init__(self):
>>>>>         BuildbotService.__init__(self)
>>>>>         TimerService.__init__(self, 10, self.dump_stats)
>>>>>         self.started = False
>>>>>     def dump_stats(self):
>>>>>         if not self.started:
>>>>>             statprof.start()
>>>>>             self.started = True
>>>>>             return
>>>>>         statprof.stop()
>>>>>         with open("profile.txt", "w") as f:
>>>>>             statprof.display(f)
>>>>>             statprof.display()
>>>>>         statprof.start()
>>>>>
>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>
>>>>>
>>>>>
>>>>> Le mer. 10 août 2016 à 14:14, Francesco Di Mizio <
>>>>> francescodimizio at gmail.com> a écrit :
>>>>>
>>>>>> Many thanks. Find it attached.
>>>>>>
>>>>>> It's a tiny http server. People use a web page to send an HTTP
>>>>>> request to it to kick off builds.
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 10, 2016 at 1:55 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I'll take a look.
>>>>>>>
>>>>>>> Le mer. 10 août 2016 12:45, Francesco Di Mizio <
>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>
>>>>>>>> I am starting to think it may be due to my own custom scheduler. If
>>>>>>>> I uploaded the code somewhere could anybody take a look? it's a 150 lines
>>>>>>>> py script.
>>>>>>>>
>>>>>>>> On Wed, Aug 10, 2016 at 12:10 PM, Francesco Di Mizio <
>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks for that Pierre. I have it on on production and the
>>>>>>>>> performances do not degrade indeed.
>>>>>>>>>
>>>>>>>>> buildbot at e2123a50d1fb:~$ cat profile.txt
>>>>>>>>> No samples recorded.
>>>>>>>>>
>>>>>>>>> I need to study statprof a bit now ;)
>>>>>>>>>
>>>>>>>>> On Wed, Aug 10, 2016 at 11:20 AM, Vasily <vassnlit at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> One can also use Intel® VTune™, you know... :-)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vasily
>>>>>>>>>> 09 авг. 2016 г. 23:56 пользователь "Pierre Tardy" <
>>>>>>>>>> tardyp at gmail.com> написал:
>>>>>>>>>>
>>>>>>>>>>> You can add following snippet to your master.cfg. It looks
>>>>>>>>>>> statprof does not work on osx, so I cannot test it to the end, but I know
>>>>>>>>>>> similar did work on my prod
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> import statprof
>>>>>>>>>>> from  twisted.application.internet import TimerService
>>>>>>>>>>> from buildbot.util.service import BuildbotService
>>>>>>>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>>>>>>>     name = "timer"
>>>>>>>>>>>     def __init__(self):
>>>>>>>>>>>         BuildbotService.__init__(self)
>>>>>>>>>>>         TimerService.__init__(self, 10, self.dump_stats)
>>>>>>>>>>>     def dump_stats(self):
>>>>>>>>>>>         statprof.stop()
>>>>>>>>>>>         with open("profile.txt", "w") as f:
>>>>>>>>>>>             statprof.display(f)
>>>>>>>>>>>         statprof.start()
>>>>>>>>>>>
>>>>>>>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le mar. 9 août 2016 à 21:44, Francesco Di Mizio <
>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>
>>>>>>>>>>>> Pierre,
>>>>>>>>>>>>
>>>>>>>>>>>> if you can enlighten on how you intend to use statprof, I will
>>>>>>>>>>>> be happy to give it a try.
>>>>>>>>>>>> This problem here is on top of my list right now as it's making
>>>>>>>>>>>> my prod env unusable.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 9, 2016 at 8:25 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Please note that cprofile is very intrusive and will lead to
>>>>>>>>>>>>> about 3x slower code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I would not run that in prod. This is why I prefer statprof
>>>>>>>>>>>>> which is using a non intrusive method to get the stats
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le mar. 9 août 2016 19:15, Francesco Di Mizio <
>>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On a side note I have managed to get twistd to dump a blob I
>>>>>>>>>>>>>> can then load and analyze. Got to see if I can run this in production.
>>>>>>>>>>>>>> If anybody is curious the following worked for me:
>>>>>>>>>>>>>> twistd --savestats -n --profiler=cprofile  --profile=/crcdata/
>>>>>>>>>>>>>> profile.stats -y ./buildbot.tac
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:21 PM, Francesco Di Mizio <
>>>>>>>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well that's expected to an extent - p4 poller is running 'p4
>>>>>>>>>>>>>>> changes' every pollinterval seconds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anyway just as a test I've tried to disable the poller and
>>>>>>>>>>>>>>> still at times I am seeing the CPU spiking up to above 100%. I believe I
>>>>>>>>>>>>>>> really need to profile this somehow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:08 PM, Dan Kegel <dank at kegel.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also watch 'top' and see if poller processes are hogging
>>>>>>>>>>>>>>>> resources.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> users at buildbot.net
>>>>>>>>>>> https://lists.buildbot.net/mailman/listinfo/users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20160812/6a72c9a0/attachment.html>


More information about the users mailing list