[users at bb.net] buildbot CPU usage
Francesco Di Mizio
francescodimizio at gmail.com
Wed Aug 17 10:53:39 UTC 2016
Thanks Pierre, for the db I use either mysql or postgres, not sqllite.
Logs are not suspiciously long - longets might be around 500 lines. Only
one step is doing something fancy with the logs by using a LogLineObserver
and regexp matching to create other log links.
That profile snapshot was taked after 230 builds were built.
Right now I have 10 workers so that's the max number of builds that can be
running simultaniously.
Those builds get scheduled either by my custon http scheduler (we dwelled
on it above) or by the p4 source scheduler polling every 3 minutes.
As stated previously, developers submit build requests from a webpage to
the http scheduler. Upon opening, the web page queries Buildbot's rest API
to see if there are already ongoing and/or completed builds for the given
p4 changelist.
The amount of http requests sent to buildbot by one single page instance
can vary a lot even in complexity. As an example, every time the web page
is asking buildbot what its builders are and that's all, other times it has
to ask buildbot for a specifc property's value for a few builds.
Also the number of coders using this page can go from zero up to any number
(reasonably no more than 10) and each instance of the page is unique per
coder meaning it'll do its own REST requests to buildbot.
Furthermore when builds get kicked, the web page opens a websocket
connection to buildbot to recieve progress updates on the builds. As builds
complete, they get closed. Some may indeed stay dangling upon refreshing.
On Wed, Aug 17, 2016 at 12:05 PM, Pierre Tardy <tardyp at gmail.com> wrote:
> Hi,
>
> Appended is the decoded stats..
> I can only see normal buildbot operations there. just a lot of them. how
> much loaded is your bot? How many build in parrallel does it have?
> How many workers? What are the size of the logs?
>
> Most importantly are you using sqlite?
> sqlite only works with one thread in buildbot nine, which could lead
> indeed to slow UI when there are lots of activity, as UI queries are not
> prioritized over core queries.
> Other sql backend do not have such issues.
>
> Wed Aug 17 11:47:56 2016 profile.stats
>
> 235488234 function calls (219896405 primitive calls) in 57689.889
> seconds
>
> Ordered by: cumulative time
> List reduced from 3158 to 200 due to restriction <200>
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 1 0.000 0.000 57689.894 57689.894 base.py:1192(run)
> 1 8.549 8.549 57689.894 57689.894 base.py:1197(mainLoop)
> 654066 25.880 0.000 55191.310 0.084 epollreactor.py:367(doPoll)
> 654066 53208.486 0.081 53208.487 0.081 {method 'poll' of
> 'select.epoll' objects}
> 1704464/415799 114.955 0.000 2846.125 0.007 defer.py:1109(_
> inlineCallbacks)
> 3434730/688115 212.408 0.000 2740.953 0.004
> defer.py:513(_runCallbacks)
> 2301115/1429188 3.898 0.000 2521.992 0.002 {method 'send' of
> 'generator' objects}
> 654066 19.060 0.000 2475.261 0.004 base.py:787(runUntilCurrent)
> 2148660/797905 18.536 0.000 2319.208 0.003 defer.py:371(callback)
> 2148861/797914 24.725 0.000 2311.151 0.003 defer.py:484(_
> startRunCallbacks)
> 1354560/496825 18.306 0.000 2191.695 0.004 defer.py:1179(gotResult)
> 382334 11.614 0.000 1956.812 0.005 log.py:88(callWithLogger)
> 936750/366288 20.765 0.000 1955.853 0.005
> defer.py:1262(unwindGenerator)
> 382334 11.120 0.000 1943.570 0.005 log.py:81(callWithContext)
> 382334 3.573 0.000 1918.521 0.005 context.py:117(
> callWithContext)
> 382334 10.873 0.000 1913.555 0.005
> context.py:61(callWithContext)
> 382322 23.054 0.000 1901.996 0.005 posixbase.py:555(_
> doReadOrWrite)
> 112747 5.568 0.000 1249.128 0.011 tcp.py:193(doRead)
> 112747 2.434 0.000 1000.258 0.009 tcp.py:212(_dataReceived)
> 104450 160.171 0.002 915.091 0.009 banana.py:171(dataReceived)
> 207384 2.815 0.000 878.622 0.004 logs.py:110(generateEvent)
> 449643/249989 7.493 0.000 854.141 0.003
> defer.py:129(maybeDeferred)
> 192309 3.906 0.000 842.046 0.004 logs.py:130(appendLog)
> 274303 5.751 0.000 790.022 0.003 pool.py:205(do)
> 274364 10.202 0.000 784.280 0.003 threads.py:22(
> deferToThreadPool)
> 274364 14.368 0.000 773.181 0.003 threadpool.py:210(
> callInThreadWithCallback)
> 299763 5.259 0.000 759.525 0.003 connector.py:117(get)
> 274364 6.264 0.000 757.978 0.003 _team.py:159(do)
> 274367 41.737 0.000 751.404 0.003 _threadworker.py:91(do)
> 199395/179107 4.397 0.000 737.929 0.004 defer.py:1312(execute)
> 2102892 27.496 0.000 706.477 0.000 banana.py:162(gotItem)
> 114728 1.872 0.000 678.239 0.006 banana.py:128(
> callExpressionReceived)
> 114718 2.676 0.000 676.366 0.006 pb.py:555(expressionReceived)
> 104696 0.721 0.000 669.270 0.006 pb.py:874(proto_message)
> 104696 2.231 0.000 668.548 0.006 pb.py:879(_recvMessage)
> 2333378/1667165 68.647 0.000 638.402 0.000
> defer.py:288(addCallbacks)
> 228307 3.802 0.000 610.345 0.003 util.py:696(untilConcludes)
> 103736 1.691 0.000 586.090 0.006 flavors.py:102(
> remoteMessageReceived)
> 98998 2.784 0.000 544.312 0.005 remotecommand.py:178(remote_
> update)
> 99279 1.197 0.000 539.977 0.005 metrics.py:92(wrapper)
> 998915 515.279 0.001 515.279 0.001 {method 'acquire' of
> 'thread.lock' objects}
> 274364 16.972 0.000 498.645 0.002 _team.py:166(<lambda>)
> 274364 55.232 0.000 481.672 0.002 _team.py:169(_
> coordinateThisTask)
> 368054/269548 2.834 0.000 480.960 0.002 defer.py:310(addCallback)
> 199395/102320 4.001 0.000 460.815 0.005 defer.py:1290(run)
> 193043 2.646 0.000 454.043 0.002 remotecommand.py:269(
> remoteUpdate)
> 98828 1.352 0.000 444.287 0.004 misc.py:24(wrapper)
> 274364 12.877 0.000 425.411 0.002 _threadworker.py:50(do)
> 274379 56.362 0.000 410.585 0.001 Queue.py:107(put)
> 118557 8.707 0.000 391.516 0.003 base.py:64(produceEvent)
> 73113 11.027 0.000 383.481 0.005 abstract.py:234(doWrite)
> 175167/143422 6.702 0.000 372.379 0.003 threadable.py:49(sync)
> 155226 1.157 0.000 360.081 0.002 connector.py:107(getEndpoint)
> 155226 31.409 0.000 358.924 0.002 pathmatch.py:45(__getitem__)
> 96155 2.231 0.000 348.588 0.004 log.py:74(<lambda>)
> 96524/95460 1.883 0.000 326.433 0.003
> lineboundaries.py:32(append)
> 199393/199333 1.528 0.000 324.017 0.002 defer.py:1285(_
> releaseAndReturn)
> 201639/201210 2.532 0.000 322.497 0.002 defer.py:1368(release)
> 155226 312.259 0.002 322.348 0.002 pathmatch.py:79(_compile)
> 96155 2.827 0.000 315.731 0.003 logs.py:198(appendLog)
> 73113 1.922 0.000 305.072 0.004 tcp.py:228(writeSomeData)
> 311176 2.758 0.000 304.071 0.001 logs.py:50(get)
> 73113 302.341 0.004 302.341 0.004 {method 'send' of
> '_socket.socket' objects}
> 241229 99.388 0.000 298.465 0.001 simple.py:39(produce)
> 137584 10.229 0.000 297.448 0.002 log.py:247(msg)
> 103693 3.131 0.000 292.473 0.003 logs.py:79(getLog)
> 147250 1.436 0.000 282.022 0.002 remotecommand.py:226(
> addStdout)
> 137584 4.478 0.000 279.129 0.002 _legacy.py:97(
> publishToNewObserver)
> 137826 4.193 0.000 274.752 0.002 _observer.py:108(__call__)
> 275652 2.541 0.000 270.559 0.001 _legacy.py:44(__call__)
> 275652 3.502 0.000 267.804 0.001 log.py:539(emit)
> 103718 1.485 0.000 265.556 0.003 logs.py:66(_getLog)
> 94968/93904 12.745 0.000 262.480 0.003 log.py:159(wholeLines)
> 26274 2.272 0.000 262.479 0.010 buildstep.py:451(startStep)
> 74361/73660 0.854 0.000 258.791 0.004 log.py:171(addStdout)
> 112747 243.302 0.002 243.302 0.002 {method 'recv' of
> '_socket.socket' objects}
> 274384 49.407 0.000 230.840 0.001 threading.py:373(notify)
> 69570 220.145 0.003 220.145 0.003 {method 'write' of 'file'
> objects}
> 1659280/1644079 16.320 0.000 206.367 0.000 defer.py:331(addBoth)
> 10667920 182.099 0.000 184.380 0.000 tuplematch.py:19(matchTuple)
> 192299 173.403 0.001 173.403 0.001 {method 'sub' of
> '_sre.SRE_Pattern' objects}
> 274399 11.041 0.000 169.881 0.001 threading.py:300(_is_owned)
> 12967 0.551 0.000 156.150 0.012 buildstep.py:605(run)
> 218466 2.708 0.000 144.798 0.001 posixbase.py:141(doRead)
> 218741 5.788 0.000 142.305 0.001 fdesc.py:62(readFromFD)
> 218741 135.535 0.001 135.535 0.001 {posix.read}
> 23350 0.257 0.000 131.984 0.006 steps.py:119(generateEvent)
> 2124 0.043 0.000 130.321 0.061 shell.py:253(start)
> 2124 0.523 0.000 129.577 0.061
> buildstep.py:921(startCommand)
> 37583 1.340 0.000 113.859 0.003 logfile.py:197(write)
> 37583 0.808 0.000 112.488 0.003 logfile.py:105(write)
> 146080 106.269 0.001 106.269 0.001 {method 'modify' of
> 'select.epoll' objects}
> 4915 0.151 0.000 104.365 0.021 eventual.py:37(_turn)
> 4719 0.201 0.000 104.214 0.022 remotecommand.py:134(_
> finished)
> 9437 0.250 0.000 96.007 0.010 buildstep.py:832(runCommand)
> 4719 0.234 0.000 95.603 0.020 remotecommand.py:80(run)
> 192309 2.155 0.000 95.360 0.000 log.py:68(addRawLines)
> 110000 1.078 0.000 89.085 0.001 pb.py:581(sendCall)
> 110010 2.046 0.000 88.008 0.001 banana.py:302(sendEncoded)
> 120190 0.877 0.000 84.101 0.001 base.py:59(sanitizeMessage)
> 2025413/120551 40.610 0.000 83.825 0.001 copy.py:145(deepcopy)
> 18330 0.640 0.000 80.581 0.004 buildsets.py:34(db2data)
> 124182/120551 11.702 0.000 80.247 0.001
> copy.py:253(_deepcopy_dict)
> 142700 7.776 0.000 80.184 0.001 epollreactor.py:238(_add)
> 104696 2.288 0.000 79.409 0.001 pb.py:921(_sendAnswer)
> 10688 0.344 0.000 78.735 0.007 rest.py:326(renderRest)
> 41284 0.288 0.000 76.702 0.002 buildsets.py:44(getSs)
> 135155 1.329 0.000 75.917 0.001 abstract.py:436(startWriting)
> 135155 4.440 0.000 74.587 0.001
> epollreactor.py:282(addWriter)
> 27246 0.552 0.000 74.348 0.003 base.py:515(wakeUp)
> 27246 3.021 0.000 73.795 0.003 posixbase.py:169(wakeUp)
> 165 0.033 0.000 73.278 0.444 buildsets.py:86(db2data)
> 123407 1.556 0.000 71.174 0.001 _newtls.py:182(write)
> 6478 1.397 0.000 71.077 0.011 basic.py:542(dataReceived)
> 254926/246533 5.769 0.000 70.023 0.000 defer.py:846(_cbDeferred)
> 123584 2.385 0.000 69.097 0.001 abstract.py:339(write)
> 27246 67.073 0.002 67.073 0.002 {posix.write}
> 106913 2.842 0.000 66.625 0.001 epollreactor.py:299(_remove)
> 46168 0.983 0.000 65.580 0.001 http.py:1675(lineReceived)
> 80486 1.303 0.000 63.860 0.001 abstract.py:423(stopWriting)
> 5829 0.057 0.000 63.464 0.011 http.py:1780(
> allContentReceived)
> 5829 0.190 0.000 63.216 0.011 http.py:769(requestReceived)
> 86356 3.425 0.000 62.633 0.001 epollreactor.py:336(
> removeWriter)
> 11238 0.331 0.000 60.409 0.005 steps.py:138(
> setStepStateString)
> 105839 1.873 0.000 57.857 0.001 metrics.py:59(log)
> 5770 0.303 0.000 56.276 0.010 server.py:161(process)
> 14168 0.479 0.000 54.273 0.004 remotecommand.py:235(
> addStderr)
> 175167 2.737 0.000 53.558 0.000 threadable.py:33(_synchPre)
> 5770 0.047 0.000 51.881 0.009 server.py:227(render)
> 4242 0.221 0.000 51.620 0.012 remotecommand.py:394(_start)
> 175180 10.267 0.000 50.821 0.000 threading.py:147(acquire)
> 5148 0.632 0.000 50.624 0.010 tcp.py:1016(doRead)
> 7539 0.421 0.000 49.435 0.007 logs.py:116(addLog)
> 7412/7084 0.052 0.000 49.199 0.007 log.py:174(addStderr)
> 2166573 46.210 0.000 46.695 0.000 banana.py:35(b1282int)
> 7299 0.416 0.000 46.610 0.006 posixbase.py:237(_
> disconnectSelectable)
> 35125 0.407 0.000 44.185 0.001 steps.py:52(get)
> 232 0.008 0.000 43.019 0.185 waf_step.py:75(start)
> 11700 0.408 0.000 42.572 0.004 steps.py:28(getStep)
> 5401 0.575 0.000 40.441 0.007 resource.py:52(
> asyncRenderHelper)
> 56875 0.787 0.000 39.562 0.001 properties.py:194(render)
> 430148 4.588 0.000 38.949 0.000 socket.py:227(meth)
> 7538 0.289 0.000 38.684 0.005 logs.py:137(finishLog)
> 5658 0.084 0.000 38.201 0.007 http.py:929(finish)
> 5344 0.043 0.000 38.059 0.007 rest.py:423(render)
> 5599 0.031 0.000 37.888 0.007 server.py:216(finish)
> 59388 5.052 0.000 37.616 0.001 defer.py:789(__init__)
> 3896/3744 0.135 0.000 37.356 0.010 lru.py:206(handle_result)
> 5401 0.149 0.000 36.074 0.007 resource.py:65(finish)
> 2626 0.036 0.000 35.977 0.014 build.py:472(_stepDone)
> 10688 0.211 0.000 35.616 0.003 rest.py:433(asyncRender)
> 132060/56875 1.371 0.000 34.532 0.001 properties.py:153(render)
> 4720 0.337 0.000 32.234 0.007 remotecommand.py:302(
> remoteComplete)
> 214676 1.441 0.000 31.946 0.000 pb.py:819(unserialize)
> 5898 0.169 0.000 30.568 0.005 tcp.py:277(connectionLost)
> 214676 1.420 0.000 30.505 0.000 jelly.py:1127(unjelly)
> 3794 0.325 0.000 29.815 0.008 debounce.py:63(invoke)
> 5655 0.099 0.000 29.609 0.005 http.py:2116(log)
> 2626 0.159 0.000 29.405 0.011
> buildstep.py:710(releaseLocks)
> 94046 2.530 0.000 28.971 0.000 remotecommand.py:220(_unwrap)
> 5901 0.208 0.000 28.758 0.005 tcp.py:113(_closeSocket)
> 20354/15469 0.158 0.000 28.597 0.002 properties.py:773(
> getRenderingFor)
> 5901 28.267 0.005 28.267 0.005 {method 'shutdown' of
> '_socket.socket' objects}
> 97233 0.980 0.000 28.000 0.000 buildstep.py:172(_delay)
> 8556 0.231 0.000 27.730 0.003 socket.py:205(accept)
> 11238 0.144 0.000 27.620 0.002 buildstep.py:416(
> updateSummary)
> 197524/103421 5.643 0.000 27.552 0.000
> buildstep.py:156(_catchup)
> 2626 0.227 0.000 27.243 0.010 build.py:481(stepDone)
> 214676 1.022 0.000 27.214 0.000 jelly.py:620(unjellyFull)
> 12452 0.852 0.000 26.962 0.002 properties.py:62(
> setBuildProperties)
> 8556 26.674 0.003 26.674 0.003 {method 'accept' of
> '_socket.socket' objects}
> 239750/119875 1.432 0.000 26.644 0.000 operators.py:296(__eq__)
> 94046 1.570 0.000 26.408 0.000 buildstep.py:203(unwrap)
> 819050/214676 12.506 0.000 26.192 0.000 jelly.py:627(unjelly)
> 122012 0.374 0.000 26.079 0.000 elements.py:682(operate)
> 547659 3.241 0.000 25.979 0.000 defer.py:70(succeed)
> 119875 0.484 0.000 25.507 0.000 {operator.eq}
> 222329 1.154 0.000 25.419 0.000 buildrequests.py:106(get)
> 1988554 24.414 0.000 25.402 0.000 copy.py:267(_keep_alive)
> 7536 0.317 0.000 25.275 0.003 log.py:89(finish)
> 7881 0.067 0.000 25.165 0.003 steps.py:152(finishStep)
> 42090 0.289 0.000 24.794 0.001 defer.py:898(gatherResults)
> 12915/12450 0.183 0.000 24.618 0.002 properties.py:810(
> getRenderingFor)
> 122012 0.662 0.000 24.525 0.000 <string>:1(<lambda>)
> 3746 0.039 0.000 24.210 0.006 steps.py:118(
> setStepStateString)
> 122012 1.937 0.000 23.863 0.000 type_api.py:60(operate)
> 93701 0.454 0.000 23.201 0.000 log.py:44(<lambda>)
> 137050 0.815 0.000 22.948 0.000 {method 'decode' of 'str'
> objects}
> 63974 22.720 0.000 22.720 0.000 {method 'flush' of 'file'
> objects}
> 99036 0.681 0.000 22.175 0.000 utf_8.py:15(decode)
> 736278/110010 13.325 0.000 22.082 0.000 banana.py:318(_encode)
> 120561 1.807 0.000 21.838 0.000 default_comparator.py:22(_
> boolean_compare)
> 99036 21.494 0.000 21.494 0.000 {_codecs.utf_8_decode}
> 7881 0.047 0.000 20.958 0.003 steps.py:132(startStep)
> 2350 0.058 0.000 20.841 0.009 git.py:333(_dovccmd)
> 7881 0.057 0.000 20.604 0.003 steps.py:124(addStep)
> 519918 6.520 0.000 20.222 0.000 context.py:120(getContext)
> 2345 0.112 0.000 19.871 0.008 task.py:199(__call__)
> 4719 0.185 0.000 19.753 0.004 remotecommand.py:123(_start)
> 4719 0.084 0.000 19.520 0.004 pb.py:248(remoteStartCommand)
>
> Le mer. 17 août 2016 à 11:11, Francesco Di Mizio <
> francescodimizio at gmail.com> a écrit :
>
>> Sending again this email with a link to download the profile as prob it
>> was too big as attachment
>> http://www.filedropper.com/profile_6
>>
>> I was able to run twistd with cprofile in production for one day. It was
>> taken with twists 15.5.0 and python 2.7.10.
>> If you could take a look it'd be great ;)
>>
>> On Wed, Aug 17, 2016 at 11:03 AM, Francesco Di Mizio <
>> francescodimizio at gmail.com> wrote:
>>
>>> Hey Pierre,
>>>
>>> I was able to run twistd with cprofile in production for one day. It was
>>> taken with twists 15.5.0 and python 2.7.10.
>>> If you could take a look it'd be great ;)
>>>
>>> Francesco
>>>
>>> On Fri, Aug 12, 2016 at 4:26 PM, Francesco Di Mizio <
>>> francescodimizio at gmail.com> wrote:
>>>
>>>> If I knew how to reproduce the slowness I wouldnt have to rpofile ;)
>>>>
>>>> Jokes aside, I tried to nuke the log file and indeed it gets re-created
>>>> every 10 secs but the content is just always the same even though the
>>>> master is actually running and I see it hitting the CPU spikes.I get the
>>>> same things on my test env: 50 smaples and then that's all. I am 100% that
>>>> the log I attached before did NOT include any spikes. That's just the
>>>> result of a few seconds of buildbot running. Then you know... somehow it
>>>> doesnt update.
>>>>
>>>> If you think it can make a difference I can try the new snippet (is it
>>>> incomplete? It's missing dump_stats function)
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Aug 12, 2016 at 3:48 PM, Pierre Tardy <tardyp at gmail.com> wrote:
>>>>
>>>>> Hy Francesco,
>>>>>
>>>>> This is weird, as on my environment I can see the sample count
>>>>> growing. It only grows if the process is not idle, though
>>>>> So I have to use the master ui to actually make the profile change on
>>>>> my test environment.
>>>>> Your profile looks like fairly idle. I can see pprint and write to
>>>>> logfile which might be not scalable if the data start growing do you have
>>>>> some heavy debug log in your prod?
>>>>>
>>>>> Here is an updated more simple snippet, but this should not change
>>>>> many thing.
>>>>> You can improve the TimerService '10' value to something bigger in
>>>>> order to dump less often.
>>>>> Are you sure you did reproduce the slowness while profiling?
>>>>>
>>>>>
>>>>> import statprof
>>>>> from twisted.application.internet import TimerService
>>>>> from buildbot.util.service import BuildbotService
>>>>> from twisted.internet import reactor
>>>>>
>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>> name = "timer"
>>>>> def __init__(self):
>>>>> BuildbotService.__init__(self)
>>>>> TimerService.__init__(self, 10, self.dump_stats)
>>>>> reactor.callFromThread(statprof.start)
>>>>>
>>>>> def dump_stats(self):
>>>>> statprof.stop()
>>>>> with open("profile.txt", "w") as f:
>>>>> statprof.display(f)
>>>>> statprof.display()
>>>>> statprof.start()
>>>>>
>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>
>>>>>
>>>>> Le ven. 12 août 2016 à 14:21, Francesco Di Mizio <
>>>>> francescodimizio at gmail.com> a écrit :
>>>>>
>>>>>> Sure thing. it's attached.
>>>>>> Please note the sampe count has been stuck at 51 for over 1 hour even
>>>>>> if the master is running.
>>>>>>
>>>>>> On Fri, Aug 12, 2016 at 12:38 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> The first lines are the important ones. Can you send the full file?
>>>>>>>
>>>>>>> Le ven. 12 août 2016 12:23, Francesco Di Mizio <
>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>
>>>>>>>> It does work but statprof seems to stop getting samples. Last few
>>>>>>>> lines from it:
>>>>>>>>
>>>>>>>> 2016-08-12 10:19:43+0000 [-] 0.00 0.06 0.00
>>>>>>>> banana.py:167:gotItem
>>>>>>>> 2016-08-12 10:19:43+0000 [-] 0.00 0.01 0.00
>>>>>>>> policies.py:718:setTimeout
>>>>>>>> 2016-08-12 10:19:43+0000 [-] 0.00 0.01 0.00
>>>>>>>> encoder.py:434:_iterencode
>>>>>>>> 2016-08-12 10:19:43+0000 [-] ---
>>>>>>>> 2016-08-12 10:19:43+0000 [-] Sample count: 51
>>>>>>>> 2016-08-12 10:19:43+0000 [-] Total time: 0.510000 seconds
>>>>>>>>
>>>>>>>> The server is running.
>>>>>>>>
>>>>>>>> On Thu, Aug 11, 2016 at 10:46 AM, Francesco Di Mizio <
>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Pierre many thanks for that. I do not remember why the force
>>>>>>>>> scheduler api didnt suite my needs, that was my first approach.
>>>>>>>>> I will try the new snippet when I get to work in a few hours.
>>>>>>>>>
>>>>>>>>> On Wed, Aug 10, 2016 at 11:18 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> After carefully looking at your code, it does not look to have
>>>>>>>>>> obvious performance flaw.
>>>>>>>>>> I think it would have been cheaper for you to just use the
>>>>>>>>>> forcescheduler REST api, but if that code works for you this is totally
>>>>>>>>>> fine.
>>>>>>>>>>
>>>>>>>>>> I saw that you did not manage to get sample in your prod as well,
>>>>>>>>>> so I did look at my code again, and now I am able to see the samples.
>>>>>>>>>>
>>>>>>>>>> statprof does not like to stop before start
>>>>>>>>>>
>>>>>>>>>> Please try again with this code:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> import statprof
>>>>>>>>>> from twisted.application.internet import TimerService
>>>>>>>>>> from buildbot.util.service import BuildbotService
>>>>>>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>>>>>> name = "timer"
>>>>>>>>>> def __init__(self):
>>>>>>>>>> BuildbotService.__init__(self)
>>>>>>>>>> TimerService.__init__(self, 10, self.dump_stats)
>>>>>>>>>> self.started = False
>>>>>>>>>> def dump_stats(self):
>>>>>>>>>> if not self.started:
>>>>>>>>>> statprof.start()
>>>>>>>>>> self.started = True
>>>>>>>>>> return
>>>>>>>>>> statprof.stop()
>>>>>>>>>> with open("profile.txt", "w") as f:
>>>>>>>>>> statprof.display(f)
>>>>>>>>>> statprof.display()
>>>>>>>>>> statprof.start()
>>>>>>>>>>
>>>>>>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le mer. 10 août 2016 à 14:14, Francesco Di Mizio <
>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>
>>>>>>>>>>> Many thanks. Find it attached.
>>>>>>>>>>>
>>>>>>>>>>> It's a tiny http server. People use a web page to send an HTTP
>>>>>>>>>>> request to it to kick off builds.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 10, 2016 at 1:55 PM, Pierre Tardy <tardyp at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'll take a look.
>>>>>>>>>>>>
>>>>>>>>>>>> Le mer. 10 août 2016 12:45, Francesco Di Mizio <
>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>> I am starting to think it may be due to my own custom
>>>>>>>>>>>>> scheduler. If I uploaded the code somewhere could anybody take a look? it's
>>>>>>>>>>>>> a 150 lines py script.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Aug 10, 2016 at 12:10 PM, Francesco Di Mizio <
>>>>>>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for that Pierre. I have it on on production and the
>>>>>>>>>>>>>> performances do not degrade indeed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> buildbot at e2123a50d1fb:~$ cat profile.txt
>>>>>>>>>>>>>> No samples recorded.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I need to study statprof a bit now ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Aug 10, 2016 at 11:20 AM, Vasily <vassnlit at gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One can also use Intel® VTune™, you know... :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Vasily
>>>>>>>>>>>>>>> 09 авг. 2016 г. 23:56 пользователь "Pierre Tardy" <
>>>>>>>>>>>>>>> tardyp at gmail.com> написал:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You can add following snippet to your master.cfg. It looks
>>>>>>>>>>>>>>>> statprof does not work on osx, so I cannot test it to the end, but I know
>>>>>>>>>>>>>>>> similar did work on my prod
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> import statprof
>>>>>>>>>>>>>>>> from twisted.application.internet import TimerService
>>>>>>>>>>>>>>>> from buildbot.util.service import BuildbotService
>>>>>>>>>>>>>>>> class BuildbotTimerService(TimerService, BuildbotService):
>>>>>>>>>>>>>>>> name = "timer"
>>>>>>>>>>>>>>>> def __init__(self):
>>>>>>>>>>>>>>>> BuildbotService.__init__(self)
>>>>>>>>>>>>>>>> TimerService.__init__(self, 10, self.dump_stats)
>>>>>>>>>>>>>>>> def dump_stats(self):
>>>>>>>>>>>>>>>> statprof.stop()
>>>>>>>>>>>>>>>> with open("profile.txt", "w") as f:
>>>>>>>>>>>>>>>> statprof.display(f)
>>>>>>>>>>>>>>>> statprof.start()
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> c['services'] = [ BuildbotTimerService() ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le mar. 9 août 2016 à 21:44, Francesco Di Mizio <
>>>>>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Pierre,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> if you can enlighten on how you intend to use statprof, I
>>>>>>>>>>>>>>>>> will be happy to give it a try.
>>>>>>>>>>>>>>>>> This problem here is on top of my list right now as it's
>>>>>>>>>>>>>>>>> making my prod env unusable.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 8:25 PM, Pierre Tardy <
>>>>>>>>>>>>>>>>> tardyp at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please note that cprofile is very intrusive and will lead
>>>>>>>>>>>>>>>>>> to about 3x slower code.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So I would not run that in prod. This is why I prefer
>>>>>>>>>>>>>>>>>> statprof which is using a non intrusive method to get the stats
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Le mar. 9 août 2016 19:15, Francesco Di Mizio <
>>>>>>>>>>>>>>>>>> francescodimizio at gmail.com> a écrit :
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On a side note I have managed to get twistd to dump a
>>>>>>>>>>>>>>>>>>> blob I can then load and analyze. Got to see if I can run this in
>>>>>>>>>>>>>>>>>>> production.
>>>>>>>>>>>>>>>>>>> If anybody is curious the following worked for me:
>>>>>>>>>>>>>>>>>>> twistd --savestats -n --profiler=cprofile
>>>>>>>>>>>>>>>>>>> --profile=/crcdata/profile.stats -y ./buildbot.tac
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:21 PM, Francesco Di Mizio <
>>>>>>>>>>>>>>>>>>> francescodimizio at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Well that's expected to an extent - p4 poller is
>>>>>>>>>>>>>>>>>>>> running 'p4 changes' every pollinterval seconds.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Anyway just as a test I've tried to disable the poller
>>>>>>>>>>>>>>>>>>>> and still at times I am seeing the CPU spiking up to above 100%. I believe
>>>>>>>>>>>>>>>>>>>> I really need to profile this somehow.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Tue, Aug 9, 2016 at 5:08 PM, Dan Kegel <
>>>>>>>>>>>>>>>>>>>> dank at kegel.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also watch 'top' and see if poller processes are
>>>>>>>>>>>>>>>>>>>>> hogging resources.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>> users at buildbot.net
>>>>>>>>>>>>>>>> https://lists.buildbot.net/mailman/listinfo/users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20160817/3039f10a/attachment.html>
More information about the users
mailing list