[devel at bb.net] Buildbot 0.9 performance issue
Aakash Jain
aj355 at cornell.edu
Thu Jun 29 17:04:38 UTC 2017
large pending buildrequests were indeed the issue. I cancelled most of the
buildrequests and the buildbot startup time dropped from ~8m to ~10s.
I had the testing instance up and running for ~1 month without any workers,
and continuously accumulating buildrequests. In production we should have
workers most of the times, except for any unexpected maintenance in which
case worker downtime shouldn't be more than few days.
However, I wanted to confirm if 25,000 is indeed very large number of
pending buildrequests for buildbot? Has anyone ever tried such situation?
Thanks
Aakash
On Wed, Jun 28, 2017 at 7:57 PM, Aakash Jain <aj355 at cornell.edu> wrote:
> Thanks Neil. That's good to know that buildbot is supporting this much
> workers easily. Also, rjarry mentioned over IRC that he is using buildbot
> for 500+ builders in single-master config with a startup time of less than
> 10s. So, number of builders (~200) doesn't seems like an issue in my case.
>
> Also, tardyp pointed out on IRC that *ROLLBACK is not the problem*, it's
> probably normal behavior for selects with sqlalchemy (link
> <https://stackoverflow.com/questions/7559570/make-sqlalchemy-commit-instead-of-rollback-after-a-select-query>).
> I confirmed this by creating a fresh new postgresql database, and I still
> saw ROLLBACKs.
>
> I am still trying to debug the performance issues. I tried removing
> schedulers, but that didn't help in performance.
>
> Also, I have ~25000 pending buildrequests (since I have test instance
> running for a while). Can this be the cause of performance issues (and
> large startup time i am seeing)?
>
> Thanks
> Aakash
>
>
>
> On Wed, Jun 28, 2017 at 7:33 AM, Neil Gilmore <ngilmore at grammatech.com>
> wrote:
>
>> While we are using a multi-master system, one of our masters has at least
>> that many builders. I think we have on the order of 178 workers on that
>> master, and at least 2 builders on each worker (most have 3 or more). We
>> also use a postgresql database and I've never seen that as a source of any
>> of our problems.
>>
>> Starting that master doesn't take more than 30 seconds, and probably less
>> (I haven't timed it). Reconfigs take a lot longer.
>>
>> In fact we had a reconfig take more than 17K seconds over the weekend. It
>> never finished, because I killed and restarted it. Users were getting
>> annoyed that none of their builds were in the UI.
>>
>> Neil Gilmore
>> grammatech.com
>>
>>
>> On 6/27/2017 8:30 PM, Aakash Jain wrote:
>>
>> Hi Everyone,
>>
>> I am working on migrating our buildbot instance from 0.8.12 to 0.9.8. I
>> did not have performance issue with 0.8.12. However with buildbot 0.9 (with
>> the same configuration), I am seeing a lot of performance issues, mostly
>> related to database. I was expecting a performance improvement since
>> buildbot 0.9 moved from pickles to database (and designed for scalability
>> <https://medium.com/@tardyp31/d0d41bba07e1>), however I am noticing the
>> reverse.
>>
>> My buildbot configuration is:
>> - buildbot 0.9.8
>> - single master
>> - postgresql db
>> - 200+ builders (queues)
>> - only 2 workers connected currently for testing
>> - decent hardware on master (8 gb ram, 8 cpu)
>>
>>
>> "buildbot start" is taking ~8 minutes (the start time increased gradually
>> over last few weeks). Also loading various buildbot webpages are slow (many
>> times webpage loads, but the build data load after ~30s). During this time
>> there isn't too much cpu/memory/IO load on master.
>>
>> I enabled postgresql logs and I see extremely large number of SELECT
>> statements in the logs. I also see a large number of "ROLLBACK" in
>> postgresql logs. I have few questions:
>>
>> 1) Why I am seeing so many ROLLBACK in the postgresql logs? Any
>> suggestions to fix/debug it?
>>
>> 2) From postgresql logs, I notice that buildbot is running database
>> queries in a loop fetching data for individual builder/build/ (resulting in
>> large number of queries). Wouldn't it be more efficient to use a single
>> database query to fetch data for all the builders and then filter it in
>> python code?
>>
>> 3) has anyone tested buildbot 0.9 for 200+ builders (for single-master)?
>>
>>
>> postgresql logs: https://goo.gl/ZCd19o
>>
>>
>> Relevant logs for ROLLBACK (#1):
>>
>> 2017-06-23 21:19:45.491 GMT 24705 0 localhost(56256) LOG: statement:
>> *ROLLBACK*
>> 2017-06-23 21:19:45.494 GMT 24697 0 localhost(56242) LOG: statement:
>> SELECT changes.changeid, changes.author, changes.comments, changes.branch,
>> changes.revision, changes.revlink, changes.when_timestamp,
>> changes.category, changes.repository, changes.codebase, changes.project,
>> changes.sourcestampid, changes.parent_changeids
>> FROM changes
>> WHERE changes.sourcestampid = 6
>> 2017-06-23 21:19:45.496 GMT 24717 0 localhost(56270) LOG: statement:
>> SELECT changes.changeid, changes.author, changes.comments, changes.branch,
>> changes.revision, changes.revlink, changes.when_timestamp,
>> changes.category, changes.repository, changes.codebase, changes.project,
>> changes.sourcestampid, changes.parent_changeids
>> FROM changes
>> WHERE changes.sourcestampid = 6
>> 2017-06-23 21:19:45.496 GMT 24715 0 localhost(56266) LOG: statement:
>> *ROLLBACK*
>> 2017-06-23 21:19:45.498 GMT 24697 0 localhost(56242) LOG: statement:
>> *ROLLBACK*
>> 2017-06-23 21:19:45.499 GMT 24698 0 localhost(56244) LOG: statement:
>> SELECT changes.changeid, changes.author, changes.comments, changes.branch,
>> changes.revision, changes.revlink, changes.when_timestamp,
>> changes.category, changes.repository, changes.codebase, changes.project,
>> changes.sourcestampid, changes.parent_changeids
>> FROM changes
>> WHERE changes.sourcestampid = 6
>>
>> Relevant logs for #2:
>>
>> 2017-06-23 21:19:52.221 GMT 24738 localhost(56312) LOG: statement:
>> SELECT tags.name
>> FROM tags JOIN builders_tags ON tags.id = builders_tags.tagid
>> WHERE builders_tags.builderid = 218
>> 2017-06-23 21:19:52.223 GMT 24738 localhost(56312) LOG: statement:
>> SELECT tags.name
>> FROM tags JOIN builders_tags ON tags.id = builders_tags.tagid
>> WHERE builders_tags.builderid = 219
>> 2017-06-23 21:19:52.226 GMT 24738 localhost(56312) LOG: statement:
>> SELECT tags.name
>> FROM tags JOIN builders_tags ON tags.id = builders_tags.tagid
>> WHERE builders_tags.builderid = 220
>> 2017-06-23 21:19:52.235 GMT 24738 localhost(56312) LOG: statement:
>> SELECT tags.name
>> FROM tags JOIN builders_tags ON tags.id = builders_tags.tagid
>> WHERE builders_tags.builderid = 221
>>
>>
>> Thanks in Advance
>>
>> -Aakash
>>
>>
>> _______________________________________________
>> devel mailing listdevel at buildbot.nethttps://lists.buildbot.net/mailman/listinfo/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel at buildbot.net
>> https://lists.buildbot.net/mailman/listinfo/devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/devel/attachments/20170629/7ec71adf/attachment.html>
More information about the devel
mailing list