[users at bb.net] Limit amount of DockerLatentWorkers running on a particular physical machine

Vlad Bogolin vlad at mariadb.org
Wed Jul 6 10:51:06 UTC 2022


Hi,

I have a follow up question: is it possible to use the locking mechanism in
a multi-master setup? From my attempts it seems that each master process
has its own locks. Do you have any suggestions on what we might use to
limit the amount of DockerLatentWorkers that might run if they are shared
among multiple master processes?

Thank you!

On Mon, Jul 4, 2022 at 10:20 PM Vlad Bogolin <vlad at mariadb.org> wrote:

> Great, thanks! Please keep me posted!
>
> On Mon, Jul 4, 2022 at 10:18 PM Povilas Kanapickas <povilas at radix.lt>
> wrote:
>
>> Your setup is exactly what I described and should work (at least in
>> theory). I will look into this when I have time.
>>
>> On 2022-07-04 22:05, Vlad Bogolin wrote:
>> > Thanks for the prompt reply!
>> >
>> > Our locks are defined as you can see
>> > here https://github.com/MariaDB/buildbot/blob/main/locks.py
>> > <https://github.com/MariaDB/buildbot/blob/main/locks.py> and then each
>> > build receives as argument the lock
>> > function
>> https://github.com/MariaDB/buildbot/blob/cd5378a7b6549e3bf5930306c2eed29239aa3a38/master.cfg#L961
>> > <
>> https://github.com/MariaDB/buildbot/blob/cd5378a7b6549e3bf5930306c2eed29239aa3a38/master.cfg#L961
>> >.
>> > However, for example this build
>> > (https://buildbot.mariadb.org/#/builders/348/builds/1638
>> > <https://buildbot.mariadb.org/#/builders/348/builds/1638>) is now
>> > waiting for 6h to acquire the locks. Overall there are 11 builds that
>> > have started and are waiting for locks currently. I have considered this
>> > to be normal, but if you think there is an issue please let me know.
>> >
>> > Thank you!
>> >
>> > On Mon, Jul 4, 2022 at 9:52 PM Povilas Kanapickas <povilas at radix.lt
>> > <mailto:povilas at radix.lt>> wrote:
>> >
>> >     Hi,
>> >
>> >     If you use renderable *builder* locks then they will not cause
>> builds to
>> >     wait for the locks. The builder lock resolution happens before the
>> build
>> >     actually starts, even before the canStartBuild function.
>> >
>> >     Starting a build that later can't acquire the builder locks we just
>> >     checked should be rare occurrence. If that's not the case, it's a
>> bug I
>> >     would be interested in investigating.
>> >
>> >     So what I would do is to set builder locks argument to a renderable
>> >     function, within that function I would check interesting build
>> >     properties such as which worker the build is about to start on and
>> then
>> >     return a set of locks that must be acquired. This way you would have
>> >     almost complete flexibility. For each resource you can have separate
>> >     master lock with maxCount representing maximum resource utilization
>> >     (e.g. 512GB RAM or whatever) and then the builders would take e.g.
>> >     lock.access('counting', count=8) to acquire a 8GB ram slice).
>> >
>> >     Regards,
>> >     Povilas
>> >
>> >     On 2022-07-04 21:39, Vlad Bogolin wrote:
>> >     > Hi,
>> >     >
>> >     > Thank you for your reply! Is there any way to customize what
>> >     > oversubscribed means? We already use a locking mechanism, but
>> >     still this
>> >     > translates into having multiple running builds that just wait for
>> the
>> >     > locks for several hours. Ideally, I would like to avoid this.
>> >     >
>> >     > Also, by any chance, can you read a lock value from the
>> canStartBuild
>> >     > function?
>> >     >
>> >     > Thank you!
>> >     > Vlad Bogolin
>> >     >
>> >     > On Fri, Jul 1, 2022 at 12:40 PM Povilas Kanapickas
>> >     <povilas at radix.lt <mailto:povilas at radix.lt>
>> >     > <mailto:povilas at radix.lt <mailto:povilas at radix.lt>>> wrote:
>> >     >
>> >     >     Hi Vlad,
>> >     >
>> >     >     You could setup a number of master locks that are each
>> >     assigned to a
>> >     >     particular physical machine. Then you can setup renderable
>> >     locks for
>> >     >     builds: a build would look into what physical machine it's
>> >     about to
>> >     >     launch on and select the correct lock. If the physical
>> machine is
>> >     >     oversubscribed, Buildbot will notice that lock can not be
>> >     acquired look
>> >     >     for another worker for the build.
>> >     >
>> >     >     Regards,
>> >     >     Povilas
>> >     >
>> >     >     On 2022-06-28 12:44, Vlad Bogolin wrote:
>> >     >     > Hello,
>> >     >     >
>> >     >     > We are using buildbot with primary DockerLatentWorkers for
>> >     our CI. So,
>> >     >     > given a physical machine, we have several
>> >     DockerLatentWorkers that may
>> >     >     > run on it. While this works well, in some cases buildbot
>> starts
>> >     >     too many
>> >     >     > latent workers on the same machine. Is there a way to limit
>> >     starting
>> >     >     > builds for a particular DockerLatentWorker if others are
>> already
>> >     >     running
>> >     >     > on the same machine?
>> >     >     >
>> >     >     > I feel like this should be achievable using the
>> >     canStartBuild, but
>> >     >     I am
>> >     >     > unsure how. Is it possible to access the full list of
>> >     defined latent
>> >     >     > workers and see if one is on or not in the canStartBuild
>> >     function?
>> >     >     >
>> >     >     > Thank you!
>> >     >     > Vlad Bogolin
>> >     >     >
>> >     >     > _______________________________________________
>> >     >     > users mailing list
>> >     >     > users at buildbot.net <mailto:users at buildbot.net>
>> >     <mailto:users at buildbot.net <mailto:users at buildbot.net>>
>> >     >     > https://lists.buildbot.net/mailman/listinfo/users
>> >     <https://lists.buildbot.net/mailman/listinfo/users>
>> >     >     <https://lists.buildbot.net/mailman/listinfo/users
>> >     <https://lists.buildbot.net/mailman/listinfo/users>>
>> >     >     >
>> >     >
>> >     >
>> >     >
>> >     > --
>> >     > Vlad
>> >
>> >
>> >
>> > --
>> > Vlad
>>
>
>
> --
> Vlad
>


-- 
Vlad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20220706/f715cc34/attachment.htm>


More information about the users mailing list