[Buildbot-devel] RFC: Assigning builds when load is high

Vitali Lovich vlovich at gmail.com
Wed May 6 20:59:11 UTC 2015

The problem with #2 is that you won’t actually use your compute cluster since you’ll be waiting for a particular buildslave even though other buildslaves may be idle.

The approach I’ve found that works better is implementing a prioritization that knows about which jobs are likely to be quick & which aren’t so that quick jobs are picked for completion first.
This does make it a domain-specific problem unfortunately but is tractable.  Ping me offline if you want to discuss the details for our setup.

If buildbot wants to properly solve scheduling I think there are a few moving parts where a revamped ETA is crucial:

1. ETA needs to be implemented properly & robustly.  That means being able to provide a buildslave-specific ETA for each build step + take into account domain-specific dimensions.
In other words, the user has to be able to provide a set of properties that must match for the ETA samples so that if a builder is shared between projects the ETA is still correct or if a build request is for a clean build vs incremental.
Similarly, some kind of fallback mechanism is likely necessary since if I’m building a branch it likely needs to use the master ETA as a baseline if we don’t have anything more up-to-date for the branch itself.
2. ETA needs to have a guess function that given a buildslave & domain-specific dimensions returns how long the build *would* take (including accounting for any locks we might need to acquire up-front).
3. The ETA would need to be available for locks based on current load by using ETA of completion of the builds/buildsteps holding the lock (read-only lock would be the max of the ETA of the things holding the lock).
4. The queue would need to take the ETA for a given BR for each buildslave & then try to use the buildslave that minimizes the ETA (regardless of any current locks being held).

This way, if you add a machine that is 10x faster than the rest, you’ll have jobs queue up on it leaving your slower machines idle until it’s faster to overflow to other machines.

This isn’t optimal from a total queue scheduling perspective since it’s greedy instead of co-operative, but it will actually likely behave
per user expectations (i.e. use all the available capacity so that jobs finish the most quickly).


> On May 6, 2015, at 1:15 PM, Jared Grubb <jared.grubb at gmail.com> wrote:
> Many months ago, I made a change in buildbot to enhance the way that buildslaves and builds get assigned. In particular, we added a “canStartBuild” functor that lets you adjust how these mappings happen.
> There was a design decision I made that I’m starting to regret (and have disabled in my buildbot).
> Question:
> - The BRD attempts to pick buildslaves that can aquire builder locks. If no buildslave qualify (ie high load), we have two choices:
>    1. pick a random buildslave that would work otherwise
>    2. give up and wait until a buildslave can acquire the locks needed
> Currently, the BRD does #1, however, I’ve seen this cause problems when quick builds get stuck behind long builds … and so I’ll see my set of buildslaves go idle except for one, which will have a few builds on it, all stuck behind one long build. If we did #2, then the short builds would get assigned immediately as the next buildslave goes idle.
> I am thinking that #2 should be the default behavior — or at least be opt-in configurable.
> Note this applies to both eight and nine and is a fairly trivial patch either way.
> Anyone have any thoughts or comments?
> Jared
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel

More information about the devel mailing list