[Buildbot-devel] Why no ETA (sometimes)?

Brian Warner warner-buildbot at lothar.com
Sat Jan 26 18:52:16 UTC 2008


> I didn't realize how useful Buildbot's ETA feature was until I started
> using BB for production builds that our QA team is waiting for.  It
> rocks!

Heh. It'd be even more useful if I could manage to rewrite the way it's
calculated to make it a bit more linear. I have a plan for this but haven't
managed to get it implemented. I'll see if I can at least describe my plan in
a ticket, maybe someone else could take it on.

> But now the ETA seems to have gone away from the waterfall page.

The ETA is calculated using an exponentially-weighted average of previous
builds (newavg = (newvalue+oldavg)/2 , so it's an IIR response), and the
history is *not* currently saved to disk. As a result, you won't see an ETA
until you've had at least one build since the Builder was created (i.e.
modified) or the buildmaster was restarted. I seem to recall having code that
would ignore the completion time of failing builds, or perhaps of builds
which had steps skipped (e.g., if your compile step is marked
haltOnFailure=True and it failed), the idea being that a build which fails
and also does half as much work as usual shouldn't change our expectations
about build time.

Does that sound like it might explain what you've seen?

buildbot/status/progress.py is where this stuff is gathered, in case you want
to look at the code.


> If so, any plans to fix it by persisting whatever is needed to calculate
> ETA?

That would be great.. feel free to create a ticket (or better yet a patch
:-). The current scheme would require storing a dict of (metricname ->
expectation) for each step. The builder pickle would be the most appropriate
place for this.

The new scheme I'm thinking of would record a list of (time, progress)
tuples, sampled maybe once every 5 seconds, for each metric*step. The idea is
to create a graph of progress-vs-time, then transpose it to get a graph of
time-vs-progress, then trim it down to a smaller number of samples and save
it. For the next build, when the metric tells us that we've gotten, say, 15kB
of stdout, we look it up on the graph and discover that we usually see 15kB
of output when the build is 60% finished, and report that percentage. This
will improve the (quite common) case where the output from a step is far from
linear.. a lot of compiles will spit out a huge amount of output at the
beginning, then very little (per unit time) towards the end. Of course,
adding more progress-measuring metrics (like an output watcher that looks for
'gcc' lines so it can count how many files have been compiled) will improve
accuracy. You could even go crazy and record which files were compiled and
when, so the metric could learn that compiling bar.c means we're 60%
complete.

cheers,
 -Brian




More information about the devel mailing list