[Buildbot-devel] GSoC: Initial thoughts on the Graphs and Data Charts Project

Wed Mar 18 18:03:07 UTC 2015

Hi Prasoon,

(Sorry for nitpicking, the tool is called Buildbot, not buildbot nor BuildBot.)

On Wed, Mar 18, 2015 at 02:24:25AM +0530, Prasoon Shukla wrote:
> Okay. So from the last couple of posts, Buildbot can already gather data
> for the supported frameworks. This gathering happens in the Steps. In case
> someone wants to add support for a new framework, they can simply add a
> new Step that can process output from *that* framework.
> As an example, if I wanted to add support for JSLint, I'd have to write a
> new Step like the one written for PyLint.
Good step forward.

> (Again, let me use an example. Suppose a Step has collected the number of
> skipped tests. Then, Step can either directly call a function to store this
> number in the influx databases via an API call. Or, the Step can call a
> function in metrics module - in this case, MetricCountEvent.log('Tests
> skipped', skipped_tests) - which will then call *another* function to store
> this number in the influx database).
Thanks for the example (I'll return to it later).  I do not really see a
problem that one function calls another function.

>    If the core devs *really* want me to use the metrics module as middleman,
>    please provide me the reasoning for it. I might be wrong, of course, and
>    there might be a good reason behind using the metrics module but I cannot
>    see any such reason at present.
You almost got the right word, the correct word is middleware.  Pierre already
listed a number of advantages of middleware (thanks, Pierre!).  I'll slightly
re-phrase them here:

* to provide a description for potential middleware users what kind of metrics
  are available and will be correctly handled (absolute values, accumulative
  numbers, times, events, etc)
* to hide details of metrics storage implementation while oferring a clear API
  that various parts of Buildbot could use to produce any kind of metrics and
  those parts would need to be written once in a clear way and could be used
  regardless what storage mechanism is used
* to provide a possibility to perform "processing" of metrics should it be
  necessary (not in scope of this project)

>    So, as was mentioned in the post before, buildbot is already capable of
>    collecting this data and so, this is a non issue. For other types of data
>    that buildbot cannot collect, I would like to go with the script based
>    approach I mentioned in an earlier email. Again, let's have an example.
>    So, say I want to measure the total number of function calls resulting
>    from a certain unit-test function. If our tests are in python, then we can
>    use the traceback module to measure this number. I will run the tests as
>    usual and this number will be measured. Then, I'll write this number to a
>    temporary file and make a simple script that echos this number to stdout.
>    This script I can put in a new Step which will read this number as usual
>    and pass it off to influx to be stored.
>    This approach takes care of getting buildbot to gather arbitrary data.
Excellent example.  After you read the data from stdout, you'd need to "turn it
into a metric".

>>> Just to illustrate with an example, the test suite for LLVM is written
>>> using [2]google test framework. Then, we cannot use metrics codeto
>>> measure/count stuff within the LLVM test suite. This is what I meant when
>>> I said that metrics code can not be used outside buildbot.
>>> So, for all non-trivial build data (e.g. timing a single unit test) we
>>> cannot use metrics module.
>> I do not agree with the conclusion ("we cannot use metrics module"), however
>> I must note that this is another very good example for the point above.
>
> Well, please provide a counter-example in that case. How would you use the
> buildbot.metrics code inside a C project? Would you write a Python-C
> interface to interact with the buildbot code?
I do not quite understand: in the above example you were talking about writing
certain data to a temporary file, and then reading it so the data is available
to Buildbot.  Now you talk about modifying the application under test so it
would directly use Buildbot's API.

What's the difference between this and the previous examples?

>> how do I make Buildbot aware of this data and how to turn this
>> data into a metric
>
> You keep using that phrase. Unfortunately, I don't quite understand what
> you mean by *turning* the build data into metric. Please explain this by
> way of an example. Thank you.
It's great that my repetition stood out.

Let me take your example:
> (Again, let me use an example. Suppose a Step has collected the number of
> skipped tests. Then, Step can either directly call a function to store this
> number in the influx databases via an API call. Or, the Step can call a
> function in metrics module - in this case, MetricCountEvent.log('Tests
> skipped', skipped_tests) - which will then call *another* function to store
> this number in the influx database).
So in this case the number of skipped tests somehow ends up in a variable
called skipped_tests, and this is just a number in a particular variable.
However when you run "MetricCountEvent.log('Tests skipped', skipped_tests)",
you "turn" the value in skipped_tests into a metric "Tests skipped".

-- 
Misha