[Buildbot-devel] GSoC: Initial thoughts on the Graphs and Data Charts Project

Mon Mar 16 13:30:25 UTC 2015

On Mon, Mar 16, 2015 at 12:31 PM, Mikhail Sobolev <mss at mawhrin.net> wrote:

> Hi Prasoon,
>
> Please keep the mailing list in the loop.
>
> On Sat, Mar 14, 2015 at 06:21:05PM +0530, Prasoon Shukla wrote:
> > On Wed, Mar 11, 2015 at 11:49 PM, Mikhail Sobolev <[1]mss at mawhrin.net>
> wrote:
> >
> >      My inclination is that this two projects are actually one project.
> >      I'll read more carefully the ideas for both and return to this.
> >
> >    Hi Mikhail. Any updates on this?
> Yes, I have some.  I should be able to write something down during next
> days.
>
> Meanwhile you could present your ideas in more details.

Of course. So, hi again everyone.

There are now ten days left till the final deadline. As a past GSoC
student, I believe that this is barely enough time to file a good quality
application (which involves thoroughly discussing your idea, writing a good
application, getting feedback from the mentor and acting on that feedback).
So, I'd like to start writing my application. For that, I'd like to present
my ideas in a bit more detail in this post.

As I said before, to use the metrics module, I'd first need to add data
storage and retrieval capability to it. So, as of now, my project would
consist of the following three logical units:
1. Getting metrics up and running with the data storage and retrieval.
2. Using metrics module, as much as possible [Ref 1], to gather and store
test data.
3. Providing access to this data via an API and passing it to a graphing
library to create graphs/charts.

[Ref 1] Mikhail believes that metrics would have a "very close relationship
with this project". Upon closer examination however, I found that while
these two project do have *some* overlap, IMHO, it is not as large as
Mikhail believes. The reason for this is that metrics is made for internal
data collection within buildbot itself and not external use. It is not
something that a user can (or should) use within their test suite. This
means that all the nice data gathering tools in metrics - MetricCountEvent,
MetricTimeEvent, countMethod and timeMethod decorators etc. - cannot be
used by a user within their test suite simply because internal buildbot
code cannot (should not) be imported in the test suite of a project.

Now, let's look at each of the three points I mentioned above.

1. The issue of metrics data storage and retrieval.

*Storage:*
Right now, there are three different kinds of MetricEvents for which we'll
need to store data. For each, we can store the data in a separate table in
the form of a (key, value, timestamp) tuple.
So, here a basic schema of the three tables (for each of the three event
types). Here, key and timestamp are the same type for each MetricEvent
while type of the value changes for each MetricEvent.
*key*: string
*timestamp*: DB timestamp
*value*:

- for *MetricCountEvent - *integer.

- for *MetricTimeEvent - *Needs two columns - an average column that stores
floats and a list column which stores a string representation of Python
list (of ints) (see the event handler
<https://github.com/buildbot/buildbot/blob/6dd2ddfca0ec70659ac3514e2b1106d3edcc344d/master/buildbot/process/metrics.py#L242>
for
this event for more information).

- for *MetricAlarmEvent*integer - integer representing alarm type (see event
handler
<https://github.com/buildbot/buildbot/blob/6dd2ddfca0ec70659ac3514e2b1106d3edcc344d/master/buildbot/process/metrics.py#L270>
for
more information).

We will, of course, continue logging the metrics to the twisted log files.
Also, the actual call to the db to save data can be triggered by using a
hook on the MetricEvent.log classmethod.

*Retrieval:*
The selection would be on the keys of various event tables. The filtering
could be on the timestamps or on a range of values (for example, selecting
all MetricTimeEvents with key 'foo' having average less than 10 seconds).
Retrieval methods can then be implemented via sqlalchemy ORM.

2. Using metrics module to gather test data.

* For all non-time data, we'll make a new *Step* -
*steps.shell.MeasureShellCommand* - that will inherit from
*steps.shell.ShellCommand*. This step will take an extra argument - the key
name (see metrics storage above) for the statistic. Internally, this
*Step* will
delegate the command to *ShellCommand* and capture the *stdout* and
*stderr* along
with the return code. If we get an exit code of 0, then, the a call will be
made to metrics module to log the data into the db. For non-zero exit
codes, we will log the error message and continue/stop execution while
honouring *haltOnFailure*/*flunkOnFailure*.

* For all time data, we can simply use a *Step* - all step execution times
are measured and stored in the db by buildbot already. For non-trivial
cases where a user wants to time an event that cannot be expressed as a
*Step*, we can fall-back on the *MeasureShellCommand* (see last point).

Once the data is stored in the db, the retrieval can be easily done with
sqlalchemy. The data API can interface with these 'getter' methods to
expose them to outside services.

3. Providing access to the data is simply the matter of JSON encoding the
results obtained from the getters. Then, the encoded data can be passed
over an HTTP response to be used by the frontend graphing library. Or, it
can be passed to a library to produce a pdf/excel/whatever file for the
user.

So, that's the whole of it. IMO, it covers all possible usecases and the
design is rather unobtrusive to existing code. Also, I believe I have
considered all the questions posed in ticket 2461
<http://trac.buildbot.net/ticket/2461> except for configurability. I can
start considering configurability of this approach once I get the go ahead
on the stuff that I've mentioned in this post.

As always, all opinions/suggestions/question are very welcome.

Prasoon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20150316/725ac93a/attachment.html>