[Buildbot-devel] Summer Of Code projects
Brian Warner
warner-buildbot at lothar.com
Sun May 7 01:21:18 UTC 2006
(I know I should have posted this like a week ago, sorry. There's still
another two days to get student proposals in, so better late than never..).
So the Python Software Foundation as a group is mentoring a number of Google
SoC projects, and they've expressed an interest in seeing some Buildbot
improvements as a part of that. The wiki page with project ideas is here:
http://wiki.python.org/moin/SummerOfCode
I seem to have been accepted as a mentor for the SoC, so I'm encouraging all
interested students to submit Buildbot-related project proposals via the
links on that page.
The kinds of things I see as good summer-sized projects include:
* SQLifying the backend build-status database
This would replace the current collection of directories and pickled Status
instances with a proper database, specifically one which could be
interrogated by external tools. I'd start with SQLite but it would be best
if other databases (MySQL and Postgres come to mind) could be swapped in
without too much effort. It has been suggested to me that divmod.axiom
should be used for the backend. Some considerable thought needs to be put
into getting the schema right, to make it useful to tool developers.
The student that works on this should have some twisted, buildbot, and SQL
skills, as well as familiarity with real-world buildbot deployment. The
first week of the project will probably be to survey buildmaster admins to
find out what kinds of questions they'd like to ask of their new buildbot
database.
I've attached my initial notes on this project below. I really don't know
SQL, so take them with a grain of salt.
* Problem Tracking
For a long time, I've wanted to be able to have the buildbot be aware of
specific test failures, so it could answer questions like "which tests are
failing intermittently?", and "when did test #3 start failing?". To
accomplish this, we need fine-grained parsing of test results for a variety
of test frameworks (starting with trial and expanding outwards) as well as
generic compile results (grepping for the filename/linenumber information
that gcc likes to emit and which emacs knows how to look for). Then we need
a place in the build status data structure to save it. Then we need some
tools to scan through these looking for "Problems", which are sequential
failures of a single test, possibly across multiple builders. Each Problem
is associated with some people and some Changes (the one which started it,
the one which fixed it).
This one might be easier to implement once the SQL stuff is in place, but I
think a lot of it could be done beforehand even though it might not be
particularly efficient. The test-result parsing stuff is independent.
In addition, there are a number of smaller projects which might be added
together to make a reasonable summer's worth of SoC work:
* Displaying Build Metrics
One of the stated goals of the buildbot is to help you improve things like
memory footprint, compile time, code size, test coverage, etc. However, so
far I've managed none of these. I'd like to see a place in each build for
various numerical properties to be stored (N.B. the 'build properties' that
will be present in 0.7.3 are the right place for this), and then some web
status pages that present graphs of these quantities over time.
The front-end of this will be easier once the "Web Parts" framework is in
place, which will make it possible to mix-and-match different web status
pages instead of having just the big chronological Waterfall page.
* Detailed Test Coverage Display
A number of languages offer tools to analyze which lines of code are
exercised by the test suite and which are not. It would be very nice if the
buildbot could interpret the results of these tools and present the
information in a useful way. In addition to just the overall coverage
percentage, the build status could link to a page (perhaps an external
viewcvs-like page) where you could see each file and each line and whether
it was covered or not.
* Better IM status clients
At the moment we have an IRC bot which is almost purely reactive (it stays
quiet until you ask it a question). In addition to that, I'd like to have
active bots (which announce build results into a channel), and bots which do
the same thing over other IM protocols (starting with AIM and probably
including Jabber too). All these aspects should use the same code, of
course. Part of the job would be to map the buildmaster's concept of "user"
into an IM handle. The long-term goal is to tie this into Problem Tracking,
using IM or email as necessary to inform the responsible user about the
status of the Problems they are on the hook to fix.
Anyways, if you're a student and are looking to do some buildbot work this
summer, please consider applying. The web page above has a number of links to
get you started. Drop me a note if you have any questions about the projects
I've described or any other buildbot-related ones you might have an interest
in.
thanks,
-Brian
-------------- next part --------------
The idea is to use SQLite to store build status.
c['storage'] = b.storage.SQLite(prune=30)
# enable SQLite instead of old-style, delete builds after 30 days
Each BuildStatus object needs to live in ram. When the BuildStatus is loaded,
a bunch of queries are performed to pull all the small things into memory at
once, so that things like IBuildStatus.getReason can run synchronously.
Certain methods are changed to return a Deferred:
IBuilderStatus.getBuild, .getEvent
IBuildStatus.getChanges
TBD
Conversion: when the sqlite backend is created for the first time, a SlowJob
runs through all old builds and adds them to the database.
Schemata:
IBuildStatus:
build_id = UNIQUE
number: INT
builder_id
isFinished = BOOL
reason = STRING
sourcestamp:
branch_id
revision = STRING
patched = BOOL
patch_level = INT
patch_diff = STRING
#changes_id: mapped via BuildTimesChanges
#responsibleUsers: mapped via ResponsibleUsers
#interestedUsers: mapped via InterestedUsers
#steps: mapped via Steps
start = TIMESTAMP
stop = TIMESTAMP (or None)
#ETA: not archived
slave_id
text: list of STRINGs?
color: INT (or enum? or string?)
results: INT (enum: SUCCESS, WARNINGS, FAILURE)
#logs: mapped via BuildLogs
#test results???
BuildTimesChanges:
(there are lots of these, up to len(builds)*len(changes)
build_id
change_id
IBuilderStatus:
builder_id = UNIQUE
name: STRING
...
ResponsibleUsers:
(there are lots of these)
build_id
user_id
InterestedUsers:
(there are lots of these)
build_id
user_id
Steps:
build_id
number = INT (within the Build)
start = TIMESTAMP (or None)
stop = TIMESTAMP (or None)
#ETA: not archived
#expectations??
#logs: mapped via StepLogs
finished = BOOL
text: list of STRINGs
color: ??
results: INT (enum)
Slaves:
slave_id = UNIQUE
slavename
BuildLogs:
build_id
log_id
name: STRING ?
StepLogs:
step_id = UNIQUE
name: STRING
log_id
Logs:
log_id = UNIQUE
step_id ?
isFinished = BOOL
filename
TestResult:
build_id
name = STRING (index)
results = INT (enum)
logs: ??
BuilderEvents:
builder_id
event_number = INT
start = TIMESTAMP
end = TIMESTAMP (or None)
text = list of strings?
color: ??
More information about the devel
mailing list