[Buildbot-devel] Large logs hang BuildBot's web interface

Brian Warner warner-buildbot at lothar.com
Thu Nov 11 19:44:57 UTC 2004


> I can grab the smaller logs just fine, but when I click to get a large 
> (~10MB) one, the Twisted process sits there chugging CPU cycles, 
> bringing the load above 1.0, and refusing to die except with "kill -9".

I've just committed the attached patch to CVS.. see if it helps the problem.
It won't affect existing builds, but when a new one is created, it should use
a smaller number of larger chunks to hold the log text, which I suspect is
the source of the problem.

The output of the build step is kept in a 'LogFile' instance, as a list of
(channel, text) tuples, so that stdout/stderr/headers can be kept separate. A
function named addEntry() is called each time the slave sends over a new
chunk of text. These chunks can be fairly small, depending upon how the
kernel chooses to buffer and deliver data across a pipe. Typically they seem
to be a few lines each.

The problem was that LogFile was just accumulating these chunks blindly. Most
builds have very long stretches of stdout with no intervening stderr (and
'headers', added by the buildslave to announce the command being run or the
exit code, usually only occur at the start and the end of the log). For a
very large log file, this means you could have a very very long list, which
takes a lot of CPU time to merge into a single big string. For a 10MB
logfile, with 100-byte chunks, you could have 100k entries to merge.

The new code attempts to merge same-channel chunks together, to minimize the
overhead of doing it later. It doesn't merge more than 10kb per chunk,
because after a while the overhead of appending small strings to very large
ones gets excessive. There are cleverer ways to do it.. let me know if this
fix helps and if not we'll try something more efficient. (specifically, doing
one big "".join() of the last 10kb-worth of same-channel chunks, instead of
doing incremental string-joins for each and every line).

The other fix that might be helpful would be a change to the TextLog class
(in buildbot.status.html), which is responsible for actually providing the
logfile's contents to an HTTP request. This function currently merges all the
chunks into one big monster string and then sends the whole thing at once.
For a 10MB logfile, it would be much better to send the chunks one at a time
(using the FileSender utility class). This will need a little bit more work,
as I have to figure out how to wrap the Logfile in a Producer thingy and
still attach the terminating HTML fragment at the end. I'll see if I can get
that part done later today.

thanks,
 -Brian

-------------- next part --------------
Index: buildbot/status/builder.py
===================================================================
RCS file: /cvsroot/buildbot/buildbot/buildbot/status/builder.py,v
retrieving revision 1.40
retrieving revision 1.41
diff -u -r1.40 -r1.41
--- buildbot/status/builder.py	8 Nov 2004 21:19:39 -0000	1.40
+++ buildbot/status/builder.py	11 Nov 2004 19:31:15 -0000	1.41
@@ -125,7 +125,15 @@
 
     def addEntry(self, channel, text):
         assert not self.finished
-        self.entries.append((channel, text))
+        if (self.entries
+            and channel == self.entries[-1][0]
+            and len(self.entries[-1][1]) < 10000):
+            # merge same-category chunks together, up to 10kb each, to cut
+            # down on overhead when assembling these into a single big string
+            # later.
+            self.entries[-1] = (channel, self.entries[-1][1] + text)
+        else:
+            self.entries.append((channel, text))
         for w in self.watchers:
             w.logChunk(self.step.build, self.step, self, channel, text)
         self.length += len(text)


More information about the devel mailing list