[Buildbot-devel] Buildbot master CPU usage
Thomas Middeldorp
thomas at ranzer.geek.nz
Thu Jul 14 23:48:43 UTC 2011
On 7/9/2011 2:43 AM, Dustin J. Mitchell wrote:
> On Thu, Jul 7, 2011 at 6:38 PM, Thomas Middeldorp<thomas at ranzer.geek.nz> wrote:
>> Could anyone please offer some advice on how to figure out what is
>> causing the high CPU usage or how to bring it to a more reasonable level?
> The best place to start is to look at the twistd.log and see what the
> master's doing too much of. You can also use tools like strace or
> dtruss to see what it's doing.
>
> Dustin
Tried investigating a bit more, it seems the cpu usage spikes up even
when there is nothing happening in twistd.log.
strace is pretty spammy and I'm not really sure what should stand out/be
suspicious... but it does seem to frequently try to stat a whole lot of
files. It tries to stat /etc/localtime a few hundred times in a row and
then tries pages and pages worth of log files (about 2/3 of which are
not found), python files, and more localtime. e.g.:
stat("/mnt/raid0/buildmaster/linux_asset_tests_incr/955-log-shell-stdio.bz2",
0x7fffde6550b0) = -1 ENOENT (No such file or directory)
stat("/mnt/raid0/buildmaster/linux_asset_tests_incr/955-log-shell-stdio.gz",
0x7fffde6550b0) = -1 ENOENT (No such file or directory)
stat("/mnt/raid0/buildmaster/linux_asset_tests_incr/955-log-shell-stdio", {st_mode=S_IFREG|0600,
st_size=1959, ...}) = 0
stat("/usr/local/lib/python2.6/dist-packages/buildbot/status/web/templates/box_macros.html",
{st_mode=S_IFREG|0644, st_size=1071, ...}) = 0
stat("/usr/local/lib/python2.6/dist-packages/buildbot/status/web/templates/box_macros.html",
{st_mode=S_IFREG|0644, st_size=1071, ...}) = 0
stat("/usr/local/lib/python2.6/dist-packages/buildbot/status/web/templates/box_macros.html",
{st_mode=S_IFREG|0644, st_size=1071, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2434, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2434, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2434, ...}) = 0
running strace with -c provided the following output, also pointing the
finger at stat:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
76.47 0.000039 0 64391 11380 stat
23.53 0.000012 0 450 450 access
0.00 0.000000 0 226 read
0.00 0.000000 0 11 write
0.00 0.000000 0 2 open
0.00 0.000000 0 2 close
0.00 0.000000 0 678 fstat
0.00 0.000000 0 226 lseek
0.00 0.000000 0 1 brk
0.00 0.000000 0 33 select
0.00 0.000000 0 31 sendto
0.00 0.000000 0 11 recvfrom
0.00 0.000000 0 7 getsockname
0.00 0.000000 0 900 fcntl
------ ----------- ----------- --------- --------- ----------------
100.00 0.000051 66969 11830 total
I also tried searching for 'access' since it seems to error entirely and
is the only other thing that takes any time, and it all seems to be
these two files:
access("/mnt/raid0/buildmaster/state.sqlite-journal", F_OK) = -1 ENOENT
(No such file or directory)
access("/mnt/raid0/buildmaster/state.sqlite-wal", F_OK) = -1 ENOENT (No
such file or directory)
Is any of this useful, or are there any other strace parameters I should
try?
Thanks for the help,
Thomas
More information about the devel
mailing list