[users at bb.net] Running (docker) buildslaves in a cluster with SLURM

David Strubbe dstrubbe at gmail.com
Sat Nov 14 21:27:14 UTC 2015


Hi, I am running buildbot with SLURM jobs too. For example,
http://www.tddft.org/programs/octopus/buildbot (specifically the ones
called hbar). But we only submit jobs for the test step, the compilation is
run on the head node. You may find this script I wrote helpful:

http://web.mit.edu/~dstrubbe/www/queue_monitor.pl

It is BSD-licensed and manages submission of jobs with PBS or SLURM. It is
being used for the Octopus testsuite above, as well as for another project,
BerkeleyGW (BSD-licensed) from which the attached script comes.

David

On Fri, Nov 13, 2015 at 8:45 AM, Dominic Kempf <
dominic.kempf at iwr.uni-heidelberg.de> wrote:

> Dear Buildbot list,
>
> I am currently working on a buildbot setup that wants to run buildslaves
> integrated into a small cluster that is using a SLURM scheduling
> system. I have trouble mapping my requirements to buildbot concepts in
> a suitable way.
>
> Problems arise from:
> * At first, I thought I can have just one buildslave on the cluster
> frontend,
>   that passes all build requests to a queue. But it seems that I rather
> need
>   one such slave on the frontend per job in the queue (sounds like a job
> for
>   a latent slave). Correct?
> * I have no clue yet on how to handle separate build steps, because either
>   - the job as submitted to SLURM must contain all build steps at
>     once - which makes a separation of logs etc. a pain
>   - every build step must be submitted to SLURM separately, with the jobs
>     depending on each other correctly - which is also a pain, because I
> cannot
>     guarantee things running on the same node.
>
> To further complicate things, I also want to run my builds in docker
> containers
> that we use to model heterogeneous userlands. Note that in the above
> context, this
> is different than for example in a DockerLatentBuildSlave: With the
> latter, the
> slave runs and builds its commands inside a docker container. In my
> approach, a
> (potenitally also dockerized) buildslave submits a job to a queue, which,
> when executed
> on some node, spins up another docker container there and runs the job
> inside that
> one.
>
> I am open to any sort of input and discussion!
> Thanks in advance,
>
> Dominic Kempf
> _______________________________________________
> users mailing list
> users at buildbot.net
> https://lists.buildbot.net/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20151114/28749a80/attachment.html>


More information about the users mailing list