[users at bb.net] Running (docker) buildslaves in a cluster with SLURM

Dominic Kempf dominic.kempf at iwr.uni-heidelberg.de
Fri Nov 13 13:45:26 UTC 2015


Dear Buildbot list,

I am currently working on a buildbot setup that wants to run buildslaves
integrated into a small cluster that is using a SLURM scheduling
system. I have trouble mapping my requirements to buildbot concepts in
a suitable way.

Problems arise from:
* At first, I thought I can have just one buildslave on the cluster 
frontend,
   that passes all build requests to a queue. But it seems that I rather 
need
   one such slave on the frontend per job in the queue (sounds like a 
job for
   a latent slave). Correct?
* I have no clue yet on how to handle separate build steps, because either
   - the job as submitted to SLURM must contain all build steps at
     once - which makes a separation of logs etc. a pain
   - every build step must be submitted to SLURM separately, with the jobs
     depending on each other correctly - which is also a pain, because I 
cannot
     guarantee things running on the same node.

To further complicate things, I also want to run my builds in docker 
containers
that we use to model heterogeneous userlands. Note that in the above 
context, this
is different than for example in a DockerLatentBuildSlave: With the 
latter, the
slave runs and builds its commands inside a docker container. In my 
approach, a
(potenitally also dockerized) buildslave submits a job to a queue, 
which, when executed
on some node, spins up another docker container there and runs the job 
inside that
one.

I am open to any sort of input and discussion!
Thanks in advance,

Dominic Kempf


More information about the users mailing list