[users at bb.net] Running (docker) buildslaves in a cluster with SLURM

Pierre Tardy tardyp at gmail.com
Sat Nov 14 15:11:30 UTC 2015


Hi Dominic,

It is quite hard for us to answer your question as you are telling us about
very specific problems you are encountering, but without really explaining
the background of your requirements.

I wasn't aware of slurm, and it looks it has a fair overlap with the
buildbot functionality.
So I would advise you to decide if you need to focus on using mostly the
functionalities of slurm, or the ones of buildbot.

Being buildbot expert, my intuition would be to map a slurm jobs to
buildbot's builds.
Buildbot is indeed expecting all over the place that each step of a build
is running in the same work environment.

You can do that by doing your own SlurmLatentBuildSlave, which would start
a buildbot slave as a slurm job, then the slave will use the slurm's job
compute resource to execute the whole build.

HTH,
Don't hesitate to tell us why you want to use slurm. Is it just to use the
compute power that your IT is providing to you?

Pierre

Le ven. 13 nov. 2015 à 14:45, Dominic Kempf <
dominic.kempf at iwr.uni-heidelberg.de> a écrit :

> Dear Buildbot list,
>
> I am currently working on a buildbot setup that wants to run buildslaves
> integrated into a small cluster that is using a SLURM scheduling
> system. I have trouble mapping my requirements to buildbot concepts in
> a suitable way.
>
> Problems arise from:
> * At first, I thought I can have just one buildslave on the cluster
> frontend,
>    that passes all build requests to a queue. But it seems that I rather
> need
>    one such slave on the frontend per job in the queue (sounds like a
> job for
>    a latent slave). Correct?
> * I have no clue yet on how to handle separate build steps, because either
>    - the job as submitted to SLURM must contain all build steps at
>      once - which makes a separation of logs etc. a pain
>    - every build step must be submitted to SLURM separately, with the jobs
>      depending on each other correctly - which is also a pain, because I
> cannot
>      guarantee things running on the same node.
>
> To further complicate things, I also want to run my builds in docker
> containers
> that we use to model heterogeneous userlands. Note that in the above
> context, this
> is different than for example in a DockerLatentBuildSlave: With the
> latter, the
> slave runs and builds its commands inside a docker container. In my
> approach, a
> (potenitally also dockerized) buildslave submits a job to a queue,
> which, when executed
> on some node, spins up another docker container there and runs the job
> inside that
> one.
>
> I am open to any sort of input and discussion!
> Thanks in advance,
>
> Dominic Kempf
> _______________________________________________
> users mailing list
> users at buildbot.net
> https://lists.buildbot.net/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20151114/9800bffc/attachment.html>


More information about the users mailing list