[Buildbot-devel] suggestions on how to deal with batch queues and mpi

Mark Richardson (Internal) mark.richardson at nag.co.uk
Wed Mar 7 09:28:33 UTC 2012


hi Chris,
I am not sure about cool (as I am not confident enough with Python to 
add to buildbot).
Anyway I wrote some Bourne shell scripts that effectively write the 
batch job script based on the supplied parameters for example:

create_job.sh <project> <mode> <platform>

I had to build in all the intelligence to distinguish the project,
mode and platform (basically  lots of case and test structures).
They build up strings that correspond to lines of the job script
I would have written manually. This is then echoed into a file that is 
subsequently submitted to the job queue.

I invoke it with a call to that job script with a buildbot shell step.
(workdir is the test directory)

Oh do not forget to build in a method for picking up the job id and
continue to poll the queue before exiting the build step (create_job)
script.

This last point is important as your build step will appear to finish
successfully and quickly (!) meanwhile your test sequence is in a queue
and could be in any state.

I cannot supply the confidential scripts but I hope I have given you a 
workplan to design one (may be even a python version?).

The other script in this toolset will launch several PBS/LSF jobs and 
wait for all to complete and report the state of each. Unfortunately it 
is not skilled in writing job scripts and relies on the developer to 
include some in their test directory :(  .

Also I am surprised that you need to use -I switch.
(you could wait a long time on the Cray I use)
I guess that is the platform from your aprun command.

Good luck,

Mark

On 06/03/2012 23:18, Chris Kees wrote:
> Hi,
>
> I have some buildslaves for which any tests have to be run through a
> batch system, and they must be run via an mpi launcher inside.  For
> example, if I was logged in I would do something like
>
> login-node%qsub -I
> (wait for interactive session to start on compute node of cluster)
> compute-node-0% cd my_tests; aprun python -c "import nose; nose.run()"
>
> I'm wondering if anybody has any cool ways of dealing with this.
> Unfortunately one of the systems will only take i/o from a terminal when
> in interactive mode (the "-I" switch), so I haven't been able to wrap
> something like subprocess.Popen around it.  I was thinking I would write
> a script that would inject the commands I want to run into a standard
> batch script for a given system and submit it to the queue, but I'm not
> sure how to get my script to wait until the job runs and then dump the
> stdout/stderr back to the Shell processes.
>
> Thanks,
> Chris
>
> ________________________________________________________________________
> This e-mail has been scanned for all viruses by Star.
> ________________________________________________________________________
>
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
>
>
>
> _______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel

-- 
Mark Richardson, Ph.D. HECToR CSE, Mobile: 07525 238037
     NAG Manchester, Peter House, Oxford Street,  Manchester, M1 5AN
Head office at:
     Numerical Algorithms Group Ltd, Wilkinson House,
     Jordan Hill Business park, Oxford OX2 8DR
----------------------------------------------------------------




________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________




More information about the devel mailing list