Autogeneration of SLURM on Comet/TSCC

SBatch (SLURM code) can be annoying to write and, due to popular demand, I have decided to release some utility scripts I use to auto-generate SLURM. EXC and EXCB are convenience python scripts that allow people to specify jobs and job arrays in simple call files that are then turned into SLURM code and run on the cluster.

EXC: You specify a number of CPUs and the command you want run and an sbatch script is automatically generated with the job (from template) and executed.


#!/usr/bin/python
import os
import sys

#input checking
if(len(sys.argv)!=3):
    print "USAGE: python exc.py [NUM_CPU] [COMMAND]"
    sys.exit()
numCPU = int(sys.argv[1])
if(numCPU > 24):
    print "ERROR: Only 24 CPUs on a node."
    sys.exit()
cmd = sys.argv[2]

#params
account = 'xxxx'
username = 'xxxx'
email = 'xxxx'

#command template
command = (
    '#!/bin/bash\n'
    '##SBATCH --nodes=1 # redundant\n'
    '#SBATCH --account={ACCOUNT}\n'
    '#SBATCH --partition=shared\n'
    '#SBATCH -N 1 # Ensure that all cores are on one machine\n'
    '#SBATCH --ntasks=1\n'
    '#SBATCH --cpus-per-task={NUM_CPU}\n'
    '#SBATCH -t 0-5:00 # Runtime in D-HH:MM\n'
    '##SBATCH --mem=124000 # Memory pool for all cores (see also --mem-per-cpu) SHOULD NOT BE USED\n'
    '#SBATCH -o EXC%a # File to which STDOUT will be written\n'
    '#SBATCH -e EXCErr%a # File to which STDERR will be written\n'
    '#SBATCH --job-name=EXC\n'
    '#SBATCH --mail-type=FAIL # Type of email notification- BEGIN,END,FAIL,ALL\n'
    '#SBATCH --mail-user={EMAIL} # Email to which notifications will be sent\n'
    '#SBATCH --array=1-1%1\n'
    '### Set this to the working directory\n\n'
    '{CMD}\n')

#replace
command = command.replace('{NUM_CPU}',str(numCPU))
command = command.replace('{CMD}',cmd)
command = command.replace('{ACCOUNT}',account)
command = command.replace('{EMAIL}',email)
command = command.replace('{USERNAME}',username)

#write to file
fout = open('exc.sh','w')
fout.write(command)
fout.close()

#submit to sbatch
os.system('sbatch exc.sh')

#remove temp file
os.system('rm exc.sh')

EXCB: Same thing but allows you to specify a list of commands in a separate text file. This “jobs” file is then executed as a job array.


#!/usr/bin/python
import os
import sys

#input checking
if(len(sys.argv)!=4):
    print "USAGE: python excb.py [NUM_CPU] [JOB_FILE_LIST] [JOB_BATCH_SIZE]"
    sys.exit()
numCPU = int(sys.argv[1])
if(numCPU > 24):
    print "ERROR: Only 24 CPUs on a node."
    sys.exit()
jobFile = sys.argv[2]
batchSize = sys.argv[3]
if(batchSize!='A'):
    batchSize=int(batchSize)
if(batchSize!='A' and batchSize > 75):
    print "ERROR: Batch size limit exceeded."
    sys.exit()

#params
account = 'xxxx'
username = 'xxxx'
email = 'xxxx'

#figure out number of lines in job file
fin = open(jobFile,'r')
numl = 0
for line in fin:
    numl = numl + 1
fin.close()
if(batchSize=='A'):
    batchSize=numl

#command template
command = (
    '#!/bin/bash\n'
    '##SBATCH --nodes=1 # redundant\n'
    '#SBATCH --account={ACCOUNT}\n'
    '#SBATCH --partition=shared\n'
    '#SBATCH -N 1 # Ensure that all cores are on one machine\n'
    '#SBATCH --ntasks=1\n'
    '#SBATCH --cpus-per-task={NUM_CPU}\n'
    '#SBATCH -t 0-7:00 # Runtime in D-HH:MM\n'
    '##SBATCH --mem=124000 # Memory pool for all cores (see also --mem-per-cpu) SHOULD NOT BE USED\n'
    '#SBATCH -o EXCB%a # File to which STDOUT will be written\n'
    '#SBATCH -e EXCBErr%a # File to which STDERR will be written\n'
    '#SBATCH --job-name=EXCB\n'
    '#SBATCH --mail-type=FAIL # Type of email notification- BEGIN,END,FAIL,ALL\n'
    '#SBATCH --mail-user={EMAIL} # Email to which notifications will be sent\n'
    '#SBATCH --array=1-{NUML}%{BATCH_SIZE}\n'
    '### Set this to the working directory\n\n'
    'linevar=`sed $SLURM_ARRAY_TASK_ID\'q;d\' {CMD}`\n'
    'eval $linevar')

#replace
command = command.replace('{NUM_CPU}',str(numCPU))
command = command.replace('{CMD}',jobFile)
command = command.replace('{NUML}',str(numl))
command = command.replace('{BATCH_SIZE}',str(batchSize))
command = command.replace('{ACCOUNT}',account)
command = command.replace('{EMAIL}',email)
command = command.replace('{USERNAME}',username)

#write to file
fout = open('excb.sh','w')
fout.write(command)
fout.close()

#submit to sbatch
os.system('sbatch excb.sh')

#remove temp file
os.system('rm excb.sh')

 

Example SLURM Script for Comet/TSCC Systems

A lot of people have recently been asking me about help with the SLURM system at UCSD, so I decided to write a blog post to help people quickly learn how to use it and get their jobs running. I have a simple script that can be executed via the sbatch command that forks jobs in a job array. This allows you to run multiple of the same jobs in parallel on the cluster.

So say you have a bunch of scripts you want to run in parallel on a SLURM system. We can put all the script commands into a file (calls.txt):

calls.txt:

script1.sh
script2.sh
script3.sh

Then we can create an sbatch script that specifies a job array that runs through these commands and executes them in a parallel job array. Here is the job array script for sbatch (runTest.sh):

runTest.sh:

#!/bin/bash
#SBATCH --account=xxxx
#SBATCH --partition=shared
#SBATCH -N 1 # Ensure that all cores are on one machine
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH -t 0-00:10 # Runtime in D-HH:MM
#SBATCH -o outputfile%a # File to which STDOUT will be written
#SBATCH -e outputerr%a # File to which STDERR will be written
#SBATCH --job-name=TestJob
#SBATCH --mail-type=ALL # Type of email notification- BEGIN,END,FAIL,ALL
#SBATCH --mail-user=xxxx@ucsd.edu # Email to which notifications will be sent
#SBATCH --array=1-3%3

linevar=`sed $SLURM_ARRAY_TASK_ID'q;d' calls.txt`
eval $linevar

As one can see, the sbatch script has several parameters. Some of the useful ones that one should specify:

  • account: The university PI account on SLURM (different for each lab)
  • partition: shared vs. compute. “shared” is generally what people use if you just want to use the cluster for straight-forward jobs. “compute” specifies a more reserved use of the cluster that needs more time (but also costs more).
  • nodes: specifies how many nodes to use (each node has 24 cores)
  • ntasks-per-node: how many tasks there are (in this case, I’m treating this all as 1 task).
  • cpus-per-task: How many cores to use per task in the job array.
  • runtime (how long to run any 1 job). Runtime is specified in D-HH:MM
  • -o specifies that outputfiles and -e specifies error files. You can just keep these as is for now (and change once you try running it if need be).
  • SLURM has the option of sending you email notifications. See comments in script
  • array=1-3%3 specifies that the job array contains three jobs (i.e. calls.txt has three lines) and %3 specifies they should be run 3 at a time (i.e. all at once). If you have 50 jobs you want to run 10 at a time, the command is array=1-50%10.

All the job array script does is step through calls.txt and fork them to the cluster in batches that are specified by the “array” command.

Then on Comet/TSCC,all one has to do is:
sbatch runTest.sh

You can do the following command to check the status of your job array in the Comet or TSSC queue:
squeue -u yourusername

Hopefully this makes use of SLURM a lot simpler for everyone. Cheers!

Controlling Multiple Robot Arms with EMG Sensors

I recently was able to configure my Myoware EMG sensor to work with some of my robot arms.

Myoware makes these EMG sensors (https://www.pololu.com/product/2732) that are pretty cool. EMG stands for “Electromyography” which detects muscle potentials using conductive electrodes on your skin. Every time you flex your muscles, your motor system sends electrical signals that cause muscles to contract. The EMG sensors can detect these potentials and thus tell when you are flexing your muscles.

Here is a demo of controlling one robot arm with an EMG sensor:

This modality is scalable to the world of Multi-Robot Cyborgs! Here is a demo controlling multiple robot arms using 1 human arm:

 

Also, feel free to check out some of my other recent robotic builds.

A super cute robotic desktop assistant:

A spiderborg that escaped from the lab:

Guest Talk @ UCSD Biorobotics Lab

I will be doing a guest talk at UCSD’s Biorobotics Lab tomorrow Monday March 7 at 9:30AM in EBU II. My talk is titled “Bayesian Aggregation (and crazy side projects).” Here are the slides.

I discuss my thesis work on Bayesian sensor fusion methods to understand information sources from multiple noisy observations, as well as show some multi-robot cyborg videos!

Come if you’re around campus!

Autonomous Robot Helper Backpack Featured!

The Robot Helper Backpack Project has been getting tons of press recently.

The project was featured on the Adafruit blog: https://blog.adafruit.com/2015/10/30/autonomous-robotic-helper-backpack-raspberry_pi-piday-raspberrpi/

The project was featured on Raspberry Pi Pod:

The project was also one of the 20 winners of Element 14’s Raspberry Pi Halloween Build-A-Thon, winning a raspberry pi 2, a pi sense hat, and a box of other hardware goodies:

A big thanks to all the sponsoring/supporting entities and readers for your ongoing support of this blog.