SBatch (SLURM code) can be annoying to write and, due to popular demand, I have decided to release some utility scripts I use to auto-generate SLURM. EXC and EXCB are convenience python scripts that allow people to specify jobs and job arrays in simple call files that are then turned into SLURM code and run on the cluster.
EXC: You specify a number of CPUs and the command you want run and an sbatch script is automatically generated with the job (from template) and executed.
#!/usr/bin/python import os import sys #input checking if(len(sys.argv)!=3): print "USAGE: python exc.py [NUM_CPU] [COMMAND]" sys.exit() numCPU = int(sys.argv[1]) if(numCPU > 24): print "ERROR: Only 24 CPUs on a node." sys.exit() cmd = sys.argv[2] #params account = 'xxxx' username = 'xxxx' email = 'xxxx' #command template command = ( '#!/bin/bash\n' '##SBATCH --nodes=1 # redundant\n' '#SBATCH --account={ACCOUNT}\n' '#SBATCH --partition=shared\n' '#SBATCH -N 1 # Ensure that all cores are on one machine\n' '#SBATCH --ntasks=1\n' '#SBATCH --cpus-per-task={NUM_CPU}\n' '#SBATCH -t 0-5:00 # Runtime in D-HH:MM\n' '##SBATCH --mem=124000 # Memory pool for all cores (see also --mem-per-cpu) SHOULD NOT BE USED\n' '#SBATCH -o EXC%a # File to which STDOUT will be written\n' '#SBATCH -e EXCErr%a # File to which STDERR will be written\n' '#SBATCH --job-name=EXC\n' '#SBATCH --mail-type=FAIL # Type of email notification- BEGIN,END,FAIL,ALL\n' '#SBATCH --mail-user={EMAIL} # Email to which notifications will be sent\n' '#SBATCH --array=1-1%1\n' '### Set this to the working directory\n\n' '{CMD}\n') #replace command = command.replace('{NUM_CPU}',str(numCPU)) command = command.replace('{CMD}',cmd) command = command.replace('{ACCOUNT}',account) command = command.replace('{EMAIL}',email) command = command.replace('{USERNAME}',username) #write to file fout = open('exc.sh','w') fout.write(command) fout.close() #submit to sbatch os.system('sbatch exc.sh') #remove temp file os.system('rm exc.sh')
EXCB: Same thing but allows you to specify a list of commands in a separate text file. This “jobs” file is then executed as a job array.
#!/usr/bin/python import os import sys #input checking if(len(sys.argv)!=4): print "USAGE: python excb.py [NUM_CPU] [JOB_FILE_LIST] [JOB_BATCH_SIZE]" sys.exit() numCPU = int(sys.argv[1]) if(numCPU > 24): print "ERROR: Only 24 CPUs on a node." sys.exit() jobFile = sys.argv[2] batchSize = sys.argv[3] if(batchSize!='A'): batchSize=int(batchSize) if(batchSize!='A' and batchSize > 75): print "ERROR: Batch size limit exceeded." sys.exit() #params account = 'xxxx' username = 'xxxx' email = 'xxxx' #figure out number of lines in job file fin = open(jobFile,'r') numl = 0 for line in fin: numl = numl + 1 fin.close() if(batchSize=='A'): batchSize=numl #command template command = ( '#!/bin/bash\n' '##SBATCH --nodes=1 # redundant\n' '#SBATCH --account={ACCOUNT}\n' '#SBATCH --partition=shared\n' '#SBATCH -N 1 # Ensure that all cores are on one machine\n' '#SBATCH --ntasks=1\n' '#SBATCH --cpus-per-task={NUM_CPU}\n' '#SBATCH -t 0-7:00 # Runtime in D-HH:MM\n' '##SBATCH --mem=124000 # Memory pool for all cores (see also --mem-per-cpu) SHOULD NOT BE USED\n' '#SBATCH -o EXCB%a # File to which STDOUT will be written\n' '#SBATCH -e EXCBErr%a # File to which STDERR will be written\n' '#SBATCH --job-name=EXCB\n' '#SBATCH --mail-type=FAIL # Type of email notification- BEGIN,END,FAIL,ALL\n' '#SBATCH --mail-user={EMAIL} # Email to which notifications will be sent\n' '#SBATCH --array=1-{NUML}%{BATCH_SIZE}\n' '### Set this to the working directory\n\n' 'linevar=`sed $SLURM_ARRAY_TASK_ID\'q;d\' {CMD}`\n' 'eval $linevar') #replace command = command.replace('{NUM_CPU}',str(numCPU)) command = command.replace('{CMD}',jobFile) command = command.replace('{NUML}',str(numl)) command = command.replace('{BATCH_SIZE}',str(batchSize)) command = command.replace('{ACCOUNT}',account) command = command.replace('{EMAIL}',email) command = command.replace('{USERNAME}',username) #write to file fout = open('excb.sh','w') fout.write(command) fout.close() #submit to sbatch os.system('sbatch excb.sh') #remove temp file os.system('rm excb.sh')