SLURM is a queue management system replacing the commercial LSF scheduler as the job manager on UAHPC.
SLURM is similar to LSF – below is a quick reference from HPC Wales comparing commands between the two. For those of you coming from an environment with a different scheduler or wanting more details, see this pdf for a comparison of commands between PBS/Torque, SLURM, LSF, SGE, and LoadLeveler: Scheduler Commands Cheatsheet
More documentation can be found at the SLURM website.
LSF to Slurm Quick Reference
Commands
LSF | Slurm | Description |
---|---|---|
bsub < script_file |
sbatch script_file |
Submit a job from script_file |
bkill 123 |
scancel 123 |
Cancel job 123 |
bjobs |
squeue orslurmtop |
List user’s pending and running jobs |
bqueues |
sinfo |
Cluster status with partition (queue) list |
bqueues |
sinfo -s |
With ‘-s’ a summarised partition list, which is shorter and simpler to interpret. |
Job Specification
LSF | Slurm | Description |
---|---|---|
#BSUB |
#SBATCH |
Scheduler directive |
-q queue_name |
-p main --qos queue_name or -p owners --qos queue_name |
Queue to ‘queue_name’ |
-n 64 |
-n 64 |
Processor count of 64 |
-W [hh:mm:ss] |
-t [minutes] or -t [days-hh:mm:ss] |
Max wall run time |
-o file_name |
-o file_name |
STDOUT output file |
-e file_name |
-e file_name |
STDERR output file |
-oo file_name |
-o file_name --open-mode=append |
Append to output file |
-J job_name |
--job-name=job_name |
Job name |
-M 128 |
--mem-per-cpu=128M or --mem-per-cpu=2G |
Memory requirement |
-R "span[ptile=16]" |
--tasks-per-node=16 |
Processes per node |
-P proj_code |
--account=proj_code |
Project account to charge job to |
-J "job_name[array_spec]" |
--array=array_spec |
Job array declaration |
Job Environment Variables
LSF | Slurm | Description |
---|---|---|
$LSB_JOBID | $SLURM_JOBID | Job ID |
$LSB_SUBCWD | $SLURM_SUBMIT_DIR | Submit directory |
$LSB_JOBID | $SLURM_ARRAY_JOB_ID | Job Array Parent |
$LSB_JOBINDEX | $SLURM_ARRAY_TASK_ID | Job Array Index |
$LSB_SUB_HOST | $SLURM_SUBMIT_HOST | Submission Host |
$LSB_HOSTS $LSB_MCPU_HOST |
$SLURM_JOB_NODELIST | Allocated compute nodes |
$LSB_DJOB_NUMPROC | $SLURM_NTASKS (mpirun can automatically pick this up from Slurm, it does not need to be specified) |
Number of processors allocated |
$SLURM_JOB_PARTITION | Queue |