Slurm-Usage

Slurm Usage

Slurm is a workload manager mainly used in HPC and GPU clusters. It ensures that resources are utilized to their maximum potential. I used to use Meituan’s intern workload manager based on Slurm. Now, I use PSC, which is equipped with full Slurm functionality. Therefore, I am writing this blog to summarize its usage.

There are following main commands in slurm:

Job Submission and Control
Job Management
Status Query
Resource Management

Job Submission and Control Commands

The first category of commands deals with submitting and controlling jobs on the cluster.

sbatch

This is your primary tool for submitting batch jobs to the cluster. The command reads a script file containing resource requests and job steps.

1
2
3
4
5


# Basic job submission
sbatch job_script.sh

# Submit with specific requirements
sbatch --time=2:00:00 --mem=4G job_script.sh

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


# Job Identification
-J, --job-name=jobname                 # Name of the job
--comment=string                       # Add comment to job
--wckey=wckey                         # Specify wckey for job

# Resource Allocation
-N, --nodes=N                         # Number of nodes
-n, --ntasks=ntasks                   # Number of tasks
-c, --cpus-per-task=ncpus            # CPUs per task
--mem=MB                              # Total memory per node
--mem-per-cpu=MB                      # Memory per CPU
--gpus=n                              # Number of GPUs
--gres=resource_spec                  # Generic resource requirements

# Time and Priority
-t, --time=minutes                    # Time limit
--deadline=timestamp                  # Job deadline
--priority=value                      # Job priority (admin only)
-H, --hold                           # Submit job in held state

# Input/Output
-o, --output=filename                 # Standard output file
-e, --error=filename                  # Standard error file
-i, --input=filename                  # Standard input file

# Partition and Constraints
-p, --partition=partition             # Partition request
-C, --constraint=list                 # Node feature constraints
--reservation=name                    # Resource reservation name

# Dependencies and Arrays
-d, --dependency=dependency_list      # Job dependencies
--array=array_spec                    # Job array indices

# Environment
--export=env_vars                     # Export environment variables
--chdir=directory                     # Working directory

srun

Use srun for running parallel jobs or interactive tasks. It’s particularly useful for immediate execution.

1
2
3
4
5


# Run a simple command across nodes
srun hostname

# Launch a parallel program
srun -n 4 ./parallel_program

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


# Job Configuration
-N, --nodes=N                         # Number of nodes
-n, --ntasks=ntasks                   # Number of tasks
-c, --cpus-per-task=ncpus            # CPUs per task
-p, --partition=partition             # Partition request

# Resource Requirements
--mem=MB                              # Memory per node
--mem-per-cpu=MB                      # Memory per CPU
--gpus=n                              # Number of GPUs
--gres=resource_spec                  # Generic resource requirements

# Time Limits
-t, --time=minutes                    # Time limit
--immediate                           # Exit if resources not available

# Input/Output
-o, --output=filename                 # Standard output file
-e, --error=filename                  # Standard error file
-i, --input=filename                  # Standard input file

# Task Distribution
--ntasks-per-node=n                   # Tasks per node
--ntasks-per-socket=n                 # Tasks per socket
--distribution=arbitrary              # Task distribution method

# MPI Options
--mpi=type                           # MPI implementation type
--cpu-bind=type                      # Bind tasks to CPUs

salloc

When you need interactive access to compute resources, salloc is your go-to command.

1
2
3
4
5


# Request an interactive session
salloc --nodes=1 --time=1:00:00

# Get a GPU-enabled session
salloc --gpus=1 --time=2:00:00

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28


# Resource Request
-N, --nodes=N                         # Number of nodes
-n, --ntasks=ntasks                   # Number of tasks
-c, --cpus-per-task=ncpus            # CPUs per task
--mem=MB                              # Memory per node
--mem-per-cpu=MB                      # Memory per CPU

# GPU and Special Resources
--gpus=n                              # Number of GPUs
--gres=resource_spec                  # Generic resource requirements
-C, --constraint=list                 # Node feature constraints

# Time and Priority
-t, --time=minutes                    # Time limit
--immediate                           # Exit if resources not available
-H, --hold                           # Submit allocation in held state

# Partition and Reservation
-p, --partition=partition             # Partition request
--reservation=name                    # Resource reservation name

# Job Identification
-J, --job-name=jobname               # Name of job
--comment=string                     # Add comment to allocation

# Environment
--export=env_vars                    # Export environment variables
--chdir=directory                    # Working directory

Example

Memory Specification

1
2
3
4


# Different ways to specify memory
--mem=4G                             # 4 GB per node
--mem-per-cpu=1G                     # 1 GB per CPU
--mem=4096MB                         # Can use MB notation

Time Specification

1
2
3
4


# Different time formats
--time=2:00:00                       # Hours:Minutes:Seconds
--time=120                           # Minutes
--time=2-00:00:00                    # Days-Hours:Minutes:Seconds

GPU Requests

1
2
3
4


# Different ways to request GPUs
--gpus=1                             # Request 1 GPU
--gres=gpu:1                         # Alternative way to request 1 GPU
--gres=gpu:tesla:2                   # Request 2 Tesla GPUs

Job Dependencies

1
2
3
4
5


# Common dependency types
--dependency=after:123               # Start after job 123 completes
--dependency=afterany:123            # Start after job 123 ends (any state)
--dependency=afternotok:123         # Start after job 123 fails
--dependency=singleton              # Only one job runs at a time

Tips:

Paths must be absolute, or they may lead somewhere you cannot find.
Carefully set the time, make sure they will not be killed due to time limited.

Job Management Commands

These commands help you manage existing jobs in the system.

scancel

The command for terminating jobs:

1
2
3
4
5
6
7
8
9


bashCopy
# Cancel a specific job
scancel 12345

# Cancel all jobs for a user
scancel -u username

# Cancel jobs in a partition
scancel -p partition_name

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


# Job Identification
-i, --interactive                    # Require response before canceling
--ctld                              # Send cancel request to slurmctld instead of directly
-n, --name=job_name                 # Cancel jobs with specified name
--qos=qos_list                      # Cancel jobs with specified QOS
--reservation=reservation_name       # Cancel jobs with specified reservation

# User and Account Control
-u, --user=user_name                # Cancel jobs of specified user
-A, --account=account               # Cancel jobs of specified account
--wckey=wckey                       # Cancel jobs with specified wckey

# Job State and Type
-t, --state=states                  # Cancel jobs in specified state
--full                             # Cancel full job allocation
--hurry                            # Expedite job cancellation
--signal=signal_number             # Send specified signal to jobs

# Partition and Node
-p, --partition=partition_names     # Cancel jobs in specified partitions
-w, --nodelist=host_list           # Cancel jobs on specified nodes

# Batch Script
-b, --batch                        # Cancel only batch jobs
-f, --full                         # Cancel entire job allocation

scontrol

A powerful tool for viewing and modifying job configurations:

1
2
3
4
5


# View job details
scontrol show job 12345

# Modify job parameters
scontrol update JobId=12345 TimeLimit=2:00:00

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


# Show Commands
show job                           # Show job information
show node                          # Show node information
show partition                     # Show partition information
show reservation                   # Show reservation information
show config                        # Show system configuration

# Update Commands
update job JobId=id                # Update job attributes
update node NodeName=name          # Update node attributes
update partition PartitionName=name # Update partition attributes

# Job Control
hold JobId                         # Place hold on job
release JobId                      # Release hold on job
requeue JobId                      # Requeue a job
suspend JobId                      # Suspend a job
resume JobId                       # Resume a suspended job

# Other Controls
ping                              # Ping slurmctld daemon
reconfigure                       # Reconfigure slurmctld
takeover                         # Takeover from backup controller
shutdown                         # Shutdown slurm daemons

Status Query Commands

These commands provide information about the current state of jobs and the system.

squeue

The primary command for viewing the job queue:

1
2
3
4
5
6
7
8
9


bashCopy
# View all jobs
squeue

# View user-specific jobs
squeue -u username

# Custom format output
squeue --format="%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


# Output Format
-o, --format=format               # Specify custom output format
--sort=fields                     # Sort by specified fields
-l, --long                       # Long output format
-i, --iterate=seconds            # Repeatedly display at intervals

# Job Selection
-j, --jobs=job_id_list           # Show specific jobs
-u, --user=user_list             # Show user's jobs
-n, --name=name_list             # Show jobs with name
-w, --nodelist=node_list         # Show jobs on specific nodes

# Partition and State
-p, --partition=partition_names   # Show jobs in partition
-t, --states=state_list          # Show jobs in state
--qos=qos_list                   # Show jobs with QOS

# Time and Priority
--start                          # Show expected start time
--priority                       # Display job priority

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


# Common format specifiers
%.18i                        # Job ID (18 characters)
%.9P                         # Partition (9 characters)
%.8j                         # Job name (8 characters)
%.8u                         # User name (8 characters)
%.2t                         # Job state (2 characters)
%.10M                        # Time limit (10 characters)
%.6D                         # Number of nodes (6 characters)
%R                          # Reason for waiting

# Example format string
squeue --format="%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"

sinfo

Use this to check partition and node information:

1
2
3
4
5
6
7
8


# View partition information
sinfo

# Detailed node status
sinfo -N

# Specific partition details
sinfo -p partition_name

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


# Output Format
-o, --format=format              # Specify custom format
-l, --long                      # Display in long format
-N, --Node                      # Display node-oriented output
--summarize                     # Report summary information

# Node Selection
-n, --nodes=nodes               # Report on specific nodes
-p, --partition=partition       # Report on specific partition
-t, --states=states            # Report on nodes in specified state

# Display Options
-R, --responding               # Show only responding nodes
-d, --dead                    # Show only non-responding nodes
--hide                        # Do not display hidden partitions

sacct

For accessing job history:

1
2
3
4
5


# View today's jobs
sacct

# View jobs since a specific date
sacct --starttime=2024-01-01

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


# Time Range
-S, --starttime=time           # Report start time
-E, --endtime=time            # Report end time
-A, --accounts=accounts       # Show jobs from specified accounts

# Output Control
-o, --format=format           # Specify output format
--units=unit                  # Display units (K,M,G,T,P)
-X, --allocations            # Only show allocation records
-p, --parsable               # Output in parsable format

# Job Selection
-j, --jobs=job_id_list        # Show specific jobs
-u, --user=user_list          # Show specific users
-s, --state=states           # Show jobs in specified states

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


# Common format specifiers
JobID                        # Job ID
JobName                      # Job name
State                        # Job state
ExitCode                     # Exit code
Submit                       # Submit time
Start                       # Start time
End                         # End time
Elapsed                     # Elapsed time
MaxRSS                      # Maximum memory used

# Example format string
sacct --format="JobID,JobName,State,ExitCode,Submit,Start,End,Elapsed,MaxRSS"

Resource Management Commands

These commands help monitor resource allocation and priorities.

sshare

View fair-share scheduling information:

1
2
3
4
5


# Basic share information
sshare

# Detailed information
sshare -

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


# Basic Options
-A, --accounts=names           # Show shares for specified accounts
-a, --all                     # Show all users, even those with no usage
-l, --long                    # Long listing format with more details
-p, --parsable               # Display in parsable format
-u, --users=user_names       # Show shares for specified users

# Output Format
-o, --format=format_string   # Format specification
--noheader                  # No header on output
--json                      # JSON output format

# Time Range
-s, --start=time            # Start time for statistics
-e, --end=time             # End time for statistics

Example:

1
2
3
4


# Examples
sshare -A myaccount                  # Show shares for specific account
sshare -u username -l               # Detailed share info for user
sshare -o "Account,User,RawShares"  # Custom format output

sprio

Check job priorities:

1
2
3
4
5


# View job priorities
sprio

# Detailed priority information
sprio -l

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# Job Selection
-j, --jobs=job_id           # Show priority for specific jobs
-u, --user=user_name        # Show priorities for specific user
-o, --format=format        # Specify output format

# Display Options
-l, --long                 # Long display format
-n, --noheader            # No header in output
--json                    # JSON output format
-w, --weights            # Show priority weights

Example:

1
2
3


sprio -u username         # Show priorities for user's jobs
sprio -j 12345           # Show priority for specific job
sprio -l                 # Show detailed priority information