Advanced Research Computing

Example job scripts - GPU jobs

The tabs below have example job submission scripts for different types of GPU job. Note that not all software can make use of GPUs and you should consult an application's documentation to check. The Software pages have example GPU job scripts for certain applications, such as GROMACS, Amber and Matlab.

Small GPU job

To improve capacity, some GPUs have been divided into multiple, smaller GPUs. Jobs can
request a 1/8th fraction of a GPU for testing and work not requiring an entire GPU. This unit is the default GPU type for the cuda queue.

#!/bin/bash  
 
# Example job requesting 1/8 GPU.   
 
# Slurm resource requests:
 
#SBATCH –p cuda         #submit the job to the CUDA queue 
#SBATCH –t 01:00:00     # time limit for job 
 
# uncomment one of the following two #SBATCH lines to allocate GPU resources.
# Scripts with neither line uncommented will be  
# allocated one unit of the default type, ie. 1 * 1/8 GPU. 
##SBATCH --gres=gpu:1   # simpler notation 
##SBATCH --gres=gpu:h200_nvl_1g.18gb:1  # full notation, recommended 
 
# CPU cores, memory and temporary disk space will be allocated automatically. 
# Do not request them in this script. 
 
# Commands to be run: 
 
module load my_module 
./my_gpu_program

If saved in a file called my_gpu_job.sh, this could be submitted to the queue with the command:

sbatch my_gpu_job.sh

Large GPU job

This script is suitable for jobs that can make effective use of one or more whole GPUs.

#!/bin/bash  
 
# Example job requesting a whole GPU
 
# Slurm resource requests:
 
#SBATCH –p cuda                 #submit the job to the CUDA queue 
#SBATCH –t 01:00:00             # time limit for job 
#SBATCH  --gres=gpu:h200_nvl:1  # request a number of whole GPUs 
  
# CPU cores, memory and temporary disk space will be allocated automatically.
# Do not request them in this script.
 
# Commands to be run:
 
module load my_module 
./my_cuda_program

If saved in a file called my_gpu_job.sh, this could be submitted to the queue with the command:

sbatch my_gpu_job.sh

Multiple processes on one GPU

Jobs have exclusive access to the GPU resources they are allocated. However, some programs or problems can make only partial use of the capability of a GPU and in this case, you may wish to run multiple copies of a program within the job, where they each access the GPU.

By default, NVIDIA GPUs only allow a single process to use the GPU at any one time, so running (say) 2 copies of a program will mean that each program will only be able to use the GPU for half the time on average. As both copies cannot run at the same time, they cannot make use of the capacity unused by the other process, and performance will remain disappointing.

The NVIDIA CUDA Multi Process Service (MPS) can be used to get round this and allow multiple programs to use the GPU at the same time. It can be enabled by including the following lines to your job script, ahead of launching the processes that use the GPU:

# start MPS (multi-process sharing of a GPU) 
nvidia-cuda-mps-control -d

An example script would be:

#!/bin/bash  
 
# Example job requesting a whole GPU 
 
# Slurm resource requests: 
 
#SBATCH –p cuda                 #submit the job to the CUDA queue 
#SBATCH –t 01:00:00             # time limit for job 
#SBATCH  --gres=gpu:h200_nvl:1  # request a number of whole GPUs  
 
# CPU cores, memory and temporary disk space will be allocated automatically.
# Do not request them in this script. 
 
# Commands to be run: 
 
module load my_module
nvidia-cuda-mps-control -d 
./my_cuda_program

If saved in a file called my_gpu_job.sh, this could be submitted to the queue with the command:

sbatch my_gpu_job.sh

MPI jobs and GPUs

Applications that use MPI (Message Passing Interface) for parallelisation typically use sbatch options to select the number of processes (#SBATCH -n) and number of CPU cores per process (#SBATCH -c).

The cuda queue rejects these options as it provides a fixed CPU resource allocation based on the GPU request, so we are trialling an updated version of the mpirun command which includes a new option "--arc-par". This can be used to specify how many processes per available GPU or CPU core should be launched within the jobs' CPU allocation. The general form of the command to launch MPI program my_mpi_program is:

mpirun --arc-par <par_option> ./my_mpi_program

Where <par_option> typically takes the form Nppg - e.g. 1ppg, 2ppg, etc - to launch N MPI processes (ranks) per GPU

For example, for a job submitted to the cuda queue and allocated two whole H200 GPUs, the following line in the job script would launch 8 MPI processes:

mpirun.test --arc-par 4ppg ./my_mpi_program

However, if each MPI process attempts to use the same GPU, performance will be disappointing unless the job launches NVIDIA MPS first, as in the example below. The "Multiple processes on a single GPU" tab has further information on MPS.

Example job script:

#!/bin/bash  

# Example job requesting 2 whole GPUs
# and running 4 MPI processes/ranks per GPU
 
# Slurm resource requests: 
 
#SBATCH –p cuda                 #submit the job to the CUDA queue 
#SBATCH –t 01:00:00             # time limit for job 
#SBATCH  --gres=gpu:h200_nvl:2  # request a number of whole GPUs  
 
# CPU cores, memory and temporary disk space will be allocated automatically.
# Do not request them in this script. 
 
# Launch NVIDIA MPS so that multiple MPI processes/ranks can 
# access each GPU at once
nvidia-cuda-mps-control -d
 
# Other commands to be run: 
module load my_module
mpirun --arc-par 4ppg ./my_cuda_program

If saved in a file called my_gpu_job.sh, this could be submitted to the queue with the command:

sbatch my_gpu_job.sh

Example job scripts - GPU jobs

tabs content

Small GPU job

Large GPU job

Multiple processes on one GPU

MPI jobs and GPUs