Skip to main content

Login nodes

Once you have logged into Hamilton, the Linux commands you type in at the prompt are run on one of the service's two login nodes. Although these are relatively powerful computers, they are a resource shared between all the users using Hamilton and should not be used for running demanding programs. Light interactive work, downloading and compiling software, and short test runs using a few CPU cores are all acceptable.

Care should be taken not to overload the login nodes: we reserve the right to stop programs that interfere with other people's use of the service.

Running intensive computations

The majority of the CPU cores and RAM on Hamilton are in its compute nodes.  In order to access these, you must request resources on them via the queuing system, Slurm.  Most work on Hamilton is submitted to Slurm as batch jobs that are scheduled by Slurm to run when space becomes available.  However, interactive work is also possible through Slurm.

Batch jobs

A batch job is defined by a script, written with a text editor such as nano, that contains two things:

  • instructions to Slurm describing the resources (CPU cores, memory, time, etc) needed for the job and any other Slurm settings
  • the commands the job will run, in sequence.

A batch job is typically set  up using a login node and is submitted to Slurm from there. The pages Example job scripts - CPU jobs and Example job scripts - GPU jobs contain sample scripts for various types of jobs, and the Software pages have additional advice on configuring jobs for certain applications. All batch jobs are submitted using the command:

sbatch <job_script_name>

Once a job has been submitted to the queuing system, it will be scheduled and run as resources become free.

When a job script is submitted using the sbatch command, the system will provide you with a job number, or job id. This number is how the system identifies the job; it can be used to see if the job has completed running yet, to cancel it, etc. If you need to contact us about a problem with a job, please include this number as it is essential when diagnosing problems.

Using the example job script for a single-core CPU job, a fictional user account foobar22 could submit a job and check on its progress with:

[foobar22@login1 ~]$ sbatch my_serial_job.sh
Submitted batch job 3141717

[foobar22@login1 ~]$ squeue -u foobar22
             JOBID PARTITION     NAME       USER ST       TIME  NODES NODELIST(REASON)
           3141717    shared my_seria   foobar22 PD       0:00      1 (Resources)

  • The fifth column (ST) shows what state the job is in. R means that the job is running and PD means the job is pending, i.e. waiting for its turn in the queue. While it is pending, the NODELIST(REASON) column will show why it is not running, for example:
  • (Resources) - normal. The job is waiting for nodes to become free and allow it to run
    (Priority) - normal. The job is waiting in the queue as there are higher-priority jobs ahead of it
    (PartitionNodeLimit) - job will not run. The job submission script has asked for too many resources for the queue

The Hamilton portal also allows you to monitor jobs' progress through the queue.  Once a job is running, the portal also offers a graphical way to monitor jobs' resource usage, which can help with performance monitoring and tailoring of resource requests. 

When the job has started running, a file called slurm-<jobid>.out will be created. This contains any output printed by the commands in your job script. If the batch scheduled has to kill your job, for example because it tried to use more time or memory than requested, this will be noted at the bottom of this file.

Once the job has finished running, it will no longer appear in the output of squeue. Details about a finished job can be obtained from the command sacct -j <jobid>

Interactive jobs

Interactive jobs are useful when, for example, work needs to be done interactively but is too intensive for a login node (e.g. work that involves a graphical interface), or for testing software's behaviour in a Slurm environment.  The Hamilton portal provides a convenient interface for running some interactive jobs.  Alternatively, the Slurm command srun will also start an interactive job. For example, to start an interactive shell on a compute node, use:

srun --pty bash

Jobs run through srun are subject to the same controls as batch jobs. If you need extra resources, such as CPU cores, memory or time, request them in the same way as with sbatch (see Queueing system). For example:

srun --pty --mem=2G -c 2 -p test bash

Instead of starting an interactive shell on a compute node, other commands can also be run through srun, e.g:

srun --mem=2G -c 2 -p test <mycommand>

Queueing system

Useful commands

Key commands to interact with the Slurm scheduling system are:

  • sbatch <jobscript> - submit a job to the queue
  • srun - start an interactive job
  • scancel <jobid> - remove jobs from the queue
  • squeue, squeue --me - see the status of all/your jobs that are in the queue or running
  • sacct -j <jobid> - show information about a job that has finished
  • sfree - show what resources are available
  • sinfo - summary of the system and status

 

Available queues and job limits

Compute nodes are organised into queues (also known as partitions). Hamilton currently has 6 queues:

Queue Description Node type Node quantity Job limits
shared Default queue, intended for jobs that can share nodes Standard 119(*) 3 days
multi For jobs requiring one or more whole nodes Standard 119(*) 3 days
long For jobs requiring >3 days to run Standard 1(*) 7 days
bigmem For jobs requiring > 250GB memory High-memory 2 3 days
test For short test jobs Standard 1 15 minutes
cuda For GPU jobs GPU 1 3 days

(*) The shared, multi and long queues share a single pool of 119 standard compute nodes.  Node types are summarised on the Systems page.

Most work on Hamilton is done in the form of batch jobs, but it is also possible to run interactive jobs via the srun command. Both types of job can be submitted to any queue.

Job resources and options - CPU jobs

Unless you specify otherwise, jobs will be submitted to the shared queue and allocated the following resources:

  • 1 hour (15 minutes for the test queue)
  • 1 CPU core
  • 1GB memory
  • 1GB temporary disk space ($TMPDIR)

Further resources can be allocated using sbatch or srun options, which can be included either on the command line (e.g. sbatch -n 1 <job_script>, or by embedding them in your job script (e.g. adding the line #SBATCH -n 1). If both are done, the command line takes precedence. Useful options include:

Slurm option Description
-p <QUEUE> Submit job to <QUEUE> (queues are also known as partitions)
-t <TIME> Run job for a maximum time of <TIME>, in the format dd-hh:mm:ss
-c <CORES> For multi-core jobs: allocate <CORES> CPU cores to the job
-n <CORES> For MPI jobs: allocate <CORES> CPU cores to the job
-N <NODES> Allocate <NODES> compute nodes to the job
--mem=<MEM> Allocate <MEM> RAM per node to the job, e.g. 1G
--gres=tmp:<TMPSPACE> Allocate <TMPSPACE> temporary disk space on the compute node(s)
--array=<START>-<END> Run job several times, from indexes <START> to <END>
--mail-user=<EMAIL> Send job notifications to email address <EMAIL> (for batch jobs only; not needed to send to submitter's Durham address)
--mail-type=<TYPE> Types of job notifications to send, e.g. BEGIN, END, FAIL, ALL (recommended: END,FAIL).  For batch jobs only.

 

 

 

 

 

 

 

 

 

 

 

 

Types of compute node:

  • Standard - 128 CPU cores, 400GB temporary disk space, 246GB RAM
  • High-memory - 128 CPU cores, 400GB temporary disk space, 1.95TB RAM

The Example job scripts - CPU jobs page shows how to choose Slurm options for different types of job.

Job resources and options - GPU jobs

GPU jobs must be submitted to the cuda queue and should request the number and type of GPU resources needed.  Currently these can be one of the following:

Slurm option GPU resource allocated
--gres=gpu:1 1/8 H200 GPU  (default if no GPU resource specified)
--gres=gpu:h200_nvl_1g.18gb:1 1/8 H200 GPU
--gres=gpu:h200_nvl 1 H200 GPU

Note: To increase capacity, some of Hamilton's H200 GPUs have been split into smaller parts.  These smaller units are recognised in Slurm as a separate type of GPU resource and are recommended for work that would not make full use of a whole GPU.

The sfree command lists the number of each type of GPU resource available.  Within jobs, the nvidia-smi command can confirm the GPU resources allocated.  

CPU, memory and temporary disk space requirements should not be specified separately for GPU jobs. Instead, they are allocated as the fraction of the machine corresponding to the GPUs allocated.  

Resource CPU cores CPU memory GPU memory TMPDIR
Allocation per 1/8 H200 GPU 2 39GB 18GB 49GB
Allocation per whole H200 GPU 16 273GB 144GB 349GB

Other than requests for CPU cores, memory or temporary disk space, other Slurm options can be used in job requests as normal.  Common options that are relevant for the cuda queue include:

Slurm option Description
-p cuda Submit job to the cuda queue.  Required.
-t <TIME> Run job for a maximum time of <TIME>, in the format dd-hh:mm:ss
--array=<START>-<END> Run job several times, from indexes <START> to <END>
--mail-user=<EMAIL> Send job notifications to email address <EMAIL> (for batch jobs only; not needed to send to submitter's Durham address)
--mail-type=<TYPE> Types of job notifications to send, e.g. BEGIN, END, FAIL, ALL (recommended: END,FAIL).  For batch jobs only.

The Example job scripts - GPU jobs page shows how to choose Slurm options for different types of job.  Once you have jobs running, the Hamilton portal allows you to monitor how your jobs use their allocated GPU resources

Environment variables - all jobs

Slurm sets a number of environment variables that can be helpful to, for example, match the behaviour of a job to its resource allocation. These are detailed on the sbatch and srun man pages. 

The four additional environment variables below are set to match the value given in #SBATCH -c <number>, to help automate the behaviour of multi-threaded programs. This should be reasonable in most cases, but the values can be changed in job scripts if desired.

  • $OMP_NUM_THREADS
  • $OPENBLAS_NUM_THREADS
  • $MKL_NUM_THREADS
  • $BLIS_NUM_THREADS

Some applications require that the number of threads is set on the command line or in code.  Setting this to $SLURM_CPUS_PER_TASK rather than a fixed number will ensure that it always matches the number of CPU cores allocated with "#SBATCH -c".