Requesting Resources on an HPC Facility · SLURM • Bessemer login nodes are gateways to the cluster of worker nodes. • Login nodes’ main purpose is to allow access to the worker

Requesting Resources on an

HPC Facility

Michael Griffiths and Norbert Gyenge

Corporate Information and Computing Services

The University of Sheffield

www.sheffield.ac.uk/cics/research

(Using the Slurm Workload Manager)

1. Understand what High Performance Computing is

2. Be able to access remote HPC Systems by different methods

3. Run Applications on a remote HPC system

4. Manage files using the Linux Operating Systems

5. Know how to use the different kinds of file storage systems

6. Run applications using a Scheduling System or Workload Manager

7. Know how to get more resources and how to get resources dedicated

for your research

8. Know how to enhance your research through shell scripting

9. Know how to get help and training

Review: Objectives

1. Using the Job Scheduler – Interactive Jobs

2. Batch Jobs

3. Task arrays

4. Running Parallel Jobs

5. Beyond Bessemer Accessing tier 2 resources

6. Course examples available using

1. git clone --single-branch --branch bessemer https://github.com/rcgsheffield/hpc_intro

Outline

1. USING THE JOB

SCHEDULER• Interactive Jobs

• https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#request-an-interactive-shell

• Batch Jobs

• https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#submitting-non-interactive-jobs

• SLURM Documentation

• https://slurm.schedmd.com/pdfs/summary.pdf

• https://slurm.schedmd.com/man_index.html

https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#request-an-interactive-shell

https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#submitting-non-interactive-jobs

https://slurm.schedmd.com/pdfs/summary.pdf

https://slurm.schedmd.com/man_index.html

RUNNING JOBS

A NOTE ON INTERACTIVE JOBS

• Software that requires intensive computing should be run on the worker nodes and not the head node.

• You should run compute intensive interactive jobs on the worker nodes by using the command

• srun --pty bash –I

• Maximum ( and also default) time limit for interactive jobs is 8 hours.

SLURM• Bessemer login nodes are gateways to the cluster of worker

nodes.

• Login nodes’ main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs.

• All cpu intensive computations must be performed on the worker nodes. This is achieved by;

• srun --pty bash –I for interactive jobs

• sbatch submission.sh for batch jobs

• Once you log into Bessmer, taking advantage of the power of a worker-node for interactive work is done simply by typing. srun --pty bash –I and working in the shell window. The next set of slides assume that you are already working on one of the worker node.

PRACTICE SESSION 1: RUNNING

APPLICATIONS ON BESSEMER

(PROBLEM 1)• Case Studies

• Analysis of Patient Inflammation Data

• Running an R application how to submit jobs and run R interactively

• List available and loaded modules load the module for the R package

• Start the R Application and plot the inflammation data

MANAGING YOUR JOBS

SLURM OVERVIEWSLURM is the workload management system, job scheduling and

batch control system. (Others available such as PBS, Torque/Maui,

Platform LSF )

• Starts up interactive jobs on available workers

• Schedules all batch orientated ‘i.e. non-interactive’ jobs

• Fault Tolerant, highly scalable cluster management and job

scheduling system

• Optimizes resource utilization

SCHEDULING BATCH JOBS ON THE CLUSTER

SLURM

worker

node

SLURM

worker

node

SLURM

worker

node

SLURM

worker

node

SLURM

worker

node

SLURM MASTER

node

Queue-A Queue-B Queue-C

AS

lot 1

A S

lot 2

B S

lot 1

C S

lot 1

C S

lot 2

C S

lot 3

B S

lot 1

B S

lot 2

B S

lot 3

B S

lot

1

C S

lot

1

C S

lot 2

A S

lot 1

B S

lot 1

C S

lot 1

Queues

Policies

Priorities

Share/Tickets

Resources

Users/ProjectsJOB Y JOB Z

JOB X

JOB U

JOB OJOB N

MANAGING JOBS MONITORING AND CONTROLLING

YOUR JOBS

• There are a number of commands for querying and modifying the status of a job running or waiting to run. These are;

• squeue (query job status)

• squeue –jobs jobid

• squeue –-users “username”

• squeue –-users “*”

• scancel (delete a job)

• scancel jobid

DEMONSTRATION 1

Using the R package to analyse patient data

sbatch example:

sbatch myjob

the first few lines of the submit script myjob contains -

$!/bin/bash

#SBATCH --time=10:00:00

#SBATCH --output myoutputfile

#SBATCH –error myerroroutput

and you simply type; SBATCH myjob

Running Jobs batch job example

PRACTICE SESSION: SUBMITTING JOBS TO BESSEMER

(PROBLEM 2 & 3)

• Patient Inflammation Study run the R example as a batch job

• Case Study

• Fish population simulation

• Submitting jobs to SLURM

• Instructions are in the readme file in the slurm folder of the course examples

• From an interactive session

• Load the compiler module

• Compile the fish program

• Run test1, test2 and test3

MANAGING JOBS: REASONS FOR JOB

FAILURES

• SLURM cannot find the binary file specified in the job script

• You ran out of file storage. It is possible to exceed your filestore allocation limits during a job that is producing large output files. Use the quota command to check this.

• Required input files are missing from the startup directory

• Environment variable is not set correctly (LM_LICENSE_FILE etc)

• Hardware failure

FINDING OUT THE MEMORY REQUIREMENTS OF A JOB

• Real Memory Limits:

• Default real memory allocation is 2 Gbytes

• Request 64GB memory using a batch file

• #SBATCH --mem=64000

• Real memory resource can be requested by using --mem="NN"G

Determining the memory requirements for a job;

• scontrol show jobid –dd <jobid>

MANAGING JOBS : RUNNING CPU-PARALLEL JOBS

• More many processor tasks• Shared memory

• Distributed Memory

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --ntasks-per-node=40

#SBATCH --mem=64000

#SBATCH [email protected] load apps/openmpi/4.0.1/binary

• Jobs limited to a single node with a maximum of 40 tasks

• Compilers that support MPI.

• PGI , Intel, GNU

DEMONSTRATION 3

• Test 6 provides an opportunity to practice submitting parallel jobs to the

scheduler.

• To run testmpi6, compile the mpi example

• Load the openmpi compiler module

• module load apps/openmpi/4.0.1/binary

• compile the diffuse program

• mpicc diffuse.c -o diffuse -lm

• sbatch testmpi6

• Use squeue to monitor the job examine the output

Running a parallel job

MANAGING JOBS: RUNNING ARRAYS OF JOBS

• Many processors running a copy of a task independently

• Add the –-array parameter to the script file (with

#SBATCH at beginning of the line)

• Example: #SBATCH --array=1-4:1

• This will create 4 tasks from one job

• Each task will have its environment variable

$SLURM_ARRAY_TASK_ID set to a single unique value

ranging from 1 to 10.

• There is no guarantee that task number m will start

before task number n , where m<n

• https://slurm.schedmd.com/job_array.html .

https://slurm.schedmd.com/job_array.html

PRACTICE SESSION: SUBMITTING A TASK

ARRAY TO BESSEMER (PROBLEM 4)

• Case Study

• Fish population simulation

• Submitting jobs to Slurm

• Instructions are in the readme file in the Slurm folder of the course examples

• From an interactive session

• Run the Slurm task array example

• Run test4, test5

BEYOND BESSEMER

• Bessemer and ShARC OK for many compute

problems

• Purchasing dedicated resource

• National tier 2 facility for more demanding compute

problems

• Archer Larger facility for grand challenge problems

(pier review process to access)

https://www.sheffield.ac.uk/cics/research/hpc/costs


HIGH PERFORMANCE

COMPUTING TIERS

• Tier 1 computing

• Archer

• Tier 2 Computing

• Peta-5, jade

• Tier 3 Computing

• Bessemer, ShARC

PURCHASING RESOURCE

• Buying nodes using framework

• Research Groups purchase HPC equipment against their research grant this hardware is integrated with Iceberg cluster

• Buying slice of time

• Research groups can purchase servers for a length of time specified by the research group (cost is 1.0p/core per hour)

• Servers are reserved for dedicated usage by the research group using a provided project name

• When reserved nodes are idle they become available to the general short queues. They are quickly released for use by the research group when required.

• For information e-mail [email protected]


mailto:[email protected]


NATIONAL HPC SERVICES

• Tier-2 Facilities• http://www.hpc-uk.ac.uk/• https://goo.gl/j7UvBa

• Archer• UK National Supercomputing Service• Hardware – CRAY XC30

• 2632 Standard nodes• Each node contains two Intel E5-2697 v2 12-core processors• Therefore 2632*2*12 63168 cores.• 64 GB of memory per node• 376 high memory nodes with128GB memory

• Nodes connected to each other via ARIES low latency interconnect• Research Data File System – 7.8PB disk• http://www.archer.ac.uk/

• EPCC• HPCC Facilities

• http://www.epcc.ed.ac.uk/facilities/national-facilities

• Training and expertise in parallel computing

http://www.hpc-uk.ac.uk/

https://goo.gl/j7UvBa

LINKS FOR SOFTWARE

DOWNLOADS

• Moba X-term

https://mobaxterm.mobatek.net/

• Putty

http://www.chiark.greenend.org.uk/~sgtatham/putty/

• WinSCP

http://winscp.net/eng/download.php

• TigerVNC

http://sourceforge.net/projects/tigervnc/

https://mobaxterm.mobatek.net/

http://www.chiark.greenend.org.uk/~sgtatham/putty/

http://winscp.net/eng/download.php

http://sourceforge.net/projects/tigervnc/

Documents

Requesting Resources on an HPC Facility · SLURM • Bessemer login nodes are gateways to the cluster of worker nodes. • Login nodes’ main purpose is to allow access to the worker