20
http://www.ngs.ac.uk http://www.grid- support.ac.uk http://www.eu- egee.org/ http:// www.pparc.ac.uk/ http:// www.nesc.ac.uk/ NGS computation services: API's, concurrent and parallel jobs Mike Mineter [email protected]

Http:// // NGS computation services: API's,

Embed Size (px)

Citation preview

Page 1: Http:// // NGS computation services: API's,

http://www.ngs.ac.ukhttp://www.grid-support.ac.uk

http://www.eu-egee.org/http://www.pparc.ac.uk/http://www.nesc.ac.uk/

NGS computation services: API's, concurrent and parallel

jobs Mike Mineter

[email protected]

Page 2: Http:// // NGS computation services: API's,

3

Overview

3 distinct techniques will be in the practical:

• API’s to the low-level tools

• Using multiple processors– Concurrent processing - for many jobs– Parallel processing – to speed execution of one job

Page 3: Http:// // NGS computation services: API's,

4

Overview

• API’s to the low-level tools

Page 4: Http:// // NGS computation services: API's,

5

Job submission so far

GLOBUS, etc.

Command-line interfaces

Portal

Browser

User’s Interface to the grid

GLOBUS, etc.

Page 5: Http:// // NGS computation services: API's,

6

Application-specific tools

GLOBUS, etc.

Command-line interfaces

Portal

Browser

Application Specific

and / or

Higher generic tools

Portal

Browser

User’s Interface to the grid

Page 6: Http:// // NGS computation services: API's,

7

GLOBUS, etc.

Application Specific

and / or

Higher generic tools

Application-specific tools

Command-line interfaces

Portal

Browser

User’s Interface to the grid

API’s:

•Java

•C

•…

Page 7: Http:// // NGS computation services: API's,

8

Available API’s

• “Community Grid” CoG http://www.cogkit.org/– Java, Python, Matlab– (very limited functionality on Windows – no GSI)

• C http://www.globus.org/developer/api-reference.html – Example in the practical

Page 8: Http:// // NGS computation services: API's,

9

Using multiple processors

Page 9: Http:// // NGS computation services: API's,

10

High performance computing

• If run-times are too long, you can – Optimise, re-code – Use HPC

• Parallel processing: multiple processors cooperate on one task – you run one executable; its distributed for you

• Concurrent processing: multiple processors, multiple tasks – multiple executables distributed and run by a script you write

• High-throughput computing: multiple processors, multiple tasks using spare CPU cycles - CPU years in elapsed days. E.g. sensitivity analyses in modelling. Condor, for example.

Page 10: Http:// // NGS computation services: API's,

11

High performance computing

• HPC– Parallel processing: on NGS, you run one globus-job-

submit command – but need to code and build program so it is parallelised

– Concurrent processing: on NGS, multiple executables run from a script on the UI

– High-throughput computing: Many independent runs using spare CPU cycles - CPU years in days.

NGS core nodes open these routes to you – but you have to do a bit of work! (Grid is not magic!...)

Watch for addition of Condor pools on the NGS!!!!

Page 11: Http:// // NGS computation services: API's,

12

Concurrent processing

UI

Internet

Head processors of clusters

Worker processors of clusters

Globus_job_submit

via PBS

Processes run independently

Page 12: Http:// // NGS computation services: API's,

13

Concurrent processing

• An approach:– Script 1: to submit multiple jobs and gather their URIs

from globus-job-submit– Script 2: to test for completion and gather results

– In the practical, the jobs are short, so these scripts are integrated

Page 13: Http:// // NGS computation services: API's,

14

Parallel Processing

• In early 1990’s parallel processing had some similarities to current state of grid computing – if on a smaller scale– National resources begun to be available– Standards were needed– Lot of hype– Early adopters in physics– Inadequate integration with the desktop

• And from the jungle of conflicting approaches emerged…

• MPI: Message Passing Interfacehttp://www-unix.mcs.anl.gov/mpi/mpich/

Page 14: Http:// // NGS computation services: API's,

15

Parallel processing

UI

Internet

Head processors of clusters

Worker processors of clusters

Globus_job_submit

Page 15: Http:// // NGS computation services: API's,

16

Parallel processing

UI

Internet

Head processors of clusters

Worker processors of clusters

MPI

Page 16: Http:// // NGS computation services: API's,

17

Parallel processing

UI

Internet

Head processors of clusters

Worker processors of clusters

MPI

Processes communicate, synchronise(the simpler the better!)

Page 17: Http:// // NGS computation services: API's,

18

MPI concepts• Each processor is assigned an ID [0,P-1] for P processors

– Used to send/recv messages and to determine processing to be done• Usually code so that processor 0 has coordinating role:

– Reading input files, Broadcasting data to all other processors– Receiving results

• Point-to-point Messages are used to exchange data– Either blocking (wait until both processors are ready to send/receive)– Or non-blocking: start the send/recv, continue, check for send/recv

completion• Collective functions

– Broadcast: sends data to all processors– Scatter: sends array elements to different processors– Gather: collects different array elements from different processors– Reduce: collects data and performs operation as dta are received

• Can group processors for specific purposes (e.g. functional parallelism)

Page 18: Http:// // NGS computation services: API's,

19

MPI notes

• How does the task split into sub-tasks?– By functions that can run in parallel??!– By sending different subsets of data to different processes?

More usual ! Overheads of scatter and gather• Need to design and code carefully: be alert to

– sequential parts of your program (if half your runtime is sequential, speedup will never be more than 2)

– how load can be balanced (64 processes with 65 tasks will achieve no speedup over 33 processes)

– Deadlock! • MPI functions are usually invoked from C, Fortran

programs, but also Java• Several example patterns are given in the practical.

Many MPI tutorials are on the Web!

Page 19: Http:// // NGS computation services: API's,

20

Summary

• Can build user applications and higher level tools on GT2 via APIs

• NGS core nodes support– concurrent processing (if you do a bit of scripting /

coding)– MPI for parallelism (if you parallelise your code)

• Remember – the CSAR, HPCx services!– NGS is batch oriented at present.

Page 20: Http:// // NGS computation services: API's,

21

Practical

• http://homepages.nesc.ac.uk/~gcw/NGS/GRAM_II.html

• (That II is letters not numbers on the end!)