11
Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre www.resc.rdg.ac.uk Environmental Systems Science Centre University of Reading, UK

Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Embed Size (px)

DESCRIPTION

Background Many NERC institutes now have HPC clusters  Beowulf clusters with commodity hardware  Common applications are ocean, atmosphere and climate models Pressure to justify spending and increase utilisation  Sharing clusters helps increase utilisation  Sharing clusters facilitates collaborations Running climate models on remote clusters in traditional way is not easy

Citation preview

Page 1: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Grid Remote Execution of Large Climate Models

(NERC Cluster Grid)

Dan Bretherton, Jon Blower and Keith Haines

Reading e-Science Centrewww.resc.rdg.ac.uk

Environmental Systems Science CentreUniversity of Reading, UK

Page 2: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Main themes of presentation

Sharing HPC clusters used for running climate models Why share clusters Grid approach to cluster sharing (NERC Cluster

Grid: UK Environmental Res. Council) G-Rex Grid middleware Large climate models as grid services Please also see demonstration and poster

Page 3: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Background Many NERC institutes now have HPC clusters

Beowulf clusters with commodity hardware Common applications are ocean, atmosphere and climate

models Pressure to justify spending and increase utilisation

Sharing clusters helps increase utilisation Sharing clusters facilitates collaborations

Running climate models on remote clusters in traditional way is not easy

Page 4: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Using remote clusters the traditional way

Input data

Output data

Local Remote

100 GB

SCP

SCP

SSH

Model input and outputModel setup, including source

code, work-flow scripts, model input and output

Page 5: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Computational challenges of Climate models

Typical requirements Parallel processing (MPI) with large number of

processors (usually 20-100) Each cluster needs high speed interconnection (e.g.

Myrinet or Infiniband) Long runs lasting several days Large volumes of output Large number of separate output files

Page 6: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

NEMO Ocean Model (eg. European Operational Oceanogr.)

Main parameters of a typical 1° Global Assimilation run for a one year: Run with 40 processors 2-3 hours per year on Cluster

Outputs 300 MB in 700 separate files as diagnostics every 5-10 minutes

Output for a one year is roughly 20 GB, a total of 50000 separate files

50-year `Reanalysis` = 1Tb. Model automatically re-submitted as a new job each year

Page 7: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

NERC Cluster Grid Includes 3 clusters so far... (plans for 11 clusters)

Reading (64 procs.), Proudman, (360 pr.), British Antarctic Survey (160 pr.)

Main aim Make it easier to use remote clusters for running large models

Key features Minimal data footprint on remote clusters Easy job submission and control Light-weight grid middleware (G-Rex) Load and performance monitoring (Ganglia) Security

Page 8: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Grid Remote EXecution G-Rex is light-weight grid middleware Implemented in Java using Spring framework G-Rex server is a Web application

Allows applications to be exposed as services Runs inside a servlet container

G-Rex client program, grexrun, behaves as if the remote service were actually running on the user's own computer

Security based on HTTP digest authentication

Page 9: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

NEMO G-Rex service: Deployment scenario 1Client Server

NEMOlaunch scripts and forcing data(same every run)

Input and output via HTTP

Port 9092

G-Rexserver

Tomcat port open to client

Apache TomcatG-Rexclient

NEMO model setup, including source code, work-flow scripts, input data and output from all runs

Page 10: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

NEMO G-Rex service: Deployment scenario 2Client Server

NEMOlaunch scripts and forcing data(same every run)

Input and output via HTTP

Port 9092

G-Rexserver

Apache TomcatG-Rexclient

NEMO model setup, including source code, work-flow scripts, input data and output from all runs

Page 11: Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre

Advantages of G-Rex Output continuously transferred back to user

Job can be monitored easily No data transfer delay at end of run

Files deleted from server when no longer needed Prevents unnecessary accumulation of data Reduces data footprint of services

Work-flows can be created using shell scripts Very easy to install and use See Poster; Demonstration also available www.resc.reading.ac.uk