34
HPC@sheffield - where do we go next? John Harding Dept. Materials Science and Engineering, Sheffield

HPC@sheffield - where do we go next?

Embed Size (px)

DESCRIPTION

HPC@sheffield - where do we go next?. John Harding Dept. Materials Science and Engineering, Sheffield. Simulation Scales. www.nsf.gov/news/speeches/bordogna/rochester/sld006.htm. Simulations and scales. For details see www.prace-ri.eu. Grid computing – particle physics. - PowerPoint PPT Presentation

Citation preview

Page 1: HPC@sheffield  - where do we go next?

HPC@sheffield -where do we go next?

John HardingDept. Materials Science and

Engineering, Sheffield

Page 2: HPC@sheffield  - where do we go next?
Page 3: HPC@sheffield  - where do we go next?

3

Simulation Scales

www.nsf.gov/news/speeches/bordogna/rochester/sld006.htm

Page 4: HPC@sheffield  - where do we go next?

Simulations and scales

4

Page 5: HPC@sheffield  - where do we go next?

For details seewww.prace-ri.eu

Page 6: HPC@sheffield  - where do we go next?

Grid computing – particle physics

Page 7: HPC@sheffield  - where do we go next?

Iceberg Current Configuration• Cluster based on Intel 6- core X5650

– 71 nodes: 12 cores and 24GB– 4 nodes: 12 cores and 48GB – 8 NVIDIA Tesla Fermi

M2070GPUs• Storage

– 45TB NFS mounted mirrored data store

• User quota system– 80TB Lustre based parallel file

store• Fastdata store a scratch area

data is deleted after 90 days

Page 8: HPC@sheffield  - where do we go next?

How do we use what we have at Sheffield Number of Active Users

0100200300400500600700800900

1000

2009 2010 2011 2012 2013 2014

Total CPU Time for Iceberg (Years)

0

200

400

600

800

1000

1200

1400

2009 2010 2011 2012 2013 2014

Mean Top100 Users usage

0

20000

40000

60000

80000

100000

120000

2009 2010 2011 2012 2013 2014

Year

Ho

urs mean top100 wallclock

mean top100 cpu

Mean User Usage

0

2000

4000

6000

8000

10000

12000

2009 2010 2011 2012 2013 2014

Year

Ho

urs mean user wallclock

mean user cpu

Page 9: HPC@sheffield  - where do we go next?

Iceberg main users

• Mechanical Engineering– Aerodynamics– Thermofluids

• Applied Maths– Solar Physics and Space Plasma Research

• Insigneo– Virtual Physiological Human Initiative

• Electrical Engineering– Device Electronics: nanoWires and NanoTubes – Lensless microscopy

Page 10: HPC@sheffield  - where do we go next?

Future of computing in Sheffield,SP2RC approach

Solar Physics and Space Plasma Research Centre (SP2RC)

Solar Wave Theory Group (SWAT)School of Mathematics and Statistics (SoMaS), Faculty of Science, UoS

Space System Laboratory (SSL)Automatic Control and Systems Engineering (ACSE), Faculty of Engineering, UoS

SP2RC at the UoS seeks to understand the nature of key plasma processes occurring in •the solar interior,•the atmosphere of the Sun, from photosphere to corona, •with particular attention devoted to the various coupling mechanisms of these apparently distinct regions.

Page 11: HPC@sheffield  - where do we go next?

Interfaces for photovoltaics

VBTFp EE

p-type Schottky barrier height.

Supercell Fermi energy.

Valence band top energy of bulk semi-conductor.

pgn E

n-type Schottky barrier height.

Semi-conductor band gap energy.

Interface φp/eV φn/eV φn/eV (expt.)

<111> 0.31 0.80 0.74<110> 0.50 0.61

11

K.T. Butler, P.E. Vullum, A.M. Muggerud, E. Cabrera and J.H. Harding, Phys. Rev. B 83 (2011) Art 235307   

Page 12: HPC@sheffield  - where do we go next?

Biomechanics (Lacroix)

Page 13: HPC@sheffield  - where do we go next?
Page 14: HPC@sheffield  - where do we go next?

A new, bigger Iceberg

• Infrastructure to be added to existing Intel Nodes

• Cluster based on Intel Ivy Bridge 8- core E5-X2650v2– 71 nodes each with 16 cores and 64 GB memory– 8 nodes each with 16 cores and 256GB memory– 8 NVIDIA Tesla Fermi K40 GPUs

• Storage– 100TB Lustre based parallel file store

• Fastdata store a scratch area data is deleted after 90 days Network

• Expand Infiniband spine to facilitate further expansion and addition of nodes by research groups

• Update of NFS mounted data – TBD• Older AMD nodes to be decommissioned• Coming by June 2014.

Page 15: HPC@sheffield  - where do we go next?

Hardware Access Routes

• Research groups can 1. access the general facility – fair share using

system scheduler2. Purchase hardware and host on iceberg3. Purchase a number of compute nodes for a

specified of time

• Purchase route provides a dedicated access to a resource effectively ‘owned’ by the reseacher

Page 16: HPC@sheffield  - where do we go next?

Client Access to Visualisation Cluster

VirtualGL Client

Iceberg –Campus Compute Cloud

VirtualGL Server(NVIDIA GPU)

Page 17: HPC@sheffield  - where do we go next?

Applications Supported Using Graphical Processing Units

• Ansys• Matlab• Flame• Beast• Ansys Abaqus• Smoothed Particle Hydrodynamics (DualSPhysics)• Astronomy

– MHD Applications (SMAUG, Athena)– Nbody 6

Page 18: HPC@sheffield  - where do we go next?

N8 HPC• 316 nodes (5,056 cores)

with 4 GByte of RAM/core • 16 nodes (256 cores) with

16 GByte of RAM/core• Sandybridge architecture• Nodes connected by

infiniband. • 110 Tflop/sec peak. • 174Tb Lustre v2 parallel

file system • 109Tb NFS backed-up file

system www.n8hpc.org.uk

Page 19: HPC@sheffield  - where do we go next?

N8 HPC - objectives

• Seed engagement between industry and academia around research using e-infrastructure

• Develop skills in the use of e-infrastructure across the N8 partnership

• Share the asset of skills and equipment across the N8 partnership via the facilitation of networks of people

Page 20: HPC@sheffield  - where do we go next?

What people have done on N8

MySpine – Damian Lacroix; Sheffield

Studies of turbulenceShuisheng He

Biomolecules on surfacesShaun Hall

Page 21: HPC@sheffield  - where do we go next?

N8 HPC - access

• You need a N8 HPC Project Account. We ask that PIs are permanent members of staff at their institution

• Contact the local HPC representative at Sheffield – [email protected] or Iceberg admin.

• You will need to produce a project proposal for consideration by the local committee.

• You will be expected to produce a report saying what you have done with the resources.

Page 22: HPC@sheffield  - where do we go next?

ARCHER (www.archer.ac.uk)

Page 23: HPC@sheffield  - where do we go next?
Page 24: HPC@sheffield  - where do we go next?

What there is (Stage 1)

• ARCHER compute nodes contain two 2.7 GHz, 12-core E5-2697 v2 (Ivy Bridge) series processors (2632 nodes)

• Standard compute nodes on ARCHER have 64 GB of memory shared between the two processors. Smaller number of high-memory nodes with 128 GB of memory shared between the two processors (376 nodes)

• In all, 3008 nodes; 72,192 cores.

Page 25: HPC@sheffield  - where do we go next?

Getting access• ARCHER Access Through EPSRC and NERC

Grants – process similar to HECToR. Need to submit a technical assessment before the grant application and attach result to JES form

• ARCHER Instant Access – means what it says; limited amounts for pump-priming

• ARCHER Access Through the Resource Allocation Panel – stand-alone access request. Next closing date is May 14th (EPSRC remit)

• ARCHER access through a consortium – there may be one in your area; need to contact PI.

Page 26: HPC@sheffield  - where do we go next?
Page 27: HPC@sheffield  - where do we go next?
Page 28: HPC@sheffield  - where do we go next?

Some facilities available

Page 29: HPC@sheffield  - where do we go next?

Getting access

• PRACE publishes its Calls for Proposals for Project Access twice per year according to a fixed schedule:

• Call opens in February > Access is provided starting September of the same year.

• Call opens in September > Access is provided starting March of the next year.

• See www.prace-ri.eu for details.

Page 30: HPC@sheffield  - where do we go next?

Some issues• High Performance Computing is essential but does

not stand on its own. Simulations are simulations of a real world problem. They must not become divorced from that. All simulation has a multidisciplinary side.

• People can always use more computing power – but getting good, reliable software is even better (and much harder)

• Hardware is useless without technical support and training – and these tend to be much harder to fund.

• Data is becoming a major issue – its capture, curation and analysis. We are already into petabytes here at Sheffield. The world is into exabytes at least.

Page 31: HPC@sheffield  - where do we go next?

Closing remarks

• Industry tends to engage with Academy on problems they need answering.– Needs an expertise offering, even if approached for cycles.

• Trusted relationships take time– Important to maintain channels that work

• HPC is part of a broader offering• Critical to build the right team

– Involve skills/resources in National centers• Must listen to & provide forum for industry

– Shape & maintain a relevant offering to industry– Embed in existing activities: N8 industry innovation forum

• Links to Universities– Appropriate skills emerging from our graduates.

Page 32: HPC@sheffield  - where do we go next?

Main Goals we would like to achieve in SP2RC

1. Forward modelling of the the excitation, progression and dissipation of MHD waves in the solar atmosphere based on the single/multi- highly localised magnetic flux tubes (e.g. magnetic bright points/pores, magnetic elements of Active Regions, vortices and inter-granular lanes) by applying realistic sources (even from high-resolution observations) of wave excitation based on one/two fluid MPI/GPU MHD codes.

2. To determine the role of Alfven, kink and longitudinal waves in the wave energy transport from photosphere to the upper layers of the solar atmosphere.

3. Perform multidimensional two fluid MHD numerical

simulations at various spatial and temporal scales within the chromosphere, transition region and low corona by taking into account the crucial solar plasma processes, i.e. heat conduction, ionization, proton-neutral hydrogen collisions, in order to obtain an accurate description of the classic wave dynamics and line synthesis.

4. Apply realistic 3D MHD models of the solar atmosphere investigating the mechanisms governing the transport of energy from the solar photosphere to the solar corona.

Page 33: HPC@sheffield  - where do we go next?

Orszag-Tang Test

200x200 Model at t=0.1, t=0.26, t=0.42 and t=0.58s

Page 34: HPC@sheffield  - where do we go next?

Timing for Orszag-Tang Using SAC/SMAUG with Different Architetures

0

100

200

300

400

500

600

0 1000 2000 3000 4000 5000

Grid Dimension

Tim

e f

or

10

0 It

era

tio

ns

(s

ec

on

ds

)

NVIDIA M2070

NVIDIA K20

Intel E5 2670 8c

NVIDIA K40

K20(2x2)

K20(4x4)