13
First public disclosure of Isambard performance results Prof Simon McIntosh-Smith University of Bristol, UK @simonmcs http://gw4.ac.uk/isambard/ 1

First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

Embed Size (px)

Citation preview

Page 1: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

First public disclosure of Isambard performance results

ProfSimonMcIntosh-SmithUniversityofBristol,UK

@simonmcs http://gw4.ac.uk/isambard/ 1

Page 2: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

Why explore Arm-based supercomputers?• Thearchitecturedevelopmentisdrivenbythefast-growingmobilespace

• MultiplevendorsofArm-CPUs:• Greatercompetition• Morechoice• Rapidinnovations,e.g.invectorinstructionset

• Mont-Blancprovedtheapproachisfeasible

@simonmcs http://gw4.ac.uk/isambard/ 2

Page 3: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

@simonmcs http://gw4.ac.uk/isambard/ 3

I.K.Brunel 1804-1859

'Isambard’, a new Tier 2 HPC service from GW4.Named in honour of Isambard Kingdom Brunel

Page 4: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

Tier-2 HPC

@simonmcs http://gw4.ac.uk/isambard/

Tier 0: international

Tier 1: national

Tier 2: regional

Tier 3

The tiered model ofHPC provision

4

DoE

Page 5: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

@simonmcs http://gw4.ac.uk/isambard/ 5

I.K.Brunel 1804-1859

Isambard system specification (red = new info):

• Cray “Scout” system – XC50 series• Aries interconnect

• 10,000+ Armv8 cores• Cavium ThunderX2 processors• 2x 32core @ >2GHz per node

• Cray software tools• Technology comparison:

• x86, Xeon Phi, Pascal GPUs• Phase 1 installed March 2017• The Arm part arrives early 2018

Page 6: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

• TheIsambardproject’sfocuswillbeonthetop10mostheavilyusedcodesonArcherin2017:• VASP,CASTEP,GROMACS,CP2K,UM,HYDRA,NAMD,Oasis,SBLI,NEMO

• Note:8ofthese10codesarewritteninFORTRAN• Additionalimportantcodesforprojectpartners:

• OpenFOAM,OpenIFS,WRF,CASINO,LAMMPS,…• Wewanttocollaboratewhereverpossible!

• AcceleratetheadoptionofArminHPC

@simonmcs http://gw4.ac.uk/isambard/ 6

Page 7: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

@simonmcs http://gw4.ac.uk/isambard/ 7

1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.55 1.56 1.51

1.65

1.41 1.41 1.40

2.09

1.72 1.76

2.22

1.66

1.43

1.06

0.0

0.5

1.0

1.5

2.0

2.5

STREAMtriad CloverLeaf2D CloverLeaf3D TeaLeaf2D TeaLeaf3D SNAPnang=10 SNAPnang=136

Perform

ance(norm

alizedtoBroadwell)

Single-socketcomparisonofBroadwell,Skylake,andThunderX2

Broadwell(18cores)

Skylake(22cores)

ThunderX2 (32cores)

Page 8: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

1.00 1.00 1.00 1.000.85

1.48

1.25

2.01

0.0

0.5

1.0

1.5

2.0

2.5

GROMACS UMAMIP NEMO OpenFOAM

Performance(norm

alizedto

Broadwell)

Single-socketcomparisonofBroadwellandThunderX2

Broadwell(18cores)

ThunderX2 (32cores)

@simonmcs http://gw4.ac.uk/isambard/ 8

*

* = NEMO runs from a 28 core, 2.0GHz TX2

Benchmarked by theUK’s Met Office

Page 9: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

Detailsforbenchmarkscomparisons:• ThunderX2earlyaccesssystems

• 32c,2.5GHz,2667MHzDDR4,Ubuntu 16.04• 28c,2.0GHz,2400MHzDDR4,SLES12SP3• Allalpharelease(pre-production)hardware

• Broadwellsystem• 18c,2.1GHz,2400MHzDDR4,XeonE5-2695v4

• Skylakesystem• 22c,2.1GHz,2667MHzDDR4,XeonGold6152

@simonmcs http://gw4.ac.uk/isambard/ 9

Page 10: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

SoftwareusedforThunderX2:• UM,NEMOandTeaLeaf:CrayCCE8.6.4

• UMandNEMOresultsproducedbytheUK’sMetOffice• GROMACS,OpenFOAMandSTREAM:GCC7.1• CloverLeaf:2Darmflang 18.0,3Darmflang 1.4• SNAP:nang=10armflang 1.4,nang=136CCE8.6.3

SoftwareusedforBroadwell:• UMandNEMO:CrayCCE8.5.8,producedbytheMetOffice• GROMACS:GCC7.1• Intel2017compilerforeverythingelse

SoftwareusedforSkylake:• Intel2018compiler

@simonmcs http://gw4.ac.uk/isambard/ 10

Page 11: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

Benchmarktestcaseparameters:• UMAMIP(v10.8):6day• NEMO:GYRE_PISCES,idealised calculation,720timesteps

• GROMACS:rnase_cubic• OpenFOAM:motorBike• STREAM:testsizeof225 doubleprecisionelementsperarraywith100iterations

• CloverLeaf:2Dbm_16,3Dbm1s_short• TeaLeaf:2Dbm_5,3Dbm_30.04• SNAP:1024x16x(NC/2),ng=32

@simonmcs http://gw4.ac.uk/isambard/ 11

Page 12: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

OtherapplicationsbeingportedtoIsambardinclude:

VASP,CASTEP,CP2K,HYDRA,NAMD,Oasis,SBLI,OpenIFS,WRF,CASINO,LAMMPS…

Earlyresultssuggestthatforcompute-boundapplicationssuchasGROMACS,CP2KandVASP,performancebetweenthedifferentprocessorsiscloserthanformemorybandwidthboundcodes.Thisisbecause,whilethecodesbenefitfromthewidervectorunitsofthex86processors,ThunderX2compensateswithhighercorecountsandclockspeeds

@simonmcs http://gw4.ac.uk/isambard/ 12

Page 13: First public disclosure of Isambard performance · PDF file•The Isambard project’s focus will be on the top 10 most heavily used codes on Archer in 2017: •VASP, CASTEP, GROMACS,

For more information:

• http://gw4.ac.uk/isambard/

• http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-newsArticle&ID=2316352

• https://github.com/UoB-HPC/GW4-Isambard

• Twitter:@simonmcs

• Email:[email protected]

@simonmcs http://gw4.ac.uk/isambard/ 13