Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
NERSC-9
NicholasJ.Wright,NERSC-9ChiefArchitectNUGmee=ngMarch24,LBNL
3/24/16
NERSCTimeline
NRP complete 12.5 MW
201520162016-182020202120242028
NERSC-8 Cori Phase II
NERSC-8 Cori Phase I
CRT 25MW upgrade
CRT 35+ MW upgrade
NERSC-10 Capable Exascale for broad Science
Staff move in
NERSC-9 150-300 Petaflops
NERSC-11 5-10 Exaflops
Edison Move Complete
2
APEX2020CurrentStatus• 3rdjointSC/NNSAprocurement• ALerTrinity/NERSC-8(2016)&CORAL(2018)
• RFPdraLtechnicalspecsreleasedNov.10,2015• 2ndDraLreleasedMarch11th
hZp://www.lanl.gov/projects/apex/_assets/docs/APEX2020_draL_tech_specs_v2.0.pdf
-3-
2015 2016 2017 2018 2019 2020
RFP Contract DeliveryNon-RecurringEngineering
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1.E+071/1/19
92
1/1/19
96
1/1/20
00
1/1/20
04
1/1/20
08
1/1/20
12
1/1/20
16
1/1/20
20
1/1/20
24
Energy
perF
lop(pJ)
Heavyweight Heavyweight Scaled Heavyweight Constant
Lightweight Lightweight Scaled Lightweight Constant
Heterogeneous Hetergeneous Scaled Historical
CMOS Projection Hi Perf CMOS Projection Low Power UHPC Goal
4
NERSCneedstotransi@ontoenergyefficientarchitectures
ManycoreorHybridistheonlyapproachthatcrossestheexascale
finishline
5
Intel Federal LLC Proprietary The information on this page is subject to the use and disclosure restrictions provided on the second page to this document.
Throughput vs Single Thread: Perf Trade-off
SKLBDW
HSWIVBSNB
NHMPNRMRM
YNHBNSSLM
STW
GLMPLM
TMT
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80
rela
tive
IPC
Normalized Power (22nm)
Belli Kuttanna
2.5-3.5x power, <2x freq
66Haswell:SilvermontIPC:~3xPower:~5x
AbstractMachineModelforExascale
6
3D StackedMemory
(Low Capacity, High Bandwidth)
FatCore
FatCore
Thin Cores / Accelerators
DRAMNVRAM
(High Capacity, Low Bandwidth)
Coherence DomainCoreIntegrated NICfor Off-Chip
Communication
Edison-2012
7
3D StackedMemory
(Low Capacity, High Bandwidth)
FatCore
FatCore
Thin Cores / Accelerators
DRAMNVRAM
(High Capacity, Low Bandwidth)
Coherence DomainCoreIntegrated NICfor Off-Chip
Communication
Cori(NERSC-8)-2016
8
3D StackedMemory
(Low Capacity, High Bandwidth)
FatCore
FatCore
Thin Cores / Accelerators
DRAMNVRAM
(High Capacity, Low Bandwidth)
Coherence DomainCoreIntegrated NICfor Off-Chip
Communication
NERSC-9(2020)?–Anexascale-eraarchitecture
9
3D StackedMemory
(Low Capacity, High Bandwidth)
FatCore
FatCore
Thin Cores / Accelerators
DRAMNVRAM
(High Capacity, Low Bandwidth)
Coherence DomainCoreIntegrated NICfor Off-Chip
Communication
Layer NERSC-7(Edison)2013
NERSC-8(Cori)2016
NERSC-92020
HighBandwidthMemorypernode
None 16GB,>400GB/sec More!
DRAMpernode 64GB,~100GB/sec 96GB,90-100GB/sec
Some
NV-DIMM(byteaddressable)
None None Maybe
Non-Vola@le(Pageaddressable)
None 1.5PB,1.5TB/sec 10sPBs,10sTB/sec
SpinningDisk–/scratch
8PB,130GB/sec 28PB,700GB/sec
Collapsedlayer>50PBs~1TB/secSpinningDisk–
longerterm(/project)
~30PB,~70GB/sec ~50PB,~100GB/sec
Tape ~40PB,~10GB/sec ~100PB,~20GB/sec ~100sPB,~10sGB/sec
• NVRAMtechnologiesarecosteffec=veforbandwidthtoday– BurstBuffersinTrinity/Cori(2016)&CORAL(2018)
• In2020– Willanyspinningdiskbeneededforcapacity?Costisthelimi=ngfactor
– NVRAM:Howmuch?Whatkind(s)?Wheretoputitinthemachine?WhatsoLware(run=me/scheduler/OS)enhancementswillbeneeded?• Workflows!
– Fusion,Climate,QCD,ALS,JGI,Materials,SkySurveyhZps://www.nersc.gov/assets/apex-workflows-v2.pdf
MarketSurvey:StorageTechnologiesareChanging
11
APEXwillDefineWorkflowstoOp@mizePlabormStorage• Aworkflowisadescrip=onofthestepsneededtoobtainresultsin
ascien=ficinves=ga=on• Theworkflowlifecycletypicallyconsistsofmanycomputa=onal
anddatatransforma=onsteps– Runningsimula=onsand/orexperiments– Analyzingoutputdata– Managingdatatoaidthescien=ficinves=ga=on,includingcollec=ng
informa=ontobenefitfuturestudiesandhelpfuturevalida=onofresults
• WhitepaperreleasedwhichdescribesotherstorageusescasespresentinAPEXworkflows– Baseduponextensiverequirementsgatheringexercise– Includeses=matesofdatavolumesandlife=mesformul=pleNERSC,
LANL,LLNLandSNLworkflows• Overallgoalistoprovideaframeworktoreasonaboutplaporm
storagedesigndecisions– Allowsvendortoinnovateandbeflexible
12
DataReten
=onTime
Forever
Temporary
Setup/Parameterize/
CreateGeometry
SimulatePhysics Viz
Ini=alInputDeck
CheckpointDump
Γ*JMTTI
JobBegin
JobEnd
Campaign
Ini=alState
CheckpointDump
TimestepDataSet
SampledDataSet
Down-Sample
Post-Process
AnalysisDataSet
SimInputDeck
PhaseS1 PhaseS2 PhaseS3 PhaseS4 PhaseS5
CheckpointDump
4–8xperweek
5-15xperpipeline
TimestepDataSet
5–10xperweek
Simula=onSciencePipeline13
DataReten
=onTime
Forever
Temporary
Generateand/orGatherInputData
HTCAnalysisorUQSimula=on
CheckpointDump
Campaign
SharedInput
CheckpointDump
Analysis
PhaseH1PhaseU1
PhaseH2PhaseU2
PhaseH3PhaseU3
CheckpointDump
4–8xperweek
5-15xperpipeline
PrivateInput
File-basedComm.
AnalysisDataSets
AnalysisDataSets
oror
HTCSciencePipeline
UQSciencePipeline
…
…
14
TargetSystemConfigura@on
15
NERSC-8 NERSC-9-Target
SSP >5xEdison >20xEdison
BaselineMemoryCapacity 1.1PB >3PiB
BurstBuffer 1.5PB1.5TB/s >90PB>5TB/s
Disk 22PB744GB/s
MarketSurveyshaveFormedtheBasisofourRequirementsDevelopment
• TheCrossroads/NERSC-9(CN9)teamshadmanyformal(Face-to-Face)andinformal(telecon)interac=onswithvendorsoverthelast15months– Interac=onscon=nueleadinguptotheRFPrelease
• MarketSurveysandinterac=onsfocusedonmajorprimeandtechnologyprovidercandidates:
16
• NERSCworkloadanalysisperformedaspartoftheprocurementac=vi=es– hZp://portal.nersc.gov/project/mpccc/baus=n/NERSC_2014_Workload_Analysis_30Oct2015.pdf
• NERSChasheldonerequirementsworkshopperofficelookingat2017requirements– hZp://www.nersc.gov/science/hpc-requirements-reviews
TechnicalSpecifica@onsIncludeFindingsFromWorkloadAnalysisandRequirementsWorkshops
17
VASP
MILC
Espresso
CESM
GYRO
LAMMPS
NAMD
chroma
xgcgtc
M3DWRFAMD
tgyrocp2k
BerkeleyGWqlua
S3D
osirisgtsGaussian
EWI3Dsextet.x
NWCHEMnimrodmadam_toast
ARTrun_wmcphoenix
ChomboCrunchoverlap_inverter
geneeffBeamEMGeopython-mpiNyx
transFnGromacsmolproNCAR-LES CompoaRunxaorsaGadget
lsppstg
DLPOLYelm_6f
Amber
>600Others• 13codesmakeup50%of
workload
• 25codesmakeup66%ofworkload
• 50codesmakeup80%ofworkload
• Remainingcodes(over600)makeup20%ofworkload.
Over650applica@onsrunonNERSCresources
-18-
TopApplica@oncodesonHopperandEdisonbyhoursused.
Jan–Dec2014
NERSCBenchmarksWereChosentoRepresenttheWorkload
-19-
DensityFunc=onalTheory
LaxceQCD
MolecularDynamics
Con=nuumFusion
Bio-Informa=cs
PICFusion
Climate
ScalableSolvers
QuantumChemistry
CMBSeismic PDSF TopalgorithmsonNERSCsystems
bycorehoursusedJan–Dec2014
• Regroupedtopcodesbysimilaralgorithms.
• Asmallnumberofbenchmarkscanrepresentalargefrac=onoftheworkload.• miniDFT• MILC• GTC• Meraculous
• IncludesGenepooland
PDSFsystems.
APEXplanstouse“mini-apps”,somefullappsforsystemevalua@on
MiniApp Descrip@on Language
miniDFT(QuantumEspresso)
Plain-waveDensityFunc=onalTheory(DFT) Fortran
MILC LaxceQuantumChromodynamics(QCD).Sparsematrixinversion,CG
C
GTC-P Par=cle-in-cellmagne=cfusion C
UMT Unstructured-Meshdeterminis=cradia=onTransport
C/C++/Fortran
SNAP Neutronpar=cletransportapplica=on Fortran
PENNANT Unstructuredfiniteelement C
Meraculous Denovogenomeassembly UPC
MiniPIC Par=cleincellforaccelerators C++
HPCG HighPerformanceConjugateGradient C
-20-
1. Provideasignificantincreaseincomputa=onalcapabili=esovertheEdisonsystem,atleast16xonasetofrepresenta=veDOEbenchmarks
2. Plapormneedstomeettheneedsofextremecompu=nganddatausersbyaccelera=ngworkflowperformance
3. Plapormshouldprovideavehicleforthedemonstra=onanddevelopmentofexascale-eratechnologies
4. Deliveryinthe2020=meframe
GoalsandObjec@vesfortheNERSC-9Project
21
• NERSC-9willbuilduponthesuccessesofthedatadifferentcomponentsofCori
• Endtoendworkflowrequirementsandperformancearecri=calforthedesignandop=miza=onofthesystem
• Overallgoalistoenableseamlessdatamo=onwithdynamicalloca=onandschedulingofresources– Enablefirststepstowardsexascale-erastoragesystem
– Vendorcommunityexcitedaboutengagementandcollabora=onopportuni=es
NERSC-9WillProvideCapabili@esforDOEData-IntensiveUsersin2020
22
APEX2020–NREonthePaththeExascale• TheAPEX2020systemsNREtopicswilltargetareasthat
– achievehigherapplica=onperformance,– improvesupportfordata-intensivecompu=ng,and,– enablegreatereaseofusebyadvancingnewtechnologiesonthepathtotheexascalesystemsin2023
• TheCrossroadsandNERSC-9plapormsNREtopicsare– Technologiesfortheexplora=onofnewandnovelprogrammingmodelsconcepts
– Aplapormintegratedstoragesystemthatsupportsnewmodelsformovingandmanagingdataseamlessly
– Systemswithscalablemanagementcapabili=estoenhancethereliability,resilience,powerandenergyusagecharacteris=cs
23
Summary
• NERSC-9willbe2020machinethatmeetstheneedsofallNERSCusers
• NERSCwillcon=nueitsNESAPprograminsupportofNERSC-9
• NERSCwillpartnerwithvendorsonNon-RecurringEngineeringprojectstomaximizetheusabilityandperformanceofthemachine
-24-
Ques@ons?
25
TheApplica@onTransi@onProgramisdesignedtocon@nueusersonthepathtoexascale
• Technicalspecifica=onsasksforCenterofExcellence– Establishmentofacollabora=onbetweentheLabs,thechosenOEM,andkeytechnologyproviders,e.g.processor,isessen=altomeetthegoalsofthemakingefficientuseoftheplapormina=melymanner
• CenterofExcellence(CoE)baseduponpreviousDOEefforts– NERSCExascaleScien=ficApplica=onsProgram(NESAP)– CAAR&ESPprogramsatORNL&ANL
• CenterofExcellence(CoE)leveragessomeorallof:– SSImetricapplica=ons– NERSCExascaleScien=ficApplica=onsProgram(NESAP)– Selectapplica=onsexpectedtousethemachineshortlyaLeropera=onalreadiness/acceptance
26
TheApplica@onTransi@onProgramwillprovidedevelopmentresourcesforusers• Earlyaccesstokeytechnologiesandprogramming
environmentsisessen=alforapplica=ontransi=on– Programmingenvironmentiscrucial
• Accesstoemula=onandsimula=oncapabili=esasearlyaspossible– keycontribu=onoftechnologyproviders
• EarlyAccessDevelopmentSystem– Oneormoreitera=onsofincreasingscale
• Eventually2-10%offinalsystemsize
• Developmenttestbeds– Toinves=gateselectadvancedtechnologyareas
• E.g.Network,powermanagement,burstbuffer
– Sameordifferentcomposi=onofhardwaredependingontopic
27
APEXNonRecurringEngineering(NRE):Philosophy
• TechnicalSpecifica=onsaskforNREproposals• NREcontractspoten=ally10-15%ofplapormbudgets
• Othertopicsthathavepoten=altoimpactpathtoexascalewillbeconsidered
• Focusontopicsthatprovideaddedvaluebeyondplannedvendorroadmapac=vi=es
• NREcollabora=onswillhaveimpactonfollow-onplapormsprocuredbytheU.S.DepartmentofEnergy'sNNSAandOfficeofScience.
28