Upload
arnold-douglas
View
220
Download
1
Tags:
Embed Size (px)
Citation preview
The Swiss Initiative for High-Performance Computing and Networking
Neil Stringfellow, Associate Director CSCS
Centro Svizzero die Calcolo Scientifico (CSCS)Swiss National Supercomputing Center
Established in 1991 by the Swiss Government as autonomous unit of ETH Zurich
Located in Manno, near Lugano Highly qualified, internationally
recognized staff (41 FTE) Develops, promotes, and
provides leading-edgehigh-performance computingservices to the Swiss researchcommunity 400 users working on 50 projects (status 2009)
Hosting and operating on behalf of Meteo Swiss the supercomputer foroperational weather forecasts (8 simulations per day, first country to run high resolution weather forecast in Europe)
Distribution of compute time in 2008 by application areas
26%
20%
16%
10%
9%
8%
8% 3%
Benutzerstatistik nach Forschungsgebiet
Earth and Environmental Sciences
Chemistry
Physics
Materials Science
Biosciences
Astronomy
Fluid dynamics
Nanoscience
Distribution of compute time in 2008 by institutions
59%
13%
9%
6%
5%4% 2% 1%1%
Benutzerstatistik nach Institutionen
ETHZ
PSI
UNI-ZH
MCH
EPFL
UNI-BA
UNI-GE
EMPA
UNI-BE
5
The national HPCN strategy
Issues HPC is a key requirement for leadership science as well as for a
knowledge based society and industry The international competition HPC is accelerating (USA, Japan,
D, F, UK, E, China and India) Economy of scale is a basis of HPC
Answers Installation of a Petaflop/s computer by 2011/2012 Construction of a new CSCS building Creation of a Swiss competence network to connect existing
application areas and reach out to new ones
Stimulus Package
Swiss Stimulus Package
700 Million CHF
2% of all Stimulus Money went to CSCS !
3.5 Million for Building Planning
10 Million for New Machine 3 Million for HPC
Education
Cray XT5 – Monte Rosa 14,752 processors
1844 eight-way nodes 2 AMD 2.4 GHz “Shanghai” Opterons per node
Upgrade underway to 2.4 GHz “Istanbul”
Peak performance 141 Tflop/s Linpack 117 Tflop/s Peak will be 212 Tflop/s after upgrade
29 Terabytes of memory 16 Gigabytes per node
2 Gbytes per processor core
287 Terabytes of scratch file system ~ capable of 12 GB/s sustained write bandwidth
23rd on Top500 list in June 2009 4th most powerful system in Europe
Already at 90% Utilisation ~ 30% of jobs require > 50% of machine
7
Pillars of 21. century scientific method
Theory (since antiquity)
combined with experiment (since Galilei & Newton)
and simulation(since Metropolis, Teller, von Neuman, Fremi, ... 1940s)
Excellence in Science requires leadership in all three areas: theory, experiment, and simulations
Invest in algorithms or computer hardware?
19701975198019851990199520001101001000100001000001000000
1E71E81E9
1E10
relative performance
computer speed19701975198019851990199520001101001000100001000001000000
1E71E81E9
1E10
relative performance
computer speed19701975198019851990199520001101001000100001000001000000
1E71E81E9
1E10
relative performance
computer speed
(source: David Landau, UGA)
Simulations are necessary for scientific investigations to cope effectively with complex systems
Science is about discovery and understanding - those who come first get the credit
Simulations that use high-performance computing (HPC) have the competitive edge
Leadership in science requires leadership in simulation and leadership in HPC in particular
Role of science in Switzerland: why we are well positioned to make leading contributions to HPC
Switzerland puts a high value on scientific research and education and on maintaining international leadership in science and engineering
The density of internationally recognized computational scientists in Switzerland is very high, even when compared to the USA
Stable funding and flat hierarchies in Switzerland and particularly at ETH allow for a pragmatic, solution-oriented, and nimble response to new challenges and opportunities
Top 200 in Shanghai List
Computational Science in Switzerland
The density of internationally recognized computational scientists in Switzerland is very high
ETH ZurichEPF Lausanne
University of ZurichUniversity of BaselUniversity of BernUniversity of Geneva
EMPAPaul Scherrer Institute
CSCS User Community
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland
Potential Vorticity streamers are intrusions of stratospheric air into the troposphere. They affect various atmospheric processes, like heavy precipitation over the Alps.
ECHAM-HAM high-resolution simulations reliably capture the frequency at which potential vorticity streamers occur. Low resolution simulations underestimate their occurrence.
(master thesis A. Béguin, ETH Zurich, 2009)
absolute streamer occurrence on 330 K during winter (DJF)2.8 x 2.8
(T42L19) 1.9 x 1.9(T63L31)
1.1 x 1.1(T106L31)
reference data (ERA40, 1x 1)
Predicting the frequency of severe weather events in a changing climate:
high-resolution simulations are crucial
CSCSSwiss National Supercomputing Center
0m
2000m
1500m
1000m
500m
sea
1.1 x 1.1(T106) land / sea distribution and terrain height
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland
2.8 x 2.8 (T42) 1.9 x 1.9(T63)
Europe in ECHAM-HAM
high-resolution required to:1) provide boundary conditions for nested regional model2) compare model with regional scale observational data
for example: Italy, the Alps, or Denmark are missing at low resolution, 2.8 x 2.8
CSCSSwiss National Supercomputing Center
Why resolution is such an issue for Switzerland70 km 35 km 8.8 km
2.2 km 0.55 km
1X 100X
10,000X 1,000,000X
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland
The Alpine area is very vulnerable to changes in the water cycle such as droughts, heat waves, and floods. Current projections of future changes in summer precipitation are highly uncertain.
Advantages of cloud-resolving climate models: (1) Better representation of the land surface, (2) Explicit representation of heav precipitation (e.g. thunderstorms).
Better representation of the daily cycle of precipitation in summer periods (Hohenegger et al. 2008, MZ).
High-resolution cloud-resolving regional climate simulations: Towards improved simulations of the water cycle in a changing climate
CSCSSwiss National Supercomputing Center
Cloud resolving @ 2.2km
State-of-the-art @ 25km
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland
CSCSSwiss National Supercomputing Center
Terrain height in the regional climate model at different resolutions
Importance of HPC for modelling other Natural Hazards in Switzerland
Climate and Weather
Avalanches
Energy
Astrophysics
Engineering
Earthquakes
•In 1356 Basel was destroyed by an Earthquake.•We now know that large earthquakes are more frequent than previously thought•Earthquake modelling is important for planning nuclear power plant safety
Selected application areas for simulation based science and engineering in Switzerland
Climate and Weather
Materials science
Chemistry/Pharmaceutical
Biomedical
Energy
and many others
Astrophysics
Engineering
Simulations require a high-performance computing ecosystem
Local/institutional capacity computing
Capability computing at
regional/national centers
Leadership
1. Prior to 2004:VASP code developedon workstations and clusters and runson about 100 proc.
2. Scale-out 2005:Algorithm and implementation adapted for leadership systems
4. Large simulations since 2008:Continue large simulations on capability systems
3. Leadership runs 2006-2007:Production runs on leadership Cray XT3/4 system(~5000 processors)
Strategic goals
In order to sustain a leading position in science, Switzerland has to develop leadership in HPC to support simulations, one of the three pillars of modern science
Sustainable implementation of the HPC ecosystem in Switzerland, which includes the national supercomputing center, institutional computing facilities, as well effective mapping of models and methods onto modern HPC hardware
Establish strong relationships with leadership computing facilities around the world
Develop key components of HPC in Switzerland Method and algorithm development Programming models, languages, and architectures for HPC Sustained operations of national and institutional HPC systems
The ecosystem in numbers (peak performance)2011 (planned)
20 PFlop/sSequoia (BG/Q) @ LLNLLCF3: Argonne or Oak Ridge
Local/institutional capacity computing
Capability computing at
regional/national centers
Leadership
2009 (today)
1.5 PFlop/s
300 TFlop/s
60 TFlop/s
5x
5x
Jaguar @ ORNL/LCF200 XT5 cabinets
Rosa @ CSCSonly 20 XT5 cabinets => 210 TFlop/s (infrastructure limited)
EPFL: ~60 TFlop/s UZH: ~60 TFlop/sETHZ: ~70 TFlop/s
4 PFlop/s
5x
Will require new building infrastructure at CSCS
800 TFlop/s
5x
Think about this now!
Elements of the Swiss HPCN Initiative
Swiss Platform for HP2C (2009-12): Simulation capabilities that make effective use of next generation
supercomputers Establish HPC in CSE programs at Swiss universities
Hardware Phase I (2009-11): Upgrade Cray XT system at CSCS to maximum possible within current
infrastructure Develop new building infrastructure by 2012:
State of the art infrastructure that support a machine footprint that is about a factor 10 larger than today
Hardware Phase II (2012-15): Goal for CSCS is to host systems with performance of 20-25%
compared to largest leadership system in the world
Experiences with upgrade in 2009 Implemented in record time!
March: financing, decision & placement of order
February through April: site preparations
May - Installation June: early users & acceptance July: part of CSCS user program
CSCS at maximum of current building capacity Current power usage 1.9 MW (99% of
capacity) Running at maximum cooling capacity
(frequent system shut-down in summer) Abandon memory upgrade in fall 2009 No room to further grow computer
systems in the future
Textmasterformate durch Klicken bearbeiten
New building planned in Lugano
Area (1500 m^2) Power & cooling ~ 10 MW Proximity to academic institution Facilitate seamlessly changes in
computer hardware Extensible
Simulations
Models,Methods,& Implementation
Map to Hardware
System operation
System design
Learning from the Oak Ridge experience: Covering all aspect of the simulation system
Physics (chemistry, ...)
Application software
Comp. mathematics
Computer Science
Computer Center
Hardware vendor
Applied researchCSCS & USI
CSE at universities
Example based on ORNL’s early science teams that run on the first petaflop/s systems
vendors
Users
CSCS’s (HPC Centers) traditional role
Distributing the tasks in Switzerland:
Systems research CS Dept.& vendors
The Swiss platform for High-Performance and High-Productivity Computing ( ) Develop simulation capabilities that will make effective use
of supercomputing platforms in 2012-14 Implement the “networking” part of the HPCN strategy
Core program in computational mathematics and problem oriented computer science (jointly between CSCS & University of Lugano)
About 10-15 domain science sub-projects at Swiss universities with ~3 “embedded” HPC developers per project
Explore future hardware architectures with industry (Cray, IBM, other) and lading laboratories (ORNL, NERSC, others)
Develop HPC components of computational science and engineering curricula at Swiss universities Already established: CSE at ETH, U. Basel Currently under development: CSE @ USI, EPFL, UZH Reach out to other universities
Projects have to face “brutal facts of HPC”
Massive concurrency: applications will have to put up with millions (billions) of threads
Less and (relatively) slower memory per thread: memory consideration should be integral part of complexity analysis
Only slow improvements in inter-processor and inter thread communications - remember that speed of light is constant!
Stagnant I/O subsystems: you don’t want to limit progress in simulation capabilities with rate of progress in long-term storage technologies
Resilience and fault tolerance: resilience towards failure of individual components; (energy) cost to error detection and correction is non-negligible
Expected research priorities of projects
Significant problems that require orders of magnitude more computer power than what is available today
Significant re-engineering of algorithms and refactoring of codes - scientific progress cannot be limited by legacy software
Consider emerging parallel programming models - multiple levels of parallelism, PGAS, DARPA HPCS languages, heterogeneous nodes (consider CPU + accelerator)
Revisit workflows, in particular to avoid I/O
Letters of intent were due August 15, 2009Project proposals were due September 30, 2009Review and decision making process in October/November 2009Tier 1 projects start in Dec./Jan. 2009Tier 2 projects start ca. spring/summer 2010
CSCS service portfolio
Business Services
•Administration•Human resources•Finance•Building Infrastructure•IT Infrastructure
Business Services
•Administration•Human resources•Finance•Building Infrastructure•IT Infrastructure
National Supercomputi
ng Service
•HPC Systems•System programming•Resource allocation•User support•User education & training•Short- to medium-term application support
National Supercomputi
ng Service
•HPC Systems•System programming•Resource allocation•User support•User education & training•Short- to medium-term application support
Scientific Computing
•Long-term application development support•Data analysis & visualisation•Experimental HPC systems
Scientific Computing
•Long-term application development support•Data analysis & visualisation•Experimental HPC systems
Research Computing Collocation
Service
•MeteoSwiss•CHIPP•Other hosting mandates
Research Computing Collocation
Service
•MeteoSwiss•CHIPP•Other hosting mandates
Internal support services
Technologytransfer
Core business:Academic HPC service and HPC research
CONFIDENTIAL
High-risk & high-impact projects of the
(www.hp2c.ch)
UpgradeCray XT514’752 cores Dual core upgrade
Cray XT33’328 cores Upgrade
Cray XT31’664 proc.
New procurementCray XT31’100 processors
2005
2007
2008
2009
2010
2011
2012
2013
2006 Hex-core upgrade 22’128 cores
Begin constructionof new building
“Final” upgradeCray XT5
Procurementnext generationsupercomputerHPCN initiative
New building