1
UCSD
SAN DIEGO SUPERCOMPUTER CENTER
Who needs a supercomputer?
Professor Snavely, University of CaliforniaProfessor Allan Snavely
University of California, San Diegoand
San Diego Supercomputer Center
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Aren’t computers fast enough already?
This talk argues computer’s are not fast enough already Nor do supercomputers just naturally get faster as a result
of Moore’s Law. We explore implications of: Moore’s Law Amdahl’s Law Einstein’s Law
Supercomputers are of strategic importance, enabling a “Third Way” of doing science-by-simulation Example: Terashake Earthquake simulation
Viable National Cyberinfrastructure requires centralized supercomputers
Supercomputing in Japan, Europe, India, China Why SETI@home + Moore’s Law does not solve all our problems
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
The basic components of a computer
Your laptop has these:
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Supercomputers (citius, altius , fortius)
Supercomputers are just “faster, higher, stronger”, than your laptop, more and faster processors etc. capable of solving large scientific calculations
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
An army of ants approach
In Supercomputers such as Blue Gene, DataStar, thousands of CPUs cooperate to solve scientific calculations
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Computers live a billion seconds to our every one!
Definitions: Latency is distance measured in timeBandwidth is volume per unit of time
Thus, in their own sense of time, the latencies and bandwdiths across the machine room span 11 orders of magnitude! (from Nanoseconds to Minutes.) To a supercomputer, getting data from disk is like sending a rocket-ship to Saturn!
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Moore’s Law
Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.
Moore’s law has had a decidedly mixed impact, creating new opportunities to tap into exponentially increasing computing power while raising fundamental challenges as to how to harness it effectively.
Things Moore never said: “computers double in speed every 18 months” “cost of computing is halved every 18 months” “cpu utilization is halved every 18 months”
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Moore’s Law
i4004
i80286
i80386
i8080
i8086
R3000R2000
R10000Pentium
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
1970 1975 1980 1985 1990 1995 2000 2005
Year
Tra
nsis
tors
Moore’s Law: the number of transistors per processor chip by doubles every 18 months .
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Snavely’s Top500 Laptop?
Among other startling implications of Moore’s Law is the fact that the peak performance of the typical laptop would have placed it as one of the 500 fastest computers in the world as recently as 1995.
Shouldn’t I just go find another job now?
No, because Moore’s Law has several more subtle implications and these have raised a series of challenges to utilizing the apparently ever-increasing availability of compute power; these implications must be understood to see where we are today in High Performance superComputing (HPC).
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
The Vonn Neumann bottleneck
Scientific calculations involve operations upon large amounts of data, and it is in moving data around within the computer that the trouble begins. As a very simple pedagogical example consider the expression
A + B = C
The computer has to load A and B, “+” them together, and store C
“+” is fast by Moore’s Law, load and store is slow by Einstein’s Law
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Supercomputer “Red Shift”
While the absolute speed of all computer subcomponents have been changing rapidly, they have not all been changing at the same rate.
While CPUs get faster they spend more time sitting around waiting for data
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Amdahl’s Law
The law of diminishing returns When a task has multiple parts, after you speed up one part
a lot, the other parts come to dominate the total time An example from cycling:
On a hilly closed-loop course you cannot ever average more than 2x your uphill speed even if you go downhill at the speed of light!
For supercomputers this means even though processors get faster the overall time to solution is limited by memory and interconnect speeds (moving the data around)
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Red Shift and the Red Queen
It takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!
Corollary: Allan’s laptop is not a balanced system!
System utilization is cut in half every 18 months?
Fundamental R&D in latency hiding, high bandwidth network, Computer Architecture
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
3 ways of science
Experiment
Theory
Simulation
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Major Earthquakes Major Earthquakes on the San on the San
Andreas Fault, Andreas Fault, 1680-present1680-present
19061906M 7.8M 7.8
18571857M 7.8M 7.8
16801680M 7.7M 7.7
The SCEC TeraShake simulation is a result of immense effort from the Geoscience community for over 10 years
Focus is on understanding big earthquakes and how they will impact sediment-filled basins.
Simulation combines massive amounts of data, high-resolution models, large-scale supercomputer runs
TeraShake results provide new information enabling better
Estimation of seismic risk
Emergency preparation, response and planning
Design of next generation of earthquake-resistant structures
Such simulations provide potentially immense benefits in saving both many lives and billions in economic losses
?
How Dangerous is the Southern San Andreas Fault?
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
TeraShake Animation
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Compute (more FLOPS)
Dat
a (m
ore
BY
TE
S)
Home, Lab, Campus, Desktop
TraditionalHPC
environment
Data-oriented Science and Engineering Environment
SDSC and Data Intensive Computing
Brainmapping
TeraShake
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
The Japanese Earth Simulator
Took U.S. HPC Community by surprise in 2002 – “Computenik”
For 2 years had more flops capacity than top 5 U.S. systems
Approach based on specialized HPC design
Still has more data moving capacity
Sparked “space race” in HPC, Blue Gene surpassed for flops 2005
San Diego Supercomputer Center
Performance Modeling and Characterization Lab
PMaC
Summary
“Red Shift” means the promise implied by Moore’s Law is largely unrealized for scientific simulation that by necessity operate on large data Consider “The Butterfly Effect”
Supercomputer Architecture is a hot field Challenges from Japan, Europe, India, China
Large centralized, specialized compute engines are a vital national strategic resources
Grids, utility programing, SETI@home etc. do not meet all the needs of largescale scientific simulation for reason that should now be obvious Consider a galactic scale