View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Technical computing: Observations on an ever changing, occasionally repetitious, environment
Los Alamos National Laboratory
17 May 2002
A brief, simplified history of HPC1. Sequential & data parallelism using shared memory, Cray’s Fortran
computers 60-02 (US:90) 2. 1978: VAXen threaten general purpose centers…3. NSF response: form many centers 1988 - present
4. SCI: Search for parallelism to exploit micros 85-95 5. Scalability: “bet the farm” on clusters.
Users “adapt” to clusters aka multi-computers with LCD program model, MPI. >95
6. Beowulf Clusters adopt standardized hardware and Linus’s software to create a standard! >1995
7. “Do-it-yourself” Beowulfs impede new structures and threaten g.p. centers >2000
8. 1997-2002: Let’s tell NEC they aren’t “in step”.
9. High speed networking enables peer2peer computing and the Grid. Will this really work?
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Outline
Retracing scientific computing evolution: Cray, SCI & “killer micros”, ASCI, & Clusters kick in.
Current taxonomy: clusters flavors deja’vu rise of commodity computng:
Beowulfs are a replay of VAXen c1978 Centers: 2+1/2 at NSF;
BRC on CyberInfrastructure urges 650M/year Role of Grid and Peer-to-peer Will commodities drive out or enable new ideas?
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
DARPA SCI: c1985-1995;prelude to DOE’s ASCI
Motivated by Japanese 5th Generation … note the creation of MCC
Realization that “killer micros” were Custom VLSI and its potential Lots of ideas to build various high
performance computers Threat and potential sale to military
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Steve Squires & G Bell at our “Cray” at the start of DARPA’s SCI c1984.
What Is the SystemArchitecture?(GB c1990)
MIMD
Multiprocessors Single Address Space Shared Memory Computation
Multicomputers Multiple Address Space Message Passing Computation
Central Memory Multiprocessors (not scalable)
Distributed Memory Multiprocessors (scalable)
Dynamic Binding of addresses to processors KSR
Static Run-time Binding research machines
Bus multis DEC, Encore, NCR, ... Sequent, SGI,Sun
Cross-point or Multi-stage Cray, Fujitsu, Hitachi, IBM, NEC, Tera
Distributed Multicomputers (scalable) Switch connected
IBM
Mesh Connected Intel
Fast LANs for High Availability and High Capacity Clusters DEC, Tandem
LANs for Distributed Processing Workstations, PCs
Butterfly/Fat Tree/Cubes CM5, NCUBE
Static binding, Ring multi IEEE SCI proposalStatic Binding, caching Alliant, DASH
Simple, ring multi ... bus multi replacement
X
X
X
GRID
SIMDX
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Processor Architectures?
VECTORS VECTORSOR
CS View
MISC >> CISC >> Language directed
RISC >> Super-scalar >>Extra-Long Instruction Word
Caches: mostly alleviate need for memory B/W
SC Designers View
RISC >> VCISC (vectors)>>
Massively parallel (SIMD) (multiple pipelines)
Memory B/W = perf.
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
The Bell-Hillis Bet c1991Massive (>1000) Parallelism in 1995
TMC
World-wide
Supers
TMC
World-wide Supers
TMC
World-wideSupers
ApplicationsRevenue
Petaflops / mo.
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Results from DARPA’s SCI c1983 Many research and construction efforts … virtually all
new hardware efforts failed except Intel and Cray. DARPA directed purchases… screwed up the market,
including the many VC funded efforts. No Software funding! Users responded to the massive power potential with LCD
software. Clusters, clusters, clusters using MPI. It’s not scalar vs vector, its memory bandwidth!
– 6-10 scalar processors = 1 vector unit– 16-64 scalars = a 2 – 6 processor SMP
Dead Supercomputer Society ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/Stellar/Stardent Denelcor Elexsi ETA Systems Evans and Sutherland Computer Floating Point Systems Galaxy YH-1
Goodyear Aerospace MPP Gould NPL Guiltech Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories MasPar Meiko Multiflow Myrias Numerix Prisma Tera Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems Suprenum Vitesse Electronics
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
What a difference 25 years AND spending >10x makes!
LLNL 150 Mflops machine room c1978
ESRDC: 40 Tflops. 640 nodes (8 - 8GFl P.vec/node)
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Computer types
NetwrkedSupers…
LegionCondor Beowulf NT clusters
VPPuni
T3E SP2(mP) NOW
NEC mP
SGI DSM clusters &SGI DSM
NEC super Cray X…T(all mPv)
MainframesMultis
WSs PCs
-------- Connectivity--------
WAN/LAN SAN DSM SM
mic
ros
v
ecto
r
ClustersGRID& P2P
OldWorld
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Top500 taxonomy… everything is a cluster aka multicomputer
Clusters are the ONLY scalable structure– Cluster: n, inter-connected computer nodes operating as one
system. Nodes: uni- or SMP. Processor types: scalar or vector. MPP= miscellaneous, not massive (>1000), SIMD or
something we couldn’t name Cluster types. Implied message passing.
– Constellations = clusters of >=16 P, SMP– Commodity clusters of uni or <=4 Ps, SMP– DSM: NUMA (and COMA) SMPs and constellations– DMA clusters (direct memory access) vs msg. pass– Uni- and SMPvector clusters:
Vector Clusters and Vector Constellations
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Linux - a web phenomenon Linus Tovald - writes news reader for his PC Puts it on the internet for others to play Others add to it contributing to open source
software Beowulf adopts early Linux Beowulf adds Ethernet drivers for essentially all
NICs Beowulf adds channel bonding to kernel Red Hat distributes Linux with Beowulf software Low level Beowulf cluster management tools added
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
The Challenge leading to Beowulf
NASA HPCC Program begun in 1992 Comprised Computational Aero-Science and
Earth and Space Science (ESS) Driven by need for post processing data
manipulation and visualization of large data sets Conventional techniques imposed long user
response time and shared resource contention Cost low enough for dedicated single-user
platform Requirement:
– 1 Gflops peak, 10 Gbyte, < $50K Commercial systems: $1000/Mflops or 1M/Gflops
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Inno
vatio
n
The Virtuous Economic Cycle drives the PC industry… & Beowulf
Volum
e
Competition
Standards
Utility/value
DOJ
Greater availability
@ lower cost
Creates apps, tools, training,Attracts users
Attracts suppliers
Lessons from Beowulf
An experiment in parallel computing systems Established vision- low cost high end computing Demonstrated effectiveness of PC clusters for some (not all) classes of
applications Provided networking software Provided cluster management tools Conveyed findings to broad community Tutorials and the book Provided design standard to rally community! Standards beget: books, trained people, software … virtuous cycle that
allowed apps to form Industry begins to form beyond a research project
Courtesy, Thomas Sterling, Caltech.
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Clusters: Next Steps
Scalability… They can exist at all levels:
personal, group, … centers Clusters challenge centers…
given that smaller users get small clusters
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Disk Evolution Capacity:100x in 10 years
1 TB 3.5” in 2005 20 TB? in 2012?!
System on a chip High-speed SAN
Disk replacing tape Disk is super computer!
Kilo
Mega
Giga
Tera
Peta
Exa
Zetta
Yotta
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Intermediate Step: Shared Logic
Brick with 8-12 disk drives 200 mips/arm (or more)
2xGbpsEthernet General purpose OS 10k$/TB to 100k$/TB Shared
– Sheet metal– Power– Support/Config– Security– Network ports
These bricks could run applications e.g. SQL, Mail…
Snap ~1TB 12x80GB NAS
NetApp ~.5TB 8x70GB NAS
Maxstor ~2TB 12x160GB NAS
IBM TotalStorage~360GB 10x36GB NAS
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
SNAP Architecture----------
RLX “cluster” in a cabinet 366 servers per 44U cabinet
– Single processor– 2 - 30 GB/computer (24 TBytes)– 2 - 100 Mbps Ethernets
~10x perf*, power, disk, I/O per cabinet
~3x price/perf Network services…
Linux based
*42, 2 processors, 84 Ethernet, 3 TBytes
Computing in small spaces @ LANL(RLX cluster in building with NO A/C)
240 processors @2/3 GFlops
Fill the 4 racks -- gives a Teraflops
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Beowulf Clusters: space
Performance/Space Ratio
Bladed Beowulf
ASCI White
Mflops/Sq. Ft.
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Beowulf clusters: power
Performance/Power Ratio
Bladed Beowulf
Beowulf
Mflops/Watt
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
“The networks becomes the system.”- Bell 2/10/82 Ethernet announcement with Noyce (Intel), and Liddle (Xerox)
“The network become the computer.” SUN Slogan >1982
“The network becomes the system.” GRID mantra c1999
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
ComputingSNAPbuilt entirelyfrom PCs Wide & Local
Area Networksfor: terminal,
PC, workstation,& servers
Centralized& departmental
uni- & mP servers(UNIX & NT)
Legacymainframes &
minicomputersservers & terms
Wide-areaglobal
network
Legacymainframe &
minicomputerservers & terminals
Centralized& departmental
servers buit fromPCs
scalable computers
built from PCs
TC=TV+PChome ...
(CATV or ATM or satellite)
???
Portables
A space, time (bandwidth), & generation scalable environment
Person servers (PCs)
Person servers (PCs)
MobileNets
Telnet & FTP
WWW Audio Video
Voice!Voice!
StandardsStandards
Increase Capacity(circuits & bw)
Lower response time
Create newservice
Increased Demand
The virtuous cycle of bandwidth supply and demand
Incompence ?
Copyright Gordon Bell LANL 5/17/2002Copyright Gordon Bell LANL 5/17/2002
Internet II concerns given $0.5B cost
Very high cost– $(1 + 1) / GByte to send on the net;
Fedex and 160 GByte shipments are cheaper– DSL at home is $0.15 - $0.30
Disks cost $1/GByte to purchase! Low availability of fast links (last mile problem)
– Labs & universities have DS3 links at most, and they are very expensive
– Traffic: Instant messaging, music stealing Performance at desktop is poor
– 1- 10 Mbps; very poor communication links
Scalable computing: the effects They come in all sizes; incremental growth
10 or 100 to 10,000 (100X for most users)debug vs run; problem growth
Allows compatibility heretofore impossible1978: VAX chose Cray Fortran1987: The NSF centers went to UNIX
Users chose sensible environment– Acquisition and operational costs & environments– Cost to use as measured by user’s time
The role of gp centers e.g. NSF, statex is unclear. Necessity for support?– Scientific Data for a given community…– Community programs and data– Manage GRIDdiscipline
Are clusters ≈ Gresham’s Law? Drive out alts.