29
Building Beowulfs for High Performance Computing Duncan Grove Department of Computer Science University of Adelaide http://dhpc.adelaide.edu.au/ projects/beowulf

Building Beowulfs for High Performance Computing

  • Upload
    donar

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Building Beowulfs for High Performance Computing. Duncan Grove Department of Computer Science University of Adelaide http://dhpc.adelaide.edu.au/projects/beowulf. “Cluster” of networked PCs Intel PentiumII or Compaq Alpha Switched 100Mbit/s Ethernet or Myrinet Linux - PowerPoint PPT Presentation

Citation preview

Page 1: Building Beowulfs for High Performance Computing

Building Beowulfs for High Performance Computing

Duncan GroveDepartment of Computer Science

University of Adelaide

http://dhpc.adelaide.edu.au/projects/beowulf

Page 2: Building Beowulfs for High Performance Computing

Anatomy of a “Beowulf”

• “Cluster” of networked PCs– Intel PentiumII or Compaq Alpha

– Switched 100Mbit/s Ethernet or Myrinet

– Linux

– Parallel and batch software support

Switching Infrastructure

n1 nNn2

Front-end Node

Outside World

Compute Nodes

Page 3: Building Beowulfs for High Performance Computing

Why build Beowulfs?

• Science/$

• Some problems take lots of processing

• Many supercomputers are used as batch processing engines

– Traditional supercomputers wasteful high throughput computing

• Beowulfs:

– “ [useful] computational cycles at the lowest possible price.”

– Suited to high throughput computing

– Effective at an increasingly large set of parallel problems

Page 4: Building Beowulfs for High Performance Computing

Three Computational Paradigms

• Data Parallel

– Regular grid based problems

• Parallelising compilers, eg HPF

• Eg physicists running lattice gauge calculations

• Message Passing

– Unstructured parallel problems.

• MPI, PVM

• Eg chemists running molecular dynamics simulations.

• Task Farming

– “High throughput computing” - batch jobs

• Queuing systems

• Eg chemists running Gaussian.

Page 5: Building Beowulfs for High Performance Computing

A Brief Cluster History

• Caltech Prehistory

• Berkeley NOW

• NASA Beowulf

• Stone SouperComputer

• USQ Topcat

• UIUC NT Supercluster

• LANL Avalon

• SNL Cplant

• AU Perseus?

Page 6: Building Beowulfs for High Performance Computing

Beowulf Wishlist

• Single System Image (SSI)

– Unified process space

– Distributed shared memory

– Distributed file system

• Performance easily extensible

– Just “add more bits”

• Is fault tolerant

• Is “simple” to administer and use

Page 7: Building Beowulfs for High Performance Computing

Current Sophistication?

• Shrinkwrapped “solutions” or do-it-yourself

– Not much more than a nicely installed network of PCs

– A few kernel hacks to improve performance

– No magical software for making the cluster transparent to the user

– Queuing software and parallel programming software can create the appearance of a more unified machine

Page 8: Building Beowulfs for High Performance Computing

Stone SouperComputer

Page 9: Building Beowulfs for High Performance Computing

Iofor

• Learning platform

• Program development

• Simple benchmarking

• Simple performance evaluation of real applcaions

• Teaching machine

• Money lever

Page 10: Building Beowulfs for High Performance Computing

iMacwulf

• Student lab by day, Beowulf by night?– MacOS with Appleseed

– LinuxPPC 4.0, soon LinuxPPC 5.0

– MacOS/X

Page 11: Building Beowulfs for High Performance Computing

“Gigaflop harlotry”

• Machine Cost # Processors ~ Peak Speed

• Cray T3E 10s million 1084 1300Gflop/s

• SGI Origin 2000 10s million 128 128Gflop/s

• IBM SP2 10s million 512 400Gflop/s

• Sun HPC 1s million 64 50Gflop/s

• TMC CM5 5 Million (1992) 128 20Gflop/s

• SGI PowerChallenge 1 Million (1995) 20 20Gflop/s

• Beowulf cluster + myrinet 1 Million 256 120Gflop/s

• Beowulf cluster 300K 256 120Gflop/s

Page 12: Building Beowulfs for High Performance Computing

The obvious, but important

• In the past:

– Commomdity processors way behind supercomputer processors

– Commodity networks way, way, way behind supercomputer networks

• In the now:

– Commomdity processors only just behind supercomputer processors

– Commmodity networks still way, way behind supercomputer networks

– More exotic networks still way behind supercomputer networks

• In the future:

– Commodity processors will be supercomputer processors

– Will the commodity networks catch up?

Page 13: Building Beowulfs for High Performance Computing

Hardware possibilities

Advantages Disadvantages

x86, K7 Mass market commodityGood floating point(Some) SMP capable

PowerPC Very good integer More expensive than x86Poor floating point

Alpha Very good floating point ExpensiveLimited vendors

Page 14: Building Beowulfs for High Performance Computing

OS possibilitiesAdvantages Disadvantages

Linux Large user communityWidely availableOpen sourceMany platforms

Open sourceGood compiler x86 only

NT Good compilers Poor user modelPoor stabilityPoor remote accessPoor networking

Digital Unix Very good compilers Runs on expensive hardware

Solaris RobustGood quality softwareRuns on x86

Not open source

MacOSX Attractive as multipurpose cluster Not out yet!

Darwin Open SourceMay be ported to x86?

Small user community

Page 15: Building Beowulfs for High Performance Computing

Open Source

• The good...

– Lots of users, active development

– Easy access to make your own tweaks

– Aspects of Linux are still immature, but recently

• SGI has release xfs as open source

• Sun has released its HPC software as open source

• And the bad...

– There’s a lot of bad code out there!

Page 16: Building Beowulfs for High Performance Computing

Network technologies

• So many choices!

– Interfaces, cables, switches, hubs; ATM, Ethernet, Fast Ethernet, gigabit Ethernet, firewire, HiPPI, serial HiPPI, Myrinet, SCI…

• The important issues

– latency

– bandwidth

– availability

– price

– price/performance

– application type!

Page 17: Building Beowulfs for High Performance Computing

Disk subsystems

• I/O a problem in parallel systems

– Data not local on compute nodes is a performance hit

– Distributed file systems

• CacheFS

• CODA

– Parallel file systems

• PVFS

• On-line bulk data is interesting in itself

– Beowulf Bulk Data Server

• cf with slow, expensive tape silos...

Page 18: Building Beowulfs for High Performance Computing

Perseus

• Machine for chemistry simulations

– Mainly high throughput computing

– RIEF grant in excess of $300K

– 128 nodes. For < $2K per node

• Dual processor PII450

• At least 256MB RAM

– Some nodes up to 1GB

• 6GB local disk each

– 5x24 (+2x4) port Intel 100Mbit/s switches

Page 19: Building Beowulfs for High Performance Computing

Perseus: Phase 1• Prototype

• 16 dual processor PII

• 100Mbit/s switched Ethernet

Page 20: Building Beowulfs for High Performance Computing

Perseus: installing a node

Switching Infrastructure

n1 nNn2

Front-end Node

Outside World

User node, administration, compilers, queues, nfs, dns, NIS, /etc/*, bootp/dhcp, kickstart, ...

Floppy disk or bootrom

Page 21: Building Beowulfs for High Performance Computing

Software on perseus

• Software to support the three computational paradigms

– Data Parallel

• Portland Group HPF

– Message Passing

• MPICH, LAM/MPI, PVM

– High throughput computing

• Condor, GNU Queue

• Gaussian94, Gaussian98

Page 22: Building Beowulfs for High Performance Computing

Expected parallel performance

• Loki, 1996– 16 Pentium Pro processors, 10Mbit/s Ethernet– 3.2 Gflop/s peak, achieved 1.2 real Gflop/s on Linpack benchmark

• Perseus, 1999– 256 PentiumII processors, 100Mbit/s Ethernet– 115 Gflop/s peak

• ~40 Gflop/s on Linpack benchmark?• Compare with top 500!

– Would get us to about 200 currently– Other Australian machines?

• NEC SX/4 @ BOM at #102• Sun HPC at #181, #182, #255• Fujitsi VPP @ ANU at #400

Page 23: Building Beowulfs for High Performance Computing
Page 24: Building Beowulfs for High Performance Computing
Page 25: Building Beowulfs for High Performance Computing
Page 26: Building Beowulfs for High Performance Computing
Page 27: Building Beowulfs for High Performance Computing

Reliability in a large system• Build it right!

• Is the operating system and software running ok?• Is heat dissipation going to be a problem?

– Monitoring daemon• Normal features

– CPU, network, memory, disk• More exotic features

– Power supply and CPU fan speeds– Motherboard and CPU temperatures

• Do we have any heisen-cabling?– Racks and lots of cable ties!

Page 28: Building Beowulfs for High Performance Computing

The limitations...

• Scalability

• Load balancing

– Effects of machines capabilities

– Desktop machines vs. dedicated machines

– Resource allocation

– Task Migration

• Distributed I/O

• System monitoring and control tools

• Maintenance requirements

– Installation, upgrading, versioning

• Complicated scripts

• Parallel interactive shell?

Page 29: Building Beowulfs for High Performance Computing

… and the opportuntities

• A large proportion of the current limitations compared with traditional HPC solutions are merely systems integration problems

• Some contributions to be made in

– HOWTOs

– Monitoring and maintenance

– Performance modelling and real benchmarking