Capacity and Capability Computing using Legion Anand Natrajan ( anand@virginia.edu ) The Legion...

Preview:

Citation preview

Capacity and Capability Computing using Legion

Anand Natrajan(anand@virginia.edu)

The Legion Project, University of Virginia(http://legion.virginia.edu)

Capacity and Capability Computing

• Capacity Computing: Conduct larger computational experiments by expending more resources– single problem– multiple, independent problems

• Capability Computing: Conduct experiments with new mechanisms– Heterogeneity– Security– Collaboration

Grid Environment

Computers Networks People Data Devices

Disjoint file systems Disjoint namespaces Multiple

administration domains

Unpredictable load, availability, failures

Security problems

Grid OS Requirements

• Wide-area• High Performance• Complexity

Management• Extensibility• Security• Site Autonomy• Input / Output• Heterogeneity

• Fault-tolerance• Scalability• Simplicity• Single Namespace• Resource

Management• Platform

Independence• Multi-language• Legacy Support

Legion - A Grid OS

Tools

• MPI / PVM• P-space studies -

multi-run• Parallel C++• Parallel object-based

Fortran• CORBA binding• Object migration• Accounting

• Remote builds and compilations

• Fault-tolerant MPI libraries

• Post-mortem debugger

• Console objects• Parallel 2D file objects• Collections• Licence support

Protein Folding with CHARMMMolecular

Dynamics Simulations

100-200 structures to sample

(r,Rgyr ) space

Rgyr

IBM Blue HorizonSDSC

375MHz Power3512/1184

IBM Blue HorizonSDSC

375MHz Power3512/1184

Resources Available

HP SuperDomeCalTech

440 MHz PA-8700128/128

HP SuperDomeCalTech

440 MHz PA-8700128/128

IBM SP3UMich

375MHz Power324/24

IBM SP3UMich

375MHz Power324/24

IBM AzureUTexas

160MHz Power232/64

IBM AzureUTexas

160MHz Power232/64

Sun HPC 10000SDSC

400MHz SMP32/64

Sun HPC 10000SDSC

400MHz SMP32/64

DEC AlphaUVa

533MHz EV5632/128

DEC AlphaUVa

533MHz EV5632/128

Transparent Remote Execution

• User initiates “run”• User/Legion selects site• Legion copies binaries• Legion copies input files• Legion starts job(s)• Legion monitors progress• Legion copies output files

Mechanics of CHARMM Runs

Leg

ion

Register binaries

Create taskdirectories &specification

Dispatchruns

Dispatchmore runs

77%

20%

1%

2%

0%

0%Blue Horizon

CalTech

UTexas

DEC Alpha

UMich

Sun HPC

Types Of Applications

• Legacy applications• Legion-aware applications

– I/O library– 2D file object

• Applications Using Stdgrid• Parameter Space Studies• Parallel Programs

– MPI, PVM, MPL, Basic Fortran Support (BFS)

Computing in the Near Future

• Security• Fault-tolerance• Heterogeneity• Collaboration• …

• Legion supports these and other needs

Heterogeneous Runs

BT-Med Ocean Model

Cross-Organisation Collaboration

• Different companies• Proprietary simulations and data• Each needs the other• Form virtual partnership

Flexible Context Space

Context

Context Context

Context Directory

Directory Directory

Directory

Disk

Disk e ftp

legion_export_dirlegion_import_tree

SambaNFS HTTP

FTP

Interfaces

• Samba, NFS, FTP, HTTP interfaces to distributed file system

• Windows interface for file sharing• Command-line through Unix-like tools• Web interface through browser• Programmatic interfaces through

system calls in C, C++, Fortran, Java

Platforms

• Windows NT, 2K, 98, 95• Sun (Solaris)• SGI (Irix, Origin)• Intel (Linux, Free BSD)• DEC (Unix, Linux)• Cray (T90, T3E) • IBM (AIX, SP-2)• HP (HPUX)

• Nimrod• Codine• LoadLeveler• Maui• PBS• NQS• LSF

Applications

• Biochemistry and Molecular Science• Information Retrieval• Materials Science• Climate Modelling• Neuroscience• Aerospace• Astronomy• Graphics

NPACI - SDSC, UCSD, Caltech, UTexas, Umich, UCB, UVa. DoD MSRCs - NAVO & ARL, NASA Ames

Recommended