Upload
ebony
View
27
Download
4
Embed Size (px)
DESCRIPTION
The Grid as a Parallel Computer. Francis C.M. Lau Department of Computer Science The University of Hong Kong www.cs.hku.hk/~fcmlau. Greetings from Hong Kong!. Systems research @ HKU. www.srg.cs.hku.hk. hkgrid.org. HKGrid – the initial setup (2004). www.cngrid.org. - PowerPoint PPT Presentation
Citation preview
The Grid as a Parallel The Grid as a Parallel ComputerComputerFrancis C.M. Lau
Department of Computer ScienceThe University of Hong Kong
www.cs.hku.hk/~fcmlau
Greetings from Hong Kong!
Systems research @ HKU
www.srg.cs.hku.hk
hkgrid.org
HKGrid – the initial setup (2004)
www.cngrid.org
The 500th machine at 11/2005 has a peak of 2.9 Tflops
The 1st (DOE/BlueGene) has 0.37 Pflops
The 500th machine at 11/2005 has a peak of 2.9 Tflops
The 1st (DOE/BlueGene) has 0.37 Pflops
(11/2002)
Agenda
Parallel computing state of affairs
Parallel computing many faces
Grid as a parallel computer
Our first attempt – G-JavaMPI
Some thoughts for the future
The State of High Performance Computing
“Oxen vs. chickens”
• “If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?” - Seymour Cray (’25–’96)
• Your choice?
Will time tell?
• “At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers … Most cluster [experts] know now that users are fortunate to get more than 8% of the peak performance in sustained performance.” - Dr. Paul Terry, CTO, Cray Canada, 2004
Never to predict the future?
• “No one will need more than 640 kb of memory for a personal computer.” (Bill Gates, 1981, wrongly attributed?)
You need cpu, cpu, …
Subramanian, 1999
Software complexity
Is Grid New?
Many faces of “parallel” computing
• Distributed computing (DC)– Multiple computers remote from each other, each having a role
in a computation problem– Loose parallelism
• Cluster computing (CC)– DC on a LAN, with homogeneous processing nodes (typically
PCs), to form what appears to be a single, highly-available system
• Grid computing (GC)– A potentially very large DC operating
as an anarchy– As large as the Internet/WWW– Parallelism at stake?
• Cluster: chicken farm
• Grid: animal zoo– Enterprise grid: a private zoo in the backyard
• Distributed system: a “static” zoo where the animals are tame
From cluster to grid
• One of the main ideas of cluster computing is that, to the outside world, the cluster appears to be a single system, which is also the reason for clustering’s extreme successes
• A cluster can be programmed like a single computer, almost
• Can a grid? Should a grid?
Grid vs. service oriented computing
• To many, the two are almost synonymous– Just as Web and the Internet are almost synonymous
• SOC refers to binding to Web services at runtime– Grid is about the provisioning of resources– The current grid’s use of Web services was out of
convenience (my opinion)– But the service paradigm should
not be the only possible form ofcomputing with the grid
• You want a hamburger– you can either go toMacdonalds or do it yourself
• SOC applied to the Web (as a grid) is probably best for commercial applications (Macdonalds)
• For scientific or grand challenge problems, we need to program the grid (DIY)
The Grid as a Computer?
• Cluster: more nodes than microprocessors in each node (MPI)
• Constellation: A node has more microprocessors than # nodes (OpenMP)
• Tightly integrated MPP
• Grid?
Grid vs. clustering
• Grid: heterogeneous resources (computation, storage, networking, OS, etc.)
• Grid: dynamic (resources come and go)• Grid: distributed over a local or wide area• Grid: increased scalability (no
latency/proximity limits)• Grid: multiple ownerships• Grid and cluster are complementary
Issues
• Heterogeneity
• Availability
• Latencies
• Security and trustworthiness
• Load balancing!
• Towards single system image (SSI)
Grid: heterogeneous resources (computation, storage, networking, OS)Grid: dynamic (resources come and go)Grid: distributed over a local or wide areaGrid: increased scalability (no latency/proximity limits)Grid: multiple ownershipsGrid and cluster are complementary
Load Balancing is Key
Parallel applications
• Multiple processes, multiple threads
• Application types– SIMD (Single Instruction, Multiple Data)
• SPMD (Single program, multiple data)
– MIMD (Multiple Instruction, Multiple Data)
MIMD
Need for process/thread migration
• SIMD: Remapping (re-partitioning) of data works
• For MIMD, “processes” might grow or shrink, or come and go– Remapping of processes = process migration– Processes with large footprints (i.e., many
threads) might benefit from spreading their threads across machines
• Process migration– Initially (load distribution)– Dynamic– State capture and resume
• Thread migration– Threads are often tightly coupled and share m
uch data– Beneficial?– A big challenge
Sidetrack: Thread Migration
Thread migration works!
• Probably not suitable for grid, fine for cluster where latencies are upper-bounded
• Our experience: the JESSICA2 system– A distributed JVM– Dynamic Java thread migration– JIT compilation– Global object space– I/O redirection
JavaEnabledSingleSystemImageComputingArchitecture
JESSICA2 Architecture
Thread Migration
Global Object Space
JESSICA2JVM
A Multithreaded Java Program
JESSICA2JVM
JESSICA2JVM
JESSICA2JVM
JESSICA2JVM
JESSICA2JVM
Master Worker Worker Worker Worker Worker
JIT Compiler ModePortable Java Frame
G-JavaMPI
Towards “grid as a parallel computer”
• M-JavaMPI– “M” stands for mi
gration– For cluster
• G-JavaMPI– An outgrowth of
M-JavaMPI– “G” for grid
G-JavaMPI
Organization Organization
Identity mapping
Policy space
Warranted
Task migration (Grid traveler)
A Grid Middleware for Transparent MPI Task Migration and Runtime Scheduling
• Grid-enabled implementation of the Java language bindings of the MPI v1.1 standard
• On top of Globus Toolkit (e.g., job startup, security) and MPICH-G2 (MPI communication)
• Combining the high-level message passing interface with the Java language to support portable messaging-passing programming in a grid
• It allows you to run MPI applications written in Java across multiple machines with different architectures belonging to multiple organizations
• Classes of problems implemented in C-MPI (for example, MPICH) can be easily ported to G-JavaMPI, but with additional support of process migration
• A better choice for those people who enjoy object-oriented programming style more
• Grid-enabled implementation of the Java language bindings of the MPI v1.1 standard
• On top of Globus Toolkit (e.g., job startup, security) and MPICH-G2 (MPI communication)
• Combining the high-level message passing interface with the Java language to support portable messaging-passing programming in a grid
• It allows you to run MPI applications written in Java across multiple machines with different architectures belonging to multiple organizations
• Classes of problems implemented in C-MPI (for example, MPICH) can be easily ported to G-JavaMPI, but with additional support of process migration
• A better choice for those people who enjoy object-oriented programming style more
Special features
• Transparent dynamic process migration– Load balancing– Fault tolerance– Resource co-allocation
• Fine-grain access control through delegation– Multi-hop delagation– Cross-organization resource sharing
G-PASS
• Globus operates at the level of users, G-PASS at the level of processes
• A process can be migrated multiple times across multiple grid nodes
• The process (a “traveler”) obtains his/her privileges via a security instance (the “passport”) instead of from the hosts
• Permission to access a resource in the destination host is granted by simply checking the signature in the security instance
Instance-oriented delegation
GSI = Grid Security Infrastructure
Main components of G-JavaMPI
Runtime analysis
• Based on JVMTI (Tool Interface) – dynamically add instrument code in Java bytecode
• Identify the execution hotspots in the process
• Analyze process synchronization relationships for per-process computational requirement, and communication workload
Dynamic instrumentation
hotspothotspot
Communication performance
JMPI-BLAST cost breakdown
Ray tracing experiment
Message passing daemons
• Manage messages in queues
• Send/receive messages on behalf of processes
• Support multiple simultaneous applications
• Profiling of communication behaviors
Daemon Daemon Daemon Daemon
MPICH-G2
Gridnode
Gridnode
Gridnode
Gridnode
Messaging
Migration
• Capture process status through JVMDI (JVMTI in latest Sun J2SE 1.5)
• Recognize branching code in Java bytecode, find appropriate location to stop execution
• Recognize file operations• Instrumentation of
migrationexceptionhandlers
Process
JVMDI
File File
JVMDI
Process
Status
dump
Status
restoration
ProcessMigration
Frames and Runtime States Restoration
Migration in action
Migration-transparent message passing
• Mapping virtual process ranks to physical locations in location tables
• Processes during message passing not allowed to migrate
• Sequencing the messages, collecting legacy messages in previous node, re-sending them to new node
•N-body simulation (body shape: loop, 10000 bodies, 16 processes)•Periodical random process migrations, for demo purpose
•Ray tracing application on CNGrid•The scheduler periodically checks the workload in grid nodes and moves some processes to idle nodes•The Java applet displays the result and migration information
To find out more
• L. Chen, T.C. Ma, C.L. Wang, F.C.M. Lau, and S.P. Li, “G-JavaMPI: A Grid Middleware for Transparent MPI Task Migration”, in Engineering the Grid: Status and Perspective, American Scientific Publishers, 2006, to appear.
• T.C. Ma, C.L. Wang, L. Chen, and F.C.M. Lau, “G-PASS: An Instance-oriented Security Infrastructure for Grid Travelers”, Concurrency and Computation: Practice and Experience, to appear.
http://www.cs.hku.hk/~lchen2/G-JavaMPI/
The Future of Grid
What next?
• Grid computing today is like a “pot luck” supper– Everyone brings and
contributes a dish– And … surprise!
• There really is “no free lunch”– Everyone shares some of the
costs– Is it worth it?
POTLUCK DINNER
• To minimize the “surprises” (quality of service)– Let the pros - the chefs - do it– You sit back, relax, and enjoy, and pay for and only for
what you consume
• Grid now is a private club• But eventually it should be like …
– Ubiquitous– Invisible (the machinery behind)– It’s my “cup of coffee”
The “pervasive grid” – everyone’s club
The grid (invisible computing)
Thin clients“To use a computer
is fun, but not to manage it”
Edge computing
• Person … device … middleware (proxies) ... Internet
• The abstract cloud moves with the client – personalized “cuddleware”, nomadic computing
Internetproxies united
client
metropolis
Problems worth pursuing
• Edge computing → “seamless”– New protocols for the edge
• The continuum → the network is the computer– Collaborative models and mechanisms, esp. at the edge
• The global grid → invisible, “PC” disappearing• The device
– Adaptation• On-demand code composition
– The SOC approach?
• Content– HTML
• UI description languages
– New paradigms for user interaction in small devices (input and output)