Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France

Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application

Christian PerezINRIA researcher

IRISARennes, France

2

IRISA

IRISA INRIA Rennes CNRS Univ. of Rennes I INSA

600 people (march 06) 233 researchers 184 PhD students 120 Engineers, technicians

3

Project team composition

Tenured personnel (11) F. André (Prof IFSIC) G. Antoniu (CR INRIA) J-P. Banâtre (Prof IFSIC) L. Bougé (Prof ENS) Y. Jégou (CR INRIA) D. Margery (IR INRIA) 50% C. Morin (DR INRIA) P. Morillon (IE IFSIC) J.L. Pazat (Prof. INSA) C. Perez (CR INRIA) T. Priol (DR INRIA)

Post-docs (5) PhD students (11) Engineers (6)

4

PARIS objectives

Objects of study: Cluster and Cluster of Clusters (aka Grids)

Main objectives: Study and design operating systems, middleware and runtime systems to

make the programming of such computing infrastructures easier with: High-performance High-availability Scalability

Design advanced programming models for the programming of Clusters of Clusters

Combining both parallel and distributed computing paradigms Evaluation of the proposed operating systems, runtimes and middleware

through the development of advanced software (not only software prototype)

Technology transfer through collaboration with industrial partners

5

Research activities

Operating System and Runtime systems for Clusters and Cluster Federations Single System Image Operating System Grid-aware Operating System

Middleware for Computational Grids Component-based middleware for computational Grids

Communication framework Parallel components Deployment of parallel components within a grid Adaptive components

Large-scale data management for Grids Coupling of Distributed Shared Memories Data sharing services for mutable data using a P2P approach

Advanced Models for the Grids High-order Gamma Enactment of workflows based on a chemical metaphor

Experimental Grid Infrastructures The Grid’5000 testbed

6

Orsay1000 (684)

Rennes518 (658)

Bordeaux500 (96)

Toulouse500 (116)

Lyon500 (252)

Grenoble500 (270)

Sophia Antipolis500 (434)

Lille500 (106)

Nancy:500 (94)

Grid’5000 testbed

10 Gbps Dark fiberDedicated LambdaFully isolated traffic!Provided by RENATER

7

Component-based middleware for computational Grids

Apply modern software practices to scientific computing

High-performance & Scalability Specialized component models for high

performance computing (clusters & grids) Code coupling applications Parametric applications

Increase the level of abstraction SPMD paradigm (MxN communications) Master-worker paradigm Data sharing paradigm

High-performance communication Independence vis à vis of the networking

technologies Adaptation

Adaptation to the dynamic behavior of grids Deployment

Map components to available resources Technology independent

(CCM, CCA, Fractal, CoreGRID GCM)

SPMD components

Master-Slaves component

Thermal Dynamics

StructuralMechanics Optics

8

High Performance Components for code coupling:SPMD paradigm

SPMD component Parallelism is a non-functional

property of a component It is an implementation issue

Collection of sequential components SPMD execution model

Support of distributed arguments API for data redistribution API for communication scheduling w.r.t.

network properties Support of parallel exceptions

0

50

100

150

200

250

300

1->1 2->2 4->4 8->8

Component configuration

Aggregated Bandwidth in MB/s0

20

40

60

80

100

120

140

160

Latency in microseconde

Java/EthC++/EthC++/MyriC++/Myri

Object Request Broker

CORBA stub/skeleton

CommunicationLibrary ( MPI)

Application

Application view management- Data distribution description

Communication management- Comm. Matrix computation- Comm. Matrix scheduling- Communication execution

Redistribution Library 1

CommunicationLibrary

GridCCM runtime

SchedulingLibrary

6 7 8 9 10 11 12 13 14 15 1675

77.5

80

82.5

85

87.5

90

92.5

95

97.5

100

Without scheduling With scheduling

Number of processors per parallel component

Component A Component B

9

High Performance Components for parametric codes:Master-worker paradigm

Collection of components Simple model for application developers A request is delivered to an instance of

the collection The selection of collection instance is

delegated to a request transport policy Enable the use of existing MW

environments (DIET, ….) Resources infrastructure independence

No dealing with the number of workers No dealing with request delivery concerns

Valid for ADL and non ADL based component models

Preliminary results Fluid motion estimation Experiment on Grid’5000

4003 components 974 processors 7 sites

Speedup of 213 with a round robin patternRound-Robin

Programmer view

binding

master

worker

Exposed provided port

RR Proxy

Set of “request delivery”

patterns

Number of workers & policy pattern selection

Resources

Abstract ADL

Concrete ADL

XML collectiondefinition worker

10

PadicoTM: Communication framework

Provides an open integration platform to combine various communication middleware and runtimes

Message based runtimes (MPI, PVM, …) DSM-based runtimes (TreadMarks, …) RPC and RMI based middleware (DCE,

CORBA, Java, …) Allows several communication middleware

and runtime to share various networking technologies

Ethernet, Myrinet, Infiniband, Quadrics, SCI Last but not least: get the maximum

performance of the network! Available as an open source software

under the GPL licence http://padico.gforge.inria.fr > 200 downloads since July 2002

MadeleinePortability across networks

MarcelI/O aware multi-threading

Myrinet SCI

PadicoTMCore

PadicoTM Services

Multithreading

Netw

orksDSM JVMMPI CORBA JXTA

TCP

Personality Layer

Internal engine

Mpich

OmniORBMICOOrbacusOrbix/E

Kaffe JuxMemMome

11

ADAGE: Automatic Deployment of Application in a Grid Environment

Deploy a same application on any kind of resources

from clusters to grids Support multi-middleware application

MPI+CORBA+JUXMEM+... Planner as plugin

round robin & random Some successes

29.000 JXTA peers on ~400 nodes 4003 components on 974 processors

on 7 sites Alpha support for dynamic application

MPI Application Description

CCM Application Description

Resource Description

Generic Application Description

Control Parameters

Deployment Planning

Deployment Plan Execution

Application Configuration

Documents

Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France