32
1 H P C A 2 0 0 1 Message Passing Message Passing Interface (MPI) and Interface (MPI) and Parallel Algorithm Parallel Algorithm Design Design

HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

Embed Size (px)

Citation preview

Page 1: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

1

HPCA

2001

Message Passing Interface Message Passing Interface (MPI) and Parallel Algorithm (MPI) and Parallel Algorithm DesignDesign

Message Passing Interface Message Passing Interface (MPI) and Parallel Algorithm (MPI) and Parallel Algorithm DesignDesign

Page 2: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

2

HPCA

2001

What is MPI?

A message passing library specification

– message-passing model

– not a compiler specification

– not a specific product

For parallel computers, clusters and

heterogeneous networks.

Full-featured

Page 3: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

3

HPCA

2001

Why use MPI? (1)

Message passing now mature as

programming paradigm

– well understood

– efficient match to hardware

– many applications

Page 4: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

4

HPCA

2001

Who Designed MPI ?

Venders

– IBM, Intel, Sun, SGI, Meiko, Cray, Convex,

Ncube,…..

Research Lab.

– PVM, p4, Zipcode, TCGMSG, Chameleon,

Express, Linda, PM (Japan RWCP), AM

(Berkeley), FM (HPVM at Illinois)

Page 5: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

5

HPCA

2001

Vender-Supported MPI HP-MPI HP; Convex SPP MPI-F IBM SP1/SP2 Hitachi/MPI Hitachi SGI/MPI SGI PowerChallenge series MPI/DE NEC. INTEL/MPI Intel. Paragon (iCC lib) T.MPI Telmat Multinode Fujitsu/MPI Fujitsu AP1000 EPCC/MPI Cray & EPCC, T3D/T3E.

Page 6: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

6

HPCA

2001

Research MPI

MPICH Argonne National Lab. &

Mississippi State U. LAM Ohio Supercomp. center MPICH/NT Mississippi State U. MPI-FM Illinois (Myrinet) MPI-AM UC Berkeley (Myrinet) MPI-PM RWCP, Japan (Myrinet) MPI-CCL Calif. Tech.

Page 7: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

7

HPCA

2001

Research MPI

CRI/EPCC MPI Cray Research and Edinburgh

(Cray T3D/E)Parallel Computing Centre

MPI-AP Australian National U.-

(AP1000) CAP Research Program

W32MPI Illinois, Concurrent Systems

RACE-MPI Hughes Aircraft Co.

MPI-BIP INRIA, France (Myrinet)

Page 8: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

8

HPCA

2001

Language Binding

MPI 1: C, Fortran (for MPICH-based implementation)

MPI 2: C, C++, Fortran Java :

– Through Java native method interface (JNI): mpiJava JavaMPI

– Implement the MPI package by pure Java: MPIJ: (DOGMA project)

– JMPI (by MPI Software Technology)

Page 9: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

9

HPCA

2001

Main Features of MPIMain Features of MPIMain Features of MPIMain Features of MPI

Page 10: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

10

HPCA

2001

“Communicator”

Identify the process group and context with respect to which the operation is to be performed

In a parallel environment, processes need to know each others (“naming”: machine name, IP address, process ID)

Page 11: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

11

HPCA

2001

Process

Process

Process

Process

ProcessProcess

Process

Process

Process

Process

Process

Communicator (2)Four communicatorsProcess in different communicators cannot communicate

Process

Process

Process

Process

ProcessProcess

Communicator within Communicator

Process

Process

Same process can be existed in different communicators

Process

Page 12: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

12

HPCA

2001

Point-to-point Communication

The basic point-to-point communication operators are send and receive.

Communication Modes : – normal mode (blocking and non-blocking), – synchronous mode, – ready mode (to allow access to fast protocols), – buffered mode– ….

Page 13: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

13

HPCA

2001

Collective Communication

Communication that involves a group of processes. E.g, broadcast, barrier, reduce, scatter, gather, all-to-all, ..

Page 14: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

14

HPCA

2001 MPI ProgrammingMPI ProgrammingMPI ProgrammingMPI Programming

Page 15: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

15

HPCA

2001

Writing MPI programs

MPI comprises 125 functions Many parallel programs can be written

with just 6 basic functions

Page 16: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

16

HPCA

2001

Six basic functions (1)

1. MPI_INIT: Initiate an MPI computation

2. MPI_FINALIZE: Terminate a computation

3. MPI_COMM_SIZE: Determine number of processes in a communicator

4. MPI_COMM_RANK: Determine the identifier of a process in a specific communicator

5. MPI_SEND: Send a message from one process to another process

6. MPI_RECV: Receive a message from one process to another process

Page 17: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

17

HPCA

2001

Program main

begin

MPI_INIT()

MPI_COMM_SIZE(MPI_COMM_WORLD, count)

MPI_COMM_RANK(MPI_COMM_WORLD, myid)

print(“I am ”, myid, “ of ”, count)

MPI_FINALIZE()

end

A simple program

Initiate computationFind the numberof processesFind the process ID of

current process Each process prints out its outputShut down

Page 18: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

18

HPCA

2001

Result

I’m 0 of 4

I’m 2 of 4

I’m 1 of 4

I’m 3 of 4

Process 2 Process 3

Process 1Process 0

Page 19: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

19

HPCA

2001

Another program (2 nodes)

…..MPI_COMM_RANK(MPI_COMM_WORLD, myid)if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…)else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…)END IFprint(“Received from %s”,words)……

I’m process 0!if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…)……

I’m process 1!else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…)

Page 20: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

20

HPCA

2001

Result

Received from One

Received from Zero

Process 0

Process 1

Page 21: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

21

HPCA

2001

Collective Communication

Three Types of Collective Operations Barrier

• for process synchronization• MPI_BARRIER

Data movement• moving data among processes• no computation• MPI_BCAST, MPI_GATHER, MPI_SCATTER

Reduction operations• Involve computation• MPI_REDUCE, MPI_SCAN

Page 22: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

22

HPCA

2001

Barrier

MPI_BARRIER Used to synchronize execution of a group of

processes

wait

compute

Continue execution

All members reach the same point before any can proceed

Process 1 Process 2 Process p

Perform barrier

Blocking time

Page 23: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

23

HPCA

2001

Data Movement Broadcast:

– one member sends the same message to all members

Scatter: – one member sends a different message to each member

Gather: – every member sends a message to a single member

All-to-all broadcast: – every member performs a broadcast

All-to-all scatter-gather (Total Exchange): – every member performs a scatter (and gather)

Page 24: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

24

HPCA

2001

MPI Collective Communications

Broadcast (MPI_Bcast) Combine-to-one (MPI_Reduce) Scatter (MPI_Scatter) Gather (MPI_Gather) Collect (MPI_Allgather) Combine-to-all (MPI_Allreduce) Reduce: (MPI_Reduce) Scan: (MPI_Scan) All-to-All: (MPI_Alltoall)

Page 25: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

25

HPCA

2001

FACEFACE

Process 0 Process 1 Process 2 Process 3

BCAST BCAST BCAST BCAST

FACE FACE FACE

Data movement (1)

MPI_BCAST One single process sends the same data to

all other processes, itself included

Page 26: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

26

HPCA

2001

Process 0 Process 1 Process 2 Process 3

GATHER GATHER GATHER GATHER

EA C EF F A CFACE

Data movement (2)

MPI_GATHER All process (include the root process) send

the same data to one process and store them in rank order

Page 27: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

27

HPCA

2001

Process 0 Process 1 Process 2 Process 3

SCATTER SCATTER SCATTER SCATTER

FACEF C EA

Data movement (3)

MPI_SCATTER A process sends out a message, which is

split into several equals parts, and the ith portion is sent to the ith process

Page 28: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

28

HPCA

2001

Process 0 Process 1 Process 2 Process 3

REDUCE REDUCE REDUCE REDUCE

9 3 789

8 9 3 7max

Data movement (4)

MPI_REDUCE (e.g., find maximum value) combine the values of each process, using a

specified operation, and return the combined value to a process

Page 29: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

29

HPCA

2001

MPI_SCAN

Scan Op: +

Input

Result

Scan (parallel prefix): “partial” reduction based upon relative process number

Process 0 Process 3 Process 5

+ + + +

Page 30: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

30

HPCA

2001

Example program (1)

Calculating the value of by:

1

02

dxx1

4

Page 31: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

31

HPCA

2001

Example program (2)

……

MPI_BCAST(numprocs, …, …, 0, …)

for (i = myid + 1; i <= n; i += numprocs)

compute the area for each interval

accumulate the result in processes’

program data (sum)

MPI_REDUCE(&sum, …, …, …, MPI_SUM, 0, …)

if (myid == 0)

Output result

……

Page 32: HPCA2001HPCA2001 1 Message Passing Interface (MPI) and Parallel Algorithm Design

32

HPCA

2001

Calculated by process 0Calculated by process 1Calculated by process 2Calculated by process 3

OK!

OK!

OK!

OK!

=3.141...

Start calculation!