Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Overview
• MPI (Message Passing Interface) • Specification for a standard library for message
passing • Defined by MPI forum
• Designed for high performance • on both massively parallel machines and on
workstation clusters.
• Widely available • both free available and vendor-supplied
implementations
Goals • To develop a widely used standard for writing message-passing programs.
• Establish a practical, portable, efficient, and flexible standard for message passing.
• Design an application programming interface (not necessarily for compilers or a system implementation library).
• Allow efficient communication: Avoid memory-to-memory copying and allow overlap of computation and communication and offload to communication co-processor, where available.
• Allow for implementations that can be used in a heterogeneous environment.
• Allow convenient C and Fortran 77 bindings for the interface.
• Assume a reliable communication interface: the user need not cope with communication failures. Such failures are dealt with by the underlying communication subsystem.
Example #include <mpi.h> int main(int argc, char **argv){ /* Initialize MPI */ MPI_Init(&argc, &argv); /* Find out my identity in the default communicator */ int my_rank; MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); int number ; if (my_rank == 0) { number = -1; MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD); } else if (my_rank == 1) { MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); printf("Process 1 received number %d from process 0\n", number); } /* Shut down MPI */ MPI_Finalize(); return 0; }
MPI-1 • Point-to-point communication
• basic, pairwise communication (i.e., send and receive)
• Collective operations • process-group collective communication operations (i.e., barrier, broadcast,
scatter, gather, reduce )
• Process groups & communication contexts • how groups of processes are formed and manipulated, how unique
communication contexts are obtained, and how the two are bound together into a communicator (i.e., MPI_COMM_WORLD)
• Process topologies • explains a set of utility functions meant to assist in the mapping of process
groups (a linearly ordered set) to richer topological structures such as multi-dimensional grids.
MPI-1 contd.
• Bindings for Fortran 77 and C • gives specific syntax in Fortran 77 and C, for all MPI
functions, constants, and types.
• Environmental Management and inquiry • explains how the programmer can manage and make
inquiries of the current MPI environment
• Profiling interface • ability to put performance profiling calls into MPI
without the need for access to the MPI source code
MPICH
• Freely available implementation of MPI specification • Argonne National Laboratory, Mississippi State
University
• Portability and High-performance
• “CH” => “Chameleon” • Symbol of adaptability
• Other – LAM, CHIMP-MPI, Unify etc. • Focus on the work station environment
Portability of MPICH
• Distributed-memory Parallel Supercomputer • Intel Paragon, IBM SP2, Meiko CS-2, Thinking Machines
CM-5, Ncube-2, Cray T3D
• Shared-memory architectures • SGI Onyx, Challenge, Power Challenge, IBM SMP's the
Convex Exempler, the Sequent Symmetry
• Networks of Workstations • Ethernet-connected Unix workstations (may be of
multiple vendors) • Sun, DEC, HP, SGI, IBM, Intel
MPICH Architecture
• ADI (Abstract Device Interface) • Central mechanism for portability • Many implementations of ADI • MPI functions are implemented in terms of ADI
macros and function • Not MPI library specific – can be used for any high-
level message passing library
ADI
• A set of function definitions
• Four set of functions • Specifying a message to be sent or received • Moving data between the API and the message
passing h/w • Managing list of pending messages (both sent or
received) • Providing basic information about the execution
environment (i.e., how many tasks are there)
Features of MPICH • Groups • An ordered list of process identifiers • Stored as an integer array • Process's rank in a group is its index in the list
• Communicators • MPICH intracommunicators and intercommunicators uses
same structure • Both has a local group and a remote group – identical
(intra) or disjoint (inter) • Send and receive context – equal (intra) or different (inter) • Contexts are integers
Features of MPICH • Collective operations • Implemented on top of point-to-point operations • Some vendor-specific collective operations (Meiko,
Intel and Convex)
• Job Startup • MPI forum did not standardize the mechanism for
starting jobs • mpirun
mpirun -np 12 myprog
• Command-Line Arguments and Standard I/O
mpirun –np 64 myprog –myarg 13 < data.in > results.out
mpirun –np 64 –stdin data.in myprog –myarg 13 > results.out
• Useful commands
mpicc –c myprog.c
Features of MPICH
MPE (Multi-Processing Environment) Extension Library
• Parallel X graphics – routines to provide all processes with access to a shared X display
• Logging – time stamped event trace file
• Sequential sections – one process at a time, in rank order
• Error handling – MPI_Errhandler_set
• MPICH has succeeded in popularizing the MPI standard
• Encouraging vendors to provide MPI to their customers • By helping to create demand • By offering them a convenient starting point
Contributions of MPICH
MPI-2
• Parallel I/O
• Dynamic process management
• One-sided communication
• New language bindings – C++ & F90
Sequential I/O
• Good for small process numbers (~100) and small datasets (~MB)
• Not good for big process numbers (~ 100K) and big datasets (~TB)
0 1 2 3
Parallel I/O
• Multiple processes of a parallel program accessing data from a common file
• Each process access a chunk of data using individual file pointers
• MPI_File_open, MPI_File_seek, MPI_File_read, MPI_File_close
FILE
P0 P1 P2 P(n-1)
One-Sided Communication • Remote Memory Access (RMA)
• Window – specific region of process memory made available for RMA by other processes
• MPI_Win_create – called by all processes within a communicator
• Origin: the process that performs the call • Target: the process in which memory is accessed
• Communication calls • MPI_Get: Remote read • MPI_Put: Remote write • MPI_accumulate
Dynamic process mgt.
• MPI-1 • Does not specify how processes will be created • Does not allow processes to enter or leave a
running parallel application
• MPI-2 • Start new process, send them signals, find out when
they die, establish communication between two processes
MPICH-2
• ADI 3 – provides routines to support MPI-1 & 2
• Two types of RMA operations • Active target – target process must call an Mpi
routine • Origin calls MPI_Win_start/MPI_Win_complete • Target calls MPI_Win_post/MPI_Win_wait
• Passive target - target process not required to call any MPI routine • Origin calls MPI_Win_lock/MPI_Win_unlock
MPICH-2
• Dynamic process • There are no absolute and global process ids • No data structure that map a process rank into a
“global rank” (i.e., rank in MPI_COMM_WORLD) • All communications are considered locally in terms of
possible virtual connections to processes • Arrays of virtual connections indexed by rank
MPI-3
• Improved scalability
• Better support for multi-core, cluster & application
• Proposed => MPI_Count (larger than integer)
• Extension of collective operations • Include non-blocking • Sparse collective operations • MPI_Sparse_gather
MPI-3
• Extension of one-sided communication • To support RMA to arbitrary locations, no constraints (symmetric
allocation or collective window creation) on memory • RMA operations that are imprecise (such access to overlapping
storage) must be permitted, even if the behavior is undefined • The required level of consistency, atomicity, and completeness
should be flexible • Read-modify-write operations and compare and swap are
needed for efficient algorithms • MPI_Get_accumulate, MPI_Compare_and_swap
• Backward compatibility
References • http://www.mcs.anl.gov/research/projects/mpi/
• http://www.mpi-forum.org
• A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard - W. Gropp et al
• MPI-2: Extending the Message Passing Interface - Al Geist et al
• MPICH Abstract Device Interface, version 3.3 Reference Manual
• http://meetings.mpi-forum.org/presentations/MPI_Forum_SC10.ppt.pdf
• http://wissrech.ins.uni-bonn.de/teaching/seminare/technum/pdfs/iseringhausen_mpi2.pdf
• www.sdsc.edu/us/training/workshops/docs