Lecture 3Lecture 3
Point-to-Point Point-to-Point CommunicationsCommunications
Dr. Muhammad Hanif Durad
Department of Computer and Information Sciences
Pakistan Institute Engineering and Applied Sciences
Some slides have bee adapted with thanks from some other lectures available on Internet
Dr. Hanif Durad 2
Lecture Outline
Models for Communication Brief introduction to MPI Basic concepts Learn 6 most commonly used functions Introduce “collective” operations
IntroMPI.ppt
MPI Basic Send/Receive
3
We need to fill in the details in
things that need specifying: How will “data” be described? How will processes be identified? How will the receiver recognize/screen messages? What will it mean for these operations to complete?
Process 0 Process 1
Send(data)
Receive(data)
IntroMPI.ppt
Some Basic Concepts
Processes can be collected into groups Each message is sent in a context, and must be received
in the same context Provides necessary support for libraries
A group and context together form a communicator A process is identified by its rank in the group associated
with a communicator There is a default communicator whose group contains all
initial processes, called MPI_COMM_WORLDDr. Hanif Durad 4
IntroMPI.ppt
MPI Datatypes The data in a message to send or receive is described by a triple
(address, count, datatype), where An MPI datatype is recursively defined as:
predefined, corresponding to a data type from the language (e.g., MPI_INT, MPI_DOUBLE)
a contiguous array of MPI datatypes a strided block of datatypes an indexed array of blocks of datatypes an arbitrary structure of datatypes
There are MPI functions to construct custom datatypes, in particular ones for subarrays
May hurt performance if datatypes are complex5Dr. Hanif Durad
IntroMPI.ppt
MPI Tags
Messages are sent with an accompanying user-defined integer tag, to assist the receiving process in identifying the message
Messages can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive
Some non-MPI message-passing systems have called tags “message types”. MPI calls them tags to avoid confusion with datatypes
Dr. Hanif Durad 6
IntroMPI.ppt
Blocking Point-to-Point Communication (1/2)
MPI_Send() Basic blocking send operation. Routine returns only after the
application buffer in the sending task is free for reuse.
MPI_Recv() Receive a message and block until the requested data is
available in the application buffer in the receiving task.
MPI_Ssend() synchronous blocking send
7Dr. Hanif Durad
Comm.ppt
Blocking Point-to-Point Communication (2/2)
MPI_Bsend() buffered blocking send
MPI_Rsend() blocking ready send, use with great care
MPI_Sendrecv() Send a message and post a receive before blocking. Will block
until the sending application buffer is free for reuse and until the receiving application buffer contains the received message.
8Dr. Hanif Durad
Comm.ppt
MPI Basic (Blocking) Send
MPI_SEND(start, count, datatype, dest, tag, comm) The message buffer is described by (start, count, datatype). The target process is specified by dest, which is the rank of the target process in
the communicator specified by comm. When this function returns, the data has been delivered to the system and the
buffer can be reused. Important: The message may not have been received by the target process.
Dr. Hanif Durad 9
A(10)B(20)
MPI_Send( A, 10, MPI_DOUBLE, 1, …) MPI_Recv( B, 20, MPI_DOUBLE, 0, … )
IntroMPI.ppt
MPI Basic (Blocking) Receive
MPI_RECV(start, count, datatype, source, tag,comm, status) Waits until a matching (both source and tag) message is received from the
system, and the buffer can be used source is rank in communicator specified by comm, or MPI_ANY_SOURCE tag is a tag to be matched on or MPI_ANY_TAG receiving fewer than count occurrences of datatype is OK, but receiving more
is an error status contains further information (e.g. size of message)
Dr. Hanif Durad 10
MPI_Send( A, 10, MPI_DOUBLE, 1, …) MPI_Recv( B, 20, MPI_DOUBLE, 0, … )
A(10)B(20)
IntroMPI.ppt
Blocking Operations pp2003\lecture4.ppt
A Simple MPI Program (C)#include "mpi.h"
#include <stdio.h>
int main( int argc, char *argv[])
{
int rank, buf;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
/* Process 0 sends and Process 1 receives */
if (rank == 0) {
buf = 123456;
MPI_Send( &buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
else if (rank == 1) {
MPI_Recv( &buf, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
&status );
printf( "Received %d\n", buf );
}
MPI_Finalize();
return 0;
} Dr. Hanif Durad 12
IntroMPI.pptProgram name blocking.c
A Simple MPI Program (Fortran)program maininclude 'mpif.h'integer rank, buf, ierr, status(MPI_STATUS_SIZE)
call MPI_Init(ierr) call MPI_Comm_rank( MPI_COMM_WORLD, rank, ierr )! Process 0 sends and Process 1 receives if (rank .eq. 0) then buf = 123456 call MPI_Send( buf, 1, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, ierr ) else if (rank .eq. 1) then call MPI_Recv( buf, 1, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, status, ierr ) print *, "Received ", buf endif call MPI_Finalize(ierr) end
Dr. Hanif Durad 13
IntroMPI.pptProgram name blocking.f90
A Simple MPI Program (C++)#include "mpi.h"
#include <iostream>
int main( int argc, char *argv[])
{
int rank, buf;
MPI::Init(argc, argv);
rank = MPI::COMM_WORLD.Get_rank();
// Process 0 sends and Process 1 receives
if (rank == 0)
{
buf = 123456;
MPI::COMM_WORLD.Send( &buf, 1, MPI::INT, 1, 0 );
}
else if (rank == 1)
{
MPI::COMM_WORLD.Recv( &buf, 1, MPI::INT, 0, 0 );
std::cout << "Received" << buf << "\n";
}
MPI::Finalize();
return 0;
} Dr. Hanif Durad 14
IntroMPI.pptProgram name blocking.cpp
Retrieving Further Information (C)
Status is a data structure allocated in the user’s program.
In C: int recvd_tag, recvd_from, recvd_count; MPI_Status status; MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG, ...,
&status ) recvd_tag = status.MPI_TAG; recvd_from = status.MPI_SOURCE; MPI_Get_count( &status, datatype, &recvd_count );
Dr. Hanif Durad 15
Retrieving Further Information (Fortran)
In Fortran: integer recvd_tag, recvd_from, recvd_count integer status(MPI_STATUS_SIZE) call MPI_RECV(..., MPI_ANY_SOURCE, MPI_ANY_TAG,
.. status, ierr) tag_recvd = status(MPI_TAG) recvd_from = status(MPI_SOURCE) call MPI_GET_COUNT(status, datatype, recvd_count, ierr)
Dr. Hanif Durad 16
Retrieving Further Information (C++)
Status is a data structure allocated in the user’s program. In C++:
int recvd_tag, recvd_from, recvd_count;MPI::Status status;Comm.Recv(..., MPI::ANY_SOURCE, MPI::ANY_TAG, ..., status )
recvd_tag = status.Get_tag();recvd_from = status.Get_source();recvd_count = status.Get_count( datatype );
Dr. Hanif Durad 17
Tags and Contexts
Separation of messages used to be accomplished by use of tags, but this requires libraries to be aware of tags used by other libraries. this can be defeated by use of “wild card” tags.
Contexts are different from tags no wild cards allowed allocated dynamically by the system when a library sets up a
communicator for its own use. User-defined tags still provided in MPI for user
convenience in organizing application18
IntroMPI.ppt
Home Work 1
We have just used MPI_Send() and MPI_Recv() Try to use other blocking functions listed
Dr. Hanif Durad 19
Flavors of message passing
Synchronous used for routines that return when the message transfer has been completed Synchronous send waits until the complete message can
be accepted by the receiving process before sending the message (send suspends until receive)
Synchronous receive will wait until the message it is expecting arrives (receive suspends until message sent)
Also called blocking
A B
request to send
acknowledgement
message
lecture2.ppt
Synchronous send() and recv() using 3-way protocol (1/2)
Dr. Hanif Durad 21
Process 1 Process 2
send();
recv();Suspend
Time
process Acknowledgment
MessageBoth processescontin ue
(a) When send() occurs before recv()
Request to send
slides2.ppt
Synchronous send() and recv() using 3-way protocol (2/2)
Dr. Hanif Durad 22
Process 1 Process 2
recv();
send();
SuspendTime
process
Acknowledgment
MessageBoth processes
contin ue
(b) When recv() occurs before send()
Request to send
slides2.ppt
Nonblocking message passing Nonblocking sends return whether or not the message has been received
If receiving processor not ready, message may be stored in message buffer Message buffer used to hold messages being sent by A prior to being accepted by receive in B
MPI: routines that use a message buffer and return after their local actions complete are blocking (even though message
transfer may not be complete) Routines that return immediately are non-blocking
A B
message buffer
4 Communication Modes in MPI (1/3)
Standard mode Not assumed that corresponding receive routine has
started. Amount of buffering not defined by MPI. If buffering
provided, send could complete before receive reached Buffered(asynchronous) mode
Send may start and return before a matching receive. Necessary to specify buffer space via routine MPI_Buffer_attach().
Dr. Hanif Durad 24
lecture4.ppt/slides2.ppt
Communication Modes in MPI (2/3)
Synchronous mode Send and receive can start before each other but can only
complete together Ready mode
Send can only start if matching receive already reached, otherwise error. Use with care
Dr. Hanif Durad 25
lecture4.ppt/slides2.ppt
Communication Modes in MPI (3/3)
Each of the four modes can be applied to both blocking and nonblocking send routines.
Only the standard mode is available for the blocking and nonblocking receive routines.
Any type of send routine can be used with any type of receive routine.
Dr. Hanif Durad 26
slides2.ppt
A Real Blocking Program (1/3)#include "mpi.h"
#include <iostream>
int main(int argc, char *argv[])
{
#define MSGLEN 2048
int ITAG_A = 100,ITAG_B = 200;
int irank, i, idest, isrc, istag, iretag;
float rmsg1[MSGLEN];
float rmsg2[MSGLEN];
MPI::Status recv_status;
MPI::Init(argc, argv);
irank = MPI::COMM_WORLD.Get_rank(); 27
Program name deadlock.cpp
A Real Blocking Program (2/3) for (i = 1; i <= MSGLEN; i++)
{
rmsg1[i] = 100;
rmsg2[i] = -100;
}
if ( irank == 0 )
{
idest = 1;
isrc = 1;
istag = ITAG_A;
iretag = ITAG_B;
}
else if ( irank == 1 )
{
idest = 0;
isrc = 0;
istag = ITAG_B;
iretag = ITAG_A;
}
28
A Real Blocking Program (3/3)
std::cout << "Task " << irank << " has sent the message" <<std::endl;
MPI::COMM_WORLD.Ssend(rmsg1, MSGLEN, MPI::FLOAT, idest, istag);
MPI::COMM_WORLD.Recv(rmsg2, MSGLEN, MPI::FLOAT, isrc, iretag, recv_status);
std::cout << "Task " << irank << " has received the message" <<std::endl;
MPI::Finalize();
}
29
Nonblocking Point-to-Point Communication (1/2)
MPI_Isend(), MPI_Irecv() identifies the send/receive buffer. Computation proceeds
immediately. A communication request handle is returned for handling the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the operation completes.
MPI_Issend(), MPI_Ibsend(), MPI_Irsend() non-blocking versions
Dr. Hanif Durad 30
Comm.ppt
Nonblocking Point-to-Point Communication (2/2)
MPI_Test(), MPI_Testany, MPI_Testall, MPI_Testsome() checks the status of a specified non-blocking send or receive
operation MPI_Wait(), MPI_Waitany(), MPI_Waitall(),
MPI_Waitsome() blocks until a specified non-blocking send or receive operation
has completed MPI_Probe()
performs a non-blocking test for a message.
Dr. Hanif Durad 31
Comm.ppt
Non-Blocking Operations lecture4.ppt
Fixing Deadlock (1/3)#include "mpi.h"
#include <iostream>
int main(int argc, char *argv[])
{
#define MSGLEN 2048
int ITAG_A = 100,ITAG_B = 200;
int irank, i, idest, isrc, istag, iretag;
float rmsg1[MSGLEN];
float rmsg2[MSGLEN];
MPI::Status irstatus, isstatus;
MPI::Request request;
MPI::Init(argc, argv);
irank = MPI::COMM_WORLD.Get_rank(); 33
Program name deadlock-fix.cpp DT\www.nccs.gov
Fixing Deadlock (2/3) for (i = 1; i <= MSGLEN; i++)
{
rmsg1[i] = 100;
rmsg2[i] = -100;
}
if ( irank == 0 )
{
idest = 1;
isrc = 1;
istag = ITAG_A;
iretag = ITAG_B;
}
else if ( irank == 1 )
{
idest = 0;
isrc = 0;
istag = ITAG_B;
iretag = ITAG_A;
} Dr. Hanif Durad 34
Program name deadlock-fix.cpp
Fixing Deadlock (3/3)
std::cout << "Task " << irank << " has sent the message" <<std::endl;
request = MPI::COMM_WORLD.Isend(rmsg1, MSGLEN, MPI::FLOAT, idest, istag);
MPI::COMM_WORLD.Recv(rmsg2, MSGLEN, MPI::FLOAT, isrc, iretag, irstatus);
MPI_Wait(request,isstatus);
std::cout << "Task " << irank << " has received the message" << std::endl;
MPI::Finalize();
}
Dr. Hanif Durad 35
Program name deadlock-fix.cpp
Home Work 2
We have just used MPI_Isend(). Try to use other non-blocking functions listed
Dr. Hanif Durad 36