33
TICAM REPORT 96-44 October, 1996 MMPI: Asynchronous Message Management for the Message- Passing Interface Harold Carter Edwards

MMPI: Asynchronous Message Management for the Message ... · An MMPI Consumer is an "open ended" receive operation where an arbit.rary number of messages of arbitrary content and

Embed Size (px)

Citation preview

  • TICAM REPORT 96-44October, 1996

    MMPI: Asynchronous Message Management for theMessage- Passing Interface

    Harold Carter Edwards

  • MMPI: Asynchronous Message Managelnent for theMessage- Passing Interface

    Harold Carter EchvardsTexas Institute for Computational and AppliedlVIathematics

    The University of Texas at AustinAustin, Texas, /8712

    TICAM Report 96-44

    Abstract

    The Message-Passing Interface (MPI) Standard defines an asynchronous mes-sage passing functionality through primitive nonblocking send and receive operations.Applications which use asynchronous message passing within MPI must assume re-sponsibility for managing the MPI nonblocking send and receive primitives. For ana.pplication with dynamic interprocessor communication requirements, asynchronousmessage management is a complex task. The asynchronous message Management forMPI (MMPI) library described in this report has been developed to simplify dynamicasynchronous message passing with MPI.

  • 1 Introd uctionA family of parallel applications developed at the Texas Institute for Computationaland Applied Mathematics (TICAM) use large irregular distributed data structures.R.untime modification and redistribution of these data structures occurs a.t variousphases during the applications' execution. The message passing patterns associatedwith accessing these irregular, dynamic, and distributed data structures are complexa.nd unpredictable.

    One approach taken at TICAM to address these complex and unpredictablecommunications is to use asynchronous message passing. Asynchronous message pass-ing has traditionally been used to overlap communications with computations: how-ever, it also provides a flexible mechanism for managing complex and unpredictablemessage passing requirements. Very hriefiy, asynchronous message passing is used t.operform communications whichm.ight OCCUl', where each message is processed Zt' w).dwhen it is received.

    The Message-Passing Interface (MPI) defines primitive communication oper-ations for asynchronous message passing. Using these primitive operations for over-lapping communications and computation is straight forward; however, using theseprimitive operations to address complex and unpredictable message passing require-ments is a formidable task. The objective of the MMPI library is to simplify this taskwith a set of MPI-like opaque data types and subprograms which are "layered on"lVIPI.

    A very brief review of asynchronous message passing with MPI is given in thefollowing section. However, it is assumed that the reader reader is already famil-iar with the Message-Passing Interface Standard, especially those sections regardingasynchronous message passing.

    1.1 Asynchronous Messages Passing with MPI

    The Message-Passing Interface defines primitive nonblocking send and receive opera-tions (see Section 3.7 of the IVIPI Standard [:3]). A simple MPI nonblocking commu-nication operation is initiated with a start call and is terminated with a completecall. Other computations may occur between these MPI calls, thus overlapping COlTl-munications and computations. A nonblocking operation which has been started buthas not yet been completed is referred to as a "pending" or an "active" operation. TheMPI specification does not include a lower or upper bound on the number of pendingnonblocking operations; however, it is stated in the MPI specification that "an MPIimplelTlenta.tion should be able to support a large number of pending nonblockingoperations. "

    An application vvith un predicta.ble communication requirements may initiatenonblocking receive operations for all messages whichm.ight occur within a partic-ular context. These active nonblocking operations must be periodically tested for

    TICAM 1 October 11, 1996

  • completion. vVhen such a. pending nonblocking operation completes, the applicationperforms some application specific processing of the received message. If several suchnonblocking operations are active, the applica.tion must a.lso be concerned with mes-sage progression rules, deadlock, and buffer overHow. A final element of complexityill this scheme is that the size and content of the message which 'Inight he received'/Iw:t) vary. In summary, this approach of initiating non blocking receives for messageswhichm.ight occur must deal with the following issues:

    a dynamic set of pending nonblocking operations, periodic testing of pending nonblocking operations, matching completed messages with appropriate processing, message progression rules, messages of variable size and content, deadlocking, and buffer overHow.

    Using MPI asynchronous message passing for complex and unpredictable communi-cation can become unwieldy without asynchronous message management.

    1.2 Managelnent for MPI Asynchronous Messages

    Each nonblocking communication operation started by an application may be postedto MMPI for subsequent completion. Each nonblocking operation is posted with anapplication supplied handleI'. A ha.ndler consists of a subprogram and an arbitraryset of data .. Upon completion of the nonblocking operation MMPI calls the associatedhandler subprogram, with the handler data., to process the completed message.

    An application may send multiple messages, of varying contents, to the samenon blocking receive operation. In such a scenario an IVIPI nonblocking receive oper-ation must be restarted after each received message has been processed. Potentialproblems with this scenario include sending a message which is larger than the re-ceiver's current buffer, or "Hooding" the receiver with messages which are too nu-merous or too large to be buffered by the MPI implementation. The IVIMPI libraryaddresses this communication scenario and it.s potential problems by providing analtel'llative to the MPI nonblocking receive, an lVr:~/IPIConsumer.

    An MMPI Consumer is an "open ended" receive operation where an arbit.rarynumber of messages of arbitrary content and length may be received and processedby a single Consumer. IVIessages received by a Consumer are "consumed" by theapplication via an application provided handler. The application's consumer han-dler is responsible for interpreting the contents of each message and processing itaccordingly. All other message passing responsibilit.ies are assumed by M1VIPIso that.the MPI implementation is never required to buffer numerous or large Consumermessages.

    TICAM 2 October 11, 1996

  • 1.3 Related Message Managelnent Capabilities

    Ot.IH'r lllessage passing libraries support handlers for asynchronous lllessage passing.Two such libraries are:

    the NX library [1] and act ive messages [4].

    This is by no means int.ended to be a comprehensive list of communication librarieswith asynchronous message management.

    The hrecv and hrecvx routines of the NX library post an asynchronous receiveto the run-time environment with an attached handler. vVben the NX run-t.imeenvironment. matches a message with t.he asynchronous receive the application's localt.hread of contl'Ol is intenupted and t.he attached handler is invoked. The NX run-timeellvirollment allows an application to post several asynchronous receives via hrecv orhrecvx, and then "forget" about them until they are matched with a message. Incontrast, an lVIPI based application must explicitly check each asynchronous receiveoperations to guara.ntee that a. message has actua.lly been received.

    Active messages [4] also allow the application to associate routines with re-ceived messa.ges. As with the NX run-time cnvil'Onrnent, the a.ctive message run-timeenvironment automatically invokes the handler upon receipt of a message.

    A similar int.enupt-like "handler" functionality could be implemented withl'vIP I and threads by 1) creating a thread for each asynchronous message and 2) per-forming a blocking receive within that thread for the message. Such an MPI+threadsapproach assumes the availability of threads and a thread safe IVIPI implementa-tion, requires the application developer to be knowledgeable in mixing threads andmessage passing, and relies upon "fairness" of the threads run-time support. ThisMPI+t.hreads approach is not used in the M.MPI library.

    The MPI-2 effort (see http://parallel .nas. nasa. gov IMPI -2/) has debatedan intenupt-like "handler" functionality at some length. In the early stages of thejVIPI-2 effort this functionality appeared in t.he "mpi-lsided" group. One contribut.ionto this debate was the TICA~d report ':A Consistent Extension of the Message-PassingInterface (J'vIPI) for Nonblocking Communication Handlers" [2]. The handler discus-sion was subsequently moved to the "mpi-external" group of the MPI-2 effort wherethe integration of NIP!, handlers, and threads are addressed. Not.e again that thelVIMPI library does not use threads, so the integration of handlers and threads is notall issue here.

    TICAM October 11, 1996

  • 1.4 MMPIOverviewi\ hrief overview of the MMPllibrary is givell in this section. T'he five components ofthe rVIMPllibrary are layered on MPI as shown here. 'fhe application programmillginterface for each component is given in the sections noted here.

    C' (S' h)_,onsumer ,ectJOn:)Tag J\!Igt Request Mgt. Packed Buffer lVlgt. Log Pile TvIgt.

    (Section 2) (Section :q (Section :[) (Section 6)MPI

    The MMPI Consumer provides the "highest level" or aSYllchronous message mcl.ll-clgement functionality, where the applicatioll need Ilot be cOllcerned with allocatingburrers, managing MPI requests, or "Hooding" an MPI implementation. Integrationof' an MMPI Consumer within a larger parallel application requires some familiar-it.y with t.he underlying tag management. request. management, and packed burrel'management components of MMP 1.

    1.4.1 Tag Management

    l

  • allocated and reallocated) and a consistent set of calling arguments. The IVIMPflibrary defines an opaque object type and supporting routines which manage bothhu ITer allocation / reallocat ion and cal J ing argnment consistencies between multi pIeralls to the MPI pack and unpack routines.

    1.5 Conslllller

    An MMPf Consumer is a high level communica.tion object which: asynchronously receives an unlimited number of messages, manages its own message tags, manages its own message buffers, and supports a COnSll'TII.er handle.,. routine.

    In using an MlVlPI Consumer the application is only responsible for packing andunpacking the consumer messages. A consumer message is unpacked on the receivingprocessor by the application's consumer handler routine. An application's consumerhandler is similar to the request handler in that it is invoked automatically for eachreceived consumer message. Multiple MMPI Consumer data objects may be createdwhere each consumer may be given its own independent consumer handler.

    1.6 Log File Mallageillent

    M~[PI routines output. warning and error messages to a processor specific log file.This log file is also available for an applica.tion's text output as a

    C file descri ptor (int), C file buffer (FILE *), or C++ output stream (ostream &).

    The name of the log file can be selected by the application immediately afterMPT initia.liza.tion with MPLlnit. The a.pplication's selected name for the log file isappended with the rank of the processor to insure that each processor outputs to adifferent log file.

    TICAM 5 October 11, 1996

  • 2 Tag Management

    The MMPT tag management. facility manages an application specified range of t.agsfor a given iVIPI Commuuicator. Tags may be reserved from a communicator's set.,Ilsed to identify a message, a.nd then released. MM P I tag management guarantees aullique tag for given message if I) the tag is reserved a.ud 2) the reserved tag is Ilsedto identify a. single send operation.

    2.1 Envirolllllent

    Tag management is enabled for a particular MPT Communicator with a call t.oMMPI1:nable or MMPI1:nable_tag. A call to MMPI1:nable calls MMPI1:nable_tagwit.h the indicated default values for the tag--1llin and tag--1llax arguments.

    int MMPI1:nable( MPI_Comm comm ); /* IN OUT */

    int MMPLEnable_tag(MPLComm comm , /* IN OUT default : */unsigned tag--1llin , /* IN 24576 (Ox6000) */unsigned tag--1llax ); /* IN 32767 (Ox7fff) */

    The call t.o MMPI1:nable* must be performed on a.ll processors of t.he given com trlll-Ilicator and each processor must have ident.ical values for tag--1llin and tag--1llax. TheMMPI1:nable* routines also "set up" the MMPLBarrier routine (see Section :LL2).Thlls the call MMPI1:nable* must occllJ' before t.he first call to MMPLBarrier. It. is rec-ommended that MMPI1:nable or MMPI1:nable_tag be called 1) for the MPLCDMM_WDRLDcommunicator immediately after calling MPLIni t and 2) for any other communica.torimmediately after it.s creat.ion.

    int MMPLDisable( MPLComm ); /* IN OUT */

    MiVl PI t.ag managernent. is disabled for a particular cOllltrlunicator with a call toMMPLDisable or when the commullicat.or is dest.royed via MPLCommJree. This c;-dlalso "cleans IlP" the MMPLBarrier routine.

    During clevelopmcut. and debugging of an applica.tion the set of global tags lllaybecome inconsistent. The application programmer may test for this error through asynchronous call to MMPLTag3erify.

    int MMPLTag_verify ( MPLComm comm );

    This routine broadcasts (MMPLBcast) t.he global tag table from~ processor #0 andcompares the broadcast global tag table with the local tag table. If the two lists donot compare MPI1:RR_CDMMis returned on that processor. Note that MMPLTag_verifyis called by MMPLIni t_tag to insure consistent tag management activation.

    TICA!vI 6 October 11, 1996

  • 2.2 Global Tag Mallageillellt

    lkservatioll of a tag with the sallie value Oil (1.\1 processors is a synchronous operation.The MMPLTag_get-.globall'Ontine, which reserves a globally consistent tag value, (\.11(1t.hc MMPLTagJel-.global routinc. which releases a. reserved global t.ag value, mustbe called synchronously on all processors.

    int MMPLTag_get-.global (MPLComm comm, /* IN OUT */int * tag ); /* OUT */

    int MMPI_TagJel-.global(MPI_Comm comm, /* IN OUT */int * tag); /* OUT */

    2.3 Local Tag Manageillent

    In some cases a tag does not need to be globally consistent. For these cases theMMPLTag-.geLlocal a.nd MMPLTagJeLlocal routines have been provided.

    int MMPI_Tag_get_local(MPI_Comm comm, /* IN OUT */int * tag ); /* OUT */

    int MMPI_TagJel_local(MPI_Comm comm, /* IN OUT */int * tag ); /* OUT */

    TICAM October 11. 1996

  • 3 Request Management

    M PI nonblocking cOlnmunications (Section :Li of [:3]) support asynchrollous IIlCS-sage passing. Each MPI asynchronous message requires two actions: 1) start. t.heasynchronous message request (send or receive) and 2) complete the request. Allasynchronolls message request which has been started but has not yet completed istermed an active n:quest. Completion of active request is typically followed b~; anapplication specific processing of the message contents.

    The MMPI library provides a facility for managing iVIPl active requests. Thisrequest management facility allows all application to start an aSYllchronous messagesand then give the I\HvIPI reqllestmmwgu' responsibility for completing the a.ctive"request.s. Upon completion of all application's active request the MMPI reclIwst.manager invokes an application specified rfljl/.e8/ han.dler which processes t.he messagecontents.

    Each request handler consists of an application supplied routine and dat.aobject. The value of the request handler data. object is arbitrary. 'rhe request han-dler routine may access the applica.tion's data, call MPI routines, or initiate newasynchrollous message. A request. handler routine may also reactivate its associatedrcq uest. \tVhen a. req uest hand IeI' rou ti ne reactivates its own request t.he request all dits handler become "self sustaining".

    3.1 Request Handler Specification

    A request handler routine's interface must conform to the following C type definition.

    typedef int MMPLRequest-.handler(void * handler_data, /* IN OUT */MPIJRequest * request , /* IN OUT */MPLStatus * status ); /* IN */

    The request handler data object is a pointer t.o an arbitrary data object, which maybe NULL.

    The request is the application's active request which is complet.ed by therequest manager. The status is the request completion st.atus generated by MPI.TIle request handler routine returns either MPLSUCCESSor an MPI error code. Ot.herint.eractions between an application's request. handler routine and the request managera.re discussed in Section ~L3.4.

    3.2 Posting Request Handlers

    An active asynchronous message request is posted to the request manager with arequest. handler.

    TICAM 8 October 11, 1996

  • int MMPLPost...handler(MPLRequest request , /* IN */void * handler _data , /* IN */MMPLRequest...handler handler Ioutine ); /* IN */

    The handler Ioutine is called by t.he request manager when the request complet.es.The handler_data is given t.o t.he handler Ioutine when it. is called by t.1l(' re-quest. manager. The request input. t.o the MMPLPost...handler routine is copiedhy t.he request manager. this copy of t.he request is subsequently given to t.hehandler Ioutine.

    A request may be ullposlrr! from t.he request mana.ger hy calling MMPLPost...handlerwith a handlerIoutine va.lue of' MMPLREQUEST..RANDLER...NULL.

    3.3 Request Conlpletion

    Bequests posted to the request. manager may be completed only during calls t.o all.Yof four M?\/IPI routines:

    MMPLTest MMPLWait MMPL.Barrier MMPLServe...handler

    [f' at. least one of these rout.ines is not called by the applicat.ion t.hen posted requestscannot be completed and t.he applicat.ion may dea.dlock.

    3.3.1 Test and Wait

    The MMPLTest and MMPLWai t routines are "replacements" for the standard MPLWai tand MPLTest routines.

    int MMPLWai t (MPLRequest * request , /* IN OUT */MPLStatus * status ); /* OUT */

    int MMPLTest (MPLRequest * request , /* IN OUT */int * flag , /* OUT */MPLStatus * status); /* OUT */

    The arguments to these routines are identical to the replaced MPLWai t and MPLTestroutines. However, if the input request is persist.ent. has been posted to the requestmana.ger, and is "self sustaining" the request may complete more than once beforethe MMPLWai t or MMPLTest ret.urns. In this sit.uation the out.put. status reHect.s thehrst complet.ion.

    TICAM October 11. 199G

  • Once posted to the request manager the request. should not bc test.ed or waitedror with the standanl MPLWai t and MPLTest family of routilles. The results of sucha call leaves t.he request manager v\'ith an inconsistent set of requests. which is likelyto yield erroneous results. If it is ahsolut.f'ly necessary to call olle of the MPLWai t amiMPLTest routines for an active request. which has been posted with MMPI request.manager. the req uest must fi rst he IInposted from the request manager.

    3.3.2 Barrier

    The MMPL..Barrier routine is a "replacement"' for the MPLBarrier routine.

    int MMPI-Barrier( MPI_Comm comm );

    :\ call to MMPLBarrier accomplishes the same sYllchronization as a call toMPI-Barrier; however, during synchronizatioll the request manager may compld.('posted requests.

    Recall that t.he MMPI-Barrier routine is "set up" by a call to MMPLEnable orMMPLEnable_tag. Thus one of the MMPLEnable* routines must bc called for a giV

  • struct data {MPI_CommMPCRequestintint

    };

    comm ;requesttag ;message [64]

    int handler(void * extra_state

    10 MPI_Request * request,MPI_Status * status )

    {struct data * D = (struct data *) extra_state;int N , R ;

    15 MPI_Get_count(status,MPI_INT,&N); /* Number of ints received */

    /* ... Process 'D->message[O ..N-i]' ... */

    20

    }

    R = MPI_Start( request );D->request = *request ;return R ;

    /* Restart the request *//* Insure consistency */

    void start_handler( struct data * D )25 {

    /* Create persistent request */MPI_Recv_init(D->message, 64, MPI_INT,

    MPI_ANY_SDURCE, D->tag, D->comm, &(D->request) );

    3(l MPI_Start( &(D->request) ); /* Activate the request */MMPI_Post_handler(D->request,D,handler); /* Post the request */return ;

    }

    This is a.n exa.mple of a "self susta.ining" rcceive request, the request is persis-tent and has a request. handler which restarts the request after processing the messa.ge.Notf' on line #19 that t.he MPLRequest data object input to the request handlcr ispassed to MPLStart. On on line #20 the resulting possibly updatcd value is copiedhack to the original D->request da.ta object. for consistency .

    A request handler routine may a.lso ca.ll point.-to-point. MPI or MMPI routines.including MMPLWai t, MMPLTest, or MMPLServe-.handler. Calls to these t.hree 1\fl'dPIrout.ines results in a. recursive entry into the MMPI request manager. H.ecursi\'einvoca.tions of the request ma.nager are safe; however, the request handler (request +routine + data) might also be recursively called if

    the request is persistent, the request is reactivated in the ha.ndler routine, and

    TICAM 11 October 11, 1996

  • the request manager is invoked after reactivation.In the following example, the call to MMPLServe-.handler on line # 15 allows thehandler routine to be called recursively. Such a. recursion will occur if pendingmessage matches the handler's request, as the rf'quest was activated by the ca.lI toMPLStart on line # 12.

    int handler(void * extra_stateMPI_Request * request ,MPI_Status * status )

    !) {

    struct data * D = (struct data *) extra_state;int N , R ;MPI_Get_count(status,MPI_INT,&N); 1* Number of ints received *1

    10 1* ... Consume 'D->message[O ..N-i]' ... *1

    15

    R = MPI_Start( request );D->request = *request

    return R ;}

    1* Restart the request *11* Insure consistency *1

    1* Possible recursive call *1

    TICAM 12 October 11, 1996

  • 4 Packed Buffer ManagementMPI defines several routines for packing and unpacking messages into buffers (Sec-tion :{.I:3 of [:3]). These buffers are then sent or received using the MPLPACKED datatype specification. Allocating, packing, and unpacking these buffers can become t.e-dious when numerous messages are packed into the same buffer. The MMPI libraryprovides routines to manage these operations.

    4.1 Create, Reset, Free

    MMPLBuf is an opaque object type, similar to other MPI opaque object types.MMPLBuf must be created and freed just as other MPI opaque data objects.

    int MMPLBuLcreate (int MMPLBuf ..reset (int MMPIJ3uf_copy(int MMPIJ3uLfree (

    int len, MPLComm comm, MMPIJ3uf * buf);int len, MPI_Comm comm, MMPIJ3uf * buf);MMPIJ3uf srcbuf , MMPIJ3uf * buf);MMPIJ3uf * buf );

    MMPIJ3uLcreate creates a buffer of an initial size len which is used to send orreceive a message via the MPI communicator comm. MMPIJ3uf..reset "recreates" thehuffer with the revised parameters. MMPIJ3uf _copy creates a copy of the source buffersrc_buf. MMPIJ3uLfree destroys the buffer and reclaims its memory.

    4.2 Query

    MMPIJ3uf data objects may be queried for various attributes.

    int MMPIJ3uLcapaci ty (int MMPIJ3uLpointer(int MMPIJ3uLposition(int MMPIJ3uf_size(int MMPIJ3uLcomm(int MMPIJ3uf ..remain (

    MMPI J3uf bufMMPIJ3uf bufMMPIJ3uf bufMMPI J3uf bufMMPIJ3uf bufMMPIJ3uf buf

    int * len );void ** ptr );

    , int * pos );, int * size );

    MPLComm * comm );, int * rem );

    Capacity is the current allocated length of the buffer. Pointer, position, size, andcomm are arguments passed to MPLUnpack. The remainder is the difference betweenthe size and the position, i.e. the remainder to be unpacked.

    4.3 Pack and SendA MMPIJ3uf is packed with the routine MMPIJ3uLpack. This routine has a similarinterface to the MPI routine MPLPack. The two significant differences are:

    several calling arguments are "compressed" and if the buffer is too small it is automatically lengthened .

    TICAM October 15, 1996

  • int MMPLBuf_pack (void * inbuf, /* IN */int incount, /* IN */MPLDatatype datatype, /* IN */MMPLBuf * outbuf ); /* IN OUT */

    A packed buff'er may bc sent with any lVIPI send routine. with tlJ(' MPLPacked datatype, buH'er size, and buff'er communicator. The following lVIMPI send routincs pro-vide synt.actic wrappers for sending a MMP I buffcl'.

    int MMPLBuLsend( MMPLBuf buf, int dest, int tag);int MMPLBuLrsend ( MMPLBuf buf, int dest, int tag);int MMPI-Buf_ssend( MMPI-Buf buf, int dest, int tag);int MMPI-Buf_bsend( MMPI-Buf buf, int dest, int tag);

    int MMPI-Buf_isend( MMPI-Buf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-Buf_irsend( MMPI-Buf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-Buf_issend( MMPI-Buf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-Buf_ibsend( MMPLBuf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-Buf_send_init( MMPLBuf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-BufJ'send_ini t ( MMPIJ3uf buf, int dest, int tag,MPIJRequest * request);

    int MMPIJ3uf_ssend_init( MMPIJ3uf buf, int dest, int tag,MPIJRequest * request);

    int MMPI-Buf_bsend_ini t ( MMPIJ3uf buf, int dest, int tag,MPIJRequest * request);

    4.4 Receive and Unpack

    A message may he reccived into a Ml'vlPI buffel'. The MPI receive statement mustinclude the bufFer's poillter, length, alld communicat.or and have the MPLPacked datatype specification. The following M]'vIPI receive routines provide syntactic wrappersfor receiving an IVllVIPI buffer.

    TICAM 14 October 11, 1996

  • int MMPLBuLrecv ( MMPLBuf buf, int src, int tag,MPLStatus * stat);

    int MMPI-Buf_irecv( MMPLBuf buf, int src, int tag,MPIJRequest * request);

    int MMPI-BuLrecv_init( MMPI-Buf buf, int src, int tag,MPIJRequest * request);

    Once alVIMPI buffer has been received t.he size of the received message must. beextracted from the MPLStatus and registered with the buffer. The followillg rout.inepnforms th is t.ask.

    int MMPI-Buf_status ( MMPI-Buf buf, MPLStatus * status);A received buffer is unpacked with MMPLBuLunpack. This routine has a similarint.erface to the MPI routine MPLUnpack. The significant difference is that severalca.lling arguments have been "compressed".

    int MMPI-BuLunpack(MMPLBuf inbuf,void * outbufint out count ,MPLDatatype datatype);

    /* IN *//* IN OUT *//* IN *//* IN */

    The following is an example of a receive and where the st.atus is extracted alldregistered with the buffer.

    int src, tag;MPI_Status statusMMPIBuf recv buf

    MMPI_Buf_recv( recv_buf, src, tag, &stat);MMPI_Buf_status( recv_buf, &stat );

    TICAIVI 15 October 11. 1996

  • 5 Consumer

    An MMPI Consumer is a global opaque object which may receive multiple messagesof dynamic size and content. These messages are sent to a con.c;ume" handle,. on adesignated processor. A consumer handler is an applica.tion provided routine andpointer, similar to the request handler.

    5.1 Conslllller Handler

    The interface requirement for the consumer handler routine is given in the followingC type definition.

    typedef int (*MMPLCon...handler) (void * extra--5tate,int source,MMPLBuf recv _buf );

    .Just as with the request handler, the consumer handler data is an anonymous pointervalue.

    5.2 Create and Free

    Creation of a global consumer opaque object. is a synchronous operation. Global tagsare reserved for the global object. when it is created and released when it is freed. Assllch the MMPLCon_create and MMPLConJree must be called synchronously on allprocessors of the given communicator.

    int MMPLCon_create (MPI_Comm comm,void *MMPLCon...handlerMMPLCon *

    extra_state,recv ...hand 1er ,consumer );

    /* IN *//* IN *//* IN *//* OUT */

    int MMPLCon_free( MMPLCon * consumer);

    int MMPI_Con~eset( MMPI_Con * consumer );

    A global consumer data object allocates buffers on each processor as required toreceive dynamic messages. These buffers may be dea.llocated without freeing theMMPI buffer via the MMPLCon~eset routine. This routine returns the consumer toit.s ini tial/ created state.

    TICAM 16 October 1L 1996

  • 5.3 Pack and Send

    Consu mel' messages arc poi nt- t.o-poi Ill. Illessages. Each message sellt to a consllmeris an explicit operation within the application code, whereas consumer's receive op-eration is automat.ically completed. Each reccived conSllmer message is processed byt.he consumer handler.

    !\ consumer message is a packedlVIMPI bufrer. The message buffer must. beinitialized with a call to MMPLCon_init and sent. wit.h a call to MMPLCon_send. Be-t.ween these two calls the burfer may be arbitrarily pa.cked. for subsequent unpackingby a matching consumer handler.

    int MMPLCon_ini t (MMPLCon consumer, /* IN */MMPL13uf * buf); /* IN OUT */

    int MMPLCon_send(MMPLBuf buf, /* IN */int destination, /* IN */MMPLCon consumer ) ; /* IN */

    The MMPLCon_init routine set.s up a. header ill t.he buffer which is processesb.y a consumer request ha.ndler on the destination processor. The MMPLCon_sendrout.ine insures that thc receive burrel' on destinat.ion processor is ready t.o receivc t.hemessa.ge. If the destination processor is not ready, the MMPLCon_send routine blocksvia MMPLWai t. Once t.he destination processor is ready the message is sent with acall to MPLRsend.

    5.4 Test and Wait

    After a consumer message is received and processed by the destination consumerhandler, an acknowledgement message is ret.urned to the source processor. The ap-plication may t.est or wait for this acknowledgement.

    int MMPLCon_wai t (MMPI_Con consumer,int destination );

    int MMPLCon_test (MMPLCon consumer,int destination,int * flag );

    TICAM

    /* IN *//* IN *//* OUT */

    11 Oct.ober 11. ] 996

  • 5.5 QueryAn J\HvIPI consumer may be queried for its communicator, handler function, andhandler data.

    int MMPI_Con_comm(MMPLCon consumer,MPLComm * comm);

    int MMPLCon_func(MMPLConMMPLCon-.handler *

    consumer,handler );

    int MMPLCon_data(MMPLCon consumer,

    void ** extra_state );

    5.6 Conslllner Handler Selnantics

    An applica.tion's consumer handler routine may be ca.lled recursively. Such a. recur-sive ca.ll occurs if J) the COllSUlller handler routine calls MMPI3est, MMPLWait, orMMPLServe-.handler (a. ca.ll to MMPL..Barrier is from within a. handler is erroneous)and 2) another message is sent to the consumer handler while the previous messageis being processed. Consistency of the recv _buf messa.ge burrel' is insured by provid-ing the consumer handler routine with a copy of the received message buffer. Thusrecursive calls to the consumer handler routine are given a unique recv _buf messagebuffer: however, each call is given the same extra_state argument.

    The consumer handler rec\ll'sion semant.ics are controlled by the MMPI Con-sumer's request ha.ndler. A significantly simplified version of MMPI Consumer's re-quest. handler is given here.

    int con_handler(void * extra stateMPI_Request * request ,MPI_Status * status )

    !) {

    MMPI_Con con = (MMPI_Con) extra_stateint src = status->MPI_SOURCEMMPI_Buf buf = MMPI BUF NULLint ret

    101* Consumer message received into 'con->buf' via 'request' *1

    MMPI_Buf_status( con->buf , status );MMPI_Buf_copy( con->buf, &buf );

    15 MPI_Start( request );

    TICAM 18 October 11, 1996

  • 1* Another consumer message may now be received into 'con->buf' *1

    ret = (*con->handler)( con->extra_state, src, buf );20

    MMPI_Buf_free( &buf );return ret ;

    }

    Recursion is ea.sily avoided, simply do not call MMPLTest, MMPLWait, orMMPLServe-.handler from within the consumer ha.ndler routine. vVhell recursionis permissible, consistency of the sha.red extra_state and a.ny other shared data ist.he responsibility of the application.

    TICAIVI 19 October 11. 1996

  • 6 Log File ManagementThe name of the MMPI log file may be selected immediately following init.ializat.iollof' NIPT via call to:

    int MMPLLog_ini t ( char * base...name );

    where the input base...name is appended with t.he rank or t.he processor in theMPLCDMM_WDRLDcommunicat.or. The base name for t.he log file wi]] default t.oMMPI. LogP# unless otherwise specified through a.1l initial call to MMPLLog_ini t.

    The MTvTPT log file is accessed for output as

    int MMPLLog_file_dO;FILE * MMPLLog_fileO;ostream & MMPLLog_streamO;

    1* C file descriptor *11* C FILE buffer *1II C++ output stream

    The lVIl\IPI log file is also written to by the routines

    void MMPLLog--1tlessage(void MMPLLog_abort (

    char * who, char *msg);char * who, char *msg, int mpi_err);

    where who is the name of the calling routine, msg is the message t.o log for that routine,alld mpi_err is the lVIPI error code to be passed to MPLAbort.

    TTCAM 20 October 11.1996

  • 7 ExamplesTwo cxamples of using t.he MMPI library are givC'n ItcrC'. The first example uscsMMPI to define a remote put operation. In t.lw secolld example MMPI is used t.odefinc a remote get operat.ion. In both examples t.hc "mcmory" which is shared is adesignated vector declared 011 each processor. Remot.e get. and put operations t.o tltisshared vector are defined by a subvectOl' displacement a.nd length.

    In both examples each processor puts or gets a random set of subvectors to orfrom a randomly determined processor. The source, destination, and content of thcpu t and get messages has been randomized to illustrat.e how easily such unpredictahlemcssages can be managed wit.h MMPI.

    7.1 Relllote Put Exalllple

    Three routines are defined in the remote put example: subvec, a random subvector generation routine, con.l1andler, an MMPI Consumer handler, and main, the main program.

    The subvec routine generates a subvector of a random displacement, length, a.nddestination processor. The main routine packs this subvect.or into a buffer and sends itt.o a.n MMPI Consumer on t.he destination processor. The message buffer is unpa.cked011 t.lte destination processor by the con.l1andler routinc and then summed int.o t.he"sllc1.l'ecl" vector.

    #include #include #include "mmpi.h"

    5 /* Random subvector generation */

    extern int subvec( /* Return 0 when done */int * dest , /* OUT: Destination processor */int * disp , /* OUT: Subvector displacement */

    10 int * len , /* OUT: Subvector length */double * val ); /* OUT: Subvector values */

    /* Consumer subprogram */

    15 extern int con_handler(void * extra_state , /* IN OUT: Consumer subprogram data */int source , /* IN: Source of the message */MMPI Buf recv_buf ); /* IN: Message contents */

    20 #define GLEN 100#define LLEN 10

    TICAM 21 October 11. 1996

  • maine int argc , char * argv[] ){

    25 MMPCBufMMPCCondoubledoubleintfor ( i

    send_buffer = MMPI_BUF_NULLconsumer = MMPI_CON_NULLshared_data[ GLEN] ;send_data[ LLEN ] ;i , proc , disp , leno ; i < GLEN; ++i ) shared_data[iJ = 0.0

    40

    MPI_Init( &argc, &argv); /* Initialize MPI */MMPI_Log_init( "Demoi.P" ); /* Name the MMPI Error/Warning file */MMPI_Enable( MMPI_COMM_WORLD );

    /* Synchronous creation of Consumer object.The 'extra_state' is the shared vector 'shared_data'. */

    /* Begin asynchronous loop */

    while ( subvec( &proc, &disp, &len, &send_data ) ) {

    15 /* Pack and send subvector */

    MMPI_Con_init( consumer, &send_buffer );MMPI_Buf_pack( &disp , 1 , MPI_INT , &send_buffer );MMPI_Buf_pack( &len ,1, MPI_INT , &send_buffer );

    50 MMPI_Buf_pack( send_data, len, MPI_DDUBLE , &send_buffer );MMPI_Con_send( send_buffer, proc , consumer );

    }

    /* End asynchronous loop. Done with the consumer. */

    MMPI_Buf_free( &send_buffer );MMPI_Con_free( &consumer ); /* MMPI_Barrier called internally */

    /* Print the results */GO

    for ( i = 0 ; i < GLEN ; ++i ) {if ( 0 == i % 5 ) fprintf( MMPCLog_file() , "\n" );fprintf( MMPCLog_file() , " %i" , shared_data[iJ );

    }()5 fprintf( MMPCLog_file() , "\n" );

    MMPI_Disable( MPI_COMM_WDRLD );MPIJinalize();return 0 ;

    70 }

    /* Subvector summation handler */

    TICAM 22 October 11. 1996

  • 7:)int con_handler( void * extra_state{

    int source , MMPI_Buf Buf )

    double data[ LLEN ] ;int disp , len , i ;double * shared data = (double *) extra_state

    ~O /* Unpack the subvector */

    MMPI_Buf_unpack( Buf , &disp 1 , MPI_INT );MMPI_Buf_unpack( Buf &len , 1 , MPCINT );MMPI_Buf_unpack( Buf data. len. MPI_DDUBLE );

    R!)

    /* Summation */

    for ( i = 0 ; i < len ; ++i ) shared_data[disp+iJ += data[iJ

    gIl return MPI_SUCCESS}

    /* Generate random subvector contributions */

    int subvec( int * dest{

    int * disp int * len double * val )

    static intstatic intstatic int

    100 static intint i ;

    count = 0num = 0proc = -1nproc = -1 ;

    if ( 0 == num ) {MPI_Comm_size( MPI_CDMM_WDRLD &nproc );

    10!) MPI_Comm_rank( MPI_CDMM_WDRLD &proc );

    srand( proc ); /* Different random numbers on each processor */

    num = 10 + rand() % 10 ; /* Random number of iterations */110 }

    *disp*len*dest

    11 :)

    rand() % ( GLEN - LLEN );1 + rand() % ( LLEN - 1 );rand () % nproc

    /* Random displacement *//* Random length *//* Random destination */

    for ( i = 0 ; i < *len ++i val [iJ 1 + proc

    return ( ++count < num );}

    TICAM October 11. 1996

  • Initialization (Lines #33-39)

    Immediately following the call to MPLInit, which init.ializes MPI, the IVIlVIPIlogfile is named wit.h a call to MMPLLog_ini t. The given log file name is appendedwith the processor's rank within the MPLCDMM_WDRLDcommunicator. This file willbe opened for writing upon the first call to MMPLLog_file_d, MMPLLogJile, orMMPL.Log_stream. The MMPI log file willllot be opened unless one of these routinesis called.

    The call to MMPLIni t activat.es tag ma.nagement for the MPLCDMM_WDRLDcom-munica.tor. Hecall that. t.his routine contains a call to MPLBcast.

    The M]VIPI Consumer is created with a call to MMPLCon_create Oil line#:{9. T'his must be a synchronous call performed by all processors of the inputcommunicator, MPLCDMM_WDRLD.The second and t.hird arguments, shared_data andcon-1landler, are the handler data and handler routine which is called to process mes-sages sent to the consumer. The final argument, consumer, is t.he IVIMPI Consumerobject. The handler data must. be a pointer convertible to void * and t.he han-dier routine must conform to the MMPLCon-1landler type specification (Section 5).The handler data (shared_data) is passed to the handler routine (line #74) via therout.ine's first argument, la.beled extra-8tate in the example.

    Asynchronous Part (Lines #41-54)

    The main routine's "asynchronous part" performs a random number of it.erations oneach processor. \t\lithin this part. of t.he example program each processor is "looseli'Single Instruction Multiple Data (SIMD). The processors are perfol'lTling similar butnot. identical operations, which is emphasized by each processor performing a differentnumber of iterations.

    \t\lithin the asynchronous part of the main rout.ine consumer messages are gen-erated with a similar format: subvector displacement, subvector length. and subvectorvalues. The subvector is packed into a message buffer in four steps (lines #41-50):

    MMPLCon_ini t is called to initialize the messa.ge buffer for the given consumer, two calls to MMPIJ3uLpack pack the displacement and length, and a. final call to MMPIJ3uLpack packs the subvector values.

    The packed buffer is sent to the M.MPI Consumer on the destination processor witha call to MMPLCon_send (line #51).

    The consumer messages of this example have both a random destination andra.ndom size. Thus the number of messages, message source-destination, and exactmessage content. cannot be determined a-priori. [1. is applications with this typeof non-deterministic communication patterns which motivated the development ofMMPI.

    TICAM 24 October 11, 1996

  • Re-Synchronization and Conclusion (Lines #57-70)

    The main routine ends the asynchronous part by calling MMPLCon_free Oil linc#5/. This lV11\lPI routine contains a call to MMPLBarrier which synchronizes t.heconsumer's processors. Once the consumer's processors have heen synchronized, it iscertain that all pending consumer messages have beell processed and so it is safe todestroy the consumer.

    The resulting "shared" vector shared_data is printed by each processor t.o the~I~IPIlog file. A call to MMPLLog_file retmns t.he C FILE pointer to this file.

    Consumer Handler (Lines #74-91)

    The con-1landler subprogram, beginning on lille #/4, is called to process each mes-sage received by the consumer. Recall that t.he value of the extra-8tate argumentpassed to t.he con-1landler rout.ine is the shared_data pointer which is passed t.oMMPLCon_create on line #:{~J. The source argument identifies the source of theIllessage while the Buf argument hold the message contents.

    The con-1landler routine casts t.he extra_state to a pointer of the appropriatetype on line #78. The put subvector's displacement, length, and values are unpackedfrom the IVIMPT Burfer Buf on lines #82-84. The put. subvector is t.hen summed intot.he appropriate subvector of shared_data on line #88.

    Random Subvector Routine (Lines #95-119)

    The random subvector generation routine subvec has been included to complete t.heexample. Note that the processor rank wit.hin MPLCDMM_WDRLDis used to seed therandom number generator on line #107. Thus each processor's subvec routine shouldgenerate a different set of random subvectors and message destinations.

    7.2 Relllote Get ExalllpleThe remote get example also defines three routines:

    subvecget, a random subvector retrieval rout.ine. con-1landler, an MMPI Consumer handler, and main, the main program.

    The subvecget routine retrieves a subvect.or of a random displacement, length, andsource processor. The main routine SUll1S the retrieved subvector into a local vec-tor. The con-1landler routine accepts subvector get requests and responds with therequested subvector values.

    #include #include #include "mmpi.h"

    TICAM 25 October 11, UJ96

  • 5 extern int con_handler( void * , int , MMPI_Buf );

    extern int subvec_get(

    MMPI Con consumer, int * disp , int * len, double * val );

    10 #define GLEN#define LLEN

    10010

    maine int argc , char * argv[]{

    I:)

    20

    MMPI_Condoubledoubledoubleint

    consumer = MMPI_CDN_NULLlocal_data[ GLEN];shared_data[ GLEN] ;recv_data[ LLEN ] ;i , proc , disp , len

    MPI_Init( &argc, &argv); /* Initialize MPI */MMPCLog_init( "Demo2.P" ); /* Name the MMPI Error/Warning file */MMPI_Enable( MPI_CDMM_WDRLD );MPI_Comm_rank( MPI_CDMM_WDRLD , &proc );

    /* Initialize local and shared vectors */

    for i a i < GLEN ++i 0.0

    :Hj for ( i = a ; i < GLEN ; ++i )shared_data[iJ = 1000 * ( proc + 1 ) + i ;

    /* Synchronous creation of Consumer object.The 'extra_state' is the shared vector 'shared_data'. */

    /* Begin asynchronous loop */

    40 while ( subvec_get( consumer, &disp, &len, recv_data ) ) {

    /* Sum received subvector */

    for ( i = a ; i < len ; ++i ) local_data[disp+iJ += recv_data[iJ45 }

    /* End asynchronous loop. */

    MMPI_Con_free( &consumer ); /* MMPI_Barrier called internally */50

    /* Print the results */

    for ( i = a ; i < GLEN ; ++i ) {if ( a == i % 5 ) fprintf( MMPCLog_file() , "\n" );

    55 fprintf( MMPCLog_file() , " %i" , local_data[i] );

    TICAM 26 October 11. 1996

  • }fprintf( MMPI_Log_fileO , "\n" );

    MMPI_Disable( MPI_COMM_WORLD );flO MPI]inalize 0 ;

    return 0 ;}

    /* Random subvector requests */G.)

    int subvec_get( MMPI_Con consumer, int * disp , int * len , double * val){

    static int count = 0 ;static int num = 0 ;

    70 static int proc =-1static int nproc = -1MPI_Request request ;MPI_Status status ;MMPI_Buf send_buffer = MMPI_BUF_NULL

    7~ int src , tag

    if ( 0 == num ) {MPI_Comm_size( MPI_COMM_WORLD , &nproc );MPI_Comm_rank( MPI_COMM_WORLD , &proc );

    80srand( proc ); /* Different random numbers on each processor */

    num = 10 + rand() % 10 ; /* Random number of iterations */}

    *disp*lensrc

    rand() % ( GLEN - LLEN); /* Random displacement */1 + rand() % ( LLEN - 1 ); /* Random length */rand() % nproc ; /* Random source */

    DO /* Unique tag for the response */

    /* Start receive operation for the response */

    MPI_Irecv( val, *len, MPI_DOUBLE, src, tag,MPI_COMM_WORLD, &request );

    /* Send the request */100

    MMPI_Con_init( consumer, &send_buffer );MMPI_Buf_pack( &tag , 1 , MPI_INT , &send buffer );MMPI_Buf_pack( disp , 1 , MPI_INT , &send_buffer );MMPI_Buf_pack( len , 1 , MPI_INT , &send_buffer );

    105 MMPI_Con_send( send_buffer , src , consumer );

    TICAM October 11. 1996

  • int source , MMPI_Buf Buf )

    /* Wait for response.110 This processor's handlers may be invoked while waiting. */

    MMPI_Wait( &request, &status );

    /* Release the tag */115

    return ( ++count < num );}

    120/* Subvector 'get' handler */

    int con_handler( void * extra_state{

    125 double * shared_data = (double *) extra_stateMPI_Comm comm ;int tag , disp , len

    MMPI_Buf_comm( Buf , &comm ); /* Query the communicator */no

    /* Unpack the subvector request */

    MMPI_Buf_unpack( Buf , &tag , 1 , MPI_INT );MMPI_Buf_unpack( Buf , &disp , 1 , MPI_INT );

    135 MMPI_Buf_unpack( Buf , &len, 1, MPI_INT );

    /* Reply with subvector values */

    MPI_Rsend( shared_data + disp, len, MPI_DDUBLE, source, tag, comm );140

    return MPI SUCCESS}

    Main Routine (Lines #13-62)

    In this example the main rout.ine declares a "shared" vector shared_data which isqueried by other processors, and a local vector locaLdata into which queried sub-vectors are summed. In t.he asynchronous pa.rt of t.he main routine, lines #:38-47, t.hesubvec....get routine is called a random number of t.imes to retrieve random sulwectors.Each subvector is summed into the locaLdata.

    TICA?v[ 28 October II, 1996

  • Random Subvector Retrieval Routine (Lines #66-119)

    The random subvector routine subvecget generates random subvector specificationsand retrieves the corresponding subvector values from a random processor. The re-mote get procedure in subvec.get begins on line #90 and ends on line # L6G. Thisprocedure is as follows:

    st.art a tlonblocking receive for the requested subvector values, send the subvect.or request to the "source" processor's consumer, wait for the "source" processor to reply with t.he subvector values.

    The remote get procedure's 110nblocking receive requires a message t.ag whichis guaranteed to be unique within the lVIPI communicator. Such a tag is obtainedwith a call to MMPLTag~eLlocal. The returned tag is guaranteed to 1) not confiictwith any of MMPI's intel'l1al tags for t.hat communicator and 2) not be retul'l1ed byany other call to MMPLTag~et_local for the communicator, until the tag is explicitlyreleased by the application. Thus as long as the application obt.ains all tags from theMMPLTag_* routines the tags will be unique t.o the communicator.

    The nonblocking receive is st.arted with a. call to MPLlrecv on line #96.This call to MPLlrecv explicit.ly specifies t.he t.ag and source, MPLANLTAG andMPLANY~OURCE are not used.

    The subvector request is packed into a messa.ge buffer and sent to the Ml\JPIConsumer on the "source" processor (lines #101-105). Note on line #102 t.hat t.hetag of t.he non-blocking receive is included in the subvector request message. Thisenables the "source" processor to form the appropriate reply message.

    Once t.he request message has been sent t.he subvecget routine calls t.heMMPLWait routine to wait for the "source" processor to reply with t.he request.ed sub-vector values. The MMPLWai t rout.ine is called, as opposed 1.0 the MPLWai t routine,so t.hat the local processor may cornplete other aSYllchl'Ol1ous messages while wa.iting.Consider the situation where two processors simultaneously send a subvector requestst.o one another. If each processor were to call MPLWai t within the subvecget routinethe exalnple would deadlock. However, by calling MMPLWait instead the processorsmay respond to each other's request. while waiting for their own reqllest to be satisfied.

    After t.he subvector request has been received, as indicated by t.he call toMMPLWai t retul'l1ing, t.he previously obtained t.ag is is not longer needed. This t.ag isreleased for subsequent re-use by t.he call to MMPLTagJeLlocal on line # 1J 6.

    Consumer Handler (Lines #123-142)

    The example's MMPI Consumer handler processes the subvector request by sendingthe requested subvector values to the requesting processor. Recall that the requestingprocessor st.arted an asynchronous receive prior to sending out the subvector request.This ordering of operations on the requesting processor allows the con-handler rou-t.ine to call MPLRsend (line #1:39) when replying with the requested subvector values.

    TIC AM 29 October 11,1996

  • 8 Conclusion

    The MMPI source code is available in the public domain under CNU license. Thel\HvIPI source code and other information call be obtained at.

    http://www.ticam.utexas.edu/ carter/rnrnpi.htrnl

    This WVVVV site includes updates to this report, list of host+lVlPI cOllfigmationsIIllCler which lVIMPI has been tested, andrelevent. W\VW links .

    The MMPllibrary is being used in t.he development. of two parallel libraries: Scalable Distributed Dynamic Array (SDDA) and Parallel Linear Algebra Package (PLAPACK) [5].

    The Scalable Distributed Dynamic Array (SDDA) library consists of a srna.llset of C++ components supporting distribut.ed object management. The SDDA usesMMPI to provide a "shared memory" view of a set of distributed data objects, whereevery processor may randomly access data objects regardless of their distribution.

    The Parallel Linear Algebra Package (PLAPACK) uses 1fMPI to support ap-plication filling and querying of distributed vectors and matrices. MMPI is used toprovide t.he application wit.h a "shared memory" view of the PLAPACK vectors andmatrices, where every processor may randomly access any part of a distributed vectoror matrix.

    Reqllirements for asynchronolls message management in MPI are identified andanalyzed in this report. The MMPI library represents one approach to meet.ing t.heserequirements. These requirements, as presented in this report, will be submitt.ed tothe Message-Passing Interface Forum (see http://parallel .nas. nasa. gov IMPI -2/).The goal is t.o have t.he MPI-2 standard meet the asynchronous message managementI'cqllil'Cmellts presented here.

    Acknow ledgements

    Development of the !VIMPI library \Vas funded by ARPA Contract DABTG:~-!)2-C-0042.

    The MI'vIPI functionality was originally developed within the "alpha version"Scalable Distributed Dynamic Array (SDDA). Collaborators for this "alpha version"SDOA include Dr. .lames C. Browne, 1)1'. Robert. van de Geijn, and Dr. Abani l~.Patra. Separation of the !VIMPI functionality from the SDDA, and its ClIlTent l'vlPI-like syntax and semantics, was mot.ivated in part by t.he !VIP1-2 effort. vVe thank Dr.St.even Huss-Lederman for discussions regarding MP1 request handler semantics.

    '1'1CA1VI :30 October 11, 1996

    http://www.ticam.utexas.edu/
  • References

    [I] Intel Corporation. Para!Jon(TM) S.'J8tem. C: Calls RrfrTena Manual, 1995.http://www.ssd.intel.com/pu bs. htrnl.

    [2] II. Cart.er Edwards. A Consistent Ext.ension of the Message-Passing Illterface(MPI) for Nonblocking Communication Handlers. TI(~AM Heport 96-11, TexasInstitute for Computational and Applied IVlathematics, The University of Texas,Austin, TX, 1996.

    P] Message-Passing Interface Forum. MP!: A Message-Passing Inte/:faa Standard,June 1995. http) /www.rncs.anl.gov/mpi/index.html.

    [4] Alan Michael Mainwaring. Active Message Applications Programming Interfaceand Communication Subsystem Organization. Technical report, ComputerScience Division, University of California at Berkeley, 1995.http://now.cs. berkeley.edu/ AM/active-Iuessages. html.

    [5] et. al. Robert van de Geijn. Parallel Linear Algebra Package (PLAPACK).ht.tp: / / www.cs.utexas.edu/users/rvdg/plapack/index.html.

    TICAlVI :31 October 11, 1996

    http://www.ssd.intel.com/puhttp:///www.rncs.anl.gov/mpi/index.html.http://www.cs.utexas.edu/users/rvdg/plapack/index.html.page1page2titlesMMPI: Asynchronous Message Managelnent for the Harold Carter Echvards TICAM Report 96-44 Abstract page3titles1 Introd uction 1.1 Asynchronous Messages Passing with MPI page4titles1.2 Managelnent for MPI Asynchronous Messages page5titles1.3 Related Message Managelnent Capabilities page6titles1.4 MMPIOverview tablestable1page7titles1.5 Conslllller 1.6 Log File Mallageillent 5 page8titles2 Tag Management 2.1 Envirolllllent tablestable1page9titles2.2 Global Tag Mallageillellt 2.3 Local Tag Manageillent TICAM page10titles3 Request Management 3.1 Request Handler Specification 3.2 Posting Request Handlers 8 page11titles3.3 Request Conlpletion TICAM tablestable1table2page12page13titlesTICAM page14titles15 TICAM 12 October 11, 1996 page15titles4 Packed Buffer Management 4.1 Create, Reset, Free 4.2 Query 4.3 Pack and Send page16tablestable1table2page17titles/* IN */ /* IN */ TICAIVI tablestable1page18titles5 Consumer 5.1 Conslllller Handler 5.2 Create and Free TICAM page19titles5.3 Pack and Send 5.4 Test and Wait TICAM tablestable1page20titles5.5 Query 5.6 Conslllner Handler Selnantics TICAM page21titlesTICAIVI page22titles6 Log File Management TTCAM page23titles7 Examples 7.1 Relllote Put Exalllple TICAM tablestable1table2page24titles25 40 GO 70 } TICAM 22 October 11. 1996 page25titles110 } 11 :) TICAM October 11. 1996 tablestable1table2page26page27titles7.2 Relllote Get Exalllple TICAM page28titles10 20 a 45 } 50 TICAM 26 October 11. 1996 page29titlesG.) 80 TICAM October 11. 1996 tablestable1page30titles115 120 no 140 Main Routine (Lines #13-62) TICA?v[ 28 page31titlesTIC AM page32titles8 Conclusion Acknow ledgements page33titlesReferences :31