112
MESSAGE PASSING

chap2

Embed Size (px)

Citation preview

Page 1: chap2

MESSAGE PASSING

Page 2: chap2

04/08/23 2/112

Message passingMessage passing

Key concepts

Introduction

IPC

Remote Procedure Calls

Group communication

Page 3: chap2

04/08/23 3/112

IntroductionIntroduction

In distributed system, processes executing on

different computers often need to communicate

with each other to achieve some common goal

Inter process communication (I.P.C.) requires

information sharing among two or more processes

2 basic methods for information sharing are

1. Original sharing or shared-Memory approach

2. Copy sharing or message-passing approach

Page 4: chap2

04/08/23 4/112

Introduction (contd…)Introduction (contd…) Shared-Memory approach

Message-passing approach

A message passing system is a subsystem of a distributed OS that provides a set of message-based IPC protocols

It serves as a suitable infrastructure for building other higher level IPC systems, such as remote procedure call(RPC) and distributed shared memory(DSM)

Read A

P1 P2 Shared commonMemory area

Write A

P1 P2

Send A Receive A

Page 5: chap2

04/08/23 5/112

Desirable Features of a good message-Desirable Features of a good message-passing systempassing system

Simplicity

Uniform semantics Local communication

Remote communication

Efficiency If the message passing system is not efficient, IPC

become more expensive. i.e. users will not feel like using this mechanism)

Page 6: chap2

04/08/23 6/112

Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)

Some optimizations normally adopted

Avoiding the cost of establishing and terminating connection between the processes for each and every message exchange.

Minimizing the costs of maintaining connections

Piggy backing of acknowledgement.

Reliability

DS are prone to node crashes or link failures. Retransmit the message ( may be based on timeouts)

Due to timeouts – Duplicate Message

A good Message passing system should have IPC protocols to handle these issues.

Page 7: chap2

04/08/23 7/112

Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)

Correctness - related to group communication Atomicity – either to all or None

Ordered delivery – Order acceptable to the application

Survivability – guarantees message delivery despite of failures

Flexibility IPC primitives must also have the flexibility to permit

any kind of control flow between the co-operating processes including synchronous and asynchronous send/ Receive.

Page 8: chap2

04/08/23 8/112

Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)

Security Authentication of the receiver / sender

Encrypted message

Portability 2 aspects of portability

The message passing system should itself be portable

The applications written by using the primitives of IPC protocols of the message passing system should be portable. So, Heterogeneity must be considered while desiging message passing system

Page 9: chap2

04/08/23 9/112

Issues in IPC by message passingIssues in IPC by message passing A message is a block of information formatted by a

sending process in such a manner that it is

meaningful to receiving process

It consists of a fixed length header and a variable size

collection of typed data objects

The header consists of:

Address – to identify the sending/receiving

process

Sequence number – message identifier for identifying lost / duplicate message

Structural information

1. Type – data or pointer to data

2. Length of the variable size message

Page 10: chap2

04/08/23 10/112

Issues in IPC (contd…)Issues in IPC (contd…)

A typical message structure

Actual dataor

pointerto the data

Structural information

Number of

bytes/elements

Type

Sequence number

orMessage id

Addresses

ReceivingProcessaddress

Sending processaddress

Fixed-length headerVariable size collection of typed data

Page 11: chap2

04/08/23 11/112

Issues in IPC (contd…)Issues in IPC (contd…) In the design of an IPC protocol, following important issues

need to be considered Who is the sender? Who is the receiver? Is there one receiver or many receivers? Is the message guaranteed to have been accepted by its

receiver's? Does the sender need to wait for the reply? What should be done if a node crash or link failure

occurs? What should be done if the receiver is not ready to

accept the message? If there are several outstanding messages for a receiver,

can it choose the order in which to service the outstanding messages?

Page 12: chap2

04/08/23 12/112

Issues in IPC (contd…)Issues in IPC (contd…)

Issues in IPC are addressed by:

Synchronization

Buffering

Multi Data gram Messages

Encoding and Decoding

Process Addressing

Failure Handling

Group Massaging

Page 13: chap2

04/08/23 13/112

Synchronization Synchronization Semantics used for synchronization may be broadly

classified as

Blocking – its invocation blocks the execution of its invoker

Nonblocking - Its invocation does not block the execution of its invoker

How a non blocking RECEIVING process knows message arrival?

Polling : Periodically poll the Kernal to check the Buffer status.

Interrupts: When message is filled in the buffer, a software interrupt is used to notify the receiving process.

Page 14: chap2

04/08/23 14/112

Sender’s execution Receiver’s execution

Message

Acknowledgement Execution resumed

Send (acknowledgement)

Execution resumed

Send (message)

Execution suspended

Receive (message);

Execution suspended

Blocked state

Executing state

Synchronous mode of communication with both send and receive primitives having blocking-type semantics

Page 15: chap2

04/08/23 15/112

Buffering Buffering Messages can be transmitted from one process to

another by copying the body of the message from the

address space of sending process to the address

space of the receiving process

The message buffering strategy in IPC is strongly

related to synchronization strategy

Four types of buffering strategy are:

Null buffer ( or no buffering)

Single message buffer

Buffer with unbounded capacity

Finite-bound (or multiple-message) buffer

Page 16: chap2

04/08/23 16/112

Buffering (contd…)Buffering (contd…)

Null buffer (or No buffering) There is no place to temporarily store the message

Strategies used are: The message remains in the senders process’s

address space and the execution of the send is delayed until the receiver executes the corresponding receive

The message is simply discarded and the timeout mechanism is used to resend the message after a timeout period

Page 17: chap2

04/08/23 17/112

Buffering (contd…)Buffering (contd…)

The logical path of message transfer is directly from the sender’s address space to the receiver’s address space, involving single copy operation

Message transfer in synchronous send with no buffering strategy

MSG

Sending process Receiving process

Page 18: chap2

04/08/23 18/112

Buffering (contd…)Buffering (contd…)

Single-message buffer Null buffer strategy is not suitable for synchronous

communication A message has to be transferred two or more times,

and receiver of the message has to wait for the entire time taken to transfer the message across the network

Synchronous communication mechanisms in Distributed systems use a single-message buffer strategy

A buffer having the capacity to store a single-message is used on the receiver’s node

Page 19: chap2

04/08/23 19/112

Buffering (contd…)Buffering (contd…) Idea is to keep the message ready for use at location of

the receiver

The request message is buffered on the receiver’s node if the receiver is not ready to receive the message

The message buffer may be either in kernel’s address space or in the receiver’s process’s address space

Sending process Receiving process

Single-message buffer

Node boundary

Message transfer in synchronous send with single-message buffering strategy (two copy operations needed)

Page 20: chap2

04/08/23 20/112

Buffering (contd…)Buffering (contd…)

Unbounded-capacity buffer In asynchronous mode of communication, since a

sender does not wait for the receiver to be ready, there may be several pending messages that have not yet been accepted by the receiver

An unbounded-capacity buffer is needed that can store all unreceived messages to support asynchronous communication With assurance that all the messages sent to the

receiver will be delivered

Page 21: chap2

04/08/23 21/112

Buffering (contd…)Buffering (contd…) Finite-bound (or multiple-message) buffer

Unbounded capacity of a buffer is practically impossible

When buffer has finite-bound - problem is buffer overflow

The buffer overflow can be dealt with one of 2 ways:

Unsuccessful communication

Message transfers simply fail whenever there is no more buffer space

Flow-controlled communication

The sender is blocked until the receiver accepts some messages, thus creating space in the buffer for new messages

Page 22: chap2

04/08/23 22/112

Buffering (contd…)Buffering (contd…)

The message is first copied from the sending process’s memory into the receiving process’s mailbox

Then message is copied from the mailbox to the receiver’s memory when the receiver calls for the message

MSG

Send Receive

Multiple-message buffer/mailbox port

Message transfer in asynchronous send with multiple-message buffering strategy

Page 23: chap2

04/08/23 23/112

Multidatagram messagesMultidatagram messages Maximum transfer unit (MTU)

Upper bound on the size of data that can be transmitted at a time

Message whose size is greater than MTU has to be fragmented into multiples of MTU and sent separately

Each fragment is sent in a packet (known as datagram)

Messages smaller than MTU can be sent in a single packet (known as single-datagram messages)

Messages larger than MTU have to separated and sent in multiple packets (known as Multidatagram messages)

The disassembling and reassembling of messages on sender and receiver side is the responsibility of message passing system

Page 24: chap2

04/08/23 24/112

Encoding and Decoding of Encoding and Decoding of message datamessage data

The structure of the message data should be

preserved between the sending and receiving

processes

It is very difficult to achieve this goal in both

heterogeneous and homogenous systems

2 reasons

An absolute pointer value loses its meaning when

transferred from one process address space to

another Different program objects occupy varying amount of

storage space A message must normally contain several types of

program objects, such as long integers, short int, variable length characters and so on

Page 25: chap2

04/08/23 25/112

Encoding and decoding of message Encoding and decoding of message data (contd…)data (contd…)

Two representation for encoding and decoding of message data: Tagged representation

The type of each program object along with its value is encoded in the message

Because of self-describing nature of the coded data format Receiving process does not need prior knowledge

Untagged representation Message data only contains program objects No information is included in the message data to

specify the type of each program object Receiving process must have prior knowledge of how to

decode the received data

Page 26: chap2

04/08/23 26/112

Process addressingProcess addressing Message passing system usually supports 2 types of

process addressing

Explicit addressing

The process with which communication is desired is

explicitly named as a parameter in the communication

primitive used

Send (Process-id, Message)

To the process

Receive (Process_id, Message)

From the process

Page 27: chap2

04/08/23 27/112

Process addressing (contd…)Process addressing (contd…) Implicit addressing

Process willing to communicate does not explicitly name a process for communication Send-any (service_id, Message)

Send a message to any process that provides the service of type “service id” Receive any (Process_id, Message)

Receive a message from any process & return the “process_id” of the process from which message was received.

Page 28: chap2

04/08/23 28/112

Process addressing (contd…)Process addressing (contd…) Processes can be identified by the combination of three

fields:

Machine_id, local_id, machine_id

First field identifies the node on which process is created

Second field is a local identifier generated by the node

on which processes is created

Third filed identifies the last known location (node) of the

process

The value of the first 2 fields of its identifier never

change; the third field, however, may

This method of addressing is known as link-based

addressing

Page 29: chap2

04/08/23 29/112

Process addressing (contd…)Process addressing (contd…) Link-based addressing:

When a process is migrated from its current node to a new node, a link information {process id, networks M/c id} is left on its previous node and on a new node,

a new local id is assigned to a process, and its process identifier and the new local-id is entered in a mapping table maintained by the kernel of the new node for all processes created on another node but running on their node.

If the value of the third field is equal to the first field, the message will be sent to the node on which the process was created

Page 30: chap2

04/08/23 30/112

Process addressing (contd…)Process addressing (contd…)

Drawbacks: Eventhough it supports migration facility, it suffers from 2 main drawbacks

The overhead of locating a process may be large if the process has migrated several times during its lifetime

It may not be possible to locate a process if an intermediate node on which the process once resided during its lifetime is down

Both process addressing methods are nontransparent due to the need to specify the machine identifier

What are the alternatives?

Page 31: chap2

04/08/23 31/112

Process addressing (contd…)Process addressing (contd…)1. Centralized process identifier allocator

Maintains a counter. When it receives a request for identifier, it returns the current value of the counter and increments the counter

It suffers from poor reliability and scalability

2. Two level naming scheme for processes

1. Machine independent high level name

2. Machine dependent low level name

with a centralized( or replicated/distributed) name server maintaining the map table that maps high level name to the low level name

Page 32: chap2

04/08/23 32/112

Failure handlingFailure handling Possible problems in IPC due to different types of

system failures

Loss of request message

Failure of communication link between sender and receiver or receiver’s node is down at time the request reaches there

Loss of response message

Failure of communication link or Sender’s node is down at the time the response message reaches there

Unsuccessful execution of the request

Receiver’s node crashing while request is being processed

Page 33: chap2

04/08/23 33/112

Failure handling (contd…)Failure handling (contd…)Sender Receiver

Send request

Send request

Send request

Request message

Request message

Request message

Lost

Lost

Response message

Successful request execution

Successful request execution

Crash

Restarted

Send response

a) Request message is lost

b) Response message is lost

c) Receiver’s computer crashed

Page 34: chap2

04/08/23 34/112

Failure handling (contd…)Failure handling (contd…)

Four-message reliable IPC protocol for client-server communication between two processes

Client Server

Request

Acknowledgement

Reply

Acknowledgement Blocked state

Executing state

Page 35: chap2

04/08/23 35/112

Failure handling (contd…)Failure handling (contd…) Three-message reliable IPC protocol for client-server

communication between two processes

Client Server

Request

Reply

Acknowledgement Blocked state

Executing state

Page 36: chap2

04/08/23 36/112

Failure handling (contd…)Failure handling (contd…) Two-message reliable IPC protocol for client-server

communication between two processes

Client Server

Request

Reply

Blocked state

Executing state

Page 37: chap2

04/08/23 37/112

Failure handling (contd…)Failure handling (contd…) Fault tolerant communication between a client and a server

Successful Execution

Retransmit REQUEST Message

Send ResponseSuccessful Execution

Restarted

Lost

Client Server

REQUEST Message

Retransmit REQUEST Message

Retransmit REQUEST Message

Lost

Response

Send Request

Time Out

Time Out

Time Out

Unsuccessful Execution

Crash

Send Request

Send Request

Send Request

These 2 successful executions of the message may produce different results

Page 38: chap2

04/08/23 38/112

Failure handling (contd…)Failure handling (contd…)

Idempotency and handling of duplicate request

messages

Idempotency means repeatability

An Idempotent operation produces same results

without any side effects no matter how many times it

is performed with the same arguments

Example simpleIntrest( 25000, 2, 8 ) procedure

produces same result when executed repeatedly

A Non Idempotent operation produces different

results for the same set of arguments when executed

repeatedly

Page 39: chap2

04/08/23 39/112

Failure handling (contd…)Failure handling (contd…) Example : Non Idempotent operation

int Cal_Final_Marks (int End_Sem_Marks, int attndnce)

{ Total_Marks += End_Sem_Marks ;

if ( attndnce > 95 )

Total_Marks += 5 ;

else if ( attndnce > 90 )

Total_Marks += 3 ;

else if ( attndnce > 85 )

Total_Marks += 2 ;

else if ( attndnce > 80 ) Total_Marks += 1 ; return(Total_Marks ); }

Page 40: chap2

04/08/23 40/112

Failure handling (contd…)Failure handling (contd…)

Execute Cal_Final_Marks.

Total_Marks=43+34+2 = 79

Execute Cal_Final_Marks.

Total_Marks=79++34+2 = 115

Lost

CLIENT SERVERTotal_Marks = 43

Cal_Final_Marks (34, 87)

Retransmit REQUEST Message

Send Request

Retrun(79)

Send Request

Receive Total_Marks = 115

Cal_Final_Marks(34, 87)

Retrun(115)

Timeout

A nonidempotent procedure

Page 41: chap2

04/08/23 41/112

Failure handling (contd…)Failure handling (contd…) When no response is received by the client, it is

impossible to determine whether the failure was due to server crash or loss of the request or response message.

Using timeouts client resends the request.

Repeated execution of NonIdempotent requests results in “ORPHAN” executions

How to ensure only one execution of NonIdempotent requests ?

Using Exactly once semantics

Exactly once semantics is implemented using unique identifier for each request at the client side and reply cache on the server side

Page 42: chap2

04/08/23

Failure handling (contd…)Failure handling (contd…)

REQUESTIDENTIFIER

REPLY TO BE

SENT

Request02 45

.. ..

Check reply Cache for request01.

Execute Cal_Final_Marks.

Total_Marks=43+34+2 = 79Save Reply

Lost

CLIENT SERVERTotal_Marks = 43

Cal_Final_Marks (34, 87)

Retransmit

Retrun(79)

Send Request01

Receive Total_Marks

= 79

Cal_Final_Marks(34, 87)

Retrun(79)

Timeout

EXACTLY ONCE SEMANTICS USING REQUEST IDENTIFIERS AND REPLY CACHE

Send Request01

Check reply Cache for request01. FOUNDExtract Reply

Reply Cache

Request 01 79

NOT FOUND

Page 43: chap2

04/08/23 43/112

Failure handling (contd…)Failure handling (contd…)

Keeping track of Lost &out-of-sequence packets in multi data gram Messages.

How ensure reliable delivery of all the packets of the Multidatagram message?

Simple approach is using STOP & WAIT Protocol

Acknowledge each packet seperately

Disadvantage: Communication Overhead.

Better approach is using BLAST Protocol

Single Acknowledgement packet for all the packets of a multidatagram message

Page 44: chap2

04/08/23 44/112

Failure handling (contd…)Failure handling (contd…)When BLAST protocol is used, Node or Common link

failure leads to

Loss of packets

Out of sequence delivery of Packets

To solve this:

Use of Bitmap to identify the packet of a message using 2 extra fields to the Header.

Total No of Packets, Bit map specifying the position of the packet.

Use “ SELECTIVE REPEAT “ method to transmit the Lost packets after time out period.

Page 45: chap2

04/08/23 45/112

Failure handling (contd…)Failure handling (contd…)

Create a buffer for 4 packets and place this packet in position 1

SENDER RECEIVER

First of 4 packets(4,1000)

Resend Missing packets

Packets of the Response message

Retransmit the request of missing packets

Buffer for 4 packets

Send Request message

Second of 4 packets(4,0100)

Fourth of 4 packets(4,0001)

Timeout

Missing packets(4,0101)

Third of 4 packets(4,0010)

Second of 4 packets(4,0100)

Fourth of 4 packets(4,0001)

Acknowledgement

Page 46: chap2

04/08/23 46/112

Group communicationGroup communication

Three types of group communication:

One to many (single sender and multiple receivers)

Many to one (multiple senders and single receiver)

Many to many ( multiple senders and multiple

receivers)

Page 47: chap2

04/08/23 47/112

Group communication (contd…)Group communication (contd…) One-to-many communication

Also known as multicast communication

Special case of multicast communication is broadcast communication

Message is sent to all processors connected to a network

Group management Closed Group - Only the members of the group can

send message to the group.

Open Group – Any person in the system can send the message to the group.

Centralized Group Servers (with Replication) – For dynamic management of Group members.

Page 48: chap2

04/08/23 48/112

Group communication (contd…)Group communication (contd…)

Group addressing

2 level naming scheme is normally used for group

addressing

High level group name is an ASCII string that is

independent of the location information of processes in

the group

Low level group name depends on underlying hardware

Special address to which multiple machines can listen

is called multicast address

Networks that do not have multicast address have

broadcasting facility with Broadcast address

Page 49: chap2

04/08/23 49/112

Group communication (contd…)Group communication (contd…)

Message delivery to receiver process User applications use high-level group names in

programs

The centralized group server maintains a mapping of high-level group names to their low-level names

Group server also maintains a list of the process identifiers of all the processes for each group

Page 50: chap2

04/08/23 50/112

Group communication (contd…)Group communication (contd…) Buffered and unbuffered multicast

Multicast is an asynchronous communication mechanism

Multicast send cannot be synchronous due to: It is unrealistic to expect a sending process to wait until

all the receiving processes that belong to the multicast group are ready to receive the multicast message

The sending process may not be aware of all the receiving processes that belong to the multicast group

For unbuffered multicast, the message is not buffered Lost if receiving process is not in a state to receive it

For buffered multicast, the message is buffered for receiving process Each process of group receive the message

Page 51: chap2

04/08/23 51/112

Group communication (contd…)Group communication (contd…) 2 types of semantics for one-to-many communications

Send-to-all semantics

Bulletin-board semantics

Bulletin-board semantics is more flexible than send-to-all semantics, because of the following factors ignored by send to all: The relevance of a message to a particular receiver

may depend on the receiver’s state

Messages not accepted within a certain time after transmission may no longer be useful

Page 52: chap2

04/08/23 52/112

Group communication (contd…)Group communication (contd…) Flexible reliability in multicast communication

In one to many communication, the degree of reliability is normally expressed in:

The 0-reliable

Ex. Time signal generation

The 1-reliable

Ex. Request for service

The m-out-of-n-reliable

(1 < m < n) Ex. Consistency Control Algorithm

All reliable

Ex. Updation of replicas

Page 53: chap2

04/08/23 53/112

Group communication (contd…)Group communication (contd…) Atomic multicast

Has an all-or-nothing property When message is sent to group, it is either received by all

processes that are members of the group or else it is not received by any of them

Many-to-one communication Multiple senders send messages to a single receiver

Single receiver may be selective or nonselective

Selective receiver specifies a unique sender Message exchange takes place only if that sender sends a

message

Nonselective receiver specifies a set of senders Message exchange takes place only if any sender in the

set sends a message to this receiver

Page 54: chap2

04/08/23 54/112

Group communication (contd…)Group communication (contd…) Many-to-many communication

Multiple senders send messages to multiple receivers

Important issue is ordered message delivery

Ordered message delivery ensures that all messages

are delivered to all receivers in an order acceptable to

the application

Ordered message delivery requires message

sequencing

Commonly used semantics for ordered delivery of

multicast messages are:

Absolute ordering

Consistent ordering

Casual ordering

Page 55: chap2

04/08/23 55/112

Group communication (Contd…)Group communication (Contd…) Absolute ordering

All messages are delivered to all receiver processes in the exact order in which they were sent

System is assumed to have clock at each machine, and clocks are synchronized with each other

Uses global timestamp as message identifiers

Kernal of the receiver places the message in a queue

Sliding window mechanism is used to deliver the message periodically

Messages whose time stamp falls within the current window are delivered to the receiver

Page 56: chap2

04/08/23 56/112

Group communication (Contd…)Group communication (Contd…) Absolute ordering

S1

t1

R1 R2 S2

m1

m2

m1

m2

t1 < t2

Absolute ordering of messages

Time

t2

Page 57: chap2

04/08/23 57/112

Group communication (contd…)Group communication (contd…) Consistent ordering

All messages are delivered to all receiver processes in the same order

However this order may be different from the order in which messages were sent

S1

t1

R1 R2 S2

m1

m2

m1

m2 t1 < t2

Consistent ordering of messages

Time

t2

Page 58: chap2

04/08/23 58/112

Group communication (contd…)Group communication (contd…) Implementation of consistent-ordering

I Approach : Centralised Sequencer Method

Many-to-many scheme appear as a combination of

many-to-one and one-to-many schemes

Kernels of sending machines send messages to a

single receiver (known as sequencer)

Assigns sequence number to each message and then

multicasts it

Kernel of each receiving machine saves all incoming

messages meant for a receiver in a separate queue

Messages in queue are delivered immediately to

receiver unless there is a gap in the sequence

number

Page 59: chap2

04/08/23 59/112

Group communication (Contd…)Group communication (Contd…)

Implementation of consistent-ordering (Contd…)

Sequencer based method is subject to single point

failure and has poor reliability

II Approach : ABCAST protocol (Distributed)

Assigns sequence number to a message by

Distributed agreement among the group members

and the sender

1. Sender assigns a temporary sequence number to the message and sends it to all members of the multicast group.

Page 60: chap2

04/08/23 60/112

Group communication (Contd…)Group communication (Contd…)ABCAST protocol :

This sequence number should be greater than the

previous number used by the sender. A counter is

used.

2. On receiving the message, each member of the group

returns a proposed sequence number to the sender Member(i) calculates its proposed sequence number

as max ( Fmax, Pmax) + 1 + i / N

Fmax largest final sequence number agreed upon so

far for a message received by the group

Pmax largest proposed sequence number by this

member

N total number of members in the multicast group

i member number

Page 61: chap2

04/08/23 61/112

Group communication (Contd…)Group communication (Contd…) ABCAST protocol :

3. When sender has received the proposed sequence

numbers from all the members, it selects the largest

one as the final sequence number for the message

and sends it to all members in a COMMIT message

On receiving the COMMIT message, each member

attaches the final sequence number to the message

Committed messages with final sequence numbers

are delivered to the application programs in order of

their final sequence numbers

Page 62: chap2

04/08/23 62/112

Group communication (contd…)Group communication (contd…) Casual ordering

Ensures that if the event of sending one message is

casually related to the event of sending another

message, the two messages are delivered to all

receivers in the correct order

Two message sending events are said to be casually

related if they are co-related by the happened-before

relation

Page 63: chap2

04/08/23 63/112

Group communication (contd…)Group communication (contd…) Casual ordering

S1

t1

R1 R2 S2

Time

CASUAL ORDERING OF MESSAGES

R3

m2

m2

m1

m1

m1

m3

m3

Page 64: chap2

04/08/23 64/112

Group communication (contd…)Group communication (contd…) Implementation of casual ordering

CBCAST protocol

1. Each member process of a group maintains a vector

of “n” components, where “n” is the total number of

members in the group

2. Each member is assigned a sequence number from

0 to n.

3. ith component of the vector corresponds to the

member with sequence number i and it is equal to

the number of last message received in sequence

by the ith member.

Page 65: chap2

04/08/23 65/112

Group communication (contd…)Group communication (contd…)4. To send a message, a process increments the value of its

own component in its own vector and sends the vector

as part of the message

5. When message arrives at a receiver process’s site, it is

buffered by the runtime system and the Runtime system

tests the two conditions, to decide whether message can

be delivered or it must be delayed to ensure casual-

ordering semantics

S[ i ] = R[ i ] +1 and

S[ j ] <= R[ j ] for all j != i

where S is Vector of Sender process and R is Vector of

Receiver process

Page 66: chap2

04/08/23 66/112

Group communication (contd…)Group communication (contd…)

S[i] = R[i] +1 ensures that the receiver has not missed any

message from the sender

S[j] <= R[j] for all j!=i ensures that the sender has not

received any message that the receiver has not yet

received

6. If message passes these two tests, the runtime system

delivers it to the user process

7. Otherwise the message is left in the buffer and the test is

carried out again for it when a new message arrives

Page 67: chap2

04/08/23 67/112

Group communication (contd…)Group communication (contd…) CBCAST protocol for implementing casual ordering

3 2 5 1 3 2 5 1 2 2 5 1 3 2 4 1

4 2 5 1 message data

Deliver

Delay because the condition

A[1]=C[1] + 1

is FALSEDelay because the condition

A[3]<=D[3]

is not TRUE

Process A sends a new message to other processes

Vector of process A

Vector of process B

Vector of process C

Vector of process D

Status of vectors at some instance of time

Page 68: chap2

04/08/23 68/112

Remote Procedure CallsRemote Procedure Calls It is a special case of general message-passing model

of IPC

RPC has become a widely accepted IPC mechanism in distributed systems because of the following features Simple call syntax

Familiar semantics ( similar to Local procedure calls)

Well-defined interface

Ease of use

Generality

Efficiency

Can be used as an IPC mechanism to communicate between processes on different machines as well as between different processes on the same machine

Page 69: chap2

04/08/23 69/112

RPC modelRPC model RPC model is similar to the procedure call model used

for the transfer of control and data within a program in

the following manner:

For making a procedure call, the caller places

arguments to the procedure in some well specified

location

Control is then transferred to the sequence of

instructions that constitutes the body of the procedure

The procedure body is executed in a newly created

execution environment

After the procedure’s execution, control returns to the

calling point, possibly returning a result

Page 70: chap2

04/08/23 70/112

it can be asynchronous , so that client can do other task while waiting for reply.

Typical Model of a RPCTypical Model of a RPC Caller Callee

(Client Process) (Server Process)

Request Message with Remote

procedure’s Parameters

Call Procedure &Wait for reply

Receive request& start Procedure Execution

Procedure Executes

Reply Message with Result of

Procedure Execution Send Reply & Wait for next Request

Resume Execution

Page 71: chap2

04/08/23 71/112

Transparency of RPCTransparency of RPC A transparent RPC mechanism is one in which local

procedures and remote procedures are

indistinguishable to programmers

Transparent RPC require

1. Syntactic transparency

RPC should have exactly the same syntax as a local

procedure call

2. Semantic transparency

Semantics of RPC should be identical to those of a

local procedure call

Page 72: chap2

04/08/23 72/112

Transparency of RPC (Contd…)Transparency of RPC (Contd…) Differences between RPC and LPC:

With RPC, the called procedure is executed in an

address space that is disjoint from the calling

program’s address space. So, remote procedure

cannot have access to any variables or data values in

the calling program’s environment

RPC are more vulnerable to failure than LPC’s

Since they involve 2 different processes and possibly a network and 2 different computers

RPCs consume much more time (100-1000 times more) than LPCs

Due to involvement of a communication network

Page 73: chap2

04/08/23 73/112

Implementation of RPC mechanismImplementation of RPC mechanism

Implementation of RPC mechanism involves five

elements of program:

1. The client

2. The client stub

3. The RPCRuntime

4. The server stub

5. The server

The client, the client stub, and one instance of

RPCRuntime execute on the client machine

The Server, the Server stub, and one instance of

RPCRuntime execute on the server machine

Page 74: chap2

04/08/23 74/112

Implementation of RPC mechanismImplementation of RPC mechanism

2

110

9 7

6 5

4

8

3

Client Process Server Process

Call packet

Result packet

Client Machine Server Machine

Call Return

Client Stub

Pack Unpack

RPC Runtime

WaitSend Receive ReceiveSend

RPC Runtime

Unpack Pack

Server Stub

Call Execute Return

Page 75: chap2

04/08/23 75/112

Implementation of RPC Implementation of RPC mechanismmechanism

Client

User process that initiates a RPC

Makes perfectly normal local procedure call that in turn

invokes corresponding procedure in client stub

Client stub

Two tasks:

On receipt of call request from client, it packs a

specification of the target procedure and the

arguments into a message and then asks the local

RPC Runtime to send it to the server stub

On receipt of the result of procedure execution, it

unpacks the result and passes to the client

Page 76: chap2

04/08/23 76/112

Implementation of RPC mechanismImplementation of RPC mechanism RPCRuntime

Handles transmission of messages across the network between client and server machines

It is responsible for retransmission, acknowledgements, packet routing and encryption

RPC runtime on the client machine receives the call request message from the client stub and sends it to the server machine. It also receives the result message from the server and passes it to the client stub

RPC runtime on the Server machine receives the result

message from the server stub and sends it to the client

machine. It also receives the call request message

from the client and passes it to the server stub

Page 77: chap2

04/08/23 77/112

Implementation of RPC mechanismImplementation of RPC mechanism Server stub

Two tasks:

On receipt of request from local RPCRuntime, it

unpacks it and makes a perfectly normal call to invoke

the appropriate procedure in the server

On receipt of result, it packs the result into a message

and then asks the local RPCRuntime to send it to the

client stub

Server

On receiving call request from server stub, the server

executes the appropriate procedure and returns the

result of procedure execution to the server stub

Page 78: chap2

04/08/23 78/112

Implementation of RPC Implementation of RPC mechanismmechanism

Stub generation:

2 ways

Manually : RPC implementor provides a set of

translation functions from which a user can construct

stubs

Automatically : Uses Interface Definition Language

(IDL) to define the interface between a client and a

server.

RPC messages:

2 types of messages involved in the implementation of

an RPC system are:

Call messages

Reply messages

Page 79: chap2

04/08/23 79/112

Implementation of RPC mechanismImplementation of RPC mechanism

Call messages:

2 basic components necessary in a call message are:

The identification information of the remote procedure

to be executed

The arguments necessary for the execution of the

procedure

In addition to these fields, a call message normally has

A message identification field

A message type field

A client identification field

Page 80: chap2

04/08/23 80/112

Implementation of RPC mechanismImplementation of RPC mechanism

A typical RPC call message format

Remote procedure identifier

Message identifier

(Seq.No.)

Message type Client

identifierProgram number

Version number

Procedure number

Arguments

(Call / Reply)

Page 81: chap2

04/08/23 81/112

Implementation of RPC mechanismImplementation of RPC mechanism

Message identifier

Message type

Reply status

(successful)

Result

Message identifier

Message type

Reply status

(unsuccessful)

Reason for failure

a) A successful reply message format

b) A unsuccessful reply message format

RPC reply message format

Page 82: chap2

04/08/23 82/112

Server ManagementServer Management

1) Server Implementation

2) Server Creation

Based on style of server Implementation sever can

be classified as

1. Stateful server – Maintains client state information.

So client need not send the information all the time.

2. Stateless server – Does not Maintain client state

information.

Page 83: chap2

04/08/23 83/112

Stateful serverStateful server

Stateful file server

Open ( Fid, 5, buffer )

Return ( bytes 0 to 199 )

Read ( Fid , 200, buffer )

Return ( Successful )

Close ( fid )

Return ( Fid )

File Mode R/W PointerId

Open ( Filename, Mode )

Return ( bytes 200 to 204 )

Client Process Server Process

Page 84: chap2

04/08/23 84/112

Stateless serverStateless server

Stateless file server

Return ( bytes 0 to 199 )

Read( Filename,0, 200,buffer )

Return ( bytes 400 to 419 )

Server Process

File Mode R/W PointerId

Client Process

Read( Filename,400,20,buffer )

Page 85: chap2

04/08/23 85/112

Staless vs. Stateful serversStaless vs. Stateful servers

Stateful servers provide an easier programming

paradigm, clients need not keep track of state

information

Stateful servers are more efficient than stateless

servers

Stateless servers make crash recovery easy in the

event of server crash

Choice of using stateless or stateful server is purely

application dependent

Page 86: chap2

04/08/23 86/112

Server Creation SemanticsServer Creation Semantics Sever processes may either be created and installed

before their client processes or be created on demand basis.

Based on the time duration for which RPC server survive, RPC servers are classified as

1. Instance – per-call Server.

2. Instance – per- session Server3. Persistent Server

1. Instance–per-call Server : Servers exist only for the duration of a single call.

It is created by RPC Runtime on the server machine, only when the call message arrives.

Server is deleted after the call execution.

Page 87: chap2

04/08/23 87/112

Server Creation SemanticsServer Creation SemanticsNot commonly used approach because,

It is stateless approach, needs state information to be presented either at client process (Time consuming and loss of data abstraction) or at server O.S. (Expensive)

Multiple invocation of same server becomes more expensive.

2. Instance – per- session Server : Server exists for the entire session for which client & server interact. Server can maintain internal state information. Overhead involved in creation and destruction is minimized.

3. Persistent Server : Server remains in existence indefinitely. A persistent server can be shared unlike other two.

Page 88: chap2

04/08/23 88/112

Communication protocols for RPCs Communication protocols for RPCs 1. The Request(R) protocol

Client Server

First RPC

Next RPC

Procedure execution

Procedure execution

Request message

Request message

Page 89: chap2

04/08/23 89/112

Communication protocols for Communication protocols for RPCsRPCs

The Request protocol

Used in RPC in which the called procedure has nothing

to return and client requires no confirmation that

procedure is executed

Only one message per call is transmitted

An RPC that uses the R protocol is called asynchronous

RPC

In asynchronous RPC, the RPCRuntime does not take

responsibility for retrying a request in case of

communication failure

Asynchronous RPC with unreliable transport protocol

are generally useful for implementing periodic update

services

Page 90: chap2

04/08/23 90/112

Communication protocols for Communication protocols for RPC’sRPC’s2. The Request/Reply(RR) protocol

Server

Also serves as acknowledgement for the reply of previous RPC

Procedure execution

Also serves as acknowledgement for the request message

Client

First RPC

Next RPC

Request message

Reply message

Request message

Reply message

Also serves as acknowledgement for the request message

Procedure execution

Page 91: chap2

04/08/23 91/112

Communication protocols for Communication protocols for RPC’sRPC’s

The Request/Reply (RR) protocol Suitable for simple RPC in which all the arguments &

results fit in a single packet buffer and duration of call and interval between the call is short (less than transmission time)

It is based on the idea of using implicit acknowledgement to eliminate explicit acknowledgment messages

In this protocol A server’s reply message is regarded as an

acknowledgment of client’s request message

A subsequent call packet from a client is regarded as an acknowledgement of the server’s reply message of the previous call made by that client

Page 92: chap2

04/08/23 92/112

Communication protocols for Communication protocols for RPC’sRPC’s3. The Request/Reply/Acknowledge-reply(RRA) protocol

Client

First RPC

Next RPC

Request message

Reply message

Request message

Reply message

Reply ack message

Procedure execution

Procedure execution

Reply ack message

Server

Page 93: chap2

04/08/23 93/112

Communication protocols for RPC’sCommunication protocols for RPC’s The RRA protocol

Message identifiers associated with request

messages are ordered

Client acknowledges the reply message only if it has

received the reply for all the previous requests

Server deletes information from its cache only after

receiving an acknowledgement for it from the client

Loss of acknowledgement is harmless, since an

acknowledgement message guarantees the receipt of

reply for earlier messages

Page 94: chap2

04/08/23 94/112

Client Server BindingClient Server Binding Binding: Process by which client become associated

with server so that calls can take place.

Server locating:

1. Broadcasting:

Message is broadcast to all nodes.

Node housing the desired server responds.

Easy to implement & suitable for small networks.

Expensive for large networks.

2. Binding Agent:

A name server used to bind a client to a server.

Name server maintains the Binding Table.

Page 95: chap2

04/08/23 95/112

Client Server BindingClient Server Binding

Bin

ding

Age

nt r

etur

ns

Serv

er L

ocat

ion

Client Calls the Server4

Server Registers itself

with B

inding Agent

Clie

nt r

eque

st

the

Bin

ding

Age

nt fo

r se

rver

Loc

atio

n

Name Server Binding Agent

Client Process

12

3

Server Process

Page 96: chap2

04/08/23 96/112

Client Server BindingClient Server Binding

Advantages of using Binding Agent:

Can support Multiple Servers having the same interface type so that any of the available server may be used to service the client’s request.

Binding agent can Balance the load evenly among the servers providing the same service.

User Authorization facility can be provided for binding

Disadvantages:

Overhead becomes large when many client processes are short lived.

Binding Agent may become a performance bottleneck

Page 97: chap2

04/08/23 97/112

Client Server BindingClient Server BindingBinding time: -

1. Compile time Binding → Hard coding of Server’s network addresses. Extremely Inflexible (if configuration changes)

2. Link time Binding → Request B.A. before making call Server process exports its services by registering it Client makes Import request to the binding agent for

the service before making call Binding Agent returns the server details to the client Client caches it to avoid contacting the Binding agent

for subsequent calls

3. Call time Binding Client is bound to a server at the time when it calls

the server for the first time during its execution.

Page 98: chap2

04/08/23 98/112

Client Server Binding - Client Server Binding - Call time Binding

Ret

urns

the

resu

lts

alon

g

wit

h th

e Se

rver

’s H

andl

e

Subsequent calls are Sent directly

5

Sends a RP

C call m

essage Clie

nt p

asse

s se

rver

’s in

terf

ace

nam

e an

d ar

gum

ents

of R

PC

Binding Agent

Client Process

21

4

Server Process

Server returns the result

of request processing3

Page 99: chap2

04/08/23 99/112

Complicated RPC’sComplicated RPC’s 2 types of complicated RPC’s are:

1. RPC’s involving long-duration calls or large gaps between calls 2 methods used to handle

Periodic probing of the server by the client

Periodic generation of an acknowledgement by the

server

2. RPC’s involving arguments and/or results that are too large to fit in a single-datagram packet A long RPC argument or result is fragmented and

transmitted in multiple packets

Page 100: chap2

04/08/23 100/112

Special types of RPC’sSpecial types of RPC’s1. Call Back RPC

2. Broadcast RPC

3. Batch-mode RPC

1. Call Back RPC

In Normal RPC, the caller and callee processes have a client-server relationship, where as in call back RPC uses Peer-to-Peer paradigm where a node acts as both client and Server.

Call Back RPC is for interactive applications, which require user intermediate inputs

During procedure execution the server process makes a callback RPC to client process

Page 101: chap2

04/08/23 101/112

Special types of RPC’sSpecial types of RPC’s Callback RPC

Start procedure execution

Stop procedure execution temporarily

Resume procedure execution

Procedure execution ends

Client Server

Process callback request and send reply

Call (parameter list)

Callback (parameter list)

Reply (result of callback)

Reply (result of call)

Page 102: chap2

04/08/23 104/112

Special types of RPC’s (Contd…)Special types of RPC’s (Contd…)2. Broadcast RPC

Client request is Broadcast on Network & processed by all the servers providing that service.

Two ways Using Binding Agent, which forwards the request to

all Servers registered with it.

Using Broadcast Ports of servers. Client process may wait for zero, one, m-out–of-n,

all replies Depending on reliability desired.

Page 103: chap2

04/08/23 105/112

Special types of RPC’sSpecial types of RPC’s3. Batch-mode RPC

Queue separate RPC request at client side in a

transmission buffer & send them over network in a

batch.

Reduces overhead of sending each RPC.

Applications requiring higher RPC call rates (50-100

RPC/sec) can be implemented easily.

Transmission Buffer is flushed when

Predetermined interval lapses.

Predetermined number of requests received.

Amount of batch data exceeds the buffer size.

A call is made to one of the server’s procedure for

which result is expected. ( Nonqueuing RPC)

Page 104: chap2

04/08/23 106/112

Optimizations in RPC for better Optimizations in RPC for better performanceperformance

1. Concurrent Access to Multiple Servers

a) Use of threads: - Each thread can independently

make calls to different servers.

b) Early Reply Approach: -

RPC is split into 2 RPC calls

1. One RPC for Passing Parameters

2. One RPC for requesting result

c) Call Buffering Approach

Page 105: chap2

04/08/23 107/112

Early Reply Approach: - to provide concurrent access to multiple servers.

Client Server

Call procedure (parameter)

Reply (tag)

Request result (tag)

Reply (result)

Carry out other activities

Return (tag)

Execute the procedure

Store (result)

Return (result)

Optimizations in RPC for better performance***Optimizations in RPC for better performance***

Page 106: chap2

04/08/23 108/112

Call buffering approach : to provide concurrent access to

multiple servers.

Clients and servers do not interact directly with each

other

Interact indirectly via a call buffer sever

To make an RPC call

A client sends its call request to the call buffer server

Client then performs other activities until it needs the

result

Client periodically polls the call buffer server, when it

needs the result

If result is available it recovers the result

Optimizations in RPC for better performanceOptimizations in RPC for better performance

Page 107: chap2

04/08/23 109/112

Call buffering approach On server side

When server is free, it periodically polls the call buffer server, if there is any call for it

If there is, it recovers the call request, executes it and makes a call back to the call buffer server

Returns the result of execution to the call buffer server

Optimizations in RPC for better performanceOptimizations in RPC for better performance

Page 108: chap2

04/08/23 110/112

Carryout otheractivities

Check for Result ( Tag)

Call Procedure (parameter)

Check for result ( Tag)

Reply (not done)

Client

Polling for Waiting request

Execute the Procedure

Polling for

result

Reply ( Result)

Check for a waiting request

ServerCall Buffer Server

Reply (No request)

Reply ( Tag)

Check for waiting request

Reply ( Tag, Parameter)

Reply (Tag, Result)

Acknowledgement

Page 109: chap2

04/08/23 111/112

2. Serving multiple requests simultaneously

Delays encountered in RPC systems :

Delay caused while a server waits for a resource that

is temporarily unavailable

A delay can occur when a server calls a remote

function that involves a considerable amount of

computation to complete or involves considerable

transmission delay

Use of Multi-threaded server with dynamic thread

creation facility allow the server to accept and

process other requests, instead of being idle while

waiting will provide better performance

Optimizations in RPC for better performanceOptimizations in RPC for better performance

Page 110: chap2

04/08/23 112/112

3. Reducing per-call workload of servers

One way to achieve this improvement is to use

stateless servers

4. Reply caching of idempotent remote procedures

Proper selection of timeout values

Too small timeout value will cause timers to expire too

often, resulting in unnecessary retransmissions

Too large timeout value will cause a needlessly long

delay in the event that a message is actually lost

Optimizations in RPC for better performanceOptimizations in RPC for better performance

Page 111: chap2

04/08/23 113/112

Servers are likely to take varying amounts of time to

service individual requests, depending on various

factors like server load, network routing and network

congestion

If the clients continue to retry sending requests, the

server loading and network congestion problem will

become worse

One method for proper selection of timeout values is

to use some back-off strategy or exponentially

increasing timeout values

5. Proper design of RPC protocol specification

Optimizations in RPC for better performanceOptimizations in RPC for better performance

Page 112: chap2

04/08/23 117/112

End of chapter 2End of chapter 2