Optimizing Buffer Management for Reliable Multicast

Preview:

DESCRIPTION

Optimizing Buffer Management for Reliable Multicast. Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse. Why important?. Many applications desire reliable or semi-reliable delivery. IP multicast is best-effort. Buffering is necessary for retransmission. - PowerPoint PPT Presentation

Citation preview

Optimizing Buffer Management for Reliable Multicast

Zhen XiaoAT&T Labs – Research

Joint work with Ken Birman and Robbert van Renesse

Why important?

• Many applications desire reliable or semi-reliable delivery.

• IP multicast is best-effort.• Buffering is necessary for retransmission.• Buffer space is limited!

How to utilize the amount of buffer space most efficiently?

Previous Work

• RMTP: Buffer all messages on repair servers.– Impractical for long-lived sessions.

• SRM: Regenerate messages at the application.– Buffer management at the application level remains a

challenge.• Stability Detection: Buffer messages until they are

stable (i.e. received by all members in the group).– It takes a long time to achieve stability in a large multicast

group.• Bimodal Multicast: Buffer messages for a fixed

amount of time.– Optimization: buffer messages on a sub-group of members.

Talk Overview

• RRMP: Randomized Reliable Multicast Protocol• Error recovery algorithm in RRMP: Infocom 2001• Buffering algorithms in RRMP: DSN 2002

– Feedback based short-term buffering– Randomized long-term buffering

• Simulation results• Summary

RRMP: Randomized Reliable Multicast Protocol

Key idea: combine previous work on randomized error recovery with the Bimodal Multicast protocol and hierarchical error recovery similar to that employed by tree-based protocols.

• Group receivers into a hierarchy.• Do not use any repair server.• parent region: the least upstream

region of a receiver in the hierarchy.• Each receiver maintains group

membership information about receivers in its region and receivers in its parent region.

Two-phase Error Recovery

Assume a receiver p detects a message loss. • local loss: the loss affects a fraction of receivers in p’s region • regional loss: the loss affects all receivers in p’s region

Local recovery: a receiver tries to recover the loss from randomly selected neighbors.

Remote recovery: some receivers in the region request retransmissions from the parent region.

s

s

routers

receivers

sender

s

p

q

s

routers

receivers

sender

s

p

q

s

routers

receivers

sender

Overview of Buffering Scheme

Local recovery

Remote recovery

Error Recovery

Long-term buffering

Short-term buffering

Buffering

Short-term buffering: when a message is first introduced into the system.

Long-term buffering: when almost all receivers in a region have received the message.

Not All Messages Are Created Equal!

Internet

Not All Messages Are Created Equal!

Internet

idle message: no request for this message has been received for a time interval T. (T is the idle threshold.)Short-term buffering: buffer a received message until it becomes idle.Result: messages most needed in the system stay in the buffer longer.No extra traffic overhead!

n: the size of a regionp: the percentage of members in this region missing a messageThe probability that a member will not receive any request:

As n , this probability can be approximated by pe

idea: a member uses the retransmission requests it received as feedback to estimate how many members in the region still miss the message.

np

n)

111(

Feedback-based Short-term Buffering

Simulation Results

• Short-term buffering in a local region.– 100 members in the region, fully connected.– RTT between any two members: 10ms.– idle threshold: 40ms.

• Outcome of IP multicast: select a random subset of members to hold a message initially.– Measure how long these members buffer the message.

1 2 4 8 16 32 6420

30

40

50

60

70

80

90

100

110

#members holding a message initially

aver

age

buffe

ring

time

(ms)

0 20 40 60 80 100 120 1400

10

20

30

40

50

60

70

80

90

100

time (ms)

#mem

bers

#received#buffered

96 %

Why Long-term Buffering ???

s

s

routers

receivers

sender

s

s

routers

receivers

sender

idleidle

idle

Sorry, you are out of luck!

p

q

Randomized Long-term Bufferingidea: provide long-term buffering for an idle message at a small subset of receivers in each region.Load balancing: spread the load of buffering across all receivers in a region.

Randomized algorithm: each member independently tosses a coin to decide whether to become a long-term bufferer.

C: the expected number of long-term bufferers.

Saving in buffer space: n / C

Network dynamics: message transfer

The probability that k members buffer an idle message for different values of C, the expected number of long-term bufferers.

1 2 3 4 5 60

5

10

15

20

25

30

35

40

C

Pro

babi

lity

of n

o lo

ng-te

rm b

uffe

rer (

%)

The probability that no member buffers an idle message decreases exponentially with C

How to find a long-term bufferer ???

s

p

q

s

routers

receivers

sender

help!

Do you have the msg?

have the msg?have the msg?have the msg?

The search is over!

Search Overhead

• Evaluate penalty in recovery time due to search for a bufferer in a region with 100 members.– RTT between any two members: 10ms.– Assume a remote request arrives at a random

member.– Simulation repeated 100 times with different random

seeds.• Question I: how does the search time change

with the number of bufferers?• Question II: how does the search time changes

with the region size?

1 2 3 4 5 6 7 8 9 1015

20

25

30

35

40

45

50

#bufferers

sear

ch ti

me(

ms)

Search time as the number of bufferers increases.

100 200 300 400 500 600 700 800 900100015

20

25

30

35

40

45

50

region size

sear

ch ti

me(

ms)

Search time as the size of the region increases

Summary

• Efficient buffer management is essential for reliable multicast in a large group.

• Two phase buffering to address variances in delivery latency in a large group.

• Retransmission requests can be used as feedback to allocate buffer space adaptively.

• Spread the load of buffering among all members in a group through randomization.

Recommended