Upload
lenard-davis
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 1
Lightweight Probabilistic Broadcast
M2 Tatsuya Shirai
M1 Dai Saito
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 2
Broadcast in Large Scale Environment
• End users send messages to all other users more frequently.– P2P BBS– Stock markets
• These applications need software broadcast.• Participating processes change more
dynamically compared to processes on servers, – machine crash– login to or logout from applications
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 3
Deterministic Broadcast
• Each process transfers messages along defined routes.• This approach provides consistency of message delivery
ordering.– Messages from each process reach in the order that it
sends• Reliability is expressed in “best effort”
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 4
Deterministic Broadcast cont.
rate of perturbed processes
• Poor scalability– Single point of failure– Cost of maintaining
routing information• Low reliability at
unstable networks.– Perturbation of few
processes makes performance of healthy processes lower.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 5
Probabilistic Broadcast
• Each process transfers messages to randomly selected processes without using defined routing information.
• Approximate redundancy enhances reliability.• Reliability is relatively high and stable in large
scale and unstable environments.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 6
Pbcast [Kenneth et al. 1999]
• This approach concurrently uses deterministic and probabilistic broadcast.– While network load is low, deterministic broadcast
achieve high reliability and low cost.– While network load is high, probabilistic broadcast
ensure certain reliability, especially of healthy processes.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 7
Deterministic Broadcast
• The first protocol is deterministic broadcast.• It uses IP multicast, or if it is not available, uses
spanning trees randomly composed.– But composing spanning trees needs information of
all membership. So this approach is limited to a few hundred processes, as mentioned in this paper.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 8
Anti-Entropy Protocol
• The second is anti-entropy protocol based on gossip.– In each round, members choose some of other members
randomly, send a summary of their message history digest to the selected processes.
– Processes receive the digest and check the lack of message, and require the lacking message for original sender.
message history
membership info.
digests
digests
message history
message history
lack 5, 8!
lack 3, 9!
3, 9
message 3, message 9
5, 8
message 5, message 8 5, 8
3, 9
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 9
Anti-Entropy Protocol cont.
• Message size and fanout, the number of processes to which a process send in one round, define network load of this protocol.
• Message size is limited by message lifetime on each process.– A process send any message for some fixed rounds fr
om initial reception.– After that, the message is gave up.
1 5 8 6 2 4
3 7
5 9 1 5 8 6 2 4
3 7 9
1 5 8 6 2 4
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 10
Flow Control
• Flow control while the network load is high.– The rate of pbcast message
s should be limited.• Normally every 100ms.
– Retransmission should delays in some rounds if many other processes require.
digests
digests3, 9
message 3, message 9
5, 8
message 5, message 8
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 11
Evaluations
• Parameters:– Message loss rate– Fanout, the number of processes
• Reliability:– (infected processes – failed ones) > all ones/2
• for applications based on quorum replication algorithm
• Throughput:– The number of messages a process receives in 1 sec
ond.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 12
Effects of Fanout
• Predicate I shows pbcast.– Message loss rate is 0.05.– Deterministic broadcast reac
hes 10 % of the processes. – 50 processes participate.
• Probability of failure decrease with an increase of the number of fanout to 8. fanout (0~10)
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 13
Scalability
• Predicate I shows pbcast.– Message loss rate is 0.0
5.– Deterministic broadcast r
eaches 10 % of the processes.
• Probability of failure decrease with bigger scale.– Though broadcast to all
processes take more rounds
processes (0~60)
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 14
Time for broadcast to all processes
• Messages are received in 12 rounds on an average, less than 20 rounds at 1024 processes.– Fanout is 1– Det. broadcast is not used.
• This result shows the means are at O(logN)
rounds (0~20)
16 32 1024 processes
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 15
Throughput
• 150 messages are sent in one second.– When message loss happens
frequently fanout is limited to small size.
• Throughput of perturbed processes decreases, but healthy processes avail full throughput.
rate of perturbed processes
deterministic
pbcast
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 16
Throughput cont.
• Throughput at 200 msg/sec.– 25 % of the processes pe
rtube 25 % of the time.– Det. broadcast is unuse
d.• High frequency of packe
t loss causes throughput lower.
• In this case, average throughput decreases to 60% at 96 processes at high bandwidth.
loss rate(0 ~ 0.2)
32~96 processes
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 17
Conclusion of pbcast
• Gossip based protocol achieves scalability and reliability in general network environments.
• Then, cost of processes are not considered. The next topic is memory management for pbcast.
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 18
Membership Management
• Assumption– Each process knows all Members
• memory consumption in large scale• communication required to ensure
the consistency of the Membership
– Problems of Scalability in Large scale environment
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 19
Membership Management of lpbcast
• Member Management + Gossip– Each process knows a subset of all Members– Sending messages with Member information– Size limitation of
Membership Management Buffer • Fixed Memory consumption
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 20
Memory Management
• The Memory requirement for a process should not change (in large scale)– Buffer of Membership Management– Buffer of outgoing message
→Scalability
• pbcast with a viewpoint of “Memory Consumption”
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 21
lpbcast algorithm
• Assumptions– Each process has unique ID– Each message has unique ID (including
process ID) – joining/leaving (= subscribing/unsubscribing)• Buffers
– Events : event notifications– EventIDs : Event IDs– Subs : subscription information– unSubs : unsubscription information– View : targets of gossip message
• Size limitation for all Buffers– Especially in Events and Subs
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 22
sending
• lpbcast(e)– Add e to Events
• periodical gossip– Send buffers to a subset
of View (every 50ms)
e
e
Events
Events
EventIDs
View
SubsunSubs
e
Mes
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 23
receiving
• When receiving gossip…– Membership Management
• add Mes.unSubs : unSubs ・ remove Mes.unSubs : View,Subs
• add Mes.Subs : View,Subs• If size of View is too large, move some items to Subs randomly
View
Mes.unSubs
Mes.SubsSubs
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 24
receiving
• When receiving gossip…– Event transmission
• Events received for the first time are transmitted to other processes in View
• If size of Events is too large, remove randomly
– Retrieving Event• When receiving undelivered event ID in Mes.EventIDs,
a request of retrieving Event
Events
e
Unknown
ee
EventIDs
Unknown
eID
ID
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 25
subscribing
• Subscribing process should know at least one node in specific Members
• Sending Gossip with appending itself to Subs• When timeout, making retransmission
View
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 26
unsubscribing
• Sending Gossip with appending itself to unSubs– The process is gradually removed from individual view– Set timeout to unSubs messages– Assumption : removed process will not recover soon
unSubs
unSubs unSubs
unSubs
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 27
features of lpbcast
• Throughput is as high as pbcast• A estimation of Memory consumption
• The membership algorithm and the dissemination of events are dealt with at the same level.
• Each view is independent uniformly– True P2P Model
→suitable for WAN– Need to recognize the “locality”
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 28
[m1,m2]
Optimization
• Age-base– Optimization of Events Buffer– Now: Events Buffer is purged randomly
→better to remove well disseminated messages– Age = # of hops
P1
P2
bcast(m1)
bcast(m2) gossip(m2)
[m1]
deliver(m2)[m1,m2]
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 29
Optimization
• Frequency-base– Optimization of Subs Buffer– Now: Subs Buffer is purged randomly
→ better to remove well-known processes– well-known = included in Subs Buffers
P1
P2
P3
Subs(P1, P2)
Subs(P2)
[P2] [P1,P2]
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 30
Experiment : # of rounds
• Simulation– Prob. of Message loss: 0.05– Prob. of process crash: 0.01
• # of rounds to disseminate 99% of all processes
• Logarithmically
– Fanout = 3
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 31
Experiment : Reliability
– SUN Ultra 10 (Solaris2.6, Memory256Mb)– 100Mbps Ethernet– 40msg/round, len(Events)=60
• A probability for any given process ofdelivering any givenevent notification
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 32
Experiment : Optimization Effect
• Age-based optimization– Delivery ratio =
(# of delivered message)/(# of broadcast)
– 30msg/roundlen(Events)=30Fanout=460processes
Optimized
Random
2007/1/15 http://www.logos.ic.i.u-tokyo.ac.jp/~std/lpbcast.ppt 33
Conclusion
• Scalability+Reliability
• Bimodal Multicast– Gossip based protocol achieves scalability an
d reliability.
• Lightweight Probabilistic Broadcast– Paying attention to cost of processes– memory management for pbcast.– Lightweight in large scale environment