Upload
dominick-mason
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Probabilistic BroadcastPresented by Keren Censor
1
Traditional client-server model Central point of failure Performance bottleneck Heavy load on servers
2
Peer-to-Peer (P2P) Central point of failure Performance bottleneck Heavy load on servers
3
Information Dissemination Deterministic solutions
Flooding – send a message to every neighbor #Messages = O(#edges) Time = diameter
Deterministic routing – send according to a spanning tree Non-resilient to failures Time = O(#nodes)
4
Requirements Reliable broadcast
Reach all nodes Resilient to failures
Considering: Dynamic network topology Crashes Disconnections Packet losses
5
Random information spreading Trade reliability with scalability
Algorithm may be less reliable, but should scale well with system size
Basic gossip algorithm Forward information to a randomly chosen subset
of your neighbors. Design parameters: Buffer capacity B Fan-out F Number of times a message is forwarded T
6
Previous algorithms First developed for consistency management
of replicated database [Demers et al. 1987]
Reliability in Bimodal Multicast [Birman et al. 1999]:
The set of nodes that a message reaches is Almost all of the nodes, with high probability Almost none of the nodes, with small probability Other subsets, with vanishingly small probability
7
Design constraints Membership
Knowledge of the participants Network awareness
Knowledge of real network topology Buffer management
Memory usage Message filtering
According to different interests
8
Design constraints Membership
Knowledge of the participants in the system Previous algorithms assume this knowledge Problems:
Storage increases linearly with system size n Maintenance imposes extra load
9
Design constraints Membership
Solution: integrate membership with gossip and maintain partial view Uniformity: how to gossip to members chosen
uniformly at random from the entire system Adaptivity: some parameter must grow with system
size, how do we estimate the system size? Bootstraping: how is the system initialized?
10
Design constraints Network awareness
Knowledge of real network topology Problem: a message sent by p to a nearby q may be
routed through a remote w
p wq
11
Network awareness Solution: organize processes in a hierarchy that
reflects the network topology Distributed? Fault tolerant? Scalable?
Design constraints
12
Related approaches
Directional gossip [Lin and Marzullo, 1999]: a weight is given for each neighbor, according to its connectivity. A higher probability is given for neighbors with less weight
11Pr nBlue
Pr Green
in each round
grows with each round
13
Design constraints Buffer management
Memory usage Problem: limited buffers. When buffer is full:
Drop new messages? Drop old messages?
In Bimodal Multicast: a message is gossiped by a node for a limited number of rounds, and then it is erased
14
Design constraints Buffer management
Solutions: Age-based priorities Application semantics
Elaboration later
15
Message filtering According to different interests Problem: redundancy if there are topics of interest
How does a process know the interest of its neighbors? Assume magically that this info is available, does p
decide not to send to q a message it is not interested in?
Solution: hierarchy of processes
Design constraints
p wq
What if w is interested?
16
LPBCAST Lightweight Probabilistic Broadcast
[Eugster, Guerraoui, Handurukande, Kouznetsov, and Kermarrec, 2003]
Main contribution: Scalable memory consumption for
Membership management Message buffering
17
Model Set of processes П = {p1 , p2 , …} Synchronous rounds Complete logical network
LPBCAST has partial views
Application Application
LPBCASTLPBCAST
18
Buffers Event notifications – events Event notifications identifiers – eventIds Unsubscriptions – unSubs Subscriptions – subs and view
Each buffer L has a maximum size |L|max
Truncation of L: removing random elements so that |L|≤|L|max
view
eventIds
unSubs
subs
events
19
Receiving event from application Upon LPCAST(e)
Add e to events
e
eventsApplication
LPBCASTe
20
Gossiping Periodically (gossip period T ms) generate a
message and send to F (fanout) members chosen randomly from view
view
F random elements
Gossipmessage
eventIds
unSubs
subs
events = Ø
∪{pi}
21
Gossip reception Unsubscriptions:
Remove from view and subs Add to unSubs and truncate
view unSubssubs
pi pipi
pj pjpj
unSubsunSubs
random elements
22
Gossip reception Subscriptions:
Add to view and subs Truncate view into subs Truncate subs
view subs
pipi
pjpj
viewview subssubssubs
random elementsrandom elements
23
Gossip reception Events:
Deliver new event notifications to application Add to eventIds and events, and truncate If received id not in eventIds then add to
retrieveBuf
eeventIdsevents
Application
LPBCAST e e.id
retrieveBuf
id
Keeps ids of all delivered events
24
Retrieving events If > k rounds passed since an eventId was
inserted into retrieveBuf, and the matching event was not yet received: Ask for the event from the process q from whom
eventId was received. If no reply for r rounds: Ask for the event from a random neighbor. Ask for the event from the source of the event.
25
Subscriptions and unsubscriptions Subscribe: pi subscribes by some known pj , which
gossips this subscription. Gossip messages will start reaching pi , otherwise it subscribes again
Unsubscribe: have timestamps after which unsubscriptions become obsolete
Subscriptions are gossiped continuously to insure uniformly distributed views: a failed process will be removed from all views with high probability
26
Analysis – Assumptions n processes, П is constant Latency is smaller than gossip period T Failures are stochastically independent:
Probability of a message lost ≤ ε Number of crashes ≤ f
Event notification identifiers are unique Pr f
ncrash
27
Analysis – Distribution of views Assume each p has an independent uniformly
distributed random view of size l In round r: In round r+1:
For l << |subs|maxF , this is estimated by l/(n-1)
1Pr lnp view
max1 | | 1 1Pr 1l l l l
n subs F l n np view
p in view p not removed
p not in view p enters view
28
Analysis – Event propagation pr = Pr[p receives certain gossip message] ≥
sr,e = #processes that received event e by round r
Markov chain: pij = Pr[sr+1=j | sr=i] =
1 11 1 1 1l F Fn l n
p in view p is chosen message not lost p doesn’t crash
Doesn’t depend on l
( )(1 )
0
i j i i n jn iq q j i
j i
j i
q=1-pr
29
Analysis – Event propagation Markov chain: pij = Pr[sr+1=j | sr=i] =
Distribution of sr:
( )(1 )
0
i j i i n jn iq q j i
j i
j i
q=1-pr
0
1 1Pr[ ]
0 1
js j
j
1Pr[ ] Pr[ ]r r iji j
s j s i p
30
Analysis – Gossip rounds #rounds decreases as the fanout F increases Claim: #rounds increases logarithmically with
system size n Compare: diameter of a random graph is O(logn)
View size l does not influence #rounds But we needed to assume l << |subs|maxF , so the
subs buffer pays the price?
31
Analysis – Partitioning
Pr[partition of size i]=
For constant n: decreases as l increases For constant l: decreases as n increases
1 1
( , , )1 1
i n ii n i
n l li n l
n ni
l l
•In one set: i views include only the other i-1,
• In the other set: n-i views include only the other n-i-1
32
Analysis – PartitioningPr[no partition up to round r]=
Decreases very slowly with r Design: can choose l as a function of some required
probability of not partitioning In practice, add a hierarchy – a set of processes that
are always known to everyone
1 /2
( , , ) 1 ( , , )
r
l i n
n l r i n l
33
Age-based message purging Replaces the randomly truncating of the
events buffer Each event e has an age parameter
Initialized to 0 by the application Incremented by every gossiping processes
While |events|>|events|max
Remove smallest id events from the same source Remove oldest events, according to their age
34
Age-based message purging Evaluation:
Delivery ratio: ratio between number of messages delivered and number of messages sent per round.
Redundancy: fraction of same messages received by a certain process in a given round
Throughput: as a function of stability – delivered by 90% of the processes
Fault tolerance: delivery ratio in the presence of faults
35
Frequency-based membership purging With random truncating, a new member has
the same probability of being removed as a well known member
Each element in subs has a frequency parameter Incremented each time it is received
Truncating: avg = average frequency in list1. Choose random element from list2. If its frequency > k·avg , then remove this element3. Otherwise, increment its frequency and goto 1
36
Frequency-based membership purging Evaluation:
Propagation delay: number of informed processes as a function of the round number
Membership management: number of times membership information about a process is seen by others. Measured on processes removed from the subs buffer
37
References Epidemic Algorithms for Replicated Database
Maintenance [Demers et al. 1987]
Bimodal Multicast [Birman et al. 1999]
Directional Gossip: Gossip in a Wide Area Network [Lin and Marzullo 1999]
Lightweight Probabilistic Broadcast [Eugster et al. 2003]
38
Thank you :)
39