Upload
athena-lawrence
View
21
Download
0
Embed Size (px)
DESCRIPTION
Fault-Tolerant. Distributed Computing:. Atomic Broadcast. Outline. Why distributed computing? Atomic Broadcast The atom system Relevance for e-textiles What’s next? Q&A. Why Distributed Computing?. Spread and balance the computational weight of applications Solve bigger problems - PowerPoint PPT Presentation
Citation preview
Outline
• Why distributed computing?
• Atomic Broadcast
• The atom system
• Relevance for e-textiles
• What’s next?
• Q&A
Why Distributed Computing?
• Spread and balance the computational weight of applications
• Solve bigger problems
• Deal with problems locally instead of centralizing all the data
Example
• Space filtering vs. raw consensus– Acoustic Beam Forming: master collects
information from slaves and decides according to the relevance of data
– Consensus: no master, all processes decide upon one common value
Atomic Broadcast: Definition (1)
• Atomic Broadcast = the same set of messages is delivered by all the processes in the same order
• Consensus = all processes decide upon one common value among those proposed
Atomic Broadcast: Definition (2)
• Validity: If a correct process broadcasts a message m it will eventually receive it
• Uniform agreement: If a process delivers a message m then every correct process will deliver it
• Uniform integrity: Every message m is delivered at most once and only if it was reliably broadcasted by sender(m)
• Total order: If 2 correct processes p and q deliver 2 messages m and m’ then p delivers m before m’ iff q delivers m before m’
Atomic Broadcast: Bad News
• Impossibly to achieve in a totally asynchronous system [Fisher, Lynch, Patterson 85]
Atomic Broadcast: Good News
• Can be done using unreliable failure detectors
• Based on a Consensus algorithm described in [Chandra, Toueg 96]
Atom
• Open source Atomic Broadcast system
Transmission
Consensus
FailureDetector
Reliable Broadcast
Atomic BroadcastA_Deliver
A_Broadcast
Atom
One_run
do_decide
do_Consensus
AB task 2
AB task 3
AB task1
RB
FD trust
FD suspect
R-broadcast
Producer
Consumer
A-deliver
A-broadcast
star
t
start
canc
el
Relevance to E-textiles
• Synchronization of data
• Coordination of decisions and actions
• Light-weight process
• Buffer sizes can be predicted
What’s Next?
• Scalability is a problem for classic fault-tolerant distributed algorithms
• Bimodal Multicast [Ken Birman, Mark Hayden, Oznur Ozkasap, Zhen Xiao, Mihai Budiu, Yaron Minsky – 1998]
– Gossip protocol– Relaxes the “strong” reliability guarantees
replacing them with probabilistic guarantees– Converges to “strong” reliability in the absence of
failures– Scalable with steady throughput
Questions …