13

Distributed Computing:

Embed Size (px)

DESCRIPTION

Fault-Tolerant. Distributed Computing:. Atomic Broadcast. Outline. Why distributed computing? Atomic Broadcast The atom system Relevance for e-textiles What’s next? Q&A. Why Distributed Computing?. Spread and balance the computational weight of applications Solve bigger problems - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Computing:
Page 2: Distributed Computing:

Outline

• Why distributed computing?

• Atomic Broadcast

• The atom system

• Relevance for e-textiles

• What’s next?

• Q&A

Page 3: Distributed Computing:

Why Distributed Computing?

• Spread and balance the computational weight of applications

• Solve bigger problems

• Deal with problems locally instead of centralizing all the data

Page 4: Distributed Computing:

Example

• Space filtering vs. raw consensus– Acoustic Beam Forming: master collects

information from slaves and decides according to the relevance of data

– Consensus: no master, all processes decide upon one common value

Page 5: Distributed Computing:

Atomic Broadcast: Definition (1)

• Atomic Broadcast = the same set of messages is delivered by all the processes in the same order

• Consensus = all processes decide upon one common value among those proposed

Page 6: Distributed Computing:

Atomic Broadcast: Definition (2)

• Validity: If a correct process broadcasts a message m it will eventually receive it

• Uniform agreement: If a process delivers a message m then every correct process will deliver it

• Uniform integrity: Every message m is delivered at most once and only if it was reliably broadcasted by sender(m)

• Total order: If 2 correct processes p and q deliver 2 messages m and m’ then p delivers m before m’ iff q delivers m before m’

Page 7: Distributed Computing:

Atomic Broadcast: Bad News

• Impossibly to achieve in a totally asynchronous system [Fisher, Lynch, Patterson 85]

Page 8: Distributed Computing:

Atomic Broadcast: Good News

• Can be done using unreliable failure detectors

• Based on a Consensus algorithm described in [Chandra, Toueg 96]

Page 9: Distributed Computing:

Atom

• Open source Atomic Broadcast system

Transmission

Consensus

FailureDetector

Reliable Broadcast

Atomic BroadcastA_Deliver

A_Broadcast

Page 10: Distributed Computing:

Atom

One_run

do_decide

do_Consensus

AB task 2

AB task 3

AB task1

RB

FD trust

FD suspect

R-broadcast

Producer

Consumer

A-deliver

A-broadcast

star

t

start

canc

el

Page 11: Distributed Computing:

Relevance to E-textiles

• Synchronization of data

• Coordination of decisions and actions

• Light-weight process

• Buffer sizes can be predicted

Page 12: Distributed Computing:

What’s Next?

• Scalability is a problem for classic fault-tolerant distributed algorithms

• Bimodal Multicast [Ken Birman, Mark Hayden, Oznur Ozkasap, Zhen Xiao, Mihai Budiu, Yaron Minsky – 1998]

– Gossip protocol– Relaxes the “strong” reliability guarantees

replacing them with probabilistic guarantees– Converges to “strong” reliability in the absence of

failures– Scalable with steady throughput

Page 13: Distributed Computing:

Questions …