64
Global States in a Distributed System By John Kor and Yvonne Cheng

Global States in a Distributed System By John Kor and Yvonne Cheng

Embed Size (px)

Citation preview

Page 1: Global States in a Distributed System By John Kor and Yvonne Cheng

Global States in a Distributed System

By John Kor and Yvonne Cheng

Page 2: Global States in a Distributed System By John Kor and Yvonne Cheng

Initial Problem Example

Garbage CollectorFree’s up memory which is no longer in useCheck’s if a reference to memory still exists

What about in a distributed system

Page 3: Global States in a Distributed System By John Kor and Yvonne Cheng

Initial Problem Example (cont’d)

A distributed system consists of multiple processes

Each process is located on a different computerNo sharing of processor or memory

Page 4: Global States in a Distributed System By John Kor and Yvonne Cheng

Initial Problem Example (cont’d)

Each process can only determine its own “state”

Problem: How do we determine when to garbage collect in a distributed system?How do we check whether a reference to

memory still exists?

Page 5: Global States in a Distributed System By John Kor and Yvonne Cheng

System Model

A distributed system consists of multiple processes Each process is located on a different computer Each process consists of “events” An event is either sending a message, receiving a

message, or changing the value of some variable Each process has a communication channel in and

out

Page 6: Global States in a Distributed System By John Kor and Yvonne Cheng

Our Garbage Collection Problem

In order to test whether a certain property of our system is true, we cannot just look at each process individually

A “snapshot” of the entire system must be taken to test whether a certain property of the system is true

This “snapshot” is called a Global State

Page 7: Global States in a Distributed System By John Kor and Yvonne Cheng

Definition

The global state of a distributed system is the set of local states of each individual processes involved in the system plus the state of the communication channels.

Page 8: Global States in a Distributed System By John Kor and Yvonne Cheng

Determinism

Deterministic ComputationAt any point in computation there is at most

one event that can happen next.

Non-Deterministic ComputationAt any point in computation there can be

more than one event that can happen next.

Page 9: Global States in a Distributed System By John Kor and Yvonne Cheng

Deterministic Computation

Page 10: Global States in a Distributed System By John Kor and Yvonne Cheng

Non-Deterministic Computation

Page 11: Global States in a Distributed System By John Kor and Yvonne Cheng

Determinism

Deterministic computationA local event would reveal everything about

the global state!The process will know other process’ state

Non-Deterministic computationBecause of branching, a local event cannot

reveal what the next step will be

Page 12: Global States in a Distributed System By John Kor and Yvonne Cheng

Simple Algorithm

Create a new process that collects the states of every other process

Every process will save their state at an arbitrary time and send it to this new process

Page 13: Global States in a Distributed System By John Kor and Yvonne Cheng

Advantages

Very simple

Easy to implement

Page 14: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems?

Based on the assumption that all processes work on a synchronized global clock

Wrong assumption!

Page 15: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems (cont’d)

State recorded by p

m

p q

Page 16: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems (cont’d)

p q

m

Page 17: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems (cont’d)

State recorded by q

p q

m

Page 18: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems (cont’d)

Global state recorded

m

p q

m

Page 19: Global States in a Distributed System By John Kor and Yvonne Cheng

Another view

p

q

m

Page 20: Global States in a Distributed System By John Kor and Yvonne Cheng

Another view

Process p has no record of sending m

Process q HAS record of receiving mProblem?Global state does not show p sending m,

therefore there is confusion as to where m came from

Breaks the Consistency concept

Page 21: Global States in a Distributed System By John Kor and Yvonne Cheng

Consistency

A global state is consistent if it could have been observed by an external observer

If e e` , then both e and e` must reside within the same state

For a successful Global State, all states must be consistent

Page 22: Global States in a Distributed System By John Kor and Yvonne Cheng

Solution

Need to develop an asynchronous algorithm

Cannot depend on a clock

Must ensure consistency in all global states

Page 23: Global States in a Distributed System By John Kor and Yvonne Cheng

Assumptions

Distributed system: Finite set of processes and channels; described by graph

Processes Set of states, initial state, set of events

Channels FIFO, error-free, infinite buffers, arbitrary but finite

delay

Page 24: Global States in a Distributed System By John Kor and Yvonne Cheng

PART 2

Presented By: Yvonne

Page 25: Global States in a Distributed System By John Kor and Yvonne Cheng

Idea of a global state recording algorithm

- each process records its own state

- the two processes incident by one channel cooperate in recording the channel state

Page 26: Global States in a Distributed System By John Kor and Yvonne Cheng

Challenge

- No global clock

- Need a meaningful result

- Superimposed on underlying computation

Page 27: Global States in a Distributed System By John Kor and Yvonne Cheng

Meaningful: The notion of Consistency

- it could have been observed by an external observer

- All feasible states are consistent

Page 28: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

p q

p

q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2

m3

Page 29: Global States in a Distributed System By John Kor and Yvonne Cheng

A Consistent State?

p q

Sp1 Sq

1

p

q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2

m3

Page 30: Global States in a Distributed System By John Kor and Yvonne Cheng

Yes

p

q

p q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2

m3

Sp1 Sq

1

Page 31: Global States in a Distributed System By John Kor and Yvonne Cheng

A Consistent State?

p q

Sp2 Sq

3

m3

p

q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2 m3

Page 32: Global States in a Distributed System By John Kor and Yvonne Cheng

Yes

p

q

p q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2 m3

Sp2 Sq

3

m3

Page 33: Global States in a Distributed System By John Kor and Yvonne Cheng

An inconsistent State

p

q

p q

Sp0 Sp

1 Sp2 Sp

3

Sq0 Sq

1 Sq2 Sq

3

m1

m2

m3

Sp1 Sq

3

Page 34: Global States in a Distributed System By John Kor and Yvonne Cheng

Conducting algorithm: Using An Example

- Processes: p and q- Channels: c and c’- Token: t

p q

c

c’

Page 35: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

- p records its state

t

p q

c

c’

Page 36: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

- q, c, and c’ record their states

t

p q

c

c’

Page 37: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

- The composite global state!

t

p q

c

c’

t

Page 38: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

- n: number of messages sent along c before p’s state is recorded

- n’: number of message sent along c before c’s state is recorded

p q

c

c’

Page 39: Global States in a Distributed System By John Kor and Yvonne Cheng

An Example

- Reason of inconsistency: n<n’

t

p q

c

c’

t

p q

c

c’

n = 0

n’ = 1

Page 40: Global States in a Distributed System By John Kor and Yvonne Cheng

Similar scenario

c is recorded when the token is at process p.

p sends the token through channel c, and the states of c’, p, and q are recorded.

The recorded global state : no tokens in the system.

The reason of inconsistency : n>n’

Page 41: Global States in a Distributed System By John Kor and Yvonne Cheng

Conclusion from the example

A consistent global state

requires

n = n’

Page 42: Global States in a Distributed System By John Kor and Yvonne Cheng

Similar Conclusion

m : number of messages received along c before q’s state is recorded

m’ : number of messages received along c before c’s state is recorded

To be consistency: m=m’

Page 43: Global States in a Distributed System By John Kor and Yvonne Cheng

Some other equations

m’ : number of messages received along c before c’s state is recorded

n’ : number of messages sent along c before c’s state is recorded

m : number of messages received along c before p’s state is recorded

n : number of messages sent along c before p’s state is recorded

n = n’ m = m’

n’ >= m’

n >= m

Page 44: Global States in a Distributed System By John Kor and Yvonne Cheng

Other Fact

The state of channel c that is recorded must be the sequence of messages sent along the channel before the sender’s state is recorded, excluding the sequence of messages received along the channel before the receiver’s state is recorded.

Two cases:n’=m’ : c is emptyn’>m’: c must be the (m’+1)st…n’th messages sent by p

along c

Page 45: Global States in a Distributed System By John Kor and Yvonne Cheng

Put All Together:A brief sketch of the algorithm

p sends a marker message along all its outgoing channels after it records its state and before it sends any other messages.

On receipt of a marker message from channel celse

state ( c ) = messages received on c since it had recorded its state excluding the marker.

if p has not recorded its staterecord the statestate ( c ) = EMPTY

Page 46: Global States in a Distributed System By John Kor and Yvonne Cheng

Chandy and Lamport Algorithm

Features: Does not promise us to give us exactly what is

there But gives us consistent state!!

Page 47: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

Page 48: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

q records state as Sq1 , sends marker to p

Page 49: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

p records state as Sp2, channel state as empty

Page 50: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

q records channel state as m3

Page 51: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

Recorded Global State = ((Sp2, Sq

1), (0,m3) )

Page 52: Global States in a Distributed System By John Kor and Yvonne Cheng

Algorithm in Action

p

qSq

0 Sq1 Sq

2 Sq3

Sp0 Sp

1 Sp2 Sp

3

m1 m2 m3

Recorded Global State = ((Sp2, Sq

1), (0,m3) )

Computation may not even have passed through the state recorded!

Page 53: Global States in a Distributed System By John Kor and Yvonne Cheng

What have we recorded

The recorded consistent state can be anything!

Page 54: Global States in a Distributed System By John Kor and Yvonne Cheng

Properties of the recorded global state

Si : global state when the algorithm starts

Sj : global state when the algorithm finishs

S*: state recorded by the algorithm

Then S* is reachable from Si

Sj is reachable from S*

Page 55: Global States in a Distributed System By John Kor and Yvonne Cheng

S* Is reachable from Si

Si

Sj

Page 56: Global States in a Distributed System By John Kor and Yvonne Cheng

Sj Is reachable from S*

Si

Sj

Page 57: Global States in a Distributed System By John Kor and Yvonne Cheng

Still what good is it?

Stable PropertiesA property Y is called a stable property

iff for all states S` reachable from S Y(S) -> Y(S’)

Page 58: Global States in a Distributed System By John Kor and Yvonne Cheng

Detection of Stable Properties

Outcome = false;

while ( outcome == false )

{

determine Global State S;

outcome = Y (S);

}

Page 59: Global States in a Distributed System By John Kor and Yvonne Cheng

Checkpoint

S* serves as a checkpoint

On a failure, restart the computation from S*

Problem! Not able to restore to Sj

Si

Sj

S*

Page 60: Global States in a Distributed System By John Kor and Yvonne Cheng

Solution: Publishing

A Broadcast medium

A central recorder process records all the messages received by each process

Processes record their states at their own time and send it to the recorder

Page 61: Global States in a Distributed System By John Kor and Yvonne Cheng

Determining Global State

Recorder can construct global state from Checkpointed States of all processes

Plus Messages recorded since last checkpoint

Page 62: Global States in a Distributed System By John Kor and Yvonne Cheng

Problems

Publishing keeps track of all messages received by each process

Expensive!Solution

recorder takes checkpoint of process p at time t

deletes all messages recd by p before t.

Page 63: Global States in a Distributed System By John Kor and Yvonne Cheng

Comparison

SNAPSHOT PUBLISHING

NetworkStronglyconnected

Need not be

Mode Distributed Centralized

Scalability Yes No

Restorability No Yes

Page 64: Global States in a Distributed System By John Kor and Yvonne Cheng

Conclusion

Global State detection difficult in Distributed Systems

Snapshot algorithm may not give an actual state but is very helpful in detecting Stable Properties

Publishing gives an asynchronous way of determining global states but is unscalable