Download ppt - LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang

LSRP: Local Stabilization in Shortest Path Routing

Anish Arora Hongwei Zhang

Motivations

Local fault containment is important in large-scale systems Stability, availability, and scalability

Self-stabilization is desirable in the presence of unanticipated faults

Even simple faults (such as node crash and message loss ) can drive a network protocol into arbitrary states

Local containment and local self-stabilization in routing remain unsolved

Only consider D-V routing RIP, BGP (path-vector), DSDV, AODV …

Outline Network and fault model

Definitions & problem statement

LSRP design & analysis

Related work

Summary

Network model

A network is a connected graph G=(V, E)

Each node has a unique ID

There is a clock at each node, with a single constraint

“the ratio of clock rates between any two neighboring

nodes is bounded from above by (not caring about the

absolute value)”

Fault model

Fail-stop: node and link

Join: node and link

State corruption




Related work

Summary

Definitions

Perturbations size

Range of contamination

F-local stabilizing

problem specific &

algorithm independent

algorithm dependent

Perturbation size: definition

Problem-specific variables E.g., “next-hop” in routing

Perturbation size at a network state q, denoted as P(q), is

the minimum number of up nodes where some transient faults have occurred or the values of whose problem-specific variables have to be

changed in order for the network to stabilize to a legitimate state

It characterizes the minimum amount of work needed in

order for a network to stabilize

clark

* See the paper for formal definition

0

Perturbation size: examples

Perturbation size: 0Perturbation size: 1Perturbation size: 31

4

5

5

2

10

4

4

33

4

2

3

83

7 6

1

5

9

11

12

11

2

Range of contamination When a network self-stabilizes to a legitimate state

q’ from an arbitrary state q, the range of contamination during stabilization is

the maximum distance from any node, that has changed state at least once during stabilization but whose state is the same at q’ and q, to the set of nodes that change state from q’ to q

G

Rc

F -local stabilizing

A network is F-local stabilizing if

starting at an arbitrary state q, the network self-stabilizes

to a legitimate state within F(P(q)) time, where F is a

function and P(q) is the perturbation size at state q.

“ A network is F-local stabilizing” implies that

the range of contamination during stabilization is

O(F(P(q))).

Problem statement: local stabilization in shortest path routing

Design a protocol that, given a network G(V, E) and a

destination node r, constructs and maintains a spanning

tree T (called shortest path tree) of G such that

r is the root of T

for every node i V, the path from i to r in T is a

shortest path between i and r in G

the network is F-local stabilizing




Related work

Summary

Fault propagation in existing D-V protocols

2

10

4

4

33

4

2

3

83

7 6

1

5

9

12

12

11

2

5

4

22

1

3 33

4

0

LSRP design

The cause for fault propagation:

“correction” action always lags behind “fault propagation” action

Solution:

the “source of fault propagation (such as node 8)” detects

the fault propagation, and initiates a “containment” action

that catches up with and stops the “fault propagation”

action

avoid forming cycles during stabilization, and remove

existing cycles fast

Approach: layering of diffusing waves Use three diffusing waves such that

Each diffusing wave has different propagation speed Speed is controlled by introducing delay in action execution

A mistakenly initiated layer-i wave Wi is contained and prevented

from propagating unbounded by a layer-(i+1) wave that is initiated at the same node which has initiated Wi

The top-layer wave self-stabilizes itself locally upon perturbations

Specifically,

V2

V1

Super-containment Wave

Stabilization Wave

Containment Wave

V0

V1 > V0

V2 > V1 > V0

Stabilization wave

Implements the basic distributed Bellman-Ford algorithm, with slight changes to interact with containment wave (no interaction with super-containment wave)

Variables: (p.i, d.i) for each node i

Actions:<S1>:: ( i is the dest. node i initiated a cont. wave) p.i ≠ i p.i := i

[]

<S2>:: i prop. SW from j j is not in CW d.i, p.i := d.j+1, j

ghost.i := false

Can be mistakenly initiated and cause fault propagation thus calls for containment wave

···

Stabilization Wave

···

V0

],[ UdLd ss

Containment wave Prevents a mistakenly initiated stabilization wave from

propagating faults unbounded

Additional variable: ghost.i for each node i

Actions:<C1>:: ghost.i (i is a source of fault prop. i prop. CW from p.i)

ghost.i := true; if i is a source of fault prop. p.i := i

fi[]<C2>:: ghost.i no other node using the corrupted state of i ghost.i := false; set (d.i, p.i)

Catch up with and stop corresponding stabilization wave

Can be mistakenly initiated thus call for super-containment wave

],[ UdLd cc

V1

···

Stabilization Wave

Containment Wave

V0

clark

* ds > max{alpha * dc, dc + U -L}* initiated by a node s which is a source of fault propagation * propagates along the same paths as the stabilization wave Ws that propagates the fault at s

Super-containment wave Prevents a mistakenly initiated containment wave from

propagating unbounded

No additional variables needed (stateless)

Action<SC> :: ghost.i (i is not a source of fault prop. p.i is not in CW)

ghost.i := false

Catch up with and stop corresponding containment wave

Self stabilizes locally stateless: trivial stabilization (no action needed) no unbounded propagation: constrained by the range of

containment wave (which is a function of perturbation size)

V2

V1

Super-containment Wave

Stabilization Wave

Containment Wave

V0

],[ UdLd scsc

clark

* dc > max{alpha * dsc, dsc + U -L}* propagates along the same paths as Wc* initiated by a node which has “mistakenly” initiated a containment wave Wc

Example revisited

3

2

10

4

4

3

4

2

3

83

7 6

1

5

9

12

12

11

1

5

4

C1 enabled at node 8

S2 enabled at nodes 6 and 5C1 executed at node 8 first, which disables S2 at nodes 6 and 5C2 executed at node 8, and network self-stabilizes

0

2

Protocol analysis LSRP is F-local stabilizing, where F is a linear function:

starting at an arbitrary state q0, a network reaches a state where the shortest path tree is

formed within O(P(q0)) time the range of contamination is O(MAXP), where MAXP denotes

the number of nodes in the largest perturbed region at q0 and

is no greater than P(q0).

perturbed regions that are far away from one another (i.e. half-distance is w(MAXP)) self-stabilizes in parallel

Quick loop removal:existing loops are removed within a small constant (i.e.,dsc+U)

time

Loop freedom:no new loop is formed during stabilization




Related work

Summary

Related work

Ghosh, Gupta, Herman, and Pemmaraju (PODC ’96) [4] Algorithms for locally containing a single state-corruption during

stabilization of a shortest path tree Not deal with such cases of multiple faults and node or link fail-stop

Ghosh and He (WSS ’99) [5] Fault-containing self-stabilizing algorithm for a consensus problem Only considers the case of linear topology, and the range of

contamination can be exponential in the perturbation size

Zhang and Arora (PODC ‘02) [16] Local stabilizing algorithm for clustering and shortest path routing in

wireless sensor networks The approach is based on different model assumptions: dense node

distribution, and knowledge of geometric information




Related work

Summary

Conclusion

Formulated concepts of perturbation size, range of contamination, and F-local stabilization

Designed LSRP for linear-local stabilization in shortest path routing

quick loop removal and loop freedom are automatically guaranteed by local stabilization

Faults are regarded as state corruption, and dealt with by way of self-stabilization