On Fault Tolerance in Wireless Ad Hoc Networks Seth Gilbert Nancy Lynch Celebration, 2008

Preview:

Citation preview

On Fault Tolerance in Wireless Ad Hoc

Networks

Seth Gilbert

Nancy Lynch Celebration, 2008

Nancy Lynch

1994Late 1980’s?? 1997 2002-2008

Through the years…

1980 1984 19881992 1996

2000 20042008

FLP: Impossibility of distributed consensus with one faulty process

DLS: Consensus in the Presence of Partial Synchrony

LT: An Introduction to Input / Output Automata

Fault tolerance

Replication

Consiste

ncy

Formal Methods

Simulati

on

Relation

s,

Invarian

t-based

Argument

s

Timing

Increasingly complex, increasingly

dyamic:

• Group communication / membership

• Publish / Subscribe

• Peer-to-peer systems

• Wireless ad hoc networks

The Virtual Infrastructure Project

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

The Virtual Infrastructure Project

Papers:

GeoQuorums: Implementing Atomic Memory in Mobile Ad Hoc Networks, DGLSW, DISC’03, DC’05

Virtual Mobile Nodes for Mobile Ad Hoc Networks, DGLSSW, DISC’03

Consensus and Collision Detectors in Wireless Ad Hoc Networks, CDGNN, PODC’05, DC’08

Timed Virtual Stationary Automata for Mobile Networks, DGLLN, Allerton’05, OPODIS’05

Autonomous Virtual Mobile Nodes, DGSSW, DIALM-POMC’05

A Middleware Framework for Robust Applications in Wireless Ad Hoc Networks, CDGN, Allerton’05

Reconciling the theory and practice of unreliable wireless broadcast, CDGLNN, ADSN’05

Self-Stabilizing Mobile Node Location Management and Message Routing, DLLN, SSS’05

Motion Coordination Using Virtual Nodes, LMN, CDC’05

The Virtual Node Layer: A Programming Abstraction for Wireless Sensor Networks, BGLNNS, WWWSNA’07

A Virtual Node-Based Tracking Algorithm for Mobile Networks, NL, ICDCS’07

Self-stabilization and Virtual Node Layer Emulations, NL, SSS’07

Secret Swarm Unit: Reactive k-Secret Sharing, DLY, IndoCrypt’07

Virtual Infrastructure for Collision-Prone Wireless Networks, CGL, PODC’08

Theses:

Virtual Infrastructure for Wireless Ad Hoc Networks, G, PhD 2007

Air Traffic Control Using Virtual Stationary Automata, B, MEng 2007

Simulation and Evaluation of the Reactive Virtual Node Layer, S, MEng 2008

Virtual Stationary Timed Automata for Mobile Networks, N, PhD 2008

In Progress:

Self-Stabilizing Robot Formations over Unreliable Networks, GLMN

Using Virtual Infrastructure to Adapt Wireline Protocols to MANET, W

Virtual Infrastructure Routing for Mobile Ad Hoc Networks, DN

Scenarios:•Sensor networks•Social networks•Coordination

Wireless Ad Hoc Networks

Scenarios:•Sensor networks•Social networks•Coordinated

applications

Wireless Ad Hoc Networks

— environmental monitoring

— intrusion detection

— border monitoring— fire detection

Scenarios:•Sensor networks•Social networks•Coordinated

applications

Wireless Ad Hoc Networks

— messaging— conferences / events

— HikingNet— TrafficNet

Scenarios:•Sensor networks•Social networks•Coordination

Wireless Ad Hoc Networks

emergency response & military

— firefighting— police response— terrorism

Scenarios:•Sensor networks•Social networks•Coordination

Wireless Ad Hoc Networks

Unreliable communication

Unknown availability

Wireless ad hoc networks are really hard to use.

NoiseCollisions

Dynamic

Unknown participants

Unknown topologyFault

prone

Lost

Messages

Fixed Infrastructure

Deploy:— Base stations— Cell towers— Servers

Problems:— Too expensive— Not feasible

Virtual InfrastructureUnreliable

ReliableAd hoc Fixed

net

Network Layers

Service Service

Middleware

Wireless Ad Hoc Network

Application

Network Layers

Routing Tracking

Virtual

Infrastructure

Wireless Ad Hoc Network

Application

Building Virtual Infrastructure

Basic idea: replicated state machine

Building Virtual Infrastructure

Basic idea: replicated state machine

1. Each participant is a replica.

2. Replicas execute a consistency protocol

3. Leader / backup

4. Leader sends & receives messages for the virtual node

Today’s Questions

1. What is virtual infrastructure?

2. What can you do with it?—Dynamic distributed coordination.

—Air traffic control

3. Does it really work?—Two simulation studies: routing and

address allocation.

Dynamic Distributed Coordination

Challenging problem:

o Highly dynamic environment

o Unreliable network

o Safety-critical applications

Ideal for Virtual Infrastructure solution:

o Static overlay

o Simpler, verifiable algorithms

o Fate-sharing

Dynamic Distributed Coordination

Note:• Number of (non-failed) robots unknown.• Location of other robots unknown.• Pattern may change over time.

Dynamic Distributed Coordination

In each round:

1.All robots stop.

2.All robots send location info.

3.Coordinators exchange info.

In each round:

4.Coordinators calculate.

5.Coordinators send out targets.

6.Robots move to target.

Dynamic Distributed Coordination

Rule 1: If only 1 robot, keep it.

Calculating new targets

Rule 2: If not on the curve and no neighbors on the curve: distribute evenly all but one.

Dynamic Distributed Coordination

Calculating new targets

Rule 3: If not on the curve: distribute among less populated neighbors on the curve.

Dynamic Distributed Coordination

Calculating new targets

Rule 4: If on the curve: distribute among less dense neighbors on the curve.

Dynamic Distributed Coordination

Calculating new targets

Rule 4: If on the curve: distribute among less dense neighbors on the curve.

Dynamic Distributed Coordination

Calculating new targets

Rule 5: Distribute robots evenly on the curve in each region.

Dynamic Distributed Coordination

Calculating new targets

Dynamic Distributed Coordination

Step 1: Eventually, robots cease moving from regions “off the curve” to regions “on the curve”.

Step 2: If neighbor g is the most dense neighbor of u after time t, then u is less dense than g after time t+1.

Step 3: Eventually, robots remain always in the same region.

Correctness

Dynamic Distributed CoordinationSelf-stabilization

What happens when something goes wrong?

Too many lost messages

Too much churn

INCONSISTENT REPLICAS

Option 1: Design for the very, very worst case.

Option 2: Design a system that can recover from faults.

Emulating Virtual Infrastructure

Self-stabilization techniques

Leader Election:

o Heartbeats, timeouts

o Resolve leader competitions

Replica Consistency:

o Leader sends “checksums” of the state.

o If out-of-synch, then re-join.

Building Virtual Infrastructure

Self-stabilization claims

Assume that:

o A is a self-stabilizing algorithm.

o A is designed for the virtual infrastructure abstraction.

o A is executed with the emulator.

o The system begins in an arbitrary (corrupt) state.

Then if the system is eventually well-behaved:

o From some point on, the state of A is as if it had really executed on a fixed infrastructure.

Dynamic Distributed Coordination

Summary

Coordination algorithm is self-stabilizing.

o In each round, all state is recalculated.

o Underlying virtual infrastructure emulation is self-stabilizing.

Implications:o Converges to changing curve.

o Recovers from network instability, lost messages, etc.

Dynamic Distributed CoordinationAdditional comments

Tina Nol

te

Virtual

Stationa

ry Timed

Automat

a for

Mobile N

etworks

PhD 2008

Dynamic Distributed CoordinationAir traffic control

Free Flight

o No flight plan, no control towers!

o Each pilot chooses a route independently.

o More efficient:

—Adapt to wind currents.—Avoid turbulence / bad weather.

Dynamic Distributed CoordinationAir traffic control

Goal: Free Flight

o Each pilot chooses a route independently.

o More efficient:

—Adapt to wind currents.—Avoid turbulence / bad weather.

In the USA, minimum separation: 3 miles lateral distance OR 1000 feet altitude

Dynamic Distributed CoordinationAdditional comments

Matthew

D. Brown

Air Traf

fic Cont

rol Usin

g Virtua

l

Stationa

ry Autom

ata

MEng, 20

08

Today’s Questions

1. What is virtual infrastructure?

2. What can you do with it?—Dynamic distributed coordination.

3. Does it really work?—Two simulation studies.

Simulating Virtual Infrastructure

Study #1 — Routing / Geocast

— Custom-built simulator (python)

— Simple communication model

Study #2— Address allocation (i.e., DHCP)

— ns2 simulator

— 802.11 MAC layer

GeoCast

Location-based routing

Source Destination

GeoCast

Location-based routing

Source Destination

Location Service

Store current location at home

Target

geocast

geocast

hash(id, 1)

hash(id, 2)

Location Service

Where are you?

Target

geocast hash(id, 2)

Source

hash(id, 1)

Routing

Point-to-point communication

Two step process: 1.Lookup destination location.

2.Geocast message to destination’s region.

400 m

400 m

250 m

Simulation Setup

Number of devices: • 25 / 50 / 100

Velocity: • 0-20 meters / second

Mobility model:• Random waypoint• Pause time: 100-900s

Simulation time: • 1000 seconds

Basic settings

400 m

400 m

250 m

Simulation Setup

GeoCast:• 10 send/receive pairs

• 1 msg every 5 secs

Routing• 10 send/receive pairs

• 1 msg every 0.5 secs

• 15 second simulation

Application settings

Mobility and DensityP

erce

nt o

f T

ime

Non

-Fai

led

Pause Time

100%

80%

60%

20%

40%

200 400 600 800

25 devices

100 devices

50 devices

When density is sufficient, virtual nodes work.

Leadership ChangesLe

ader

ship

Cha

nges

pe

r R

egio

n10

Pause Time

8

6

2

4

200 400 600 800

100 devices

There is continuous turn-over in the leader.

Message OverheadM

essa

ges

per

Reg

ion

per

seco

nd

Pause Time

0.5

0.4

0.3

0.05

0.1

0.01

200 400 600 800

Heartbeat

JoinLeader

Most overhead is heartbeats. (Overhead is negligible.)

Geocast Latency Overhead

VN-GeoCast is 2-3 times slower than simple GeoCast.

Late

ncy

(in s

econ

ds) 0.5

Pause Time

0.4

0.3

0.1

0.2

200 400 600 800

100 devices

simple Geocast

Routing

79%0.46

seconds0.58

seconds

Delivery Rate

Median Latency

Average Latency

End-to-end performance

Each message requires 3 GeoCast messages.

** devices=50, pausetime=400

Simulation Summary

Virtual nodes are stable if:—sufficient density (e.g.,

4/region), OR—low-enough churn

Message overhead: negligible.

GeoCast latency overhead: factor of 2.

Routing: relatively slow.

Simulation SummaryAdditional comments

Mike Spi

ndel

Simulati

on and E

valuatio

n of

th

e Reacti

ve Virtu

al Node

Layer

MEng 200

8

Simulating Virtual Infrastructure

Study #1 — Routing / Geocast

— Custom-built simulator (python)

— Simple communication model

Study #2— Address allocation (i.e., DHCP)

— ns2 simulator

— 802.11 MAC layer

— Mobile devices join and leave.

— Each device needs an address.

— Addresses should be assigned dynamically.

— Addresses should be unique.

Basic problem

Address Allocation

Challenges: Highly dynamic. No central authority. Unreliable network. Limited address pool.

Simple Scheme Each region is allocated a cache of addresses.

Basic protocol: Client send REQUEST Server reply OFFER Client send ACQUIRE Server reply ACK

Renew protocol: Client send RENEW Server reply RACK

Message forwarding…

REQUEST

ACQUIRE

RENEW

RENEW

OFFER

ACK

RACK

RACK

Virtual Node Client

Number of devices: • 160MAC Layer: • 802.11• Models collisionsMobility model:• Random waypointSimulation time: • 40000 seconds

700 m

700 m

250 m

Simulation Setup

Basic settings

Number of addresses:

30 per regionLease time:

400 secondsForwarding limit:

2 hop - REQUEST2 hop - RACKVarying - RENEW

700 m

700 m

250 m

Simulation Setup

Application settings

Simulation Setup

Simulation settings

Very Slow

SlowMedium Slow

Medium Fast

Fast

Min. Speed (m/s) 0.365 0.73 1.46 2.92 7.3

Max. Speed (m/s) 1.48 2.92 5.84 11.68 29.2

Average Pause Time (s)

4400 2200 1100 550 220

Average Cross Time (s)

82.20 41.10 20.55 10.27 4.11

Message Overhead

Messages per 400 secs

Percent

Heartbeats 360 76

Leader Request 24 5

Leader Reply 50 11

Synch-Request 20 4

Synch-Reply 20 4

Total Message Overhead

474

Maximum observed:

Less than 2-4.5kbps

Message Overhead

0

1000

2000

3000

4000

5000

6000

very slow slow medium slow medium fast fast

Other emulator messages per region

LeaderRequest msgs/region

LeaderReply msgs/region

SYN_REQUEST msgs/region

SYN_ACK msgs/region

Different speeds

Message Overhead

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

40 60 80 100 120total number of nodes

other emulator messages per node

LeaderRequest msgs per region

LeaderReply msgs per region

SYN_REQUEST msgs per region

SYN_ACK msgs per region

Different densities

Protocol Performance

messages per region

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

very slow slow medium slow medium fast fast

allocations per client

messages per region

Different speeds

delay per renewal

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

very slow slow medium slow medium fast fast

renewal delay

delay per renewal

Renewal cost

Protocol Performance

Simulation Summary

Message overhead: still negligible.

— Even with collisions…— Backoff…— Bigger simulations…

Simple address allocation scheme:

— Reasonably efficient…— Scales well…

Simulation SummaryAdditional comments

Jiang Wu

Using Vi

rtual In

frastruc

ture to

Adapt Wi

relines

Protcols

to

MANET

Summary

What is virtual infrastructure? Dynamic distributed coordination

Robotic motion coordination

Self-stabilization

(Preliminary) simulation results.

The Virtual Infrastructure Project

Distributed Algorithms

Focus on fault-tolerance

— Replication

— Consistency

— Agreement

Design principles

— Abstraction / layered design

— IOA / TIOA formalism

Classical techniques, modern networks

Seth

Gilbert

George Varghese

Boaz Patt-Shamir

Jennifer Welch

Brian Coan Kenneth Goldman

Shinya Umeno

Alex Cornejo

Mark Tuttle

Joshua Tauber

Eugene Stark

Rainer Gawlick

Alan Fekete

Victor Luchancgo

Roberto Segala

Rui FanTina NolteSayan Mitra

Calvin Newport

Carl Lividas

Jim Burns

Roger KhazanRoberto DePriscoCongratulations, Nancy, and thank you!!

The End

Recommended