View
216
Download
0
Category
Tags:
Preview:
Citation preview
Consensus Routing: The Internet as a Distributed System
2009. 2. 26
John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson
Presented by John P. John
Modified by Moonyoung Chung
Contents Introduction Motivation and Goals Consensus Routing– Stable Mode– Transient Mode
Evaluation Conclusions
2NSDI '08
Internet Routing
3NSDI '08
A goal of the Internet is global reachability
But, BGP fails to achieve this goal– Physical paths exist, but not BGP paths– 10-15% of BGP updates cause loops and blackholes– 90% of all packet losses on the Internet due to loops
BGP
NSDI '08 4
Opaque policy routing– Preferred routes visible to neighbors– Underlying policies not visible and under local control
Mechanism:– Autonomous Systems(ASes) send preferred path to
neighbors– If AS receives new path, start using right away– Forward path to neighbors, after some delay– Path eventually propagates to all ASes
BGP link failure
NSDI '08 6
5: 4-55: 3-4-55: 1-5
5: 4-55: 2-4-5
2
3
4
5
1
Destination
5:4-5
Link 4-5 failsAS4 withdraws pathfrom upstream ASes
BGP link failure
NSDI '08 7
5: 4-55: 3-4-55: 1-5
5: 4-55: 2-4-5
2
3
4
5
1
Destina-tion
AS 2 and 3 pick theirnext best paths
Routing loop is formed!
BGP policy change
NSDI '08 8
5
AS4 wants all traffic destined for AS5 to come through AS6
5: 4-55: 3-4-55: 6-4-55: 1-5 5: 4-5
5: 2-4-55: 6-4-5
5: 4-55: 2-4-5
5:4-5
AS4 withdraws the pathfrom AS2 and AS3
2
3
4
1
6
Destina-tion
BGP policy change
NSDI '08 9
5
5: 4-55: 3-4-55: 6-4-55: 1-5 5: 4-5
5: 2-4-55: 6-4-5
5: 4-55: 2-4-5
2
3
4
1
6
Destina-tion
AS 2 and 3 pick theirnext best paths
Routing loop is formed!
Lack of Consistency
NSDI '08 10
The underlying cause of all these problems is in-consistent global state– Link failures– Traffic engineering– Scheduled Maintenance– Link coming up
Protocol behavior complex, unpredictable No indicator of when system converged to consis-
tent state
Motivation and Goal
NSDI '08 11
Goal:– Networks that have high availability
Insight:– Consistency is the key
Consensus Routing
NSDI '08 12
Lesson from distributed system design:– De-couple safety and liveliness
Safety: Forwarding tables are always consistent and policy compliant, consistent view of global state
Liveness: Routing system adapts to failures quickly and maintains high availability
Safety: Stable Mode
NSDI '08 13
Problem: Inconsistent state
Solution: – Apply updates only after they have reached all depen-
dent ASes– Apply updates synchronously across ASes
Stable Mode Consistent view of global state– Stable Forwarding Table (SFT)
at kth epoch1. Update log
2. Distributed snapshot
3. Frontier computation
4. SFT computation
5. View change
NSDI '08 14
Update log
NSDI '08 15
1
4
6 5
3
2
ASes compute and forward routes as before, but don’t apply to forwarding table
Distributed Snapshot
1
4
6 5
3
2
NSDI '08 16
Some node(s) calls for the (k+1)th distributed snapshot
1. Run BGP, but don’t applythe updates
Periodically, a distributed snapshot is taken
Updates in transit, or being processed are marked incomplete
Frontier Computation: Aggregation
1
4
6 5
3
2
* frontier: the most recent complete update at each AS
NSDI '08 17
ASes send snapshot report to the consolidators 1. the saved sequence of updates2. the set of incomplete updates
Consolidators 1. Run BGP, but don’t applythe updates
2. Distributed Snapshot
Frontier Computation: Consensus
1
4
6 5
3
2
NSDI '08 18
1. Run BGP, but don’t applythe updates
2. Distributed Snapshot3. Send info to consolidators
Consolidators run a consensusalgorithm to agree on the setof incomplete updates
Consolidators
Frontier Computation: Flood
1
4
6 5
3
2
NSDI '08 19
Consolidators
Consolidators flood the incomplete set to all the ASes
1. Run BGP, but don’t applythe updates
2. Distributed Snapshot3. Send info to consolidators4. Consensus
SFT Computation & View Change
1
4
6 5
3
2
Details and proof of consistency in the paper
NSDI '08 20
1. Run BGP, but don’t applythe updates
2. Distributed Snapshot3. Send info to consolidators4. Consensus5. Flood
Apply completed updates
Versioning, Garbage collection
Mechanism
NSDI '08 21
Other details in the paper:– Transition between epochs– Slow/unresponsive ASes– Failed ASes– Reintegration of failed ASes– Provable safety and liveness properties
Transient Mode: Liveness Problem: Upon link failure, need to wait till path
reaches everyone
Solution: Dynamically re-route around the failed link– use existing techniques• Pre-computed backup paths• Deflection• Detour routing
NSDI '08 22
Routing Deflection
NSDI '08 23
S
1
2
Destina-tion
D
3
deflect packet to neighbor
traverse a different route
Detour Routing
NSDI '08 25
S
4
5
Destina-tion
tunnel
D
3
B Tier 1
B is responsible for forwarding packets
Backup routes Pre-computed failover paths
e.g. RBGP, scheme for pre-computing backup routes to each destination
NSDI '08 26
BGP
NSDI '08 27
Time
Conn
ectiv
ity
Link Failure (or other BGP event)
BGP convergesto alternate path
Globalreachability
CompletelyUnreachable
Consensus Routing
NSDI '08 28
Time
Conn
ectiv
ity
Globalreachability
CompletelyUnreachable
Time
Conn
ectiv
ity
Globalreachability
CompletelyUnreachable
Link Failure (or other BGP event)Switch to
transient routingSnapshot
Evaluation In the talk, answer the following:– How does consensus routing affect connectivity?– What is the traffic overhead?
Methodology– Extensive simulations on realistic Internet-scale topologies.– an implemented XORP prototype.– experiments on PlanetLab.
NSDI '08 29
Methodology
NSDI '08 30
1 2
3 54
Fail each access link ofeach multi-homed stubAS
See what fraction of ASesare temporarily disconnecteduntil convergence
23,390 ASes, 46,095 links 9,100 multi-homed stub AS
Connectivity
Consensus routing maintains complete connectivity in over 99% of the cases
BGP maintains completeconnectivity in < 40% of the failure cases
NSDI '08 31
Recommended