32
Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding

  • Upload
    booker

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding. Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI. What’s the problem?. One of the central goals of the Internet - continuous end-to-end connectivity - PowerPoint PPT Presentation

Citation preview

Page 1: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Reducing Transient Disconnectivity using Anomaly-Cognizant ForwardingAndrey Ermolinskiy, Scott Shenker

University of California – Berkeley and ICSI

Page 2: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

What’s the problem? One of the central goals of the Internet - continuous

end-to-end connectivity

BGP convergence is a major cause of connectivity disruption Routers operate upon potentially inconsistent local views Temporary inconsistencies give rise to anomalies such as

loops and black holes that disrupt end-to-end packet delivery

Page 3: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Example: transient routing loop with BGP

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

withdraw BA

Page 4: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

withdraw BA

Routing loop between C and D incurs temporary loss of connectivity between {B, C, D, E, F} and A.

Example: transient routing loop with BGP

Page 5: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Related Work Shrinking the convergence time window through BGP protocol extensions

Ghost flushing Consistency assertions

Protecting end-to-end packet delivery from adverse effects of convergence R-BGP

Forward packets on pre-computed failover paths, Propagate root cause information to prevent loops

Consensus Routing Enforce a globally-consistent view via distributed snapshots and strategically delay adoption of incoming BGP updates

Anomaly-Cognizant Forwarding

Page 6: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding (ACF) Approach Accept routing anomalies as an unavoidable fact Protect end-to-end packet delivery by detecting and recovering from anomalies on the forwarding path

Main hypothesis Several simple and lightweight extensions to conventional IP forwarding enable us to sustain packet delivery during periods of BGP instability

without the use of pre-computed backup paths without modifying the core routing protocol or altering its timing dynamics

Page 7: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Domain S has anomalous forwarding state for destination D if S’s outgoing packets destined for D arrive back to S as result of a routing loop.

Main idea of ACF: Detect occurrences of anomalous state

Avoid forwarding packets via domains that are known to have anomalous state.

S

DAnomalous forwarding state

ACF Overview

Each packet carries a list of prior AS-level hops (pathTrace)

Each packet carries a blackList of domains with anomalous state

pathTrace blackList

Packet header

Page 8: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF OverviewForward (packet p) {

if (localASNum in p.pathTrace)

Move loop elements from p.pathTrace to p.blackList

nextHop lookupNextHop(p.destAddr)

if (nextHop in p.blackList)

Invoke the control plane, look for alternate non-blacklisted routes in the RIB

if (nextHop != NONE) {

Append localASNum to p.pathTrace

SendPacket(p, nextHop)

} else

Initiate recovery-mode forwarding for p

}

Page 9: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF Recovery-mode forwarding

Normal-mode forwarding

Recovery-mode forwarding

Intuition: R or some router along the path to R may know a working alternate route to the original destination.

If a router is unable to forward a packet because it does not have a valid non-blacklisted route, it initiates recovery forwarding. Chooses a recovery destination R from a static and well-

known set of highly-connected Tier-1 domains. Detours the packet through R.

R1 R2

nextHop=NONE

Recovery destinations

Page 10: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C ] blackList = { }dst = A origDst =

Page 11: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C D ] blackList = { }dst = A origDst =

Page 12: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

pathTrace = [ C D ] blackList = {D }

p.Headerdst = A origDst =

C initiates recovery forwarding through domain F

Page 13: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ ] blackList = {C D }dst = F origDst = A

C initiates recovery forwarding through domain F

Page 14: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ ] blackList = {C D }dst = F origDst = A

C initiates recovery forwarding through domain F

Page 15: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C] blackList = {C D }dst = F origDst = A

C initiates recovery forwarding through domain F

Page 16: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C] blackList = {C D }dst = F origDst = A

C initiates recovery forwarding through domain F

Page 17: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

Page 18: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C E] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

Page 19: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ C E] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

Page 20: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ ] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

F resumes normal-mode

forwarding

Page 21: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ F] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

F resumes normal-mode

forwarding

Page 22: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ F G] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

F resumes normal-mode

forwarding

Page 23: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

1. BA 2. CBA

1. BA 2. DBA

1. CBA 2. DBA

1. ECBA 2. GA

p

p.Header

pathTrace = [ F G] blackList = {C D E}dst = F origDst = A

C initiates recovery forwarding through domain F

F resumes normal-mode

forwarding

Page 24: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Anomaly-Cognizant Forwarding

A

B

C D

EF

G

Page 25: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Observations ACF does not use pre-computed failover paths

Discovers alternate routes dynamically using state in the packet header The two forwarding modes make use of the same forwarding table

Paths to recovery destinations are not assumed to be stable and anomaly-free We protect recovery-mode forwarding using the same mechanism (pathTrace and blackList)

Page 26: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Preliminary Evaluation Evaluation metrics

Effectiveness in eliminating transient disconnectivity Efficiency of alternate paths Packet header overhead

Page 27: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Preliminary Evaluation Simulation methodology

CAIDA AS-level topology (27969 nodes) annotated with inferred inter-AS relationships 12937 multihomed edge domains, 29426 adjacent provider links Provider link failure experiment

For each multihomed domain D, and each provider link L Fail L and simulate packet delivery from every other domain to D during

convergence

D

S1

S2

S4

S3

Recovery destinations = 10 highly-connected Tier-1 ISPs Packet TTL = 32 hops

Page 28: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Preliminary Evaluation Transient disconnection after a link failure

BGP with conventional forwarding 51% of failures cases produce unwarranted disconnection Widespread disconnection (>50% of ASes) in 17% of cases

BGP with ACF No disconnection in 92% of failure

cases <1% of ASes see disconnection in

98% of failure cases

Page 29: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Preliminary Evaluation Transient path efficiency

Causes of path dilation in ACF Transient loops Detouring via a recovery

destination

F – failure cases that produce transient disconnection with conventional forwarding

In 65% of failure cases that produce disconnectivity, ACF recovers packets using ≤ 2 extra hops

9% of cases require 7 hops or more

Page 30: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

ACF: Preliminary Evaluation Packet header overhead

% of ASes disconnected 0% 0.09% 0.9% 9% 90%

pathTrace length 11 16 16 20 13

blackList length 4 11 9 11 16

Maximum number of pathTrace and blackList entries in a representative sample of failure cases.

Worst-case pathTrace – 20 entries 40 bytes of overhead assuming 16-bit AS numbers

Worst-case blackList – 16 entries 10 bytes of overhead for a Bloom filter with 1% error rate

Page 31: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Challenges / Concerns Feasibility of deployment

ACF adds fields to packet header and modifies core IP forwarding logic.

Packet processing overhead Control plane is invoked only during periods of

instability Common case: check pathTrace and blackList.

Both operations admit efficient implementation in hardware and parallelization.

ACF and routing policies

Page 32: Reducing Transient Disconnectivity using     Anomaly-Cognizant Forwarding

Thank you. Questions?