29
1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra [email protected] Apricot 2006

1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

Embed Size (px)

Citation preview

Page 1: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

1© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Advanced BGP Convergence Techniques

Pradosh Mohapatra

[email protected]

Apricot 2006

Page 2: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

2© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Agenda

•Terminology

•Convergence Scenarios

•Core Link Failure

•Edge Node Failure

•Edge Link Failure

Page 3: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

3© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Basic Terminology

• Prefix – A route that is learnt by routing protocols.

–12.0.0.0/16

• Pathlist – A list of Next Hop paths learnt by routing protocols.

–12.0.0.0/16

Via POS1/0

Via GE2/0, 5.5.5.5

–10.0.0.0/16

Via 5.5.5.5

Non-recursive

Recursive

(Depends on the resolution of the next-hop)

Page 4: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

4© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Forwarding Table Structure

BGPPL

path 1

path 2

IGPPL

path 1

path 2

IGPPL

path 1

path 2

Intf1/NH1

Intf2/NH2

Intf3/NH3

Intf4/NH4

Page 5: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

5© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Salient Features

• Pathlist Sharing:

All BGP prefixes that have the same set of paths point to a single pathlist.

• Hierarchical Structure:

BGP prefixes (recursive) point to IGP prefixes (non-recursive).

Page 6: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

6© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Core Link Failure

666

Page 7: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

7© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Multipath BGP, Multipath IGP, IGP path goes down

BGPPL

path 1

path 2

IGPPL

path 1

path 2

IGPPL

path 1

path 2

• Initial organization before failure of IGP path 1.

• Link to Path 1 goes down.

Page 8: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

8© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Multipath BGP, Multipath IGP, IGP path goes down

BGPPL

path 1

path 2

IGPPL

path 2

IGPPL

path 1

path 2

• IGP pathlist modified after Path 1 failure.

• BGP Convergence = IGP Convergence.

Page 9: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

9© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Multipath BGP, Multipath IGP, IGP prefix is deleted

BGPPL

path 1

path 2

IGP PL

Path 1

Path 2

IGPPL

path 1

path 2

• Initial organization before deletion of IGP prefix 1.

• IGP Prefix 1 gets deleted.

• Fix-up BGP PL to point to the second path.

Page 10: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

10© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Multipath BGP, Multipath IGP, IGP prefix is deleted

BGPLI

path 1

IGPLI

path 1

path 2

• BGP pathlist modified after deletion of IGP prefix 1.

• BGP Convergence = IGP Convergence.

Page 11: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

11© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Multipath BGP, Multipath IGP, IGP path modified

BGPLI

path 1

path 2

IGPLI

path 1

path 2

IGPLI

path 1

path 2

• Initial organization before modification of IGP Path 1.

• IGP Path 1 gets modified.

• BGP Convergence = IGP Convergence

Page 12: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

12© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Conclusion

• In case of core link failure:

Sub-second convergence.

BGP Prefix-independent & In-place modification of the forwarding table.

Make-before-break solution

Page 13: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

13© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Edge Node Failure

131313

Page 14: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

14© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Edge node failure

• PE1 has selected PE2 as bestpath and has installed that path only in forwarding table.

• What PE1 needs upon PE2’s failure is fast detection of Unreachability.

• Unreachability status requires all the IGP neighbors to have detected the failure and have sent their LSP’s to PE1.

• PE1 now needs to point to PE3.

PE2

PE3PE1 P1 P2

Page 15: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

15© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

BGP Next-Hop Tracking

• Event-driven reaction to BGP next-hop changes– BGP communicates its next-hops to RIB.

– If RIB gets a modify/delete/add of an entry covering these next-hops, it notifies BGP.

– BGP runs bestpath algorithm.

• Stability requirement– Fast reaction to isolated events

– Delayed reaction to too frequent events

• Classification of Events– Next-hop unreachable is critical: React faster.

–Metric Change is non-critical: React slower.

Page 16: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

16© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

BGP NHT – Implementation highlights

• RIB implements dampening algorithm

– Next-hops flapping too often are dampened.

• RIB classifies next-hop changes as critical or non-critical.

– Critical events are sent immediately to BGP. Non-critical events are delayed up-to 3 seconds.

• BGP has an initial delay before it reacts to next-hop changes.

– Default: 5s. Configurable.– Capture as many changes as

possible within the initial delay before running bestpath.

router bgp 1 bgp nexthop-trigger-delay 1

Page 17: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

17© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

BGP NHT - example

RIB sends 1st NH

notification

IGP CV

Lk

Dn

T2

NHScan + BestPath

T1 T3

• T1: Link failure triggering IGP convergence.

• T2: First next-hop notification to BGP.

• T3: BGP reads the next-hop updates and starts initial delay timer.

• T4: Initial delay period expires. BGP does Nhscan and bestpath change (a function of the table size).

T4

Page 18: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

18© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

BGP NHT

• Principle: The first SPF must declare PE2 as unreachable

We want to make sure that if PE2 fails, then all its neighbors have had the time to detect the failure, originate their LSP and have flooded it to PE1

We want to make sure that when PE1 starts its SPF, all PE2’s neighbors LSP’s are in PE1’s database

• Dependency

fast failure detection

fast flooding

SPF Initial-wait conservative enough

Page 19: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

19© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

BGP NHT – Typical Timing

• 0: PE2 failure

• 50ms: PE1 receives the 1st LSP and schedules SPF at T=200ms

the other LSP’s will have all the time to arrive in the meantime

• 200ms: PE1 starts SPF

we account a duration of 30ms but with iSPF it will be ~1ms

• 232ms: PE1 deletes PE2’s loopback and schedules BGP NHT at T=1232ms

there are few prefixes to modify as this is a node failure

• 1232ms: PE1 runs BGP NHT

table scan: ~6us per entry: if PE1 has 20k routes: ~ 120ms

RIB modify: ~140us per entry: if PE1 has 5k routes from PE2, it takes ~ 700ms

70ms distribution download

• 2122ms: PE1/LC has finished modifying the BGP entries to use nh=PE3. We still need to resolve them

resolution starts [0, 1000ms]

resolution lasts: ~ 100us per entry

• 3622ms: Convergence is finished in the worst case

Page 20: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

20© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Conclusion – Edge node failure

• Sub-5s is achievable

analyzed scenario leads to WC ~ 3500ms

• Sub-Second is challenging

• Ongoing work to improve this further:

Backup path

Page 21: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

21© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Backup Path

BGPPL

path 1

backup path

IGPPL

path 1

path 2

Intf3/NH3

Intf4/NH4

IGPPL

path 1

path 2

Intf1/NH1

Intf2/NH2

•No Multipath. Prefix always points to Path 1.

•Reroute triggered per IGP prefix: fix-up Path 1 to

point to the backup path.

Page 22: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

22© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Backup Path – Contd.

• Problem:

How to know the backup path? BGP advertises only one path.

Peering with RRs: RR sends only the bestpath it computes.

• Solution:

Add-path draft.

Page 23: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

23© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

ADD-PATH

• Mechanism that allows the advertisement of multiple paths for the same prefix without the new paths implicitly replacing any previous ones.

• Add a path identifier to the encoding to distinguish between different prefixes.

+-----------------------------+| Path Identifier (4 octets) |+-----------------------------+| Length (1 octet) |+-----------------------------+| Label (3 octets) |+-----------------------------+...........................................+-----------------------------+| Prefix (variable) |+-----------------------------+

+----------------------+| Path

Identifier (4 octets) |+-------------

---------+| Length (1 octet)

|+-------------

---------+| Prefix

(variable) |

+----------------------+

Page 24: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

24© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

ADD-PATH - Operation

• New capability: Add-path

• Advertisement of the capability indicates ability to receive multiple paths for all negotiated AFI/SAFI.

• Advertisement of specific AFI/SAFI information in the capability indicates the intent to send multiple paths.

• Only in these cases must the new encoding be used.

• Concerns: Cost of multiple paths advertisement outweigh the benefits on convergence?

Page 25: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

25© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Edge Link Failure

252525

Page 26: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

26© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Example: PE-CE Link Failure

CE2

CE3CE1

VPN1 site

VPN1 HQ

PE1

PE2

PE3

RRA1

RRA2

RRB1

RRB2

Page 27: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

27© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Edge Link Failure scenarios

• Edge Link Failure: Next-hop on the peering link

Convergence behavior same as the last two scenarios.

• Edge Link Failure: Next-hop-self

Default behavior for L3VPN

In-place modification and/or BGP NHT do not help.

Advanced BGP signaling required.

Page 28: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

28© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID

Any Questions ?

Page 29: 1 © 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential Session Number Presentation_ID Advanced BGP Convergence Techniques Pradosh Mohapatra

29© 2005 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialSession NumberPresentation_ID