72
Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

  • Upload
    ksena

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin. Best Effort Connectivity. IP traffic. 135.207.49.8. 192.0.2.153. This is the fundamental service provided by Internet Service Providers (ISPs). All other IP services depend on connectivity: DNS, email, VPNs, Web Hosting, …. - PowerPoint PPT Presentation

Citation preview

Page 1: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Slides Selected from SIGCOMM 2001 BGP Tutorial

by Tim Griffin

Page 2: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Best Effort Connectivity

This is the fundamental service provided by Internet Service Providers (ISPs)

All other IP services depend on connectivity: DNS, email, VPNs, Web Hosting, …

IP traffic

135.207.49.8 192.0.2.153

Page 3: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

3

Routing vs. Forwarding

R

R

RA

B

C

D

R1

R2

R3

R4 R5

ENet Nxt Hop

R4R3R3R4DirectR4

Net Nxt Hop

A B C D Edefault

R2R2DirectR5R5R2

Net Nxt Hop

A B C D Edefault

R1DirectR3R1R3R1

Default toupstreamrouter

A B C D Edefault

Forwarding: determine next hop

Routing: establish end-to-end paths

Forwarding always works

Routing can be badly broken

Page 4: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

4

How Are Forwarding Tables Populated to implement Routing?

Statically DynamicallyRouters exchange network reachability information using ROUTING PROTOCOLS. Routers use this to compute best routes

Administrator manually configuresforwarding table entries

In practice : a mix of these.Static routing mostly at the “edge”

+ More control+ Not restricted to destination-based forwarding - Doesn’t scale- Slow to adapt to network failures

+ Can rapidly adapt to changes in network topology+ Can be made to scale well- Complex distributed algorithms- Consume CPU, Bandwidth, Memory- Debugging can be difficult- Current protocols are destination-based

Page 5: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Routers Talking to Routers

Routing info

Routing info

• Routing computation is distributed among routers within a routing domain

• Computation of best next hop based on routing information is the most CPU/memory intensive task on a router

• Routing messages are usually not routed, but exchanged via layer 2 between physically adjacent routers (internal BGP and multi-hop external BGP are exceptions)

Page 6: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Before We Go Any Further …

IP ROUTING PROTOCOLS DO NOT DYNAMICALLY ROUTE AROUND NETWORK CONGESTION

• IP traffic can be very bursty • Dynamic adjustments in routing typically operate more slowly than fluctuations in traffic load

• Dynamically adapting routing to account for traffic load can lead to wild, unstable oscillations of routing system

Page 7: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Autonomous Routing DomainsA collection of physical networks glued togetherusing IP, that have a unified administrativerouting policy.

• Campus networks• Corporate networks• ISP Internal networks• …

Page 8: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Autonomous Systems (ASes)An autonomous system is an autonomous routing domainthat has been assigned an Autonomous System Number (ASN).

RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System

… the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.

Page 9: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

AS Numbers (ASNs)ASNs are 16 bit values.

64512 through 65535 are “private”

• Genuity: 1 • MIT: 3• Harvard: 11• UC San Diego: 7377• AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …

ASNs represent units of routing policy

Currently over 11,000 in use.

Page 10: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Architecture of Dynamic Routing

AS 1

AS 2

BGP

EGP = Exterior Gateway Protocol

IGP = Interior Gateway Protocol

Metric based: OSPF, IS-IS, RIP, EIGRP (cisco)

Policy based: BGP

The Routing Domain of BGP is the entire Internet

OSPF

EIGRP

Page 11: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

• Topology information is flooded within the routing domain

• Best end-to-end paths are computed locally at each router.

• Best end-to-end paths determine next-hops.

• Based on minimizing some notion of distance

• Works only if policy is shared and uniform

• Examples: OSPF, IS-IS

• Each router knows little about network topology

• Only best next-hops are chosen by each router for each destination network.

• Best end-to-end paths result from composition of all next-hop choices

• Does not require any notion of distance

• Does not require uniform policies at all routers

• Examples: RIP, BGP

Link State Vectoring

Technology of Distributed Routing

Page 12: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

The Gang of FourLink State Vectoring

EGP

IGP

BGP

RIPIS-IS

OSPF

Page 13: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

13

Many Routing Processes Can Run on a Single Router

Forwarding Table

OSPFDomain

RIPDomain

BGP

OS kernel

OSPF Process

OSPF Routing tables

RIP Process

RIP Routing tables

BGP Process

BGP Routing tables

Forwarding Table Manager

Page 14: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

14

IPv4 Addresses are 32 Bit Values

11111111 00010001 10000111 00000000

255 013417

255.17.134.0Dotted quad notation

IPv6 addresses have 128 bits

Page 15: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

15

Classful Addresses

0nnnnnnn

10nnnnnn nnnnnnnn

nnnnnnnn nnnnnnnn110nnnnn

hhhhhhhh hhhhhhhh hhhhhhhh

hhhhhhhh hhhhhhhh

hhhhhhhh

n = network address bit h = host identifier bit

Class A

Class C

Class B

Leads to a rigid, flat, inefficient use of address space …

Page 16: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

16

RFC 1519: Classless Inter-Domain Routing (CIDR)

IP Address : 12.4.0.0 IP Mask: 255.254.0.0

00001100 00000100 00000000 00000000

11111111 11111110 00000000 00000000

Address

Mask

for hosts Network Prefix

Use two 32 bit numbers to represent a network. Network number = IP address + Mask

Usually written as 12.4.0.0/15

Page 17: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Which IP Addresses are Covered by a Prefix?

00001100 00000100 00000000 00000000

11111111 11111110 00000000 0000000012.4.0.0/15

00001100 00000101 00001001 00010000

00001100 00000111 00001001 00010000

12.5.9.16

12.7.9.16

12.5.9.16 is covered by prefix 12.4.0.0/15

12.7.9.16 is not covered by prefix 12.4.0.0/15

Page 18: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

18

CIDR = Hierarchy in Addressing

12.0.0.0/8

12.0.0.0/16

12.254.0.0/16

12.1.0.0/16

12.2.0.0/16

12.3.0.0/16

:::

12.253.0.0/16

12.3.0.0/2412.3.1.0/24

::

12.3.254.0/24

12.253.0.0/1912.253.32.0/1912.253.64.0/19

12.253.96.0/1912.253.128.0/1912.253.160.0/1912.253.192.0/19

:::

Page 19: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Classless Forwarding

Destination =12.5.9.16------------------------------- payload

Prefix Interface Next Hop

12.0.0.0/8 10.14.22.19 ATM 5/0/8

12.4.0.0/15

12.5.8.0/23 attached

Ethernet 0/1/3

Serial 1/0/7

10.1.3.77

IP Forwarding Table

0.0.0.0/0 10.14.11.33 ATM 5/0/9

even better

OK

better

best!

Page 20: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

IP Address Allocation and Assignment: Internet Registries

IANAwww.iana.org

RFC 2050 - Internet Registry IP Allocation Guidelines RFC 1918 - Address Allocation for Private Internets RFC 1518 - An Architecture for IP Address Allocation with CIDR

ARINwww.arin.org

APNICwww.apnic.org

RIPEwww.ripe.org

Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs

Page 21: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

21

Nontransit vs. Transit ASes

ISP 1ISP 2

Nontransit ASmight be a corporateor campus network.Could be a “content provider”

NET ATraffic NEVER flows from ISP 1through NET A to ISP 2(At least not intentionally!)

IP traffic

Internet Serviceproviders (often)have transit networks

Page 22: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

22

Selective Transit

NET BNET C

NET A provides transitbetween NET B and NET Cand between NET D and NET C

NET A

NET D

NET A DOES NOTprovide transitBetween NET D and NET B

Most transit networks transit in a selective manner…

IP traffic

Page 23: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Customers and Providers

Customer pays provider for access to the Internet

provider

customer

IP trafficprovider customer

Page 24: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Customer-Provider Hierarchy

IP trafficprovider customer

Page 25: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

The Peering Relationship

peer peer

customerprovider

Peers provide transit between their respective customers

Peers do not provide transit between peers

Peers (often) do not exchange $$$trafficallowed

traffic NOTallowed

Page 26: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Peering Provides Shortcuts

Peering also allows connectivity betweenthe customers of “Tier 1” providers.

peer peer

customerprovider

Page 27: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Peering Wars

• Reduces upstream transit costs

• Can increase end-to-end performance

• May be the only way to connect your customers to some part of the Internet (“Tier 1”)

• You would rather have customers

• Peers are usually your competition

• Peering relationships may require periodic renegotiation

Peering struggles are by far the most contentious issues in the ISP world!

Peering agreements are often confidential.

Peer Don’t Peer

Page 28: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

29

BGP-4• BGP = Border Gateway Protocol

• Is a Policy-Based routing protocol

• Is the de facto EGP of today’s global Internet

• Relatively simple protocol, but configuration is complex and the

entire world can see, and be impacted by, your mistakes.

• 1989 : BGP-1 [RFC 1105]– Replacement for EGP (1984, RFC 904)

• 1990 : BGP-2 [RFC 1163]

• 1991 : BGP-3 [RFC 1267]

• 1995 : BGP-4 [RFC 1771] – Support for Classless Interdomain Routing

(CIDR)

Page 29: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

30

BGP Operations (Simplified)

Establish session on TCP port 179

Exchange all active routes

Exchange incremental updates

AS1

AS2

While connection is ALIVE exchangeroute UPDATE messages

BGP session

Page 30: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

31

Four Types of BGP Messages

• Open : Establish a peering session.

• Keep Alive : Handshake at regular intervals.

• Notification : Shuts down a peering session.

• Update : Announcing new routes or withdrawing

previously announced routes.

announcement = prefix + attributes values

Page 31: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Attributes are Used to Select Best Routes

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

Given multipleroutes to the sameprefix, a BGP speakermust pick at mostone best route

(Note: it could reject them all!)

Page 32: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

34

Two Types of BGP Neighbor Relationships

• External Neighbor (eBGP) in a different Autonomous Systems

• Internal Neighbor (iBGP) in the same Autonomous System AS1

AS2

eBGP

iBGP

iBGP is routed (using IGP!)

Page 33: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

35

iBGP Peers Must be Fully Meshed

iBGP neighbors do not announce routes received via iBGP to other iBGPneighbors.

eBGP update

iBGP updates

• iBGP is needed to avoid routing loops within an AS

• Injecting external routes into IGP does not scale and causes BGP policy information to be lost

• BGP does not provide “shortest path” routing

• Is iBGP an IGP? NO!

Page 34: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

36

BGP Next Hop Attribute

Every time a route announcement crosses an AS boundary, the Next Hop attribute is changed to the IP address of the border router that announced the route.

AS 6431AT&T Research

135.207.0.0/16Next Hop = 12.125.133.90

AS 7018AT&T

AS 12654RIPE NCCRIS project

12.125.133.90

135.207.0.0/16Next Hop = 12.127.0.121

12.127.0.121

Page 35: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Forwarding Table

Forwarding Table

Join EGP with IGP For Connectivity

AS 1 AS 2192.0.2.1

135.207.0.0/16

10.10.10.10

EGP

192.0.2.1135.207.0.0/16

destination next hop

10.10.10.10192.0.2.0/30

destination next hop

135.207.0.0/16Next Hop = 192.0.2.1

192.0.2.0/30

135.207.0.0/16

destination next hop

10.10.10.10

+

192.0.2.0/30 10.10.10.10

Page 36: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Forwarding Table

Forwarding Table

Next Hop Often Rewritten to Loopback

AS 1 AS 2192.0.2.1

135.207.0.0/16

10.10.10.10

EGP

127.22.33.44135.207.0.0/16

destination next hop

10.10.10.10127.22.33.44

destination next hop

135.207.0.0/16

destination next hop

10.10.10.10

+

127.22.33.44

10.10.10.10

127.22.33.44

135.207.0.0/16Next Hop = 192.0.2.1

135.207.0.0/16Next Hop = 127.22.33.44

Page 37: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Implementing Customer/Provider and Peer/Peer relationships

• Enforce transit relationships – Outbound route filtering

• Enforce order of route preference– provider < peer < customer

Two parts:

Page 38: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Import Routes

Frompeer

Frompeer

Fromprovider

Fromprovider

From customer

From customer

provider route customer routepeer route ISP route

Page 39: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Export Routes

Topeer

Topeer

Tocustomer

Tocustomer

Toprovider

From provider

provider route customer routepeer route ISP route

filtersblock

Page 40: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

42

How Can Routes be Colored?BGP Communities!

A community value is 32 bits

By convention, first 16 bits is ASN indicating who is giving itan interpretation

communitynumber

Very powerful BECAUSE it has no (predefined) meaning

Community Attribute = a list of community values.(So one route can belong to multiple communities)

RFC 1997 (August 1996)

Used for signallywithin and betweenASes

Two reserved communities

no_advertise 0xFFFFFF02: don’t pass to BGP neighbors

no_export = 0xFFFFFF01: don’t export out of AS

Page 41: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Communities Example

• 1:100– Customer routes

• 1:200– Peer routes

• 1:300– Provider Routes

• To Customers– 1:100, 1:200, 1:300

• To Peers– 1:100

• To Providers– 1:100

AS 1

Import Export

Page 42: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Mars Attacks!

• 0.0.0.0/0: default• 10.0.0.0/8: private• 172.16.0.0/12: private• 192.168.0.0/16: private• 127.0.0.0/8: loopbacks• 128.0.0.0/16: IANA reserved • 192.0.2.0/24: test networks• 224.0.0.0/3: classes D and E• …..

Martian list often includes

Page 43: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Import Routes (Revisited)

Frompeer

Frompeer

Fromprovider

Fromprovider

From customer

From customer

provider route customer routepeer route ISP route

filtersmartians

xxxxxx

xxxxxx

xxxxxx

xxxxxxxxxxxx

xxxxxxxxxxxx

Customeraddress filters

cccccccccccccccccc

potentialbackhole

Martian

Page 44: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

47

So Many Choices

Which route shouldFrank pick to 13.13.0.0./16?

AS 1

AS 2

AS 4

AS 3

13.13.0.0/16

Frank’s Internet Barn

peer peer

customerprovider

Page 45: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

48

BGP Route Processing

Best Route Selection

Apply Import Policies

Best Route Table

Apply Export Policies

Install forwardingEntries for bestRoutes.

ReceiveBGPUpdates

BestRoutes

TransmitBGP Updates

Apply Policy =filter routes & tweak attributes

Based onAttributeValues

IP Forwarding Table

Apply Policy =filter routes & tweak attributes

Open ended programming.Constrained only by vendor configuration language

Page 46: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Tweak Tweak Tweak

• For inbound traffic– Filter outbound routes– Tweak attributes on

outbound routes in the hope of influencing your neighbor’s best route selection

• For outbound traffic– Filter inbound routes– Tweak attributes on

inbound routes to influence best route selection

outboundroutes

inboundroutes

inboundtraffic

outboundtraffic

In general, an AS has morecontrol over outbound traffic

Page 47: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Route Selection Summary

Highest Local Preference

Shortest ASPATH

Lowest MED

i-BGP < e-BGP

Lowest IGP cost to BGP egress

Lowest router ID

traffic engineering

Enforce relationships

Throw up hands andbreak ties

Page 48: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

51

Back to Frank …

AS 1AS 2

AS 4

AS 3

13.13.0.0/16

peer peer

customerprovider

local pref = 80

local pref = 100

local pref = 90

Higher Localpreference valuesare more preferred

Local preference only used in iBGP

Page 49: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

52

Implementing Backup Links with Local Preference (Outbound Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1 AS 65000

Set Local Pref = 50for all routes from AS 1

We’ll talk about inbound traffic soon …

Page 50: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

53

Multihomed Backups (Outbound Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1

AS 2

Set Local Pref = 50for all routes from AS 3

AS 3provider provider

Page 51: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

54

ASPATH Attribute

AS7018135.207.0.0/16AS Path = 6341

AS 1239Sprint

AS 1755Ebone

AT&T

AS 3549Global Crossing

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 3549 7018 6341

AS 6341

135.207.0.0/16

AT&T Research

Prefix Originated

AS 12654RIPE NCCRIS project

AS 1129Global Access

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 1239 7018 6341

135.207.0.0/16AS Path = 1755 1239 7018 6341

135.207.0.0/16AS Path = 1129 1755 1239 7018 6341

Page 52: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

55

Interdomain Loop Prevention

BGP at AS YYY will never accept a route with ASPATH containing YYY.

AS 7018

12.22.0.0/16ASPATH = 1 333 7018 877

Don’t Accept!

AS 1

Page 53: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Traffic Often Follows ASPATH

AS 4AS 3AS 2AS 1135.207.0.0/16

135.207.0.0/16ASPATH = 3 2 1

IP Packet Dest =135.207.44.66

Page 54: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

… But It Might Not

AS 4AS 3AS 2AS 1135.207.0.0/16

135.207.0.0/16ASPATH = 3 2 1

IP Packet Dest =135.207.44.66

AS 5

135.207.44.0/25ASPATH = 5

135.207.44.0/25

AS 2 filters allsubnets with maskslonger than /24

135.207.0.0/16ASPATH = 1

From AS 4, it may look like thispacket will take path 3 2 1, but it actually takespath 3 2 5

Page 55: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

In fairness: could you do this “right” and still scale?

Exporting internalstate would dramatically increase global instability and amount of routingstate

Shorter Doesn’t Always Mean Shorter

AS 4

AS 3

AS 2

AS 1

Mr. BGP says that path 4 1 is better than path 3 2 1

Duh!

Page 56: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

59

Shedding Inbound Traffic with ASPATH Padding Hack

Padding will (usually) force inbound traffic from AS 1to take primary link

AS 1

192.0.2.0/24ASPATH = 2 2 2

customerAS 2

provider

192.0.2.0/24

backupprimary

192.0.2.0/24ASPATH = 2

Page 57: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

60

Padding May Not Shut Off All Traffic

AS 1

192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!

Padding in this way is oftenused as a form of loadbalancing

backupprimary

Page 58: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

61

COMMUNITY Attribute to the Rescue!

AS 1

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

backupprimary

192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70

Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70

AS 3: normal customer local pref is 100,peer local pref is 90

Page 59: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

62

Hot Potato Routing: Go for the Closest Egress Point

192.44.78.0/24

15 56 IGP distances

egress 1 egress 2

This Router has two BGP routes to 192.44.78.0/24.

Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!

Page 60: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

63

Getting Burned by the Hot Potato

15 56

172865High bandwidth

Provider backbone

Low bandwidthcustomer backbone

Heavy Content Web Farm

Many customers want their provider to carry the bits!

tiny http request

huge http reply

SFF NYC

San Diego

Page 61: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

64

Cold Potato Routing with MEDs(Multi-Exit Discriminator Attribute)

15 56

172865

Heavy Content Web Farm

192.44.78.0/24

192.44.78.0/24MED = 15

192.44.78.0/24MED = 56

This means that MEDs must be considered BEFOREIGP distance!

Prefer lower MED values

Note1 : some providers will not listen to MEDs

Note2 : MEDs need not be tied to IGP distance

Page 62: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Route Selection Summary

Highest Local Preference

Shortest ASPATH

Lowest MED

i-BGP < e-BGP

Lowest IGP cost to BGP egress

Lowest router ID

traffic engineering

Enforce relationships

Throw up hands andbreak ties

This is somewhat simplified. Hey, what happened to ORIGIN??

Page 63: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Policies Can Interact Strangely(“Route Pinning” Example)

backup

Disaster strikes primary linkand the backup takes over

Primary link is restored but sometraffic remains pinned to backup

1 2

3 4

Install backup link using community

customer

Page 64: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

News At 11:00

• BGP is not guaranteed to converge on a stable routing. Policy interactions could lead to “livelock” protocol oscillations. See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan,

R. Govindan, and D. Estrin. ISI report, 1996 • Corollary: BGP is not guaranteed to recover

from network failures.

Page 65: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

What Problem is BGP solving?

Underlying problem

Shortest Paths

Distributed means of computing a solution.

X?

RIP, OSPF, IS-IS

BGP

• aid in the design of policy analysis algorithms and heuristics• aid in the analysis and design of BGP and extensions• help explain some BGP routing anomalies • provide a fun way of thinking about the protocol

X could

A Wee Bit of Theory

Page 66: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

Separate dynamic and static semantics

SPVP = Simple Path Vector Protocol = a distributed algorithm for solving SPP

BGP

SPVP

Booo Hooo, Many, many complications...

BGP Policies

Stable Paths Problem (SPP)

staticsemantics

dynamicsemantics

See [Griffin, Shepherd, Wilfong]

Page 67: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

1

An instance of the Stable Paths Problem (SPP)

2 5 5 2 1 0

0

2 1 02 0

1 3 01 0

3 0

4 2 04 3 0

3

4

2

1

• A graph of nodes and edges, • Node 0, called the origin, • For each non-zero node, a set

or permitted paths to the origin. This set always contains the “null path”.

• A ranking of permitted paths at each node. Null path is always least preferred. (Not shown in diagram)

When modeling BGP : nodes represent BGP speaking routers, and 0 represents a node originating some address block

most preferred…least preferred (not null)

Yes, the translation gets messy!

Page 68: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

5 5 2 1 0

1

A Solution to a Stable Paths Problem

2

0

2 1 02 0

1 3 01 0

3 0

4 2 04 3 0

3

4

2

1

• node u’s assigned path is either the

null path or is a path uwP, where wP

is assigned to node w and {u,w} is

an edge in the graph,

• each node is assigned the highest

ranked path among those consistent

with the paths assigned to its

neighbors.

A Solution need not represent a shortest path tree, or a spanning tree.

A solution is an assignment of permitted paths to each node such that

Page 69: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

An SPP may have multiple solutions

First solution

1

0

2

1 2 01 0

1

0

2

1

0

2

2 1 02 0

1 2 01 0

2 1 02 0

1 2 01 0

2 1 02 0

Second solutionDISAGREE

Page 70: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

BAD GADGET : No Solution

2

0

31

2 1 02 0

1 3 01 0

3 2 03 0

4

3

Page 71: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

SURPRISE : Beware of Backup Policies

2

0

31

2 1 02 0

1 3 01 0

3 4 2 03 0

4

4 04 2 04 3 0

Becomes a BAD GADGET if link (4, 0) goes down.

BGP is not robust : it is not guaranteed to recover from network failures.

Page 72: Slides Selected from SIGCOMM 2001 BGP Tutorial by Tim Griffin

PRECARIOUS

1

0

2

1 2 01 0

2 1 02 0

3

4

5 6

5 3 1 05 6 3 1 2 05 3 1 2 0

6 3 1 06 4 3 1 2 06 3 1 2 0

4 3 1 04 5 3 1 2 04 3 1 2 0

3 1 03 1 2 0

As with DISAGREE, this part has two distinct solutions

This part has a solution only when node 1 is assigned the direct path (1 0).

Has a solution, but can get “trapped”