30
Characterizing the Internet Characterizing the Internet Hierarchy from Multiple Hierarchy from Multiple Vantage Points Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ http://www.research.att.com/~jrex Work with L. Subramanian, S. Agarwal, and R. Katz http://www.cs.berkeley.edu/~sagarwal/resear ch/BGP-hierarchy/

Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Characterizing the Internet Hierarchy Characterizing the Internet Hierarchy from Multiple Vantage Pointsfrom Multiple Vantage Points

Jennifer RexfordInternet and Networking Systems

AT&T Labs - Research; Florham Park, NJ

http://www.research.att.com/~jrex

Work with L. Subramanian, S. Agarwal, and R. Katz http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/

Page 2: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

OutlineOutline Internet architecture

– ASes, IP addressing, BGP routing, and AS relationships Type-of-relationship problem

– Motivation, formulation, and practical challenges Analyzing partial views of the AS graph

– Assigning a rank to each AS from a single vantage point

– Comparing ranks of ASes across multiple vantage points Analysis results

– BGP routing data and inferred AS relationships

– AS paths that are inconsistent with the inferences

– Five-level classification of the Internet hierarchy

Conclusions

Page 3: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Internet ArchitectureInternet Architecture

Divided into Autonomous Systems– Distinct regions of administrative control (~11,000)

– Set of routers and links managed by a single institution

– Service provider, company, university, …

Hierarchy of Autonomous Systems– Large, tier-1 provider with a nationwide backbone

– Medium-sized regional provider with smaller backbone

– Small stub network run by a company or university

Interaction between Autonomous Systems– Internal topology is not shared between ASes

– … but, neighboring ASes interact to coordinate routing

Page 4: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Autonomous Systems (ASes)Autonomous Systems (ASes)

1

2

3

4

5

67

ClientWeb server

Path: 6, 5, 4, 3, 2, 1

Page 5: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

IP Addressing and PrefixesIP Addressing and Prefixes

32 bits in dotted-quad notation (12.34.158.5)

Divided into network and host portions

12.34.158.0/23 is a 23-bit prefix with 29 addresses

00001100 00100010 10011110 00000101

Network (23 bits) Host (9 bits)

12 34 158 5

Page 6: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Interdomain Routing with BGP (Between ASes)Interdomain Routing with BGP (Between ASes)

ASes announce info about prefixes they can reachLocal policies for path selection (which to use?) Local policies for route propagation (who to tell?)Policies configured by the AS’s network operator

1 2 3

12.34.158.5

“I can reach 12.34.158.0/23”

“I can reach 12.34.158.0/23 via AS 1”

Page 7: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Customer-Provider RelationshipCustomer-Provider Relationship

Customer pays provider for access to the InternetAS exports customer’s routes to all neighborsAS exports provider’s routes only to its customers

d customer

provider

Traffic to the customer

d provider

customer

Traffic from the customer

advertisements

traffic

Page 8: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Peer-Peer RelationshipPeer-Peer Relationship

Peers exchange traffic between their customers Free of charge (assumption of even traffic load)AS exports a peer’s routes only to its customers

peerpeer

Traffic to/from the peer and its customers

d

advertisements

traffic

Page 9: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

AS Relationships MatterAS Relationships Matter Motivating problems

– Placement of servers for content distribution network

– Selection of new peers or providers for an AS

– Analyzing the convergence properties of the BGP protocol

– Installing route filters to protect against misconfiguration

– Understanding of the basic structure of the Internet

Knowing the AS graph is not enough– Interdomain routing is not shortest-path routing

– Some paths not allowed (e.g., transit through a peer)

– Local preference of paths (e.g., prefer customer path)

– Node degree does not define the Internet hierarchy

Need to know the relationship between AS pairs

Page 10: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Inferring Relationships from Routing DataInferring Relationships from Routing Data

Practical realities of the Internet– AS graph is not known

– AS relationships are proprietary

– … at least some routing data is publicly available!

Exploiting routing data– Available via traceroute experiments or BGP tables

– Provides a set of AS paths, such as “701 7018 46”

– Implies existence of edges (701, 7018) and (7018, 46)

– Implies that 7018 (AT&T) allows AS 701 (UUNet) to transit to AS 46 (Rutgers)

Page 11: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Valid and Invalid PathsValid and Invalid Paths

AS relationships limit the kinds of valid paths– Uphill portion: customer-provider relationships

– Plateau: zero or one peer-peer edge

– Downhill portion: provider-customer relationships

Valid Invalid

Invalid

Lixin Gao, “On inferring Autonomous System relationships in the Internet,” IEEE/ACM Transactions on Networking, December 2001.

Page 12: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Type-of-Relationship ProblemType-of-Relationship Problem

Given the inputs– AS graph G(V,E) with vertices V and edges E

– Set of paths P on the graph G

Find a solution that– Labels each edge with an AS relationship

– Minimizes the number of invalid paths in P

Properties of the problem– NP complete (?)

– May have multiple solutions

– We propose a heuristic algorithm

Page 13: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Practical ChallengesPractical Challenges

Peer-peer relationships are hard to infer– Mislabeling a peer-peer edge as provider-customer does not

change a valid path into an invalid path

– We use heuristics to detect the peer-peer edges

Some AS pairs have unusual relationships– Sibling ASes that provide transit service for each other

– Backup relationship for connectivity under failure

– Misconfiguration of a conventional AS relationship

– We detect these cases by analyzing the “invalid” paths

Getting access to a large path set P is hard– We exploit BGP routing tables from multiple vantage points

Page 14: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Validation ApproachesValidation Approaches

Quantify the number of invalid paths– Small number suggests better results

– …still, this doesn’t mean that inferences are correct

Compare results with other inference algorithms– Higher confidence if inferences are the same

– … still, both algorithms could give wrong answers

Compare results with Routing Arbiter Database– Higher confidence if consistent with RADB routing policies

– … still, RADB information is incomplete and out-of-date

Compare results with proprietary ISP data– Higher confidence if answers are correct for this AS– … still, answers may be wrong for other ASes

Page 15: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Partial View of the AS GraphPartial View of the AS Graph

Routing data from a single source AS– Collection of paths starting from the source

– Directed graph from union of all edges in these paths

Actual graph

A

B

C D

E

F

E

FD

C

B

A

B

AC

D

E

F

Page 16: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Assigning Rank to AS in a Partial ViewAssigning Rank to AS in a Partial View

Reverse pruning algorithm to assign rank– Rank 1 to the leaves, then remove leaves

– Rank 2 to the leaves, then remove leaves…

– Single (largest) rank to nodes in connected component, if any

E

FD

C

B

A

B

AC

D

E

F1

1 1

1

2

3

4

2

3

4

5 5

Page 17: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Combining Information From Multiple ViewsCombining Information From Multiple Views

Vector of ranks for each AS– A single element for each of the n views

Dominance: provider-customer relationship– Provider has higher ranks than customer in most views

– For example, B has (2,5) and A has (1,1) Equivalence: peer-peer relationship

– Peers have equal ranks in or inconsistent ranks

– For example, C has (3,4) and D has (4,3)Probabilistic inference

– Thresholds to tolerate some variations across the views

– E.g., an AS dominates in n-1 views and dominated in 1

Page 18: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Applying Our AlgorithmApplying Our Algorithm

Applying the algorithm to ten public BGP tables– RouteViews table and nine Looking Glass servers

– Extracted set of unique paths P for each view

– Applied reverse pruning algorithm to each view

– Applied inference rules to the vectors of ranks

Results of the analysis on data from April 2001– AS graph with 10,698 ASes and 23,935 edges

– Inferences were made for 99.2% of the edges

– 94.5% provider-customer and 4.7% peer-peer edges

– Most inferences do not require the probabilistic rules

Page 19: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Advantage of Multiple Vantage PointsAdvantage of Multiple Vantage Points

A single vantage point is not enough– 15% of the edges appear in exactly one BGP table

– Only 25% of the edges appear in all ten BGP tables

Page 20: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Analyzing Invalid PathsAnalyzing Invalid Paths

Checking the validity of inferences– Assume the relationship inferences are correct

– Identify paths that are invalid under these inferences

– Compute the number of invalid paths

– Investigate common anomaly triples (A, B, C)

Results of our analysis– Applied to paths in 2 of the original 10 BGP tables

– Applied to paths in 4 other BGP tables

– 0.5-3% of paths are invalid for five of the six tables

– 8.7% of paths are invalid for the KDDI table

Page 21: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Common Anomaly PatternsCommon Anomaly Patterns

Misconfiguration– (1, 65112, 6461): 65112 is a private AS that should not

appear between Genuity and AboveNet

Sibling relationships– (7018, 6841, 3300): Infonet Europe merged with AUCS

– (1239, 1740, 7018): Cerfnet was acquired by AT&T

– (1239, 8043, 6395): IXC Communications acquired SmartNAP and renamed Broadwing

Heuristic for identifying sibling relationships– AS pair that appears in a large number of “invalid” paths

– Our analysis identified 22 possible sibling relationships

Page 22: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Digression: Really Weird “Invalid” Paths…Digression: Really Weird “Invalid” Paths…

Properties of the path– Two tier-1 U.S. providers (Genuity and UUNet)

– One service provider in Hong Kong (Hutchinson)

– Another tier-1 U.S. provider (AT&T) at the end of the path Looking at internal AT&T configuration data…

– AT&T does not have a BGP session with AS 9304

– AT&T does not originate the prefixes (e.g., 152.141.116.0/24) Explanation

– Another AS was using the AT&T AS number (for over a year!)

– We sent them an e-mail and asked them to stop, and they did

1 701 703 9304 7018

Genuity UUNet Hutchinson AT&T

Page 23: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Digression: How Could This Happen, and Persist?Digression: How Could This Happen, and Persist? BGP configuration is done locally by neighbors

– Customer configures its router with AS number 7018

– Provider configures its router with neighbor of 7018 The misconfiguration didn’t necessarily cause a problem

– Hop-by-hop routing took the traffic to the right place

– Most BGP policies don’t look at the identity of the ASes Could have caused a problem: route filtering

– Large providers might applying filtering to customer routers

– Discard routes with other large providers in the path Could have caused a problem: loop detection

– The bogus routes did not appear in AT&T’s routing tables

– AT&T router saw 7018 in the path and discarded the route

– AT&T router did have a route for the supernet (152.141.0.0/16)

Page 24: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

AS ClassificationAS Classification

Directed AS graph– Directed edge from provider to customer

– Bidirectional edge between two peers

Lowest level: Stubs– Leaf nodes: no peers or downstream customers

– 8898 of the 10915 ASes (82.5% of ASes)

– Ex: UC Berkeley (25), AT&T Labs (6431), and INRIA (1300)

Next lowest level: Regional ISPs– Leaf nodes after successive pruning of leaf nodes

– 971 ASes of the 10915 ASes (8.9% of ASes)

– Ex: PacBell (5676), US West (6223), and UUNET Canada (815) Remaining 1046 ASes: Core

Page 25: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Dense CoreDense Core

Ways to classify so-called “tier-1” ASes– Any AS with no upstream provider (98 such nodes)

– AS set that forms the largest clique of peer edges (13 nodes)

Relaxing the definition– Tolerate some missing or misclassified edges

– Tolerate some ASes with sibling relationships

“Almost a clique”– Subgraph of m nodes with in and out degree at least m/2

– Greedy algorithm for locating the largest near-clique

20 ASes in the near-clique– 15 of the ASes form a subgraph just 3 edges short of a clique

– Genuity, Sprint, UUNET, AT&T, Verio, Level3, C&W,…

Page 26: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Transit and Outer CoreTransit and Outer Core

Transit core– ASes that peer with the dense core and each other

– Notion of a “weak in-way cut” to isolate these ASes

– Algorithm for identifying the ASes in transit core

– 129 ASes, including top providers in Europe and Asia

– Ex: UUNET Europe, KDDI, and Singapore Telecom

Outer core– All of the remaining ASes in the core

– 897 ASes, including large regional and national ISPs

– Ex: Turkish Telecom and Minnesota Regional Network

Page 27: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Node Degree is Not EnoughNode Degree is Not Enough

Node degree ignores relationships– A stub AS may have many upstream providers

– A core AS may have a small number of peers

– Some ASes have customers that don’t have AS numbers

Page 28: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Related WorkRelated Work

AS graph characterization– Constructing graph from BGP tables or traceroute experiments

– Characterizing the topological properties of the graph

Inferring AS relationships (Lixin Gao)– Identifies the key properties of paths (uphill, downhill, etc.)

– Heuristic using node degree to infer boundary point between the uphill and downhill portions of the path

– Application of the algorithm using RouteViews routing table

Characterization of the hierarchy of ASes– Early work by Govindan/Reddy based on node degree

– Recent work by Ge et al based on AS relationships

Page 29: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

ConclusionsConclusions

Inferring AS relationships– Reverse pruning to assign rank to each AS

– Comparison of ranks from different vantage points

Performance evaluation– Application of algorithm to collection of ten BGP tables

– Exploration of the anomalies that cause invalid paths

Characterization of Internet hierarchy– Stub, regional ISP, outer core, transit core, & dense core

– Algorithms for identifying the three parts of the core

– Application to AS graph inferred from the BGP tables

Page 30: Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park,

Ongoing WorkOngoing Work

Classification of siblings– Use anomalous triples (A, B, C) to identify siblings

– Group siblings into a single node (with union of edges)

– Repeat classification of the AS hierarchy on new graph

Longitudinal study– Repeat the study over a period of time with new data

– Study how AS relationships and hierarchy changes

Validation of our inference results– Compare to RADB, Lixin’s results, AT&T data, etc.

http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/