40
On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or Where on Earth is the Internet?

On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Embed Size (px)

Citation preview

Page 1: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

On the Geographic Location of Internet Resources

Mark CrovellaBoston University Computer Science

with

Anukool Lakhina, John Byers, and Ibrahim Matta

or

Where on Earth is the Internet?

Page 2: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Some observations about the Internet

• Rapid, decentralized growth:– 90% of Internet systems were added in the last four

years– Connecting to the network can be a purely local

operation

• This rapid, decentralized growth has opened significant questions about the physical structure of the network; e.g.,– The number of hosts connected to the network– The properties of network links (delay, bandwidth)– The interconnection pattern of hosts and routers– The interconnection relationships of ISPs– The geographic locations of hosts, routers and links

Page 3: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Internet Science

• Engineering or Science?Engineering: study of things madeScience: study of things found

• Although the Internet is a engineered artifact, it now presents us with questions that are better approached from a scientific posture

Page 4: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

For example: where is the Internet?

?? ?

? ?

What is the relationship between geography and the network?

Page 5: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Why does this matter?• Our motivation is in developing better

generators for “representative network topologies”– Many simulation based results in networking

depend critically on network topology– Topology generation is still fairly ad hoc

• However, there is a scientific goal as well!– We need to understand what drives Internet

growth– Basic investigations will pay off in

unexpected ways

Page 6: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Assumptions and Definitions

• We will treat the Internet as an undirected graph embedded on the Earth’s surface– Nodes correspond to routers or interfaces– Edges correspond to physical router-router

links– Not concerned with hosts (end systems)

• We will ignore many higher and lower level questions– Autonomous systems– Link layers

Page 7: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Some Basic Questions

• What is the relationship between population and network geometry?

• What effect does distance have on network topology?

Our Basic Approach

• Obtain IP-level router maps Mercator and Skitter

• Find geographic locations of those routersIxia’s IxMapping service

Page 8: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Mercator: Govindan et al., USC/ISI, ICSI

Skitter: Moore et al., CAIDA

• Based on active probing from a single site• Resolves aliases• Uses loose source routing to explore alternate

paths

• Traceroutes from 19 monitors to large set of destinations

• Does not resolve aliases• Destinations attempt to cover IP address space

Page 9: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Datasets

Mercator• Collected August 1999• 228,263 routers• 320,149 links

Skitter• Collected January 2002• 704,107 interfaces• 1,075,454 links

Page 10: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

IxMapping: Moore et al., CAIDA

• Given an IP address, infers geographic location based on a variety of heuristics– Hostnames, DNS LOC, whoise.g., 190.ATM8-0-0.GW3.BOS1.ALTER.NET is in Boston

• Able to map – 99% of Mercator routers– 98.5% of Skitter interfaces

• Similar to GeoTrack [Padmanabhan] which exhibits reasonable accuracy – Median error of 64 mi– 90% queries within 250 mi – for well connected nodes

Page 11: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Where are the routers? USA

Page 12: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Europe

Page 13: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Routers and People: World(Grid size: ~150 mi x 150 mi)

Ugh!

Page 14: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

People Per Interface (Skitter)

Pop (M)

Intfs(K)

People / Intfc

Online(M)

Online/ Intfc

Africa 837 8,379 100,011

4.1 495

S. America

341 10,131 33,752

21.9 2,161

Mexico 154 4,361 35,534

3.4 784

W. Europe

366 95,993 3,817 143.1 1,489

Japan 136 37,649 3,631 47.1 1,250

Australia 18 18,277 975 10.0 552

USA 299 282,048

1,061 166.1 588

World 5,653

563,521

10,032

513.2 910

Pop (M)

Intfs(K)

People / Intfc

Online(M)

Online/ Intfc

Africa 837 8,379 100,011

4.1 495

S. America

341 10,131 33,752

21.9 2,161

Mexico 154 4,361 35,534

3.4 784

W. Europe

366 95,993 3,817 143.1 1,489

Japan 136 37,649 3,631 47.1 1,250

Australia 18 18,277 975 10.0 552

USA 299 282,048

1,061 166.1 588

World 5,653

563,521

10,032

513.2 910

Page 15: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Interfaces and People: USA, Skitter

Grid size: ~90 mi x 90 mi

Page 16: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Routers and PeopleUpper, Mercator; Lower, Skitter

USA Europe Japan

Page 17: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Router Location: Summary

• Router location is strongly driven by both population density and economic development

• Superlinear relationship between router and population density:

R k Pa

k varies with economic development (users online)

a is greater than one• More routers per person in more densely

populated areas

Page 18: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Models for Network Topology

1988

1999

1996

Spatial Models

Structural Models

Degree-based Models

Page 19: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Spatial model: Waxman, 1988

• Nodes are distributed randomly (uniformly) in the plane.

• Probability that two nodes separated by distance d are connected:

P[C|d] = exp(-d/L)0 , 1; L = diameter of region: degree of distance sensitivity: edge density

• A spatially imbedded random graph

Page 20: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Structural Models

• Real networks have structure– Always connected!– Formed by interconnection of component

networks– Distinction between transit and stub

networks imposes a hierarchy on resulting graph

• Tiers: Doar, 1996• GT-ITM: Calvert, Doar, Zegura, 1997

Page 21: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Degree-based Models

• Faloutsos, Faloutsos, & Faloutsos, 1999:– Empirically measured networks show a

power law in degree distribution:

P[node has degree d] = k d-a

• Barabasi & Albert, 1999:– This property will be present in a graph

where:• Nodes and links are added incrementally• Probability of connecting to a node is proportional

to its degree (preferential connectivity)

• BRITE: Medina, Matta, Byers, 2001

Page 22: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Empirical Evidence

• Interested in influence of distance on link formation:

f(d) = P[C|d]i.e., Probability two nodes separated by distance d are connected

• Estimated as: number links of length d f(d) = ------------------------------------------- number of router pairs separated by d

Page 23: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

f(d) for USA (Skitter)

Distance Sensitive

Distance Insensitive

Page 24: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Link Distance Preference for USA Skitter, d < 250, semi-log plot

L 140 mi.

Page 25: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Link distance preference: all regionsUpper, Mercator; Lower, Skitter; small d

USA L 140 mi

Europe L 80 mi

Japan L 140 mi

Page 26: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Large d: distance insensitivity

F(d) = f(u)d

u=1

USA data, Skitter

Page 27: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Distance insensitivity, all regions Upper, Mercator; Lower, Skitter; large d

USA Europe Japan

Page 28: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Limits to distance sensitivity

Mercator Skitter

Limit % Links < Limit

Limit % Links < Limit

USA 820 mi.

82.1% 818 mi.

77.2%

Europe

383 mi.

97.3% 366 mi.

95.4%

Japan 165 mi.

91.5% 116 mi.

92.8%

Page 29: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Link Formation: Summary

• Link formation seems to be a mixture of distance-dependent and –independent processes

• Waxman (exponential) model remarkably good for large fraction of all links!– But, crucial difference is that we are using a

very irregular spatial distribution of nodes

• Small fraction of non-local links are very important (structural)

Page 30: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Generating Topologies: a new recipe

1. Good models for population density exist– CIESIN’s Gridded Population of the World– 2.5 arc minutes (<5 mi2) … very high quality– Time trends also available

2. Router density then follows from population relationship

3. Link formation driven by hybrid process– Distance-dependent and –independent

Page 31: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Related Work

• Matrix.net: – Uses DNS hostname allocations– proprietary location methods

• Akamai– Extensive peering and measurement infrastructure

• Padmanabhan and Subramanian, 2000:– assessed accuracy of geographic mapping

techniques

• Yook, Jeong, and Barabasi, 2001:– Similar goals– Find linear (not exponential) distance dependence

Page 32: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Final Thoughts

• The Internet has fully interpenetrated human society

• Scientific understanding of the net is essential

• Applications:– Simulation – Security– Reliability– Planning

Page 33: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Thanks!

• CAIDA: – David Moore– k claffy– Andre Broido

• Notre Dame:– Lazslo Barabasi

• USC:– Ramesh Govindan– Hongsuda Tangmunarunkit

Page 34: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Routers and People: North America

Page 35: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Subdividing the Data

N. US

C. A.

S. US

Page 36: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Economic Heterogeneity

Page 37: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

What Influences the Formation of Links?

• Waxman, 1988: spatial model• Zegura et al., 1997: structural model

– Explicitly captures hierarchical structure

• Barabasi et al., 1999: degree-based preferential connectivity– Matches observed power-law node degree– Inspired by Faloutsos et al., 1999

• Strogatz & Watts, 1998: small-world properties– Captures “six degrees of separation”

Page 38: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Routers and Economics

Matrix.net: hosts Mercator: routers

Page 39: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Penultimate Geographic Map, Oct 1980

Page 40: On the Geographic Location of Internet Resources Mark Crovella Boston University Computer Science with Anukool Lakhina, John Byers, and Ibrahim Matta or

Last Complete Geographic Map, Aug 1982