Upload
chloe-daniels
View
218
Download
1
Embed Size (px)
Citation preview
IDMaps: A Global Internet Host Distance Estimation Service
P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavitt, L.
Zhang
Presenter: Zhenying Liu
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
Background Increasing need to learn network
distances, bandwidth One method
Measure the distance by itself(ping, traceroute)
A useful general service: quick, efficient SONAR, Feb. 1996 HOPS(Host proximity Service) Need underlying measurement infrastructure
to provide distance measurements
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
IDMaps Internet Distance Map Service
To be underlying service that provides the distance information used by SONAR/HOPS
Goals Not near instantaneous information Determine roughly the best service given
technology constraints Consider whether there are applications for
which this level of service would be useful
Resulting Goals Separation of functions
Separation of IDMaps and the query/reply service
Distance Metrics Latency(round-trip delay)
useful, easy to provide Bandwidth
Useful, difficult to provide, expensive to measure Accuracy of the distance information
High accuracy: difficult to achieve To obtain accuracy within a factor of 2
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
Alternative Architectures and Related Work SPAND, Remos: provide only distance
information between hosts close to a distance server and remote hosts on the internet For each server: scales proportionally to the
number of destination For all sites in the Internet: N2
Stemm: passive monitoring Not perturb actual internet traffic Only measure regions previous traversed Not adapt to the internet topology changes More human efforts
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
IDMaps Architecture
Address three questions What form does the distance
information take? What are IDMaps’ components? How should the distance information
be disseminated?
Various forms of distance information
Forms Scale commentsGlobal IP addr. H2
H: # of hostsInfeasible
Addr. Prefix(AP)
P2
P: # of APs; 200,000Easily terabytes
AS A2+P’ ( A<<P )A: # of AS, P’:# of BGP-advertised IP addr. Blocks
A = 100,000 (large)Its accuracy is highly suspected
Cluster of APs B2+PB: # of Traces
If B = 500, manageableReasonable accuracy
1
2
3
4
1
2
3
4
The form used There are three main components
APs, Tracers, and the virtual links(the raw distance)
AP: a consecutive address range of IP addresses Tracers: Some systems that are distributed around
the Internet Assumption
We can estimate the distance between two points as the sum of distances between intermediate points
AP1
AP2
Tracer1
b
ac
|a-c|<|b|<|a+c| ? Feasible to estimate
distance?
-- APs-- Tracers
An assumption: Triangulation
To support the triangulation
Set up 2 experiments: D1(1995), D2(1997)
Fig. Shows the ratios of for all shortest-path triangulation in the data sets Between 75% an 90% of triangulation
estimates fall within a factor of 2 of the real distance
The resulting estimates are acceptable!
ba
c
Tracer placement Two problems
How many tracers are optimal? Given the number of tracers, how to put to
minimize the maximum distance between an AP and the nearest tracer?
Two graph theoretic approaches that can apply K-HST algorithm Minimum K-center algorithm These algorithms are used to determine the
placement of fire stations, ambulance placement, etc. with a priori
k-HST: decide # of tracers 1st phase: The graph is recursively
partitioned: A node is arbitrarily selected from the
current(parent) partition, and all the nodes that are within a random radius from this node form a new node partition
The radius of the child partition is a factor of k smaller than the diameter of the parent partition
Recurs until each node is in a partition of its own
k-HST tree
2nd phase: virtual node is assigned to each of the partition on each level
The diameter of a partition The furthest distance between two
nodes in the partition Equals to 2 times of the length of the
links from a virtual node to its children
Use K-HST tree Devise a greedy algorithm to find the
number of tracers when the maximum distance is bounded to D
Push the tracers down the tree until it discovers a partition with diameter <=D
The number of partitions is the minimum number of tracers
Set the virtual nodes of these partitions to be the tracer
Minimum K-Center Algorithm K-Center problem
The placement of a given number of centers such that the maximum distance from a node to the nearest center is minimized
NP-complete Willing to tolerate inaccuracies within a
factor of 2(2-approximation) No worse than twice the maximum
Observation: Guarantee that the distance from a node to the nearest center is bounded
Minimum K-Center Algorithm: details
G=(V,E), E=V×V, c(e) is the cost of the shortest path between (v1, v2)
All the graph edges are arranged in non-decreasing order by cost
Gi2 is the graph whenever there is a path between
u and v in Gi of at most two hops, uv An independent set of a graph G(V,E) is such that,
for all u,vV’, the edge (u,v) is not in E An independent set of Gi
2 is thus a set of nodes in Gi that are at least 3 hops apart in Gi
The maximal independent set M as an independent set V’ such that all nodes in V-V’ are at most one hop away from nodes in V’
1. Construct Gi2,G2
2,…, Gm2
2. Compute Mi for each Gi2
3. Find the smallest I such that |Mi|<=K, say j4. Mj is the set of K centers
Algorithm 2 (2-approximate minimum-center [18]):details
Tracer Heuristics Stub-AS
only connected to one other AS Transit-AS
connected to one or more other AS allows itself to be used as a conduit for
traffic (transit traffic) between other AS's Most large ISPs are Transit-AS’s
Mixed Randomly, with uniform distribution placed
on the network
Virtual links
Tracer-tracer virtual links Not necessary to list all B2 tracer-tracer
distances Given a number of tracers in Seattle and
Boston It would almost certainly not to be useful
to know all of the distance between them Allow a sufficient distance approximation
between hosts in Seattle and hosts in Boston
Virtual links Tracer-AP VLs
A dedicated tracer? More than one
tracer?
C in AP1 will be directed to mirror M1 in AP3 instead of M2 in AP2
Had tracer T2 also traced to AP1, the client would have been directed to M2
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
Performance Evaluation
Topology Generation Waxman, Tiers, Inet
Simulating IDMaps Infrastructure Tracer placement: Stub-AS, Transit-AS
Distance map computation Tracer-tracer VLs and Tracer-AP VLs
Performance Metric Computation
Nearest mirror selection Papp: the percentage of correct IDMaps’
answers over total number of clients Consider IDMaps’ server selection
correct As long as the distance between a client
and the nearest mirror determined by IDMaps is within a factor of λ times the distance between the client and the actual nearest mirror ( we use λ=2)
Simulation result Mirror selection using IDMaps gives
noticeable improvement over random selection
Network topology can affect IDMaps’ performance
Tracer placement heuristics that do not rely on network topology can perform as well or better than algorithms that requires a priori knowledge of the topology
Simulation result Adding more tracers gives
diminishing return Number of tracer-tracer VLs required
for good performance can be on the order of B with a small constant
Increasing the number of tracers tracing to each AP improves IDMaps’ performance with diminishing return
Mirror selection Transit-AS
The probability of that at least 80% of all clients will be directed to the “correct” mirror is 100%
Up to 98% of all clients will be directed to the correct mirror is only 85%
Mirror selection
Mirror selection using distance maps outperforms random selection regardless of the tracer placement algorithm
Qualitatively, the results from agree with the conclusion: mirror selection using distance maps
outperforms random selection
Effect of Topology
Effect of Topology
Performance on Tiers generated topology exhibit a qualitatively different behavior than those on other topologies
The transit-AS heuristic gives better IDMaps performance than the k-HST algorithm on topologies generated from Inet and Waxman, but not so in the topologies generated from Tiers
Contents
Background Goals Related work Architecture Performance Evaluation Conclusion
Conclusion A global distance measurement
infrastructure called IDMaps is purposed It can be placed on the Internet to collect
distance information Nearest mirror selection fro clients
Significant improvement over random selection
Do not require a full knowledge of the underling topology
Conclusion IDMaps overhead can be minimized by
grouping Internet addresses into APs to reduce the number of measurements Apply t-spanner to tracer-tracer VLs can
result in linear measurement overhead with respect to the number of tracers in the common case
Overall, this study has provided positive results to demonstrate that a useful Internet distance map service can indeed be built scalably
(Stub AS)