13
cing: Measuring Network-Internal Delays using only Existing Infrastructure Kostas G. Anagnostakis and Michael Greenwald CIS Department, University of Pennsylvania 200 S. 33rd St., Philadelphia, PA 19104, USA Email: anagnost,mbgreen @dsl.cis.upenn.edu Raphael S. Ryger CS Department, Yale University P.O. Box 208285, New Haven CT, USA Email: [email protected] Abstract— Several techniques have been proposed for mea- suring network-internal delays. However, those that rely on router responses have questionable performance, and all pro- posed alternatives require either new functionality in routers or the existence of a measurement infrastructure. In this paper we revisit the feasibility of measuring network-internal delays using only existing infrastructure, focusing on the use of ICMP Timestamp probes to routers. We present network measurements showing that ICMP Timestamp is widely supported and that TTL-responses often perform poorly, and we analyze the effect of path instability and routing irregularities on the performance and applicability of using ICMP Timestamp. We also confirm that router responses rarely introduce errors in our measurements. Finally, we present a practical algorithm for clock artifact removal that addresses problems with previous methods and has been found to perform well in our setting. I. I NTRODUCTION Network measurement techniques are crucial for gaining insights into network performance, as well as for understand- ing the behavior of network control mechanisms such as TCP congestion control [1]. Network parameters (e.g. delay, loss, and throughput) are relatively easy to measure over an end- to-end flow. In contrast, measuring the same quantities on an individual link inside the network is more difficult. Tools like pathchar [2] and similar techniques [3] directly estimate internal network performance characteristics such as per-link bandwidth and delays using per-hop round-trip mea- surements, obtained by eliciting TTL-expired responses from routers. To obtain one estimated sample of the per-link delay, the round-trip time to the head of the link is subtracted from the round-trip time to the tail of the link. Another desirable feature of such tools is that they are able to operate using only the existing infrastructure, without requiring cooperation from remote hosts or routers along some path, and can therefore be used to measure paths to almost any destination. However, there have been several questions about their effectiveness. First, there have been concerns about the representativeness of ICMP responses from routers, due to possible delays in processing such responses in routers. Second, techniques based on TTL-expired responses can only measure round-trip times and not one-way delays. Third, asymmetric paths make it difficult to correlate hop-by-hop round-trip times in order to isolate per-hop estimates. Due to such concerns, direct measurement tools that require no infrastructure support (such as pathchar) have been regarded (rightly or not) as unreliable for measuring network-internal delays. Consequently, several alternatives have been proposed in recent years. Such alternatives are either indirect, or require a measurement infrastructure to be available, or require addi- tional functionality to be added to routers, or some combina- tion of these three. Considerable effort has been put into tech- niques for inferring internal network performance from end- to-end (e.g. active) measurements. These inference techniques, sometimes referred to as network tomography techniques, derive network-internal statistics by injecting probe packets from one source to multiple destinations and correlating the observed packet behavior on the resulting tree topology [4], [5], [6]. Other approaches exist, based on passive packet sampling [7], or using the IPMP protocol [8]. While promising, those mechanisms are not likely to be deployed soon on any rea- sonable scale. Our work revisits the possibility of using direct measure- ment, using only existing infrastructure, to estimate per-link network-internal queuing delays. Specifically, we examine the use of ICMP Timestamp probes, and have built a tool, cing, which uses them. We have used cing to study multiple congestion points in flows over the Internet [9]. We have recently demonstrated [10] that direct measurement techniques can be both more accurate and more robust 1 than inference methods using end-to-end measurements — but only when the direct approach is applicable and practical. This paper answers the question of when cing is applicable. We explain the technical details needed to make cing practical and accurate, detail the problems we encountered, and present the solutions to those problems we solved. Where space permits, we present data that helps explain why our solutions seem most suitable compared to other alternatives that we are aware of. We also measured a large number of paths in order to determine whether the problems that we could not solve limit the applicability of this approach in practice. Some preliminary 1 After a large number of observations, the mean and median error for direct measurement were moderately lower (more accurate) than for indirect measurement. We say cing produced more “robust” estimates because the variance in the error was low for cing and high for indirect inferencing, and because the mean error for direct methods was low after a much smaller number of observations than for indirect inference.

cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

cing: MeasuringNetwork-InternalDelaysusingonly Existing Infrastructure

KostasG. AnagnostakisandMichael GreenwaldCIS Department,University of Pennsylvania

200 S. 33rd St., Philadelphia,PA 19104,USAEmail:

�anagnost,mbgreen� @dsl.cis.upenn.edu

RaphaelS. RygerCS Department,Yale University

P.O. Box 208285,New Haven CT, USAEmail: [email protected]

Abstract— Several techniques have been proposed for mea-suring network-inter nal delays. However, those that rely onrouter responseshave questionable performance, and all pro-posed alternatives require either new functionality in routersor the existenceof a measurement infrastructur e. In this paperwe revisit the feasibility of measuring network-inter nal delaysusing only existing infrastructur e, focusing on the use of ICMPTimestampprobesto routers.We presentnetwork measurementsshowing that ICMP Timestamp is widely supported and thatTTL-r esponsesoften perform poorly, and we analyze the effectof path instability and routing irr egularities on the performanceand applicability of using ICMP Timestamp.Wealsoconfirm thatrouter responsesrar ely intr oduce errors in our measurements.Finally, we present a practical algorithm for clock artifactremoval that addressesproblemswith previous methodsand hasbeen found to perform well in our setting.

I . INTRODUCTION

Network measurementtechniquesare crucial for gaininginsightsinto network performance,aswell as for understand-ing the behavior of network control mechanismssuchasTCPcongestioncontrol [1]. Network parameters(e.g. delay, loss,and throughput)are relatively easyto measureover an end-to-endflow. In contrast,measuringthe samequantitieson anindividual link inside the network is moredifficult.

Tools like pathchar [2] and similar techniques[3] directlyestimateinternalnetwork performancecharacteristicssuchasper-link bandwidthanddelaysusingper-hop round-tripmea-surements,obtainedby eliciting TTL-expired responsesfromrouters.To obtainoneestimatedsampleof the per-link delay,the round-trip time to the headof the link is subtractedfromthe round-trip time to the tail of the link. Another desirablefeatureof suchtools is that they areableto operateusingonlytheexisting infrastructure,without requiringcooperationfromremotehostsor routersalong somepath, and can thereforebe usedto measurepathsto almostany destination.However,there have been several questionsabout their effectiveness.First, there have beenconcernsabout the representativenessof ICMP responsesfrom routers,due to possibledelays inprocessingsuchresponsesin routers.Second,techniquesbasedon TTL-expired responsescanonly measureround-trip timesand not one-way delays. Third, asymmetricpaths make itdifficult to correlate hop-by-hop round-trip times in orderto isolate per-hop estimates.Due to such concerns,directmeasurementtools that requireno infrastructuresupport(such

aspathchar) have beenregarded(rightly or not) asunreliablefor measuringnetwork-internaldelays.

Consequently, several alternatives have been proposedinrecentyears.Such alternatives are either indirect, or requirea measurementinfrastructureto be available,or requireaddi-tional functionality to be addedto routers,or somecombina-tion of thesethree.Considerableeffort hasbeenput into tech-niquesfor inferring internal network performancefrom end-to-end(e.g. active)measurements.Theseinferencetechniques,sometimesreferred to as network tomography techniques,derive network-internal statisticsby injecting probe packetsfrom one sourceto multiple destinationsand correlatingtheobserved packet behavior on the resulting tree topology [4],[5], [6].

Other approachesexist, basedon passive packet sampling[7], or using the IPMP protocol [8]. While promising,thosemechanismsare not likely to be deployed soon on any rea-sonablescale.

Our work revisits the possibility of using direct measure-ment, using only existing infrastructure,to estimateper-linknetwork-internalqueuingdelays.Specifically, we examinetheuseof ICMP Timestampprobes,andhave built a tool, cing,which uses them. We have used cing to study multiplecongestionpoints in flows over the Internet [9]. We haverecentlydemonstrated[10] thatdirectmeasurementtechniquescan be both more accurateand more robust1 than inferencemethodsusing end-to-endmeasurements— but only whenthe direct approachis applicableand practical. This paperanswersthe questionof whencing is applicable.We explainthe technical details neededto make cing practical andaccurate,detail the problemswe encountered,andpresentthesolutionsto thoseproblemswe solved. Wherespacepermits,we presentdata that helps explain why our solutionsseemmostsuitablecomparedto otheralternativesthatwe areawareof. We also measureda large number of paths in order todeterminewhetherthe problemsthat we could not solve limittheapplicabilityof thisapproachin practice.Somepreliminary

1After a large number of observations, the mean and median error fordirect measurementweremoderatelylower (moreaccurate) thanfor indirectmeasurement.We saycing producedmore “robust” estimatesbecausethevariancein the error was low for cing and high for indirect inferencing,andbecausethe meanerror for direct methodswas low after a muchsmallernumberof observationsthanfor indirect inference.

Page 2: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

work alongtheselines wasreportedin [11].In summary, therequiredinfrastructure(for example,router

support for ICMP Timestamp message)is almost univer-sally deployed, and we have solved a wide enough rangeof problems,to safely estimatethat cing is practical andaccurateto isolate per-link delays for more than half thepathsin the Internet,andper-segmentdelaysfor mostothers.Unlike pathchar, we demonstratethat cing can accuratelyisolateone-way delays.Unlike indirect inferencetechniques,our techniqueis computationallyandconceptuallysimpleandrobustly providesaccurateestimates.Unlike either, the typical“f ailure” modeis not an extremelyinaccurateestimate,but ana-priori reportthatcing will beunableto provide ananswer.The basic limitation of this techniqueis a susceptibility tothe irregularity of IP routing, in particular, the fact that routesfrom a singlesourceto nodesat both endsof a pathsegmentwe wish to measure,do not necessarilyfollow the samepath.We measurehow this impactsour ability to perform directmeasurements.

The restof this paperis organizedasfollows. In SectionIIwe presentthe measurementtechnique,as proposedin [11],[10]. In Section III we look into the various problemsthatcan arise with our technique,and quantify their effect withnetwork measurementswhere necessary. In Section IV wediscussseveral directionsfor future work, and in SectionVwe summarizeour findingsandpresenttheconclusionsof ourinvestigation.

I I . MEASURING NETWORK-INTERNAL DELAYS USING

ICMP TIMESTAMP

A. Preliminaries

We model a network as an arbitrary graph � , with nodes�and links � . Eachlink, ����� , is an orderedtuple � ��������

in�����

. A link � �������� implies that nodes � and �� areneighbors,and that � can senda packet directly to �� withno intermediatehops.Let ������ denotea path taken by apacket, in this casefrom � to �� . ����������� denotesapathfrom � to � thatpassesthrough � . Whenrelevant,wedenotea directone-hoppathfrom a node � to its neighbor �by �! �"�� . We usethenotation #$ to representpathvariables.

We definethe notion of head, tail , andprefix of a path inthe obvious ways. For a node in a path #$ , let partition#$ as follows. #$&% #$ �'( � #$�) . Then #$

%head* #$ ���+

and #$,)-% tail * #$ ���+ . #$ is a prefix of #$,) if thereexists somenode in #$ ) , s.t. #$

%head* #$ ) ���+ .

Each node containsa routing map .0/ *213+54 �76 � �.

.0/ takesa destinationnode 18� � and returnsthe next hop9 � � .9 % .0/ *�1:+ impliesthat ��� 9 � is in � , andthatpackets

with destination1 passingthrough travel on thepath � 9.

For all ;��� <=�>�?�@��. / *; <=+ % < .Let AB�C1 representthepath(possiblymulti-step)inducedby

. on packetstravelling from node A to node 1 . Packetstravelon paths basedon purely local next-hop routing decisions.That is, AD�E1 % A �F.8GD*�1:+��H1 .

A map . is regular over a graph � if IJK�HAD�E1L�MIJN��tail *2AD�C1O���+ , . / *;N�+ % . / *213+ . . in the Internetis irregular.For example,let

9 % . / *213+ . . / *P.8Q�*�1:+�+ doesnot necessarilyequal

9. Further, . / is not stablein the Internet— routing

mapschangeover time in responseto routing updates.Theformer irregularity significantly impactswhich links we candirectly measure.In contrast,measurementssuggestthat, forour purposes,we neednot modelthe latter instability [12]; weverify this assumptionin SectionIII-C.

B. Technique

We can directly estimatethe delay on link � % ;R���ST�as follows. Let AU� �

denote our measurementsource,and VXW and VXY denotethe timestampdifferencereturnedfromICMP TimestampRequestpacketssentback-to-backfrom A tonodesR and S respectively2. The timestampdifferenceis thedifferencebetweenthetimestampreturnedin an ICMP packetandthe time the packet wasoriginally sent.If we assumethatAD�ZR is a prefix of AD�ZS , then we can also assumethat thetwo packetswill experienceapproximatelythe samedelayonAD�ZR . In sucha case[ % V Y 6 V W will yield an initial estimateof the total delayon � . The total delayincludesboth the fixedlink delay (transmissiontime and propagationdelay over � )and the queuingdelayon node S .

We cannotassumethat the clockson R and S aresynchro-nized. We must assumethat the clocks differ by an offset\ W^] Y . (For now, let us assumethat the skew in the rate thattheclocksadvanceis small enoughso that thechangein

\ W^] Yover the period of measurementis within error tolerance.)[ % VXY 6 VXW % V queuing _ V link _ \ W`] Y . Given [a� and [b� , twoobservationsof [ , then [a� 6 [b� %dc V queuing (becauseV link and\ W^] Y cancelout).

Let [ min% egfih *�VXY � + 6jegfih *;VXW � +k��l&�nmio`��N8p after N

observations.Assumingconstantclock-offset, for sufficientlylarge N , [ min denoteswith “high” probability theeventof zeroqueuingdelayon Rq�rS . Then [ <� % [b� 6 [ min givesthedesirednetwork-internalqueuingdelayon path R8�FS .

The computationof [D<� dependsupon the path to R beinga prefix of the path to S . If we are studying the path #$U%AD�E1 , then we say that nodes R���Ss� #$ are membersof thesametomographygroup, t-�Huv , if (i) AD�ZR is a prefix of AD�wS ,and(ii) tail *2AD�ESJ��R�+ is a prefix of tail *2AD�C1O��R�+ . Tomographygroupspartition the nodesin #$ into equivalenceclasses.Anode, x� $ , is calledan orphan if y tz�suv y % o .

We cancompute[D<� usingdirect measurementsfor any pairof nodes,R���S , in the sametomographygroup. Condition (i)ensuresthat the path to R is a prefix of the path to S , andcondition (ii) ensuresthat the segment R?��S lies entirely inthe path #$ . We computetomographygroupsin three steps.First, we use TTL-limited probesto determinethe nodesin#$ , by sendingpackets from A to 1 and increasingthe TTL

by 1 until we reach 1 , as does the traceroute tool [13].

2Becausewe requireback-to-backpackets,we cannotnaively usethesamemechanism(subtractingthe {;| and {;} from the receive time) to measurecongestionon the return path. There is no way of generatingback-to-backpackets from two differentdestinationnodes.

Page 3: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

0 1 2 3 4 5 6 7 8 9 10 11 12 13

orphan−0

group−1

group−2

group−3

Fig. 1. Orphansandtomographygroupson a typical path.

The ICMP TTL-expired responsesreturn the list of nodes.Second,we use the TTL techniqueto computethe path toeachnode x� #$ . Finally, we examinethe path to eachnodeto determinewhethereachpair of nodessatisfiesconditions(i) and(ii). The orphansand tomographygroupson a typicalpath,asdeterminedby this method,areshown in Figure1.

For simplicity, we assumedabove in the computationofqueuingdelay [D<� that the clock offset

\ W`] Y remainsconstant.In practice,we cannotmake this assumption.In SectionIII-Ewe discussour approachto clock artifact removal.

I I I . FEASIBIL ITY STUDY

In this sectionwe investigatethe variousissuesthat affectthe feasibility andeffectivenessof our technique.

A. Routersupportfor ICMP Timestamp

We measurehow widely ICMP Timestampis supportedinthe Internet.Our measurementsinvolve 20,206pathsfrom theUPENN network to randomtargets.For eachpath, we firstdeterminethe path using TTL-limited ICMP Echo probes,then,for eachnodeon the path,attemptto elicit ICMP Echoand Timestampresponses.We useda 2-secondtimeout and4-retrieslimit for eachprobe.

In Table I we present,in the left column, the fraction ofrouterssupportingexactly a givensetof ICMP messages,andin the right column,the fraction of pathsfor which the givenset containsexactly the messagessupportedby all routersalong the path. We observe that Timestampis supportedon92.93%of the routersprobed.The caseswhereTimestampisdisabled(while EchoandTTL-expiredresponsesareenabled)is small, at about 4.21%. 1.47% of the routers probed didnot respondto TTL-limited probes(while both the precedingandsucceedingneighborson the pathresponded).In this casewe could not determinean IP addressto probefor EchoandTimestamp.On 62.78%of the pathsthere was at least onerouter that did not supportTimestamp. Performingper-linkmeasurementsand accuratelyisolating the sourcesof delayvariationmay thereforenot always be possible.However, for37.22%of the paths,all nodesin the path fully supporttheneededmeasurementprimitives3. Note that the paths in thedata-setscontainonly destinationsthat reply to ICMP Echo,hence,the figurespresentedheredo not accountfor networks

3Not reflectedin TableI arecasesof buggy ICMP Timestampimplemen-tations.For instance,somehostsreturnTimestampsin the wrong byte order.Otherproblemsaredetectable,but arenot asreadily corrected.

routers paths~TTL, ECHO, TSTAMP � 37180092.92% 739537.22%~TTL, TSTAMP � 34 0.01% 8 0.04%~TTL, ECHO� 16853 4.21% 586029.49%~TTL � 5593 1.40% 353717.80%

-no reply- 5864 1.47% 306915.45%

TABLE I

ICMP T IMESTAMP SUPPORT

that block ICMP Echo traffic4. On the otherhand,the resultsareconservative due to packet lossandgiven that we limitedthe numberof retries and timeout for probing eachhost orrouter for eachof the ICMP-basedservices.

B. Accuracy of router probes

Thereareseveral questionswith respectto the accuracy ofrouterprobes.Thefirst is whetherICMP probesto routersarerepresentative of normalpacket behavior. This questionarisesfrom the fact that ICMP packets are handledby the slow-pathof routers,which arelikely to introduceadditionaldelaysandthereforebiasmeasurements.Althoughthis possibilityhasbeen a concernfor router-basedmeasurements,recent pre-liminary measurementsindicate that routersrarely introducenoticeabledelaysto TTL-limited probes[14]. However, thesemeasurementsfocus on determiningICMP generationdelaysthat affect bandwidth estimation techniques:the minimumoverheadof having a packet forwardedby the slow path in-steadof thefastpath.cing dependsnot only on theminimumoverheadbut also on the variability in ICMP processingandschedulingdelays. We conductedtwo experimentsto lookfurther into this matter. We first apply the method of [14]to the caseof ICMP Timestampsto confirm that Timestampsbehave similarly to TTL-limited probes.The methodinvolvessendingpairsof probes,oneto the target host,andoneto therouter underconsideration.The sourceaddressof the packetsentto the router is spoofedso that it will be deflectedto thetarget host.Becauseone packet travels directly to the target,while the other one is processedby the router before beingforwarded to the target, the differenceof the two packets’transit times providesan estimateof router ICMP processingdelays.It is important to note that this experimentdependsupon the assumptionthat packet pairs experiencethe samedelay over their commonpath. In this case,the assumptionseemsjustifiable.First, Figure 5 shows that suchan assump-tion is usuallycorrect.Second,if the assumptionis false,andthe packets becomeseparated(that is, the secondpacket isdelayedmore than the first), then this failure is conservative:it will overestimatethe apparentICMP processingtime.

As the test requiresinstrumentingboth sourceand targethosts,a large-scalestudy doesnot seemfeasiblewithout an

4A preliminaryattemptat quantifyingthe fractionof hoststhathave ICMPprobesfiltered (either locally or somewherealong the path) using hoststhatparticipatein theGnutellanetwork indicatesthat15051outof 33533(44.88%)of hosts reply to ICMP Echo messages.The population of hosts in thisexperimentmay not be representative.

Page 4: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

Router ID

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Est

imat

ed T

imes

tam

p ge

nera

tion

dela

y (m

s)

0

25

50

75

100

Fig. 2. Distribution of estimatedICMP Timestampprocessingdelays.Largedotsaremedians,(barelyvisible) shadedboxesareinter-quartileranges,outerbarsare5-95%percentiles,small dotsareoutliers.

infrastructureof reasonablescale. Therefore,the results ofboth [14] andour own studymay not be conclusive. Becauseof irregularrouting,our measurementsarerestrictedto routersthat are on the sametomographygroup as the target host.Our measurementsinvolve 7 sites, with only one that canget spoofedpackets on the Internet,and 20 routersthat aremembersof suitabletomographygroups.The distributionsofestimatedICMP Timestampgenerationdelaysare shown inFigure 2. As reportedin [14], the median estimatedICMPprocessingdelays are negligible. However, the tails of thedistributions are not always well behaved: two out of 20routerswe studiedhadhigh 95-percentilesof roughly 80 ms,while anotherfive had non-negligible 95-percentilesbetween8 and20 ms.

We attemptto estimatemore accuratelyhow many routersexhibit suchbehavior, basedon the following observation: iffor a given link ;R���ST� , router R exhibits this behavior, thenafraction of our methodestimateswill be negative (assumingthereis no strongcorrelationbetweenICMP generationdelayson R and S , in the casethat S also behaves similarly). Wecharacterizea routerassuspectto this behavior if 10% of thedelayestimatesare below a thresholdof -10 ms. We appliedthis estimatorby taking100sampleestimateson eachof 1368routersandfoundthat136(9.9%) appearto havethisproblem.

A moredetailedassessmentof exactly whenandhow oftenthisoccursfor eachrouterwouldbeuseful.Nevertheless,theseresultsindicatethat ICMP generationdelayswill occasionallyintroduceerrors,especiallywhen analyzingtails of the esti-matedqueuingdelaydistributions.On theotherhand,in manycasesthis problemis detectable,using testssuchas the oneemployed above.

To understandthe relative benefitsof using Timestampsinsteadof a TTL-basedmethod,we measureda large numberof paths from the UPENN network. Timestampshave theadvantageover TTL-expired messagesin estimatingone-wayqueuingdelaysbecauseTimestampsaredesignedto measureone-waytimes,while TTL-basedmethodsnecessarilycomputeround-trip times.On the other hand,Timestampssuffer froma needto calibrateunsynchronizedclocks on eachend of a

target link, while TTL methodsrely only on the singleclockat the measurementsource.For eachpath,we first determinetheirregularity structureandrandomlyselecta pair of suitablemeasurementnodes.After applyingthe clock correctionalgo-rithm detailedin III-E, we comparethe Timestampone-wayqueuingdelay estimateswith round-trip-basedqueuingdelayestimates.In order to make the comparisonmostdirectly, weusetheround-triptimesof theTimestamppackets,themselves,rather than send out new packets with limited TTLs. Thepaths (both forward and return) are identical to those thatwould be taken by TTL-limited packets, becausewe onlychoosenodesfrom the tomographygroup of the destination.The round-trip/TTL estimatetakesthe differencebetweentheround-trip times to the headand tail of the link (using therespective minima to accountfor propagationdelay). WhenTTL-limited probesare used to measurepropagationdelay,we must divide the measuredtime by two, to accountforthe forward and return paths.However, thereis no reasontoassumethat congestiondelaysare equal in the forward andreturn directions,so no division takesplace(in fact, queuingdelays usually only appearin one direction). We collecteddataon 10,931 isolatedlinks and 9,591 multi-hop segmentsseparately. Theresultsareshown in Figure3 for isolatedlinks,and in Figure4 for the multi-hop segments.

The densediagonalline indicatesthat often both methodsyield the sameestimate.In both figures,thereare correlatednegative estimates,which can be attributed to occasionalICMP generationdelays on the router on the head of thelink or segment.Data points off the diagonalindicateerrorsin one or both methods.Vertical displacementis causedbyTimestamperrors, and horizontal by TTL errors. We candeterminethat TTL errorsaremorecommonthanTimestamperrorsby exploiting the non-uniformdensityof the diagonal:if we take an arbitrary line with slope = 1, the densitydistribution correlateswith a horizontal displacementmorethan with a vertical displacement.TTL errors come from anumberof sources.Asymmetric return pathscan causeTTLto underestimatequeuing delay when the return path fromthe headhaslarger queuingdelaysthan the return path fromthe tail, and overestimatedelay when the tail path haslargerqueuingdelays.Even with return pathsthat sharea commonsuffix except for the target link, any congestionon the returndirection of the target link causesover-estimates.User spacemeasurementof round-trip times further distort estimates.Aschedulingdelay beforethe return from the first recvfromcausesunderestimates,and delaysbetweenthe arrival of thetwo packetscausesoverestimates.Further, the return packetscannot be back-to-back,making it less likely that the tworeturn packets encounterexactly the same delays even onthe sharedpath. While all the sourcesof error describedsofar exclusively affect TTL estimates,the occasionalfailureof the packet-pair assumptionon the forward path affectsboth TTL and Timestampestimates.The Timestampmethodaloneis affectedby clock calibrationerrors.Someclockshaveanomalousbehavior that is detectablebut not correctablebyour fixclock algorithm at our samplingrate (one possible

Page 5: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

Fig. 3. TTL vs Timestampestimates,single-hopmeasurements Fig. 4. TTL vs Timestampestimates,multi-hop measurements

sourceof suchanomaliesareclocksthat “jump”, on average,beforewe accumulate4 samples,thusmakingit impossibletodetectlinear regions).Suchclocksarecalled”flailing clocks”(see [15] for details). Flailing clocks at tails of measuredlinks can causeoverestimatesof the queuing delay, or un-derestimateswhen the “flailing clock” is at the head.For alltheseerror conditions,underboth TTL or Timestamp,in thecommoncasethereis little or no queuingdelay, consequentlyany error is most likely to occur when the “real” queuingdelay is very closeto zero.Thereforewe would expect,andsee,errorsto be mostvisible alongthe axes in our figures.

An unusual phenomenonin both figures is that we seeregionswith a sharpdropoff in thedensityof datapoints.(Forexample,TTL overestimatesdeclinesharplyabove 150 ms).Thisappearsanomalousto us,andwehaveseveralconjecturedexplanations,but the scaleof theexperimentdoesnot providesufficient information to reachany safeconclusions.

To summarize,thecomparisonof TTL vs.Timestamp-basedestimatesdemonstratedseveralpossiblefailuremodesof TTL-basedestimates,andrevealedsomeproblemsthatarecommonto both approaches.To a lesserextent,clock correctionerrorsaffect only the Timestampapproach.

Finally, we needto determinehow the packet pair assump-tion affectsaccuracy. cing expectsthat back-to-backTimes-tamp probesencounterthe sameenvironmentup to the headof the link examined.We determinewhetherthis assumptionis violatedfrequentlyenoughto have any noticeableeffect onour internalnetwork delayestimates.

We consideragainpathsto randomtargetsandobtaindataon 7,300routerson thosepathsby sending400pairsof back-to-back ICMP Timestampprobes to each router. We thencomputethe differencein the Timestampsfor eachpair, afterremoving possibleerrors due to clock resetsor adjustmentsusing the clock artifact removal algorithm.

In Figure 5 we present the fraction of routers againstthe maximum, 98-,95- and 90-percentilesof the computed

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

frac

tion

of r

oute

rs

packet pair error (ms)

Legend90-percentile95-percentile98-percentile

max

Fig. 5. Measuredpacket pair error on a collectionof routers.

packet pair errorover the run of eachexperiment.We observethat packet pair error is small in our measurements,mean-ing that cing would provide highly accurateestimatesofnetwork-internaldelays(assumingall other issueshave beenaddressed).This result should,of course,not be regardedasa global propertyof Internetpaths:becauseof relatively goodconnectivity of our measurementsourcesour measurementsdo not provide insights into situationsof heavier congestionor in caseswhereslow links openup gapsin the packet pair,especially if such phenomenaoccur near the measurementsource.

C. Internal path stability

A rather uncommonproperty of our techniqueis that itsendspacketsnotonly on theend-to-endpathbut alsoon pathsto some network-internal nodes (which are frequently notprefixesof the end-to-endpath,aswe will discussin SectionIII-D). The robustnessof our techniquetherefore depends

Page 6: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

�����b�����2����X�B�����2���2�X����X�2�������

�T�X���2������B���X�2�����������������������B�X���B���i� ���B�X����������B�X���B�����

�X�B���2���B���X�B����B�X���B�����

Fig. 6. Graphicalrepresentationof pathstability results.

on the probability that all paths to internal nodesinvolvedin measurementare stableand not just the end-to-endpath,makingour techniquemoresensitive to pathinstability. Whileit is known [12] that around 91% of end-to-endpaths arelikely to be stableover periodsof hours,this result doesnotnecessarilyaccountfor pathstowardsnetwork-internalnodes.Although we are not aware of any obvious reasonwhy thesameshould not apply to network-internal nodes(and onecould assumethat instability of network-internalpathsis notindependent),we haveperformeda limited measurementstudyto validate this assumption.Our measurementsinvolve 40repeatedpathstructuresampleson eachof 1263randompathsmeasuredfrom the UPENN network within a measurementperiodof approximately15 minutesfor eachpath.The resultsaresummarizedgraphicallyin Figure6.

We found that 77.9%of end-to-endpathsarestableduringthe experiment,but closer inspectionrevealedthat 56.9% ofthe unstablepaths were unstableonly within the UPENNnetwork (hops 2, 3 and 4) indicating that local multi-pathrouting within the UPENN network was the primary causefor the observed difference.Discountingfor local instability,90.7% of the pathscan be characterizedas stable.Analysisof pathsto internal nodeson stableend-to-endpathsshowsfurther instability in the pathstructure:only 42.6%hadstablepathsto eachhop.From thosepaths,44.7%werefound to belocally unstablewithin theUPENNnetwork and22.8%locallyunstablein other partsof the network. Discountingfor localinstability, 81.4% of internal pathsappearto be stable.Thefraction of pathsin which both the end-to-endpathaswell asall internalpathsarestableis around76.1%.

These results indicate that it may be necessaryto peri-odically check for possiblerouting changes,if a significantnumberof internal nodesare usedfor measurement.On theotherhand,it appearsunlikely that many or all internalnodeswill be usedat the sametime. With this condition, we canconcludethat althoughrouting instability is somewhat worsefor our techniquecomparedto others,it doesnot significantlyaffect applicability.

D. Effect of irregular routing

We analyzethe effect of irregular routingon our technique,given the constraintsdiscussedin Section II. Most of the

dataset Source PathsUPENN upenn.edu 11,749YALE yale.edu 11,742SRI sri.com 7,076UCH uch.gr 8,553LIACS liacs.nl 11,754

TABLE II

DATA-SETS USED IN PATH STUDY

resultsare basedon five data-setscontainingmeasurementsof characteristics(see Table II) of paths to random targetIP addresses5. In most caseswe only presentresultsfor theUPENNdata-set;resultsfor theotherdata-setsdisplaysimilarpropertiesandarepresentedin the Appendix.

A key metric for the “coverage” of a path in terms ofpotential measurementnodes is the number of non-orphannodes.In Figure13 we illustratethepercentageof non-orphannodesat a given normalizeddistancefrom the sourcenode.We observe that thechanceof gettinga usablenodedecreasessignificantlyaswe move deeperinto thepathandcloserto thedestination.Our techniqueis thereforemorelikely to beusefulandaccurateon links andsegmentscloserto themeasurementsource.In the proximity of the destination,nodesagaintendto be non-orphanswith higher probability than in the core.While the generaltrendis the samefor all data-sets,therearenotabledifferencesin the magnitudeof the irregularities. Itis possibleto overcomea fraction of theseirregularities: inFigure 14 we presentthe coverageobtainedfor the pathsinthePENNdatasetwhenthePENNnodeis assistedby thefiveothernodesin overcomingirregularities.

Non-orphannodeendpointsarenot sufficient to allow us tomeasurea particularlink. Additionally, we requirebothnodesto belongto the sametomographygroup.Thusthe fraction ofnon-orphannodestell uswhich nodescanbeusedto terminatea measurablesegmentof any numberof hops,but doesnottell uswhetherindividual links aredirectly measurableor not.Figure 15 presentsthe fraction of individual links per paththat are directly measurable,and comparesthat fraction tothe (larger)numberof non-orphannodes.For many pathsthisfraction is lower than we would like. There are other waysof overcomingrouting irregularities,and measuringdelay onlinks that border orphannodes;cing doesnot, in general,currently attemptthese.For example,cing as describedinSectionII doesnot considercombiningmeasurementsin thecaseof overlappingsegments(possibleextensionsto dealwiththis aspectare discussedin SectionIV, and in [16], whichdiscussesan extendedversionof cing).

There are a numberof other interestingcharacteristicsofrouting irregularities. Most paths generally have between1and 4 tomographygroups (Figure 7). The distribution oftomographygroup diameters,e.g. the distancebetweenthefirst and last node (excluding thosegroupsincluding source

5Similar resultsareobtainedwhenconsideringtargetsobtainedfrom Webserver tracesratherthanrandomtargets.

Page 7: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

0

20

40

60

80

100

0 pathlength

% o

f non

-orp

han

node

s

distance from source (normalized)

UPENNYALE

SRIUCH

LIACS

Fig. 13. Percentageof non-orphannodesvs. distancefrom source

and destination,whosediameteris equal to path length) isbimodal. The diametertendsto be either relatively small orrelatively large, asshown in Figure8. This also indicatestheexistenceof segmentsthat can be analyzedin detail down tothe level of 2 or 3 hops,but also that overlappingof groupse.g. very small groupswithin larger groupsare likely to becommon.

We alsoanalyzetherelationshipbetweennumberof tomog-raphy groups,path length and numberof non-orphannodes.The ratio of non-orphannodesdecreaseswith path length,with a significant drop betweensmall (possibly local) pathsandmedium-lengthpaths,but with slower thanlineardecreaseas we move into longer paths (Figure 12). The fraction ofnon-orphannodes per path increaseswith the number oftomographygroupsavailableon a path,but the improvementdiminishesasthenumberof groupsperpathincreases(Figure10). The numberof groupsformed by a path is likely to behigher in longerpaths,as illustratedin Figure11.

To betterunderstandthefactorsbehindroutingirregularities,we examinehow they relateto intra- andinter-domainroutingchoices. Intra-domain routing is expected to contribute toirregularities,especiallyconsideringthe existenceof parallelpathsandadaptive routingsuchas [17]. Similar effectscouldalso arise at the inter-domain level: network-internal nodesandthetargetarenot necessarilypartof thesameAutonomousSystem(AS), andarein principlehandledseparatelyby BGP-basedrouting [18]. To gain insight into this matter, we look atthe AS-pathstructure,examining whetherinternal pathstendto include ASesthat are not on the original end-to-endpath.In Figure9 we show the numberof ASescrossedby internalpaths that do not appearon end-to-endpaths. We observethat for nearlyhalf the paths,all internalpathsremainwithinthe end-to-endAS-path,with the numberof additionalASeson internal pathsrarely exceedingone or two. However, thepercentageof paths where paths to internal network nodesincludeotherASesis significant.Fromtheseresults,it appearsthat both intra- and inter-domain routing are responsibleforirregular routing andthe resultinglimitations of cing.

0

20

40

60

80

100

0 pathlength

% o

f non

-orp

han

node

s

distance from source (normalized)

UPENN+SRI+YALE+LIACS+UCHUPENN+SRI+YALE+LIACS

UPENN+SRI+YALEUPENN+SRI

UPENN

Fig. 14. Percentageof non-orphannodesvs. distancefrom source,whenmultiple sourcesareused

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

fraction of paths in dataset

Fraction of non-orphan nodesFraction of directly measurable links

Fig. 15. Fraction of directly measurablelinks vs. fraction of non-orphannodesin UPENN dataset

E. Clock artifact removal: fixclock

Our experimentsgathertimestampdatafrom tensof thou-sandsof target clocks,exhibiting a broadvariety of behavior.Some clocks exhibit jump adjustments(or “resets”) manytimes per minute. Some changerate (”skew”) in peculiarpatterns. Some report times in milliseconds, but actuallyappearto have much coarserresolution. In order to reasonmeaningfully about the collected data, we must correct forsuchclock behavior if we can, and we must be preparedtodrop datathat we cannotcorrectconfidently.

Figures16 and17 are“beforeandafter” viewsof timestampcorrection.The clock on the probing sourceis adjustment-free during the probe, so the round-trip times, shown forreferencein both graphs, are real. However, the one-waytransit times (OTTs) involve the target clock, and so, atthe very least,must be expectedto have someoffset6 fromthe source clock, denotedby

\ W`] Y . The main problem isdemonstratedin Figure16. Thetargetclock’s timestampsmay

6As notedabove, we neednot discover the value of ��|�� } , and, in fact, itis impossibleto determinethe value from one-way timestamps.We chooseavalue for displaypurposes,that brings the OTTs into range.

Page 8: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

UPENN YALE SRI LIACS UCH

% o

f pat

hs

0

10

20

30

40

50

1 1 1 1 12 2 2 2 23 3 3 3 34 4 4 4 45 5 5 5 56 6 6 6 67 7 7 7 78 8 8 9

Fig. 7. Tomographygroupsper pathUPENN YALE SRI LIACS UCH

% o

f gro

ups

0

5

10

15

20

1 1 1 1 15 5 5 5 510 10 10 10 1015 15 15 1520 20 20 20 20

Fig. 8. TomographygroupdiameterUPENN YALE SRI LIACS UCH

% o

f pat

hs

01020304050607080

0 0 0 0 01 1 1 1 12 2 2 2 23 3 3 3 34 4 4 4 45 5 5 56 6 6 67 78 8

Fig. 9. ASesin internalpaths

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

11738652 3237 4180 2484 881 231

Fig. 10. Groups/pathvs. non-orphans

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

11738652 3237 4180 2484 881 231

Fig. 11. Groups/pathvs. path length

path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

1173814 45 155 228 317 489 698 925 1055 1147 1147 1256 1029 769 627 566

Fig. 12. Path lengthvs. non-orphans

0

50

100

150

200

250

300

350

0 50 100 150 200 250 300 350 400

tran

sit t

ime,

mse

c

request sent, sec

(translated) original source-to-target timeround-trip time

Fig. 16. Beforefixclock correction.

0

20

40

60

80

100

120

140

160

180

200

0 50 100 150 200 250 300 350 400

tran

sit t

ime,

mse

c

request sent, sec

(translated) corrected source-to-target timeround-trip time

Fig. 17. After fixclock correction.

reflect rate-skew and jumpscomparedto the sourceclock. Inthis example,thesource-to-targetOTTs climb about245msecgradually over about 65 seconds,then drop back suddenlyto begin the samelinear rise again.This cannot reasonablybe attributed to networking delays.It is clearly target clockbehavior. On the other hand, we seestray OTT points thatseemmost reasonablyattributed to network delays, not toclock behavior. This attribution is supportedby thecorrelationof thesestray points with increasedround-trip times, whichwe know to be independentof the target clock, and so must

indicate delay variation in at least one direction. The taskof timestampcorrectionis to divide apparent delayvariationinto actual delay variation and artifacts of the target clockbehavior reflectedin thetimestamps.Oncethedatais cleaned,asdepictedin Figure17, we canconsiderthe OTT variation,asdiscussedabove, wherethe main remainingclock problemis unknown, fixed, relative offset.

In this example it is quite clear what the remoteclock isdoing, so correctingfor it seemsstraightforward. If we wishto handleonly clock skew, or only jumps,andwe areprepared

Page 9: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

to assumethat congestiondelaysdo not occurat inopportunetimes to obscurethe target clock behavior, the problem isalsonot too daunting.The problemis moredifficult whenweattemptto designanalgorithmto handlethegeneralcase,of anunknown combinationof non-negligible skew, skew variation,jumps, and congestiondelays,and we needto automatethecorrectionefficiently in order to processthousandsof probedatasets.

Several approachesto the detectclock behavior andcorrecttimestampshavebeenproposed.We particularlynotePaxson’sinsights [19] regarding jump detection and regarding thestatisticalsignatureof non-negligible skew; and the convex-hull-basedstrategies of Zhang, Liu, and Xia [20]. Neitherthe Paxson nor the ZLX methodsare able to addressthevariety of phenomenathat confront us simultaneously. Wehave preferredto develop our own approachwith a view togreatergenerality—anunavoidableobjective in our researchcontext—and to greaterconceptualunity and simplicity thanwe have found in the previously proposedmethods.Ourfixclock algorithm,unlike its predecessors,doesnot beginby trying to locate the discontinuitiesin the target clock’sbehavior. Instead,it builds regions that are convincingly freeof target clock discontinuities,regions throughoutwhich thealgorithm has grounds to assumethat the target clock isbehaving linearly. Each suchregion is subjectto a commonlinear correction.The regionsmaycover all or very nearlyallof thedata,in whichcaseit is likely thatonecouldcharacterizethe transitionsbetweenthe regionsas jumpsor skew changesthat occurredat precise times. But if the algorithm leavesgapsbetweenthe regions, typically as a result of obscuringcongestiondelays,this is of no concern:we are interested,in the first place, in the correctedtimestamps,not a precisehistory of the remoteclock’s behavior.

A detaileddiscussionof thefixclock algorithm,aswellas backgroundremarkson the Paxsonand ZLX approaches,may be found in [15].

F. Networkload and securityissues

For experimental purposes,the network and router pro-cessingresourcesrequiredby cing is not expectedto be aconcern.Resource-consciouschoiceof thenumberof networknodesparticipatingin measuringpathsis neverthelessimpor-tant to avoid biasedresults.Wider use of any method thatinvolvesprobing routersis also likely to consumesignificantresourcesat intermediateroutersandthereforeneedscaution.

The use of our tool to investigatequeuingdelayson thelinks in a small numberof pathswill not imposea large loadon the network. Similarly, it will not excite securitysoftware.In contrast,large scalemeasurementsof thousandsof pathswill likely raisealarms,becauseICMP TimestampandICMPEchomessagesareoften usedfor maliciouspurposessuchasscanningand fingerprinting of remotesystemsfor mountingappropriateattacks.The intensityof our experiments,aswellasthe needfor picking randompaths,which involvesprobinga large numberof randomIP addresses,further addstowardsa suspiciouslooking traffic signature.We expected(correctly)

that we would causealarmswhen observed by firewalls orintrusiondetectionsystems.

We thereforeinitially choseto embeda URL in our probepackets,pointing to a Webpageexplainingthenatureof theseprobes.The Web pagewas visited several times, and therewere a handful of routine complaintsmadethroughemail [email protected] (fewer than 5 over the courseofour experiments)andtwo requeststo filter out specificIP pre-fixes.For ensuringuninterruptedexperimentsandto make theinnocentnatureof our probesmoreobvious,we wereadvisedto advertiseappropriatehost-namesin DNS (e.g.netmap-X-contact-telnumber-email-at.cis.upenn.edu).

IV. FUTURE WORK

We know of severalsituationswherecing cangive inaccu-rate estimates.Our measurementsseemto indicatethat suchsituationsare rare. However, althoughwe have investigatedon the order of 10,000 paths,we have used fewer than 10sources.Consequentlywe are not confident that our studyis comprehensive. Fortunately, we believe that we understandhow to compensatefor the inaccuraciesand limitations thatwe know about.We have not incorporatedthem into cingbecausewe felt that without compelling evidence that theywould be useful,we couldnot justify increasingthe complex-ity of our tool. Nor haveweyetsubjectedthesenew techniquesto testingcomparableto the basiccing techniques.Shouldbroaderexperiencewith cing indicate that thesepotentialproblemsare limitations in practice,we would pushforwardin thoseareas.

For example, accuracy may be reduced when the twopacketsin asinglepacket-pairexperiencedifferentlatenciesontheir sharedpath.By sendingboth packetsin a packet-pairtothe“head”of thelink wecangetasampleof thedistributionofpacket-pairdistortions.If we assumesuchdistortions(whetherdue to differencesin queuingdelay or variationsin time togenerateICMP packets)areindependentof thequeuingdelay,thenwe candeconvolve our measureddelaywith the packet-pairdifferencein orderto recoverfrom packet-pairerrors.Thiswill not correctfor (hypothetical)biasthat may be correlatedwith large delay, though.

We are pursuing improving the coverage of our directmethod. For instance,we can estimatethe distribution ofqueuingdelay in the sharedoverlapof overlappingsegments,by usingdeconvolution in an adaptationof the indirect infer-encetechniqueover non-treetopologies.This would allow usto measureany link, or segment, as long as neither of theendpointsof the segmentwereorphans.The valueof suchanextensionis shown in Figure 15, althoughone would expectsomedecreasein accuracy androbustnessof the resultinges-timates,whencomparedto a puredirect approach.Onecouldextendit further, to evencover orphans,by usingTTL-expiredmessagesto estimatedelayson links anchoredby orphansandcombining theseestimateswith overlappingsegmentsusingdeconvolution. As shown, theseare lessaccuratethan usingthan Timestampmeasurements,but an improvementover nomeasurementsat all.

Page 10: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

In general,we areconsideringinterestinghybridsof directtechniques� and indirect techniques.The direct measurementsimprove the accuracy andreducethe computationcost,whilethe indirect measurementsincreasecoverage.Our ongoingwork alongtheselines is describedin [16].

V. SUMMARY AND CONCLUDING REMARKS

We have revisited the possibility of measuringnetwork-internal delay distributions using direct methods.Our workshows that the requiredinfrastructureis deployed on the vastmajority of routers(over 97%) and on every router in abouthalf of all paths.We generallydivide a pathup into a small-numberof multi-link segments,so the ability to investigatedelay on every individual link in only half the paths is notmuch of a limitation in practice,and the infrastructureis inplaceto usecing almostanywhere.Thus,directmeasurementof network-internaldelayswithout addingany extra measure-mentinfrastructure is feasible.

The accuracy of cing dependedupon two broadassump-tions. First, we assumethat back-to-backpackets experiencethe sameperformanceand behavior on their sharedpath —we can then use the differencein time to measurethe timespenton the unsharedsegment.Second,we assumethat,withrespectto queuingdelay, our ICMP Timestampprobesbehaveindistinguishablyfrom TCP or UDP packets. If so, we canthenclaim thatmeasurementsderivedfrom our ICMP packetsapply to otherpackets,too.

Our measurementsshow that the first assumptionis truemost,but not all, of the time. Back-to-backpacketsappeartoexperienceonly negligible differencesin queuingdelays,how-ever ICMP packet generationtimescansometimesbevariablewithin a single router. Suchvariation may be misinterpretedas queuingdelay. Nevertheless,our measurementsshow thatmore than 90% of all routers introducesuch distortion lessthan10% of the time. Further, if it becomesnecessary, thereare techniquesthat canrecover from this distortionwith highprobability. We have avoided suchtechniquesso far, becausethey increasethe complexity of cing for very little gain.

Our measurementsalso show that despitethe fact that ourprobesare addresseddirectly to the last router and not tothe end node, and despitethe fact that they are processedin the slow path rather than forwarded in the fast path,the excessoverheadis roughly constant(other than a smallamount of variability introduced during the generationofICMP packets). Constantoverheadcancelsout becausewesubtractout the shortestmeasuredtime from all observations.Theonly variability thatremainsarequeuingdelaysin thenet.

Given that theseassumptionsare true, then it is clear frominspection(andconfirmedby simulation[10]) thatcing willreturnanaccurateestimateof thedistribution of queuingdelayon eachsegment that it measures.Thus, direct measurementof queuingdelayusingonly existinginfrastructure is accurate.

Earlier [10] we hadshown thatdirect techniquesto measurenetwork-internalqueuingdelaysaremore accuratethan indi-rect inferencetechniques.Here we have shown that one-waytimestampsare a more accuratedirectmeasurementtechnique

than RTT measurementsof probesdesignedto induce TTL-expired messages. The advantagearisesbecauseof the exis-tenceof both asymmetricroutesand asymmetriccongestionon symmetricroutes.

Routing irregularity in the Internetdoesoften prevent thecurrent version of cing from measuringthe queuingdelayon certain specific links. Fortunately, cing can detect(andreport), in advance,its inability to measurespecific links inisolation. Further, breakingup a path into a few multi-hopsegmentsis usually sufficient to isolatecongestionpoints. Ifthe inability to measurespecific links due to routing irregu-larity becomesa problem in practice,then, as mentionedinSectionIV, we arepursuingseveral promisingapproachestoincreasecoverage, and improve accuracy, but at the expenseof morecomplexity in cing.

ACKNOWLEDGMENTS

This work wassupportedby the DoD University ResearchInitiative (URI) programadministeredby the Office of NavalResearchunderGrantN00014-01-1-0795.

We thank John C. Parker, Dave Millar and CharlesBuchholtzat UPenn,C. Ricudisat AUTH, A. Hatzistamatiouand G. Portokalidis at UCH, H. Bos and V. Viagers atLIACS, M. Hogsett, P. Lincoln at SRI, and C. Powelland J. Feigenbaumat Yale for making it possible to takemeasurementsfrom differentsites.We alsothankS. Ioannidis,B. Knutsson,J. M. Smith andthe membersof the DistributedSystemsLaboratoryfor valuablecommentsand suggestionsthroughoutthe courseof this work.

REFERENCES

[1] V. Jacobson,“Congestionavoidanceand control,” in ProceedingsofSymposiumon CommunicationsArchitectures and Protocols (SIG-COMM ’88), 1988,pp. 314–329.

[2] ——, “pathchar - A tool to infer characteristicsof Internet paths,”available from ftp://ftp.ee.lbl.gov/pathchar, April 1997.

[3] A. B. Downey, “Using pathcharto estimateInternetlink characteristics,”in Proceedingsof ACM SIGCOMM, September1999,pp. 241–250.

[4] A. Adams,T. Bu, R. Caceres,N. Duffield, T. Friedman,J. Horowitz,F. Lo Presti,S.Moon,V. Paxson,andD. Towsley, “The useof end-to-endmulticast measurementsfor characterizinginternal network behavior,”IEEE CommunicationsMagazine, vol. 38, no. 5, pp. 152–159,May2000.

[5] N. G. Duffield and F. Lo Presti, “Multicast inferenceof packet delayvarianceat interior network links,” in Proceedingsof IEEE INFOCOM,vol. 3, April 2000,pp. 1351–1360.

[6] N. G. Duffield, J. Horowitz, F. Lo Presti,andD. Towsley, “Network de-lay tomographyfrom end-to-endunicastmeasurements,” in Proceedingsof the 2001InternationalWorkshopon Digital Communications2001-EvolutionaryTrendsof the Internet, September2001.

[7] N. Duffield andM. Grossglauser, “Trajectorysamplingfor direct trafficobservation,” IEEE/ACM Transactionson Networking, vol. 9, no. 3, pp.280–292,June2001.

[8] M. J. Luckie, A. J. McGregor, and H.-W. Braun, “Towards improvingpacket probing techniques,” in Proceedingsof the ACM SIGCOMMInternetMeasurementWorkshop2001, November2001,pp. 145–150.

[9] K. G. Anagnostakis,M. B. Greenwald, and R. S. Ryger, “On thesensitivity of network simulation to topology,” in Proceedingsof theTenthIEEE/ACM Symposiumon Modeling, Analysis,and SimulationofComputerand TelecommunicationsSystems(MASCOTS2002), October2002,pp. 117–126,Fort Worth, Texas.

Page 11: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

[10] K. G. Anagnostakisand M. B. Greenwald, “Direct measurementvs.indirect� inferencefor determiningnetwork internaldelays,” PerformanceEvaluation, vol. 49, no. 1-4, pp. 165–176,September2002, (Perfor-mance’02).

[11] ——, “On the feasibility of network delay tomographywithout infras-tructure support,” CIS Department,University of Pennsylvania, Tech.Rep.MS-CIS-01-35,December2001.

[12] V. Paxson,“End-to-endroutingbehavior in theInternet,” in Proceedingsof ACM SIGCOMM, August1996,pp. 25–38.

[13] V. Jacobson,“traceroute software,” Original releaseavailable fromftp://ftp.ee.lbl.gov/traceroute.tar.gz.

[14] R. Govindan and V. Paxson,“Estimating router ICMP generationde-lays,” in Proceedingsof thePassiveandActiveMeasurementsWorkshop(PAM), March 2002.

[15] R. S. Ryger, “fixclock: removing clock artifacts from communi-cation timestamps,” Departmentof ComputerScience,Yale University,Tech.Rep.YALEU/DCS/TR-1243,December2002.

[16] K. G. Anagnostakisand M. B. Greenwald, “A hybrid direct-indirectestimatorof network internal delays,” CIS Department,University ofPennsylvania,Tech.Rep.MS-CIS-02-31,December2002,submittedtothe 4th Workshopon Passiveand ActiveMeasurements(PAM03).

[17] A. Shaikh,J. Rexford, andK. G. Shin, “Load-sensitive routing of long-lived IP flows,” in Proceedingsof ACM SIGCOMM, September1999,pp. 215 – 226.

[18] Y. RekhterandT. Li, “A bordergateway protocol4 (BGP-4),” RFC1771,http://www.rfc-editor.org/, March 1995.

[19] V. Paxson,“On calibrating measurementsof packet transit times,” inProceedingsof ACM SIGMETRICS, June1998,pp. 11–21.

[20] L. Zhang, Z. Liu, and C. H. Xia, “Clock synchronizationalgorithmsfor network measurements,” in Proceedingsof IEEE INFOCOM, June2002,pp. 160–169.

Page 12: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

APPENDIX

ADDITIONAL MEASUREMENT RESULTS

UPENN YALE SRI LIACS UCH

% o

f pat

hs

0

10

20

30

40

50

1 1 1 1 12 2 2 2 23 3 3 3 34 4 4 4 45 5 5 5 56 6 6 6 67 7 7 7 78 8 8 9

Fig. 18. Number of tomography groups per path

UPENN YALE SRI LIACS UCH

% o

f gro

ups

0

5

10

15

20

1 1 1 1 15 5 5 5 510 10 10 10 1015 15 15 1520 20 20 20 20

Fig. 19. Tomography group diameter (distance betweenfirst and last node in a group)

UPENN YALE SRI LIACS UCH

% o

f pat

hs

01020304050607080

0 0 0 0 01 1 1 1 12 2 2 2 23 3 3 3 34 4 4 4 45 5 5 56 6 6 67 78 8

Fig. 20. Number of ASesin non-end-to-endpaths that are do not appear on the end-to-end path

UPENN

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

11738652 3237 4180 2484 881 231

YALE

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

117421758 5381 3574 848 144 22

SRI

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

117151506 4692 3875 1335 241 56

UCH

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

85532218 3660 1768 752 131 19

LIACS

groups/path 1 2 3 4 5 6 All

% n

on−

orph

an n

odes

20

30

40

50

60

70

80

90

100

20

30

40

50

60

70

80

90

100

117455278 4534 1421 282 122 46

Fig. 21. Relationship betweennumber of groups per path and number of non-orphan nodes

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

11738652 3237 4180 2484 881 231

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

117421758 5381 3574 848 144 22

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

117151506 4692 3875 1335 241 56

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

117455278 4534 1421 282 122 46

groups/path 1 2 3 4 5 6 All

Pat

h le

ngth

(ho

ps)

0

3

6

9

12

15

18

21

24

27

30

33

36

0

3

6

9

12

15

18

21

24

27

30

33

36

85532218 3660 1768 752 131 19

Fig. 22. Relationship betweennumber of groups per path and path length

Page 13: cing: Measuring Network-InternalDelays using only Existing …mbgreen/papers/cingfeas-tr.pdf · 2004-02-09 · Other approaches exist, based on passive packet sampling [7], or using

path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

1173814 45 155 228 317 489 698 925 1055 1147 1147 1256 1029 769 627 566path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

11742503 556 834 990 1305 1384 1367 1278 984 710 492 276 208 205 120 61path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

11715905 1009 1018 1323 1486 1095 1119 956 677 486 294 205 144 106 81 47path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

11745235 517 492 592 835 767 908 1116 1543 1109 846 791 678 377 302 184path length: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 All

% n

on−

orph

an n

odes

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

855329 82 293 439 436 611 778 829 849 719 657 533 455 289 180 163

Fig. 23. Relationship betweenpath length and number non-orphan nodes