24
1 20 February 2006 Measuring the Internet: State of the Art and Challenges February 2006 Matti Siekkinen [[email protected]] Institut Eurecom Sophia Antipolis, France 2 20 February 2006 Background and motivation The Internet Part 1: Different measurement approaches and techniques Passive vs. active On-line vs. off-line Aggregation level Hardware vs. software Storage methods Environment Case study 1: InTraBase Case study 2: Gigascope Part 2: The analysis Traffic characterization and modeling Network characterization and modeling Anomaly detection Case study on TCP Root Cause Analysis 3 20 February 2006 What is the Internet? 4 20 February 2006

Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

1 20 February 2006

Measuring the Internet:State of the Art and Challenges

February 2006

Matti Siekkinen [[email protected]]

Institut EurecomSophia Antipolis, France

2 20 February 2006

Background and motivationThe Internet

Part 1: Different measurement approaches and techniquesPassive vs. activeOn-line vs. off-lineAggregation levelHardware vs. softwareStorage methodsEnvironmentCase study 1: InTraBaseCase study 2: Gigascope

Part 2: The analysisTraffic characterization and modelingNetwork characterization and modelingAnomaly detectionCase study on TCP Root Cause Analysis

3 20 February 2006

What is the Internet?

4 20 February 2006

Page 2: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

5 20 February 2006 6 20 February 2006

What is the Internet?

A collection of computers capable of communicating with each other using a standard set of protocols (TCP/IP)

An internetwork: made up of numerous networksEach network comprises numerous hosts and routersHosts are endpoints, routers are internal way-stationsConnections between hosts are links

Packet network with best effort delivery

7 20 February 2006

What is the Internet?

Internet Services Providers (ISP) classified into tiers based onsize and capacity

tier 1: global reach, 20 (British Telecom (BT), Cable & Wireless, Global Crossing, Level 3, Sprint, MCI (UUnet), Verio (NTT)..)

the backbone of the Internettier 2: regional, ~3000tier 3: local, ~17000

Each ISPs distinct network forms an AS (Autonomous System)

180,000 reachable networks

8 20 February 2006

What is the Internet?

Page 3: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

9 20 February 2006

Why do we need to measure it?

Internet was not designed for the current usageOriginally designed as a private military research network that should be resilientNo built-in security, QoS, nothing…

Operators want to correctly provision their networksModeling trafficModeling user behavior

Operators want to manage their networksLoad balancingIdentify bottlenecks

In the case that the network is not correctly provisioned…Identify misconfigured devices (e.g. routers)

10 20 February 2006

Why do we need to measure it?

Guide Internet application developmentModel the trafficPerform empirical studies

Simulations are not always sufficient

Security related issuesUsing honeypots to understand attacking processesIntrusion Detection Systems (IDS)

11 20 February 2006

Why is all this very challenging?

The Internet has no built-in measurement mechanismsThe End-to-end arguments (J. H. Saltzer, D. P. Reed, and D. D. Clark. End-to-end arguments in system design. ACM Transactions on Computer Systems. 1984)The network is stupid, intelligence is at the edges

The Internet is a constantly moving targetTraffic volumes are ever increasingDominating applications

before: HTTP (Web) and FTPnow: P2Ptomorrow: ?

Access link capacitiesa few years ago (in Europe): 512 Kbit/snow: > 8Mbit/s

More and more mobilityThere is no such thing as “typical”

12 20 February 2006

Why is all this very challenging?

Traffic volumes are very largeMethods need to be scalable

Traffic data is sensitiveLegal issues: privacyBusiness: ISPs are reluctant to disclose any informationSecurity: attackers get the same knowledge

(however: no security through obscurity)

Page 4: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

13 20 February 2006

Part 1: Different measurement approaches and techniques

Passive vs. activeOn-line vs. off-line

DSMSsAggregation level

SamplingHardware vs. softwareStorage methods

flat files vs. DBMS vs. data warehouseEnvironment

wired vs. wirelessbackbone vs. WAN vs. LAN

Case study 1: InTraBase (Integrated Traffic Analysis using a DBMS)Database system for passive, off-line measurements and analysis

Case study 2: GigascopePassive, on-line packet monitoring platform (a kind of DSMS…)

14 20 February 2006

Measurements: Passive vs. active

PassiveObserve and record the traffic as it passes byUseful for characterizing the Internet traffic

☺ Measures real traffic☺ Does not perturb the network

No control over the measurement process

ActiveInject packets into the network, follow them and measure serviceobtainedUseful for inferring the network characteristics

☺ Full control over the measured traffic☺ Important for available bandwidth and link capacity estimation techniques

Need often access to two measurement points at strategical locationsCan perturb the network

15 20 February 2006

Measurements: On-line vs. off-line

On-linePerform (at least a part of) the analysis on the observed traffic in a real-time mannerOften necessary when handling very large amounts of traffic

E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed packet headers

☺ Data reduction, don’t need to store everything☺ Results right away, can react immediately

Efficient solutions can be complex and/or proprietary (Gigascope)Do not necessarily have all the raw packet data for later analysisSee also the lecture “DSMS for Network Monitoring” by Prof. V. Goebel

Off-lineCapture traffic into trace files and analyze later

☺ Possible to run complex time-consuming analysis☺ Simple and cheap solutions exist (e.g. tcpdump)

Not applicable for time critical scenariosStorage can become an issue

16 20 February 2006

Measurements: Aggregation levelPackets

Capture whole or partial packets (e.g. only TCP/IP headers)☺ Have it all

☺ Can construct connection-level data and/or do detailed packet-level analysisStorage requirementsAnalysis is resource consuming and thus slow

FlowsUsually grouped by timeouts and/or maximum packet counts and the five tuple: (src IP,dst IP,src port,dst port,layer 3 protocol)Cisco’s Netflow (latest version 9)

Flows computed in Netflow enabled devices (e.g. routers)7 keys to define a flow: src & dst addresses, src & dst ports, layer 3 protocol, TOS (type of service) byte, input interfaceFlow ends (by default) after 15s inactive timeout, 30min active timeout, or when flow cache is full ⇒ not true connections

☺ Reliefs memory requirements for on-line measurementsConnection level analysis needs assemblyingLoose packet-level knowledge

Page 5: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

17 20 February 2006

Measurements: Aggregation level

ConnectionsTCP connections

☺ True connection-level data enables full-scale end-to-end analysisTough memory requirements for on-line analysis

When is TCP connection finished?Loose packet-level knowledge

SamplingDo not record each packetUse statistical methods to estimate e.g. flow sizesFor the interested ones: see the work of N. Duffield (AT&T Labs-Research)

☺ Utilize less resourcesTrade off some accuracy

18 20 February 2006

Measurements: Hardware vs. software

HardwareEndace’s DAG cards

Only for packet capturingNetwork processors (Intel’s IXP cards)

Programmable devices for packet processing☺ Fast ⇒ can measure/process Gigabit and faster links☺ Accurate

☺ GPS synchronized accurate timestamping (DAG)☺ Less packet drops and corruption

Often very expensiveCan be non-trivial to use

Softwaretcpdump and friends

☺ Cheap (free)☺ Easy to use

SlowPrecision may be too low for certain use cases

19 20 February 2006

Measurements: Storage methods

Plain filesThe most common method

☺ SimpleManagement can become problematic in case of large amounts of measurements and analysis results

Non-reproducible results

DatabaseStore measurements and/or metadata and/or analysis results

☺ Solves many problems of management☺ Correct design can bring good performance (indexes etc.)

Some overhead in disk space usage and processingRequires initial investments to master

20 20 February 2006

Measurements: Storage methods

Data warehouseCollect, summarize and organize data within a database in order to facilitate and speed up further processing, commonly queryingDBS optimized for analysis and/or data mining Complex queries to extract information that reveals correlation between different sources of dataCreate collections, “views”, of data from “raw”data by aggregating the datasWarehouse can be periodically reconstructed, periodically updated from the sources, or updated upon changes in source datas

☺ Enables mining large amounts of dataThe quality of analysis depends on the quality of the defined viewsIt takes time to master DW techniques

Page 6: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

21 20 February 2006

Measurement environment: Wired or not?

WiredEverything else in the Internet is wired except maybe the last hops(Typically) FiFo scheduling with drop tailRIP and OSPF for intra-domain routingBGP for inter-domain routing

WirelessGPRS & UMTS802.11 (Wi-Fi)Can be non-FiFo schedulingDifferent routing mechanisms (e.g. ad-hoc networks)

Need to focus on different issues

22 20 February 2006

Measurement environment: Global vs. LocalWide area Internet traffic

ISP or university edgesBackbone

LocalWithin an ISPWithin a universityWithin an enterprise

Traffic characteristics are differentApplication setsTraffic volumes

Network characteristics are differentLink capacitiesDelays

Need to focus on different issues

23 20 February 2006

Part 1: Different measurement approaches and techniques

Passive vs. activeOn-line vs. off-line

DSMSsAggregation level

SamplingHardware vs. softwareEnvironment

wired vs. wirelessbackbone vs. WAN vs. LAN

Storage methodsflat files vs. DBMS vs. data warehouse

Case study 1: InTraBase (Integrated Traffic Analysis using a DBMS)Database system for passive, off-line measurements and analysis

Case study 2: GigascopePassive, on-line packet monitoring platform (a kind of DSMS…)

24 20 February 2006

M. Siekkinen, E.W. Biersack, V. Goebel, T. Plagemann, and G. Urvoy-KellerInTraBase: Integrated Traffic Analysis Based on a Database

Management SystemE2EMON 2005

M. Siekkinen, V. Goebel, and E.W. BiersackObject-Relational DBMS for Packet-Level Traffic Analysis: Case

Study on Performance OptimizationE2EMON 2006

Database system for passive, off-line measurements and analysis

Case study 1: InTraBase

Page 7: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

25 20 February 2006

Case study 1: InTraBaseOutline

Motivation

Our InTraBase approach

First Prototype of InTraBasePerformance evaluation

Conclusions

26 20 February 2006

Case study 1: InTraBaseMotivation

Current situation in off-line traffic analysis

State of the art is handcrafted scripts and numerous specialized software tools

Traffic analysis is an iterative process

Large amount of data for analysis

27 20 February 2006

Case study 1: InTraBaseMotivation

Resulting problems:

1. Management• Data, metadata, and tools• Getting lost with files containing data and

ad-hoc scripts

2. Analysis cycle• Data loses semantics and structure

3. Scalability• Cannot even analyze 10GB data sets

Filter

Process

Combine

Store

Interpret

Definenew task

28 20 February 2006

Case study 1: InTraBaseThe approach

Perform traffic measurements and store results in files

Upload base data into the db and process it within the db

Issue SQL queriesUse extensibility of the DBMS to create functions for advanced processing

Base data:tcpdump packet tracesApplication logs...

Page 8: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

29 20 February 2006

Application logs

Case study 1: InTraBaseThe approach

Web100 Raw base datafiles

Network link

Data Warehouse

SubSubSubSubSubSub

SubSubSubSub

SubSub

Off-line analysis

Base data

Results

Queries

Descriptions

DBMSApplication

TCP

IP

Preprocess

tcpdump

Functions

30 20 February 2006

Case study 1: InTraBaseBenefits from a DBMS-based Approach

Support provided by DBMS to organize and manage data, related metadata, analysis results and tools

The database consists of reusable componentsPerforming newanalysis is lesslaborious and errorprone

31 20 February 2006

Case study 1: InTraBaseBenefits from a DBMS-based Approach

Data becomes structured and conserves semantics

Processing and updating data is easier

Searching is more efficient (indexes)

Store reusable intermediate results

It is easier to combine different data sourcesE.g. application level events explain some of the phenomena in the traffic at TCP layer

32 20 February 2006

Case study 1: InTraBaseDrawbacks

However,

Initial investment to master a DBMS

Elevated processing time and disk space consumptionFor simple tasks with small datasets simple tools s.a. tcptrace are sufficient

Page 9: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

33 20 February 2006

Case study 1: InTraBaseA Prototype of InTraBase

Analyze TCP traffic from tcpdump packet traces with PostgreSQL

PostgreSQLObject relational DBMS

Allows to extend the functionality with new functionsLarge user community => support

34 20 February 2006

1. Copy packets from a file into the packets

Case study 1: InTraBaseProcessing a tcpdump file

modified tcpdump or dagdump- enforce structure- add connection ids

DBSpcap trace file psql copy

2. Build an index for the packets based on cnxid

3. Create connection level statistics into connections

4. Insert unique 4-tuple to cnxid mapping data into cid2tuple

dag trace file

35 20 February 2006

Case study 1: InTraBasePrototype base table layout

36 20 February 2006

Case study 1: InTraBasePrototype functions

Contains a set of functions

pl/pgSQLProduce timeseries (packet inter-arrival times, throughput…)Plot time-sequence diagram in xplot format…

pl/RProduce graphsStatistical calculations

Page 10: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

37 20 February 2006

Case study 1: InTraBaseHistogram of the packet inter-arrival times

of the fastest connection

SELECT plot_ts_hist(‘SELECT * FROM iat(t2.cnxid,t2.reverse,”packets”)','histogram.pdf') FROM(SELECT cnxid,reverse FROM cnxs,(SELECT max(throughput) FROMcnxs) AS t1 WHERE cnxs.throughput=t1.max) AS t2;

Find the connection with the highest throughput from Connections table

Fetch the packets of this connection from Packets table

Compute the inter-arrival times from timestamps

Plot the histogram

38 20 February 2006

Case study 1: InTraBaseHistogram of the packet inter-arrival times

of the fastest connection

SELECT plot_ts_hist(‘SELECT * FROM iat(t2.cnxid,t2.reverse,”packets”)','histogram.pdf') FROM(SELECT cnxid,reverse FROM cnxs,(SELECT max(throughput) FROMcnxs) AS t1 WHERE cnxs.throughput=t1.max) AS t2;

0

50

100

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

connectionsbytes

packetstput…

connection id

packets

timestampstart #seqend #seq

flags…

connection id

iat(…) plot_ts_hist()

histogram.pdf

39 20 February 2006

Case study 1: InTraBaseAnalysis of the feasibility

Analysis of the feasibilityProcessing timeDisk space consumption

Test data:BitTorrent traffic files

Few large connectionsMixed Internet traffic files

Lots of small connections

Tests run on Linux 2.6.3. , 2x Intel Xeon 2.2GHz, SCSI RAID

40 20 February 2006

Case study 1: InTraBaseAnalysis of the feasibility

total processing time vs. file size

Page 11: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

41 20 February 2006

Case study 1: InTraBaseAnalysis of the feasibility

disk space consumption

42 20 February 2006

Case study 1: InTraBaseAnalysis of the feasibility

Processing time is good enoughRarely process larger files than 10 GB (overnight)

Overhead of 50% in disk space is acceptableUsually not an issue nowadaysPrice to pay for having structured data

43 20 February 2006

Case study 1: InTraBaseConclusions

Experiences with a flat file -based approachManagement becomes an issue

measurements, derived data, results, and toolsAnalysis tasks become cumbersome

⇒ Need a new approachA DBMS-based approach

Solves many of the management problemsPrototype scales reasonably well

Overheads are acceptableUsages

Helps a lot everyday analysis workOngoing work with France Telecom on performance monitoring/troubleshooting of their ADSL platform

Process periodically a trace captured at the edge of ADSL platformGUI in production…

44 20 February 2006

Part 1: Different measurement approaches and techniques

Passive vs. activeOn-line vs. off-line

DSMSsAggregation level

SamplingHardware vs. softwareEnvironment

wired vs. wirelessbackbone vs. WAN vs. LAN

Storage methodsflat files vs. DBMS vs. data warehouse

Case study 1: InTraBase (Integrated Traffic Analysis using a DBMS)Database system for passive, off-line measurements and analysis

Case study 2: GigascopePassive, on-line packet monitoring platform (a kind of DSMS…)

Page 12: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

45 20 February 2006

Case study 2: GigascopeHow to monitor network traffic 5Gbit/sec

Chuck Cranor, Theodore Johnson and Oliver Spatscheck.Gigascope: a stream database for network applications.

SIGMOD 2003.

C. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk.The Gigascope Stream Database.

IEEE Data Engineering Bulletin 26(1):27-32. 2003.

46 20 February 2006

Case study 2: GigascopeOutline

Motivation

Features

GSQL

Performance

47 20 February 2006

Case study 2: GigascopeMotivation

Objectives: To manage the network’sSecurityReliabilityPerformance

Requirements for the network monitoring toolFlexibility and performance in order to achieve reactiveness

Achieve high speeds• GETH• OC48 (2.48Gbit/sec full duplex)• OC192 (9.92Gbit/sec full duplex)

Be inexpensiveAllow retrofittingBe reliableBe remotely manageable

All 7 layers need to be monitored

48 20 February 2006

Case study 2: GigascopeMotivation

Tcpdump and off-line data analysis (=InTraBase) is out of question

Does not scale to even moderate line rates for on-line monitoringExpensive due to disk cost and space

Standard tools such as: SNMP, RMON, NetflowDo not cover layer 7 in a scalable fashionDo not allow changes flexibly to the aggregation type

Proprietary vendor products (sniffer.com, NARUS, handcrafted libpcap tool, …)

Do not allow changes flexibly to the aggregation type

Page 13: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

49 20 February 2006

Case study 2: GigascopeFeatures

Fast, flexible packet monitoring platformOn-linePassiveA custom DSMSFlexible SQL-based interface

Claimed to be able to monitor at speeds up to OC48 (2 * 2.48 Gbit/sec) with a single probe

Supports (in 2003):Netflow compatible recordsDetection of hidden P2P trafficEnd-to-end TCP performance monitoringDetailed custom performance statistics

50 20 February 2006

Case study 2: GigascopeFeatures

Data gathering steps:1. Raw data feed from:

Optical splitterElectrical splitterMonitoring/SPAN port

2. Extract aggregated records from the data feed in real time3. Store data on local RAID4. Data is copied in real time or during off peak hours to data warehouse

for further analysisSSH secured back channelTools allow rate limiting to prevent the Gigascope from flooding the network

5. Data is analyzed and/or joined with other data feeds using Daytona or other tools in the data warehouse

6. Result is displayed, or used for alarming, or to generate customized reports

51 20 February 2006

Case study 2: GigascopeGSQL

GSQL is the query language for GigascopeSimilar to SQLSupport for stream database queries

Stream fields can have ordering propertiesDeduce when aggregates are closed and thus can be flushed to theoutput stream

Each query:Receives one or more tuple streams as inputGenerates one tuple stream as output

Currently (2003) limited to selection, aggregation, views, joinsand mergesQuery compiler maps logical query topology to an optimized FTA processing topology

52 20 February 2006

Case study 2: GigascopeConclusions

Gigascope is a packet monitoring platform

Main advantagesFast (at least claimed to be)Flexible through SQL-like query languageData reduction

DisadvantagesProprietaryUsually needs another solution for data storage (Daytona)

Page 14: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

53 20 February 2006

Part 2: The analysisTraffic characterization and modeling

TCPP2P

Network characterization and modelingTopology discoveryNetwork coordinatesMeasuring link capacities and available bandwidthTraffic matrices

Anomaly detectionNetwork troubleshootingIntrusion detection

Case study on TCP Root Cause Analysis

54 20 February 2006

Traffic characterization and modelingCharacterize and model traffic on various layers

Application layer: P2P, On-line games, Skype, WWW…Transport layer: TCP, (UDP)IP layerMAC layer: Wireless environmentsPhysical layer: Especially wireless links

ObjectivesFirst understand, then compute and predict network and application performance, and user behavior

Provision and troubleshoot networksGuide application development (e.g. caching)Provide accurate workload models for simulations

Identify different applicationsNeeded for the first objectives too…Enforce rules and regulations

55 20 February 2006

Traffic characterization and modeling:TCP

TCP carries over 90% of the bytes in the Internet

Modeling TCPExpress the performance of a TCP transfer as a function of some parameters that have physical meaning

Parameters: packet loss rate (p), round-trip time (RTT) of the TCP connection, the receiver advertised window, the slow start threshold, initial window size, window increase rate etc.Performance metrics: Throughput, latency, fairness index etc.

E.g. the Square Root Formula: (Mathis et al. 1997)

More advanced modelingAdvanced models for loss processesQueuing theory

pRTTMSSTput

23

=

56 20 February 2006

Traffic characterization and modeling

Page 15: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

57 20 February 2006

Traffic characterization and modeling:TCP

Empirical approachInfer techniques from observations on real Internet trafficLess maths, more intuitive and simple modelsApply a tool or an algorithm on real packet traces and analyze resultsExamples

Studying the burstiness of TCP traffic• Why is the Internet traffic bursty in short (sub-RTT) time scales? H.

Jiang, C. Dovrolis. SIGMETRICS 2005.TCP Root Cause Analysis (case study in the end)

• How to identify the cause that prevents a TCP connection from achieving a higher throughput?

More also on the lecture “Operational analysis of TCP in the wild”by Dr. Guillaume Urvoy-Keller

58 20 February 2006

Traffic characterization and modeling:P2P

Identifying P2P trafficNeed to identify it before it can be characterized…Regulations and rules (RIAA)Not trivial since it can hide behind other TCP ports (80)

Circumvent filtering firewalls and legal issuesIdentify by well-know TCP ports☺ Fast and simple

May capture only a fraction of the total P2P trafficSearch application specific keywords from packet payloads☺ Generally very accurate

A set of legal, privacy, technical, logistic, and financial obstaclesNeed to reverse engineer poorly documented P2P protocolsPayload encryption increasingly supported in P2P protocols

Transport layer identificationTransport layer identification of P2P traffic. T. Karagiannis, A. Broido, M. Faloutsos, Kc claffy. IMC 2004.Observe connection patterns of source and destination IPs

☺ Identify > 95% of P2P flows and bytes, 8-12% false positivesLimited by knowledge of the existing connection patterns

59 20 February 2006

Traffic characterization and modeling:P2P

Characterizing and modeling P2P application trafficImprove the performance and scalability of P2P applicationsEvaluate their impact on the networkBuild mathematical models for the behavior and verify by applying to real traffic

Mathematical model enables accurate analysisE.g. Modeling and performance analysis of BitTorrent-like peer-to-peer networks. D. Qiu, R. Srikant. SIGCOMM 2004

Empirical analysis of P2P traffic“Tune the knobs” and draw conclusions from what you observeE.g. Dissecting BitTorrent: Five Months In Torrent's Lifetime. M. Izal, G. Urvoy-Keller, E.W. Biersack, P.A. Felber, A. Al Hamra, L. Garces-Erice. PAM 2004.

See also the lecture “P2P systems for file replication” by Prof. Ernst Biersack

60 20 February 2006

Part 2: The analysisTraffic characterization and modeling

TCPP2P

Network characterization and modelingTopology discoveryNetwork coordinatesMeasuring link capacities and available bandwidthTraffic matrices

Anomaly detectionNetwork troubleshootingIntrusion detection

Case study on TCP Root Cause Analysis

Page 16: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

61 20 February 2006

Network characterization and modeling:Topology discovery

The art of finding out how the network is layed outNot trivial knowledge in large scale networks

Why do this?It is fun ☺Realistic simulation and modeling of the Internet

Some examples of methodsSNMP

Works only locallyTraceroute@home

Do traceroute in large scaleCoordinate the efforts in order to be highly effective while avoiding unnecessary load on networkSee http://tracerouteathome.net/

SkitterICMP ECHO-REQUEST probes (= traceroute) from 30-40 monitors to measure delay and IP pathGather actively used IP addresses from a number of sources

• Bbone packet traces, NeTraMet traces, NetGeo, CAIDA website hits…

62 20 February 2006

Network characterization and modeling:Network coordinates

Express the communication latency, the “distance”, in virtual coordinatesEnables predicting round-trip times to other hosts without having to contact them first

Useful in selecting a mirror server or peers in P2P systemsGeneral approach:1. Select a subset of hosts for reference points (RP)

Create the origin of the coordinate system2. Measure round-trip-time (distance) between RPs3. Calculate coordinates for each RP4. Measure RTT between host and RPs5. Calculate coordinates for the host

Different proposed techniques for steps 1,3 and 5Reference points = landmarks, lighthouses, beacons

63 20 February 2006

Network characterization and modeling:Link capacities and available bandwidth

What?Infer the maximal throughput that a TCP transfer can achieve…

at a given time instant (available bandwidth)when there is no other traffic (capacity)

…on a specific link or on entire pathWhy?

Network aware applicationsRoute selection in overlay networksQoS verificationTraffic engineering

How?Generally use active probing: introduce packets with a specific traffic pattern and observe the pattern at the other endSee the lecture “Measuring link capacities in the Internet” by Dr. Guillaume Urvoy-Keller

64 20 February 2006

Network characterization and modeling:Traffic matrices

Traffic matrix is a network wide view of the trafficRepresents for every ingress point i into the network and every egress point j out of the network, the volume of traffic Ti,j from i to j over a given time intervalInput for any capacity offer engineering plan

Routing matrix, link capacity, traffic engineering, etc…

Problem: Can not measure directlyFlow-level measurements at ingress points can generate terabytes of data per day

Solution: Estimate

Page 17: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

65 20 February 2006

Network characterization and modeling:Traffic matrices

A B

E

C D

5 4

3 6

33D12CBAsrc

dst

AE, BE, EC, ED obtained using SNMPLink ED = AD + BD, Link AE = AD + AC…

We have a linear system Y = AXX are the Ti,j values to be estimatedA are IGP link weightsY can be obtained using SNMP

Fundamental problem: # links < < # OD pairsunder-constrained systeminfinitely many solutions

A variety of different proposed solutions

24D21CBAsrc

dst

66 20 February 2006

Part 2: The analysisTraffic characterization and modeling

TCPP2P

Network characterization and modelingTopology discoveryNetwork coordinatesMeasuring link capacities and available bandwidthTraffic matrices

Anomaly detectionNetwork troubleshootingIntrusion detection

Case study on TCP Root Cause Analysis

67 20 February 2006

Anomaly detection

Study abnormal trafficNon-productive traffic, a.k.a. Internet “background radiation”Traffic that is malicious (scans for vulnerabilities, worms) or mostly harmless (misconfigured devices)

Network troubleshootingIdentify and locate misconfigured or compromised devices

Intrusion detectionIdentify malicious activity before it hits youHoneypot project at Eurecom (http://www.leurrecom.org)

Characterize attack processes

68 20 February 2006

Part 2: The analysisTraffic characterization and modeling

TCPP2P

Network characterization and modelingTopology discoveryNetwork coordinatesMeasuring link capacities and available bandwidthTraffic matrices

Anomaly detectionNetwork troubleshootingIntrusion detection

Case study on TCP Root Cause Analysis

Page 18: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

69 20 February 2006

Case study on TCP Root Cause Analysis

Yin Zhang, Lee Breslau, Vern Paxson and Scott ShenkerOn the Characteristics and Origins of Internet Flow Rates

SIGCOMM 2002

M. Siekkinen, G. Urvoy-Keller, E. Biersack, and T. En-NajjaryRoot Cause Analysis for Long-Lived TCP Connections

CoNEXT 2005

70 20 February 2006

Case study on TCP Root Cause AnalysisOutline

Motivation and Objectives

Flow rate characteristics

Taxonomy of TCP rate limitation causes

One approach to infer limitation causes

Experimental results

Conclusions

71 20 February 2006

Motivation

Facts about The Internet:over the last 5 years…

Traffic volumes and number of users have skyrocketed

Access link capacities have multiplied

Dominance shifted from Web+FTP into Peer-to-peer applications

TCP has kept its position as the dominating transport protocol

72 20 February 2006

Motivation

Questions are raised:

ISPs want to know what is going on in their network

What are the limitations that Internet applications are facing?

Why does a client with 4Mbit/s ADSL access obtain only total throughput of 2Mbit/s when downloading movies with eDonkey?

Need techniques for traffic measurement and analysis

Page 19: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

73 20 February 2006

Objectives of TCP RCA

Learn more about rates and rate limitations of data transfers in the Internet

TCP typically over 90% of all traffic

Study long lived connectionsReveal the root cause of limitation

Do quantitative analysisPassive traffic analysis techniques

Observe traffic at a single measurement point (e.g. at the edge of an ISP’s network)Capture and store TCP/IP headers, analyze later off-line

☺ See earlier slides about passive measurementsNeed to estimate many parameters

74 20 February 2006

Flow rate characteristicsDatasets and Methodology

DatasetsPacket traces at ISP backbones and campus access links

8 datasets; each lasts 0.5 – 24 hours; over 110 million packetsSummary flow statistics collected at 19 backbone routers

76 datasets; each lasts 24 hours; over 20 billion packets

Flow definitionFlow ID: <SrcIP, DstIP, SrcPort, DstPort, Protocol> Timeout: 60 seconds

Rate = Size / DurationExclude flows with duration < 100 msec

Look at:Rate distributionCorrelations among rate, size, and duration

75 20 February 2006

DataSets

Observe:The diversity of the dataset (in terms of aggregation level)The fact that sampling is required as speed is increasingNot all traces are bidirectional

76 20 February 2006

Flow Rate CharacteristicsRate distribution

Most flows are slow, but most bytes are in fast flowsDistribution is skewed

Not as skewed as size distribution

10% of flows have rate > 100 kbtis/s

10% of flows send > 10 kbytes

Page 20: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

77 20 February 2006

Flow Rate CharacteristicsCorrelations

Rate and size are strongly correlated Not due to TCP slow-start

Removed initial 1 second of each connection; correlations increase

What users download is a function of their bandwidth

R: RateD: DurationS: Size

78 20 February 2006

Limitation Causes for TCP Throughput

Application

TCP End PointReceiver window limitation

NetworkBottleneck link

TCP protocolSS and CA

Application Application

TCP TCPNetwork

Sender Receiver

buffers

79 20 February 2006

An example of xplot time-sequence diagram

receiver advertizedwindow limit

received acknowledgments

retransmitted datasent data packets

pushed data pktis marked with a diamond

outstanding bytes

size of receiver advertized window

80 20 February 2006

Application that sends small amounts of data at constant rateStreaming applications

Skype: Internet telephony applicationWeb radios

Throttling applicationsP2P: eDonkey (rate control by user)

Page 21: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

81 20 February 2006

Application that sends larger bursts separated by idle periodsBitTorrent, HTTP/1.1 (persistent)

only keep-alive messages

transfer periods

82 20 February 2006

TCP End Points: Receiver window limitationmaximum amount of outstanding bytes = min(cwnd,rwnd)Intentionally

the sender’s upload capacity is too high compared to the receiving TCP’s download capacity

Unintentionallydefault maximum receiver advertized window is set too low by the operating systemwindow scaling is not enabled

83 20 February 2006

Limitation Causes: Network

Limitation is due to a bottleneck link

packets get dropped due to filled buffers

shared bottleneck: obtain only a fraction of its capacity

non-shared bottleneck: obtain all of its capacity

84 20 February 2006

Limitation Causes: TCP protocol

Limiting factor is TCP’s congestion avoidance or slow start algorithm

Transfer ends before the rate grows enough to hit limits set by network or receiving TCP

Page 22: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

85 20 February 2006

One Approach to Root Cause Analysis

Inputa bidirectional packet trace

Aggregation levelConnection level analysis

Divide & Conquer1. Identify bulk transfer periods

The other traffic is limited by the application2. Analyze the bulk transfer periods for

Receiver window limitationNetwork limitation

Methods are based on generated time series

86 20 February 2006

Identifying Bulk Transfer Periods (BTP)Use a time series of fraction of pushed data packets

fraction of pushed data packets (P) of all data packets seen for each non-overlapping time window

TimeTime window

P = 0.25 P = 1 P = 0.75

pkt with push flag pkt with no push flag

87 20 February 2006

Identifying BTPs

n1 consecutive P < pth starts a bulk transfer period

n2 consecutive P > pth starts an application limited period

Drawback: Need to select pth, n1, n2, and time window for each application

We chose empirically pth=0.7 , n1=5 , n2=10, and a time window of 1s for BitTorrentWe have already a more generic solution

88 20 February 2006

BTP Analysis

Apply the time-series based limitation test algorithms

Tests give limitation scoresDegree of limitationTCP end points: receiver window limitation scoreNetwork: retransmission score and dispersion score

Page 23: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

89 20 February 2006

Receiver window limitation test

Uses two time series:outstanding bytes (O)receiver advertised window (R)

Compute R-O for each pair of valuesindicates how close the sender is to the limit set by the receiver advertised windowoutput 1 if R ~ O, and 0 otherwise

Limitation score is the average value from the R-O comparison

indicates the fraction of time being limited by the receiver advertised window

90 20 February 2006

Retransmission scorefraction of bytes retransmitted

Dispersion scoreassess the impact of the bottleneck on the throughput

if DS is close to zero tput~r non-shared bottleneck linkelse shared bottleneck link

Network limitation test

path theofcapacity is where,1DS rr

tput−=

91 20 February 2006

Experimental Results

Analysed a large BitTorrent packet trace

Seeding a single very big torrent

102 million packets

60.000 connections

92 20 February 2006

Bulk Transfer Periods3295 BTPs from 696 connections

BTPs carry most of bytes but ALPs are longer

Page 24: Background and motivation Different measurement approaches … · )E.g. monitoring of one Abilene Intenet2 backbone link (OC-192, 10 Gbit/s link) produced >8 MBytes/s of uncompressed

93 20 February 2006

Limitation Scores

Receiver Window Limitation65% of BTPs are not limited by rwnd at allBTPs with small avg receiver windows have higher scores

Network LimitationHigh retransmission scores

20% of BTPs retransmitted over 10% of bytesHigh dispersion scores

95% of BTPs achieve less than half of the capacity of the path

94 20 February 2006

Network LimitationReceiver window limitation score < 0.5

High retransmission score induces high dispersion score

95 20 February 2006

complete transfer periodno lossRTT is 20 times the initial RTT

shared bottlenecktransfer ends before CWND reaches RWND value

Q: What is the limiting factor?A: TCP protocol through Congestion

Avoidance algorithm

96 20 February 2006

Conclusions

Characteristics of Internet flow ratesFast flows carry most of the bytes

It is important to understand their behavior.Strong correlation between flow rate and size

What users download is a function of their bandwidth.

TCP transmission rate limitation analysisCauses can be on different layers and in different locations

Application, TCP, or IPEnd hosts, network

One approach is to use time series-based techniquesAllows for quantitative analysis