Upload
nirmala-last
View
192
Download
0
Tags:
Embed Size (px)
Citation preview
Automatically Inferring Automatically Inferring Patterns of Resource Patterns of Resource
Consumption in Network Consumption in Network TrafficTraffic
Cristian Estan, Stefan Savage, George Varghese
University of California, San Diego
April 13, 2023 Traffic Clusters - 2003 2
Who is using my link?Who is using my link?
April 13, 2023 Traffic Clusters - 2003 3
Do something smarter!
Too much data for a human
Looking at the trafficLooking at the traffic
April 13, 2023 Traffic Clusters - 2003 4
Looking at traffic aggregatesLooking at traffic aggregates
Aggregating on individual packet header fields gives useful results but
Traffic reports are not always at the right granularity (e.g. individual IP address, subnet, etc.)
Cannot show aggregates defined over multiple fields (e.g. which network uses which application)
The traffic analysis tool should automatically find aggregates over the right fields at the right granularity
Rank Destination IP Traffic
1 jeff.dorm.bigU.edu 11.9%
2 tracy.dorm.bigU.edu 3.12%
3 risc.cs.bigU.edu 2.83%
Most traffic goes to the dorms …
Rank Destination network Traffic
1 library.bigU.edu 27.5%
2 cs.bigU.edu 18.1%
3 dorm.bigU.edu 17.8%
What apps are used?
Rank Source port Traffic
1 Web 42.1%
2 Kazaa 6.7%
3 Ssh 6.3%
Dest. IP
Dest. net
Source port
Where does the traffic come
from?……
Src. IP Src. port
Src. net
Dest. portDest. IP
Dest. net
Protocol
Which network uses
web and which one kazaa?
April 13, 2023 Traffic Clusters - 2003 5
Ideal traffic reportIdeal traffic report
Traffic aggregate Traffic
Web traffic 42.1%
Web traffic to library.bigU.edu 26.7%
Web traffic from www.schwarzenegger.com 13.4%
ICMP traffic from sloppynet.badU.edu to jeff.dorm.bigU.edu 11.9%
Web is the dominant applicationThe library is a
heavy user of webThat’s a big flash
crowd!
This is a Denial of Service attack !!
This paper is about giving the network administrator insightful traffic reports
April 13, 2023 Traffic Clusters - 2003 6
Contributions of this paperContributions of this paper
Approach
Definitions
Algorithms
System
Experience
April 13, 2023 Traffic Clusters - 2003 7
ApproachApproach
Characterize traffic mix by describing all important traffic aggregates
Multidimensional aggregates (e.g. flash crowd described by protocol, port number and IP address)
Aggregates at the the right level of granularity (e.g. computer, subnet, ISP)
Traffic analysis is automated – finds insightful data without human guidance
April 13, 2023 Traffic Clusters - 2003 8
Definition: traffic clustersDefinition: traffic clusters
Traffic clusters are the multidimensional traffic aggregates identified by our reports
A cluster is defined by a range for each field The ranges are from natural hierarchies (e.g. IP
prefix hierarchy) – meaningful aggregates Example
Traffic aggregate: incoming web traffic for CS Dept. Traffic cluster: ( SrcIP=*, DestIP in 132.239.64.0/21,
Proto=TCP, SrcPort=80, DestPort in [1024,65535] )
April 13, 2023 Traffic Clusters - 2003 9
Traffic reports give the volume of chosen traffic clusters To keep report size manageable describe only clusters
above threshold (e.g. H=total of traffic/20) To avoid redundant data compress by omitting clusters
whose traffic can be inferred (up to error H) from non-overlapping more specific clusters in the report
To highlight non-obvious aggregates prioritize by using unexpectedness label
Example» 50% of all traffic is web» Prefix B receives 20% of all traffic» The web traffic received by prefix B is 15% instead of
50%*20%=10%, unexpectedness label is 15%/10%=150%
Definition: traffic reportDefinition: traffic report
April 13, 2023 Traffic Clusters - 2003 10
Contributions of this paperContributions of this paper
Approach
Definitions
Algorithms
System
Experience
April 13, 2023 Traffic Clusters - 2003 11
Algorithms and theoryAlgorithms and theory
Algorithms and theoretical bounds in the paper Unidimensional reports are easy to compute
Multidimensional reports are exponentially harder as we add more fields
Next few slides Example of unidimensional compression
Example for the structure of the multidimensional cluster space
April 13, 2023 Traffic Clusters - 2003 12
Unidimensional report exampleUnidimensional report example
10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.5 10.0.0.8 10.0.0.9 10.0.0.10 10.0.0.14
15 35 30 40 160 110 35 75
10.0.0.2/31 10.0.0.4/3150 10.0.0.8/31 10.0.0.10/31
70 270 35 75
10.0.0.0/30 10.0.0.4/30 10.0.0.8/30 7530550 70
10.0.0.0/29 10.0.0.8/29120 380
10.0.0.0/28 500500
120 380
305
270
160 110
HierarchyThreshold=100
10.0.0.14/31
10.0.0.12/30
April 13, 2023 Traffic Clusters - 2003 13
270
120
500
305
380
160 110
Unidimensional report exampleUnidimensional report example
10.0.0.8 10.0.0.9
10.0.0.0/29 10.0.0.8/29
10.0.0.8/31
10.0.0.8/30
10.0.0.0/28
120 380
160 110
Compression
305-270<100
380-270≥100
Source IP Traffic
10.0.0.0/29 120
10.0.0.8/29 380
10.0.0.8 160
10.0.0.9 110
April 13, 2023 Traffic Clusters - 2003 14
Multidimensional structure ex.Multidimensional structure ex.
All traffic All traffic
US EU
CA NY GB DE
Web Mail
Source net Application
US Web
Nodes (clusters) have multiple parents
US
Web
Nodes (clusters) overlap
CA
April 13, 2023 Traffic Clusters - 2003 15
Contributions of this paperContributions of this paper
Approach
Definitions
Algorithms
System
Experience
April 13, 2023 Traffic Clusters - 2003 16
System: AutoFocusSystem: AutoFocus
Trafficparser
Web basedGUI
Cluster miner
Grapher
Packet header trace
categories
names
April 13, 2023 Traffic Clusters - 2003 17
April 13, 2023 Traffic Clusters - 2003 18
April 13, 2023 Traffic Clusters - 2003 19
April 13, 2023 Traffic Clusters - 2003 20
Contributions of this paperContributions of this paper
Approach
Definitions
Algorithms
System
Experience
April 13, 2023 Traffic Clusters - 2003 21
Backups from CAIDA to tape server
Semi-regular time pattern
FTP from SLAC Stanford
Scripps web traffic
Web & Squid servers
Large ssh traffic
Steady ICMP probing from CAIDA
Structure of regular traffic mixStructure of regular traffic mix
SD-NAP
SD-NAP
April 13, 2023 Traffic Clusters - 2003 22
Analysis of unusual eventsAnalysis of unusual events UCSD to UCLA route change Sapphire/SQL Slammer worm
Site 2
April 13, 2023 Traffic Clusters - 2003 23
ConclusionsConclusions
1010111101010000101011111101011001010101101011010000101010100101010111101010101000101111010000010111111101011001010111010111100100101010100011011111100010101110110101100101010110101111000010101011110111010111010101010111111010110010101011010101111101010000110100001011010100101011001000000101011001010101011111000010001000010101011110101000010111001010101101011110000010101011111101011000101111010000010111110101011010111100100101010110010101010001010100101010110101010010111001010000010100001110110101010110111111000101011101011101011001010101101011110000110111101110101110101010101111110101100101010110101111011101010000110101010010101101010111010101001010000101011010101001010100000101010101010101101011101010100000010101010101101010101011110101110101011010100011000101010010111010101001101010100001000110101111010100010110
April 13, 2023 Traffic Clusters - 2003 24
ConclusionsConclusions Multidimensional traffic clusters using natural hierarchies
describe traffic aggregates
Traffic reports using thresholding identify automatically conspicuous resource consumption at the right granularity
Compression produces compact traffic reports and unexpectedness labels highlight non-obvious aggregates
Our prototype system, AutoFocus, provides insights into the structure of regular traffic and unexpected events
April 13, 2023 Traffic Clusters - 2003 25
Thank you!Thank you!
Alpha version of AutoFocus downloadable from
http://ial.ucsd.edu/AutoFocus/
Any questions?
Acknowledgements: NIST, NSF, Vern Paxson, David Moore, Liliana Estan, Jennifer Rexford, Alex Snoeren, Geoff Voelker
April 13, 2023 Traffic Clusters - 2003 26
Bounds and running timesBounds and running times
Report size Running time Memory usage
unc. 1dim. rep. ≤1+(d-1)T/H O(n+m(d-1)) O(m(d-1))
1dim. report ≤ T/H linear linear
1dim. Δ report ≤T1/H+T2/H linear
unc. +dim. rep. ≤ T/H ∏di ≈result*n O(m+result)
+dim. rep. ≤ T/H ∏di/max(di)
+dim. Δ report ≈eresult
April 13, 2023 Traffic Clusters - 2003 27
Open questionsOpen questions
Are there tighter bounds for the size of the reports?
Are there algorithms that produce smaller results?
Are there algorithms that compute traffic reports more efficiently? In streaming fashion?
April 13, 2023 Traffic Clusters - 2003 28
Delta reportsDelta reports Why repeat the same traffic report if the traffic doesn’t
change from one day to the other?
Delta reports describe the clusters that increased or decreased by more than the threshold from one interval to the other
On related traffic mixes delta reports much smaller than traffic reports
Multidimensional compression very hard for delta reports
We have only exponential algorithm for the cluster delta
April 13, 2023 Traffic Clusters - 2003 29
Greedy compression algorithmGreedy compression algorithm
April 13, 2023 Traffic Clusters - 2003 30
Multidimensional report exampleMultidimensional report example
Thresholding Compression
April 13, 2023 Traffic Clusters - 2003 31
System detailsSystem details
Part Language LoC Status
Backend C++ 5400 stable
GUI HTML,
Javascript
1000 functional
Glue perl 350 evolving