Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Entropy in IP Darkspace Data
Tanja Zseby
Cooperative Association for Internet Data Analysis (CAIDA) and
Fraunhofer Institute for Open Communication Systems (FOKUS)
CERT FloCon, January 2012
IP Darkspace
• Global routable IP address space– announced by routing– but no hosts attachedall traffic destined to darkspace is unsolicited
• UCSD telescope– /8 darkspace– Used for different analysis (security, outages, etc.)
• Other IP darkspace monitors:– Internet Motion Sensor, Team cymru Darknet Project,
iSink, …
2 of 44
Scanning
3 of 44
Backscatter
4 of 44
Analysis of Darkspace Data
• Detection of incidents– Scanning activities– Backscatter– Misconfigurations– Network outages
Analysis (patterns, scope,..) Early warning „Cleaning up“ address space
5 of 44
DSA related work
• General Analysis Techniques– Brownlee. One-way Traffic Monitoring with iatmon. To appear at PAM 2012– Ahmed et al. Characterising anomalous events using change - point correlation
on unsolicited network traffic. In Identity and Privacy in the Internet Age, 2009.
• Security and Misconfigurations– Wustrow et al. Internet background radiation revisited. IMC 2010– Aben. Conficker. ISOI 2009– Moore et al.Code-Red: a case study on the spread and victims of an Internet
worm. IMW 2002
• Network Outages– Dainotti et al. Analysis of Country-wide Internet Outages Caused by Censorship,
IMC 2011
• Darkspace Construction– Janies, Collins, Darkspace Construction and Maintenance, FloCon 2011
• IPv6 Darkspace– Huston: IPv6 Background Radiation, NANOG50, 2010– Ford, et al. Initial Results from an IPv6 Darknet, 2006
…and others.6 of 44
Metrics and Techniques
Packetclassification
Packet count per class
Classification rules(feature combinations)
Packetclassification
Packet count per class
Distributions for selected features
Classification rules(selected features)
Time series of packet countsfor selected feature combinations
t
C3C2C1
classes classes classes
T1 T2 T3
7 of 44
Example Metrics
• Time series of packet counts– Overall packet count– Packets to a specific port– Packets with specific TCP flags
• Source groups based on source behavior– Packet features (e.g. SYNs to specific port)– Inter Arrival Times (IATs)
• Distributions– IP addresses, port numbers
8 of 44
Challenges
• High amount of data– Many repetitions/boring events (TCP-SYNs,…)– whole distributions huge amount of data
• Selection of suitable classification rules– Separate known events from new/interesting packets– Feature selection difficult– Features of interest may change– High analysis effort– Detection of different events requires various metrics
9 of 44
Problem Statement
• Goal: detect and classify „events of interest“ – New vulnerabilities (increased scanning)– New victims of attacks (increased backscatter)– Misconfigurations– Network outages
• Ideal: Comprehensive metric – capture all events of interest
• Conditions– Keep storage requirements low
10 of 44
Characteristics of DS Events
• Hostscans (new vulnerability)– Many new sources (attackers) send to a specific destination port
• Backscatter (from DoS attacks with spoofed addresses)– Several sources (victims) send a lot of data to many destination
addresses using a specific source port• Misconfiguration (configuration of wrong destination IP)
– Several sources send to a specific destination IP and specific destination port
• Outages– Source IPs from outage region are missing fewer source IPs
• DDoS (to a destination IP in darkspace)– Many new sources (bots or spoofed) send to a specific destination IP
and specific destination port• Portscan
– One or several hosts send to a specific destination IP and many destination port 11 of 44
Expected Effects on Distributions
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random (attackers)
specific(victims)
specific specific (somemissing)
random (attackers)
specific(attackers)
dIP random random specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specifc depends specific random
Distinction of specific/random entropy !*assuming random sPort selection by attack tools
12 of 44
Sample Entropy
“You should call it entropy, […] …no one really knows what entropy really is, so in a debate you will always
have the advantage.“ John von Neumann’s suggestion to Claude Shannon according to Max Jammer “Dictionary of
the History of Ideas: Entropy”
Sample Entropy
Total number of observations
Histogram
[LaCD05] Lakhina, Crovella, Diot: Mining Anomalies Using Traffic Feature Distributions. SIGCOMM2005
Definition from [LaCD05]:
14 of 44
Related Work
Entropy-based anomaly detection:• Lee/Xiang 2001
– Information Theoretic Measures for Anomaly Detection
• Feinstein/Schnackenberg 2003– Detection of DDoS attacks based on source IP
entropy• Lakhina et al.2005
– Detection of scanning, DDoS, outages based on combinations of entropy from addresses and ports
15 of 44
Entropy Example
All packets equal
Entropy = minH(X) = 0
Each packet different
Entropy= maxH(X) = log2N
freq
freq
feature i
feature i
H(x)=max
H(x)=min
16 of 44
Expected Entropy Patterns
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random specific specific specific random specific
dIP random** random** specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specific depends specific random
**dIP has already high entropy in “normal” operation
*assuming random sPort selection by attack tools
17 of 44
Analysis
• Time periods– Nov 2008– Jan/Feb 2011– Oct 2011
• Calculation of Sample Entropy– sIP, dIP, sPort, dPort– Time intervals: 1 hour
• Tools: SiLK, R
18 of 44
NOV 2008
19
20 of 44
Nov 2008H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
21 of 44
Nov 2008H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)A B
#pkt
s[x1
08]
Classification of Event B
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random specific specific specific random specific
dIP random** random** specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specific depends specific random
22 of 44
23 of 44
Distributions: sIP, dPort
H(sIP)=5.97# unique sIPs: 206,159
4.5
0.8 12
2
H(sIP)=9.36# unique sIPs: 421,563
H(dPort)=8.15top ports: 1434,445, 3072,..
H(dPort)=6.59top ports: 445, 62997, 137,..
B
A
B
A
sIP rank (log) dPort rank (log)
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
JAN/FEB 2011
24
25 of 44
Jan-Feb 2011H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
26 of 44
Jan-Feb 2011H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
27 of 44
Jan 2011H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
28 of 44
A BH
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
Classification of Event B
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random specific specific specific random specific
dIP random** random** specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specific depends specific random
29 of 44
30 of 44
Distributions: sIP, dIP
H(sIP)=10.97
# unique sIPs: 3,022,603
Much more sIPs
H(sIP)=16.40
# unique sIPs: 23,733,290
sIP rank (log) 1e+07
1e+06
H(dIP)=14.4
dIP rank (log)
500
10
A lot of packets to one IP
H(dIP)=0.59
#pkt
s[x
106 ]
B
A
B
A
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
1.2
10
31 of 44
Distributions: sPort, dPort
500
8010
1.2
H(sPort)= 8.42
sPorts dispersed
H(sPort)=10.43
Top dPort: 445
H(dPort)=3.19
A lot of packets to one port
H(dPort)=0.22Top dPort: 80
B
A
B
A
sPort rank (log) dPort rank (log)
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
32 of 44
A CH
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)#p
kts[
x108
]
Classification of Event C
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random specific specific specific random specific
dIP random** random** specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specific depends specific random
33 of 44
34 of 44
Distributions: sIP, dIP
H(sIP)=10.97
# unique sIPs: 3,022,603
10A
sIP rank (log)
C
H(dIP)=14.4A
C
dIP rank (log)
A lot of packets from few sIPs
H(sIP)=6.05
10
90 4.5
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ] H(dIP)=15.60
35 of 44
Distributions: sPort, dPort8010
H(sPort)= 8.42
Top dPort: 445
H(dPort)=3.19
C
A
C
A
sPort rank (log) dPort rank (log)
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
H(dPort)=8.14
dPorts dispersed
60100
Top sPorts: 80, 9021
A lot of packets from one port
H(sPort)=4.70
OCT 2011
36
37 of 44
Oct 2011H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)
A B
#pkt
s[x1
08]
38 of 44
Oct 2011H
(sIP
)H
(dIP
)H
(dPo
rt)
H(s
Port
)
A B
#pkt
s[x1
08]
Classification of Event B
Hostscan Backscatter Misconfig Outage DDoS(rare)
Portscan(rare)
sIP random specific specific specific random specific
dIP random** random** specific depends specific specific
sPort random* specific depends depends random* random*
dPort specific random* specific depends specific random
39 of 44
40 of 44
Distributions: sIP, dPort
H(sIP)= 10.37 H(dPort)=3.43
B
A
B
A
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
#pkt
s[x
106 ]
H(sIP)=5.55
A lot of packets from few sIPs
H(dPort)=8.14
dPorts dispersed
sIP rank (log) dPort rank (log)
6
80
70
45
41 of 44
Oct 2011
SYN-ACKs#pkt
s[x
106 ]
all packets
H
Discussion
• Entropy– Good indicator for new incidents in darkspace– Comprehensive metric to detect and classify different incidents
• Future considerations:– Detection of slow and small changes
• Outages were not visible with current time interval• Stealth scanning• check fine grained time intervals
– Time interval vs. calculation effort– Entropy calculation effort compared to other methods– Problems with nested events– Combination with other metrics (geolocation, source groups,…)– Combination with other DS monitors
42 of 44
CAIDA Workshop on Darkspace Analysis
• May 2012, San Diego• Objectives
– Bring community together– Share experiences– Share data, results– Establish global distributed DS network
• Participation by invitation– If interested contact me
43 of 44