44
Entropy in IP Darkspace Data Tanja Zseby Cooperative Association for Internet Data Analysis (CAIDA) and Fraunhofer Institute for Open Communication Systems (FOKUS) CERT FloCon, January 2012

Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Entropy in IP Darkspace Data

Tanja Zseby

Cooperative Association for Internet Data Analysis (CAIDA) and

Fraunhofer Institute for Open Communication Systems (FOKUS)

CERT FloCon, January 2012

Page 2: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

IP Darkspace

• Global routable IP address space– announced by routing– but no hosts attachedall traffic destined to darkspace is unsolicited

• UCSD telescope– /8 darkspace– Used for different analysis (security, outages, etc.)

• Other IP darkspace monitors:– Internet Motion Sensor, Team cymru Darknet Project,

iSink, …

2 of 44

Page 3: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Scanning

3 of 44

Page 4: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Backscatter

4 of 44

Page 5: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Analysis of Darkspace Data

• Detection of incidents– Scanning activities– Backscatter– Misconfigurations– Network outages

Analysis (patterns, scope,..) Early warning „Cleaning up“ address space

5 of 44

Page 6: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

DSA related work

• General Analysis Techniques– Brownlee. One-way Traffic Monitoring with iatmon. To appear at PAM 2012– Ahmed et al. Characterising anomalous events using change - point correlation

on unsolicited network traffic. In Identity and Privacy in the Internet Age, 2009.

• Security and Misconfigurations– Wustrow et al. Internet background radiation revisited. IMC 2010– Aben. Conficker. ISOI 2009– Moore et al.Code-Red: a case study on the spread and victims of an Internet

worm. IMW 2002

• Network Outages– Dainotti et al. Analysis of Country-wide Internet Outages Caused by Censorship,

IMC 2011

• Darkspace Construction– Janies, Collins, Darkspace Construction and Maintenance, FloCon 2011

• IPv6 Darkspace– Huston: IPv6 Background Radiation, NANOG50, 2010– Ford, et al. Initial Results from an IPv6 Darknet, 2006

…and others.6 of 44

Page 7: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Metrics and Techniques

Packetclassification

Packet count per class

Classification rules(feature combinations)

Packetclassification

Packet count per class

Distributions for selected features

Classification rules(selected features)

Time series of packet countsfor selected feature combinations

t

C3C2C1

classes classes classes

T1 T2 T3

7 of 44

Page 8: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Example Metrics

• Time series of packet counts– Overall packet count– Packets to a specific port– Packets with specific TCP flags

• Source groups based on source behavior– Packet features (e.g. SYNs to specific port)– Inter Arrival Times (IATs)

• Distributions– IP addresses, port numbers

8 of 44

Page 9: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Challenges

• High amount of data– Many repetitions/boring events (TCP-SYNs,…)– whole distributions huge amount of data

• Selection of suitable classification rules– Separate known events from new/interesting packets– Feature selection difficult– Features of interest may change– High analysis effort– Detection of different events requires various metrics

9 of 44

Page 10: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Problem Statement

• Goal: detect and classify „events of interest“ – New vulnerabilities (increased scanning)– New victims of attacks (increased backscatter)– Misconfigurations– Network outages

• Ideal: Comprehensive metric – capture all events of interest

• Conditions– Keep storage requirements low

10 of 44

Page 11: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Characteristics of DS Events

• Hostscans (new vulnerability)– Many new sources (attackers) send to a specific destination port

• Backscatter (from DoS attacks with spoofed addresses)– Several sources (victims) send a lot of data to many destination

addresses using a specific source port• Misconfiguration (configuration of wrong destination IP)

– Several sources send to a specific destination IP and specific destination port

• Outages– Source IPs from outage region are missing fewer source IPs

• DDoS (to a destination IP in darkspace)– Many new sources (bots or spoofed) send to a specific destination IP

and specific destination port• Portscan

– One or several hosts send to a specific destination IP and many destination port 11 of 44

Page 12: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Expected Effects on Distributions

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random (attackers)

specific(victims)

specific specific (somemissing)

random (attackers)

specific(attackers)

dIP random random specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specifc depends specific random

Distinction of specific/random entropy !*assuming random sPort selection by attack tools

12 of 44

Page 13: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Sample Entropy

“You should call it entropy, […] …no one really knows what entropy really is, so in a debate you will always

have the advantage.“ John von Neumann’s suggestion to Claude Shannon according to Max Jammer “Dictionary of

the History of Ideas: Entropy”

Page 14: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Sample Entropy

Total number of observations

Histogram

[LaCD05] Lakhina, Crovella, Diot: Mining Anomalies Using Traffic Feature Distributions. SIGCOMM2005

Definition from [LaCD05]:

14 of 44

Page 15: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Related Work

Entropy-based anomaly detection:• Lee/Xiang 2001

– Information Theoretic Measures for Anomaly Detection

• Feinstein/Schnackenberg 2003– Detection of DDoS attacks based on source IP

entropy• Lakhina et al.2005

– Detection of scanning, DDoS, outages based on combinations of entropy from addresses and ports

15 of 44

Page 16: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Entropy Example

All packets equal

Entropy = minH(X) = 0

Each packet different

Entropy= maxH(X) = log2N

freq

freq

feature i

feature i

H(x)=max

H(x)=min

16 of 44

Page 17: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Expected Entropy Patterns

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random specific specific specific random specific

dIP random** random** specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specific depends specific random

**dIP has already high entropy in “normal” operation

*assuming random sPort selection by attack tools

17 of 44

Page 18: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Analysis

• Time periods– Nov 2008– Jan/Feb 2011– Oct 2011

• Calculation of Sample Entropy– sIP, dIP, sPort, dPort– Time intervals: 1 hour

• Tools: SiLK, R

18 of 44

Page 19: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

NOV 2008

19

Page 20: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

20 of 44

Nov 2008H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 21: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

21 of 44

Nov 2008H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)A B

#pkt

s[x1

08]

Page 22: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Classification of Event B

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random specific specific specific random specific

dIP random** random** specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specific depends specific random

22 of 44

Page 23: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

23 of 44

Distributions: sIP, dPort

H(sIP)=5.97# unique sIPs: 206,159

4.5

0.8 12

2

H(sIP)=9.36# unique sIPs: 421,563

H(dPort)=8.15top ports: 1434,445, 3072,..

H(dPort)=6.59top ports: 445, 62997, 137,..

B

A

B

A

sIP rank (log) dPort rank (log)

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

Page 24: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

JAN/FEB 2011

24

Page 25: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

25 of 44

Jan-Feb 2011H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 26: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

26 of 44

Jan-Feb 2011H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 27: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

27 of 44

Jan 2011H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 28: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

28 of 44

A BH

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 29: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Classification of Event B

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random specific specific specific random specific

dIP random** random** specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specific depends specific random

29 of 44

Page 30: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

30 of 44

Distributions: sIP, dIP

H(sIP)=10.97

# unique sIPs: 3,022,603

Much more sIPs

H(sIP)=16.40

# unique sIPs: 23,733,290

sIP rank (log) 1e+07

1e+06

H(dIP)=14.4

dIP rank (log)

500

10

A lot of packets to one IP

H(dIP)=0.59

#pkt

s[x

106 ]

B

A

B

A

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

1.2

10

Page 31: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

31 of 44

Distributions: sPort, dPort

500

8010

1.2

H(sPort)= 8.42

sPorts dispersed

H(sPort)=10.43

Top dPort: 445

H(dPort)=3.19

A lot of packets to one port

H(dPort)=0.22Top dPort: 80

B

A

B

A

sPort rank (log) dPort rank (log)

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

Page 32: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

32 of 44

A CH

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)#p

kts[

x108

]

Page 33: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Classification of Event C

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random specific specific specific random specific

dIP random** random** specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specific depends specific random

33 of 44

Page 34: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

34 of 44

Distributions: sIP, dIP

H(sIP)=10.97

# unique sIPs: 3,022,603

10A

sIP rank (log)

C

H(dIP)=14.4A

C

dIP rank (log)

A lot of packets from few sIPs

H(sIP)=6.05

10

90 4.5

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ] H(dIP)=15.60

Page 35: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

35 of 44

Distributions: sPort, dPort8010

H(sPort)= 8.42

Top dPort: 445

H(dPort)=3.19

C

A

C

A

sPort rank (log) dPort rank (log)

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

H(dPort)=8.14

dPorts dispersed

60100

Top sPorts: 80, 9021

A lot of packets from one port

H(sPort)=4.70

Page 36: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

OCT 2011

36

Page 37: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

37 of 44

Oct 2011H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)

A B

#pkt

s[x1

08]

Page 38: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

38 of 44

Oct 2011H

(sIP

)H

(dIP

)H

(dPo

rt)

H(s

Port

)

A B

#pkt

s[x1

08]

Page 39: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Classification of Event B

Hostscan Backscatter Misconfig Outage DDoS(rare)

Portscan(rare)

sIP random specific specific specific random specific

dIP random** random** specific depends specific specific

sPort random* specific depends depends random* random*

dPort specific random* specific depends specific random

39 of 44

Page 40: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

40 of 44

Distributions: sIP, dPort

H(sIP)= 10.37 H(dPort)=3.43

B

A

B

A

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

#pkt

s[x

106 ]

H(sIP)=5.55

A lot of packets from few sIPs

H(dPort)=8.14

dPorts dispersed

sIP rank (log) dPort rank (log)

6

80

70

45

Page 41: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

41 of 44

Oct 2011

SYN-ACKs#pkt

s[x

106 ]

all packets

H

Page 42: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Discussion

• Entropy– Good indicator for new incidents in darkspace– Comprehensive metric to detect and classify different incidents

• Future considerations:– Detection of slow and small changes

• Outages were not visible with current time interval• Stealth scanning• check fine grained time intervals

– Time interval vs. calculation effort– Entropy calculation effort compared to other methods– Problems with nested events– Combination with other metrics (geolocation, source groups,…)– Combination with other DS monitors

42 of 44

Page 43: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

CAIDA Workshop on Darkspace Analysis

• May 2012, San Diego• Objectives

– Bring community together– Share experiences– Share data, results– Establish global distributed DS network

• Participation by invitation– If interested contact me

43 of 44

Page 44: Tanja Zseby Cooperative Association for Internet Data Analysis … · 2012. 1. 9. · 14 of 44. Related Work Entropy-based anomaly detection: • Lee/Xiang 2001 – Information Theoretic

Thank You!

Contact: [email protected]