Upload
craig-hurley
View
44
Download
3
Embed Size (px)
DESCRIPTION
A Machine Learning Approach to Detecting Attacks by Identifying Anomalies in Network Traffic. A Dissertation by Matthew V. Mahoney Major Advisor: Philip K. Chan. Overview. Related work in intrusion detection Approach Experimental results Simulated network Real background traffic - PowerPoint PPT Presentation
Citation preview
A Machine Learning Approachto Detecting Attacks
by Identifying Anomaliesin Network Traffic
A Dissertation
by Matthew V. Mahoney
Major Advisor: Philip K. Chan
Overview
• Related work in intrusion detection
• Approach
• Experimental results– Simulated network– Real background traffic
• Conclusions and future work
Limitations of Intrusion Detection
• Host based (audit logs, virus checkers, system calls (Forrest 1996)) – Cannot be trusted after a compromise
• Network signature detection (SNORT (Roesch 1999), Bro (Paxson 1998))– Cannot detect novel attacks– Alarms occur in bursts
• Address/port anomaly detection (ADAM (Barbara 2001), SPADE (Hoagland 2000), eBayes (Valdes & Skinner 2000))– Cannot detect attacks on public servers (web, mail)
Anomaly
SignatureNetwork
Host
User
System
BSM
VirusDetection
SNORT Bro
AuditLogs
Firewalls
SPADEADAMeBayes
Network ProtocolAnomaly Detection
Intrusion Detection Dimensions
Model
Data Method
Problem Statement
• Detect (not prevent) attacks in network traffic• No prior knowledge of attack characteristics
Model of normal traffic
IDS
Training – no known attacks
Test data with attacks Alarms
Approach
1. Model protocols (extend user model)
2. Time-based model of “bursty” traffic
3. Learn conditional rules
4. Batch and continuous modeling
5. Test with simulated attacks and real background traffic
Approach 1. Protocol Modeling
• User model (conventional)– Source address for authentication– Destination port to detect scans
• Protocol model (new)– Unusual features (more likely to be
vulnerable)– Client idiosyncrasies– IDS evasion– Victim’s symptoms after an attack
Example Protocol Anomalies
Attack How detected
Category
Teardrop – overlapping IP fragments crashes target
IP fragments
Unusual feature
Sendmail – buffer overflow gives remote root shell
Lower case mail
Idiosyn-crasy
FIN scan (portsweep) - FIN packets not logged
FIN with-out ACK
Evasion
ARPpoison – Forged replies to ARP-who-has
Interrupt-ed TCP
Victim symptoms
Approach 2 -Non-Poisson Traffic Model (Paxson & Floyd, 1995)
• Events occur in bursts on all time scales
• Long range dependency
• No average rate of events
• Event probability depends on– The average rate in the past– And the time since it last occurred
Time-Based Model
If port = 25 then word1 = HELO or EHLO
• Anomaly: any value never seen in training• Score = tn/r
– t = time since last anomaly for this rule– n = number of training instances (port = 25)– r = number of allowed values (2)
• Only the first anomaly in a burst receives a high score
Example
Training = AAAABBBBAA Test = AACCC
• C is an anomaly• r/n = average rate of training anomalies =
2/10 (first A and first B)• t = time since last anomaly = 9, 1, 1• Score (C) = tn/r = 45, 5, 5
Approach 3. Rule Learning
1. Sample training pairs to suggest rules with n/r = 2/1
2. Remove redundant rules, favoring high n/r
3. Validation: remove rules that generate alarms on attack-free traffic
Learning Step 1 - Sampling
Port Word1 Word2 Word3
80 GET / HTTP/1.0
80 GET /index.html HTTP/1.0
• If port = 80 then word1 = GET
• word3 = HTTP/1.0
• If word3 = HTTP/1.0 and word1 = GET then port = 80
Learning Step 2 – Remove Redundant Rules (Sorted by n/r)
• R1: if port = 80 then word1 = GET (n/r = 2/1, OK)• R2: word1 = HELO or GET (n/r = 3/2, OK)• R3: if port = 25 then word1 = HELO (n/r = 1/1, remove)• R4: word2 = pascal, /, or /index.html (n/r = 3/3, OK)
Port Word1 Word2 Word3
25 HELO pascal MAIL
80 GET / HTTP/1.0
80 GET /index.html HTTP/1.0
Learning Step 3 – Rule Validation
• Training (no attacks) – Learn rules, n/r• Validation (no attacks) – Discard rules that
generate alarms• Testing (with attacks)
Train Validate Test
Approach 4. Continuous Modeling
• No separate training and test phases
• Training data may contain attacks
• Model allows for previously seen values
• Score = tn/r + ti/fi
– ti = tine since value i last seen
– fi = frequency of i in training, fi > 0
• No validation step
Implementation
Model Data Con-ditions
Valid-ation
Score
PHAD Packet headers
None No tn/r
ALAD TCP streams
Server, port
No tn/r
LERAD TCP streams
Learned Yes tn/r
NETAD Packet bytes
Protocol Yes tn/r + ti/fi
Example Rules (LERAD)1 39406/1 if SA3=172 then SA2 = 0162 39406/1 if SA2=016 then SA3 = 1723 28055/1 if F1=.UDP then F3 = .4 28055/1 if F1=.UDP then F2 = .5 28055/1 if F3=. then F1 = .UDP6 28055/1 if F3=. then DUR = 07 27757/1 if DA0=100 then DA1 = 1128 25229/1 if W6=. then W7 = .9 25221/1 if W5=. then W6 = .10 25220/1 if W4=. then W8 = .11 25220/1 if W4=. then W5 = .12 17573/1 if DA1=118 then W1 = .^B^A^@^@13 17573/1 if DA1=118 then SA1 = 11214 17573/1 if SP=520 then DP = 52015 17573/1 if SP=520 then W2 = .^P^@^@^@16 17573/1 if DP=520 then DA1 = 11817 17573/1 if DA1=118 SA1=112 then LEN = 518 28882/2 if F2=.AP then F1 = .S .AS19 12867/1 if W1=.^@GET then DP = 8020 68939/6 if then DA1 = 118 112 113 115 114 11621 68939/6 if then F1 = .UDP .S .AF .ICMP .AS .R22 9914/1 if W3=.HELO then W1 = .^@EHLO23 9914/1 if F1=.S W3=.HELO then DP = 2524 9914/1 if DP=25 W5=.MAIL then W3 = .HELO
1999 DARPA IDS Evaluation(Lippmann et al. 2000)
• 7 days training data with no attacks• 2 weeks test data with 177 visible attacks• Must identify victim and time of attack
SunOS Solaris Linux WinNT
IDSVictims
Internet(simulated)
Attacks
Attacks Detected at 10 FA/Day
0
20
40
60
80
100
120
140
160
PHAD ALAD LERAD NETAD Continuous
Unlikely Detections
• Attacks on public servers (web, mail, DNS) detected by source address
• Application server attacks detected by packet header fields
• U2R (user to root) detected by FTP upload
Unrealistic Background Traffic
• Source Address, client versions (too few clients)• TTL, TCP options, TCP window size (artifacts)• Checksum errors, “crud”, invalid keywords and
values (too clean)
r
Time
Simulated
Real
5. Injecting Real Background Traffic
• Collected on a university departmental web server• Filtered: truncated inbound client traffic only• IDS modified to avoid conditioning on traffic source
SunOS Solaris Linux WinNT
IDS
Internet(simulatedand real)
AttacksReal web server
Mixed Traffic: Fewer Detections, but More are Legitimate
0
20
40
60
80
100
120
140
PHAD ALAD LERAD NETAD
Total
Legitimate
0
25
50
75
100
125
0 100 200 300 400 500
NETAD-S
LERAD-S
NETAD-C
LERAD-C
Detections out of 148
False Alarms
Detections vs. False Alarms(Simulated and Combined Traffic)
Results Summary
• Original 1999 evaluation: 40-55% detected at 10 false alarms per day
• NETAD (excluding U2R): 75%
• Mixed traffic: LERAD + NETAD: 30%
• At 50 FA/day: NETAD: 47%
Contributions
1. Protocol modeling
2. Time based modeling for bursty traffic
3. Rule learning
4. Continuous modeling
5. Removing simulation artifacts
Limitations
• False alarms – Unusual data is not always hostile
• Rule learning requires 2 passes (not continuous)• Tests with real traffic are not reproducible
(privacy concerns)• Unlabeled attacks in real traffic
– GET /MSADC/root.exe?/c+dir HTTP/1.0– GET /scripts/..%255c%255c../winnt/system32/cmd.exe?/c+dir
Future Work
• Modify rule learning for continuous traffic
• Add other attributes
• User feedback (should this anomaly be added to the model?)
• Test with real attacks
Acknowledgments
• Philip K. Chan – Directing research• Advisors – Ryan Stansifer, Kamel Rekab, James
Whittaker• Ongoing work
– Gaurav Tandon – Host based detection using LERAD (system call arguments)
– Rachna Vargiya – Parsing application payload– Hyoung Rae Kim – Payload lexical/semantic analysis– Muhammad Arshad – Outlier detection in network
traffic• DARPA – Providing funding and test data