Upload
gizela
View
47
Download
0
Embed Size (px)
DESCRIPTION
Rule-Based Anomaly Detection on IP Flows. Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg. Unwanted traffic detection. Enterprise. Intrusion Detection Systems ( IDSes ) p rotect the edge of a network Inspect IP packets - PowerPoint PPT Presentation
Citation preview
Nick Duffield, Patrick Haffner,Balachander Krishnamurthy, Haakon
Ringberg
Rule-Based Anomaly Detectionon IP Flows
2
Intrusion Detection Systems (IDSes) protect the edge of a network Inspect IP packets Look for worms, DoS, scans, instant messaging,
etc Many IDSes leverage known signatures of traffic
e.g., Slammer packets contain “MS-SQL” (say) in the payload
or AOL IM packets use specific TCP ports and application headers
IP header
TCP header
App header
Payload
Enterprise
Unwanted traffic detection
Benefits
ProgrammableLeverage existing
communityMany rules already existCERT, SANS Institute, etc
Classification “for free”
A predicate is a boolean function on a packet feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
3
Packet and rule-based IDSs
Packet and rule-based IDSs
Drawbacks
Packet inspection at the edge requires deployment at many interfaces
Too many packets per second
ISP
4
A predicate is a boolean function on a packet feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
Drawbacks
Packet inspection at the edge requires deployment at many interfaces
Too many packets per second
DPI predicates can be computationally expensive
Packet has:• Port number X, Y, or Z• Contains pattern “foo” within the first 20 bytes• Contains pattern “ba*r” within the first 40 bytes
5
Packet and rule-based IDSs A predicate is a boolean function on a packet
feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
src IP
dst IP
src Por
t
dst Por
t
Duratio
n
# Packet
s
A B 5 min
36
… … … … … …
Our idea: IDS on IP flows
6
How well can rule-based IDS’s be mimicked on IP flows?
EfficientOnly fixed-offset rule
predicates More compact (no
payload)Flow collection
infrastructure is ubiquitous
IP flows capture the concept of a connection
Idea
1. IDS’es associate a “label” with every packet
2. An IP flow is associated with a set of packets
3. Our systems associates the labels with flows
7
Snort rule taxonomy
8
Header-only
Meta-Informatio
n
Payload dependent
Inspect only IP flow header
Inexact corresponden
ce
Inspects packet payload
e.g., port numbers
e.g., TCP flags
e.g., ”contains
ab*c”Relies on features that cannot be exactly reproduced in the
IP flow realm
Simple translation
9
3. Our systems associates the labels with flows
Simple rule translation would capture only flow predicatesLow accuracy or low applicability
• dst port = MS SQL• contains “Slammer”
9
• dst port = MS SQL
Snort rule:
Only flow predicates:
Slammer Worm
Machine Learning (ML)
3. Our systems associates the labels with flows
10
Leverage ML to learn mapping from “IP flow space” to labelIP flow space = src port * # packets *
flags * duration:
if raised
otherwise
src port
# p
acke
ts
Boosting
11
Boosting combines a set of weak learners to create a strong learner
h1 h2
h3
Hfinalsign
• dst port = MS SQL• contains “Slammer”
Benefit of Machine Learning (ML)
Rule translation would capture flow-only predicatesLow accuracy or low applicability
ML algorithms discover new predicates that capture the ruleLatent correlations between predicatesCapturing same subspace using different dimensions
12
• dst port = MS SQL
Snort rule: Only flow predicates: ML-generated rule:
Slammer Worm
• dst port = MS SQL• packet size = 404• flow duration
1.Operate at a small # of interfaces
2.Use ML algorithms to learn to classify on IP flows
3.Apply learned classifiers across all/other interfaces
Architecture
13
Evaluation
Border router on OC-3 linkUsed Snort rules in placeUnsampled NetFlow v5 and packet tracesStatistics
One month, 2 MB/s average, 1 billion flows400k Snort alarms
14
Accuracy metrics
Receiver Operator Characteristic (ROC)Full FP vs TP tradeoffBut need a single number
Area Under Curve (AUC)Average Precision
AP of p1 - p
p FP per TP
15
Training on week 1, testing on week n High degree of accuracy for header and
meta Minimal drift within a month
Rule class Week1-2 Week1-3 Week1-4
Header rules 1.00 0.99 0.99
Meta-information
1.00 1.00 0.95
Payload 0.70 0.71 0.70
16
Classifier accuracy5 FP per 100 TP
43 FP per 100 TP
Accuracy is a function of correlation between flow and packet-level features
w/o dst port
w/o mean packet size
0.99 0.83
0.79 0.06
0.02 0.22
Rule Overall Accuracy
MS-SQL version overflow 1.00
ICMP PING speedera 0.82
NON-RFC HTTP DELIM 0.48
17
Difference in rule accuracy
Choosing an operating point
18
X ZY
• X = alarms we want raised• Z = alarms that are raised
PrecisionY
ZExactness
RecallY
XCompleteness
AP is a single number, but not most intuitive
Precision & recall are useful for operators“I need to detect 99% of these alarms!”
AP is a single number, but not most intuitive
Precision & recall are useful for operators“I need to detect 99% of these alarms!”
Rule Precision w/recall 1.00
Precision w/recall=0.99
MS-SQL version overflow 1.00 1.00
ICMP PING speedera 0.02 0.83
CHAT AIM receive message 0.02 0.11
19
Choosing an operating point
Computational efficiency
1. Machine learning (boosting) 33 hours per rule for one week of OC48
2. Classification of flows 57k flows/sec 1.5 GHz Itanium 2 Line rate classification for OC48
20
Conclusion
Applying Snort alarms to flows is feasibleML algorithms discover latent correlations
between packet and flow predicatesHigh degree of accuracy for many rulesMinimal drift within a monthPrototype can scale up to OC48 speeds
Qualitatively predictive rule taxonomyFuture work
Performance on sampled NetFlowCross-site training /classification
21
22
Thank you!
Questions?
Nick Duffield, Patrick Haffner,Balachander Krishnamurthy, Haakon
Ringberg