Upload
giana-rex
View
216
Download
1
Embed Size (px)
Citation preview
Firewalls and Intrusion Detection Systems
David [email protected] Mellon University
2
IDS and Firewall GoalsExpressiveness: What kinds of policies can we write?
Effectiveness: How well does it detect attacks while avoiding false positives?
Efficiency: How many resources does it take, and how quickly does it decide?
Ease of use: How much training is necessary? Can a non-security expert use it?
Security: Can the system itself be attacked?
Transparency: How intrusive is it to use?
3
FirewallsDimensions:1. Host vs. Network2. Stateless vs. Stateful3. Network Layer
4
Firewall Goals
Provide defense in depth by:1. Blocking attacks against hosts and services2. Control traffic between zones of trust
5
Logical Viewpoint
Inside OutsideFirewall
For each message m, either:• Allow with or without modification• Block by dropping or sending rejection notice• Queue
m
?
6
Placement
Host-based Firewall
Network-Based Firewall
Host Firewall Outside
Firewall OutsideHost B
Host C
Host A
Features:• Faithful to local
configuration• Travels with you
Features:• Protect whole network• Can make decisions on
all of traffic (traffic-based anomaly)
7
Parameters
Types of Firewalls1. Packet Filtering2. Stateful Inspection3. Application proxy
Policies1. Default allow2. Default deny
8
Recall: Protocol Stack
Application(e.g., SSL)
Transport (e.g., TCP, UDP)
Network (e.g., IP)
Link Layer(e.g., ethernet)
Physical
Application message - data
TCP data TCP data TCP data
TCP Header
dataTCPIP
dataTCPIPETH ETH
Link (Ethernet) Header
Link (Ethernet) Trailer
IP Header
9
Stateless FirewallFilter by packet header fields1. IP Field
(e.g., src, dst)2. Protocol
(e.g., TCP, UDP, ...)3. Flags
(e.g., SYN, ACK)
Application
Transport
Network
Link Layer
Firewall
Outside Inside
Example: only allow incoming DNS packets to nameserver A.A.A.A.
Allow UDP port 53 to A.A.A.ADeny UDP port 53 allFail-safe good
practice
e.g., ipchains in Linux 2.2
10
Need to keep state
Inside Outside
Listening
Store SNc, SNs
Wait
SNCrandC
ANC0Syn
SYN/ACK:SNSrandS
ANSSNC
Established
ACK: SNSNC+1ANSNS
Example: TCP HandshakeFirewall
Desired Policy: Every SYN/ACK must have been preceded
by a SYN
11
Stateful Inspection Firewall
Added state (plus obligation to manage)– Timeouts– Size of table
State
Application
Transport
Network
Link Layer
Outside Inside
e.g., iptables in Linux 2.4
12
Stateful More Expressive
Inside Outside
Listening
Store SNc, SNs
Wait
SNCrandC
ANC0Syn
SYN/ACK:SNSrandS
ANSSNC
Established
ACK: SNSNC+1ANSNS
Example: TCP HandshakeFirewall
Record SNc in table
Verify ANs in table
13
State Holding Attack
Firewall AttackerInside
SynSyn
Syn...
1. SynFlood
2. Exhaust Resources
3. Sneak Packet
Assume stateful TCP policy
14
Fragmentation
Octet 1 Octet 2 Octet 3 Octet 4
Ver IHL TOS Total Length
ID 0DF
MF
Frag ID
...
Data
Frag 1 Frag 2 Frag 3
IP Hdr DF=0 MF=1 ID=0 Frag 1
IP Hdr DF=0 MF=1 ID=n Frag 2
IP Hdr DF=1 MF=0 ID=2n Frag 3
say n bytes
DF : Don’t fragment (0 = OK, 1 = Don’t)MF: More fragements(0 = Last, 1 = More)Frag ID = Octet number
15
ReassemblyData
Frag 1 Frag 2 Frag 3
IP Hdr DF=0 MF=1 ID=0 Frag 1
IP Hdr DF=0 MF=1 ID=n Frag 2
IP Hdr DF=1 MF=0 ID=2n Frag 3
Frag 1 Frag 2 Frag 3
0 Byte n Byte 2n
16
Octet 1 Octet 2 Octet 3 Octet 4
Source Port Destination Port
Sequence Number
....
...DF=
1MF=1 ID=0 ...
1234(src port)
80(dst port)
...Packet 1
Overlapping Fragment Attack
...DF=
1MF=1 ID=2 ... 22 ...Packet 2
1234 8022
Assume Firewall Policy: Incoming Port 80 (HTTP) Incoming Port 22 (SSH)
Bypass policyTCP Hdr(Data!)
17
Stateful Firewalls
Pros• More expressive
Cons• State-holding attack• Mismatch between
firewalls understanding of protocol and protected hosts
18
Application Firewall
Check protocol messages directly
Examples:– SMTP virus scanner– Proxies– Application-level
callbacks
State
Application
Transport
Network
Link Layer
Outside Inside
19
Firewall Placement
20
Demilitarized Zone (DMZ)
Inside OutsideFirewall
DMZ
WWW
NNTP
DNS
SMTP
21
Dual Firewall
Inside OutsideHubDMZ
InteriorFirewall
ExteriorFirewall
22
Design Utilities
Solsoft
Securify
23
References
Elizabeth D. ZwickySimon Cooper
D. Brent Chapman
William R CheswickSteven M Bellovin
Aviel D Rubin
24
Intrusion Detection and Prevetion Systems
25
Logical Viewpoint
Inside OutsideIDS/IPS
For each message m, either:• Report m (IPS: drop or log)• Allow m• Queue
m
?
26
Overview• Approach: Policy vs Anomaly• Location: Network vs. Host• Action: Detect vs. Prevent
27
Policy-Based IDSUse pre-determined rules to detect attacks
Examples: Regular expressions (snort), Cryptographic hash (tripwire,
snort)Detect any fragments less than 256 bytesalert tcp any any -> any any (minfrag: 256; msg: "Tiny fragments detected, possible hostile activity";)Detect IMAP buffer overflowalert tcp any any -> 192.168.1.0/24 143 ( content: "|90C8 C0FF FFFF|/bin/sh"; msg: "IMAP buffer overflow!”;)
Example Snort rules
28
Modeling System Calls [wagner&dean 2001]
Entry(f)Entry(g)
Exit(f)Exit(g)
open()
close()
exit()
getuid() geteuid()
f(int x) { if(x){ getuid() } else{ geteuid();} x++}g() { fd = open("foo", O_RDONLY); f(0); close(fd); f(1); exit(0);}
Execution inconsistent with automata indicates attack
29
Anomaly Detection
Distribution of “normal” events
IDS
New Event
Attack
Safe
30
Example: Working Sets
Alice
Days 1 to 300
reddit xkcd
slashdot
fark
working setof hosts
Alice
Day 300
outside working set
reddit xkcd
slashdot
fark18487
31
Anomaly Detection
Pros• Does not require pre-
determining policy (an “unknown” threat)
Cons• Requires attacks are
not strongly related to known traffic
• Learning distributions is hard
Automatically Inferring the Evolution of Malicious Activity on the Internet
David BrumleyCarnegie Mellon University
Shobha VenkataramanAT&T Research
Oliver SpatscheckAT&T Research
Subhabrata SenAT&T Research
33
<ip1,+> <ip2,+> <ip3,+> <ip4,->
Tier 1
E K
A
...
Spam Haven
Labeled IP’s from spam assassin, IDS logs, etc.
Evil is constantly on the move
Goal:Characterize regions
changing from bad to good ( -good) or Δ
good to bad ( -bad)Δ
34
Research Questions
Given a sequence of labeled IP’s1. Can we identify the specific
regions on the Internet that have changed in malice?
2. Are there regions on the Internet that change their malicious activity more frequently than others?
35
Spam Haven
Tier 1
Tier 1
Tier 2
D
X
Tier 2
B C
K
Per-IP often not interesting
A
... DSL CORP
A
X
Challenges
1. Infer the right granularity
E
Previous work:Fixed granularity
Per-IPGranularity
(e.g., Spamcop)
36
Spam Haven
Tier 1
Tier 1
Tier 2
D E
Tier 2
B C
W
A
... DSL CORP
A
BGPgranularity
(e.g., Network-Aware clusters [KW’00])
Challenges
1. Infer the right granularity
XX
Previous work:Fixed granularity
37
B C
Spam Haven
Tier 1
Tier 1
Tier 2
D
X
Tier 2
B C
K
Coarse granularity
A
... DSL CORP
A
K
Challenges
1. Infer the right granularity
EE CORP
Idea:Infer granularity
Medium granularity
Well-managed network: fine
granularity
38
Spam Haven
Tier 1
Tier 1
Tier 2
D E
Tier 2
B C
W
A
... DSL SMTP
Challenges
1. Infer the right granularity
2. We need online algorithms
A
fixed-memory device high-speed link
X
39
Research Questions
Given a sequence of labeled IP’s
1. Can we identify the specific regions on the Internet that have changed in malice?
2. Are there regions on the Internet that change their malicious activity more frequently than others?
-ChangeΔ
-MotionΔ
We Present
40
Background1. IP Prefix trees2. TrackIPTree Algorithm
41
Spam Haven
Tier 1
Tier 1
Tier 2
D E
Tier 2
B C
W
A
... DSL CORP
A
X
1.2.3.4/32
8.1.0.0/24
IP Prefixes:i/d denotes all IP addresses
i covered by first d bits
Ex: 8.1.0.0-8.1.255.255
Ex: 1 host (all bits)
42
OneHost
WholeNet0.0.0.0/0
0.0.0.0/1 128.0.0.0/1
128.0.0.0/2 192.0.0.0/20.0.0.0/264.0.0.0.0/
2
An IP prefix tree is formed by masking each bit of an IP address.0.0.0.0/32 0.0.0.1/32
0.0.0.0/31
128.0.0.0/3 160.0.0.0/3
128.0.0.0/4 152.0.0.0/4
43
0.0.0.0/0
0.0.0.0/1 128.0.0.0/1
0.0.0.0/264.0.0.0.0/
2128.0.0.0/2 192.0.0.0/2
A k-IPTree Classifier [VBSSS’09]
is an IP tree with at most k-leaves, each leaf labeled with
good (“+) or bad (“-”).
128.0.0.0/3 160.0.0.0/3
128.0.0.0/4 152.0.0.0/4
0.0.0.0/32 0.0.0.1/32
0.0.0.0/31
6-IPTree
+ -
+ -
+
+Ex: 64.1.1.1 is
badEx: 1.1.1.1 is
good
44
/1
/16
/17
/18+-
-In: stream of labeled IPs
... <ip4,+> <ip3,+> <ip2,+> <ip1,->
TrackIPTree Algorithm [VBSSS’09]
Out: k-IPTree
TrackIPTree
45
Δ-Change Algorithm1. Approach2. What doesn’t work3. Intuition4. Our algorithm
46
Goal: identify online the specific regions on the Internet that have changed in malice.
/0
/1
/16
/17
/18
T1 forepoch 1
+-
+
/0
/1
/16
/17
/18
T2 forepoch 2
++
-
-Bad: ΔA change from good to bad
-Good: ΔA change from bad to good
-Good: ΔA change from bad to good
Epoch 1 IP stream s1 Epoch 2 IP stream s2 ....
47
Goal: identify online the specific regions on the Internet that have changed in malice.
/0
/1
/16
/17
/18
T1 forepoch 1
+-
+
/0
/1
/16
/17
/18
T2 forepoch 2
++
-
False positive: Misreporting that a
change occurred
False Negative: Missing a real change
48
Goal: identify online the specific regions on the Internet that have changed in malice.
Idea: divide time into epochs and diff• Use TrackIPTree on labeled IP stream s1 to learn T1
• Use TrackIPTree on labeled IP stream s2 to learn T2
• Diff T1 and T2 to find -Good and -BadΔ Δ
/0
/1
/16
/17
/18
T1 forepoch 1
+-
-
/0
/1
/16
T2 for epoch 2
-
Different Granularities!✗
49
Goal: identify online the specific regions on the Internet that have changed in malice.
-Change Algorithm Main Idea: ΔUse classification errors between Ti-1 and Ti to
infer -Good and -BadΔ Δ
50
Ti-2 Ti-1 TiTrackIPTree TrackIPTree
Si-1 Si
Fixed
Ann. with class. error
Si-1
Told,i-1
Ann. with class. error
Si
Told,i
compare (weighted)
classificationerror
(note both based on same tree)
-Good and -Δ ΔBad
Δ-Change Algorithm
51
Comparing (Weighted) Classification Error
/16IPs: 200Acc: 40%
IPs: 150Acc: 90%
IPs: 110Acc: 95%
Told,i-1
IPs: 40Acc: 80%
IPs: 50Acc: 30%
/16IPs: 170Acc: 13%
IPs: 100Acc: 10%
IPs: 80Acc: 5%
Told,i
IPs: 20Acc: 20%
IPs: 70Acc: 20%
-Change SomewhereΔ
52
Comparing (Weighted) Classification Error
/16IPs: 200Acc: 40%
IPs: 150Acc: 90%
IPs: 110Acc: 95%
Told,i-1
IPs: 40Acc: 80%
IPs: 50Acc: 30%
/16IPs: 170Acc: 13%
IPs: 100Acc: 10%
IPs: 80Acc: 5%
Told,i
IPs: 20Acc: 20%
IPs: 70Acc: 20%
Insufficient Change
53
Comparing (Weighted) Classification Error
/16IPs: 200Acc: 40%
IPs: 150Acc: 90%
IPs: 110Acc: 95%
Told,i-1
IPs: 40Acc: 80%
IPs: 50Acc: 30%
/16IPs: 170Acc: 13%
IPs: 100Acc: 10%
IPs: 80Acc: 5%
Told,i
IPs: 20Acc: 20%
IPs: 70Acc: 20%
Insufficient Traffic
54
Comparing (Weighted) Classification Error
/16IPs: 200Acc: 40%
IPs: 150Acc: 90%
IPs: 110Acc: 95%
Told,i-1
IPs: 40Acc: 80%
IPs: 50Acc: 30%
/16IPs: 170Acc: 13%
IPs: 100Acc: 10%
IPs: 80Acc: 5%
Told,i
IPs: 20Acc: 20%
IPs: 70Acc: 20%
-Change LocalizedΔ
55
Evaluation1. What are the performance characteristics?2. Are we better than previous work?3. Do we find cool things?
56
Performance
In our experiments, we :– let k=100,000 (k-IPTree size)– processed 30-35 million IPs (one day’s traffic)– using a 2.4 Ghz Processor
Identified -Good and -Bad Δ Δin <22 min using <3MB memory
57
2.5x as many changes on average!
How do we compare to network-aware clusters?(By Prefix)
58
SpamGrum botnet
takedown
59
Botnets22.1 and 28.6 thousand new
DNSChanger bots appeared
38.6 thousand new Conficker and Sality bots
60
Caveats and Future Work
“For any distribution on which an ML algorithm works well, there is another on which is works poorly.” – The “No Free Lunch” Theorem
Our algorithm is efficient and works well in practice. ....but a very powerful adversarycould fool it into having many false negatives. A formal characterizationis future work.!
61
Detection TheoryBase Rate, fallacies, and detection systems
62
Let be the set of all possible events. ΩFor example:• Audit records produced on a host• Network packets seen
Ω
63
Ω
I
Set of intrusion events I
Intrusion Rate:
Example: IDS Received 1,000,000 packets. 20 of them corresponded to an intrusion.The intrusion rate Pr[I] is:Pr[I] = 20/1,000,000 = .00002
64
Ω
I A
Set of alerts A
Alert Rate:
Defn: Sound
65
Ω
I
A
Defn: Complete
66
Ω
I A
Defn: False PositiveDefn: False Negative
Defn: True Positive
Defn: True Negative
67
Ω
I A
Defn: Detection rate
Think of the detection rate as the set ofintrusions raising an alert normalized by the set of all intrusions.
70
Ω
I A
18 4
2
72
Ω
I A
Think of the Bayesian detection rate as the set of intrusions raising an alert normalized by the set of all alerts. (vs. detection ratewhich normalizes on intrusions.)
Defn: Bayesian Detection rateCrux of IDS usefulness!
74
Ω
I A2
4
18
About 18% of all alerts are false positives!
75
Challenge
We’re often given the detection rate and know the intrusion rate, and want to calculate the Bayesian detection rate– 99% accurate medical test– 99% accurate IDS– 99% accurate test for deception– ...
76
Fact:
Proof:
77
Calculating Bayesian Detection RateFact:
So to calculate the Bayesian detection rate:
One way is to compute:
78
Example
• 1,000 people in the city
• 1 is a terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000
• Suppose we have a new terrorist facial recognition system that is 99% accurate.– 99/100 times when someone is a
terrorist there is an alarm– For every 100 good guys, the
alarm only goes off once.
• An alarm went off. Is the suspect really a terrorist?
City
(this times 10)
79
Example
Answer: The facial recognition system is 99% accurate. That means there is only a 1% chance the guy is not the terrorist.
(this times 10)
City
Wrong!
80
Formalization
• 1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.001
• 99/100 times when someone is a terrorist there is an alarm.P[A|T] = .99
• For every 100 good guys, the alarm only goes off once.P[A | not T] = .01
• Want to know P[T|A]
City
(this times 10)
81
• 1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.001
• 99/100 times when someone is a terrorist there is an alarm.P[A|T] = .99
• For every 100 good guys, the alarm only goes off once.P[A | not T] = .01
• Want to know P[T|A]
City
(this times 10)
Intuition: Given 999 good guys, we have 999*.01 ≈ 9-10 false alarms
False alarms
Guesses?
82
Unknown
Unknown
83
Recall to get Pr[A]Fact:
Proof:
85
..and to get Pr[A∩ I]Fact:
Proof:
86
✓
✓
87
88
For IDS
Let – I be an intrusion,
A an alert from the IDS
– 1,000,000 msgs per day processed
– 2 attacks per day– 10 attacks per
message
False positives
False positives
True positives
70% detection requires
FP < 1/100,000
80% detection generates 40% FP
From Axelsson, RAID 99
89
Conclusion• Firewalls– 3 types: Packet filtering, Stateful, and Application– Placement and DMZ
• IDS– Anomaly vs. policy-based detection
• Detection theory– Base rate fallacy
90
Questions?
END
92
Thought
93
Overview• Approach: Policy vs Anomaly• Location: Network vs. Host• Action: Detect vs. Prevent
Type Example
Host, Rule, IDS Tripwire
Host, Rule, IPS Personal Firewall
Net, Rule, IDS Snort
Net, Rule, IPS Network firewall
Host, Anomaly, IDS System call monitoring
Host, Anomaly, IPS NX Bit?
Net, Anomaly, IDS Working set of connections
Net, Anomaly, IPS