View
217
Download
3
Category
Tags:
Preview:
Citation preview
Listen and Whisper: How to verify BGP route updates?
Lakshmi
Joint work with:
Volker Roth, Ion Stoica, Scott Shenker, Randy Katz
A short BGP primer
• The Internet is composed of 14000 autonomous systems(AS’s)
• AS’s exchange route advertisements using BGP.
• Features of BGP– Path vector protocol– Uses local preference and hop-count as the
distance metric– Supports policy routing
Route Verification problem?• BGP assumes that the routes advertised by
neighboring nodes are correct• What if this assumption is violated?
– An AS propagates spurious routes to a neighbor!
• Potential Causes– Accidental router mis-configurations
– Malicious behavior
• What are the effects?– Drop packets and render a destination unreachable
– Eavesdrop the traffic to a given destination
– Impersonate the destination
Why bother?
• Router mis-configurations are a common occurrence [Mahajan02]– Two major outages in April 1997, 2001.
• Router break-ins also occur regularly [Rob Thomas]– Many routers have open telnet interfaces
• “Evil” effects of a compromised node– Impersonation of an online banking system– Blackhole attack on root DNS servers
Causes and Effects
Cause
Effect Accidental Malicious
Blackhole
Eavesdrop
Impersonate
Implication: Accidental problems can be potentially detected in the data plane
Goals and Assumptions• Goal: Verify the correctness of BGP route
updates– Minimize the harmful effects of spurious
updates– Incrementally deployable, lightweight– Minimal modifications to BGP
• Assumptions– No PKI or any key distribution – Shared keys allowed across peering links– No dependence on a central authority (like
ICANN)
Data plane vs Control Plane• Router misconfigurations occur every day
– Previous solutions mostly deal with control plane– Few of them impact reachability [Mahajan02]– Some of them can cause serious outages lasting
hours (April 97, April 01, Sept 02)
• Need a data plane component– Fast detection of reachability problems of
popular prefixes– Stale routes: control plane is correct but data
plane is not• UUNet not forwarding route advertisements
Listen: Passive TCP-Probing
• A router passively observes a TCP flow for SYN and DATA packets– If so, the ACK has been received by sender =>
Route to destination is verifiable
• Does not work for malicious nodes– Malicious nodes can send ACKs for SYN,
DATA packets
• Advantages– No modifications to BGP– Lightweight
What about port scanners?• Port scanners may generate either merely SYN or
SYN+DATA packets.• Case 1: SYN+DATA
– Active drop: Randomly drop a DATA packet and check for retransmissions.
– Retransmit check: Check for number of retransmitted packets in a window.
– Alternative: Delay packets at routers
• Case 2 :only SYNs– Step 1: Try other alternative routes
– Step 2: If no other source generates genuine TCP connections, the prefix is either unused or unreachable.
Results: Data from Tier-1 ISP• Reachability problems for popular prefixes
detectable varies between 15 sec- 1 minute– Only 700 prefixes are popular
• How many routes are verifiable?– Typical routing table has 130-140K entries of which
only 10K are active within a period of one hour
– 3K over periods of 5 minutes
• Frequency of route changes?– 99% of the routes are stable for >1 hour
– Need to verify only few flows every hour
– Specific prefixes are extremely unstable
Local Testbed Results
Number of Machines 28
Probing Period 40 days
Number of Prefixes 11141 (9% of Table)
Verifiable Prefixes 9711
Prefixes with incomplete connections
1460
Perennial problems 42
Number of Failed Conn 15321 (3433 unique)
Detected Problems (verified using Active probing)
• Specific Examples– Two local outages lasting more than one hour
– 207.126.224.0/20 (Yahoo NET) observed regular problems
• Routing loops (detected using traceroute)– 51 different prefixes
• One prefix is perenially down 193.148.15.0/24
• Forwarding problem: No entries in routing table – 64 different prefixes
• Generic routing problems– 87 different prefixes
False Negatives• Outbound connections
– 63.5% are false negatives– Primary sources:
• Server not responding to HTTP connections• buggy BGP daemon script
• Inbound connections– 91.83% are false negatives– Primary sources:
• NetBIOS worm• Port-80 scanners• SQL Server vulbnerability on port 1433
Listen: Summary
• Strengths – Popular prefixes can be detected within a short
period of time– Low overhead– Non-popular prefixes can be verified with a
higher false positive ratio
• Limitations– False negatives do occur often due to worms
• Need to be conservative in determining when routes are not verifiable
Reality…• Data plane solutions do not work!
– Malicious nodes can always impersonate behavior of genuine nodes
• Triggering Alarms vs Identification– Without authentication, a node cannot
distinguish between malicious and genuine speakers!
• Our Goals:– Detect route inconsistencies– Containment: A malicious node should not
harm more than a few set of nodes.
What do we mean?• Route Consistency Test: A router compares two
routes R and S to a destination D:– If R and S are genuine routes, they should be consistent– If R is genuine and S is spurious, they should be
inconsistent– If R and S are both spurious, they may be either
consistent or inconsistent
• What does route consistency check give?– Trigger alarm if any node generate spurious update.
• What does containment mean?– A malicious node should not have the capability to affect
more than a few destinations– A malicious node attempting to cause widespread damage
should be detected and isolated
Consistency test requirements
• Property 1: Malicious node should not be able to invent “spurious” advertisements that are also consistent.
• Property 2: A route advertisement modified by a malicious node should be inconsistent with genuine routes.
From Consistency to Containment
V
BA
FED
C
M
A,B,C
If Verifier V notices multiple spurious routes from M, V can avoid routes through M.
How to check for consistency?
A
B
M
C
A
A?
Example Problem
A
B
M
C
A,x
A,y
x
x=y?
Solution: A uses nonce x.
A
B
M
C
A,x
A,x
x
What about this?
Using hash chains
Secret=x
h(x)
h(x)
h(h(h(x)))h(h(x))
h(h(x))
DS
End-result: A malicious node N hops away from source S can generate a spurious route of path length=N If malicious node generates shorter path, hash values will not match. Which route is incorrect is unknown?
Embed path in hash-chains?
h1=h(x,S)
h1=h(x,S)
h3=h(h2,B)h2=h(h1,A)
h4=h(h1,U)
End-result: One malicious node cannot lie However, two colluding malicious nodes can fake a link
Secret=xD
SA B C
U V
Implementing in BGP• Use Community attributes
– Require two signature attributes• Seed value for the hash (512-bit,1024-bit or 2048-bit)
• Hash Signature (512-bit, 1024-bit or 2048-bit)
– Each Community attribute uses 32 bits
– Split each Signature attribute between multiple community attributes
• Our Implementation:– Hash library uses RSA-like signatures built on top of
the OpenSSL library
– Whisper library integrated with Zebra version 0.93b bgpd implementation
Cost of RSA-based operations
512-bit 1024-bit 2048-bit
VerifySign 0.18 msec 0.45 msec 1.42 msec
UpdateSign 0.25 msec 0.6 msec 1.94 msec
GenSign 0.4 sec 8.0 sec 68 sec
• For 1024-bit keys, process rate >100,000 adv/minute• BGP maximum update rate is 9300 adv/min (avg=130)
Conclusions
• We identified 2 causes for spurious route advertisements– Mis-configurations, malicious behavior
• Harmful effects– Blackhole, impersonation, eavesdrop
• Remedies– Mis-configurations: TCP probing– Malicious behavior: Whisper protocols with
penalty functions
Vulnerability metric
M
Ds
r
BA C
Affected Node Malicious Unaffected
affect(D,M) = # affected / #nodes How much harm can one malicious node do?
Compute the distribution of affect(D,M) over all D
Graph Containment Problem
M GG
G
Core
Satellite
Problem: A malicious node in a satellite should not be able to affect good nodes in other satellites.
Model: A graph with a core and multiple satellites
What if hashes mismatch?
R S
a b
v
d
If the hashes of routes R and S do not match: penalize both R and S Penalty (route R)
For every vertex “x” in R (inclusive of end-points) Increment penalty m(x) by 1
Problems with simple penalty
Malicious Appear Malicious First probe
A malicious node can make many other nodes appear to be malicious
Penalize sub-path
R S
P Q
A
Identify sub-paths where loop-tests cannot be performed Penalize sub-paths alone (e.g. R and S)
Renormalize penalties
A B CD
E
For P, it is hard to differentiate between A, B and C as to which node is malicious. However, P can deduce that D, E may not be malicious
A B CD
E
P
P
Recommended