33
Analyzing Network Traffic in the Presence of Adversaries Vern Paxson International Computer Science Institute / Lawrence Berkeley National Laboratory [email protected] / [email protected] October 18, 2004

Analyzing Network Traffic in the Presence of Adversaries Vern Paxson International Computer Science Institute / Lawrence Berkeley National Laboratory [email protected]

Embed Size (px)

Citation preview

Analyzing Network Traffic in the Presence of Adversaries

Vern Paxson

International Computer Science Institute /

Lawrence Berkeley National Laboratory

[email protected] / [email protected]

October 18, 2004

Roadmap

• In today’s Internet, attacks are the norm

– Adversaries can create fundamental problems for network traffic analysis

• #1 problem: evasion by ambiguity

• “Active mapping” to resolve ambiguities

• “Normalization” to eliminate ambiguities

• #2: flooding directed at network devices

• How to design robust analysis hardware

Data courtesy of Rick Adams

= 80% growth/year

= 60% growth/year

= 596% growth/year

= 596% growth/year

Courtesy Mark Dedlow

In Today’s InternetAttacks are the Norm

Great interest in watching network traffic and analyzing what it’s doing• Watching: monitor traffic at chokepoints, capture

copy or perhaps intercept

• Analyzing: reconstruct protocol layers as seen by endpoints, interpret semantics

• How hard can it be?• Attackers are adversaries: they don’t want to be

caught and they want to make it painful for us to operate

The Problem of Evasion

• Evasion raises fundamental problems• Network traffic seen from within a network is

inherently ambiguous.

• Analyzing network traffic at a high semantic level requires extensive state …… which an adversary can target.

• Consider a network intrusion detection

system (IDS; “Bro”) detecting occurrences of

the string “root” inside a network connection

(Let’s disregard the wholly separate issue of false

positives: whether this is a good “signature”)

Detecting “root”: Attempt #1

• Method: scan each packet for ‘r’, ‘o’, ‘o’, ‘t’– Perhaps using Boyer-Moore, Aho-Corasick,

Bloom filters …

…….….root………..…………1

But: TCP protocol doesn’t preserve text boundaries

…….….ro1

ot………..…………2

Detecting “root”: Attempt #2

• Method: remember match from end of previous packet

…….….ro1

ot………..…………2

+

But: TCP protocol doesn’t guarantee in-order arrival

…….….ro1

ot………..…………2

?

- Now we’re managing state

Detecting “root”: Attempt #3

• Method: reassemble entire byte stream– Keep track of full TCP connection state

– So much for “simple”

– What happens if we run out of memory?

• And:– Still evadable …

Evading Detection ViaAmbiguous TCP Retransmission

Evading Detection ViaAmbiguous TCP Retransmission

It’s Not Just TTL Expiration

• Systematic study (w/ M. Handley & C. Kreibich) to analyze ambiguous protocol fields:

– 73 exploitable ambiguities IP/TCP/UDP/ICMP

– E.g: control flags, flow control window, “don’t fragment”, old timestamps, service class, redundant length field, filtering on unused bits

– Internet protocols not designed for analysis

– Attacker toolkits already exist for exploiting these

• Answer: alert upon seeing ambiguous traffic?

The Problem of “Crud”

• Unfortunately, ambiguities occur in benign traffic, too:– Legitimate tiny fragments, overlapping fragments

– Receivers that acknowledge data they did not receive

– Senders that retransmit different data than originally

• In a diverse traffic stream, you will see these:

– What is the intent?

• Loss of alert precision “Maybe there’s an attack”

Countering Evasion-by-Ambiguity: Active Mapping

• Idea (w/ Umesh Shankar, UCB): Probe end-host in advance to resolve vantage-point ambiguities– E.g., how many hops to it?

– E.g., how does it resolve ambiguous retransmissions?

– Gray-box testing

Mapping Setup

Grey-box Inference of Reassembly Policy

A Plethora of Inferred Policies

Issues for Active Mapping

• Probing for most ambiguities requires eliciting a response

– Some hosts won’t respond when not actively engaged

– For some responses, need to trick host into echoing back what it saw

• Have to take churn into account

– At a large site, something’s always changing

– Lack of identity due to NAT, DHCP

– Our implementation takes ≈ 5 sec/host

Countering Evasion-by-Ambiguity: Normalization

• Idea (w/ Mark Handley, Christian Kreibich): Introduce network element to rewrite traffic passing through it to eliminate ambiguities– E.g., regenerate low TTLs (dicey!)

– E.g., regularize flags, unused fields

– E.g., trim out-of-window data

– E.g., reassemble streams & remove inconsistent retransmissions

Issues for Normalization

• Effect on end-to-end semantics?– Some normalizations harmless (e.g., inconsistent

streams)

– Some actually improve protocol (e.g., reliable RSTs)

– Some degrade performance in the presence of cold start (e.g., stripping TCP window scaling)

• Performance: element is in-line– Prototype (1.1 GHz): 400 Mbps

– Would like to use custom hardware …

Robust Hardware for Analyzing Traffic in the Presence of Adversaries

• Ongoing work w/ Sarang Dharmapurikar (WUSTL)

• Basic building-block for boosting network analysis: in-line TCP stream reassembly

– If data arrives in-sequence, hand it to analyzer module

– If data arrives out-of-sequence, it creates a “hole”

• Buffer for later delivery

• How hard can it be?

How Much Buffer for HolesDo We Need?

• Most previous work says: “Zero”– Skip out-of-sequence packets

• Commercial work says: “Yes”– Claim out-of-sequence packets buffered, but with

no details

• Answer for sound operation depends critically on whether we consider adversaries …

Measured Buffer Required Per-Hole

Measured Duration of Holes

Instantaneous Aggregate Hole Buffer

How Much Buffer for HolesDo We Need?, con’t

• Trace analysis says: a few hundred KB suffices even for a large site’s access link ...

• … But: an adversary can maliciously create holes, overflowing the buffer. On overflow, we can either:– Stop analyzing evicted connection, allowing

adversary to evade

– Kill unanalyzable connection, allowing adversary to inflict collateral damage

Adversary-ResistantStream Reassembly

• Trace analysis also says:– Very few connections have concurrent holes

• Can limit adversary to one hole per connection

– No hosts have concurrent connections w/ holes• Can limit adversary to one hole per Zombie

• Consider randomized eviction:– If buffer size >> requirements of legit connections,

then most evictions evict the attacker’s own holes

Zombie Equations• Let:

– M, P = total memory (pages) available for holes

– Ml, Pl = memory (pages) for legitimate holes

– e = tolerable eviction rate for legit. connections

– r = rate at which a zombie can transmit (bytes/sec)

– g = page size (granularity) for hole buffer

– Z = # of zombies required to achieve eviction rate

• Then for attacker creating small/large holes:

Zombie Implications• If we only terminate connections with > 2

packets buffered; allow each connection 10KB of buffer; and use 512MB DRAM …

• … then collateral damage rate X of legitimate connections terminated per second is:

By throwing memory at the problem, we can weather a large attack

Summary

• The lay of the land has changed– Ecosystem of endemic hostility

• Adversaries can exploit ambiguity and pressures of holding state to evade detection or inflict collateral damage

• Internet protocols not designed with “wire analysis” in mind …

• … But it is possible to design to address these issues if they are properly considered

Summary, con’t

• Network analysis amidst adversaries is a new area:– Did not talk about: application-level

evasion, polymorphism, tunneling, compromising passive monitors

– In many ways, reminiscent of Internet measurement a decade ago:

• Low-hanging fruit • Daunting problems• Fun!