Upload
amir-woods
View
24
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Exploiting Temporal Persistence to Detect Covert Botnet Channels. Frederic Giroire (CNRS, France) , Jaideep Chandrashekar, Nina Taft, Eve Schooler, and Dina Papagiannaki (Intel Research) RAID’09. Outline. Introduction Temporal Persistence Design and Implementation - PowerPoint PPT Presentation
Citation preview
Exploiting Temporal Persistence to Detect Covert Botnet Channels
Frederic Giroire (CNRS, France), Jaideep Chandrashekar, Nina Taft, Eve Schooler, and Dina Papagiannaki (Intel Research)
RAID’09
2009/9/4 Speaker: Li-Ming Chen 2
Outline
Introduction Temporal Persistence Design and Implementation Dataset and Evaluation Conclusion and Comments
2009/9/4 Speaker: Li-Ming Chen 3
Botnet
Botnet A botnet is a collection of compromised end-hosts Under controlled by a bot-master Through a command and control (C&C) channel Used to launch various malevolent activities
DDoS, spamming, stealing privacy, etc.
Why botnets are so common and dangerous? Low maintenance cost and easy of use (e.g., through IRC) Non-tech criminals can buy or rent botnets Botnet-based underground economy
2009/9/4 Speaker: Li-Ming Chen 4
Botnet Detection
Traditional intrusion detection: Misused detection
Drawback: only for known attacks, and easy to evade
Anomaly detection Can detect activated zombie hosts But with a delay after a host joining a botnet to the time that is instru
cted to carry out a malicious task
Directions for mitigating botnet problems 1.) prevent the recruitment 2.) detect the covert C&C channel (focus)
3.) detect attacks being carried out by the bots
2009/9/4 Speaker: Li-Ming Chen 5
Botnet Detection (related work)
Anomaly-based IRC channels detection (based on protocol/payload analysis)
BotHunter – chains together various alarms to detect a whole (or partial) botnet lifecycle (USENIX Sec.‘07)
BotSniffer – focus on detecting C&C server (NDSS’08)
Similar behaviors to the same destination (centralized botnet)
BotMiner – cluster attack traffic and normal (C&C) traffic, then perform cross clustering to identify hosts that undertake both kinds of communication (USENIX Sec.’08)
2009/9/4 Speaker: Li-Ming Chen 6
Objective of this Paper
Aim to detect botnet C&C communications on an endhost Define “destination atoms” Measure the temporal regularity (persistence) for indi
vidual destination atoms on each endhost Identify suspicious C&C communications
Comparing to other detection techniques: Not attempt to identify attack traffic in the traffic stream Not attempt to correlate activities across hosts
2009/9/4 Speaker: Li-Ming Chen 7
Outline
Introduction Temporal Persistence Design and Implementation Dataset and Evaluation Conclusion and Comments
2009/9/4 Speaker: Li-Ming Chen 8
Observations
C&C traffic: Each bot needs to communicate regularly with a C&C serve
r And this is a common behavior across different bots
This C&C communication might be very stealthy Avoid being detected
However without “frequent” communication to a C&C server, the bot becomes invisible to the bot-master Still need to maintain this communication over time
C&C communication may be low frequent but persistent
2009/9/4 Speaker: Li-Ming Chen 9
Observations (cont’d)
Normal communications An endhost, on any particular day, may communicate
with a large set of destination end-points However, most of these destinations are transient
Be communicated with a few times and never again Smaller and stable set of destinations will be visited re
gularly Work related sites, news/entertainment websites, sites contact
ed by applications
need to distinguish C&C traffic from these
2009/9/4 Speaker: Li-Ming Chen 10
Approach (how to exploit temporal persistence to detect botnet channel) Introduce a notion called “destination atoms”, and a metric c
alled “persistence” to capture the lightweight yet regular communication
Training: Persistent destination atoms are added to a host’s whitelist during
a training period The whitelist requires infrequent updating (due to the persistence)
Test: Track the persistence of new destination atoms not already whitel
isted identify the C&C traffic and destination For stealthy attacks:
Track persistence at multiple timescales concurrently
2009/9/4 Speaker: Li-Ming Chen 11
Outline
Introduction Temporal Persistence Design and Implementation Dataset and Evaluation Conclusion and Comments
2009/9/4 Speaker: Li-Ming Chen 12
Destination Atoms
Destination atoms is an aggregation of destinations Only care about the network service being connected
to, not so much the actual destination IP address E.g., the particular addresses that respond to google.c
om vary by location and time (but the user just want to access the google service)
Mapping: Given (dstIP, dstPort, proto) obtain the atom (dstSe
rvice, dstPort, proto)
2009/9/4 Speaker: Li-Ming Chen 13
Example of Destination Atoms
Destination Atoms contacted by somehost.intel.com
2009/9/4 Speaker: Li-Ming Chen 14
How to extract Services? (by heuristic) 1. if the src. and dst. belong to different domains,
the service name is simply the 2nd level domain name of the dst. E.g., google.com, yahoo.com
2. if the src. and dst. belong to the same domain, the service name is the 3rd level domain name E.g., mail.intel.com, print.intel.com
2009/9/4 Speaker: Li-Ming Chen 15
How to extract Services? (by heuristic) (cont’d) 3. utilize application level information (when higher l
evel application semantics are available) E.g., dst. atom for FTP service: (ftp.service.com, 21:>1024,
tcp)
4. using destination port to distinguish services on a single destination host (who provides a number of distinct services)
5. when the addresses cannot be mapped to names, using IP address as the service name
2009/9/4 Speaker: Li-Ming Chen 16
Persistence
Host A - - - ->generates outgoing traffic
(sliding) Observation Window W
Measurement Window s
…
W ≡ [s1, s2, …, sn ] The persistence of a destination atom d in the observation window
W is defined as:
Say d is persistent if
Indication function, return 1 if si > 0, 0 if si = 0
p* is a pre-defined threshold
2009/9/4 Speaker: Li-Ming Chen 17
Persistence in Multiple Timescales Botnets differ from one to another, and we cannot know
a prior the frequency of C&C comm. need to design a method that can track persistence o
ver several observation windows simultaneously
Timescale: Select k overlapping timescales
And the judge of the persistence is
),(...),(),( 2211 kk sWsWsW Smallest timescale Largest timescale
For each timescale,Persistence P(j)(d)
2009/9/4 Speaker: Li-Ming Chen 18
Multiple Timescales - Implementation Size of the measurement window
{s1, s2, …, s7 } = {1, 4, 8, 12, 16, 20, 24} (hour) In preliminary analysis, 87% of connections to the same destination a
tom are separated by at least 1hr
Choose n = 10 Wj = n * sj
(Wmin =10, smin =1)smallest ~ (Wmax =240, smax =24)largest
Implementation k separate bitmaps !?(not necessary)
Smallest timescale(bitmap)
sj is covered by a slot in the next higher timescale
OR operation
therefore, only need to construct a simple long bitmap that cover all the timescales
2009/9/4 Speaker: Li-Ming Chen 19
Compute Persistence
bitmaps stored in DCT, indexed by individual atoms d(for each atom d)
multiple timescales
bitmap length, idx for each bit(ring buffer)
for each smin, compute persistence for all dst. atoms
(there is a separate process that processes each outgoing connection; this check if the destination atom is whitelisted)
2009/9/4 Speaker: Li-Ming Chen 20
Whitelist – Training and Detection Training and detection stages proceed identically (almost)
Persistence of destinations is tracked and alarms raised when this crosses a specified threshold
Training: An alarm simply results in the atom being insert into the whitelist
Detection: Checking whitelist
Alarm is exposed to the enduser for further analysis (benign, insert into whitelist; or malicious, block connecitons)
2009/9/4 Speaker: Li-Ming Chen 21
Outline
Introduction Temporal Persistence Design and Implementation Dataset and Evaluation Conclusion and Comments
2009/9/4 Speaker: Li-Ming Chen 22
End Host Traffic Traces
Collect at 157 hosts over a 5 week period (2007/1~2)
Collect all packets headers Divide data into training and testing sets Training set is used to determine the threshold and build th
e per-user whitelists Testing data is used to assess the detection performance
FP rate and FN rate
2009/9/4 Speaker: Li-Ming Chen 23
Botnet Traffic Traces
Collect botnet binaries, execute on WinXP SP2 VM, and generate botnet traffics No other IP traffic will be sent out of the VM Hard work: binary crash, C&C server not found, only 12 bin
aries work! In test dataset, overlay these botnet traffic on top of the nor
mal traffic traces (conn./min.)
2009/9/4 Speaker: Li-Ming Chen 24
System Properties
For system to work well, whitelists properties: Should be stable, changes infrequently Smaller is better, can speed up the searching
CDF of p(d) across all the atoms seen in training data
(a user typically has fewpersistence dst. atoms)
select p* = 0.6
Distribution of per host whitelistsizes computed using p* = 0.6
Size is small,manageable
(Total 157 users)
2009/9/4 Speaker: Li-Ming Chen 25
C&C DetectionOverlaid bot trace data on top each of the 157 user tracesVarious properties of the detected botnets
A botnet might use multiple timescales for different dst. atoms!
(> 0.6)
s = (1, 4, 8, 16, 20, 24)W = 10 * s
Stealthy botnets
Also detect non-centralized(p2p) botnet
2009/9/4 Speaker: Li-Ming Chen 26
C&C Detection (cont’d)
Using ROC curve to compute the FP and detection rate In an enterprise network, FP rate might be low (well beh
aved users); however, in real world, FP rate will raise! Whitelist applications. e.g., BitTorrent.
ROC curve FP across users (p* = 0.6)
Knee, best threshold
(Total 157 users)
Small userssee large alarms
(avg. 5.3 benign dst. atoms per user)
2009/9/4 Speaker: Li-Ming Chen 27
Detecting Botnet Attack Traffic Study how whitelist can boost the detection rates of more traditional
volume-based anomaly detectors Whitelists known good destinations Traffic going to these destinations must be “anomaly free” (can be filtere
d out) Use a simple connection count detector with a 99.9%-ile threshold
After filtering, the detection rate is better (e.g., Aimbot-25, VB-666) The benefit of filtering is apparent when the botnet traffic volumes are lo
w to moderate
2009/9/4 Speaker: Li-Ming Chen 28
Outline
Introduction Temporal Persistence Design and Implementation Dataset and Evaluation Conclusion and Comments
2009/9/4 Speaker: Li-Ming Chen 29
Conclusions
Introduce “persistence” as a temporal measure of regularity in connection to “destination atoms” Persistence does not require any protocol semantics or to look in
side payloads to detect the malware
Describe a method that builds whitelists of known good destination atoms In order to isolate persistent destinations (likely C&C channels)
Evaluation shows that the proposed method successfully identified C&C destinations in every botnet instance
The proposed method can also boost the traditional detection algorithm by filtering traffic
2009/9/4 Speaker: Li-Ming Chen 30
My Comments
Using multi-resolution approach to explore the temporal behavior of a bot Connects to C&C server(s) periodically
Can cooperate with other botnet detection techniques (not host-based)
In detection, alarm raise does not imply finding the attack Require to further analyze the destination and the traffic
The limitation of using multi-resolution approach?