Upload
lawrence-warren
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Published: Internet Measurement Conference (IMC) 2006
Presented by Wei-Cheng Xiao
112/04/20 1
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 2
IntroductionBotnet:
a network of infected hosts, called bots, that are controlled by botmasters
The characteristic of botnetsThe command and control (C&C) channel
Communication mechanisms IRC (the majority, easy to distribute)P2PHTTP
112/04/20 3
Why choosing IRCSupports several forms of communication
Point-to-point, point-to-multipointSupports several forms of data
disseminationProvide open-source implemenations
112/04/20 4
Motivation and GoalsMotivation
There are increases in botnet activity, but little behavior is known.
GoalsGetting better understanding of botnets,
includingthe prevalence of botnet activitythe botnet subspecies diversitythe evolution of a botnet
112/04/20 5
ContributionsThe development of a multifaceted
infrastructure to capture and concurrently track multiple botnets in the wild
A comprehensive analysis of measurements reflecting several important structural and behavioral aspects of botnets
112/04/20 6
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 7
The Life Cycle of A Botnet Infection
112/04/20 8
Step 1: ExploitExploit software
vulnerability of victim hostsby worms or
malicious email attachments
112/04/20 9
Step 2: Download Bot BinaryExecute a shellcode
to download bot binary from a specific location and install it
112/04/20 10
Step 3: DNS Lookup (optional)Resolve the domain
name of the IRC server coded in the binary
Avoid server unavailability due to IP blocking
112/04/20 11
Step 4: JoinJoin the IRC server and
C&C channel listed in the binary
3 types of authentications1. Bots authenticate to join
the server using passwords in the binary
2. Bots authenticate to join the C&C channel using passwords in the binary
3. Botmasters authenticate to the bot population to send commands
112/04/20 12
Step 5: Parse and Execute CommandsParse commands
from the channel topic and execute them
The topic contains default commands for all bots
112/04/20 13
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 14
The Overall Data Collection Architecture
112/04/20 15
The Three Main Phases1. Malware collection
Goal: collect bot binaries
2. Binary analysis via gray-box testing Goal: analyze the binaries
3. Longitudinal tracking of botnets Goal: track real botnets using the analysis
results
112/04/20 16
Phase 1: Malware Collection
Darknet: an allocated but unused portion of the IP address space
112/04/20 17
Malware CollectionEnvironment setup
There are 14 nodes distributed in the PlantLab testbed.These nodes have access to the darknet, whose IP space is
located in 10 different class A networks.
NepenthesMimics replies generated by vulnerable services to get
shellcodesPass URLs in the shellcodes to the download station to
fetch bot binaries (why?)Honeynet
Used to handle cases where Nepenthes failedRunning unpatched Windows XP on VMVLAN
112/04/20 18
GatewayRoute darknet traffic to Nepenthes and
honeypotshalf to Nepenthes, half to honeypots
Rotate routing among 8 class-C networks in the darknet
Use NAT to keep # of honeypots smallAct as a firewall to prevent honeypots from
outgoing attack and cross infections (VLAN)Detect and manage IRC connections
112/04/20 19
Phase 2: Binary analysis
(graybox)
112/04/20 20
Binary AnalysisEnvironment setup
A sink (IRC server) monitors all network traffic.
A client, which is a VM with clean Windows XP installed and binary executed, is connected to the sink.
Two stepsCreating network fingerprintsExtracting IRC-related features
112/04/20 21
The Two StepsCreating network fingerprints (network level)
fnet = {DNS, IPs, Ports, Scan} DNS: targets of any DNS requests IPs: destination IP addresses Ports: contacted ports on the server side Scan: whether or not the IP scanning behavior is
detected
Extracting IRC-related features (application level)When an IRC session is detected, an IRC-
fingerprint is created: firc = {PASS, NICK, USER, MODE, JOIN}.
fnet and firc provide enough information to join a botnet in the wild.112/04/20 22
DialectDialect: the syntax of botmasters’ commands
and their responsesLearning a botnet’s dialect is required for
mimicking actual bot behavior.An IRC query engine plays the role of
botmaster.Commands come from
those observed in honeypotssource codes of public known bots
The output of the querying process becomes the template.112/04/20 23
Phase 3: Longitudinal Tracking of Botnets
112/04/20 24
IRC Tracker (Drone):An IRC clients who can join a real-world IRC channel.A drone is given firc and the template.Automatically answer queries based on the template
Pretend to be a dutiful botMust be intelligent enoughMimicry improvement
Randomly join and leaveChange external IP
112/04/20 25
DNS TrackingMost bots find out IRC servers via DNS
queries.Probe about 800,000 real-world DNS servers
Query domain names of the IRC serversA cache hits implies one or more bots
ShortcomingsNot all DNS server are probed.# of hits provides only the lower bound of # of
botsStill useful when the broadcast feature in a
botnet is turned off.112/04/20 26
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 27
Data collectedStarted from Feb. 1st, 2006, including
Traffic traces over the span of 3 monthsIRC logs over the span of 3 months,
covering data from more than 100 botnet channels
Results of DNS cache hits from tracking 65 IRC servers on 800,000 DNS servers for more than 45 days
112/04/20 28
Botnet Traffic Share
27% of SYNs are from known botnet spreaders.
76% of SYNs direct to target ports.
The two curves reveal similar traffic pattern.
This is a low-bound estimate.
112/04/20 29
About 85,000 DNS servers are involved in at least one botnet activity.
Botnet Prevalence: A Global Look
112/04/20 30
Botnet Prevalence: A Global Look
112/04/20 31
Botnet Spreading Patterns
Two types of botnet:Type-I: fixed
scanning algorithm Type-II: variable
scanning algorithmOut of 192 IRC bots,
34 are Type-I.
Summery of Type-II scanning practice
112/04/20 32
Botnet Growth Patterns
112/04/20 33
Predominant Botnet Structures1. Single IRC server (70%)
Prevalent among small botnets2. Multiple IRC servers, bridged botnet
(30%) 25% of which are public known servers
3. A botmaster controls multiple botnets4. Some botnets migrate
112/04/20 34
Effective Botnet Sizes andBotnet LifetimeEffective size: the # of online botsThe observed effective size was much smaller
than the footprint.Bots usually stay connected for only 25 minutes.May be due to client inavailabilityMore likely, botmasters ask them to leave.
Botnets, however, have long life time84% IRC servers were still up at the end of
study.112/04/20 35
Botnet Software Taxonomy
112/04/20 36
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 37
Related WorkBotnet Tracking: Exploring a Root-Cause
Methodology to Prevent DoS Attacks. ESORICS, 2005Introduces the idea of using honeypots and
active responders to analyze the botnet behavior
Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm. ACM SIGOPS, 2005A very useful tool for botnet detection, but
not appropriate for long term botnet tracking112/04/20 38
OutlineIntroductionOverview of IRC-based botnetsData collection methodologyAnalysis resultsRelated workConclusion
112/04/20 39
ConclusionA multifaceted approach is proposed to
understand botnet phenomenon.The results show that botnet is a major
contributor to the unwanted network traffic.The scanning and pattern of botnets is quite
different from that of autonomous malware.The effective size of botnets are much
smaller than that of fingerprints.
112/04/20 40