Watchtowers of the Internet - Source Boston 2012

WATCH TOWERS OF THE INTERNET

Websense Security Labs

Stephan Chenette, Armin Buescher

(c) 2012 Websense Security Labs.

ANALYSIS OF OUTBOUND MALWARE COMMUNICATION

Who we are

Stephan Chenette (Northeastern Grad.)

Security Researcher, UCSD M.S.

Vulnerabilities, Reversing, Coding

Armin Buescher

Security Researcher, M.S.

AV, Reversing, Coding

R&D and Malware/Exploit Research

Essentials of this Talk

• Malware Lab

• Observations of Malware

Communication

• Clustering

Current State of Affairs

Companies are concerned about targeted attacks

...and for good reason.

• A persistent attacker will eventually penetrate your

network

• Malware will be installed

• Most malware will eventually communicate

outbound * (* unless the end goal of the attacker is complete destruction of data, malware will be used as the communication mechanism

back to C&C)


Current State of Affairs

Most important to you as a network administrator:

• Knowledge of what machines are infected

• Prevention of important information leaving your

network

Value of this Presentation

Better understanding of

Outbound Malware Communication

Deep dive into threats that are

present against or on your network

Malware Lab

Building a

Malware Lab

1

2

3

4

Malware Lab

• Sandbox

• VPN Services

• Network Listeners

• Databases

• Multiple Scanner Engines

• Malware…lots of it! =]

Malware Lab Output

• Behavior Analysis

• Network Analysis

Our Philosophy

• Don't run around trying to find a

particular bot/variant

Run Everything!

• Then figure out what it is…

• Spam Bots

• Network Worms

• File Infectors

• Etc. (c) 2012 Websense Security Labs.

Malware Samples

Typically received 30-70k samples/day

For this presentation we took a small

representative daily subset totaling

~155,000

malware files to sample from

Malware Samples

How to Classify Samples...

DO NOT USE -- AV-Names **

• e.g. Trojan.Win32.Downloader

DO USE -- CLUSTERING

• Behavior Analysis/Network Analysis

** (AV-names are avoided as main use of classification when possible)

Malware Samples

Outbound

Communication

Understanding

Generic Trojan Downloader SHA-1: ab57031100a8c8c813a144b20b1ef5b9a643cec7

fling.com?...p0rn site

promos.fling/geo/txt/city.php

VPN Gateway - Canada

Botnet C&C 83.125.22.188

P2P Communication

P2P Botnet

P2P Botnet – Encryption

Generic Trojan Downloader?

• GEO/IP Lookup from a P0rn site

• C&C traffic uses DGA to “sign” botnet

traffic via host header

• P2P communication over port 443

• Zaccess Dropper! (Sophos/Kaspersky)

• Future versions with the same network

behavior can be profiled

GEO/IP lookup

• 2,744 samples in our malware set use

fling.com to look up geo-location

• 177 different AV detection variants

• …clustering might have put this in the

same grouping?

Another Sample…

K = (bot id) only replies if k is present!

Returns instructions to DoS two targets

03 – DoS (Attack mode)

50 – Number of Threads

60 – Timeout (s) for the next C&C Request

DoS:

smcae.com:3306

&

http://tonus.crimea.ua

DOS

DOS

Results

• DirtJumper Botnet

• Request commands via HTTP (unencrypted!)

• DoS on mysql (3306), no SQL content

• DoS on http (80), GET request

Manual Analysis

• Good for deep-dive of a particular binary

e.g. Flashback Mac OS X malware to

find DGA

• But not good for mass analysis of large

number of samples daily

• …Clustering

Clustering

Basics

Clustering

The process of grouping together

samples that contain similar features

Network Communication

TCP Services

2012: Malware is talking

over HTTP

>=70% HTTP

vs.

.46% IRC (6667)

HTTP Outbound

Communication

Clustering on

Malware downloading

executable payloads

Trojan:Win32/Medfos

Worm:Win32/Renocide

Trojan:Win32/Opachki

Worm:Win32/Rebhip

Don't Rely 100% on AV Names

Don't Rely 100% on AV Names

Rely on behavioral functionality

C&C Communication via HTTP

Malware Communication


Feature: HTTP User-Agents

used by Malware


• Most Malware uses browser user-agent strings

• >17% have empty user-agent strings!

• 85% use a user-agent of a browser not

present on the system

Good Apps…User-Agent

Good Apps…User-Agent

Bluestacks is an android emulator

Completely benign…but there are

characteristics that look like bot traffic…

Good Traffic

User-Agent / HTTP GET

Dalvik/1.4.0 (Linux; U; Android 2.3.4;

BlueStacks-c4afa5ac-7f39-11e1-b41e-

001676aa4685 Build/GRJ22)\r\n

GET

/public/appsettings/updates.txt

…Essential to have a large sample set of

both benign and malicious examples

Obviously Malicious…

URLs

• www.csa.uem.br/administrator

/includes/MicrosoftUpdate.exe

• s1c0gv3v0x.h1.ru/Trojan.rar

• ospianistas.com.br/aviso

/infect.php

• svpembtywvrc.eu/gate.php?

cmd=ping&botnet=fr18&userid=

x1lgje2mdh51kc8z&os=V2luZG93cy

BYUA==

User-Agents

• Mozilla/6.0 (iPhone; U; CPU

iPhone OS 3_0 like Mac OS X;

en-us)

• Mozilla/1.22 (compatible; MSIE

2.0; Windows 95)

• darkness

• N0PE

• Trololo

Network behavior

features

Clustering

Net. Clustering Features

• Basic Network communication features

• Protocols

• Timing

• Encryption

• Encoding (e.g. BASE64)

• DNS features

• Number of lookups

Net. Clustering Features

• HTTP features

• Number of requests

• Request method (POST/GET/…)

• MIME types (server/real)

• URL

• User-agent

• Etc.

Clustering examples

DDoS malware Dirt Jumper

• Clustering w. network

behavior:

• found ~900 DJ samples

• Identified 90 unique

C&C URLs

Led to research paper “Tracking DDoS, Insights into the

business of disrupting the Web” accepted at LEET

academic conference for publication

Distinguishing families

• Downloaders w.

similar behavior

• Categorizing

unknown samples:

• ~85% precision

• Two families

Banking Trojan Zbot

• Zoom into cluster

w. network

behavior “Zbot”

• Clusters:

• Alive & kickin’

• Domain killed

• Server killed

Conclusion

Telemetry = System behavior + Network behavior

• Automated deep analysis of network

behavior is underrated

• Paint full picture of analyzed malware!

• AV Names don’t always represent

functionality

Conclusion II

• Clustering on network behavior analysis • Identify malware communication techniques

• Obviously malicious

• Generic

• Sophisticated

• Clustering…yes! Just remember

sophisticated might just mean generic!

Q & A

questions.py:

while len(questions) > 0:

if time <= 0:

break

print answers[questions.pop()]


That’s all folks!

Thanks!

Stephan Chenette

Twitter: @StephanChenette

Armin Buescher

Twitter: @armbues (c) 2012 Websense Security Labs.

Documents

Watchtowers of the Internet - Source Boston 2012