An Effective Defense Against Email Spam Laundering Paper by: Mengjun Xie, Heng Yin, Haining Wang Presented at:CCS'06 Presentation by: Devendra Salvi

An Effective Defense Against Email Spam Laundering

Paper by: Mengjun Xie, Heng Yin, Haining Wang

Presented at:CCS'06

Presentation by: Devendra Salvi

Overview

Introduction Spam Laundering Anti spam techniques Proxy based spam behavior DBSpam Evaluation Review

Introduction

Presently spam makes 80% of emails

Spam has evolved in parallel with anti spam techniques.

Spammers hide using, proxies and compromised computers

Introduction (contd.)

Detecting spam at its source by monitoring bidirectional traffic of a network

DBSpam uses “packet symmetry” to break spam laundering in a network

Spam Laundering

Spam Proxy

Anti Spam Techniques

Existing “Anti spam techniques” are classified into,

1. “Recipient Oriented”

2. “Sender Oriented”

3. “HoneySpam”

Anti Spam Techniques (contd.) Recipient Oriented anti-spam techniques

functions

They block / delay email spam from reaching recipients mailbox

Or Remove / mark spam in recipients mailbox

Anti Spam Techniques (contd.) Recipient Oriented anti-spam techniques are

further classified as Content based

Email address filters Heuristic filters Machine learning based filters

Non content based

Anti Spam Techniques (contd.) Recipient Oriented anti-spam techniques are

further classified as Content based Non content based

DNSBL MARID Challenge response Tempfailing Delaying Sender behavior analysis

Anti Spam Techniques (contd.) Sender Oriented Techniques

Usage Regulations E.g. blocking port 25, SMTP authentication

Cost based approaches Charge the sender (postage)

Anti Spam Techniques (contd.) HoneySpam

It is a honeypot framework based on honeyD It deters “email address harvesters”, poison spam

address databases and blocks spam that goes through the open relay / proxy decoys set by HoneySpam

Proxy based spam behavior

Laundry path of Proxy Spamming

Proxy based spam behavior (contd.) Connection Correlation

There is one-to-one mapping between the upstream and downstream connections along the spam laundry path

This kind of connection is a common for proxy based spamming

In normal email delivery there is only one connection; between sender and receiving MTA

Proxy based spam behavior (contd.) Connection Correlation

The detection of such spam-proxy-related connection correlation is difficult because Spammers may use encryption for content

It sits at network vantage points and may induce unaffordable overhead

Proxy based spam behavior (contd.) Spam laundering for single and multiple

proxies

Proxy based spam behavior (contd.) Message symmetry at application layer leads

to packet symmetry at network layer

Exception: one to one mapping between inbound and outbound streams can be violated

Reasons: packet fragmentation, packet compression and packet retransmission

Proxy based spam behavior (contd.) The packet symmetry is a key to distinguish

the suspicious upstream / downstream connections along the spam laundry path from normal background traffic

DBSpam

Goals Fast detection of spam laundering with high

accuracy Breaking spam laundering via throttling or

blocking after detection Support for spammer tracking Support for spam message fingerprinting

DBSpam

DBSpam consists of two major components Spam detection module

Simple connection correlation detection algorithm

Spam suppression module

DBSpam

Deployment of DBSpam It is placed at a network vantage point which may

connect costumer network to the Internet DBSpam works well if it is deployed at the primary

ISP edge router

DBSpam

Packet symmetry for spam TCP is 1 For a normal TCP connection it is one with

very small probability of occurrence DBSpam uses a statistical method,

“sequential probability ratio test” (SPRT)

DBSpam

“sequential probability ratio test” (SPRT) checks probability between bounds for each observation

The algorithm contains a variable X which is checked for correlation

Variables A and B form the bounds If X is between A and B, the algorithm does another

iteration, else it stops with a conclusion

DBSpam

Evaluation

How fast DBSpam can detect spam laundering ?

How accurate the detection results were ?

How many system resources it consumes ?

Evaluation

DBSpam detection time is mainly decided by the SPRT detection time Number of observations needed to reach a

decision Actual time spent by SPRT

Evaluation

SPRT can filter out 95% non-spam traffic in four observations

Evaluation

The actual detection time is approximately 6 reply rounds of SMTP connection

Evaluation

Accuracy The probability is less than 0.0002 in all traces,

indicating that false positive probability of SPRT is fairly small

False negatives are calculated using ratio of number of packets missed to number of spam packets missed

Evaluation

Resource Consumption

Trace Information

Resource consumption

Review

Strengths Can detect spam sources by isolating and

tracking proxies Truncates spam at near its source Can detect spam even if its content is encrypted Low false positives Does not degrade network performance

Review

Weaknesses It cannot efficiently detect spam with short reply

rounds Its it more effective if it can be installed on an ISP

edge router The paper does not discuss about spam

suppression techniques

Review

Improvements: With evolving spam, DBSpam will have to tweak

its spam detection algorithm

Questions ?

Documents

An Effective Defense Against Email Spam Laundering Paper by: Mengjun Xie, Heng Yin, Haining Wang Presented at:CCS'06 Presentation by: Devendra Salvi