Dissecting Ghost Clicks: Ad Fraud Via Misdirected Human ClicksSumayah A. Alrwais, Christopher W. Dunn, Minaxi GuptaIndiana University, U.S.A.Alexandre Gerber, Oliver SpatscheckAT&T Labs-Research, U.S.A.Eric OsterweilVerisign Labs, U.S.A.
28th ACSAC (December, 2012)
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Outline• Introduction• Ad Fraud Scheme• Identifying When Resolvers Lie• Aspects of Ad Replacement• Attack Infrastructure• Impact of the Ad Fraud Scheme• Potential Mitigation Strategies• Related Work
2012
/12/
3
2
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Introduction• Online advertising is a fast growing multi-billion dollar
industry.
• Common revenue models include:• cost per mille (CPM)• cost per click (CPC)• cost per action (CPA)
2012
/12/
3
3
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
FBI: Operation Ghost Click [link]• Botnet: Esthost• 4 million computers• Take down: November 2011
• Attack scheme: ad fraud• Earn CPM and CPC revenue• 14 million USD in 4 years
• [TrendLab blog] Esthost Taken Down – Biggest Cybercriminal Takedown in History [link]
• [TrendLab blog] Big Botnet Busts [link]
• Key element• DNS changer malware
2012
/12/
3
4
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Contribution• In situ experimentation
• Mapping the attack infrastructure
• Gauging attack impact
• Mitigation
2012
/12/
3
5
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Ad Fraud Scheme• Ad replacement attack• Earn CPM revenue
• Click hijacking attack• Earn CPC revenue
• Theat model• Malware changes victim’s DNS resolver to a malicious one.
2012
/12/
3
6
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Ad Replacement Attack
2012
/12/
3
7
ebay.com
ad.doubleclick.combanners.awfulnews.comad.xtendmedia.com
300X250Source= ebay
300X250300X250Source = attacker
ebay server
Malicious DNS resolver(213.109.64.5)
Malicious server(216.180.243.10)
Ad networkxtendmedia.com
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Click Hijacking Attack
2012
/12/
3
8
google.com
AVG server
DNS
free.avg.comReferrer = google/?keyword=xxx
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Click Hijacking Attack
2012
/12/
3
9
AVG server
DNS
free.avg.com
<script src=“google-analytics/ga.js”>
205.234.201.229Referrer = google/?keyword=xxx
Import search2.google.com/123.php?referrer= …
Import search3.google.com/?Google+AVG+xxx
67.210.14.53
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Click Hijacking Attack
2012
/12/
3
10
AVG server
DNS
free.avg.com
205.234.201.229Referrer = google/?keyword=xxx
Import search3.google.com/?Google+AVG+xxx
67.210.14.53
{ load bulletindialy.com /?parameter}
bulletindialy.com/?parameter
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Click Hijacking Attack
2012
/12/
3
11
Fake search engineaccurately-locate.com
DNS
65.60.9.238(Form click IPs)
<form action=“ 65.60.9.238/?param”>
<script> submit form</script>
bulletindialy.com/?parameter
Referrer=bulletindialy.com
HTTP 302 redirectaccurately-locate.com/?keyword=yyy&itemidReferrer=bulletindialy.com
Search Ad Networklooksmart.com
HTTP 302 redirect/?keyword=yyy&itemidReferrer=bulletindialy.com
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Click Hijacking Attack
2012
/12/
3
12
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Modes of Click hijacking
2012
/12/
3
13
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Identifying When Resolvers Lie• We started our investigation with two IP addresses of
malicious resolvers in the 213.109.0.0/20 prefix • Given by a Trend Micro researcher involved in helping the FBI
with Operation Ghost Click.
• Visit Alexa top 3,000 websites on May 11, 2011• Filter ad URL in captured HTTP traffic through URL patterns used
by Adblock Plus[link]• 7,483 unique HTML and Javascript ad URLs• Delivered by 1,019 ad hosts
2012
/12/
3
14
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Filtering Mis-resolved DNS• Heuristic 1: Resolution contains a valid IP address• We gathered good DNS resolutions from 4,490 public resolvers
around the world covering 74 countries.• If an IP address returned by a malicious resolver was returned by
a public DNS resolver for any ad host name, this heuristic considers all IP addresses in that resolution to be good.
• Cut down: 90.5% IPs => remains: 281 IPs (96 host names)
2012
/12/
3
15
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Filtering Mis-resolved DNS• Heuristic 2: Suspicious IP returns a valid SSL certificate• Many ad networks support secure Web-based logins for their
advertisers for tasks.• In 62 host names, over 98% of the IPs in the good resolved result
returned a valid certificate.
• Examine the suspicious resolved result• 8 malicious IPs (4 + 23 host names) => 1,277 URL
2012
/12/
3
16
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Aspects of Ad Replacement• We setup a test machine to use a malicious resolver as its
primary DNS resolver and visited each of the 1,277 ad URLs.
2012
/12/
3
17
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Operational Details• 1,277 ad URLs => 782 URLs successed• Why?
• When the URL didn’t match a certain form, attackers loaded the original ad.
2012
/12/
3
18
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Attack Infrastructure• The attack infrastructure had three components.• Malicious resolvers
• Malicious websites (host names)
• Malicious IP addresses
2012
/12/
3
19
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Malicious Resolvers• We found several IP addresses belonging to six IP prefixes
which are reported to be acting malicious or used by a DNS changer malware.
• We scanned each IP in these prefixes and queried for an A record for ad.doubleclick.net.
• Using Hurricane Electric BGP Toolkit[link] to find the owners of malicious IPs
2012
/12/
3
20
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Behavior seen at .com/.net• We examined the behavior of malicious resolvers in the query
traffic seen at Verisign's .com and .net DNS Top Level Domain (TLD) infrastructure, and its instances of the global DNS root zone.• Data Time: October 20th, 2011
• None of the known malicious resolvers sent any queries to the TLD servers.
• => 13 DNS forwarders• None queried for ad.doubleclick.net.
2012
/12/
3
21
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Malicious Website• We found a total of 42 front-end websites and 43 fake search
engines during our experiments.
• In order to expose more malicious websites• We took known IP addresses from good resolutions of known
malicious websites and found what host names they corresponded to.
• And then test these host names for whether they are mis-resolved or not.
• If it is mis-resolved => malicious• 263 front-end websites• 160 fake search engines
2012
/12/
3
22
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Valid Resolutions of Malicious Websites
2012
/12/
3
23
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Malicious IP Adresses• In our investigations, • 15 malicious IP addresses were used to mis-resolve various ad
hosts and search engine host names.• 2 malicious IP addresses were form click IPs used to simulate
form clicks on attackers' front-end sites.• Using the data set of HTTP transactions, we searched for host
names corresponding to the 17 known malicious IP addresses.• => 30 malicious IP addresses
2012
/12/
3
24
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Summary of all malicious IP addresses found
2012
/12/
3
25
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Impact of the Ad Fraud Scheme• We placed a network monitor on a Broadband Remote Access
Server (BRAS).• An aggregation point for Digital Subscriber Lines (DSLs) for a large
Tier 1 ISP's customers • => 17,000 active broadband subscribers (U.S.)• 2/15/2011
2012
/12/
3
26
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Impact of the Ad Fraud Scheme• 257 legitimate content publishers lost revenue• 21 different ad hosts (20 ad networks) lost revenue
2012
/12/
3
27
2,334 calls to abc.js
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Estimating the Impact of the Ad Fraud Scheme• 86 million subscription lines in the U.S. • =>186,574 infected lines
• 540 million subscription lines world wide• =>1,176,795 infected lines
• 1 line -> 3 computers• =>3.53 million infected computers• =>4 million infected computers (FBI) similar!!
2012
/12/
3
28
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Potential Mitigation Strategies• Serving bluff ads
• Finding fake publisher websites
• Using HTTP with integrity
• Monitoring and scrutinizing unexpected DNS resolvers
• Identifying accounting discrepancies
2012
/12/
3
29
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Related Work• Clickbots• Reverse engineered clickbots
• Clickbot.A -- Neil Daswani et al. (HotBots'07)• Fiesta and 7cy -- Brad Miller et al. (DIMVA'11)
• Human clickers• Qing Zhang et al. (WebQuality’11)
• Inflight modification• Chao Zhang et al. (LEET 2011)
• Lying DNS resolver• David Dagon et al. (NDSS 2008)
• Examining open resolvers of entire IPv4
• Unusual DNS resolver• Bojan Zdrnja et al. (DIMVA‘07)
2012
/12/
3
30
A Se
min
ar a
t Adv
ance
d De
fens
e La
b
Q & A
2012
/12/
3
31
Recommended