39
Examining the Landscape and Impact of Android App Plagiarism Hao Chen Clint Gibler Ryan Stevens Jonathan Crussell Hui Zang Heesook Choi 1

Examining the Landscape and Impact of Android App Plagiarism

  • Upload
    patch

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

Examining the Landscape and Impact of Android App Plagiarism . Hao Chen. Clint Gibler Ryan Stevens Jonathan Crussell. Hui Zang Heesook Choi. Smartphones Abound. Plagiarism Harms the App Ecosystem. Developers Lose revenue and incentive to make apps Markets Polluted search results - PowerPoint PPT Presentation

Citation preview

Page 1: Examining the Landscape and Impact of Android App Plagiarism

1

Examining the Landscape and Impact of Android App Plagiarism

Hao ChenClint GiblerRyan StevensJonathan Crussell

Hui ZangHeesook Choi

Page 2: Examining the Landscape and Impact of Android App Plagiarism

2

Smartphones Abound

Page 3: Examining the Landscape and Impact of Android App Plagiarism

3

Page 4: Examining the Landscape and Impact of Android App Plagiarism

4

Plagiarism Harms the App Ecosystem

• Developers– Lose revenue and incentive to make apps

• Markets– Polluted search results

• Users– Difficult to find useful, high-quality apps

Page 5: Examining the Landscape and Impact of Android App Plagiarism

5

Investigation Goals• Characteristics of cloned apps

– Market– App category– Ad provider

• Impact on developers– Ad revenue– User base

Page 6: Examining the Landscape and Impact of Android App Plagiarism

6

Definitions• Cloning

– Apps with significant code sharing• Plagiarism

– Cloned apps by different authors• Owner

– Signed and uploaded a given app

Page 7: Examining the Landscape and Impact of Android App Plagiarism

7

Dataset – Android Apps• 265,000 apps from 17 markets

Apps

Play

9 English

6 Chinese

2 Russian

Page 8: Examining the Landscape and Impact of Android App Plagiarism

8

Dataset – Clone Clusters• 265,000 apps from 17 markets

Page 9: Examining the Landscape and Impact of Android App Plagiarism

9

Dataset – Clone Clusters• [Crussell ESORICS 2013]• >5,000 clusters of similar apps• >44,000 unique apps

Page 10: Examining the Landscape and Impact of Android App Plagiarism

10

Dataset – Clone Clusters• [Crussell ESORICS 2013]• >5,000 clusters of similar apps• >44,000 unique apps

Likely clones

Page 11: Examining the Landscape and Impact of Android App Plagiarism

11

Characteristics of Cloned Apps

Page 12: Examining the Landscape and Impact of Android App Plagiarism

12

Cloning between Markets

playandroidonline

Page 13: Examining the Landscape and Impact of Android App Plagiarism

13

How do Plagiarized Apps Impact Developers?

Page 14: Examining the Landscape and Impact of Android App Plagiarism

14

Determining Impact• Naïve approach

– How many times has this app been cloned?• Our approach

– How many use plagiarized apps instead of the original?

X

Page 15: Examining the Landscape and Impact of Android App Plagiarism

15

Determining Impact• Measuring users running a given app• Determining app ownership• Identifying original app from plagiarized

Page 16: Examining the Landscape and Impact of Android App Plagiarism

16

(we’re not Google)

Page 17: Examining the Landscape and Impact of Android App Plagiarism

17

So… what apps are you

running?

Page 18: Examining the Landscape and Impact of Android App Plagiarism

18

Advertising Background

Ad request

Client ID = “bob”

Ad URL

Ad Server

Ad library

Page 19: Examining the Landscape and Impact of Android App Plagiarism

19

Number of Users Running an App

“bob” Aha! Bob’s app is being run.

Page 20: Examining the Landscape and Impact of Android App Plagiarism

20

Dataset – Network Traffic• Major U.S. Cellular Provider• 2.6 billion packets in 12 days• All user-identifying info removed

Page 21: Examining the Landscape and Impact of Android App Plagiarism

21

Determine Ownership of Apps• Owners may have multiple dev accounts

– Within one or on multiple markets• Apps that share an owner should not be

considered plagiarized

Page 22: Examining the Landscape and Impact of Android App Plagiarism

22

Determine Ownership of AppsPhase 1 – Market/Dev Account

Page 23: Examining the Landscape and Impact of Android App Plagiarism

23

Determine Ownership of AppsPhase 1 – Market/Dev Account

Page 24: Examining the Landscape and Impact of Android App Plagiarism

24

Determine Ownership of AppsPhase 2 - Signature

Page 25: Examining the Landscape and Impact of Android App Plagiarism

25

Determine Ownership of AppsPhase 2 - Signature

Page 26: Examining the Landscape and Impact of Android App Plagiarism

26

Determine Ownership of AppsPhase 2 - Signature

Page 27: Examining the Landscape and Impact of Android App Plagiarism

27

Determine Ownership of AppsPhase 3 – Client IDs

Page 28: Examining the Landscape and Impact of Android App Plagiarism

28

Determine Ownership of AppsPhase 3 – Client IDs

Page 29: Examining the Landscape and Impact of Android App Plagiarism

29

Determine Ownership of AppsPhase 3 – Client IDs

Page 30: Examining the Landscape and Impact of Android App Plagiarism

30

Determine Ownership of AppsPhase 3 – Client IDs

Page 31: Examining the Landscape and Impact of Android App Plagiarism

31

Identifying Original Apps:

• Date first uploaded to the market• Popularity

– Installs

– Rating• Code size

Naïve Approaches

Page 32: Examining the Landscape and Impact of Android App Plagiarism

32

Determining Original vs Clones

• Goal: give lower bound

20 impressionsAlice

Charlie 50 impressions

Bob 30 impressions

An Example Cluster

AliceBobCharlie

Impressions

50%

20%

30%

Page 33: Examining the Landscape and Impact of Android App Plagiarism

33

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss

AliceBobCharlie

Impressions

50%

20%

30%

AliceBobCharlie

50%

50%

Page 34: Examining the Landscape and Impact of Android App Plagiarism

34

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss

AliceBobCharlie

Real Loss

50%

20%

30%

AliceBobCharlie

50%

50%

Page 35: Examining the Landscape and Impact of Android App Plagiarism

35

AliceBobCharlie

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss Real Loss

70% 30%

AliceBobCharlie

50%

50%

Page 36: Examining the Landscape and Impact of Android App Plagiarism

36

Percent Revenue/Users Lost

Page 37: Examining the Landscape and Impact of Android App Plagiarism

37

Suggestions for Reducing Cloning

• Developers– Proguard, License Verification Library (LVL)

• Markets– Use tools to detect cloned apps– Adjust market registration fee

• Ad providers– Vet developers

Page 38: Examining the Landscape and Impact of Android App Plagiarism

38

Conclusion• First large scale study on impact of

Android application plagiarism• Combine

– Static analysis for clone detection– Network analysis for revenue loss

measurement– Use client IDs to link both analyses

• Coming soon: sherlockdroid.com

Page 39: Examining the Landscape and Impact of Android App Plagiarism

39