40
Living in the (malicious) social web: Beyond friendships Yazan Boshmaf, Konstantin Beznosov, Matei Ripeanu, Dionysions Logothetis, Georgios Siganos, Jose Lorenzo

Living in the (malicious) social web: Beyond friendships

Embed Size (px)

Citation preview

Page 1: Living in the (malicious) social web: Beyond friendships

Security Analysis of Malicious Socialbots on the Web

Yazan Boshmaf

Dissertation presented in partial fulfillment of degree requirements of PhD in ECE, UBC 1

Living in the (malicious) social web: Beyond friendships

Yazan Boshmaf, Konstantin Beznosov, Matei Ripeanu,

Dionysions Logothetis, Georgios Siganos, Jose Lorenzo

Page 2: Living in the (malicious) social web: Beyond friendships

2

Social bots

Hwang et al. Socialbots: Voices from the fronts. ACM Interactions 19, 2 (March 2012), 38-45.

+ =

Automated fake accounts in online social networks (OSNs)

Designed to deceive and appear human

Page 3: Living in the (malicious) social web: Beyond friendships

3

The threat of malicious social bots

Hwang et al. Socialbots: Voices from the fronts. ACM Interactions 19, 2 (March 2012), 38-45.

+ =

Automated fake accounts in online social networks (OSNs)

Designed to deceive and appear human

What is at stake?

Page 4: Living in the (malicious) social web: Beyond friendships

4

Fake accounts are bad for business

“… If advertisers, developers, or investors do not perceive our user metrics to be accurate representations of our user base, or if we discover material inaccuracies in our user metrics, our reputation may be harmed and advertisers and developers may be less willing to allocate their budgets or resources to Facebook, which could negatively affect our business and financial results…”

Page 5: Living in the (malicious) social web: Beyond friendships

5

Fake accounts are bad for users

OSNs are attractive medium for abusive users

Social Infiltration

Connecting with many benign users (friend request spam)

Bilge et al. All your contacts are belong to us: Automated identity theft attacks on social networks. Proc. of WWW, 2009

Page 6: Living in the (malicious) social web: Beyond friendships

6

Fake accounts are bad for users

OSNs are attractive medium for abusive users

Data collectionSocial Infiltration

Online surveillance, profiling, and data commoditization

Nolan et al. Hacking human: Data-archaeology and surveillance in social networks. ACM SIGGROUP Bulletin 25.2, 2005

Page 7: Living in the (malicious) social web: Beyond friendships

7

Fake accounts are bad for users

OSNs are attractive medium for abusive users

MisinformationData collectionSocial Infiltration

Influencing users, biasing public opinion, propaganda

Ratkiewicz et al. Detecting and tracking political abuse in social media. Proc. of ICWSM. 2011

Page 8: Living in the (malicious) social web: Beyond friendships

8

Fake accounts are bad for users

OSNs are attractive medium for abusive users

MisinformationData collection Malware InfectionSocial Infiltration

Infecting computers and use it for DDoS, spamming, and fraud

Thomas et al. The Koobface botnet and the rise of social malware. Proc. of MALWARE, 2010

Page 9: Living in the (malicious) social web: Beyond friendships

9

Fake accounts are bad for users

OSNs are attractive medium for abusive content

MisinformationData collection Malware InfectionSocial Infiltration

Infecting computers and use it for DDoS, spamming, and

fraud1

1 Thomas et al. The Koobface botnet and the rise of social malware. Proc. of MALWARE, 2010.

Countermeasure designThreat characterization

Our work

Page 10: Living in the (malicious) social web: Beyond friendships

Questions

10

• Vulnerability analysis

• Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

11

12

13

14

Page 11: Living in the (malicious) social web: Beyond friendships

Questions

11

• Vulnerability analysis

• Characterization of user behavior

How vulnerable are OSNs to social infiltration?

• Quantifying privacy breaches

• Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

11

12

13

14

Page 12: Living in the (malicious) social web: Beyond friendships

Questions

12

• Vulnerability analysis

• Characterization of user behavior

How vulnerable are OSNs to social infiltration?

• Quantifying privacy breaches

• Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? • Scalability in economic context

• Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

11

12

13

14

Page 13: Living in the (malicious) social web: Beyond friendships

Questions

13

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Is victim prediction feasible

•Can victim prediction enable robust detection

How to detect social bots that infiltrate on a large

scale?

Threat

Characterization

Countermeasure

Design

11

12

13

14

Page 14: Living in the (malicious) social web: Beyond friendships

Attack side: Social infiltration in OSNs

14

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

11

12

13

14

1 The socialbot network: When bots socialize for fame and money, Boshmaf, Beznosov, Ripeanu, ACSAC, Dec 20112 Key challenges in defending against malicious socialbots, Boshmaf, Beznosov, Ripeanu, USENIX LEET, April 20123 Design and analysis of a social botnet, Boshmaf, Beznosov, Ripeanu, J. Comp. Net., 57(2), Feb 2013

Page 15: Living in the (malicious) social web: Beyond friendships

Social botnet: Experiment

Operated 100 socialbots on Facebook, single botmaster

15

Bots sent 9.6K friend requests send in 8 weeks,

35.7% requests from bots accepted (victims)

Page 16: Living in the (malicious) social web: Beyond friendships

Main findings

16

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Effectiveness of security defenses

•Quantification of privacy breaches

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rational behind infiltration

OSNs at scale?

•Systematic evaluation

•Robust detection technique

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

11

12

13

14

(Platform-level vulnerability)

It is feasible to automate social infiltration by exploiting platform and user vulnerabilities

Page 17: Living in the (malicious) social web: Beyond friendships

Main findings

17

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Effectiveness of security defenses

•Quantification of privacy breaches

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltration OSNs at scale?

•Systematic evaluation

•Robust detection technique

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

11

12

13

14

(Data breaches)

Social infiltration results in serious privacy breaches, where personally identifiable information is compromised

Page 18: Living in the (malicious) social web: Beyond friendships

Victims are highly affected

18

0

10

20

30

40

50

IM account ID Postal address Phone number E-mail address Number'of'users'(thousands)'

Before'

AEer'

Direct (%) Extended(%)

ProfileInfo Before After Before After

BirthDate 3.5 72.4 4.5 53.8

Email Address 2.4 71.8 2.6 4.1

Gender 69.1 69.2 84.2 84.2

HomeCity 26.5 46.2 29.2 45.2

Current City 25.4 42.9 27.8 41.6

PhoneNumber 0.9 21.1 1.0 1.5

School Name 10.8 19.7 12.0 20.4

Postal Address 0.9 19.0 0.7 1.3

IM Account ID 0.6 10.9 0.5 0.8

MarriedTo 2.9 6.4 3.9 4.9

WorkedAt 2.8 4.0 2.8 3.2

Average 13.3 34.9 15.4 23.7

Table2.3: Percentageof userswithaccessibleprivatedata

2.4.7 InfiltrationPerformance

Thesocialbots infiltratedFacebook over 55daysstartingJanuary 28, 2011. Dur-

ingthistime, thebotsestablished3,439friendshipswithvictimusers, whereeach

friendship or attackedgeconnectsexactly onevictimtoasocialbot, asshownin

Figure2.7a. Thefigurealso illustratestheeffect of triadic closure. Inparticular,

theinfiltration rapidly increasedafter thefirst 2weeks, whereeachsocialbot had

at least onefriendincommonwiththeuser towhichit sent afriendrequest.

Another observation of this study is that attack edges aregenerally easy to

establish. AsshowninFigure2.7b, anattacker canestablishenoughattack edges

suchthat fakeaccounts,whicharecontrolledbysocialbots, arestronglyconnected

to real accounts. Thisobservation hasserious implications to graph-based fake

account detectionmechanismsusedby defensesystemssuchasEigenTrust [38],

SybilLimit [85], SybilInfer [11], Mislove’smethod[76], andGateKeeper [73]. In

particular, thesesystemsassumethat fakescanestablish only asmall number of

attack edges, at most oneper fake[84], so that thecut whichcrossesover attack

edgesissparse.17 Accordingly, thesystemsattempt tofindsuchasparsecut with

17A cut isapartitionof thenodesof agraphintotwodisjoint subsets. Visually, it isalinethat

cutsthroughor crossesover aset of edgesinthegraph.

36

(a) (b)

Figure 2.7: Users with accessible private data

Collected Data

The socialbots harvested large amounts of user data through the col l ect master

command. By the end of the 8th week, the SbN resulted in a total of 250GB in-

bound and 3GB outbound network trafficbetween our machineand Facebook. We

were able to collect news feeds, user profile information, and wall posts. In other

words, practically everything shared on Facebook by the victims, which could be

used for large-scale user surveillance [91]. Even though adversarial surveillance,

such asonlineprofiling [42], isaseriousthreat to user privacy, wedecided to focus

on user data that havemonetary value at underground markets, such aspersonally

identifiable information (PII).

After excluding remaining friendships among the bots, the size of their direct

neighborhoodswas3,439 users. Thesizeof all extended neighborhoods, however,

wasaslargeas1,085,785 users. In Figure2.7a, wecomparedatarevelation before

and after operating theSbN in termsof percentage of userswith accessible private

data, PII in particular. On average, 2.62 times more PII was exposed in direct

neighborhoods after infiltration, and 1.54 times more in extended neighborhoods.

37

2.62 times more private data collected after infiltration

Page 19: Living in the (malicious) social web: Beyond friendships

Friends of victims are affected too

19

0

10

20

30

40

50

IM account ID Postal address Phone number E-mail address Number'of'users'(thousands)'

Before'

AEer'

Direct (%) Extended(%)

ProfileInfo Before After Before After

BirthDate 3.5 72.4 4.5 53.8

Email Address 2.4 71.8 2.6 4.1

Gender 69.1 69.2 84.2 84.2

HomeCity 26.5 46.2 29.2 45.2

Current City 25.4 42.9 27.8 41.6

PhoneNumber 0.9 21.1 1.0 1.5

School Name 10.8 19.7 12.0 20.4

Postal Address 0.9 19.0 0.7 1.3

IM Account ID 0.6 10.9 0.5 0.8

MarriedTo 2.9 6.4 3.9 4.9

WorkedAt 2.8 4.0 2.8 3.2

Average 13.3 34.9 15.4 23.7

Table2.3: Percentageof userswithaccessibleprivatedata

2.4.7 InfiltrationPerformance

Thesocialbots infiltratedFacebook over 55daysstartingJanuary 28, 2011. Dur-

ingthistime, thebotsestablished3,439friendshipswithvictimusers, whereeach

friendship or attackedgeconnectsexactly onevictimtoasocialbot, asshownin

Figure2.7a. Thefigurealso illustratestheeffect of triadic closure. Inparticular,

theinfiltration rapidly increasedafter thefirst 2weeks, whereeachsocialbot had

at least onefriendincommonwiththeuser towhichit sent afriendrequest.

Another observation of this study is that attack edges aregenerally easy to

establish. AsshowninFigure2.7b, anattacker canestablishenoughattack edges

suchthat fakeaccounts,whicharecontrolledbysocialbots, arestronglyconnected

to real accounts. Thisobservation hasserious implications to graph-based fake

account detectionmechanismsusedby defensesystemssuchasEigenTrust [38],

SybilLimit [85], SybilInfer [11], Mislove’smethod[76], andGateKeeper [73]. In

particular, thesesystemsassumethat fakescanestablish only asmall number of

attack edges, at most oneper fake[84], so that thecut whichcrossesover attack

edgesissparse.17 Accordingly, thesystemsattempt tofindsuchasparsecut with

17A cut isapartitionof thenodesof agraphintotwodisjoint subsets. Visually, it isalinethat

cutsthroughor crossesover aset of edgesinthegraph.

36

(a) (b)

Figure 2.7: Users with accessible private data

Collected Data

The socialbots harvested large amounts of user data through the col l ect master

command. By the end of the 8th week, the SbN resulted in a total of 250GB in-

bound and 3GB outbound network trafficbetween our machineand Facebook. We

were able to collect news feeds, user profile information, and wall posts. In other

words, practically everything shared on Facebook by the victims, which could be

used for large-scale user surveillance [91]. Even though adversarial surveillance,

such asonlineprofiling [42], isaseriousthreat to user privacy, wedecided to focus

on user data that havemonetary value at underground markets, such aspersonally

identifiable information (PII).

After excluding remaining friendships among the bots, the size of their direct

neighborhoodswas3,439 users. Thesizeof all extended neighborhoods, however,

wasaslargeas1,085,785 users. In Figure2.7a, wecomparedatarevelation before

and after operating theSbN in termsof percentage of userswith accessible private

data, PII in particular. On average, 2.62 times more PII was exposed in direct

neighborhoods after infiltration, and 1.54 times more in extended neighborhoods.

37

1.54 times more, with more than 1 million affected users

Page 20: Living in the (malicious) social web: Beyond friendships

Friends of victims are affected too

20

0

10

20

30

40

50

IM account ID Postal address Phone number E-mail address Number'of'users'(thousands)'

Before'

AEer'

Direct (%) Extended(%)

ProfileInfo Before After Before After

BirthDate 3.5 72.4 4.5 53.8

Email Address 2.4 71.8 2.6 4.1

Gender 69.1 69.2 84.2 84.2

HomeCity 26.5 46.2 29.2 45.2

Current City 25.4 42.9 27.8 41.6

PhoneNumber 0.9 21.1 1.0 1.5

School Name 10.8 19.7 12.0 20.4

Postal Address 0.9 19.0 0.7 1.3

IM Account ID 0.6 10.9 0.5 0.8

MarriedTo 2.9 6.4 3.9 4.9

WorkedAt 2.8 4.0 2.8 3.2

Average 13.3 34.9 15.4 23.7

Table2.3: Percentageof userswithaccessibleprivatedata

2.4.7 InfiltrationPerformance

Thesocialbots infiltratedFacebook over 55daysstartingJanuary 28, 2011. Dur-

ingthistime, thebotsestablished3,439friendshipswithvictimusers, whereeach

friendship or attackedgeconnectsexactly onevictimtoasocialbot, asshownin

Figure2.7a. Thefigurealso illustratestheeffect of triadic closure. Inparticular,

theinfiltration rapidly increasedafter thefirst 2weeks, whereeachsocialbot had

at least onefriendincommonwiththeuser towhichit sent afriendrequest.

Another observation of this study is that attack edges aregenerally easy to

establish. AsshowninFigure2.7b, anattacker canestablishenoughattack edges

suchthat fakeaccounts,whicharecontrolledbysocialbots, arestronglyconnected

to real accounts. Thisobservation hasserious implications to graph-based fake

account detectionmechanismsusedby defensesystemssuchasEigenTrust [38],

SybilLimit [85], SybilInfer [11], Mislove’smethod[76], andGateKeeper [73]. In

particular, thesesystemsassumethat fakescanestablish only asmall number of

attack edges, at most oneper fake[84], so that thecut whichcrossesover attack

edgesissparse.17 Accordingly, thesystemsattempt tofindsuchasparsecut with

17A cut isapartitionof thenodesof agraphintotwodisjoint subsets. Visually, it isalinethat

cutsthroughor crossesover aset of edgesinthegraph.

36

(a) (b)

Figure 2.7: Users with accessible private data

Collected Data

The socialbots harvested large amounts of user data through the col l ect master

command. By the end of the 8th week, the SbN resulted in a total of 250GB in-

bound and 3GB outbound network trafficbetween our machineand Facebook. We

were able to collect news feeds, user profile information, and wall posts. In other

words, practically everything shared on Facebook by the victims, which could be

used for large-scale user surveillance [91]. Even though adversarial surveillance,

such asonlineprofiling [42], isaseriousthreat to user privacy, wedecided to focus

on user data that havemonetary value at underground markets, such aspersonally

identifiable information (PII).

After excluding remaining friendships among the bots, the size of their direct

neighborhoodswas3,439 users. Thesizeof all extended neighborhoods, however,

wasaslargeas1,085,785 users. In Figure2.7a, wecomparedatarevelation before

and after operating theSbN in termsof percentage of userswith accessible private

data, PII in particular. On average, 2.62 times more PII was exposed in direct

neighborhoods after infiltration, and 1.54 times more in extended neighborhoods.

37

1.54 times more, with more than 1 million affected users

From 49K birthdates to 584K

Acquisti et al. Predicting social security numbers from public data. Proc. Of Nat. Acad. of Sc. 106(27), 2009

Page 21: Living in the (malicious) social web: Beyond friendships

Vulnerabilities exploited to automate infiltration

21

Large scale network crawls Exploitable platforms and APIs

Fake accounts and profilesIneffective abuse mitigation

(User behavior characterization)

Some users are more susceptible to social infiltration,which partly depends on factors related to their social structure

Page 22: Living in the (malicious) social web: Beyond friendships

User susceptibility to become a victim correlates with social structure

More friends, more susceptible to infiltration

More mutual friends, more susceptible to infiltration

0

10

20

30

40

50

60

70

80

Accep

tance'rate'(%

)'

Number'of'friends'

Pearson’s r = 0.85 Pearson’s r = 0.85

10

20

30

40

50

60

70

80

90

0 1 2 3 4 5 6 7 8 9 10 ≥11

Accep

tance'rate'(%

)'Number'of'mutual'friends'

Without mutual friends

22

60%

80%

20%

Page 23: Living in the (malicious) social web: Beyond friendships

23

Fake accounts mimic real accounts

All manually flagged by concerned users

Only 20% of fakes were “detected”

Page 24: Living in the (malicious) social web: Beyond friendships

Friends of victims are affected too

24

0

10

20

30

40

50

IM account ID Postal address Phone number E-mail address Number'of'users'(thousands)'

Before'

AEer'

Direct (%) Extended(%)

ProfileInfo Before After Before After

BirthDate 3.5 72.4 4.5 53.8

Email Address 2.4 71.8 2.6 4.1

Gender 69.1 69.2 84.2 84.2

HomeCity 26.5 46.2 29.2 45.2

Current City 25.4 42.9 27.8 41.6

PhoneNumber 0.9 21.1 1.0 1.5

School Name 10.8 19.7 12.0 20.4

Postal Address 0.9 19.0 0.7 1.3

IM Account ID 0.6 10.9 0.5 0.8

MarriedTo 2.9 6.4 3.9 4.9

WorkedAt 2.8 4.0 2.8 3.2

Average 13.3 34.9 15.4 23.7

Table2.3: Percentageof userswithaccessibleprivatedata

2.4.7 InfiltrationPerformance

Thesocialbots infiltratedFacebook over 55daysstartingJanuary 28, 2011. Dur-

ingthistime, thebotsestablished3,439friendshipswithvictimusers, whereeach

friendship or attackedgeconnectsexactly onevictimtoasocialbot, asshownin

Figure2.7a. Thefigurealso illustratestheeffect of triadic closure. Inparticular,

theinfiltration rapidly increasedafter thefirst 2weeks, whereeachsocialbot had

at least onefriendincommonwiththeuser towhichit sent afriendrequest.

Another observation of this study is that attack edges aregenerally easy to

establish. AsshowninFigure2.7b, anattacker canestablishenoughattack edges

suchthat fakeaccounts,whicharecontrolledbysocialbots, arestronglyconnected

to real accounts. Thisobservation hasserious implications to graph-based fake

account detectionmechanismsusedby defensesystemssuchasEigenTrust [38],

SybilLimit [85], SybilInfer [11], Mislove’smethod[76], andGateKeeper [73]. In

particular, thesesystemsassumethat fakescanestablish only asmall number of

attack edges, at most oneper fake[84], so that thecut whichcrossesover attack

edgesissparse.17 Accordingly, thesystemsattempt tofindsuchasparsecut with

17A cut isapartitionof thenodesof agraphintotwodisjoint subsets. Visually, it isalinethat

cutsthroughor crossesover aset of edgesinthegraph.

36

(a) (b)

Figure 2.7: Users with accessible private data

Collected Data

The socialbots harvested large amounts of user data through the col l ect master

command. By the end of the 8th week, the SbN resulted in a total of 250GB in-

bound and 3GB outbound network trafficbetween our machineand Facebook. We

were able to collect news feeds, user profile information, and wall posts. In other

words, practically everything shared on Facebook by the victims, which could be

used for large-scale user surveillance [91]. Even though adversarial surveillance,

such asonlineprofiling [42], isaseriousthreat to user privacy, wedecided to focus

on user data that havemonetary value at underground markets, such aspersonally

identifiable information (PII).

After excluding remaining friendships among the bots, the size of their direct

neighborhoodswas3,439 users. Thesizeof all extended neighborhoods, however,

wasaslargeas1,085,785 users. In Figure2.7a, wecomparedatarevelation before

and after operating theSbN in termsof percentage of userswith accessible private

data, PII in particular. On average, 2.62 times more PII was exposed in direct

neighborhoods after infiltration, and 1.54 times more in extended neighborhoods.

37

1.54 times more, with more than 1 million affected users

From 49K birthdates to 584K

Acquisti et al. Predicting social security numbers from public data. Proc. Of Nat. Acad. of Sc. 106(27), 2009

(Feature-based detection is ineffective)

Socialbots leads to arms race and render feature-based fake account detection ineffective

Page 25: Living in the (malicious) social web: Beyond friendships

Defense side: Infiltration-resilient fake account detection

25

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Countermeasure

Design

11

12

13

14

1 Graph-based Sybil detection in social and information systems. In Proc. of ASONAM, Aug 20132 Integro: Leveraging victim prediction for robust fake account detection in OSNs. NDSS, Feb 2015 3 Thwarting fake accounts by predicting their victims. Submitted to TISSEC, Feb 2015

Page 26: Living in the (malicious) social web: Beyond friendships

26

Feature-based detection is ineffective

All manually flagged by concerned users

Only 20% of fakes were “detected”

(Graph-based detection)

Social infiltration invalidates the assumption behind graph-based fake account detection

Page 27: Living in the (malicious) social web: Beyond friendships

27

Graph-based detection

Alvisi et al. The evolution of Sybil defense via social networks. IEEE Security and Privacy, 2013.

Real region Fake region

Attack edges

Finds a (provably) sparse cut between the regions by ranking

Assumes social infiltration on a large scale is infeasible

Page 28: Living in the (malicious) social web: Beyond friendships

28

Graph-based detection

Alvisi et al. The evolution of Sybil defense via social networks. IEEE Security and Privacy, 2013.

Most real accounts rank higher than fakes

Real region Fake region

Ranks computed from landing probability of a short random walk

Cut size = 3

Page 29: Living in the (malicious) social web: Beyond friendships

Graph-based detection is not resilient to social infiltration

29

Real region Fake region

Cut size = 10 (densest)

50% of bots had more than 35 attack edges

Page 30: Living in the (malicious) social web: Beyond friendships

Premise: Regions can be tightly connected

30

Real region Fake region

Cut size = 10 (densest)

Page 31: Living in the (malicious) social web: Beyond friendships

Key idea: Identify potential victims with some probability

31

Real region Fake region

Potential victim with

probability 0.9

Page 32: Living in the (malicious) social web: Beyond friendships

Key idea: Leverage victim prediction to reduce cut size

32

High = 1

Medium < 1

Low = 0.1

Real region Fake region

Assign lower weight to edges incident to potential victims

Cut size = 1.9 << 10

Page 33: Living in the (malicious) social web: Beyond friendships

Delimit the real region by ranking accounts

33

High = 1

Medium < 1

Low = 0.1

Real region Fake region

Most real accounts are ranked higher than fake accounts

Ranks computed from landing probability of a short random walk

Page 34: Living in the (malicious) social web: Beyond friendships

Delimit the real region by ranking accounts

34

High = 1

Medium < 1

Low = 0.1

Real region Fake region

Most real accounts are ranked higher than fake accounts

Ranks computed from landing probability of a short random walk Result 1: Bound on ranking quality

Number of fake accounts that rank equal to or higher than real accounts is O(vol(EA) logn) where vol(EA) ≤ |EA|

Assuming a fast mixing real region and an attacker who establishes attack edges at random

Page 35: Living in the (malicious) social web: Beyond friendships

Result 2: Victim classification is feasible (even using low-cost features)

Random Forests (RF) achieves up to 52% better than random

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

True(posiSve(rate(

False(posiSve(rate(

TuenS(

Facebook(

Random(

AUC = 0.76

AUC = 0.7

AUC = 0.5

35

No need to train on more than 40K feature vectors on Tuenti

40K vectors

Integro: Leveraging victim prediction for robust fake account detection in OSNs. NDSS, Feb 2015Thwarting fake accounts by predicting their victims. Submitted to TISSEC, Feb 2015.

Page 36: Living in the (malicious) social web: Beyond friendships

Result 3: Ranking is resilient to infiltration

Cao et al. Aiding the Detection of Fake Accounts in Large Scale Social Online Services, NSDI’12

Targeted-victim attack

0.5

0.6

0.7

0.8

0.9

1.0

Mean(area(under(ROC(curve(

Number(of(a9ack(edges(

IntegroYBest(IntegroYRF(IntegroYRandom(SybilRank(

Random-victim attack

Integro delivers up to 30% higher AUC, and AUC is always > 0.92

36

Infiltration

resilience

Page 37: Living in the (malicious) social web: Beyond friendships

Deployment at Tuenti confirms results

Precision at lower intervals Precision at higher intervals

Integro delivers up to an order or magnitude better precision

37

Highly-infiltrating fakesLow ranks to higher ranks

Page 38: Living in the (malicious) social web: Beyond friendships

Research Questions and Contributions

38

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

Countermeasure

Design

11

12

13

14

Page 39: Living in the (malicious) social web: Beyond friendships

Research Questions and Contributions

39

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

Countermeasure

Design

11

12

13

14

Impact

42#

42#

42#

42#42#

42#

Public education & further studies Production-class deployment

Open-source, public release

Page 40: Living in the (malicious) social web: Beyond friendships

Research Questions and Contributions

40

•Vulnerability analysis of OSN platforms

•Characterization of user behavior

How vulnerable are OSNs to social infiltration?

•Quantification of privacy breaches

•Effectiveness of security defenses

What are the security and privacy implications of

social infiltration? •Scalability from economic context

•Profit-maximizing infiltration strategy

What is the economic rationale behind

infiltrating OSNs at scale?

•Victim prediction for robust detection

•Framework for evaluation

How can OSNs detect fakes or social bots that

infiltrate on a large scale?

Threat

Characterization

Countermeasure

Design

11

12

13

14

Research impact

42#

42#

42#

42#42#

42#

Public education & further studies Production-class deployment

Open-source, public release

PublicationsPrimary:

1. Boshmaf et al. The socialbot network: When bots socialize for fame and money.Proc. of ACSAC, Dec 2011 (20% acceptance rate, best paper award)

1. Boshmaf et al. Key challenges in defending against malicious socialbots. In Proc. of USENIX LEET, April 2012 (18% acceptance rate)

1. Boshmaf et al. Design and analysis of a social botnet. J. Comp. Net., 57(2), Feb 2013 (1.9 impact factor)

1. Boshmaf et al. Graph-based Sybil detection in social and information systems. In Proc. of ASONAM, Aug 2013 (13% acceptance rate, best paper award)

Related:

1. Boshmaf et al. The socialbot network: are social botnets possible?ACM Interactions, March-April, 2012

1. Sun et al. A billion keys, but few locks: The crisis of web single sign-on.In Proc. of NSPW, Sept 2010

1. Rashtian et al. To befriend or not? A model for friend request acceptance on Facebook. In Proc. of SOUPS, July 2014