27
12th Italian Networking Workshop Cavalese, Italy - January 14, 2015 CrowdSurf: Empowering Informed Choices in the Web Hassan Metwalley Stefano Traverso Marco Mellia Stanislav Miskovic Mario Baldi

CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

12th Italian Networking Workshop Cavalese, Italy - January 14, 2015

CrowdSurf: Empowering

Informed Choices in the Web

Hassan Metwalley Stefano Traverso Marco Mellia Stanislav Miskovic Mario Baldi

Page 2: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

•  Online advertising •  E-commerce / recommendation •  Analytics in general

§  Each service can know everything about you

Motivations

2"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 3: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Third-Party Trackers

3"

§  Third-party web tracking refers to the practice by which a service records user web activities often for profit

§  Many techniques •  Cookies •  HTML5 LocalStorage •  Finger printing (browser/OS/IP)

acmeTrack.com

acmeAds.com

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 4: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Quantify tracking activity

4"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 5: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

§  Top third-party tracking services are contacted by more than 95% of users –  77% of PC users contact the first tracker in less than 1 second –  71% of services embed at least one tracking service

§  Yes, probably you are tracked!!

Are you tracked?

5"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 6: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Privacy and tracking: the role of HTTPS

6"

§  HTTPS is becoming more used [1] –  This clearly improves people privacy –  But it makes it harder to verify and regulate tracking services from a neutral third

party observing traffic…

§  How many third party tracking services are using HTTPS?

[1] Naylor, D., Finamore, A., Leontiadis, I., Grunenberger, Y., Mellia, M., Papagiannaki, K.,Steenkiste, P.: The Cost of the “S” in HTTPS. In: ACM CoNEXT. (2014)

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 7: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Privacy and tracking: the role of HTTPS

7"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 8: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Countermeasures?

8"

§  Some countermeasures are available as browser extensions •  Disable cookie sending, disable javascripts, blacklisting,…

§  ….but they worsen user’s browsing experience §  ….are ineffective for mobile users and few users use them

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 9: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

How to deal with this scenario? How to know what information is being collected? How to protect not experts?

9"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Our proposal: give back to the user the control on web browsing!"

Page 10: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Current Scenario

10"

§  Regulators are reacting, but they don not have enough data §  Still now tracking services can collect information without any

authorization §  It is mandatory to:

•  inform users about privacy leakage problem •  help people to understand what information is collected •  create a system in witch each user can choose what share

§  Making the web economy sustainable

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 11: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf

11"

§  Crowd-Sourced system (called CrowdSurf) •  Users can collaborate by providing implicit (e.g., traffic samples) and explicit

(e.g., their opinion) information •  They obtain information about web services, i.e., advices

§  Cloud runs data mining algorithms to produce advices containing indications about trustfulness of web services

§  Challenges •  Unified system •  Semi-supervised approaches •  Overcome limitation of current systems (fragmented and not automated)

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 12: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf Layer

12"

§  Mandatory Layer in the Internet stack –  To handle HTTP(S) traffic before encryption

§  It processes, filters and check all web traffic –  Using the advices provided by the community –  Under complete control of user

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

CROW

DSUR

F Lay

er

HTTP

Rule

Proc

esso

r

Action

SSL/TLS TCP

Redirect

Regular Expression Matching

Modify Allow Block

Advising Community

Third-party Advisor

Controller

Anonymization

Advic

es to

Ru

le-Se

ts

Log and Report

Page 13: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf System

13"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

""""""!!!!

Internet!

Web Services!

Advising Community !""""""!!!!

Third-party Advisor!

""""""!!!!

Corporate Controller!

Corporate Network!

Contribution!

Web Browsing!

Advices!

Corporate Device!

Collector!

Corporate Rule-sets!

Private User Device!

Data Analyzer!

Page 14: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Testbed implementations and benchmarking

14"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 15: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf Prototype

15"

§  Crowdsurf layer implemented as a Firefox extension –  Watch all HTTP(S) traffic, immediately before encryption –  Support for both PCs and Mobile Devices

§  Cloud server collects traffic samples, elaborates these and distributes advices

§  Personalized policies –  Three cases

•  Corporate •  Kid •  Paranoid

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Block Redirect log&report Facebook Corp Twitter Corp Dropbox Corp Google Corp (--> Bing) YouTube Corp Ebay+Amazon Corp Adult Sites Corp, Kid Trackers Par Kid Ads+NoJS Par

Page 16: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf Extension - Benchmark

16"

§  Performance impairment test §  20 websites

–  10 Top Global Sites –  8 News Sites –  6 Tracker-Free Sites

§  Performance index: –  Average rendering time

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 17: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

CrowdSurf Extension - Benchmark

17"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Clients have enough power to easily handle the extra load generated by

possible CrowdSurf implementation!!"

Page 18: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Smart algorithms to automatically flag possible problems

18"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 19: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

§  Algorithm to automate advice generation §  For a set of events in which hostname is different from referrer field

1.  extract all keys from the URLs in the set that includes a query string http://www.acmeAds.com/query?key1=X&...........&keyN=Y

2.  for each hostname and for each key, investigate one-to-one mapping between the client and the values taken by each of the keys

§  If a key value assumes a different value for each client, and does not change over time –  We found a “client identifier” –  Mark the service as a “possible tracker”

Automatic Tracker Detection

19"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

acmeAds.com

Page 20: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection - Results

20"

§  HTTP traces §  Third party sites in

–  Repubblica.it –  YouTube –  Facebook

§  Keys suggest the exchange of client identifiers

§  With CrowdSurf should be possible to check also on HTTPS

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Website Third party sites Keys Repubblica

YouTube

Facebook

Page 21: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection - Results

21"

§  HTTP traces §  Third party sites in

–  Repubblica.it –  YouTube –  Facebook

§  Keys suggest the exchange of client identifiers

§  With CrowdSurf should be possible to check also on HTTPS

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Website Third party sites Keys Repubblica

pix04.revsci.net su.addthis.com track.adform.net

YouTube

bh.ams.contextweb.com eu-jet-01.sociomantic.com ib.adnxs.com www.wajam.com uip.semasio.net

Facebook

adadvisor.net

data.bncnt.com

go.flx1.com

ira.spysomeone.com

tags.bluekai.com

ww1.collserve.com

www.skyscanner.com

Page 22: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection - Results

22"

§  HTTP traces §  Third party sites in

–  Repubblica.it –  YouTube –  Facebook

§  Keys suggest the exchange of client identifiers

§  With CrowdSurf should be possible to check also on HTTPS

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Website Third party sites Keys Repubblica

pix04.revsci.net su.addthis.com track.adform.net

YouTube

bh.ams.contextweb.com eu-jet-01.sociomantic.com ib.adnxs.com www.wajam.com uip.semasio.net

Facebook

adadvisor.net

data.bncnt.com

go.flx1.com

ira.spysomeone.com

tags.bluekai.com

ww1.collserve.com

www.skyscanner.com

Page 23: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection - Results

23"

§  HTTP traces §  Third party sites in

–  Repubblica.it –  YouTube –  Facebook

§  Keys suggest the exchange of client identifiers

§  With CrowdSurf should be possible to check also on HTTPS

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Website Third party sites Keys Repubblica

pix04.revsci.net id su.addthis.com puid track.adform.net icid

YouTube

bh.ams.contextweb.com vgd eu-jet-01.sociomantic.com fpc ib.adnxs.com uuid www.wajam.com sExtCookieId uip.semasio.net install_timestamp

Facebook

adadvisor.net bk_uuid

data.bncnt.com uid

go.flx1.com anuid, euid

ira.spysomeone.com s

tags.bluekai.com google_gid

ww1.collserve.com bk_uuid

www.skyscanner.com ksh_id

Page 24: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Future work

24"

§  CrowdSurf System presents some practical challenges that must be faced •  Preserve privacy of users when sending contribution to the cloud

•  Anonymization is mandatory •  Research community is called to design automatic algorithms, and propose

scalable implementations •  Users must become aware of the risk of web tracking and thus embrace

crowdsurf (or similar) solution •  It shall pass through a long and difficult standardization process to get

accepted as a compelling technology §  “But it can work”…..

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 25: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Thanks!

25"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Page 26: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection

26"12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"

Client2"

Client1"

Client3"

Client4"

ClientN"

Id2"

Id1"

Id3"

Id4"

IdN"

Hostname1 and Key1"

.

.

.

.

.

.

Possible Third-party tracking

service

Page 27: CrowdSurf: Empowering Informed Choices in the Web...CrowdSurf: Empowering Informed Choices in the Web" Website Third party sites Keys Repubblica pix04.revsci.net id su.addthis.com

Automatic Tracker Detection

27"

Client2"

Client1"

Client3"

Client4"

ClientN"

Id1"

Id3"

Id4"

IdN"

Hostname2 and Key1"

.

.

.

.

.

.

Not Possible Third-party tracking

service

12th Italian Networking Workshop, Jan 2015 CrowdSurf: Empowering Informed Choices in the Web"