52
Website fingerprinting on Tor: attacks and defenses Claudia Diaz KU Leuven Post-Snowden Cryptography Workshop Brussels, December 10, 2015 Joint work with: Marc Juarez , Sadia Afroz, Gunes Acar, Rachel Greenstadt, Mohsen Imani, Mike Perry, Matthew Wright

Website fingerprinting on TOR

Embed Size (px)

Citation preview

Page 1: Website fingerprinting on TOR

Website fingerprinting on Tor: attacks and defenses

Claudia Diaz

KU Leuven

Post-Snowden Cryptography Workshop Brussels, December 10, 2015

Joint work with: Marc Juarez, Sadia Afroz, Gunes Acar, Rachel Greenstadt,

Mohsen Imani, Mike Perry, Matthew Wright

Page 2: Website fingerprinting on TOR

Metadata It’s not just about communications content: Sigint

Time, duration, size, identities, location, pattern Exposed by default in communications protocols Bulk collection: size much smaller than content Machine readable, cheap to analyze, highly revealing Much lower level of legal protection

Dedicated systems to protect metadata Tor network NSA program “Egotistical Giraffe”

Page 3: Website fingerprinting on TOR

Introduction: how does WF work?

User

Tor

Web

Adversary

2

User = Alice

Webpage = ??

Page 4: Website fingerprinting on TOR

Why is WF so important? Tor as the most advanced anonymity network (according to NSA)

Allows an adversary to recover users web browsing history

Series of successful attacks

Weak adversary model (local adversary)

Number of top conference publications

on WF (30)

3

2WKHUV LQ�7RU

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

���� � �

Page 5: Website fingerprinting on TOR

Introduction: assumptions

User

Tor

Web

Adversary

Client settings: Browsing behaviour: which pages, one at the time

4

Page 6: Website fingerprinting on TOR

Introduction: assumptions

User

Tor

Web

Adversary

Adversary: Replicability system config, parsing (start/end page), clean traces

4

Page 7: Website fingerprinting on TOR

Introduction: assumptions

User

Tor

Web

Adversary

Web: No personalisation, or staleness

4

Page 8: Website fingerprinting on TOR

Methodology

Based on Wang and Goldberg’s

Batches and k-fold cross-validation

Fast-levenshtein attack (SVM)

Comparative experiments

Key: isolate variable under evaluation (e.g., TBB version)

6

Page 9: Website fingerprinting on TOR

Comparative experiments: example ●  Step 1:

●  Step 2:

7

Page 10: Website fingerprinting on TOR

Comparative experiments: example ●  Step 1:

●  Step 2: Train: on data with default value Test: on data with default value

Accuracy Control

7

Page 11: Website fingerprinting on TOR

Comparative experiments: example ●  Step 1:

●  Step 2: Train: on data with default value Test: on data with value of interest

Accuracy Test

7

Page 12: Website fingerprinting on TOR

Experiments: multitab browsing

●  FF users use average 2 or 3 tabs

9

Page 13: Website fingerprinting on TOR

Experiments: multitab browsing

●  Experiment with 2 tabs: 0.5s, 3s, 5s

●  FF users use average 2 or 3 tabs

9

Page 14: Website fingerprinting on TOR

Experiments: multitab browsing

●  Experiment with 2 tabs: 0.5s, 3s, 5s

●  FF users use average 2 or 3 tabs

9

Page 15: Website fingerprinting on TOR

Experiments: multitab browsing

Background Background Foreground Foreground

●  Experiment with 2 tabs: 0.5s, 3s, 5s

●  FF users use average 2 or 3 tabs

●  Background page picked at random

9

Page 16: Website fingerprinting on TOR

Experiments: multitab browsing

●  Experiment with 2 tabs: 0.5s, 3s, 5s

●  FF users use average 2 or 3 tabs

●  Background page picked at random for a batch

●  Success: detection of either page 9

Page 17: Website fingerprinting on TOR

Control Test (0.5s)

77.08%

9.8% 7.9% 8.23%

Test (3s) Test (5s)

Experiments: multitab browsing Accuracy for different time gaps

10

Time

BW

Tab 2 Tab 1

Page 18: Website fingerprinting on TOR

Experiments: TBB versions

Coexisting Tor Browser Bundle (TBB) versions

Versions: 2.4.7, 3.5 and 3.5.2.1 (changes in RP, etc.)

11

Page 19: Website fingerprinting on TOR

Experiments: TBB versions

Coexisting Tor Browser Bundle (TBB) versions

Versions: 2.4.7, 3.5 and 3.5.2.1 (changes in RP, etc.)

Control (3.5.2.1)

Test (2.4.7)

Test (3.5)

79.58% 66.75%

6.51%

11

Page 20: Website fingerprinting on TOR

VM New York VM Leuven

VM Singapore

Experiments: network conditions

12

KU Leuven

DigitalOcean (virtual private servers)

Page 21: Website fingerprinting on TOR

VM New York VM Leuven

VM Singapore

Experiments: network conditions

66.95%

8.83%

Control (LVN) Test (NY) 12

Page 22: Website fingerprinting on TOR

VM New York VM Leuven

VM Singapore

Experiments: network conditions

66.95%

9.33%

Control (LVN) Test (SI) 12

Page 23: Website fingerprinting on TOR

VM New York VM Leuven

VM Singapore

Experiments: network conditions

76.40% 68.53%

Test (NY) Control (SI) 12

Page 24: Website fingerprinting on TOR

Experiments: data staleness

Accuracy (%)

Time (days)

Staleness of our collected data over 90 days (Alexa Top 100)

Less than 50% after 9d.

15

Page 25: Website fingerprinting on TOR

Summary

16

Page 26: Website fingerprinting on TOR

Closed vs Open world Early prior WF works considered closed world of pages users may browse (train and test on that world)

In practice: in the Tor case, extremely large universe of web pages

How likely is the user (a priori) to visit a target web page? -  If adversary has a good prior, the attack becomes “confirmation attack” -  BUT may be hard for adversary to have a good prior, particularly for less popular pages -  If the prior is not a good estimate: base rate fallacy ! many false positives

“False positives matter a lot”1

1Mike Perry, “A Critique of Website Traffic Fingerprinting Attacks”, Tor project Blog, 2013. https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks.

Page 27: Website fingerprinting on TOR

Breathalyzer test: 0.88 identifies truly drunk drivers (true positives)

0.05 false positives

Alice gives positive in the test What is the probability that she is indeed drunk? (BDR)

Is it 0.95? Is it 0.88? Something in between?

The base rate fallacy: example

17

Page 28: Website fingerprinting on TOR

Breathalyzer test: 0.88 identifies truly drunk drivers (true positives)

0.05 false positives

Alice gives positive in the test What is the probability that she is indeed drunk? (BDR)

Is it 0.95? Is it 0.88? Something in between?

The base rate fallacy: example

Only 0.1!

17

Page 29: Website fingerprinting on TOR

The base rate fallacy: example

●  Circumference represents the world of drivers.

●  Each dot represents a driver.

18

Page 30: Website fingerprinting on TOR

The base rate fallacy: example

●  1% of drivers are driving drunk (base rate or prior).

19

Page 31: Website fingerprinting on TOR

The base rate fallacy: example

●  From drunk people 88% are identified as drunk by the test

20

Page 32: Website fingerprinting on TOR

The base rate fallacy: example

●  From the not drunk people, 5% are erroneously identified as drunk

21

Page 33: Website fingerprinting on TOR

●  Alice must be within the black circumference

●  Ratio of red dots within the black circumference:

BDR = 7/70 = 0.1 !

The base rate fallacy: example

22

Page 34: Website fingerprinting on TOR

Base rate must be taken into account

In WF: Blue: webpages

Red: monitored

Base rate?

The base rate fallacy in WF

23

Page 35: Website fingerprinting on TOR

Probability of visiting a monitored page?

Experiment -  4 monitored pages

-  Train on Alexa top 100, test on Alexa top 35K

-  Binary classification: monitored / non-monitored

Prior probability of visiting a monitored page: -  Uniform in 35K -  Priors estimated from Active Linguistic Authentication Dataset (ALAD) dataset (3,5%): Real-world users (80 users, 40K unique URLs)

The base rate fallacy in WF

24

Page 36: Website fingerprinting on TOR

Experiment: BDR in a 35K world

Size of the world

●  Uniform world ●  Non-popular pages

from ALAD

25

0.13

0.026

0.8

Page 37: Website fingerprinting on TOR

Classify, but verify

Verification step to test classifier confidence

Number of FPs reduced

But BDR is still very low for non popular pages

26

Page 38: Website fingerprinting on TOR

Cost for the adversary

Adversary’s cost will depend on:

Number of pages (versions, personalisation)

27

Page 39: Website fingerprinting on TOR

Cost for the adversary

Adversary’s cost will depend on:

Number of pages (versions, personalisation)

Number of target users (system configuration, location)

29

Page 40: Website fingerprinting on TOR

Cost for the adversary

Adversary’s cost will depend on:

Number of pages (versions, personalisation)

Number of target users (system configuration, location)

Training and testing complexities of the classifier

31

Page 41: Website fingerprinting on TOR

Cost for the adversary

Adversary’s cost will depend on:

Number of pages (versions, personalisation)

Number of target users (system configuration, location)

Training and testing complexities of the classifier

Maintaining a successful WF system is costly 32

Page 42: Website fingerprinting on TOR

Defenses against WF attacks High level: Randomized pipelining, HTTPOS

Ineffective !

Supersequence approaches, traffic morphing: grouping pages to create anonymity sets

Infeasible !

BuFLO: constant rate Expensive (bandwidth), usability (latency) !

Tamaraw, CS-BuFLO Still expensive (bandwidth), usability (latency) !

Page 43: Website fingerprinting on TOR

Requirements defenses Effectiveness

Do not increase latency

No need for computing / distributing auxiliary information

No server-side cooperation needed

Bandwidth: some increase is tolerable in the input connections to the network

Page 44: Website fingerprinting on TOR

Adaptive padding Based on proposal by Shmatikov and Wang as defense for E2E traffic confirmation attacks

Generates traffic packets at random times Inter-packet timings following distribution of general web traffic

Does NOT introduce latency: real packets are not delayed

Disturbs key traffic features exploited by classifiers (burst features, total size) in an unpredictable way, different for each visit to the same page

Page 45: Website fingerprinting on TOR

Adaptive padding implementation Implemented as a pluggable transport

Implemented by both ends (OP "# Guard or Bridge) Controlled by the client (OP)

Need to obtain the distribution of inter-packet delays: crawl

Page 46: Website fingerprinting on TOR

Adaptive padding

Page 47: Website fingerprinting on TOR

Modifications to adaptive padding Interactivity: two additional histograms to generate dummies in response to a packet received from the other end

Control messages: for client to tell server parameters of padding

Soft-stop condition: sampling an infinity value (probabilistic)

Page 48: Website fingerprinting on TOR

Adaptive padding evaluation Classifier: kNN (Wang et al.)

Experimental setup:

Training: Top Alexa 100 Monitored pages: 10, 25, 80 Open world: 5K-30K pages

Page 49: Website fingerprinting on TOR

Evaluation results Comparison with other defenses

Closed world: 100 pages Ideal attack conditions

Page 50: Website fingerprinting on TOR

Evaluation results (realistic) 50 monitored pages 5K pages (open world) k-NN with k=5

Page 51: Website fingerprinting on TOR

Conclusions

Significant difference between WF attack in ideal lab

conditions and more realistic conditions

Effect of False Positives (base rate fallacy)

Attack is costly for the adversary

Adaptive padding is an effective defence with no latency

and moderate bandwidth overhead 34

Page 52: Website fingerprinting on TOR

Questions?

Thank you for your attention.