142
Anonymous communications: High latency systems Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, It

Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymous communications: High latency systemsMessaging anonymity & the traffic analysis of hardened systems

FOSAD 2010 – Bertinoro, Italy

Page 2: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Network identity today

Networking Relation between identity and efficient

routing Identifiers: MAC, IP, email, screen name No network privacy = no privacy!

The identification spectrum todayFull

AnonymityStrong

IdentificationPseudonymity

“The Mess” we are in!

Page 3: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Network identity today (contd.)

NO ANONYMITY

Weak identifiers everywhere: IP, MAC Logging at all levels Login names /

authentication PK certificates in clear

Also: Location data leaked Application data leakage

NO IDENTIFICATION

Weak identifiers easy to modulate Expensive / unreliable logs. IP / MAC address changes Open wifi access points Botnets

Partial solution Authentication

Open issues: DoS and network level

attacks

Page 4: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Ethernet packet format

Anthony F. J. Levi - http://www.usc.edu/dept/engineering/eleceng/Adv_Network_Tech/Html/datacom/

MAC Address

No integrity orauthenticity

Page 5: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

IP packet format

RFC: 791

INTERNET PROTOCOL

DARPA INTERNET PROGRAM

PROTOCOL SPECIFICATION

September 1981

3.1. Internet Header Format

A summary of the contents of the internet header follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Datagram Header

Figure 4.

Link different packets together

No integrity / authenticitySame for TCP, SMTP, IRC, HTTP, ...

Weak identifiers

Page 6: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Outline –4 lectures

Objective: foundations of anonymity & traffic analysis

Why anonymity?

Lecture 1 – Unconditional anonymity DC-nets in detail, Broadcast & their discontents

Lecture 2 – Black box, long term attacks Statistical disclosure & its Bayesian formulation

Lecture 3 – Mix networks in detail Sphinx format, mix-networks

Lecture 4 – Bayesian traffic analysis of mix networks

Page 7: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity in communications Specialized applications

Electronic voting Auctions / bidding / stock market Incident reporting Witness protection / whistle blowing Showing anonymous credentials!

General applications Freedom of speech Profiling / price discrimination Spam avoidance Investigation / market research Censorship resistance

Page 8: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity properties (1)

Sender anonymity Alice sends a message to Bob. Bob

cannot know who Alice is. Receiver anonymity

Alice can send a message to Bob, but cannot find out who Bob is.

Bi-directional anonymity Alice and Bob can talk to each other, but

neither of them know the identity of the other.

Page 9: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity properties (2)

3rd party anonymity Alice and Bob converse and know each

other, but no third party can find this out.

Unobservability Alice and Bob take part in some

communication, but no one can tell if they are transmitting or receiving messages.

Page 10: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Pseudonymity properties

Unlinkability Two messages sent (received) by Alice

(Bob) cannot be linked to the same sender (receiver).

Pseudonymity All actions are linkable to a pseudonym,

which is unlinkable to a principal (Alice)

Page 11: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity through Broadcast

Simple receiver anonymity

E(Message)

E(Junk)

E(Junk)

E(Junk)

E(Junk)

Point 1: Do not re-invent this

Point 2: Many ways to do broadcast

- Ring- Trees

It has all been done (Buses)

Point 4: What are the problems here?

CoordinationSender anonymityLatencyBandwidth

Point 3: Is your anonymity system better than this?

Page 12: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Unconditional anonymity

DC-nets Dining Cryptographers (David Chaum 1985)

Multi-party computation resulting in a message being broadcast anonymously

2 twists How to implement DC-nets through broadcast trees How to achieve reliability?

Communication cost...

Page 13: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The Dining Cryptographers (1) “Three cryptographers are sitting down to

dinner at their favourite three-star restaurant. Their waiter informs them that arrangements

have been made with the maitre d'hotel for the bill to be paid anonymously.

One of the cryptographers might be paying for the dinner, or it might have been NSA (U.S. National Security Agency).

The three cryptographers respect each other's right to make an anonymous payment, but they wonder if NSA is paying.”

Page 14: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The Dining Cryptographers (2)

Wit

Adi

Ron

Did theNSA pay?

I paid

I didn’t

I didn’t

Page 15: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The Dining Cryptographers (2)

Wit

Adi

Ron I paidmr = 1

I didn’tmw = 0

I didn’tma = 0

Toss coincar

Toss coincaw

Toss coincrw

ba = ma + car + caw

bw = mw + crw + caw

br = mr + car + crw

Combine:B = ba + br + bw =ma + mr +mw = mr (mod 2)

Page 16: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

DC-nets

Generalise Many participants Larger message size▪ Conceptually many coins in parallel (xor)▪ Or: use +/- (mod 2|m|)

Arbitrary key (coin) sharing▪ Graph G: ▪ nodes - participants, ▪ edges - keys shared

What security?

Page 17: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The Dining Cryptographers (3)

Wit

Adi

RonI want to

sendmr =

message

mw = 0

ma = 0

Toss m coinscar

Toss m coinscaw

Toss m coinscrw

ba = ma - car + caw

(mod 2m)

bw = mw + crw - caw

(mod 2m)

br = mr + car - crw

(mod 2m)

Combine:B = ba + br + bw =ma + mr +mw = mr (mod 2m)

Page 18: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Key sharing graph

Derive coins cabi = H[Kab, i]

for round i Stream cipher

(Kab)

Alice broadcasts ba = cab + cac

+ ma

AB

Shared key Kab

C

Page 19: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Key sharing graph – security (1)

If B and C corrupt

Alice broadcasts ba = cab + cac

+ ma

Adversary’s view ba = cab + cac

+ ma

No Anonymity

AB

Shared key Kab

C

Page 20: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Key sharing graph – security (2)

Adversary nodes partition the graph into a blue and green sub-graph

Calculate: Bblue = ∑bj, j is blue

Bgreen = ∑bi, i is green

Substract known keys Bblue + Kred-blue = ∑mj

Bgreen + K’red-green = ∑mi

Discover the originating subgraph. Reduction in anonymity

AB

C

Anonymity set size = 4 (not 11 or 8!)

Page 21: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

DC-net implementation

bi broadcast graph Tree – independent of key sharing graph = Key sharing graph –

No DoS unless split in graph

Communication in 2 phases: 1) Key sharing (off-line) 2) Round sync & broadcast

Aggregator

Combine:B = bi = mr (mod 2m)

Page 22: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

DC-net implementation (p2p)

bi broadcast graph Tree – independent of key sharing graph = Key sharing graph –

No DoS unless split in graph

Communication in 2 phases: 1) Key sharing (off-line) 2) Round sync & broadcast

(peer-to-peer?) Ring? Tree?

Page 23: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

DC collisions

Wit

Adi

RonI want to

sendmr =

message1

I want to sendmw = message2

ma = 0

Toss m coinscar

Toss m coinscaw

Toss m coinscrw

ba = ma - car + caw

(mod 2m)

bw = mw + crw - caw

(mod 2m)

br = mr + car - crw

(mod 2m)

Combine:B = ba + br + bw =ma + mr +mw = collision (mod 2m)

Page 24: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

How to resolve collisions? Ethernet:

detect collisions (redundancy) Exponential back-off, with random re-transmission

Collisions do not destroy all information B = ba + br + bw = ma + mr +mw =

= collision (mod m) = message 1 + message2 (mod m)

N collisions can be decoded in N transmissions

Cool!

Page 25: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Reliability – How?Trick 1: run 2 DC-net rounds in parallel

(1, m1)

(1, m2)

(0, 0)

(1, m4)

(1, m5)

(1, m6)

(0, 0)

(5, m1+m2+m4+m5+m6)Round 1

(3, m1+m2+m4) (2, m5+m6)

Round 4

(1, m1)(2, m2+m4) (1, m5)

(1, m6)

(1, m2)(1, m4)

Round 3

Round 2

Round 5

Trick 2: retransmit if greater than average

Page 26: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Other tricks with DC nets Reliability DoS resistance!

How to protect a DC-network against disruption?

Solved problem? DC in a Disco – “trap & trace” Modern crypto: commitments, secret sharing,

and banning

Note the homomorphism: collisions = addition Can tally votes / histograms through DC-nets

Page 27: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

DC-net shortcommings

Security is great! Full key sharing graph perfect anonymity

Communication cost – BAD (N broadcasts for each message!) Naive: O(N2) cost, O(1) Latency Not so naive: O(N) messages, O(N) latency▪ Ring structure for broadcast

Expander graph: O(N) messages, O(logN) latency? Centralized: O(N) messages, O(1) latency

Not practical for large(r) N! Local wireless communications?

Perfect Anonymity?

Page 28: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Black-box, long term traffic analysis attacks

“In the long run we are all dead” – Keynes

Page 29: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity so far...

DC-nets – setting Single message from Alice to Bob, or between a set

group

Real communications Alice has a few friends that she messages often Interactive stream between Alice and Bob (TCP) Emails from Alice to Bob (SMTP)

Alice is not always on-line (network churn)

Repetition – patterns -> Attacks

Page 30: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Fundamental limits

Even perfect anonymity systems leak information when participants change Remember: DC-nets do not scale well

Setting: N senders / receivers – Alice is one of them Alice messages a small number of friends:▪ RA in {Bob, Charlie, Debbie}

▪ Through a MIX / DC-net▪ Perfect anonymity of size K

Can we infer Alice’s friends?

Page 31: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Setting

Alice sends a single message to one of her friends

Anonymity set size = KEntropy metric EA = log K

Perfect!

Alice

K-1 Sendersout of N-1

others

K-1 Receiversout of Nothers

rA in RA= {Bob, Charlie, Debbie}

Anonymity

System

(Model as random receivers)

Page 32: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Many rounds

Observe many rounds in which Alice participates

Rounds in which Alice participates will output a message to her friends!

Infer the set of friends!

Alice

Others Others

rA1Anonymit

ySystem

Alice

Others Others

rA2Anonymit

ySystem

Alice

Others Others

rA3Anonymit

ySystem

Alice

Others Others

rA4Anonymit

ySystem

...

T1

T2

T3

T4

Tt

Page 33: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Hitting set attack (1)

Guess the set of friends of Alice (RA’) Constraint |RA’| = m

Accept if an element is in the output of each round

Downside: Cost N receivers, m size – (N choose m) options Exponential – Bad

Good approximations...

Page 34: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Statistical disclosure attack Note that the friends of Alice will be in the

sets more often than random receivers

How often? Expected number of messages per receiver: μother = (1 / N) ∙ (K-1) ∙ t

μAlice = (1 / m) ∙ t + μother

Just count the number of messages per receiver when Alice is sending! μAlice > μother

Page 35: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Comparison: HS and SDA

Parameters: N=20 m=3 K=5 t=45 KA={[0, 13, 19]}

Round Receivers SDA SDA_error#Hitting sets1 [15, 13, 14, 5, 9] [13, 14, 15]2 6852 [19, 10, 17, 13, 8] [13, 17, 19] 1 3953 [0, 7, 0, 13, 5] [0, 5, 13]1 2574 [16, 18, 6, 13, 10] [5, 10, 13] 2 2035 [1, 17, 1, 13, 6] [10, 13, 17] 2 1796 [18, 15, 17, 13, 17] [13, 17, 18] 2 1757 [0, 13, 11, 8, 4] [0, 13, 17] 1 1718 [15, 18, 0, 8, 12] [0, 13, 17] 1 809 [15, 18, 15, 19, 14] [13, 15, 18] 2 4110 [0, 12, 4, 2, 8] [0, 13, 15] 1 1611 [9, 13, 14, 19, 15] [0, 13, 15] 1 1612 [13, 6, 2, 16, 0] [0, 13, 15] 1 1613 [1, 0, 3, 5, 1] [0, 13, 15] 1 414 [17, 10, 14, 11, 19] [0, 13, 15] 1 215 [12, 14, 17, 13, 0] [0, 13, 17] 1 2

16[18, 19, 19, 8, 11] [0, 13, 19] 0 117 [4, 1, 19, 0, 19] [0, 13, 19] 0 118 [0, 6, 1, 18, 3] [0, 13, 19] 0 119 [5, 1, 14, 0, 5] [0, 13, 19] 0 120 [17, 18, 2, 4, 13] [0, 13, 19] 0 121 [8, 10, 1, 18, 13] [0, 13, 19] 0 122 [14, 4, 13, 12, 4] [0, 13, 19] 0 123 [19, 13, 3, 17, 12] [0, 13, 19] 0 124 [8, 18, 0, 10, 18] [0, 13, 18] 1 1

Round 16: Both attacks give correct

result

SDA: Can give wrong results – need more

evidence

Page 36: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

HS and SDA (continued)

25 [19, 4, 13, 15, 0] [0, 13, 19] 0 126 [13, 0, 17, 13, 12] [0, 13, 19] 0 127 [11, 13, 18, 15, 14] [0, 13, 18] 1 128 [19, 14, 2, 18, 4] [0, 13, 18] 1 129 [13, 14, 12, 0, 2] [0, 13, 18] 1 130 [15, 19, 0, 12, 0] [0, 13, 19] 0 131 [17, 18, 6, 15, 13] [0, 13, 18] 1 132 [10, 9, 15, 7, 13] [0, 13, 18] 1 133 [19, 9, 7, 4, 6] [0, 13, 19] 0 134 [19, 15, 6, 15, 13] [0, 13, 19] 0 135 [8, 19, 14, 13, 18] [0, 13, 19] 0 136 [15, 4, 7, 13, 13] [0, 13, 19] 0 137 [3, 4, 16, 13, 4] [0, 13, 19] 0 138 [15, 13, 19, 15, 12] [0, 13, 19] 0 139 [2, 0, 0, 17, 0] [0, 13, 19] 0 140 [6, 17, 9, 4, 13] [0, 13, 19] 0 141 [8, 17, 13, 0, 17] [0, 13, 19] 0 142 [7, 15, 7, 19, 14] [0, 13, 19] 0 143 [13, 0, 17, 3, 16] [0, 13, 19] 0 144 [7, 3, 16, 19, 5] [0, 13, 19] 0 145 [13, 0, 16, 13, 6] [0, 13, 19] 0 1

SDA: Can give wrong results – need more

evidence

Page 37: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Disclosure attack family

Counter-intuitive The larger N the easiest the attack

Hitting-set attacks More accurate, need less information Slower to implement Sensitive to Model

▪ E.g. Alice sends dummy messages with probability p.

Statistical disclosure attacks Need more data Very efficient to implement (vectorised) – Faster partial results Can be extended to more complex models (pool mix, replies, ...)

Bayesian modelling of the problem A systematic approach to traffic analysis

Page 38: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Traffic analysis is probabilistic inference (1)

Full generative model: Pick a profile for Alice Alice and a Profile for others Other from prior distributions

(once!) For each round:

▪ Pick a multi-set of senders uniformly from the multi-sets of size K of possible senders▪ Pick a permutation M1 from kk uniformly at random

▪ For each sender pick a receiver according to the probability in their respective profiles

Alice

Othersrother Other

rA1 AliceAnonymitySystem

T1

Profile Alice

Profile Other

Matching M1 M

Observation O1 = Senders ( Uniform) + receivers (as above)

Page 39: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Traffic analysis is probabilistic inference (2)

Full generative model: Pick a profile for Alice Alice and a Profile for others Other from prior distribution (once!) For each round:

▪ Pick a multi-set of senders in Oi uniformly from the multi-sets of size K of possible senders

▪ Pick a permutation M1 from kk uniformly at random

▪ For each sender pick a receiver rAlice and rother according to the probability in their respective profiles

Alice

Othersrother Other

rA1 AliceAnonymitySystem

T1

Profile Alice

Profile Other

Matching M1 M

Observation O1 = Senders ( Uniform) + receivers (as above)

Page 40: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Traffic analysis is probabilistic inference (3)

The Disclosure attack inference problem: “Given all known and observed variables (priors , M, observations Oi) determine the

distribution over the hidden random variables (profiles Alice , Other and matching Mi )”

Using information from all rounds, assuming profiles are stable

How? (a) maths = Bayesian inference (b) computation = Gibbs sampling

Alice

Othersrother Other

rA1 AliceAnonymitySystem

T1

Profile Alice

Profile Other

Matching M1 M

Observation O1 = Senders ( Uniform) + receivers (as above)

Page 41: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Lets apply probability theory What are profiles?

Keep it simple: a multinomial distribution over all users. E.g. Alice = [Bob: 0.4, Charlie: 0.3, Debbie: 0.3]

Other = [Uniform(1/n-1)] Prior distribution on multinomials: Dirichlet Distribution, i.e.

samples from Dirichlet distribution are multinomial distributions

Other profile models are also possible: restrictions on number of contacts, cliques, mutual friends, …

Other quantities have known conditional probabilities.

What are those?

Page 42: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Traffic analysis is probabilistic inference (4)

Full generative model: Pick a profile for Alice Alice and a Profile for others Other from prior distribution (once!) For each round:

▪ Pick a multi-set of senders in Oi uniformly from the multi-sets of size K of possible senders

▪ Pick a permutation M1 from kk uniformly at random

▪ For each sender pick a receiver rAlice and rother according to the probability in their respective profiles

Alice

Othersrother Other

rA1 AliceAnonymitySystem

T1

Profile Alice

Profile Other

Matching M1 M

Observation O1 = Senders ( Uniform) + receivers (as above)

Pr[Alice | ]

Pr[Other | ]Pr[M1 | M]

Pr[rA1 | Alice ,M1, O1]

Pr[rother | other ,M1, O1]Pr[O1 | K]

Page 43: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Re-arrange probabilities

What we have:

Pr[Alice , Other, Mi , riA, riother | Oi, M, , K] =

= Pr[riA | Alice ,M1, Oi] x Pr[riother | other ,M1, Oi] x Pr[Mi | M] x Pr[Alice | ] x Pr[Other | ]

What we really want:

Pr[Alice , Other, Mi | riA, riother, Oi, M, , K]

Page 44: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Pr[A=a, B=b] = Pr[A=a | B=b] x Pr[B = b]=Pr[B=b | A=a] x Pr[A = a]

Pr[A=a | B=b] = Pr[B=b | A=a ] x Pr[A=a ] / Pr[B = b]

Pr[B=b] = (A=a) Pr[A=a, B=b] = (A=a) Pr[A=a | B=b] x Pr[B = b]

Revision: Bayes theorem

Derivation

Meaning

Normalising factor

Forward probability Prior Normalising factor

Large spaces = no direct computation

Inverse probability

Page 45: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

ABA B

A

Apply Bayes

Pr[Alice , Other, Mi | riA, riother , Oi, M, , K] =

Pr[ riA, riother | Alice , Other, Mi Oi, M, , K] x

Pr[Alice , Other, Mi | Oi, M, , K]

Pr[ riA, riother | Alice , Other, Mi Oi, M, , K] x Pr[Alice , Other, Mi | Oi, M, , K]

All riA, riother

Large spaces = no direct computation

Page 46: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The good, the bad and the uglyGood: can compute the probability

sought up to a multiplicative constant Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]

Pr[ riA, riother | Alice , Other, Mi Oi, M, , K] x Pr[Alice , Other, Mi | Oi, M, , K]

Bad: Forget about computing the constant – in general

Ugly: Use computational methods to estimate quantities of interest for traffic analysis

Page 47: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Quantities of Interest

Full hidden state: Alice , Other, Mi

Marginal probabilities & distributions Pr[Alice->Bob] – Are Alice and Bob

friends? Mx – Who is talking to whom at round x?

How can we get information about those? Without computing the exact probability!

Page 48: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Revision: Sampling

Consider a distribution Pr[A=a]

Direct computation of mean: E[A] = a Pr[A=a] x a

Computation of mean through sampling: Samples A1, A2, …, An A E[A] i Ai / n

Page 49: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Revision: Sampling (2)

Consider a distribution Pr[A=a]

Computation of mean through sampling: Samples A1, A2, …, An A E[A] i Ai / n

More: Estimates of variance Estimate of percentiles – confidence intervals

(0.5%, 2.5%, 50%, 97.5%, 99.5%)

Page 50: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Traffic analysis is probabilistic inference (5)

The Disclosure attack inference problem: “Given all known and observed variables (priors , M, observations Oi) determine the

distribution over the hidden random variables (profiles Alice , Other and matching Mi )”

Using information from all rounds, assuming profiles are stable

How? (a) maths = Bayesian inference (b) computation = Gibbs sampling

Alice

Othersrother Other

rA1 AliceAnonymitySystem

T1

Profile Alice

Profile Other

Matching M1 M

Observation O1 = Senders ( Uniform) + receivers (as above)

Sample

Page 51: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Sampling from complex distributions

Typical approaches Direct sampling: Draw elements

according to the their probabilities Rejection sampling: Draw elements

according to another distribution, keep with the probability of the element

Markov-Chain Monte Carlo (MCMC): Gibbs Sampling (simpler) Metropolis Hastings (more flexible)

Page 52: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Gibbs sampling in a nutshell Allows sampling from complex distributions

when their marginal distributions are easy to sample from.

Example: Sample Pr[A,B | O]

For sample s in (0, SAMPLES): For iteration j in (0, ITERATIONS):▪ aj A with Pr[A|B=bj-1,O]

▪ bj B with Pr[B|A=aj,O]

Samples = (aSAMPLES, bSAMPLES)

Page 53: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Gibbs sampling for traffic analysis

Our distribution:Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]

Marginal distributions: Profiles:

Pr[Alice , Other | Mi ,riA, riother , Oi, M, , K](Direct sampling by sampling Dirichlet dist.)

Mappings:Pr[Mi | Alice , Other , riA, riother , Oi, M, , K]

(Direct sampling of the matching link by link)

Page 54: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Sampling vs. Optimizing

SAMPLING

Get typical values of all hidden variables

Pros: determine the distribution and confidence intervals of each quantity of interest.

Cons: not the “best solution”

OPTIMIZING

Get the values of hidden variables that maximize the probability

Pros: “best solution” Cons: information

about other solutions lost!

Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]

Page 55: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Summary of key points

Near-perfect anonymity is not perfect enough! High level patterns cannot be hidden for ever Unobservability / maximal anonymity set size needed

Flavours of attacks Very exact attacks – expensive to compute▪ Model inexact anyway

Statistical variants – wire fast!

Bayesian approach – systematic▪ Defile a “forward” generative probability model▪ “Invert” the model using Bayes theorem▪ Use sampling to estimate quantities of interest▪ Find their confidence intervals

Page 56: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Where next?

References Limits of Anonymity in Open Environments

by Dogan Kesdogan, Dakshi Agrawal, and Stefan Penz. In the Proceedings of Information Hiding Workshop (IH 2002), October 2002.

Statistical Disclosure Attacks: Traffic Confirmation in Open Environments by George Danezis. In the Proceedings of Security and Privacy in the Age of Uncertainty, (SEC2003), Athens, May 2003, pages 421-426.

The Hitting Set Attack on Anonymity Protocolsby Dogan Kesdogan and Lexi Pimenidis. In the Proceedings of 6th Information Hiding Workshop (IH 2004), Toronto, May 2004.

Statistical Disclosure or Intersection Attacks on Anonymity Systemsby George Danezis and Andrei Serjantov. In the Proceedings of 6th Information Hiding Workshop (IH 2004), Toronto, May 2004.

Practical Traffic Analysis: Extending and Resisting Statistical Disclosureby Nick Mathewson and Roger Dingledine. In the Proceedings of Privacy Enhancing Technologies workshop (PET 2004), May 2004, pages 17-34.

Two-Sided Statistical Disclosure Attackby George Danezis, Claudia Diaz, and Carmela Troncoso. In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies (PET 2007), Ottawa, Canada, June 2007.

Perfect Matching Statistical Disclosure Attacks by Carmela Troncoso, Benedikt Gierlichs, Bart Preneel, and Ingrid Verbauwhede. In the Proceedings of the Eighth International Symposium on Privacy Enhancing Technologies (PETS 2008), Leuven, Belgium, July 2008, pages 2-23.

Vida: How to Use Bayesian Inference to De-anonymize Persistent Communicationsby George Danezis and Carmela Troncoso. In the Proceedings of Privacy Enhancing Technologies, 9th International Symposium, PETS 2009, Seattle, WA, USA, August 5-7, 2009, 2009, pages 56-72.

Traffic analysis exercise http://www.cl.cam.ac.uk/~sjm217/projects/anon/ta-exercise.html

Page 57: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Everything you wanted to know about

Mix-networks and never dared ask.

Anonymity, cryptography, network security and attacks

Page 58: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Advertisement – where is privacy done?

FINANCIAL CRYPTOGRAPHY & DATA SECURITY 2011

St. Lucia – Caribbean Submission deadline: October 1,

2010 Event: Feb-Mar 2011.

PRIVACY ENHANCING TECHNOLOGIES 2011

Waterloo – Canada Submissions deadline: March

2011 Event: July 2011

Page 59: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Mix – practical anonymity David Chaum (concept 1979 – publish 1981)

Ref is marker in anonymity bibliography

Makes uses of cryptographic relays Break the link between sender and receiver

Cost O(1) – O(logN) messages O(1) – O(logN) latency

Security Computational (public key primitives must be secure) Threshold of honest participants

Page 60: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The mix – illustrated

The Mix

Alice Bob

Adversary cannot

see inside the Mix

A->M: {B, Msg}Mix M->B: Msg

Page 61: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The mix – security issues

The Mix

Alice BobA->M: {B, Msg}Mix M->B: Msg

1) Bitwise unlinkability

?

2) Traffic analysis resistance

?

Page 62: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Mix security (contd.)

Bitwise unlinkability Ensure adversary cannot link messages

in and out of the mix from their bit pattern

Cryptographic problem

Traffic analysis resistance Ensure the messages in and out of the

mix cannot be linked using any meta-data (timing, ...)

Two tools: delay or inject traffic – both add cost!

Page 63: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Two broken mix designs (1) Broken bitwise unlinkability

The `stream cipher’ mix (Design 1)

{M}Mix = {fresh k}PKmix, M Streamk

The Mix

Alice

Bob

A->M: {B, Msg}Mix

M->B: Msg

Active attack?Tagging Attack

Adversary intercepts {B, Msg}Mix

and injects {B, Msg}Mix xor (0,Y).

The mix outputs message:M->B: Msg xor Y

And the attacker can link them.

kStream Cipher

MessagePKMix

k,

Page 64: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Lessons from broken design 1 Mix acts as a service

Everyone can send messages to it; it will apply an algorithm and output the result.

That includes the attacker – decryption oracle, routing oracle, ...

(Active) Tagging attacks Defence 1: detect modifications (CCA2) Defence 2: lose all information (Mixminion,

Minx)

Page 65: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Modern message format: Sphinx

Input

ProcessinginsideMIx

Output

George Danezis & Ian Goldberg. Sphinx: A Compact and Provably Secure Mix Format. IEEE S&P ‘09.

Page 66: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Two broken mix designs (2) Broken traffic analysis resistance

The `FIFO*’ mix (Design 2) Mix sends messages out in the order they came in!

The Mix

Alice

Bob

A->M: {B, Msg}Mix

M->B: Msg

Passive attack?

The adversary simply counts thenumber of messages, and assignsto each input the corresponding

output.

* FIFO = First in, First out

Page 67: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Lessons from broken design 2 Mix strategies – ‘mix’ messages together

Threshold mix: wait for N messages and output them in a random order. Pool mix: Pool of n messages; wait for N inputs; output N out of N+n;

keep remaining n in pool. Timed, random delay, ...

“Hell is other people” – J.P. Sartre Anonymity security relies on others Problem 1: Mix must be honest Problem 2: Other sender-receiver pairs to hide amongst

Threshold Mix Pool Mix

Pool

Page 68: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Distributing mixing

Rely on more mixes – good idea Distributing trust – some could be dishonest Distributing load – fewer messages per mix

Two extremes Mix Cascades▪ All messages are routed through a preset mix sequence▪ Good for anonymity – poor load balancing

Free routing▪ Each message is routed through a random sequence of

mixes▪ Security parameter: L then length of the sequence

Page 69: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The free route example

M1

M3

M4

M2

M5

M6

M7

AliceBob

Free routemix networkThe Mix

(The adversary should

get no more information

than before!)

A->M2: {M4, {M1,{B, Msg}M1}M4}M2

Page 70: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Free route mix networks

Bitwise unlinkability Length invariance Replay prevention

How to find mixes? Lists need to be authoritative, comprehensive & common

Additional requirements – corrupt mixes Hide the total length of the route Hide the step number (From the mix itself!)

Length of paths? Good mixing in O(log(|Mix|)) steps = log(|Mix|) cost Cascades: O(|Mix|)

We can manage “Problem 1 – trusting a mix”

Page 71: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Problem 2 – who are the others?

The (n-1) attack – active attack Wait or flush the mix. Block all incoming messages (trickle)

and injects own messages (flood) until Alice’s message is out.

The Mix

Alice

Bob

Attacker

n

1

Page 72: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Mitigating the (n-1) attack Strong identification to ensure distinct identities

Problem: user adoption

Message expiry Messages are discarded after a deadline Prevents the adversary from flushing the mix, and injecting

messages unnoticed

Heartbeat traffic Mixes route messages in a loop back to themselves Detect whether an adversary is blocking messages Forces adversary to subvert everyone, all the time

General instance of the “Sybil Attack”

Page 73: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Robustness to DoS

Malicious mixes may be dropping messages Special problem in elections

Original idea: receipts (unworkable)

Two key strategies to prevent DoS Provable shuffles Randomized partial checking

Page 74: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Provable shuffles – overview

Bitwise unlinkability: El-Gamal re-encryption El-Gamal public key (g, gx) for private x El-Gamal encryption (gk, gkx ∙M) El-Gamal re-encryption (gk’ ∙ gk , gk’xgkx ∙M)▪ No need to know x to re-encrypt▪ Encryption and re-encryption unlinkable

Architecture – re-encryption cascade Output proof of correct shuffle at each

step

Page 75: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Provable shuffles – illustrated

Proof of correct shuffle Outputs are a permutation of the decrypted inputs (Nothing was inserted, dropped, otherwise

modified!) Upside: Publicly verifiable – Downside: expensive

El-GamalEncryption

Re-enc

Re-enc

Re-enc

ThresholdDecryption

Alice’s input Mix 1 Mix 2 Mix 3

Proof Proof Proof Proof

Page 76: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Randomized partial checking Applicable to any mix system

Two round protocol Mix commits to inputs and outputs Gets challenge Reveals half of correspondences at random Everyone checks correctness

Pair mixes to ensure messages get some anonymity

Page 77: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Partial checking – illustrated

Rogue mix can cheat with probability at most ½

Messages are anonymous with overwhelming probability in the length L Even if no pairing is used – safe for L = O(logN)

Mix i Mix i+1

Reveal half Reveal other half

Page 78: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Receiver anonymity

Cryptographic reply address

Alice sends to bob: M1,{M2, k1,{A,{K}A}M2}M1

▪ Memory-less: k1 = H(K, 1)k2 = H(K, 2)

Bob replies: ▪ B->M1: {M2, k1, {A,{K}A}M2}M1, Msg

▪ M1->M2: {A,{K}A}M2 , {Msg}k1

▪ M2->A: {K}A, {{Msg}k1}k2

Security: indistinguishable from other messages

Page 79: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Summary of key concepts Anonymity requires a crowd

Difficult to ensure it is not simulated – (n-1) attack

DC-nets – Unconditional anonymity at high communication cost Collision resolution possible

Mix networks – Practical anonymous messaging Bitwise unlinkability / traffic analysis resistance Crypto: Decryption vs. Re-encryption mixes Distribution: Cascades vs. Free route networks Robustness: Partial checking

Page 80: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity measures – old The anonymity set (size)

Dining cryptographers▪ Full key sharing graph = (N - |Adversary|)▪ Non-full graph – size of graph partition

Assumption: all equally likely

Mix network context Threshold mix with N inputs: Anonymity = N

MixAnonymity

N = 4

Page 81: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity set limitations Example: 2-stage

mix Option 1:

3 possible participants

=> N = 3

Note probabilities!

Option 2: Arbitrary min

probability Problem: ad-hoc

Mix 1

Mix 2

Alice

Bob

Charlie ?½

¼

¼

Page 82: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Entropy as anonymity

Example: 2-stage mix

Define distribution of senders (as shown)

Entropy of the distribution is anonymity E = -∑pi log2 pi

Example:E = - 2 ¼ (-2) – (½) (-1)

= + 1 + ½ = 1.5 bits

(NOT N=3 => E = -log3 = 1.58 bits)

Intuition: missing information for full identification!

Mix 1

Mix 2

Alice

Bob

Charlie ?½

¼

¼

Page 83: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymity measure pitfalls Only the attacker can measure the anonymity

of a system. Need to know which inputs, output, mixes are

controlled

Anonymity of single messages How to combine to define the anonymity of a

systems? Min-anonymity of messages

How do you derive the probabilities? (Hard!) Complex systems – not just examples

Page 84: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

References

Core: The Dining Cryptographers Problem: Unconditional Sender and Recipient

Untraceability by David Chaum.In Journal of Cryptology 1, 1988, pages 65-75.

Mixminion: Design of a Type III Anonymous Remailer Protocol by George Danezis, Roger Dingledine, and Nick Mathewson.In the Proceedings of the 2003 IEEE Symposium on Security and Privacy, May 2003, pages 2-15.

Sphinx: A Compact and Provably Secure Mix Format by George Danezis and Ian Goldberg. In the Proceedings of the 30th IEEE Symposium on Security and Privacy (Samp;P 2009), 17-20 May, Oakland, California, USA, 2009, pages 269-282.

More A survey of anonymous communication channels by George Danezis,

Claudia Diaz and Paul Syversonhttp://homes.esat.kuleuven.be/~gdanezis/anonSurvey.pdf

The anonymity bibliography http://www.freehaven.net/anonbib/

Page 85: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Bayesian traffic analysis of Mix NetworksA systematic approach to measuring anonymity

Remixed slides contributed by Carmela Troncoso

Page 86: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Attacks against mix networks

Uncover who speaks to whom Observe all links (Global Passive Adversary)

Long term disclosure attacks: Exploit persistent patterns Disclosure Attack [Kes03], Statistical Disclosure Attack [Dan03], Perfect

Matching Disclosure Attacks [Tron-et-al08]

Restricted routes [Dan03] Messages cannot follow any route

Bridging and Fingerprinting [DanSyv08] Users have partial knowledge of the network

Based on heuristics and specific models, not generic

86

Page 87: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

87

Determine probability distributions input-output

Threshold mix: collect t messages, and outputs them changing their appearance and in a random order

Mix networks and traffic analysis

MIX 3

MIX 2

MIX 1A

B Q

CS

R

)4

1,

8

3,

8

3(

)4

1,

8

3,

8

3(

)2

1,

4

1,

4

1(

),,( CBA2

1

2

1BorA

2

1

2

1BorA

2

1

4

1

4

1CorBorA

Page 88: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

88

Mix networks and traffic analysis

MIX 3

MIX 2

MIX 1A

B Q

C S

R

)2

1,

4

1,

4

1(

)0,2

1,

2

1(

),,( CBA

)2

1,

4

1,

4

1(

Non trivial given observation!!

1C

Constraints, e.g. length=2

2

1

2

1BorA

2

1

2

1BorA

Page 89: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

89

“The real thing”

Senders

Mixes (Threshold = 3)

Receivers

How to compute probabilities

systematically??

Page 90: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Redefining the traffic analysis problem Find “hidden state” of the mixes

AB Q

C S

RM1

M2

M3

],|Pr[ COHS

HS

COHS

CHSCHSOCOHS

]|,Pr[

]|Pr[],|Pr[],|Pr[

Prior information

Too large to enumerate!!

KCHSO ],|Pr[

90G.Danezis – EPFL – June 2010

Page 91: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Redefining the traffic analysis problem “Hidden State” + Observation = Paths

AB Q

C S

RM1

M2

M3

A M1 M2 M3 RB M1 M3 QC M2 S

P1

P2

P3

]|Pr[],|Pr[

],|Pr[CPathsKCHSO

COHS

91G.Danezis – EPFL – June 2010

Page 92: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

92

Sampling to estimate probabilities (I)

Actually… we want marginal probabilities

But… we cannot obtain a full distribution on HS directly, and cannot enumerate them all

HS

jQA COHSHSICOQA ],|Pr[)(],|Pr[

AB Q

C S

R

)4

1,

8

3,

8

3(

)4

1,

8

3,

8

3(

)2

1,

4

1,

4

1(

),,( CBA

2

1

2

1BorA

2

1

2

1BorA

2

1

4

1

4

1CorBorA

),|Pr( COQA

Page 93: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Sampling to estimate probabilities (II) Using sampling

HS1, HS2, HS3, HS4,…, HSN

0 1 0 1 … 1(A → Q)?

How does Pr[Paths|C] look like?

N

HSICOQA j jQA

)(

,|Pr[

]|Pr[],|Pr[

CPathsCOHS

],|Pr[~ COHS

Markov Chain Monte Carlo Methods

Page 94: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Applications

Evaluation: information theoretic metrics for anonymity

Or min-Entropy, max-probability, (k-anonymity), …

Operations: estimating probability of arbitrary events Input message to output message? Alice speaking to Bob ever? Two messages having the same sender?

],|Pr[log],|Pr[ CORACORAH iR

i

i

Page 95: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Modelling mix networks

How we define

?

]|Pr[ CPaths

Page 96: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Generative model for paths

Context: Senders and the time / round they send Basic generative model:

Users decide on paths independently Path length sampled from a path length distribution Set of nodes chosen as a random choice of distinct nodes

x

x CPCPaths ]|Pr[]|Pr[

x

minmax

1]|Pr[

LLClL

!

)!(],|Pr[

mix

mixx N

lNClLM

)(],|Pr[]|Pr[]|Pr[ xsetxx MIClLMClLCP

0

1)( xset MI

If all distinct

otherwise

Example uniform path length lmin to lmax

Example random permutation of length l

Page 97: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Probabilistic model – Basic Constraints

The problem of Unknown destinations

Observer has to stop at some point!

Problem of imputed observation:Bayesian approach – fill in the data

)(],|Pr[]|Pr[]|Pr[max

xset

L

Llxx MIClLMClLCP

obs

Destination

Destination

Destination

End of observation

Page 98: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Probabilistic model – More constraints Bridging: users can only send through mixes they

know

Non-compliant clients (with probability ) Do not respect length restrictions Choose l out of the Nmix node available, allow repetitions

),( max,min, pcpc LL

lmix

pcxN

PathIClLM1

)](,,|Pr[

pcp

cp

pc

pc

pcPj

jPi

ipci CPpPICPpCPaths ]|Pr[)1()](,|Pr[]|Pr[

x

x CPCPaths ]|Pr[]|Pr[

)|()(],|Pr[]|Pr[]|Pr[ xMIMIClLMClLCP xbridgingxsetxx

As before, paths are independent

Decide whether a node is compliant or not to calculate the probability

Page 99: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Probabilistic model – More constraints

Social network information Assuming we know sending profiles

Other constraints Unknown origin Dummies Other mixing strategies ….

]RecSenPr[)(],|Pr[]|Pr[]|Pr[ xxxsetxx MIClLMClLCP

)RecSenPr( xx

Augment the traffic analysis of mix networks with long-term profiles

Take home lesson:

• Integrate more constraints / more complexity by augmenting generative model

• More constraints = less anonymity

Page 100: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Markov Chain Monte Carlo methods and traffic analysis

How we sample

?

]|Pr[ CPaths

Page 101: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Markov Chain Monte Carlo Sample from distribution that is difficult to

sample directly

Best Paths vs. Paths Samples – 3 Key points Requires only a generative model / likelihood Provides full & true distributions – good error

estimate ▪ Not false positives and negatives

Systematic & extensible

HS

COHS

CHSCHSOCOHS

]|,Pr[

]|Pr[],|Pr[],|Pr[

]|Pr[],|Pr[ CPathsKCHSO

x

x CPCPaths ]|Pr[]|Pr[

Page 102: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Revision: Metropolis-Hastings sampling We want to sample Pr[A] which is hard to sample directly

But we can calculate Pr[A] / Z up to a multiplicative constant Z

Metropolis Hastings algorithm Define a random walk Q on A, that is easy to sample

and calculate Q(A=ai | A=ai-1) and Q(A=ai-1 | A=ai)

Select a random initial sample A=a0

Given ai-1 draw a sample ai from Q(ai | ai-1) Calculate

Accept ai with probability min(1, )

)|(]Pr[

)|(]Pr[

11

1

iii

iii

aaQa

aaQaA

A = ai

A = ai-1

Q(A=ai | A=ai-1)

Q(A=ai-1 | A=ai)

Iterate like

crazy then

pick a

sample, …

Then iterate

like crazy,

Page 103: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Metropolis Hastings Algorithm for traffic analysis

Constructs a Markov Chain with stationary distribution

Current state Candidate state

1. Compute

2. If

else

if

else

Q

)|()Pr(

)|()Pr(

candidatecurrentcurrent

currentcandidatecandidate

HSHSQHS

HSHSQHS

HScandidateHScurrent

)|( currentcandidate HSHSQ

)|( candidatecurrent HSHSQ

1

candidatecurrent HSHS

candidatecurrent HSHS

)1,0(~Uu

u

currentcurrent HSHS

),|Pr( COHS

Accept with probability α

Accept

Page 104: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Our sampler: Q transition

Transition Q: swap operation

More complicated transitions for non-compliant clients

AB

Q

C S

R

M1

M2

M3

Z

CPathsCOHS

]|Pr[],|Pr[ Pahtscandida

tePathscurrent

)|( currentcandidate PathsPathsQ

)|( candidatecurrent PathsPathsQ

)|(]Pr[

)|(]Pr[

candidatecurrentcurrent

currentcandidatecandidate

PathsPathsQPaths

PathsPathsQPaths

Page 105: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Iterations

Paths

Paths

Paths

Paths

Paths

Pahtscandidat

e

Pathscurre

nt

)|( currentcandidate PathsPathsQ

)|( candidatecurrent PathsPathsQ

Consecutive paths dependent Sufficiently separated samples

guarantee independencePaths

PathsJ

Pathsi

Page 106: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

C. Troncoso - UTA - Nov 2009

Evaluation and results

106

It works!

Page 107: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Evaluation

Events should happen with the predicted probability

1. Create an instance of a network 2. Run the sampler and obtain P1,P2,…3. Choose a target sender and a receiver4. Predict probability

5. Check if actually Sen chose Rec as receiver 6. Choose new network and go to 2

N

PathsIj

j

)(

)RecSenPr(RecSen

)(RecSen networkI

Page 108: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Results – compliant clients – 50 messages

)1,1(Beta~)(Pr YXYXempirical IIYX

))(( RecSen networkIE

j

PathsIPaths

j )(RecSen

Predicted probability

Empirical probability

Page 109: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Results non compliant – 50 messages

Page 110: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Results – big networks

100 msg1000 msg

It scales well as networks get larger

As expected mix networks offer good protection

Page 111: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Performance – RAM usage

Nmix t Nmsg Samples

RAM(Mb)

3 3 10 500 16

3 3 50 500 18

10 20 50 500 18

10 20 1 000 500 24

10 20 10 000 500 125

Size of network and population Results are kept in memory during

simulation

Page 112: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Performance – Running time

Nmix t Nmsg iter Full analysis (min)

One sample(ms)

3 3 10 6011 4.24 509.12

3 3 50 6011 4.80 576.42

5 10 100 7011 5.34 641.28

10 20 1 000 7011 5.97 706.12

Operations should be O(1) Writing of the results on a file Different number of iterations

Page 113: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Where to go next?

Models with partial adversaries

Onion routing – low latency models

Location privacy

Use models to derive the anonymity systems

Page 114: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Conclusions

Traffic analysis is non trivial when there are constraints

Probabilistic model: incorporates most attacks Non-compliant clients Integrate with other inferences: long-term attacks & other

Monte Carlo Markov Chain methods to extract marginal probabilities Systematic Only generative model needed

Future work: Model more constraints / less information for attacker Added value?

Page 115: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Time for questions

More info Vida: How to use Bayesian inference to de-

anonymize persistent communications. George Danezis and Carmela Troncoso. Privacy Enhancing Technologies Symposium 2009

The Bayesian analysis of mix networks. Carmela Troncoso and George Danezis. 16th ACM Conference on Computer and Communications Security 2009

The Application of Bayesian Inference to Traffic analysis. Carmela Troncoso and George Danezis Microsoft Technical Report

Page 116: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Additional materialSampling error, Beta distribution, …

Page 117: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Bayes theorem

)|Pr(),|Pr()|,Pr( CHSCHSOCHSO

HS

COHS

CHSCHSO

CO

CHSCHSOCOHS

)|,Pr(

)|Pr(),|Pr(

)|Pr(

)|Pr(),|Pr(),|Pr(

)|Pr(),|Pr()|,Pr( COCOHSCHSO

)Pr()|Pr()Pr()|Pr(),Pr( XXYYYXYX Joint probability:

Page 118: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Error estimation: the Beta function We need to specify the prior knowledge (

) expresses our uncertainty conforms to the nature of the parameter, i.e. is continuous

but bounded between 0 and 1 A convenient choice is the Beta distribution

)|Pr( CHS)Pr(

11 )1()()(

)(),()(

ba

ba

babaBetaP

0.0 0.4 0.8

theta

Beta(0.5,0.5)

0.0 0.4 0.8

theta

Beta(1,1)

0.0 0.4 0.8

theta

Beta(5,1)

0.0 0.4 0.8theta

Beta(5,5)

0.0 0.4 0.8theta

Beta(5,20)

0.0 0.4 0.8theta

Beta(50,200)

0.0 0.4 0.8

theta

Beta(0.5,0.5)

0.0 0.4 0.8

theta

Beta(1,1)

0.0 0.4 0.8

theta

Beta(5,1)

0.0 0.4 0.8

theta

Beta(5,5)

0.0 0.4 0.8

theta

Beta(5,20)

0.0 0.4 0.8

theta

Beta(50,200)

Page 119: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Error estimation: the Beta function

Combining a beta prior with the binomial likelihood gives a posterior distribution

),(

)1(

)(),|(),|(11

bsuccessestotalasuccessesBeta

ptotalsuccessespsuccessestotalpbsuccessestotalasuccesses

Prior knowledge

Page 120: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Sampling error estimation

P1 P2 P3 …

1 0 1 …(A → Q)?j

PathsIQA i

iQA

)(

)Pr(

Paths

Paths

Paths

Paths

Paths1

Paths2

Paths3

Error estimation Bernouilli distribution

Prior Beta(1,1) ~ uniform

),...](),(),(|)Pr[Pr( 321 PIPIPIQA QAQAQA

)1)(,1)((~)Pr( Paths

iQAPaths

iQA PIPIBetaQA

)]Pr(|),...(),(),(Pr[ 321 QAPIPIPI QAQAQA

Page 121: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Example

Studying events with Network 1

Network 2 Network 3 Network 4 Network 5

4.0)RecSenPr(

1)(;0)(;0)(;1)(;0)( 5BA4BA3BA2BA1BA PIPIPIPIPI

0)(;4.05

)(

)BAPr( 1BA

BA

NetworkI

PIj

j

0)(;4.0)YXPr( 2YX NetworkI

1)(;4.0)YXPr( 3YX NetworkI

1)(;4.0)YXPr( 4YX NetworkI

0)(;4.0)YXPr( 5YX NetworkI

)13,12(Beta~)(Pr

4.0)YX(Pr

YXempirical

sampled

X% Confidence intervals

Page 122: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Anonymous communications: Low latency systemsAnonymous web browsing and peer-to-peer

Page 123: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Onion Routing

Anonymising streams of messages Example: Tor

As for mix networks Alice chooses a (short) path Relays a bi-directional stream of traffic to Bob

OnionRouter

Alice Bob

Cells of traffic

OnionRouterBi-directional

OnionRouter

Page 124: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Onion Routing vs. Mixing

Setup route once per connection Use it for many cells – save on PK

operations

No time for delaying Usable web latency 1—2 sec round trip Short routes – Tor default 3 hops No batching (no threshold , ...)

Passive attacks!

Page 125: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Stream Tracing

Adversary observes all inputs and outputs of an onion router

Objective link the ingoing and outgoing connections (to trace from Alice to Bob)

Key: timing of packets are correlated

Two techniques: Correlation Template matching

Page 126: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Tracing (1) – Correlation

Quantise input and output load in time

Compute: Corr = ∑i INi∙OUTi

Downside: lose precision by quantising

OnionRouter1 3 2 1 2 2 1 2 3 0 3 2

Number of cellper time interval

T=0

T=0

INi OUTi

Page 127: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Tracing (2) – Template matching

Use input and delay curve to make template Prediction of what the output will be

Assign to each output cell the template value (vi) for its output time

Multiply them together to get a score (∏ivi)

OnionRouter

INTemplate

Compare with template

Input Stream Output Stream

vi

Page 128: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The security of Onion Routing Cannot withstand a global passive adversary

(Tracing attacks to expensive to foil)

Partial adversary Can see some of the network Can control some of the nodes

Secure if adversary cannot see first and last node of the connection If c is fraction of corrupt servers Compromize probability = c2

No point making routes too long

Page 129: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

More Onion Routing security Forward secrecy

In mix networks Alice uses long term keysA->M2: {M4, {M1,{B, Msg}M1}M4}M2

In Onion Routing a bi-directional channel is available

Can perform authenticated Diffie-Hellman to extend the anonymous channel

OR provides better security against compulsion

Page 130: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Extending the route in OR

Alice OR1 OR2 OR3 BobAuthenticated DH

Alice – OR1

Authenticated DH, Alice – OR2

K1

Encrypted with K1

K2Authenticated DH, Alice – OR3

Encrypted with K1, K2

TCP Connection with Bob, Encrypted with K1, K2, K3K3

Page 131: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Some remarks

Encryption of input and output streams under different keys provides bitwise unlinkability As for mix networks Is it really necessary?

Authenticated Diffie-Hellman One-sided authentication: Alice remains

anonymous Alice needs to know the signature keys of the

Onion Routers Scalability issue – 1000 routers x 2048 bit keys

Page 132: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Exercise

Show that: If Alice knows only a small subset of all Onion

Routers, the paths she creates using them are not anonymous.

Assume adversary knows Alice’s subset of nodes.

Hint: Consider collusion between a corrupt middle and last node – then corrupt last node only.

Real problem: need to ensure all clients know the full, most up-to-date list of routers.

Page 133: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Future directions in OR

Anonymous routing immune to tracing Reasonable latency?

Yes, we can! Tracing possible because of input-output

correlations Strategy 1: fixed sending of cells

(eg. 1 every 20-30ms) Strategy 2: fix any sending schedule

independently of the input streams

Page 134: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Crowds – lightweight anonymity

Mixes and OR – heavy on cryptography

Lighter threat model No network adversary Small fraction of corrupt nodes Anonymity of web access

Crowds: a groups of nodes cooperate to provide anonymous web-browsing

Page 135: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Crowds – illustrated

Bob(Website)

Alice

Probability p(Send out request)

Reply

Probability 1-p(Relay in crowd)

Crowd – (Jondo)

Example:p = 1 / 4

Page 136: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Crowds security

Final website (Bob) or corrupt node does not know who the initiator is Could be the node that passed on the

request Or one before

How long do we expect paths to be? Mean of geometric distribution L = 1 / p – (example: L = 4) Latency of request / reply

Page 137: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Crowds security (2)

Consider the case of a corrupt insider A fraction c of nodes are in fact corrupt

When they see a request they have to decide whether the predecessor is the initiator or merely a relay

Note: corrupt insiders will never pass the request to an honest node again!

Page 138: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Crowds – Corrupt insider

Bob(Website)

Alice Probability 1-p(Relay in crowd)

Crowd – (Jondo)

Corrupt node

What is the probability my predecessor is the initiator?

Page 139: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Calculate: initiator probability

Initiator

p

1 - p

Req

Relay

c

1 - c

Corrupt

Honest1 - p

pReq

Relay

c

1 - c

Corrupt

Honest

1 - p

pReq

Relay

c

1 - c

Corrupt

Honest

Predecessor is initiator & corrupt

final node

Predecessor is random & corrupt

final node

pI = (1-p) c / c ∑i=1..inf (1-p)i(1-c)i-1

pI = 1 – (1-p)(1-c)

pI grows as (1) c grows (2) p grows

Exercise: What is the information theoretic amount of anonymity of crowds in this context

Page 140: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

The predecessor attack

What about repeated requests? Alice always visits Bob E.g. Repeated SMTP connection to

microsoft.com

Adversary can observe n times the tuple 2 x (Alice, Bob) Probability Alice is initiator (at least

once)▪ P = 1 – [(1-p)(1-c)]n

Probability of compromize reaches 1 very fast!

Page 141: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

Summary of key points

Fast routing = no mixing = traffic analysis attacks

Weaker threat models Onion routing: partial observer Crowds: insiders and remote sites

Repeated patterns Onion routing: Streams vs. Time Crowds: initiators-request tuples

PKI overheads a barrier to p2p anonymity

Page 142: Messaging anonymity & the traffic analysis of hardened systems FOSAD 2010 – Bertinoro, Italy

References

Core: Tor: The Second-Generation Onion Router by Roger

Dingledine, Nick Mathewson, and Paul Syverson. In the Proceedings of the 13th USENIX Security Symposium, August 2004.

Crowds: Anonymity for Web Transactions by Michael Reiter and Aviel Rubin.In ACM Transactions on Information and System Security 1(1), June 1998.

More: An Introduction to Traffic Analysis by George Danezis and

Richard Clayton.http://homes.esat.kuleuven.be/~gdanezis/TAIntro-book.pdf

The anonymity bibliography http://www.freehaven.net/anonbib/