Dependence Makes You Vulnerable: Differential Privacy ... · Changchang Liu1, Supriyo Chakraborty2,...

Preview:

Citation preview

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Dependence Makes You Vulnerable: DifferentialPrivacy Under Dependent Tuples

Changchang Liu1, Supriyo Chakraborty2, Prateek Mittal1

Email: 1{cl12, pmittal}@princeton.edu, 2supriyo@us.ibm.com,1 Princeton University, 2IBM T.J. Watson Research Center

February 23, 2016

1 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data Privacy

• Privacy is important!

- Snowden case- G20 summit breach- iCloud photo breach· · ·

2 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Direct Release Data Would Compromise Privacy!

Individuals Data Provider

Raw Data

Applications Researchers

Data Recipients

Query Results

3 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Direct Release Data Would Compromise Privacy!

Individuals Data Provider

Raw Data

Data Recipients

Query Results

3 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Obfuscate Data before Release to Protect Privacy

Individuals Data Provider

Raw DataData

Obfuscation

Applications Researchers

Data Recipients

Query Results

Perturbed

Query Results

3 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Existing Privacy Metrics

– Differential Privacy [ICALP ’06]

– Pufferfish Privacy [PODS ’12]

– Membership Privacy [CCS ’13]

– Blowfish Privacy [SIGMOD ’14]

4 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

ε-Differential Privacy (DP)

D

Neighboring Databases

The adversary’s ability to infer the individual’s information is bounded!

5 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

ε-Differential Privacy (DP)

D

Neighboring

Databases

Differential Privacy requires:

The adversary’s ability to infer the individual’s information is bounded!

5 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

ε-Differential Privacy (DP)

D

Probability

S Query Output

Neighboring

Databases

Differential Privacy requires:

The adversary’s ability to infer the individual’s information is bounded!

5 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Laplace Perturbation Mechanism

S

noise

( )Q D

( ) 1, ~ exppQ

eæ ö= -ç ÷

Dè ø

xb b x

( )( )D Q D= +b( )D Q D( )( )D Q DD Q D

Raw Data

ε is the privacy budgetQ is the query function∆Q is the global sensitivity of Q: maxD,D′‖Q(D)−Q(D′)‖1

6 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Limitations for Differential Privacy (DP) Mechanisms

Implicitly assume independent tuples

7 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Limitations for Differential Privacy (DP) Mechanisms

In reality, however, tuples are correlated

• large volume• rich semantics• complex structure

8 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data correlation exists almost everywhere

(a) social network data (b) business data

(c) mobility data (d) medical data

9 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data correlation exists almost everywhere

(a) social network data (b) business data

(c) mobility data (d) medical data

friendships

interactions

9 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data correlation exists almost everywhere

(a) social network data (b) business data

(c) mobility data (d) medical data

friendships

interactions

financial

transactions

9 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data correlation exists almost everywhere

(a) social network data (b) business data

(c) mobility data (d) medical data

friendships

interactions

financial

transactions

communication

records

9 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Data correlation exists almost everywhere

(a) social network data (b) business data

(c) mobility data (d) medical data

friendships

interactions

financial

transactions

communication

recordsdisease

transmission

9 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Our Objective

Incorporate correlated data in differential privacy

10 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

10 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Correlation in Gowalla Location Dataset

Gowalla location dataset: 6,969 users, 98,802 location recordsGowalla social dataset: 6,969 users, 47,502 edges

Manhattan, NewYork

Queens, NewYork

Brooklyn, NewYork

San Jose, San Francisco

Pasadena, Los Angeles

Long Beach, Los Angeles

Beverly Hills, Los Angeles

Oakland, San Francisco

11 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Inference Attack on DP via K-Means Query

Differentially Private K -means for Gowalla Location Dataset

Individuals Data Provider

Raw DataK-means

Clustering

Data Recipients

Differentially Private

K-means Clustering

Perturbation

Inference Attack

12 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Inference Attack

Social

RelationshipsInference Attack

Check-in

Community

13 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Inference results by using correlation

0 0.5 1 1.5 2 2.5 30

2

4

6

8

Privacy Budget ε

Lea

ked

In

form

atio

n

with social relationships

w/o social relationships

security guarantee by DP

Exploiting correlation, one can infer more information!Exploiting correlation can break DP security guarantees!

14 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Inference results by using correlation

0 0.5 1 1.5 2 2.5 30

2

4

6

8

Privacy Budget ε

Lea

ked

In

form

atio

n

with social relationships

w/o social relationships

security guarantee by DP

Exploiting correlation, one can infer more information!Exploiting correlation can break DP security guarantees!

14 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

14 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

ε-Dependent Differential Privacy (DDP)

Neighboring Databases

•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.

15 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

ε-Dependent Differential Privacy (DDP)

Neighboring

Databases

Dependent Differential

Privacy requires:

•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.

15 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

ε-Dependent Differential Privacy (DDP)

Probability

S Query Output

Neighboring

Databases

Dependent Differential

Privacy requires:

•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.

15 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Dependent Perturbation Mechanism

• Augment conventional LPM with additional noise relevant to ρij

• Dependent coefficient ρij

− extent of dependence of Dj on the modification of Di

16 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Dependent Coefficient

Laplace noise in dependent perturbation mechanism

exp

{− ε

Sensitivityi + ρij ×Sensitivityj

}Dependent coefficient satisfies: 0≤ ρij ≤ 1

• ρij = 0: standard differential privacy (independent setting)

• ρij = 1: fully dependent setting

• ρij : formulate correlation from privacy perspective

17 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Limitations of Dependent Coefficient

The exact computation of ρij

relies on knowledge of data generation model

18 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

18 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Resilience to Inference Attack

0 0.5 1 1.5 2 2.5 30

2

4

6

8

Privacy Budget ε

Lea

ked

In

form

atio

n

attack DPattack DDP

DDP is more resilient to inference attack

19 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Resilience to Inference Attack

0 0.5 1 1.5 2 2.5 30

2

4

6

8

Privacy Budget ε

Lea

ked

In

form

atio

n

attack DPattack DDP

DDP is more resilient to inference attack

19 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results

Further Analysis and Experiments

• Composition Property

− Sequential/parallel composition property

• Theoretical utility analysis

• Different classes of queries

− Machine learning queries− Graph queries

20 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Conclusion and Future work

• Incorporate correlation into differential privacy

− Dependent differential privacy− More resilient to inference attack

• Alternative data generation models in the future work

21 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Appendix1: Dependence between tuples can seriouslydegrade the privacy guarantees provided by the existing DPmechanisms

QuerySum

,i jD Dé ùë û ( )1Lap e

Add noise Noisy ( )i jD D+

Independent

Privacy

Guarantee( )exp e

( )exp 1.5eDependent

i jD D^

Smaller means better privacy e

[ ]0.5 0.5

0,1

j i

i

D D X

D X

= +

and are i.i.d in

Privacy

Guarantee

21 / 21

IntroductionDifferential Privacy under Dependent Data

Conclusion and Future Work

Appendix 2: Model to Compute Dependent Coefficient

Here, we consider to utilize the friend-based model to compute theprobabilistic dependence relationship, where a user’s location can beestimated by her friend’s location based on the distance between theirlocations. Specifically, the probability of a user j locating at dj whenher friend i is locating at di is

P(Dj = dj |Di = di) = a(‖dj −di‖1 + b)−c (1)

where a > 0,b > 0,c > 0.

21 / 21

Recommended