26
DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization Benjamin Bichsel Timon Gehr Dana Drachsler-Cohen Petar Tsankov Martin Vechev

DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

  • Upload
    vomien

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization

Benjamin Bichsel Timon GehrDana

Drachsler-CohenPetar Tsankov Martin Vechev

Page 2: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Differential Privacy – Basic Setting

7

2

# disease

Page 3: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Differential Privacy – Basic Setting

7.3

3

# disease+ noise

What about my privacy?

Page 4: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Differential Privacy - Intuition

Change my data

7.3

7.6

4

# disease+ noise

?or# disease

+ noise

Page 5: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Differential Privacy – More Abstractly

Neighboring

𝐹(𝑥)

𝐹(𝑥′)

5

𝐹

𝐹Attacker check

𝐹(𝑥) ∈ Φ?

𝑥

𝑥′

Attacker check𝐹(𝑥′) ∈ Φ?

Page 6: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Attacker check

𝐹(𝑥) ∈ Φ?

Attacker check𝐹(𝑥′) ∈ Φ?

𝐹(𝑥)

Differential Privacy - Definition

Neighbouring

𝐹(𝑥′)

6

𝐹

𝐹

𝑥

𝑥′

𝜀-DP:

Pr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]≤ exp 𝜀 ≈ 1 + 𝜀

Challenges induced by DP:

• Proving/checking 𝜀-DP is hard (buggy algorithms)

• Proof strategies not complete• Proofs only provide upper bounds

Page 7: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

( , , )𝑥′𝑥 Φ

Pr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]> exp 𝜀

logPr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]> 𝜀

that violate 𝜀-DP:

𝜀-DP Counterexamples

7

Page 8: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

( , , )𝑥′𝑥 Φ

Pr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]> exp 𝜀

logPr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]> 𝜀

that violate 𝜀-DP:

𝜀-DP Counterexamples

8

Maximize ε(𝑥, 𝑥′, Φ)

Page 9: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Bounds on "true" 𝜀

9

Evaluation: We get precise and large ε, close to known

upper bounds

Proven: 10%-DP(𝜀 = 10% = 0.1)

Counterexample:9.9%-DP

Counterexample:15%-DP

Counterexample:5%-DP

Page 10: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

𝜀-DP Counterexamples

10

Goal: Maximize ε(𝑥, 𝑥′, Φ)

Challenge 1: Expensive to compute ε precisely

Challenge 2: Search space is sparse: Few 𝑥, 𝑥′, Φ lead to

large ε(𝑥, 𝑥′, Φ)

Estimate 𝜀by sampling

Make Ƹ𝜀differentiable

𝜀 Ƹ𝜀 Ƹ𝜀 𝑑

Page 11: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Step 1: Estimate 𝜀

11

Estimate 𝜀by sampling

𝜀 Ƹ𝜀

Page 12: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Estimating 𝜀

12

𝜀 x, x′, Φ ≔ logPr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]

Page 13: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

𝜀 x, x′, Φ ≔ logPr[𝐹 𝑥 ∈ Φ]

Pr[𝐹(𝑥′) ∈ Φ]

Estimating 𝜀

Pr 𝐹(𝑥) ∈ Φ =1

𝑛𝑖=1

𝑛

check𝐹,Φ𝑖 (𝑥)

67%

33%

13

𝐹 7.3

𝐹 7.6

𝐹 6.8

𝐹(𝑥)𝑥 check𝐹,Φ𝑖 (𝑥)

yes

no

yes

Page 14: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

How precise is our estimate?

Precision of Pr[𝐹 𝑥 ∈ Φ]and Pr[𝐹 𝑥′ ∈ Φ]

Sampling

effort 𝑛Precision of 𝜀

14

Exponential search

Counterexample:9.9% ± 10%-DP

Counterexample:9.9% ± 2 ∙ 10−3-DP

vs

Page 15: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Estimating precisely is expensive

Estimating 𝜀 up to an error of 2 ∙ 10−3 with confidence of 90% 15

104

Probabillstic guaranteesHeuristic

Efficient Heuristic

Page 16: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

𝐹 7.3

𝐹 7.6

𝐹 6.8

1

𝑛

𝑖=1

𝑛

check𝐹,Φ𝑖 𝑥

1

𝑛

𝑖=1

𝑛

check𝐹,Φ𝑖 𝑥′

Follows 2D Gaussian distribution

Applying the M-CLT (Correlation)

16

yes

no

yes

𝐹

𝐹

𝐹

7.3

7.6

8.2

yes

no

no

Page 17: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Obtaining a Confidence Interval for 𝜀

17

Distribution of GaussGauss (correlated):

D. V. Hinkley. 1969. On the Ratio of Two Correlated Normal Random Variables.Biometrika 56, 3 (1969), 635–639. http://www.jstor.org/stable/2334671

Joint likelihood ofPr[𝐹 𝑥 ∈ Φ]

Pr[𝐹 𝑥′ ∈ Φ]

Likelihood ofε(𝑥, x′, Φ)

Confidence Interval for ε(𝑥, x′, Φ)

Page 18: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

How precise is our estimate?

18

Counterexample:9.9% ± 10%-DP

Counterexample:9.9% ± 2 ∙ 10−3-DP

vs

Page 19: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Step 2: Finding Counterexamples

19

Ƹ𝜀 Ƹ𝜀 𝑑

Make Ƹ𝜀differentiable

Page 20: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

How can we optimize our estimate?

maximize

¬𝐵 ↝ 1 − 𝐵

𝐵1 ∧ 𝐵2 ↝ 𝐵1 ∙ 𝐵2

if 𝐵 ∶ 𝑥 = 𝐸1 else ∶ 𝑥 = 𝐸2 ↝ 𝑥 = 𝐵 ∙ 𝐸1 + (1 − 𝐵) ∙ 𝐸2

20

Ƹ𝜖 𝑥, 𝑥′, Φ = log1𝑛σ𝑖=1𝑛 check𝐹,Φ

𝑖 (𝑥)1𝑛σ𝑖=1𝑛 check𝐹,Φ

𝑖 (𝑥′)

Not differentiable

Goals• Make differentiable• Preserve semantics

Page 21: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

How can we optimize our estimate?

maximize

21

Ƹ𝜖 𝑥, 𝑥′, Φ = log1𝑛σ𝑖=1𝑛 check𝐹,Φ

𝑖 (𝑥)1𝑛σ𝑖=1𝑛 check𝐹,Φ

𝑖 (𝑥′)

Not differentiable

• Maximize using SLSQP (supports hard constraints for

neighborhood)

• Random starting point (+ restart)

• What about division by zero?

• What about very small denominators?

Page 22: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Main differences to Ding et al.

22

Dimension Ding et al. This work

Problem statement ε 𝑥, 𝑥′, Φ > ε0? Maximize ε(𝑥, 𝑥′, Φ)

Approach Statistical tests Estimate + confidence interval

Search By patterns Gradient descent (incremental)

Page 23: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Evaluation

23

• How precise is the differentiable estimate?

• How efficient is DP-Finder in finding violations compared to

random search?

Exact solver (PSI) for ground truth

Page 24: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Precision of Differentiable Estimate

24

𝜀

AlgorithmsƸ𝜀 𝑑 𝜀

Page 25: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Random vs Optimized

25

Random start Optimized

Page 26: DP-Finder: Finding Differential Privacy Violations by ... · yes no yes. How precise is our estimate? Precision of Pr[ 𝑥∈Φ] and Pr[ 𝑥′∈Φ] Sampling ... • How efficient

Differential Privacy

Conclusion

26

𝜀-DP Counterexamples

( , , )Estimate 𝜀 Finding Counterexamples

𝜀 Ƹ𝜀 Ƹ𝜀 𝑑Ƹ𝜀