30
Privacy-preserving Release of Statistics: Differential Privacy Piotr Mardziel or Anupam Datta CMU Fall 2018 18734: Foundations of Privacy

Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Privacy-preservingReleaseofStatistics:DifferentialPrivacy

PiotrMardziel orAnupam DattaCMU

Fall2018

18734:FoundationsofPrivacy

Page 2: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Privacy-PreservingStatistics:Non-InteractiveSetting

2

Goals:• Accuratestatistics(lownoise)• Preserveindividualprivacy(whatdoesthatmean?)

Addnoise,sample,generalize,suppress

x1…xn

DatabaseDmaintainedbytrustedcurator

• Censusdata• Healthdata• Networkdata• …

AnalystSanitizedDatabaseD’

Page 3: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Privacy-PreservingStatistics:InteractiveSetting

3

Goals:• Accuratestatistics(lownoise)• Preserveindividualprivacy(whatdoesthatmean?)

Queryf

f(D)+noise

x1…xn

DatabaseDmaintainedbytrustedcurator

Analyst

#individualswithsalary>$30K

• Censusdata• Healthdata• Networkdata• …

Page 4: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Somepossibledefenses

• Anonymizedata– Re-identification,informationamplification

• Queriesoverlargedatasets– Differencingattack

• Queryauditing– Refusalleaks,computationaltractability

• Summarystatistics– Frequencylists

4

Page 5: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

ClassicalIntuitionforPrivacy

• “IfthereleaseofstatisticsSmakesitpossibletodeterminethevalue[ofprivateinformation]moreaccuratelythanispossiblewithoutaccesstoS,adisclosurehastakenplace.”[Dalenius 1977]– Privacymeansthatanythingthatcanbelearnedaboutarespondentfromthestatisticaldatabasecanbelearnedwithoutaccesstothedatabase

• Similartosemanticsecurityofencryption

5

Page 6: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

ImpossibilityResult[Dwork,Naor 2006]

• Result:Forreasonable“breach”,ifsanitizeddatabasecontainsinformationaboutdatabase,thensomeadversarybreaksthisdefinition

• Example– TerryGrossistwoinchesshorterthantheaverageLithuanianwoman

– DBallowscomputingaverageheightofaLithuanianwoman

– ThisDBbreaksTerryGross’sprivacyaccordingtothisdefinition…evenifherrecordisnot inthedatabase!

6

Page 7: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

VeryInformalProofSketch• SupposeDBisuniformlyrandom• “Breach”ispredictingapredicateg(DB)– Example:g(DB)=“TerryGross’sheight=6feet”

• Adversary’sbackgroundknowledge:– r,[H(r;San(DB)) ⊕g(DB) ]whereHisasuitablehashfunction,r=H(DB)

Example:“TerryGrossistwoinchesshorterthantheaverageLithuanianwoman“

• Byitself,doesnotleakanythingaboutDB• TogetherwithSan(DB),revealsg(DB)– Example:San(DB)=“averageheightofaLithuanianwoman“

7

Page 8: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

DifferentialPrivacy:Idea

Releasedstatisticisaboutthesameifanyindividual’srecord isremovedfromthedatabase

8

[Dwork,McSherry,Nissim,Smith2006]

Page 9: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

AnInformationFlowIdea

Changinginputdatabasesinaspecificwaychangesoutputstatisticbyasmallamount

9

Page 10: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

NotAbsoluteConfidentiality

DoesnotguaranteethatTerryGross’sheightwon’tbelearnedbytheadversary

10

Page 11: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

DifferentialPrivacy:Definition

Randomizedsanitizationfunctionκ hasε-differentialprivacyifforalldatasetsD1 andD2 differingbyatmostoneelement andallsubsetsS oftherangeofκ,

Pr[κ(D1)∈ S]≤eε Pr[κ(D2)∈ S]

Answertoquery#individualswithsalary>$30Kisinrange[100,110]withapproximatelythesame

probabilityinD1 andD2

11

Page 12: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

AchievingDifferentialPrivacy:InteractiveSetting

Howmuchandwhattypeofnoiseshouldbeadded?

Tellmef(D)

f(D)+noisex1…xn

DatabaseDUser

12

Page 13: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Example:NoiseAddition

13

Slide:AdamSmith

Page 14: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

GlobalSensitivity

14

Slide:AdamSmith

Page 15: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Exercise

15

• Functionf:#individualswithsalary>$30K• GlobalSensitivityoff=?

• Answer:1

Page 16: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

BackgroundonProbabilityTheory(seeOct11,2013recitation)

16

Page 17: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

ContinuousProbabilityDistributions

• Probabilitydensityfunction(PDF),fX

• Exampledistributions– Normal,exponential,Gaussian,Laplace

17

Page 18: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

LaplaceDistribution

18

Mean=μ

Variance=2b2

PDF=

Source:Wikipedia

Page 19: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

LaplaceDistribution

19

Changeofnotationfrompreviousslide:xà yμà 0bà λ

Page 20: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

AchievingDifferentialPrivacy

20

Page 21: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

LaplaceMechanism

21

Slide:AdamSmith

Page 22: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

LaplaceMechanism:ProofIdea

22

Pr[A(x)=t]Pr[A(x’)=t]

Page 23: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

LaplaceMechanism:Moredetails

• Pr 𝐴 𝑥 ∈ 𝑆 = ∫ 𝑝 𝐴 𝑥 = 𝑡 𝑑𝑡,∈-

• 𝑝 𝐴 𝑥 = 𝑡 = 𝑝 𝐿 = 𝑡 − 𝑓 𝑥 = ℎ 𝑡 − 𝑓 𝑥

• 2(,45 6 )2(,45 68 )

≤:;< =?@ A

B

:;< =?@ A8

B

≤ exp 5 6 45 68

F≤ exp

G-@F

• HI J 6 ∈-HI J 68 ∈-

=∫ K J 6 L, M,=∈N

∫ K J 68 L, M,=∈N

=∫ 2 ,45(6) M,=∈N

∫ 2 ,45(68) M,=∈N

≤ expG-@F

• For𝜆 =G-@P,wehave𝜖-differentialprivacy

23

Page 24: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Example:NoiseAddition

24

Slide:AdamSmith

Page 25: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

UsingGlobalSensitivity

• Manynaturalfunctionshavelowglobalsensitivity– Histogram,covariancematrix,stronglyconvexoptimizationproblems

25

Page 26: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

CompositionTheorem

• IfA1 isε1-differentiallyprivateandA2 isε2-differentiallyprivateandtheyuseindependentrandomcoinsthen< A1,A2>is(ε1+ε2)-differentiallyprivate

• Repeatedqueryingdegradesprivacy;degradationisquantifiable

26

Page 27: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Applications

• Netflixdataset[McSherry,Mironov 2009;MSR]– Accuracyofdifferentiallyprivaterecommendations(wrtonemovierating)comparabletobaselinesetbyNetflix

• Networktracedatasets[McSherry,Mahajan2010;MSR]

27

Page 28: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Challenge:HighSensitivity

• Approach:Addnoiseproportionaltosensitivitytopreserveε-differentialprivacy

• Improvements:– Smoothsensitivity[Nissim,Raskhodnikova,Smith2007;BGU-PSU]

– Restrictedsensitivity[Blocki,Blum,Datta,Sheffet 2013;CMU]

28

Page 29: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

Challenge:IdentifyinganIndividual’sInformation

• Informationaboutanindividualmaynotbejustintheirownrecord

– Example: Inasocialnetwork,informationaboutnodeAalsoinnodeBinfluenced byA,forexample,becauseAmayhavecausedalinkbetweenBandC

29

Page 30: Privacy-preserving Release of Statistics: Differential Privacy€¦ · 10-10-2018  · Impossibility Result [Dwork, Naor2006] • Result : For reasonable “breach”, if sanitized

DifferentialPrivacy:Summary

• Anapproachtoreleasingprivacy-preservingstatistics

• Arigorousprivacyguarantee– SignificantactivityintheoreticalCScommunity

• Severalapplicationstorealdatasets– Recommendationsystems,networktracedata,..

• Somechallenges– Highsensitivity,identifyingindividual’sinformation,repeatedquerying

30