Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Marco GaboardiRoom: 338-B
[email protected]://www.buffalo.edu/~gaboardi
CSE711 Topics in Differential Privacy
Data
Data
Statistics over Data
Private Queries?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Private Queries?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
medicalcorrelation?
queryanswer
Private Queries?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
answer
query
Does Joe have
cancer?
Private Queries?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Does Joe have
cancer?
Anonymization?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Anonymization?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
medicalcorrelation
queryanswer
Anonymization?
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
answer
query?!?
Anonymous Data
Anonymization
Anonymous Data
Anonymization
Anonymous Data
Anonymization
Additional Data
Anonymous Data
Anonymization
Additional Data
A Possible Solution:Differential Privacy
Differential Privacy
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: the idea
A. Haeberlen
Promising approach: Differential privacy
3USENIX Security (August 12, 2011)
Private data
N(flue, >1955)?
826±10
N(brain tumor, 05-22-1955)?
3 ±700
Noise
?!?
Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.
Trade-off between utility and privacy.
DMNS’06
Differential Privacy
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: the idea
A. Haeberlen
Promising approach: Differential privacy
3USENIX Security (August 12, 2011)
Private data
N(flue, >1955)?
826±10
N(brain tumor, 05-22-1955)?
3 ±700
Noise
?!?
Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.
Trade-off between utility and privacy.
query
answer + noise
medicalcorrelation?
DMNS’06
Differential Privacy
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: the idea
A. Haeberlen
Promising approach: Differential privacy
3USENIX Security (August 12, 2011)
Private data
N(flue, >1955)?
826±10
N(brain tumor, 05-22-1955)?
3 ±700
Noise
?!?
Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.
Trade-off between utility and privacy.
answer + noise
query
?!?
DMNS’06
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: motivation
A. Haeberlen
Motivation: Protecting privacy
2USENIX Security (August 12, 2011)
19144 02-15-1964 flue
19146 05-22-1955 brain tumor
34505 11-01-1988 depression
25012 03-12-1972 diabets
16544 06-14-1956 anemia
...
Fair insurance
plan?
Resend
policy?
DataI know Bob is
born 05-22-1955
in Philadelphia…
A first approach: anonymization.I Using different correlated anonymized data sets one
can learn private data.
Differential Privacy: the idea
A. Haeberlen
Promising approach: Differential privacy
3USENIX Security (August 12, 2011)
Private data
N(flue, >1955)?
826±10
N(brain tumor, 05-22-1955)?
3 ±700
Noise
?!?
Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.
Trade-off between utility and privacy.
?!?
Differential Privacy
DMNS’06
Foundamental Law of Information ReconstructionThe release of too many overly accurate statistics gives
privacy violations.
[DinurNissim02]
Privacy vs. Utility
Privacy vs. Utility
UtilityPrivacy
Privacy vs. Utility
Utility
Privacy
Privacy vs. Utility
Utility
Privacy
Privacy vs. Utility
UtilityPrivacy
This class:understanding the mathematical and
computational meaning of this trade-off.
18
• US Census Bureau - onTheMap, new releases
• Google - RAPPOR tool for Chrome
• Apple - typing statistics reports
• LeapYear - startup
• …
Some Official Users
Syllabus for the courseLocation: Davis 113A Time: Thursday 16:30 - 19:00? Office Hours: Friday 11:00 - 12:00 or by appointment Discussion forums: NB and Piazza class website: http://www.buffalo.edu/~gaboardi/teaching/CSE711-spring17.html
Course load: - presenting part of the material, - commenting on the material presented every week, beforehand on NB and during class, - working on a project and presenting the results (optional)
Grading
50% - material presentation 50% - engagement and participation in class and on NB and Piazza
Reference Material
Cynthia Dwork and Aaron Roth, “The Algorithmic Foundations of Differential Privacy,” 2014 Linked from the class website.
Salil Vadhan, “The Complexity of Differential Privacy” 2016. Linked from the class website.
Questions?