32
Marco Gaboardi Room: 338-B [email protected] http://www.buffalo.edu/~gaboardi CSE711 Topics in Differential Privacy

CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Marco GaboardiRoom: 338-B

[email protected]://www.buffalo.edu/~gaboardi

CSE711 Topics in Differential Privacy

Page 2: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Data

Page 3: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Data

Statistics over Data

Page 4: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Private Queries?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Page 5: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Private Queries?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

medicalcorrelation?

queryanswer

Page 6: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Private Queries?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

answer

query

Does Joe have

cancer?

Page 7: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Private Queries?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Does Joe have

cancer?

Page 8: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymization?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Page 9: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymization?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

medicalcorrelation

queryanswer

Page 10: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymization?

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

answer

query?!?

Page 11: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymous Data

Anonymization

Page 12: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymous Data

Anonymization

Page 13: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymous Data

Anonymization

Additional Data

Page 14: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Anonymous Data

Anonymization

Additional Data

Page 15: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

A Possible Solution:Differential Privacy

Page 16: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Differential Privacy

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: the idea

A. Haeberlen

Promising approach: Differential privacy

3USENIX Security (August 12, 2011)

Private data

N(flue, >1955)?

826±10

N(brain tumor, 05-22-1955)?

3 ±700

Noise

?!?

Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.

Trade-off between utility and privacy.

DMNS’06

Page 17: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Differential Privacy

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: the idea

A. Haeberlen

Promising approach: Differential privacy

3USENIX Security (August 12, 2011)

Private data

N(flue, >1955)?

826±10

N(brain tumor, 05-22-1955)?

3 ±700

Noise

?!?

Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.

Trade-off between utility and privacy.

query

answer + noise

medicalcorrelation?

DMNS’06

Page 18: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Differential Privacy

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: the idea

A. Haeberlen

Promising approach: Differential privacy

3USENIX Security (August 12, 2011)

Private data

N(flue, >1955)?

826±10

N(brain tumor, 05-22-1955)?

3 ±700

Noise

?!?

Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.

Trade-off between utility and privacy.

answer + noise

query

?!?

DMNS’06

Page 19: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: motivation

A. Haeberlen

Motivation: Protecting privacy

2USENIX Security (August 12, 2011)

19144 02-15-1964 flue

19146 05-22-1955 brain tumor

34505 11-01-1988 depression

25012 03-12-1972 diabets

16544 06-14-1956 anemia

...

Fair insurance

plan?

Resend

policy?

DataI know Bob is

born 05-22-1955

in Philadelphia…

A first approach: anonymization.I Using different correlated anonymized data sets one

can learn private data.

Differential Privacy: the idea

A. Haeberlen

Promising approach: Differential privacy

3USENIX Security (August 12, 2011)

Private data

N(flue, >1955)?

826±10

N(brain tumor, 05-22-1955)?

3 ±700

Noise

?!?

Differential Privacy:Ensuring that the presence/absence of an individual has anegligible statistical effect on the query’s result.

Trade-off between utility and privacy.

?!?

Differential Privacy

DMNS’06

Page 20: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Foundamental Law of Information ReconstructionThe release of too many overly accurate statistics gives

privacy violations.

[DinurNissim02]

Page 21: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Privacy vs. Utility

Page 22: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Privacy vs. Utility

UtilityPrivacy

Page 23: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Privacy vs. Utility

Utility

Privacy

Page 24: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Privacy vs. Utility

Utility

Privacy

Page 25: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Privacy vs. Utility

UtilityPrivacy

Page 26: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

This class:understanding the mathematical and

computational meaning of this trade-off.

18

Page 27: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

• US Census Bureau - onTheMap, new releases

• Google - RAPPOR tool for Chrome

• Apple - typing statistics reports

• LeapYear - startup

• …

Some Official Users

Page 28: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Syllabus for the courseLocation: Davis 113A Time: Thursday 16:30 - 19:00? Office Hours: Friday 11:00 - 12:00 or by appointment Discussion forums: NB and Piazza class website: http://www.buffalo.edu/~gaboardi/teaching/CSE711-spring17.html

Course load: - presenting part of the material, - commenting on the material presented every week, beforehand on NB and during class, - working on a project and presenting the results (optional)

Page 29: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Grading

50% - material presentation 50% - engagement and participation in class and on NB and Piazza

Page 30: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,
Page 31: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Reference Material

Cynthia Dwork and Aaron Roth, “The Algorithmic Foundations of Differential Privacy,” 2014 Linked from the class website.

Salil Vadhan, “The Complexity of Differential Privacy” 2016. Linked from the class website.

Page 32: CSE711 Topics in Differential Privacy - University at Buffalogaboardi/teaching/cse711Spring2017/Lecture1.pdfA. Haeberlen Motivation: Protecting privacy 2 USENIX Security (August 12,

Questions?