28
Privacy of profile- based ad targeting Alexander Smal and Ilya Mironov

Privacy of profile-based ad targeting Alexander Smal and Ilya Mironov

Embed Size (px)

Citation preview

Privacy of profile-based ad targeting

Alexander Smal

and

Ilya Mironov

2

User-profile targeting• Goal: increase impact of your ads by targeting a group

potentially interested in your product.• Examples:

• Social Network

Profile = user’s personal information + friends• Search Engine

Profile = search queries + webpages visited by user

Privacy of profile-based targeting

3

Facebook ad targeting

Privacy of profile-based targeting

4

Characters

Privacy of profile-based targeting

My system is private!

Advertising company Privacy researcher

5

Simple attack [Korolova’10]

Targeted ad

Public:- 32 y.o. single man- Mountain View, CA- ….- has cat

Private:- likes fishing

Show

- 32 y.o. single man- Mountain View, CA….- has cat- likes fishing

Amazing cat food for $0.99!

Nice!Likes fishing

# of impressions

Likes fishingnoise

Privacy of profile-based targeting

Jon

Eve

6Privacy of profile-based targeting

My system is private! Unless your

targeting is not private, it is not!

How can I target

privately?

Advertising company Privacy researcher

7

How to protect information?• Basic idea: add some noise

• Explicitly• Implicit in the data

• noiseless privacy [BBGLT11] • natural privacy [BD11]

• Two types of explicit noise• Output perturbation

• Dynamically add noise to answers

• Input perturbation• Modify the database

Privacy of profile-based targeting

8Privacy of profile-based targeting

I like input perturbation

better…

Advertising company Privacy researcher

9

Input perturbation• Pro:

• Pan-private (not storing initial data)• Do it once• Simpler architecture

Privacy of profile-based targeting

10Privacy of profile-based targeting

I like input perturbation

better… Signal is sparse and non-random

Advertising company Privacy researcher

11

Adding noise• Two main difficulties in adding noise:

Privacy of profile-based targeting

1 0 0 0 1 0 0 0 1 0

0 0 0 0 0 0 0 0 1 1

0 0 0 1 0 0 0 0 0 1

0 0 0 1 0 0 1 0 0 0

0 1 1 0 0 0 0 0 0 1

0 0 0 0 1 0 0 1 0 0

0 0 1 0 0 0 0 1 1 0

0 0 0 0 0 0 0 1 0 1

1 0 1 0 0 0 0 0 0 0

0 1 0 1 0 1 0 0 0 0

• Sparse profiles1 0 0 1 1 1 0 1 0 0

0 1 0 0 0 0 0 0 1 1

1 0 0 0 0 0 0 0 1 0

1 0 0 1 1 1 1 0 0 0

0 1 1 0 0 0 0 0 1 1

1 0 0 0 1 0 0 1 0 0

0 1 1 1 1 1 0 1 1 1

0 1 0 0 0 0 0 1 0 0

1 0 1 1 1 1 0 0 0 0

0 1 0 0 0 0 0 1 0 1

• Dependent bits

1 1 1 1 1 0 0 1 0 1

differential privacy

deniability

“Smart noise”

12Privacy of profile-based targeting

I like input perturbation

better… Signal is sparse and non-random

Let’s shoot for deniability, and add

“smart noise”!

Advertising company Privacy researcher

13

“Smart noise”• Consider two extreme cases

• All bits are independent

independent noise• All bits are correlated with correlation coefficient 1

correlated noise

• “Smart noise” hypothesis: “If we know the exact model we can add right noise”

Privacy of profile-based targeting

Aha!

14

Dependent bits in real data• Netflix prize competition data

• ~480k users, ~18k movies, ~100m ratings

• Estimate movie-to-movie correlation• Fact that a user rated a movie

• Visualize graph of correlations• Edge – correlation with correlation coefficient > 0.5

Privacy of profile-based targeting

15

Netflix movie correlations

Privacy of profile-based targeting

16Privacy of profile-based targeting

Let’s construct models where “smart noise”

fails

Let’s shoot for deniability, and add

“smart noise”!

Advertising company Privacy researcher

17

How can “smart noise” fail?

Privacy of profile-based targeting

large = relative distance

large

18

Models of user profiles• hidden independent bits• public bits

• Public bits are some functions of hidden bits

Privacy of profile-based targeting

1 0 1 … 0 1

1 1 0 1 … 0 1 0 1

𝑓

• Are users well separated?

hidden bits

public bits

19Privacy of profile-based targeting

Error-correcting codes• Constant relative distance• Unique decoding• Explicit, efficient

20Privacy of profile-based targeting

But this model is unrealistic!

See — unless the noise is >25%, no privacy

Let me see what I can do with monotone

functions…

Advertising company Privacy researcher

21

Monotone functions• Monotone function: for all and for all values of ,

• Monotonicity is a natural property

• Monotone functions are bad for constructing error-correcting codes

Privacy of profile-based targeting

22

Approximate error-correcting codes• -approximate error-correcting code with distance :

function such that

• If less than fraction of is corrupted then we can reconstruct within fraction of bits.

• We need -approximate error-correcting code with constant distance.

Privacy of profile-based targeting

blatant non-privacy

23

Noise sensitivity• Noise sensitivity of function :

where is chosen uniformly at random, is formed by flipping each bit of with probability .

• If is big

• If is small

Privacy of profile-based targeting

1 0 1 … 0 1

1 1 0 1 … 0 1 0 1

𝑓

hidden bits

public bits

1 1 1 … 0 0

1 0 0 0 … 1 1 1 1

1 0 1 … 0 1

1 1 0 1 … 0 1 0 1

𝑓

hidden bits

public bits

1 1 1 … 0 0

1 1 1 1 … 0 0 0 1

24

Monotone functions• There exist highly sensitive monotone functions [MO’03].• Theorem: there exists monotone -approximate error-

correcting code with constant distance on average. • Idea of proof: Let be random independent monotone

boolean functions, such that and depends only on bits of .

• Let .• With high probability for random there is no such that

and• For Talagrand -approximate error-correcting code with

constant distance on average.

Privacy of profile-based targeting

25Privacy of profile-based targeting

Hmmm. Does smart noise ever work?

If the model is monotone, blatant non-privacy is still

possible

Advertising company Privacy researcher

26

Linear threshold model• Function is a linear threshold function, if there exist real

numbers ’s such that

• Theorem [Peres’04]: Let be a linear threshold function, then

No o(1)-approximate error-correcting

code with O(1) distance

Privacy of profile-based targeting

27

Conclusion• Two separate issues with input perturbation:

• Sparseness• Dependencies

• “Smart noise” hypothesis:Even for a publicly known, relatively simple model, constant corruption of profiles may lead to blatant non-privacy.

• Connection between noise sensitivity of boolean functions and privacy

• Open questions:• Linear threshold privacy-preserving mechanism?• Existence of interactive privacy-preserving solutions?

Privacy of profile-based targeting

ArbitraryMonotoneLinear thresholdfallacy

28

Thank for your attention!

Special thanks for Cynthia Dwork, Moises Goldszmidt, Parikshit Gopalan, Frank McSherry, Moni Naor, Kunal Talwar, and Sergey Yekhanin.

Privacy of profile-based targeting