Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For...

Preview:

Citation preview

Comprehensive Privacy Analysis of Deep Learning:!

Passive and Active White-box Inference Attacks against Centralized and Federated Learning !

MiladNasr1,RezaShokri2,AmirHoumansadr1

1University of Massachusetts Amherst, 2National University of Singapore

1

Deep learning Tasks!

2

Medical Location Financial Personal History

Privacy Threats!

• We provide a comprehensive privacy analysis of deep learning algorithms.

• Our objective is to measure information leakage of deep learning models about their training data

•  In particular we emphasize on membership inference attacks

• Can an adversary infer whether or not a particular data record was part of the training set?

3

Membership Inference!

4

!Trained Model!

!

Output Vector

Member

Non Member

Output !Pattern!

What is the cause of this behavior ? !

Train Data!

Data Distribution

Training a Model!

5

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

SGD:

𝑾 =𝑾 −𝜶 𝛁𝐋↓𝐰 

Model parameters change in the

opposite direction of each training

data point’s gradient!

𝑾 Model parameters 𝐋

Loss 𝛁𝐋↓𝐰  Loss gradient

w.r.t parameters

Training a Model!

6

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

Training a Model!

7

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

Non member data

Gradients leak information by

behaving differently for non-member data

vs. member data. !

Gradients Leak Information !

0.4!

0.14!0.1! 0.09! 0.07! 0.07!

0!

0.1!

0.2!

0.3!

0.4!

0.5!

0.6!

0.7!

0.8!

0.9!

1!

0! 100! 200! 300! 400! 500!

Gradient Norm Distribution!

Members!Non-Member!

8

Separable distributions

Different Learning/Attack Settings!

•  Fully trained

•  Black/ White box

•  Fine-tuning

•  Federated learning

•  Central/ local Attacker

•  Passive/ Active

9

Federated Model!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

10

Federated Learning!

11

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Multiple observations:

Epoch 1 Epoch 2 Epoch n

Every point leave traces on the target function

Active Attack on Federated Learning!

12

Target member

Target non-member

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Active attacker change the

parameters in the direction of the

gradient!

Active Attack on Federated Learning!

13

For the data points that are in the

training dataset, local training will

compensate for the active attacker!

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Active Attacks in! Federated Model!

0!

1!

2!

3!

4!

5!

6!

1! 2! 3! 4! 10! 20! 30! 40! 50! 60! 70! 80!

Gra

dien

t nor

m!

Epochs!

Target Members!

Target Non-member!Member instances!

Non-member instances!

14

Attacker

Scenario 1: Fully Trained Model!

!!Training

!Trained Model!

!

Dataset

!Trained Model!

!Not Observable

Input Output vector

Cat Dog

15

Outputs of all layers

Loss

Gradients of all layers

Scenario 2: Central Attacker in !Federated Model!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

16

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

17

Not observable Not observable Not observable Not observable

Scenario 2: Central Attacker in !Federated Model!

Target individual collaborators

Local Updates

!Isolated Model!

!

Isolated any collaborators

In addition to the local attacker observations:

Scenario 3: Local Attacker in !Federated Learning!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

Active

18

Not Observable Outputs of all layers

Loss

Gradients of all layers

Epoch 1:

Outputs of all layers

Loss

Gradients of all layers

Epoch 2:

.

.

.

Outputs of all layers

Loss

Gradients of all layers

Epoch N:

Score function!

19

Outputs of all layers

Loss

Gradients of all layers

Epoch 1:

Outputs of all layers

Loss

Gradients of all layers

Epoch 2:

.

.

.

Outputs of all layers

Loss

Gradients of all layers

Epoch N:

Different observation:

Input! Score ! Member ? Non-member

Experimental Setup!•  Unlike previous works, we used publicly available pretrained models • We used all common regularization techniques • We implemented our attacks in PyTorch • We used following datasets: •  CIFAR100 •  Purchase100 •  Texas100

20

Results!

21

Pretrained Models Attacks!

22

Last layer contains the most information

Gradients leak significant information

Federated Attacks!

23

Global attack is more powerful than the local attacker

An active attacker can force SGD to leak more information

Conclusions!• We go beyond black-box scenario and try to understand why a deep

learning model leak information

•  Gradients leak information about the training dataset

•  Attacker in the federated learning can take the advantage of multiple observations to leak more information

•  In the federated setting, an attacker can actively force SGD to leak information

24

Questions ?

Overall Attack Model!

25

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Epoch 1:

Epoch n :

Output Component!

Gradient Component!

Output Component!

Gradient Component!

Output Component!

Gradient Component!

. . .

Loss Component!

Label Component!

Member ? Non-member

Scenario 4: Fine-Tuning Model!

Dataset

!!Training

!General Model!

!

Specialized Dataset

!!Training

!Fine-Tuned

Model!!

Not Observable

Attacker

!General Model!

!

!Fine-Tuned

Model!!

Outputs of all layers

Loss

Gradients of all layers

Outputs of all layers

Loss

Gradients of all layers

General dataset!

Specialized dataset! None!

26

Fine-tuning Attacks!

27

Both specialized and general datasets are vulnerable to the membership attacks

Dataset Arch

Distinguishing specialized/general

datasets

Distinguishing general / non-member

datasets

Distinguishing Specialized / non-member datasets

Federated Attacks!

28

Fine-Tuning Model Leakage!

Dataset

!!Training

!General Model!

!

Specialized Dataset

!!Training

!Fine-Tuned

Model!!

Not Observable

Attacker

!General Model!

!

!Fine-Tuned

Model!!

Outputs of all layers

Loss

Gradients of all layers

Outputs of all layers

Loss

Gradients of all layers

General dataset!

Specialized dataset! None!

29

Recommended