29
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning Milad Nasr 1 , Reza Shokri 2 , Amir Houmansadr 1 1 University of Massachusetts Amherst, 2 National University of Singapore 1

Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Comprehensive Privacy Analysis of Deep Learning:!

Passive and Active White-box Inference Attacks against Centralized and Federated Learning !

MiladNasr1,RezaShokri2,AmirHoumansadr1

1University of Massachusetts Amherst, 2National University of Singapore

1

Page 2: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Deep learning Tasks!

2

Medical Location Financial Personal History

Page 3: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Privacy Threats!

• We provide a comprehensive privacy analysis of deep learning algorithms.

• Our objective is to measure information leakage of deep learning models about their training data

•  In particular we emphasize on membership inference attacks

• Can an adversary infer whether or not a particular data record was part of the training set?

3

Page 4: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Membership Inference!

4

!Trained Model!

!

Output Vector

Member

Non Member

Output !Pattern!

What is the cause of this behavior ? !

Train Data!

Data Distribution

Page 5: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Training a Model!

5

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

SGD:

𝑾 =𝑾 −𝜶 𝛁𝐋↓𝐰 

Model parameters change in the

opposite direction of each training

data point’s gradient!

𝑾 Model parameters 𝐋

Loss 𝛁𝐋↓𝐰  Loss gradient

w.r.t parameters

Page 6: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Training a Model!

6

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

Page 7: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Training a Model!

7

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Train Data

Non member data

Gradients leak information by

behaving differently for non-member data

vs. member data. !

Page 8: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Gradients Leak Information !

0.4!

0.14!0.1! 0.09! 0.07! 0.07!

0!

0.1!

0.2!

0.3!

0.4!

0.5!

0.6!

0.7!

0.8!

0.9!

1!

0! 100! 200! 300! 400! 500!

Gradient Norm Distribution!

Members!Non-Member!

8

Separable distributions

Page 9: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Different Learning/Attack Settings!

•  Fully trained

•  Black/ White box

•  Fine-tuning

•  Federated learning

•  Central/ local Attacker

•  Passive/ Active

9

Page 10: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Federated Model!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

10

Page 11: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Federated Learning!

11

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Multiple observations:

Epoch 1 Epoch 2 Epoch n

Every point leave traces on the target function

Page 12: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Active Attack on Federated Learning!

12

Target member

Target non-member

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Active attacker change the

parameters in the direction of the

gradient!

Page 13: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Active Attack on Federated Learning!

13

For the data points that are in the

training dataset, local training will

compensate for the active attacker!

0!

0.5!

1!

1.5!

2!

2.5!

3!

3.5!

4!

0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5!

Page 14: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Active Attacks in! Federated Model!

0!

1!

2!

3!

4!

5!

6!

1! 2! 3! 4! 10! 20! 30! 40! 50! 60! 70! 80!

Gra

dien

t nor

m!

Epochs!

Target Members!

Target Non-member!Member instances!

Non-member instances!

14

Page 15: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Attacker

Scenario 1: Fully Trained Model!

!!Training

!Trained Model!

!

Dataset

!Trained Model!

!Not Observable

Input Output vector

Cat Dog

15

Outputs of all layers

Loss

Gradients of all layers

Page 16: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Scenario 2: Central Attacker in !Federated Model!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

16

Page 17: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

17

Not observable Not observable Not observable Not observable

Scenario 2: Central Attacker in !Federated Model!

Target individual collaborators

Local Updates

!Isolated Model!

!

Isolated any collaborators

In addition to the local attacker observations:

Page 18: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Scenario 3: Local Attacker in !Federated Learning!

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training

Local Data!

!!

LocalModel!

Training …

Collaborator 1 Collaborator 2 Collaborator 3 Collaborator n

!Central Model!

!

Active

18

Not Observable Outputs of all layers

Loss

Gradients of all layers

Epoch 1:

Outputs of all layers

Loss

Gradients of all layers

Epoch 2:

.

.

.

Outputs of all layers

Loss

Gradients of all layers

Epoch N:

Page 19: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Score function!

19

Outputs of all layers

Loss

Gradients of all layers

Epoch 1:

Outputs of all layers

Loss

Gradients of all layers

Epoch 2:

.

.

.

Outputs of all layers

Loss

Gradients of all layers

Epoch N:

Different observation:

Input! Score ! Member ? Non-member

Page 20: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Experimental Setup!•  Unlike previous works, we used publicly available pretrained models • We used all common regularization techniques • We implemented our attacks in PyTorch • We used following datasets: •  CIFAR100 •  Purchase100 •  Texas100

20

Page 21: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Results!

21

Page 22: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Pretrained Models Attacks!

22

Last layer contains the most information

Gradients leak significant information

Page 23: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Federated Attacks!

23

Global attack is more powerful than the local attacker

An active attacker can force SGD to leak more information

Page 24: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Conclusions!• We go beyond black-box scenario and try to understand why a deep

learning model leak information

•  Gradients leak information about the training dataset

•  Attacker in the federated learning can take the advantage of multiple observations to leak more information

•  In the federated setting, an attacker can actively force SGD to leak information

24

Questions ?

Page 25: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Overall Attack Model!

25

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Layer 1 output!

Layer 1 gradient!

Layer 2 output!

Layer 2 gradient!

Layer n output!

Layer n gradient!

.

.

.

Loss!

Label!

Epoch 1:

Epoch n :

Output Component!

Gradient Component!

Output Component!

Gradient Component!

Output Component!

Gradient Component!

. . .

Loss Component!

Label Component!

Member ? Non-member

Page 26: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Scenario 4: Fine-Tuning Model!

Dataset

!!Training

!General Model!

!

Specialized Dataset

!!Training

!Fine-Tuned

Model!!

Not Observable

Attacker

!General Model!

!

!Fine-Tuned

Model!!

Outputs of all layers

Loss

Gradients of all layers

Outputs of all layers

Loss

Gradients of all layers

General dataset!

Specialized dataset! None!

26

Page 27: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Fine-tuning Attacks!

27

Both specialized and general datasets are vulnerable to the membership attacks

Dataset Arch

Distinguishing specialized/general

datasets

Distinguishing general / non-member

datasets

Distinguishing Specialized / non-member datasets

Page 28: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Federated Attacks!

28

Page 29: Comprehensive Privacy Analysis of Deep Learning€¦ · Active Attack on Federated Learning! 13 For the data points that are in the training dataset, local training will compensate

Fine-Tuning Model Leakage!

Dataset

!!Training

!General Model!

!

Specialized Dataset

!!Training

!Fine-Tuned

Model!!

Not Observable

Attacker

!General Model!

!

!Fine-Tuned

Model!!

Outputs of all layers

Loss

Gradients of all layers

Outputs of all layers

Loss

Gradients of all layers

General dataset!

Specialized dataset! None!

29