THE DARK SIDE OF NEURAL NETWORKS

Secure Architectures & Systems LaboratoryCEA Tech, Centre CMP, Gardanne

Joint Team CEA Tech - Mines Saint-Etienne

THE DARK SIDE OF NEURAL NETWORKS:

AN ADVOCACY FOR SECURITY IN MACHINE LEARNING

C&ESAR 2018, 19/11/2018

PIERRE-ALAIN MOËLLIC

| 2

• CONTEXT

• ATTACK DEEP NEURAL NETWORKS

• A FOCUS ON ADVERSARIAL EXAMPLES

• PHYSICAL ATTACKS

• PROTECTIONS & EVALUATION

• CONCLUSION

CEA Tech | MOELLIC PIERRE-ALAIN

OUTCOME

| 3CEA Tech | MOELLIC PIERRE-ALAIN

CONTEXT

| 4

• Machine Learning resurgence with the advent of Deep Learning (2010-2011)

• Performance era

• BETTER

• FASTER

• LIGHTER


New trend : EMBEDDED MACHINE LEARNING SYSTEMS

CONTEXT: “A.I. EVERYWHERE”


ATTACK DEEP NEURAL NETWORKS

| 6

ADVERSARY’S KNOWLEDGE / CAPACITIY

Attack at learning / inference time ?

What knowledge about the model ?White box / Black box paradigm

Probing / Querying the model

• Striking the Holy Trinity: CONFIDENTIALITY / INTEGRITY / AVAILABILITY


FOOL A MODEL

The output prediction is not the expected one (i.e. correctly learned)

Fool a model under the radar, i.e. in a (almost) imperceptible way

Critical cases: Autonomous vehicle « Stop » recognizes as a « 130 km/h » sign.

Malware detection

THREAT MODELING

EXTRACT INFORMATION

Training data (medical, financial, biometric, classified…)

Model (IP, limited authorization)

MAKE THE SYSTEM USELESS

Attack the environment (e.g. classical DoS)Strongly alter the performance of the model


𝒙𝒕𝒆𝒔𝒕

𝒀

𝒚𝒕𝒆𝒔𝒕 « Stop »

𝒇𝜽 learned model = 𝒂𝒓𝒈𝒎𝒊𝒏𝜽(ℒ 𝒇𝜽 𝑿 ,𝒀 )

learning

inference

ATTACK MACHINE LEARNING PIPELINE

• Attack the Machine Learning Pipeline• Supervised case

(𝑿, 𝒀)𝒕𝒓𝒂𝒊𝒏

Illustration from [Goodfellow, Defense against the dark arts: An overview of adversarial

example security research and future research directions. 2018]

| 8

• Data poisoning



𝒀′



TRAINING SET

POISONING

𝒇′𝜽 learned model = 𝒂𝒓𝒈𝒎𝒊𝒏𝜽(ℒ 𝒇𝜽 𝑿 , 𝒀 )

learning

inference


| 9

• Data poisoning : target Integrity or Availability

• Well-known issue for Support Vector Machine [Biggio et al., 2012]

• [L.Munoz, B. Biggio, Towards Poisoning of Deep Learning Algorithms with Back-

gradient Optimization, 2017]*

• For spam and malware (ransomware) detection, up to 30% error rate

when controlling 15% of the training data (perfect knowledge

scenario)

• Prove transferability issue

• [C. Yang, Generative Poisoning Attack Method Against Neural Networks, 2017]**

• Use generative models (GAN/AAE) to improve the generation of

poisoned data (compared to direct gradient-based methods)



* (Imp. Coll. London, Univ. Cagliari)** (Univ Pittsburgh, Air Force Research Lab)

| 10


• Data leakage



𝒀


RECOVERY OF SENSITIVE

TRAINING DATA

Private /

confidential


learning

inference


| 11

• Data leakage : target Confidentiality / Privacy

• Critical case: Membership Inference Is 𝑥𝑡𝑒𝑠𝑡 from 𝑋𝑡𝑟𝑎𝑖𝑛 ?

• [Shokri et al., Membership Inference attacks against machine learning model, 2017] (Cornell Tech)• Black-box case use shadow models to mimic the target model

• (training data) leakage through the prediction outputs

• High attack accuracy level (>75%)

• See also [Rahman et al., Membership Inference Attack against differentially private deep learning

model, 2018] (Univ. Ottawa)



Illustration [Shokri]

• Amazon ML• Google Prediction API• CNN

| 12

• Model leakage




𝒀


learning

inference

Model Theft

IP ?



| 13

• Model leakage

• Model inversion : find (average) information of inputs from output of the

model

• Face recognition application [Fredrikson et al., Model inversion attacks that exploit confidence

information and basic countermeasures, 2015]*

• Model extraction : extract parameters of a model by querying it



* (Carnegie Mellon Univ.) ** (EPFL / Stanford Univ., Cornell Univ.)

• Online services BigML and Amazon ML [Tramer et al., Stealing Machine Learning Models via prediction APIs, 2016]**

• Mainly LR-based models and decision tree models. •Works on a simple MLP with ≈55000 queries

| 14

• Adversarial examples



𝒀

𝒚𝒕𝒆𝒔𝒕 « 130 km/h »

ADVERSARIAL

EXAMPLES


learning

inference



| 15

CRITICAL CASE:

ADVERSARIAL EXAMPLES


Source NIPS2018 Challenge


| 16

• Intriguing properties of Adversarial Examples



Adversarial examples are not:

• noise…

• explained by overfitting

• only concentrated in some “pockets” related to the complex

(non linear) high dimensional geometry

On the contrary:

• well distributed in 𝑿 but very precisely. Statistically, unlikely to

occur

• local linearity of the decision boundary

Simple, fast gradient-based crafting methods

[Szegedy et al. Intriguing properties of neural networks, 2014] (Google, NY Univ.)[Goodfellow et al. Explaining and harnessing adversarial examples, 2015] (Google)

« True » ostrich space

« True » cat space

Model cat spaceModel ostrich space

𝜀

| 17

• A (very) critical property: transferability

• Adversarial examples crafted for a model 𝒇𝜽𝑨 will likely fool another model 𝒇𝜽

𝑩

• Transferability is also observed from one ML algorithm to another (cross-technique

transferability)

• Transferability BLACK-BOX attack


N. Papernot. (MNIST results).

[Transferability in machine learning: from phenomena to blackbox attacks using adversarial samples, 2016] (Google Brain, Penn State Univ.)

64,32% of adversarial examples crafted

from a DNN also fool a SVM model



• Several flavors for Adversarial Examples

[Eykholt et al. Robust physical-world attacks on deep learning visual classification, 2018] (Univ. Michigan, Univ. Washington)

SÉCURITÉ MACHINE LEARNINGADVERSARIAL EXAMPLES

[Sharif et al., Accessorize to a crime, 2016] (Carnegie Mellon)

Adversarial examples do

not really need to be

« imperceptible »


EXPAND THE ATTACK SURFACE:

PHYSICAL ATTACKS

| 20

• Attack surface: embedded ML system

ALGORITHMIC WORLD PHYSICAL WORLD

FAULT INJECTION ATTACKS

SIDE-CHANNEL ANALYSIS

EMBEDDED ML SYSTEM


DATA POISONING

MODEL THEFT

MEMBERSHIP INFERENCES

…

Forthcoming threatsHandful of papers in 2018…

Important State-of-the-Art

But imbalance Attacks >> Defenses

SÉCURITÉ MACHINE LEARNINGPHYSICAL ATTACKS

CEA Tech | MOELLIC PIERRE-ALAINPhD (CEA, Gardanne) : Rémi BERNHARD (2018-2021)

PhD, RÉMI BERNHARD, CEA + MSE

(LSAS, CMP, Gardanne)

| 21

• Side-Channel Analysis

[Batina, et al. CSI neural network: Using side-channels to recover your artificial neural network information. 2018] (Radboud Univ.)



8-bit MicrocontrollerEM probe

SCA set-up

• Simple visual profiling of 4 activation functions or Multiplications / Activation function processing

• [Wei et al. I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators, 2018] (Chinese Univ. HK)

• CNN accelerator in a Xilinx Spartan-6 FPGA

• Attack principle close to Model Inversion

• Adversary with partial knowledge of the model

(CNN)

• Simple binary-like images: MNIST database

Original

imagePixel Recovery with Power

Templates

Illustration from [Wei]Illustration, [Batina]

| 22

• Fault Injection

[J. Breier et al. Deep Laser: Practical fault attack on deep neural networks, 2018] (Nanyang Tech. Univ.)



• 8-bit microcontroller

• Only on activation functions

• Fault model: Skip-instruction

• Simulated misclassifications on MNIST with random

injected faults

• “This area is still in the beginning phase of research” paves the way to further experimentations

• Open question: relevance of well-known

countermeasures (masking, hiding, redundancy,

control flow…) for ML systems under physical attack ? Illustration: Laser beam, CMP, LSAS team, Gardanne


PROTECTIONS

&

EVALUATION

| 24

• Imbalance between attack-based / protection-based efforts

• Need to better understand the inherent mechanisms behind attacks

• Need to get out of the (vain) arm race

• Attack_1a Protection_1a Attack_1b Protection_1b… …

• Lack of efficient, robust, guaranteed defense strategies

• Growing topic. For example: J. Steinhardt’s works (Stanford)

• [Steinhardt et al., Certified Defenses for Data Poisoning Attacks, 2017]

• [Steinhardt et al., Certified Defenses for Adversarial Examples, 2018]

SÉCURITÉ MACHINE LEARNINGPROTECTIONS & EVALUATION


| 25

• Protections for confidentiality & privacy

• Basic strategies against leakage: • output minimal information from the model or reduce the precision of the

confidence scores (𝒇𝜽 𝑿 )

• ensemble approach: use several models for the final prediction

• Not very efficient against an advanced adversary

• Hot topic (1): Privacy preserving machine learning• Differential Privacy

• Perturbation approaches: make the output useless for an adversary

• Lack of robustness (for now)

• Hot topic (2): Homomorphic encryption• Prohibitive processing time overhead (for now)



| 26

• Defenses against integrity-based attacks

• Poisoning attacks

• Very few effective defenses• Outlier removal / filtering…

• Attacks aim at moving the decision boundary and increasing the loss• Monitoring loss anomaly

• Adversarial attacks

• Denoising approaches (signal preprocessing methods)• Alter the inputs with filtering, quantification

• Gradient Masking: • make the gradient information useless (or nonexistent) • [Athalye et al. Obfuscated gradients give a false sense of security: Circumventing

defenses to adversarial examples, 2018]



| 27

• Defenses against integrity-based attacks

• Adversarial attacks

• Adversarial training:• Basic idea: data augmentation with adversarial examples

• STATE-OF-THE-ART PERFORMANCE :• [Tramer et al. Ensemble adversarial training: Attacks and Defenses, 2018]

• adversarial training « with adversarial examples crafted on other static pre-trained models »

• Detection-based defenses:• [Carlini et al. Adversarial Examples are not easily detected: Bypassing ten detection methods,

2018]

•Not efficient for advanced adversary in white-box paradigm.

•Raise question on how adversarial examples should be evaluated.



| 28

• Evaluations: significant effort for adversarial examples

• NIPS 2017 & 2018 Adversarial Vision Challenge

• 2 Attack tracks (targeted / non-targeted)

• 1 Adversarial Defense track

• NB: 2018 edition : 25/06/2018 15/11/2018

• CleverHans (tensorflow)

• “An adversarial example library for constructing attacks, building defenses, and

benchmarking both”

• https://github.com/tensorflow/cleverhans (v.2.1.0 since 18/06/2018)



https://github.com/tensorflow/cleverhans


CONCLUSION

| 30

• ML systems: CRITICAL task / infrastructure / data

• A godsend for attackers

• Security is one of the two major obstacles for the deployment of ML models and systems

• With Interpretability issue

• See PIA, I.A. main topics (Conseil de l’Innovation, 18/07/2018, https://www.economie.gouv.fr/grands-

defis)

• Security is not « an option » (but, obviously, a costly one)

• R&D efforts: Virtuous circle Attack / Protection / Evaluation

• A strong imperative: ML community + Security community

• Gathering Theoretical knowledge and expertise

• Need strong investments in Fr/Eu on that field


SÉCURITÉ MACHINE LEARNINGCONCLUSION

Protections

Attacks Evaluation

https://www.economie.gouv.fr/grands-defis

Commissariat à l’énergie atomique et aux énergies alternatives

17 rue des Martyrs | 38054 Grenoble Cedex

www.cea-tech.fr

Établissement public à caractère industriel et commercial | RCS Paris B 775 685 019

CONTACT

Scientific contact point:

Pierre-Alain MOELLIC – [email protected]

Laboratory head:

Romain WACQUEZ – [email protected]

Partnership:

Paul-Vincent BONZOM – [email protected]

SECURE ARCHITECTURES AND SOFTWARES LAB – CEA TECH / LSAS

Centre de Microélectronique

Provence, Georges Charpak

Gardanne

Documents

THE DARK SIDE OF NEURAL NETWORKS