44
Verifiably Robust Machine Learning For Security Yizheng Chen Postdoctoral Researcher Columbia University * Joint work with Shiqi Wang, Dongdong She and Suman Jana

Verifiably Robust Machine Learning For Security · Searching for Evasive PDF Malware •Attacks can be decomposed to building block operations •Feature insertion-only attacks. Grosse

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Verifiably Robust Machine Learning For Security

Yizheng ChenPostdoctoral Researcher

Columbia University

* Joint work with Shiqi Wang, Dongdong She and Suman Jana

Security Classifiers

2

Malicious

Benign

Security Classifiers

3

Very easy to evade7MB PDF evades Gmail’s PDF malware classifier [1]

[1] Xu et al., NDSS Talk: Automatically Evading Classifiers (including Gmail’s).

Benign

Robust Accuracy

oUnbounded attackers can always evade the classifier

4

All Samples

Robust and

Accurate

Accurate

Robust Accuracy

oUnbounded attackers can always evade the classifier

oReasonably bounded attackers

oGeneralize to robustness against unbounded attackers

5

All Samples

Robust and

Accurate

Accurate

Measure Robust Accuracy

• Formal verification on image classification• Given an image, no adversarial examples exist within a small Lp distance.

• Verified Robust Accuracy (VRA)• The percentage of test samples that are verified to be correctly classified

within a Lp distance bound.• Any Lp norm bounded attacker.

6

Sound Over-approximation

7

o 3 input dimensions, 2 output classeso f(x) is class 0

Sound Over-approximation

8

o 3 input dimensions, 2 output classeso f(x) is class 0

o For all x’ that Lp(x, x’) ≤ !,is f0(x’) always the largest output?

Sound Over-approximation

9

o 3 input dimensions, 2 output classeso f(x) is class 0

o For all x’ that Lp(x, x’) ≤ !,is f0(x’) always the largest output?

o Sound transformation T: f(x’) ⊆ Tf(x’)Tf overapproximates fRobustness analysis based on Tf(x’)

New Sound Over-approximation Methods

Convex Outer BoundWong et al. ICML 2018, NIPS 2018.

SMT Solvers (Katz et al. 2017; Ehlers 2017)

Mixed-Integer Programming (Bunel et al., 2017; Tjeng et al., 2017; Cheng et al., 2017)

Semidefinite Programming (Raghunathan et al., 2018; Fazlyab et al. 2019)

Convex Relaxation (Weng et al., 2018; Mirman et al., 2018; Gehr et al., 2018; Dvijotham et al., 2018; Wong & Kolter, 2018; Wong et al., 2018)

Abstract InterpretationGehr et al. Oakland 2018, Mirman et al. ICML 2018.

Symbolic Linear RelaxationWang et al. USENIX Security 2018, NIPS 2018.

10

New Sound Over-approximation Methods

Convex Outer BoundWong et al. ICML 2018, NIPS 2018.

Abstract InterpretationGehr et al. Oakland 2018, Mirman et al. ICML 2018.

Symbolic Linear RelaxationWang et al. USENIX Security 2018, NIPS 2018.

11

Abstract transformers for NN layers Solve the dual of the Linear Program Propagate relaxed linear bound

Tradeoff: over-approximation error vs computation overhead

Symbolic Linear Relaxation

Symbolic Linear RelaxationWang et al. USENIX Security 2018, NIPS 2018.

o Propagate Symbolic Intervalso ReLU Relaxation

12

https://github.com/tcwangshiqi-columbia/symbolic_interval

Symbolic Interval Propagation

14

!"(x, x′) ≤ (

Symbolic Interval Propagation

15

!"(x, x′) ≤ (

Symbolic Interval Propagation

16

!"(x, x′) ≤ (

Symbolic Interval Propagation

17

!"(x, x′) ≤ ( Overapproximate f0(x’) and f1(x’)

Measure Verified Robust Accuracy (VRA)

• MNIST model, regular training, 98.28% test accuracy• !" ≤ 0.01, 80% VRA• !" ≤ 0.3, 0% VRA

• Verifiably Robust Training Increases VRA

18

Regular Training Robust Training

What is Verifiably Robust Training?

19

Regular Trainingmin(errors)

Robust Training

What is Verifiably Robust Training?

20

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

What is Verifiably Robust Training?

21

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

What is Verifiably Robust Training?

22

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

Adversarial

What is Verifiably Robust Training?

23

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

Adversarial

What is Verifiably Robust Training?

Sound Over-approximation

24

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

Adversarial Verifiable

What is Verifiably Robust Training?

Sound Over-approximation

25

Regular Trainingmin(errors)

Robust Trainingmin( max( errors by successful evasions ) )

Adversarial Verifiable

What is Verifiably Robust Training?

Sound Over-approximation

Underestimate Robust Loss Overestimate Robust Loss

Robust Loss

26

Verifiably Robust Training

27

Training Data

Robust Loss

Sound Overapproximation

!" ≤ #

Verifiably Robust Training

28

Training Data

Robust Loss

Sound Overapproximation

!" ≤ #

Training only for robustness decreases accuracy

Verifiably Robust Training

29

Training Data

Robust Loss

Sound Overapproximation

!" ≤ #

Regular Loss

Ongoing research:• Obtain high accuracy and high VRA at the same time• Scale to larger datasets and networks

Verifiably Robust Training

30

Sound Overapproximation!" ≤ #

If we sample enough training data for robustness, VRA generalizes to test data.

Test Data

VRA

Accuracy

MixTrain Results

Our paper “MixTrain” https://arxiv.org/abs/1811.02625• Standard networks used in verifiably robust training research• Acceptable accuracy drop than regular training

31

Dataset Network Accuracy VRA !" BoundMNIST Small 99% 97% 0.1

MNIST Large 96% 58% 0.3

CIFAR10 Resnet 71% 50% 2/255VRA increased from 0

Robustness Properties

Why is it useful for Security?

• Robustness Property for Security Classifier• Reasonably bounded attackers cannot evade the classifier

• PDF malware classifier cannot be evaded by• Inserting benign pages• Deleting non-functional objects

• Twitter spam detector cannot be evaded by• URL with no redirection chain

32

Challenges

• How to train a single model to be robust against different attackers?

• How to maintain low false positive rate?

• Does verifiable robustness generalize to unrestricted attackers?

33

Robust Against Different Attackers

• Obtain VRA for multiple robustness properties and regular accuracy• The underlying optimization problem is harder

• Mixed Training• Combined and dynamic training objective• Mix the batches

34

Accuracy

Property 1 VRA Property 2 VRA

Property 3 VRA

Our paper “MixTrain” https://arxiv.org/abs/1811.02625

New Distance Metric

• To bound attackers that reasonably mimic real attackers• Does not affect false positive rate

• Adversarial malware examples• x -> x’, s.t. f(x’) = benign and O(x’) is malicious, imperceptible by machine

35

New Distance Metric

• To bound attackers that reasonably mimic real attackers• Does not affect false positive rate

• Adversarial malware examples• x -> x’, s.t. f(x’) = benign and O(x’) is malicious, imperceptible by machine• Mimicry operations

• Spam: good words injection.• PDF Malware: merge benign and malicious feature vectors.• Phishing: name squatting.

• High FPR (67% ~ 85%)

36

malicious benign

adversarial malware examples

?

New Distance Metric

• To bound attackers that reasonably mimic real attackers• Does not affect false positive rate

• Lp norm is not suitable.• Image: no single pixel influences the final class too much.• Security classifiers: some features are more malicious or benign.• Therefore, attacker’s operations are not evenly distributed across features.

37

Searching for Evasive PDF Malware

• Attacks can be decomposed to building block operations• Feature insertion-only attacks. Grosse et. al., Hu et al.• Mimicry, merging with benign features. Šrndić et al.• Mutation operations (insert, replace, delete). Xu et al., Dang et al.

• Optimization• Greedy (Gradient Descent)• Genetic Evolution• Hill Climbing

38

Benign

How to choose the distance metric?

• Distance captures the building block attack operations

• A larger distance means more building block attack operations

39

Subtree Distance

• A PDF malware variant needs correct syntax and correct semantic.• PDF file is parsed into a tree structure

40

/Catalog

1

/Pages/Page

Exploit /Javascript

/FlateDecode 1573

/Root/Type

/OpenAction

/JS /S

/Pages

/Kids

/Type

/Count

/Type

/Filter /Length

Subtree Distance

• A PDF malware variant needs correct syntax and correct semantic.• PDF file is parsed into a tree structure• Malware variants differ in subtrees

41

/Catalog

1

/Pages/Page

Exploit /Javascript

/FlateDecode 1573

/Root/Type

/OpenAction

/JS /S

/Pages

/Kids

/Type

/Count

/Type

/Filter /Length

/Catalog

1

/Pages/Page

Exploit /Javascript

/FlateDecode

/Root/Type

/OpenAction

/JS /S

/Pages

/Kids

/Type

/Count

/Type

/Filter

Subtree Distance One: arbitrary changes in 1 out of N subtrees under root

Building Block Robustness Properties

• PDF variants with correct syntax• Subtree insertion property (subtree distance one)• Subtree deletion property (subtree distance one)

• Overapproximate attackers within the bound• Small subtree distance maintains low FPR• Binary path features (Hidost Šrndić et al. NDSS 13)

42

Accuracy 99.74%False positive Rate 0.56%Subtree Insertion VRA 91.86%Subtree Deletion VRA 99.68%

Genetic Evolutionary Attack EvadeML

• Mutation operations (insert, replace, delete). Xu et al., NDSS 2016.

• Cuckoo oracle to check variants are malicious• Classifier’s score as search feedback• Genetic evolution algorithm to optimize the search

43

Benign

Success Rate of Unbounded Attackers

Generate evasive variants for 500 PDFs• Longer mutation trace• Larger feature difference• Worse search feedback

44

Conclusion

• Verifiably robust training has a lot of potential to build truly robust models against different attackers.

• We need to define meaningful distance metrics to train robustness properties for security classifiers.

• Training verifiable robustness in a bounded fashion tend to increase the bar for unrestricted attackers to evade the classifier.

45