Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. ·...

Preview:

Citation preview

Semidefinite relaxations for certifying robustness to

adversarial examples

Jacob Steinhardt Percy Liang

Aditi Raghunathan

2

ML: Powerful But Fragile

2

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

2

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.

• Focus on adversarial examples — manipulation of test inputs

Adversarial Examples

3

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation

3

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation Banana + patch !Toaster

[Brown et al. 2017]

3

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation Banana + patch !Toaster Stop + sticker !Yield

[Brown et al. 2017] [Evtimov et al. 2017]

3

Adversarial Examples

4

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

4

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

Noise ! “Ok Google”

[Carlini et al. 2017]

4

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

Noise ! “Ok Google”

[Carlini et al. 2017]

Malware ! Benign

[Grosse et al. 2017]

4

5

What is an adversarial example?

Definition of attack model usually application specific and complex

5

What is an adversarial example?

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

Panda Gibbon

Szegedy et al. 2014

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

|xadv � x|i ✏ for i = 1, 2, . . . d

Panda Gibbon

Szegedy et al. 2014

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

|xadv � x|i ✏ for i = 1, 2, . . . d xadv 2 B✏(x)

Panda Gibbon

Szegedy et al. 2014

History

6

History

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6

Hard to defend even in this well defined model…

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6

Hard to defend even in this well defined model…

Vs.

7

Provable robustness

7

Provable robustness

7

Can we get robustness to all attacks?

Provable robustness

7

Can we get robustness to all attacks?

f(x)Let be the scoring function and adversary wants to maximizef(x)

Provable robustness

7

Can we get robustness to all attacks?

f(x)Let be the scoring function and adversary wants to maximizef(x)

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

Network is provably robust if optimal attack fails

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

Network is provably robust if optimal attack fails

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

f? ⌘ f(Aopt(x)) < 0

8

8

Network is provably robust if

Provable robustnessf? ⌘ f(Aopt(x)) < 0

8

Network is provably robust if

Provable robustnessf? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails =)

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper 0f? fupper

Robust and certified

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper 0f? fupper

Robust and certified

0f? fupper

Robust and not certified

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

9

Two layer networks

9

Two layer networks

9

Two layer networks

9

Two layer networks

9

Key idea: Uniformly bound gradients

Two layer networks

f(x) f(x) + ✏maxx

krf(x)k19

Key idea: Uniformly bound gradients

Two layer networks

10

Two layer networks

10

Two layer networks

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

optimize over activations

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

optimize over activations

optimize over signs of perturbation

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Two layer networks

Bound on gradient:

optimize over activations

optimize over signs of perturbation

Final step: SDP relaxation (similar to MAXCUT) leads to Grad-cert

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Relaxation Training

11

Relaxation TrainingTraining a neural network

11

Relaxation TrainingTraining a neural network

Objective:

11

Relaxation TrainingTraining a neural network

Objective:

11

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

11

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

11

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

Regularizer:

11

D · �+max

�(M(v,W )� diag(c)

�+ 1> max(c, 0)

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

Regularizer:

Just one max eigenvalue computation for gradients

11

D · �+max

�(M(v,W )� diag(c)

�+ 1> max(c, 0)

Results on MNIST

12

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

12

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

12

Results on MNIST

13

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

13

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

13

Gradient based bound is quite loose

Results on MNIST

14

Train with Grad-cert(attack)

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 14

Train with Grad-cert(attack)

Results on MNIST

15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Gradient based bound is much better!

Results on MNIST

16

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

Results on MNIST

17

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)

Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)

Can we certify such “foreign” networks?

Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

18

Summary so far…

18

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

18

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

18

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

18

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

• LP-cert and Grad-cert are tight only on training

18

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

• LP-cert and Grad-cert are tight only on training

• Goal: Efficiently certify foreign multi-layer networks

18

19

19

New SDP-cert relaxation

19

New SDP-cert relaxation

19

New SDP-cert relaxation

19

New SDP-cert relaxationx x1 x2 x3 ⌘ xL

19

New SDP-cert relaxation

Attack model constraints:

x x1 x2 x3 ⌘ xL

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2 x3 ⌘ xL

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

x3 ⌘ xL

Neural net constraintsW2W1W0

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraintsW2W1W0

19

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraintsW2W1W0

19

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraints

f? = maxx

(cy � cy)>xL

W2W1W0

19

Source of non-convexity is the ReLU constraints

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraints

f? = maxx

(cy � cy)>xL

W2W1W0

20

Handling ReLU constraints

20

Handling ReLU constraintsConsider single ReLU constraint z = max(0, x)

20

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

20

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � x Linear

20

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

Linear

Linear

20

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

Linear

Linear

20

{is greater than z x, 0

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Can relax quadratic constraints to get a semidefinite program

21

SDP relaxation

21

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � x

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � x

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � 0

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z(z � x) = 0

z2 = xz

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 ReLU constraints as linear constraints on matrix entries

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

ReLU constraints as linear constraints on matrix entries

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

ReLU constraints as linear constraints on matrix entries

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

M ⌫ 0

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

M ⌫ 0

Generalizes to multiple layers: large matrix M with all activations

SDP relaxation

22

SDP relaxationInteraction between different hidden units

22

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

22

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

LP treats units independently SDP reasons jointly

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

LP treats units independently SDP reasons jointly

Theorem: For a random two layer network with hidden nodes and input dimension , opt(LP) = and opt(SDP) =

md ⇥(md) ⇥(m

pd+ d

pm)

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

23

Results on MNIST

23

Results on MNIST

23

Three different robust networks

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Grad-NN LP-NN PGD-NN

Grad-cert 35% 93% N/A

LP-cert 97% 22% 100%

SDP-cert 20% 20% 18%

PGD-attack 15% 18% 9%

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Grad-NN LP-NN PGD-NN

Grad-cert 35% 93% N/A

LP-cert 97% 22% 100%

SDP-cert 20% 20% 18%

PGD-attack 15% 18% 9%

SDP provides good certificates on all three different networks

Results on MNIST

24

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Uncertified points are more vulnerable to attack

Scaling up…

25

Scaling up…In general, CNNs are more robust than fully connected networks

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25

26

Summary

26

Summary • Robustness for attack model

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

• Adversarial examples expose limitations of current systems

26

`1

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

• Adversarial examples expose limitations of current systems

• How do we get models to learn “the right thing”?

26

`1

Thank you!

Jacob Steinhardt Percy Liang

“Certified Defenses against Adversarial Examples” https://arxiv.org/abs/1801.09344 [ICLR 2018]

“Semidefinite Relaxations for Certifying Robustness to Adversarial Examples” https://arxiv.org/abs/1811.01057 [NeurIPS 2018]

Open Philanthropy

Project

Google

Recommended