2
On the Robustness of Semantic Segmentation Models to Adversarial Attacks Anurag Arnab 1 Ondrej Miksik 1,2 Philip H.S. Torr 1 1 University of Oxford 2 Emotech Labs {anurag.arnab, ondrej.miksik, philip.torr}@eng.ox.ac.uk 1. Introduction Computer vision has progressed to the point where Deep Neural Network (DNN) models for most recognition tasks have become a widely available commodity. However, de- spite DNNs performing exceptionally well in absolute per- formance scores, they have also been shown to be vul- nerable to adversarial examples [11]. This raises doubts about DNNs being used in safety-critical applications such as driverless cars or medical diagnosis since they could in- explicably classify a natural input incorrectly although it is almost identical to examples it has classified correctly be- fore. Moreover, it allows the possibility of malicious agents attacking systems that use DNNs [6]. Hence, the robustness of DNNs to adversarial perturbations may be as important as the predictive accuracy on clean inputs. This phenomenon has recently attracted a lot of atten- tion, however, most proposed defenses to adversarial at- tacks have been compromised [2] and often come at the cost of performance penalties on clean inputs [8]. To the best of our knowledge, adversarial examples have not been extensively analysed beyond standard image classification models. Hence, the vulnerability of modern DNNs to ad- versarial attacks on more complex tasks such as semantic segmentation in the context of real-world datasets covering different domains remains unclear. Semantic segmentation models typically extend standard image classification architectures by additional components such as dilated convolutions, skip-connections, Conditional Random Fields (CRFs) and/or multiscale processing whose impact on the robustness has never been thoroughly stud- ied. In this paper [1], we present what to our knowledge is the first rigorous evaluation of adversarial attacks on mod- ern semantic segmentation models, using two large-scale datasets, and analyse the effect of different model architec- tures, capacity, multiscale processing and structured predic- tion, and show that many observations made on the classi- fication do not always transfer to this more complex task. Furthermore, we show how mean-field inference in deep structured models and multiscale processing naturally im- plement recently proposed adversarial defenses. 2. Experimental Set-up Datasets. We use the Pascal VOC and Cityscapes valida- tion sets. Pascal VOC consists of internet-images labelled with 21 classes, whilst Cityscapes is composed of road- scenes captured from a car and has 19 classes. Models. We evaluate models based on both VGG [10] and ResNet [4] backbones. We also consider the custom E- Net and ICNet architectures for real-time applications. Our chosen networks exhibit a variety of approaches unique to semantic segmentation models, including specialised pool- ing (PSPNet, DeepLab), encoder-decoder architecture (Seg- Net, E-Net), multiscale processing (DeepLab), CRFs (CRF- RNN), dilated convolutions (DilatedNet, DeepLab), and skip-connections (FCN). Due to space limitations, refer- ences to the original models are left to our full paper [1]. Adversarial attacks. We use the FGSM, FGSM ll and their iterative variants with min( +4, d1.25e) iterations and step-size α =1 [7]. The norm of the perturbations was set to each value of {0.25, 0.5, 1, 2, 4, 8, 16, 32}. Evaluation metric. Since the accuracy on clean inputs varies, we adapt the relative metric [7] and measure adver- sarial robustness using the Intersection over Union (IoU) Ratio – the ratio of the network’s IoU on adversarial attacks to that on clean images computed over the entire dataset. More exhaustive details are included in the full paper [1]. 3. Primary Findings Architectures. An evaluation of different architectures (Fig. 1) shows that models with residual connections are in- herently more robust than chain-like networks across both VOC and Cityscapes. This extends to the case of models with very few parameters (E-Net and ICNet) designed for real-time embedded platforms, contrary to the prior obser- vations of [7, 8]. Though we observed a correlation between robustness and accuracy, the most accurate network (PSP- Net) was not always the most robust (Deeplab v2). Multiscale processing. The multiscale processing of Deeplab v2 makes it more robust. Further experiments (c.f . the full paper [1]) showed that adversarial attacks are not as malignant when they are generated and processed at dif- ferent scales. This is because CNNs are not invariant to scale, and many other transformations. This was confirmed by evaluating the transferability of attacks generated at one scale and then evaluated at another. The fact that CNNs lack invariances to a broad class of transformations also ex- plains why recent papers [12, 3] which transform the input to a CNN as an adversarial defense have shown promise. 1

On the Robustness of Semantic Segmentation Models to Adversarial Attacksvision.soic.indiana.edu/bright-and-dark-workshop-2018/... · 2018-04-30 · On the Robustness of Semantic Segmentation

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the Robustness of Semantic Segmentation Models to Adversarial Attacksvision.soic.indiana.edu/bright-and-dark-workshop-2018/... · 2018-04-30 · On the Robustness of Semantic Segmentation

On the Robustness of Semantic Segmentation Models to Adversarial Attacks

Anurag Arnab1 Ondrej Miksik1,2 Philip H.S. Torr11University of Oxford 2Emotech Labs

{anurag.arnab, ondrej.miksik, philip.torr}@eng.ox.ac.uk

1. IntroductionComputer vision has progressed to the point where Deep

Neural Network (DNN) models for most recognition taskshave become a widely available commodity. However, de-spite DNNs performing exceptionally well in absolute per-formance scores, they have also been shown to be vul-nerable to adversarial examples [11]. This raises doubtsabout DNNs being used in safety-critical applications suchas driverless cars or medical diagnosis since they could in-explicably classify a natural input incorrectly although it isalmost identical to examples it has classified correctly be-fore. Moreover, it allows the possibility of malicious agentsattacking systems that use DNNs [6]. Hence, the robustnessof DNNs to adversarial perturbations may be as importantas the predictive accuracy on clean inputs.

This phenomenon has recently attracted a lot of atten-tion, however, most proposed defenses to adversarial at-tacks have been compromised [2] and often come at thecost of performance penalties on clean inputs [8]. To thebest of our knowledge, adversarial examples have not beenextensively analysed beyond standard image classificationmodels. Hence, the vulnerability of modern DNNs to ad-versarial attacks on more complex tasks such as semanticsegmentation in the context of real-world datasets coveringdifferent domains remains unclear.

Semantic segmentation models typically extend standardimage classification architectures by additional componentssuch as dilated convolutions, skip-connections, ConditionalRandom Fields (CRFs) and/or multiscale processing whoseimpact on the robustness has never been thoroughly stud-ied. In this paper [1], we present what to our knowledge isthe first rigorous evaluation of adversarial attacks on mod-ern semantic segmentation models, using two large-scaledatasets, and analyse the effect of different model architec-tures, capacity, multiscale processing and structured predic-tion, and show that many observations made on the classi-fication do not always transfer to this more complex task.Furthermore, we show how mean-field inference in deepstructured models and multiscale processing naturally im-plement recently proposed adversarial defenses.

2. Experimental Set-upDatasets. We use the Pascal VOC and Cityscapes valida-tion sets. Pascal VOC consists of internet-images labelled

with 21 classes, whilst Cityscapes is composed of road-scenes captured from a car and has 19 classes.

Models. We evaluate models based on both VGG [10]and ResNet [4] backbones. We also consider the custom E-Net and ICNet architectures for real-time applications. Ourchosen networks exhibit a variety of approaches unique tosemantic segmentation models, including specialised pool-ing (PSPNet, DeepLab), encoder-decoder architecture (Seg-Net, E-Net), multiscale processing (DeepLab), CRFs (CRF-RNN), dilated convolutions (DilatedNet, DeepLab), andskip-connections (FCN). Due to space limitations, refer-ences to the original models are left to our full paper [1].

Adversarial attacks. We use the FGSM, FGSM ll andtheir iterative variants with min(ε + 4, d1.25εe) iterationsand step-size α = 1 [7]. The `∞ norm of the perturbationsε was set to each value of {0.25, 0.5, 1, 2, 4, 8, 16, 32}.

Evaluation metric. Since the accuracy on clean inputsvaries, we adapt the relative metric [7] and measure adver-sarial robustness using the Intersection over Union (IoU)Ratio – the ratio of the network’s IoU on adversarial attacksto that on clean images computed over the entire dataset.More exhaustive details are included in the full paper [1].

3. Primary FindingsArchitectures. An evaluation of different architectures(Fig. 1) shows that models with residual connections are in-herently more robust than chain-like networks across bothVOC and Cityscapes. This extends to the case of modelswith very few parameters (E-Net and ICNet) designed forreal-time embedded platforms, contrary to the prior obser-vations of [7, 8]. Though we observed a correlation betweenrobustness and accuracy, the most accurate network (PSP-Net) was not always the most robust (Deeplab v2).

Multiscale processing. The multiscale processing ofDeeplab v2 makes it more robust. Further experiments (c.f .the full paper [1]) showed that adversarial attacks are notas malignant when they are generated and processed at dif-ferent scales. This is because CNNs are not invariant toscale, and many other transformations. This was confirmedby evaluating the transferability of attacks generated at onescale and then evaluated at another. The fact that CNNslack invariances to a broad class of transformations also ex-plains why recent papers [12, 3] which transform the inputto a CNN as an adversarial defense have shown promise.

1

Page 2: On the Robustness of Semantic Segmentation Models to Adversarial Attacksvision.soic.indiana.edu/bright-and-dark-workshop-2018/... · 2018-04-30 · On the Robustness of Semantic Segmentation

(a) Iterative FGSM ll. on VOC (b) Targeted Iterative FGSM ll. on Cityscapes

Figure 1: Adversarial robustness of state-of-the-art models on Pascal VOC (a) and Cityscapes (b) Models based on the ResNetbackbone tend to be more robust. Models are ordered by increasing IoU on clean inputs. Results on additional attacks in [1].

(a) (b) (c)

Figure 2: (a) On untargetted attacks on Pascal VOC, CRF-RNN is noticably more robust than FCN8s. (b) CRF-RNN ismore vulnerable to black-box attacks from FCN8, due to its “gradient masking” effect which results in ineffective white-boxattacks. (c) Moreover, the CRF does not “mask” the gradient for targeted attacks and it is no more robust than FCN8s.

CRFs and mean-field inference. Intuitively, it appearsthat the high frequency components that characterises ad-versarial perturbations may be mitigated by the pairwiseterms of DenseCRF [5] which act as a low-pass filter. Anevaluation of CRF-RNN, which performs end-to-end mean-field inference of DenseCRF, shows that it is indeed morerobust to untargetted attacks (Fig. 2a). However, the rea-son for this robustness is that mean-field inference tends toproduce overconfident predictions (measured by the entropyand maximum probability of the marginal distribution ateach pixel) which “mask” the gradient used to construct un-targetted adversarial attacks. Therefore, a commonly usedtechnique in segmentation literature naturally performs the“gradient masking” defence proposed by [9]. This effectcan be circumvented by performing black-box (Fig. 2b) andtargetted attacks (Fig. 2c) in which case CRF-RNN is asvulnerable as the FCN8s network it extends.

4. ConclusionOur main paper [1] presented, to our knowledge, the first

rigorous evaluation of the robustness of semantic segmen-tation models to adversarial attacks, which are arguably thegreatest challenge affecting DNNs. We have made numer-ous observations and raised questions that will aid futurework in understanding adversarial examples and developingmore effective defenses that do not compromise accuracy.In the shorter term, our observations suggest that networkssuch as Deeplab v2, which is based on ResNet and performsmultiscale processing, should be preferred in safety-criticalapplications due to their inherent robustness. As the most

accurate network on clean inputs is not necessarily the mostrobust network, we recommend evaluating robustness to avariety of adversarial attacks as done in this paper to findthe best combination of accuracy and robustness before de-ploying models in practice.

References[1] A. Arnab, O. Miksik, and P. H. S. Torr. On the robustness of semantic

segmentation models to adversarial attacks. In CVPR, 2018. 1, 2[2] A. Athalye, N. Carlini, and D. Wagner. Obfuscated gradients give a

false sense of security: Circumventing defenses to adversarial exam-ples. In arXiv preprint arXiv:1802.00420, 2018. 1

[3] C. Guo, M. Rana, M. Cisse, and L. van der Maaten. Counteringadversarial images using input transformations. In ICLR, 2018. 1

[4] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning forimage recognition. In CVPR, 2016. 1

[5] P. Krahenbuhl and V. Koltun. Efficient inference in fully connectedCRFs with Gaussian edge potentials. In NIPS, 2011. 2

[6] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial examples inthe physical world. In ICLR Workshop, 2017. 1

[7] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial machinelearning at scale. In ICLR, 2017. 1

[8] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. To-wards deep learning models resistant to adversarial attacks. In ICLR,2018. 1

[9] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami. Distil-lation as a defense to adversarial perturbations against deep neuralnetworks. In IEEE Symposium on Security and Privacy, 2016. 2

[10] K. Simonyan and A. Zisserman. Very deep convolutional networksfor large-scale image recognition. In ICLR, 2015. 1

[11] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Good-fellow, and R. Fergus. Intriguing properties of neural networks. InICLR, 2014. 1

[12] C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille. Mitigating adver-sarial effects through randomization. In ICLR, 2018. 1

2