Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Quantifying Realistic Threats for Deep Learning ModelsZhenyu Zhong, Zhisheng Hu, Xiaowei Chen, Baidu Research Institute
{edwardzhong, zhishenghu, xiaoweichen01}@baidu.com
Motivation• Intentional adversarial example attacks are less
likely to happen due to the lack of practical monetizing scheme by the attackers.
• Real world threats against DNN don’t cease to exist even if there is no attacker for safety-critical scenarios.
• AI industries are in great need of real world threat quantification for the DNN model robustness.
Goal• Define safety properties observed from the real-
world such that any violations lead to misprediction• Design standardized pipeline to evaluate threat
severity & quantify DNN model robustness• Shed a light on model robustness for pretrained
models from different learning tasks
Threat Quantification FrameworkCategory Safety Properties Description
Luminance Brightness, Contrast ReductionGeometric Transformation Horizontal (Vertical) Translation, Rotation, Spatial
Blur Motion Blur, Gaussian BlurCorruption Uniform Noise, Gaussian Noise,
Blended Noise, Salt And Pepper NoiseWeather Fog, Frost, Snow
Tabl
e 1.
Sa
fety
Pro
perty
Poo
l
Criteria DescriptionMisclassification ! " +$ ≠ &(")
ConfidenceMisclassification )∃+|+∈{/} " > 2ℎ456ℎ789+, 8 ≠ &(")
TopKMisclassification & " ∉ !<=>?(" +$)
OriginalClassProbability )@ " +$ < 2ℎ456ℎ789@
TargetClassProbability )<@ " +$ > 2ℎ456ℎ789<@
Tabl
e 2.
Thre
at C
riter
ia P
ool
B is the original input, B+P is the perturbed input. Q is the function returns the class label, R is the ground truth of the input. S is the probability of the input prediction, T = R B , V is the multiclass label collection. WT is the target class.
Preliminary Results
REFERENCES1. Dan Hendrycks and Thomas Dietterich. 2019. Benchmarking Neural
Network Robustness to Common Corruptions and Perturbations. In International Conference on Learning Representations
2. David Wagner Nicholas Carlini. 2017. Towards Evaluating the Robustness of Neural Network. In Proceedings of the 38h IEEE Symposium on Security and Privacy
3. Jonas Rauber, Wieland Brendel, and Matthias Bethge. 2017. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. arXiv preprint arXiv:1707.04131 (2017). arXiv:1707.04131 http://arxiv.org/abs/1707.04131
4. Dong Su, Huan Zhang, Hongge Chen, Jinfeng Yi, Pin-Yu Chen, and YupengGao.2018. Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models. In Computer Vision – ECCV 2018 - 15th European Conference, Munich, Germany, Proceedings . 644–661
5. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, DumitruErhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations
6. Perceptron Robustness Benchmark. https://github.com/advboxes/perceptron-benchmark
Brightness Rotation Gaussian Blur Salt & Pepper Snow
Fig 1. Pretrained Model Robustness Comparison across 13 DNN architectures on randomly sampled 1k images from ImageNet. The ||)542X4YZ[\||] introduces Misclassification.
Methods
Fig 2. Fooling Success Rate: The median minimal ^]distance is the threshold _ for each property across all the models. A success is defined as an input image that needs less than _ perturbation to achieve model misbehavior.
Safety Violation to Resnet101Ground Truth
jay
A. Luminance
B. Geometric Transformation
C. Blur
D. Corruption
E. Weather
Brightnessmagpie
Contrastafrican grey
Rotationcabbage butterfly
Verticalhumming
birdHorizontallycaenid
Motion Blurmadagascar
catGaussian Blur
indri
BlendedNoisebulbul
Salt &Pepper
fountain
GaussianNoise
ptarmigan
Fogafricangrey
Frostfountain
Snowcabbagebutterfly
Fig 3. Violations against YOLOv3. Upper left (UL): original image. Lower left( LF): Rotation applied.Lower middle (LM): Gaussian Blur applied. Lower right (LR): Fog effect applied. No detections (LL, LM), misclassification (LR)