.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Visualizing and Understanding ConvolutionalNetworks
Mattew D. Zeiler and Rob Fergus12 Nov 2013
Baek Gyuseung
Seoul National University
December 2, 2016
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 1
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Introduction
• Since their introduction by LeCun in the early 1990’s,Convolution Neural Networks - CNNs have demonstratedexcellent performance at image classification
• CNNs have developed continuously - Krizhevsky et al. win theImageNet 2012 classification benchmark with their own CNN(AlexNet)
• There are several reasons that CNNs perform well - complexstructure, parsimonious coefficient
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Introduction
• However, from a scientific standpoint, this is deeplyunsatisfactory
• Little insight into the internal operation and behavior• Have no idea of how they achieve such good performance
• Without clear understanding of how and why they work, thedevelopment of better models is reduced to trial-and-error
• Visualization - reveals the input stimuli that excite individualfeature maps at any layer in the model
• Result of visualization is no just crops of input images, rathertop-down projection
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 2
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Deconvolution Network
• Proposed by Zeiler et al.
• Inverse mapping of CNN
• It is not a learning - just use already trained CNN
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Deconvolution Network
• Unpoolingmax pooling is non-invertible - recording the locations of themaxima (switch variable)
• Rectification
• Filteringuse tranposed versions of the same filters (like Auto-Encoder)
• Contrast normalization does not need
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Deconvolution Network
Figure: Example of unpooling by using switch variable
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Deconvolution Network
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 3
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Training details
• Compare with the AlexNet
• Data set - ImageNet 2012
• model fitting by stochastic gradient descent
• mini-batch size : 128• learning rate : 10−2, momentum term : 0.9• Dropout with a rate of 0.5• initial weight : 10−2, bias : 0• produce multiple crops
• Renormalize each filter in conv. layers whose RMS valueexceeds a fixed radius of 10−1 to this fixed radius
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Architecture
A little difference to AlexNet
• Sparse connection in AlexNet are replaced with denseconnections
• Filter size of input image and stride are adjusted
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 4
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature Visualization
Figure: Top 9 activations in a random subset of feature maps across thevalidation data
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature Visualization
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature Evolution during Training
Figure: Evolution of features through training. The visualization showsthe strongest activation for a given feature map at epochs[1,2,5,10,20,30,40,64]
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature invariance
Figure: translation, scale, rotation invariance of CNN
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Architecture selection
• (b) and (d) is obtained by visualizing AlexNet
• (b) : too many features are dead // (d) : Aliasing
• Fix it by adjusting filter size and stride of 1st layer
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Occlusion Sensitivity
Figure: (a) Original image (b) Strongest feature map, layer5 (c)Visualization of b. (d) Prob. of correct class (e) Most probable class
CNN identify the location of the object in the image.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Correspondence Analysis
• Occlude same configuration (eyes and nose) and calculateHamming distance
• ∆ =∑
i ̸=j H(sign(ϵi ), sign(ϵj)), where ϵi = xi − x̃i
• Lower value indicates greater consistency in the changeresulting from the masking operation
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 5
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
ImageNet 2012
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Varying ImageNet Model Sizes
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature Generalization - Caltech-101
• Foregoing results show that the features of CNN representunique topological properties.
• Use this image classifier trained by ImageNet 2012 to otherdifferent image datasets!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Caltech-256
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
PASCAL 2012
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Feature Analysis
• Compare models having different number of layers
• As the feature hierarchies become deeper, they learnincreasingly powerful features!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Outline for section 6
1 Introduction
2 Visualization
3 Training Details
4 CNN Visualization
5 Experiments
6 Discussion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Discussion
• Introduce a novel way to cisualize the activity within themodel
• By visualizing the CNN, we improve the AlexNet
• CNN is highly sensitive to local structure
• Deep model shows good performance
• ImageNet trained model can generalize well to other datasets
• Shortcoming of our visualization : It only visualize a singleactivation, not the joint activity
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction Visualization Training Details CNN Visualization Experiments Discussion
Bibliography
• Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenetclassification with deep convolutional neural networks.In:NIPS(2012)
• LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard,R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied tohandwritten zip code recognition. Neural Comput. 1(4),541551 (1989)
• Zeiler, M., Taylor, G., Fergus, R.: Adaptive deconvolutionalnetworks for mid and high level feature learning. In: ICCV(2011)