Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
1
Understanding Deep Image Representations by Inverting Them
Paper by Aravindth Mahendran, Andrea Velaldi
Presentation by Anthony Chen
2
Background
● Feature extraction methods like SIFT and HOG and CNN, but difficult to understand from information preservation standpoint.
3
Contributions
● Novel method to invert representations. – That is, given a function and its output, recover the
original input.
● Analysis of the information preservation of different types of representation (CNN, HOG, SIFT).
4
Related Work
● DeConvNets
Your thoughts on similarities/differences?
5
Related Work (2)
● DeConvNets – My thoughts
– DeConvNet are encouraged to look like original, while this paper enforces no such constraint.
– Therefore, while both can be thought of as inverses, DeConvNet studies how results are obtained, whereas this paper studies information representation/preservation.
6
Inverting Images
● This is the function representing the CNN.
● Let x0 be the original image.
● Goal: Find an x such that is close to
7
Inverting Images (2)
● We want to find an x, which we will call x*, s.t
● Here, we add a regularizer to ensure that the optimization search only searches for “natural images”
8
Inverting Images (3)
● Given an image reconstruction , the reconstruction error is given by:
● Additional modification To ensure that loss near solution is bounded in a [0, 1) range:
where sigma is the mean of the images in our test set.
9
Regularizers
● Let x be a mean subtracted image vector.
● enforces range.
● Total variation:
– Penalizes images with large total gradients.
– Discrete version:
10
Regularizers (2)
● Allows us to set the range of the pixel values . If we want to set the range between [-B, B], then
● Allows us to say how much variability the reconstruction should have.
11
Final objective function
12
Optimization
● Momentum based gradient descent is used to minimize the objective function.
● Momentum size has a decaying factor of .9
● Because CNN's function is differentiable, this is easy to optimize, but not for HOG and SIFT. Therefore, HOG and SIFT are implemented in CNN.
13
Representations: CNN
14
Representations: SIFT and HOG
● DSIFT and HOG implemented w/ CNN architecture which makes it easy to compute gradients.
● Binning is approximated using ReLU layer
● Pooling into cell histograms by linear filter.
● Cell blocks then normalized by a normalization layer.
● Maximum values are then set using ReLU unit.
15
Results
● Normalized reconstruction error
● is the normalization constant. Average pairwise Euclidean distance across 100 images.
● λa = 2.16x108, λVβ = 5, β = 2.
16
Results: SIFT and HOG
● Using bilinear gradient improves HOGb greatly.
17
Results: SIFT and HOG (2)
18
Results: CNN
● Experiments run allowing different levels of total variance. – λ1 = .5. λ2 = 5. λ3 = 50
19
Test Images
20
Results CNN (2)
21
Results: CNN (3)
● Reconstruction from subset of network illustrates subset's purpose.
22
Results (4): Variance in Reconstruction
23
Effects of parameter tuning
Decreasing the regularizing constant leads to higher variance reconstructions. These indistinguishable images still lead to good reconstruction errors.
24
Future Work
● Use this inverse technique to improve CNN architecture.
● Use this technique on other forms of neural networks (LSTM)?
25
Conclusion
● This paper provides a novel method to study and visualize information preservation in a CNN.
● Formalizes relationship between CNN and shallow feature representation.