Upload
shuai-zhang
View
97
Download
2
Embed Size (px)
Citation preview
1. Introduction
2. Convolution
3. Relu
4. Pooling
School of Computer Sicience and Engineering
3. Example by Tensorflow
1.1 Definition• CNN is a specualized kind of neural network for processing
data thtat has a known, grid-like topology, such as time-series(1D grid), image data(2D grid), etc.
• CNN is a supervised deep learning algorithm, it is used in various fields like speech recognition, image retrieval and face recognition.
School of Computer Sicience and Engineering
1.1 Definition• ImageNet Classification with Deep Convolutional Neural Networks (Cited
by 9538, NIPS 2012, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton)• build a CNN, has 60 million parameters and 650,000 neurons,
consists of five convolutional layers.
• Typical CNN is a 5 layer architecture consist of convolution layer, pooling layer and classification layer. • Convolution layer: extract the unique features from the input image• Pooling layer: reduce the dimensionality • Generally CNN is trained using back-propagation algorithm
School of Computer Sicience and Engineering
1.2 Motivation• MLP do not scale well
• MLP ignore pixel correlation
• MLP are not robust to image transformation
School of Computer Sicience and Engineering
multi-layer perceptron
2.1 Why Convolution ?• preserves the spatial relationship
between pixels by learning image features using small squares of input data
• detect small,meaningful features such as edges with kernels
School of Computer Sicience and Engineering
A 2D convolution example from deep learning book
2.2 Convolution Example
School of Computer Sicience and Engineering
different filters can detect different features
3 ReLU • Introducing the Non Linearity
School of Computer Sicience and Engineering
Other non linear functions such as tanh or sigmoid can also be used instead of ReLU, but ReLU has been found to perform better in most situations
4.1 Motivation of Pooling • Reduce dimensionality
• In all cases, pooling helps to make the representation become approximately invariant to small translations of the input. • local translation can be a very useful property if we care more about
whether some feature is present than exactly where it is.
• Type of Pooling • Max(works better)• Average• Sum
School of Computer Sicience and Engineering
5 Example by Tensorflow
School of Computer Sicience and Engineering
• zero-padding the 28x28x1 image to 32x32x1
• applying 5x5x32 convolution to get 28x28x32
• max-pooling down to 14x14x32 zero-padding the 14x14x32 to 18x18x32
• applying 5x5x32x64 convolution to get 14x14x64
• max-pooling down to 7x7x64.
Reference • http://ufldl.stanford.edu/tutorial/supervised/ConvolutionalNeuralNetwork• http://cs231n.github.io/convolutional-networks/• https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/• Deep Learning Book • http://www.slideshare.net/ssuser06e0c5/explanation-on-tensorflow-
example-deep-mnist-for-expert• http://shuaizhang.tech/2016/12/08/Tensorflow
%E6%95%99%E7%A8%8B2-Deep-MNIST-Using-CNN/
School of Computer Sicience and Engineering