Upload
yogendra-tamang
View
729
Download
2
Embed Size (px)
Citation preview
Efficient Convolutional Neural Network Architecture
for Image Classification
Yogendra TamangMSCS-070-670
Supervisor:Prof. Dr. Sashidhar Ram Joshi
Presented By:
Outline• Background• Convolutional Neural Network• Objectives•Methodology•Work Accomplished•Work Remaining• References
Background• Learning• Supervised• Unsupervised
• AI Tasks• Classification and Regression• Clustering
Machine Learning Problem
Supervised
RegressionClassfication
Unsupervised
Clustering
Background•Classification• Classifies data into one of discrete classes• Eg. Classifying digits• Cost Function for Classification Task may be Logistic Regression or Log-
likelihood
• Regression• Predicts continuous real valued output• Eg. Stock Price Predictions• Cost function for regression type problem are MSE(Mean Squared Error)
Multi Layerd Perceptrons (MLPs)
Input Layer Hidden LayerOutput Layer
Convolutional Neural Networks•One or more convolutional layer• Followed by one or more fully connected layer•Resulting in easy to train networks with many fewer
parameters.
Objectives• To classify images using CNN• To design effective architecture of CNN for image classification task.
Convolutional Neural Networks
• Receptive fields(RFs)• Apply filter to image.• Pooling and
subsampling layers
Convolution Neural Network
MethodologyTraining Set Validation
Set
Testing Set
Methodology• Convolution Layer Design
Methodology• Pooling Layer Design
MethodologyExample CNN Architecture
Learning a Classifier• Gradient Descent Algorithm• Calculate Cost Function or Lost Function J(s)• Calculate Gradient • Update Weights
• Stochastic Gradient Descent: Updates Adjust after example.• Minibatch SGD: Updates after batches.
Learning a Classifier- Negative Log likelihood
𝑁𝐿𝐿 (𝜃 ,𝒟 )=− ∑𝑖=0
¿ 𝒟∨¿log 𝑃(𝑌=𝑦 ( 𝑖)∨𝑥 ( 𝑖) ,𝜃 )¿
¿Where is Dataset is weight parameter is ith training data. Y is target data.
Work Accompolished1. GPU Configuration to support CUDA.
2. CNN Architecture for CIFAR-10 dataset
3. CNN Architecture for MNIST-10 datasetINPUT-> CONV ->MAXPOOL-> CONV -> MAXPOOL-> FULL -> OUTPUT
MNIST Dataset Training and Output
Training Loss, Validation Loss, Validation Accuracy on MNIST Dataset
1 2 3 4 5 6 7 8 9 100
0.2
0.4
0.6
0.8
1
1.2
CNN running over mnist dataset
Training LossValidation lossValidation accuracy
Epochs
Trai
ning
Loss
/Val
idati
on
Loss
/Val
idati
on A
ccur
acy
Work Remaining• Dropout Implementation• Parameter Changing
Time Schedule
References[1] A. D. J. J. J. B. Eugenio Culurciello, “An Analysis of the Connections Between Layers of Deep Neural Networks,” arXiv, 2013.[2] B. K. A.-r. M. B. R. Tara N. Sainath, “Learning Filter Banks within a Deep Neural Network Framework,”
in IEEE, 2013. [3] A.-r. M. G. H. Alex Graves, “Speech Recognition with Deep Recurrent Neural Networks,” University of
Toronto.[4] A. Graves, “Generating Sequences with Recurrent Neural Networks,” arXiv, 2014.[5] Q. V. Oriol Vinyals, “A Neural Conversational Model,” arXiv, 2015.[6] J. D. T. D. J. M. Ross Grishick, “Rich Features Hierarchies for accurate object detection and semantic
segmentation.,” UC Berkeley.[7] A. Karpathy, “CS231n Convolutional Neural Networks for Visual Recognition,” Stanford University, [Online]. Available: http://cs231n.github.io/convolutional-networks/.[8] I. Sutskever, “Training Recurrent Neural Networks,” University of Toronto, 2013.[9] “Convolutional Neural Networks (LeNet),” [Online]. Available: http://deeplearning.net/tutorial/lenet.html.[10] I. S. E. H. Alex Krizhevsky, “ImageNet Classification with Deep Convolutional Neural Networks,” 2012.
References[11] R. F. Matthew D Zeiler, “Visualizing and Understanding Convolutional Networks,” arXiv, 2013.[12] A. K. a. L. Fie-Fie, “Deep Visual Alignment for Generating Image Descriptions,” Standford University, 2014.[13] A. T. S. B. D. E. O. Vinyals, “Show and Tell: A Neural Image Caption Generator.,” Google Inc., 2014.[14] J. M. G. H. IIya Sutskever, “Generating Text with Recurrent Neural Networks,” in 28th International Conference on Machine Learning, Bellevue, 2011. [15] M. A. Nielsen, “Neural Networks and Deep Learning,” Determination Press, 2014.[16] J. Martens, “Deep Learning via Hessian-Free Optimization,” in Procedings of 27th International Conference on Machine Learning, 2010.