25
Learning Ensembles of Convolutional Neural Networks Liran Chen Faculty mentor: Greg Shakhnarovich

Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Learning Ensembles of Convolutional Neural Networks

Liran Chen!Faculty mentor: Greg Shakhnarovich

Page 2: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Motivation and Data

Build a model/set of models to efficiently classified the images!!!Mixed National Institute of Standards and Technology database (MNIST) is a large database of handwritten zip code digits provided by the U.S. Postal Service that is commonly used for training various image processing systems. !The database contains 60,000 training images and 10,000 testing images.

Page 3: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Convolutional Neural Network

• Inspired by biological processes!!•A type of feed-forward artificial neural network!!• Individual neurons are tiled in such a way that they respond to overlapping regions in the visual field!!•Widely used models for image recognition/classification!!•Composed of convolutional layers, fully-connected layers, softmax layer, etc.

Page 4: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently
Page 5: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Training Procedure

Gradient Descent!“Hill climbing” •Batch !

•Mini-Batch!•Stochastic!•Online

Page 6: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

•Epoch!!Training result depends on stochastic procedure!!•Non-convex model !!•Randomness comes from

Training Procedure

Order of data in each epoch!!!Initialization of the parameters!!!Learning rates

Page 7: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble LearningA method for independently generating multiple versions of a predictor network and using them to get an aggregated prediction.

Page 8: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

(Krizhevsky et al., 2012)!

Five convolutional layers and three are fully- connected layers!

Averaging the predictions of five similar CNNs gives an error rate of 16.4%, which is reduced from 18.2%. !

!With an extra sixth convolutional layer over the last pooling layer and then “fine-tuning” it on ILSVRC-2012 gives an error rate of 16.6%. Averaging the predictions of seven such CNNs reduce the error to 15.4%!

!!!

Page 9: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

(Zeiler & Fergus, 2013)!

!!!

Page 10: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

(Naftaly, et al., 1996)!!

Autoregression model!Q is the number of model used in ensemble. Increasing Q contributes to reducing the error.

Page 11: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble Learning

30 sets of CNN, each is trained from 1 to 20 epochs.!!Increasing epochs doesn’t contribute to the reduce of error

Page 12: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble Learning

Increasing number of averaged networks contributes to the reduce of error!

!After a threshold, cannot gain accuracy anymore.

Page 13: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble Learning

The tradeoff between number of models and their complexity!!Fix nnet*epoch to be 30!!Provided enough machines, training time is reduced while gaining training accuracy.

Page 14: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble Learning

New Softmax Layer!!!!!!!!!

• Use stack Probabilities from training to train new softmax layer

Page 15: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Bagging!!• The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets!!

• Aggregate the predictors by voting

(Breiman, 1994)

Page 16: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Advantage of Ensemble Learning

!!Gain accuracy!!!Save time

Page 17: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Theoretical Work

Page 18: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Theoretical Work

Page 19: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Appendix I

Page 20: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently
Page 21: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently
Page 22: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently
Page 23: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Appendix II

Reducing the units in fully connected layer from 30 to 1.!!Increasing epochs does contributes to the reducing of testing error.

Page 24: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

Ensemble the networks

Fix the total epochs trained,! i.e., nnet*epoch = constant

Page 25: Learning Ensembles of Convolutional Neural Networkstheorycenter.cs.uchicago.edu/REU/2014/presentations/chen.pdf · Motivation and Data Build a model/set of models to efficiently

0.5014 0.575 0.4377 0.4502 0.4655 0.3711 0.3704 0.3366 0.336 0.33920.2167 0.2777 0.2074 0.2183 0.2402 0.1545 0.1801 0.1489 0.1328 0.150.2363 0.2245 0.2892 0.2169 0.183 0.1232 0.1318 0.1229 0.1425 0.15430.1141 0.1131 0.1813 0.1133 0.1145 0.0577 0.0601 0.0538 0.0561 0.05940.085 0.0952 0.1504 0.0769 0.0869 0.0468 0.0469 0.043 0.0432 0.04410.0765 0.0775 0.1281 0.0689 0.0867 0.043 0.0415 0.0381 0.039 0.04050.0649 0.0713 0.1259 0.0795 0.0864 0.0405 0.0383 0.036 0.036 0.03550.0689 0.0697 0.1237 0.0735 0.0989 0.0397 0.0375 0.035 0.0362 0.03560.0652 0.0645 0.0989 0.0596 0.0989 0.0381 0.0354 0.0328 0.0334 0.03390.0664 0.0576 0.0846 0.056 0.086 0.0366 0.0347 0.0314 0.034 0.03290.0667 0.0556 0.0824 0.0535 0.1025 0.0355 0.0326 0.0304 0.0306 0.03110.0642 0.0504 0.0781 0.0517 0.0895 0.0323 0.0326 0.0297 0.0297 0.02880.06 0.0443 0.0697 0.0425 0.0704 0.0305 0.0291 0.0266 0.0275 0.0266

0.0582 0.0448 0.0639 0.0425 0.0583 0.031 0.0293 0.0265 0.0271 0.02680.0522 0.0416 0.0612 0.0426 0.0501 0.0301 0.0296 0.0268 0.0275 0.02680.0526 0.0433 0.0559 0.0428 0.0464 0.031 0.0295 0.0266 0.028 0.02730.0526 0.0431 0.0556 0.0412 0.0453 0.0326 0.0305 0.0285 0.0295 0.02860.0522 0.0407 0.049 0.039 0.0428 0.03 0.0288 0.0266 0.0276 0.02770.0524 0.041 0.0485 0.0366 0.0439 0.0317 0.03 0.0276 0.0287 0.02970.0463 0.0392 0.0461 0.0337 0.0411 0.0296 0.0287 0.0259 0.0273 0.02730.0476 0.0414 0.0433 0.0337 0.0408 0.0312 0.0295 0.0273 0.0275 0.02830.0454 0.0388 0.0406 0.0342 0.0383 0.0298 0.0284 0.026 0.0272 0.02770.048 0.0393 0.0392 0.033 0.0386 0.0294 0.0292 0.0261 0.028 0.02760.0457 0.0399 0.037 0.0317 0.0377 0.0293 0.0279 0.0254 0.0268 0.02610.0459 0.0368 0.0366 0.0318 0.0361 0.028 0.0264 0.0257 0.0266 0.0250.0459 0.0378 0.0375 0.0313 0.0367 0.0287 0.0267 0.026 0.0268 0.02610.0443 0.0373 0.038 0.0314 0.0357 0.0281 0.0274 0.0259 0.026 0.02560.0452 0.0371 0.036 0.0314 0.0357 0.0288 0.028 0.0264 0.0268 0.02590.0442 0.0376 0.0374 0.0321 0.0378 0.0287 0.0281 0.0257 0.0267 0.02660.0455 0.0372 0.0377 0.0323 0.0369 0.0287 0.0282 0.0256 0.026 0.0262