81
From multiplication to convolutional networks How do ML with Theano

Introduction to Deep Learning with Python

Embed Size (px)

Citation preview

Page 1: Introduction to Deep Learning with Python

From multiplication to convolutional networksHow do ML with Theano

Page 2: Introduction to Deep Learning with Python

Today’s Talk● A motivating problem● Understanding a model based framework● Theano

○ Linear Regression ○ Logistic Regression○ Net○ Modern Net○ Convolutional Net

Page 3: Introduction to Deep Learning with Python

Follow alongTutorial code at:https://github.com/Newmu/Theano-TutorialsData at:http://yann.lecun.com/exdb/mnist/Slides at:http://goo.gl/vuBQfe

Page 4: Introduction to Deep Learning with Python

A motivating problemHow do we program a computer to recognize a picture of a handwritten digit as a 0-9?

What could we do?

Page 5: Introduction to Deep Learning with Python

A dataset - MNISTWhat if we have 60,000 of these images and their label?

X = images

Y = labels

X = (60000 x 784) #matrix (list of lists)

Y = (60000) #vector (list)

Given X as input, predict Y

Page 6: Introduction to Deep Learning with Python

An ideaFor each image, find the “most similar” image and guess

that as the label.

Page 7: Introduction to Deep Learning with Python

An ideaFor each image, find the “most similar” image and guess

that as the label.

KNearestNeighbors ~95% accuracy

Page 8: Introduction to Deep Learning with Python

Trying thingsMake some functions computing relevant information for

solving the problem

Page 9: Introduction to Deep Learning with Python

What we can codeMake some functions computing relevant information for

solving the problem

feature engineering

Page 10: Introduction to Deep Learning with Python

What we can codeHard coded rules are brittle and often aren’t obvious or

apparent for many problems.

Page 11: Introduction to Deep Learning with Python

A Machine Learning Framework

8

Inputs Computation Outputs

Model

Page 12: Introduction to Deep Learning with Python

A … model? - GoogLeNet

from arXiv:1409.4842v1 [cs.CV] 17 Sep 2014

Page 13: Introduction to Deep Learning with Python

A very simple model

Input Computation Output

3 mult by x 12

Page 14: Introduction to Deep Learning with Python

Theano intro

Page 15: Introduction to Deep Learning with Python

Theano intro

imports

Page 16: Introduction to Deep Learning with Python

Theano intro

imports

theano symbolic variable initialization

Page 17: Introduction to Deep Learning with Python

Theano intro

imports

theano symbolic variable initializationour model

Page 18: Introduction to Deep Learning with Python

Theano intro

imports

theano symbolic variable initializationour model

compiling to a python function

Page 19: Introduction to Deep Learning with Python

Theano intro

imports

theano symbolic variable initializationour model

compiling to a python function

usage

Page 20: Introduction to Deep Learning with Python

Theano

Page 21: Introduction to Deep Learning with Python

Theano

imports

Page 22: Introduction to Deep Learning with Python

Theano

imports

training data generation

Page 23: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

Page 24: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

Page 25: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

Page 26: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

metric to be optimized by model

Page 27: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

metric to be optimized by modellearning signal for parameter(s)

Page 28: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal

Page 29: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal

compiling to a python function

Page 30: Introduction to Deep Learning with Python

Theano

imports

training data generation

symbolic variable initialization

our model

model parameter initialization

metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal

compiling to a python functioniterate through data 100 times and train model on each example of input, output pairs

Page 31: Introduction to Deep Learning with Python

Theano doing its thing

Page 32: Introduction to Deep Learning with Python

Logistic Regression

0.1

T.dot(X, w)

softmax(X)

0. 0.10. 0.0. 0.0. 0.10.7

Zero One Two Three Four Five Six Seven Eight Nine

Page 33: Introduction to Deep Learning with Python

Back to Theano

Page 34: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

Page 35: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

Page 36: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix format

Page 37: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

Page 38: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

now matrix types

Page 39: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

now matrix types

probability outputs and maxima predictions

Page 40: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

now matrix types

probability outputs and maxima predictionsclassification metric to optimize

Page 41: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

now matrix types

probability outputs and maxima predictionsclassification metric to optimize

compile prediction function

Page 42: Introduction to Deep Learning with Python

Back to Theano

convert to correct dtype

initialize model parameters

our model in matrix formatloading data matrices

now matrix types

probability outputs and maxima predictionsclassification metric to optimize

compile prediction function

train on mini-batches of 128 examples

Page 43: Introduction to Deep Learning with Python

What it learns

0 1 2 3 4 5 6 7 8 9

Page 44: Introduction to Deep Learning with Python

What it learns

0 1 2 3 4 5 6 7 8 9

Test Accuracy: 92.5%

Page 45: Introduction to Deep Learning with Python

An “old” net (circa 2000)

0.0

h = T.nnet.sigmoid(T.dot(X, wh))

y = softmax(T.dot(h, wo))

0. 0.10. 0.0. 0.0. 0.0.9

Zero One Two Three Four Five Six Seven Eight Nine

Page 46: Introduction to Deep Learning with Python

A “old” net in Theano

Page 47: Introduction to Deep Learning with Python

A “old” net in Theano

generalize to compute gradient descent on all model parameters

Page 48: Introduction to Deep Learning with Python

Understanding SGD

2D moons datasetcourtesy of scikit-learn

Page 49: Introduction to Deep Learning with Python

A “old” net in Theano

generalize to compute gradient descent on all model parameters

2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)

Page 50: Introduction to Deep Learning with Python

Understanding Sigmoid Units

Page 51: Introduction to Deep Learning with Python

A “old” net in Theano

generalize to compute gradient descent on all model parameters

2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)

initialize both weight matrices

Page 52: Introduction to Deep Learning with Python

A “old” net in Theano

generalize to compute gradient descent on all model parameters

2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)

initialize both weight matrices

updated version of updates

Page 53: Introduction to Deep Learning with Python

What an “old” net learns

Test Accuracy: 98.4%

Page 54: Introduction to Deep Learning with Python

A “modern” net - 2012+

0.0

h = rectify(T.dot(X, wh))

y = softmax(T.dot(h2, wo))

0. 0.10. 0.0. 0.0. 0.0.9

Zero One Two Three Four Five Six Seven Eight Nine

h2 = rectify(T.dot(h, wh))

Noise

Noise

Noise(or augmentation)

Page 55: Introduction to Deep Learning with Python

A “modern” net in Theano

Page 56: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

Page 57: Introduction to Deep Learning with Python

Understanding rectifier units

Page 58: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

numerically stable softmax

Page 59: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

numerically stable softmax

a running average of the magnitude of the gradient

Page 60: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

numerically stable softmax

a running average of the magnitude of the gradientscale the gradient based on running average

Page 61: Introduction to Deep Learning with Python

Understanding RMSprop

2D moons datasetcourtesy of scikit-learn

Page 62: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

numerically stable softmax

a running average of the magnitude of the gradientscale the gradient based on running average

randomly drop values and scale rest

Page 63: Introduction to Deep Learning with Python

A “modern” net in Theano

rectifier

numerically stable softmax

a running average of the magnitude of the gradientscale the gradient based on running average

randomly drop values and scale rest

Noise injected into modelrectifiers now used2 hidden layers

Page 64: Introduction to Deep Learning with Python

What a “modern” net learns

Test Accuracy: 99.0%

Page 65: Introduction to Deep Learning with Python

Quantifying the difference

Page 66: Introduction to Deep Learning with Python

What a “modern” net is doing

Page 67: Introduction to Deep Learning with Python

Convolutional Networks

from deeplearning.net

Page 68: Introduction to Deep Learning with Python

A convolutional network in Theano

Page 69: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

Page 70: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

Page 71: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) format

Page 72: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix

Page 73: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix

conv weights (n_kernels, n_channels, kernel_w, kerbel_h)

Page 74: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix

conv weights (n_kernels, n_channels, kernel_w, kerbel_h)

highest conv layer has 128 filters and a 3x3 grid of responses

Page 75: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix

conv weights (n_kernels, n_channels, kernel_w, kerbel_h)

highest conv layer has 128 filters and a 3x3 grid of responses

noise during training

Page 76: Introduction to Deep Learning with Python

A convolutional network in Theano

a “block” of computation conv -> activate -> pool -> noise

convert from 4tensor to normal matrix

reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix

conv weights (n_kernels, n_channels, kernel_w, kerbel_h)

highest conv layer has 128 filters and a 3x3 grid of responses

noise during trainingno noise for prediction

Page 77: Introduction to Deep Learning with Python

What a convolutional network learns

Test Accuracy: 99.5%

Page 78: Introduction to Deep Learning with Python

Takeaways● A few tricks are needed to get good results

○ Noise important for regularization○ Rectifiers for faster, better, learning○ Don’t use SGD - lots of cheap simple improvements

● Models need room to compute.● If your data has structure, your model should

respect it.

Page 79: Introduction to Deep Learning with Python

Resources● More in-depth theano tutorials

○ http://www.deeplearning.net/tutorial/● Theano docs

○ http://www.deeplearning.net/software/theano/library/● Community

○ http://www.reddit.com/r/machinelearning

Page 80: Introduction to Deep Learning with Python

A plugKeep up to date with indico:https://indico1.typeform.com/to/DgN5SP

Page 81: Introduction to Deep Learning with Python

Questions?