Transcript

Neural Networks and Deep LearningTijmen Blankevoort

Scyfer

Prof dr. Max Welling Drs. Jorgen Sandig Msc. Taco Cohen

Deep Learning

All purpose machine learning

Using Neural Networks:- Using large amounts of data- Learning very complex problems - Automatically learning features

A new era of machine learning

Deep learning wins all competitions- IJCNN 2011 Traffic Sign Recognition Competition- ISBI 2012 Segmentation of neuronal structures in EM stacks challenge- ICDAR 2011 Chinese handwriting recognition

ApplicationsA lot of state of the art systems use deep learning to some extent:- IBMs Watson: Jeopardy contest 2011- Google’s self-driving car- Google Glasses- Facebook face recognition- Facebook user modellingMostly image and sound recognition tasks (difficult)

Google Brain (2011)- 10 million youtube/imagenet images- 1 billion parameters- 16.000 processors- Largely unsupervised!- 20.000 categories- 15.8% accuracy

Bigger, betterDeep Learning:- The scope of what computers can learn has greatly been increased- Interaction with the real world

Biological Inspiration

Neuron

Neuron computer model

Activation function

Sigmoid activation function

Neuron computer model

Perceptron - 1957 Rosenblatt

Easy functions with a neuron

Linking neurons and training

- Initialize randomly- Sequentially give it data.- See what the difference is between network output and actual output.- Update the weights according to this error.- End result: give a model input, and it produces a proper output.

Quest for the weights. The weights are the model!

The Perceptron (1958)

“A machine which senses, recognizes, remembers, and responds like the human mind”“Remarkable machine… [was] capable of what amounts to thought” - The New Yorker

Criticism and downfall (1969)

- Perceptrons are painfully limited. They can not even learn a simple XOR function!

- No feasible way of learning networks with multiple layers

- Interest in neural networks close to fully disappeared

Renewed interest (90’s)

- Learning multiple layers- “Back propagation”- Can theoretically learn any function!

But…Very slow and inefficient

- Machine learning attention towards SVMs, random forests etc.

Deep learing (2006)

- Quest: Mimic human brain representations- Large networks- Lots of data

Problem:Simple back propagation fails on large networks.

Deep learning (2006)

- Exactly same networks as before, just BIGGER

- Combination of three factors:- (Big data)- Better algorithms- Parallel computing (GPU)

Better algorithms

Restricted Boltzmann machinePre-training: Learn the representation by parts!Very strong unsupervised learning

After pre-training, use back propagation

Parallel (GPU) power- Every set of weights can be stored as a matrix (w_ij)- GPUs are made to do common parallel problems fast!- All similar calculations done at the same time, huge performance boost.- CPU parallelizing

Future of Deep Learning- Currently an explosion of developments

- Hessian-Free networks (2010)- Long Short Term Memory (2011)- Large Convolutional nets, max-pooling (2011)- Nesterov’s Gradient Descent (2013)

- Currently state of the art but...- No way of doing logical inference (extrapolation)- No easy integration of abstract knowledge- Hypothetic space bias might not conform with reality

When to apply Deep Learning- Generally, vision and sound recognition, but...

- Works great for any other problem too!- A lot of data / features- Don’t want to make your own features- State of the art results

How to apply Deep LearningDeep learning is very difficult!- No easy plug and play software- Far too many different networks/options/additions- Mathematics and programming very challenging- Research is fast paced- Learning a network is both an art and a science

My advice:Cooperation university <=> business

How to apply Deep Learning- For most current business problems, no need for expensive hardware. e.g. we use a laptop


Recommended