Download pdf - Neural networks and deep learning

Neural Networks and Deep LearningTijmen Blankevoort

Scyfer

Prof dr. Max Welling Drs. Jorgen Sandig Msc. Taco Cohen

Deep Learning

All purpose machine learning

Using Neural Networks:- Using large amounts of data- Learning very complex problems - Automatically learning features

A new era of machine learning

Deep learning wins all competitions- IJCNN 2011 Traffic Sign Recognition Competition- ISBI 2012 Segmentation of neuronal structures in EM stacks challenge- ICDAR 2011 Chinese handwriting recognition

ApplicationsA lot of state of the art systems use deep learning to some extent:- IBMs Watson: Jeopardy contest 2011- Google’s self-driving car- Google Glasses- Facebook face recognition- Facebook user modellingMostly image and sound recognition tasks (difficult)

Google Brain (2011)- 10 million youtube/imagenet images- 1 billion parameters- 16.000 processors- Largely unsupervised!- 20.000 categories- 15.8% accuracy

Bigger, betterDeep Learning:- The scope of what computers can learn has greatly been increased- Interaction with the real world

Biological Inspiration

Neuron

Neuron computer model

Activation function

Sigmoid activation function

Neuron computer model

Perceptron - 1957 Rosenblatt

Easy functions with a neuron

Linking neurons and training

- Initialize randomly- Sequentially give it data.- See what the difference is between network output and actual output.- Update the weights according to this error.- End result: give a model input, and it produces a proper output.

Quest for the weights. The weights are the model!

The Perceptron (1958)

“A machine which senses, recognizes, remembers, and responds like the human mind”“Remarkable machine… [was] capable of what amounts to thought” - The New Yorker

Criticism and downfall (1969)

- Perceptrons are painfully limited. They can not even learn a simple XOR function!

- No feasible way of learning networks with multiple layers

- Interest in neural networks close to fully disappeared

Renewed interest (90’s)

- Learning multiple layers- “Back propagation”- Can theoretically learn any function!

But…Very slow and inefficient

- Machine learning attention towards SVMs, random forests etc.

Deep learing (2006)

- Quest: Mimic human brain representations- Large networks- Lots of data

Problem:Simple back propagation fails on large networks.

Deep learning (2006)

- Exactly same networks as before, just BIGGER

- Combination of three factors:- (Big data)- Better algorithms- Parallel computing (GPU)

Better algorithms

Restricted Boltzmann machinePre-training: Learn the representation by parts!Very strong unsupervised learning

After pre-training, use back propagation

Parallel (GPU) power- Every set of weights can be stored as a matrix (w_ij)- GPUs are made to do common parallel problems fast!- All similar calculations done at the same time, huge performance boost.- CPU parallelizing

Future of Deep Learning- Currently an explosion of developments

- Hessian-Free networks (2010)- Long Short Term Memory (2011)- Large Convolutional nets, max-pooling (2011)- Nesterov’s Gradient Descent (2013)

- Currently state of the art but...- No way of doing logical inference (extrapolation)- No easy integration of abstract knowledge- Hypothetic space bias might not conform with reality

When to apply Deep Learning- Generally, vision and sound recognition, but...

- Works great for any other problem too!- A lot of data / features- Don’t want to make your own features- State of the art results

How to apply Deep LearningDeep learning is very difficult!- No easy plug and play software- Far too many different networks/options/additions- Mathematics and programming very challenging- Research is fast paced- Learning a network is both an art and a science

My advice:Cooperation university <=> business

How to apply Deep Learning- For most current business problems, no need for expensive hardware. e.g. we use a laptop