42
Introduction to Computational Neuroscience Artificial Neural Networks Tambet Matiisen 15.10.2018

Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Introduction to Computational Neuroscience

Artificial Neural Networks

Tambet Matiisen

15.10.2018

Page 2: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Artificial neural network

NB! Inspired by biology, not based on biology!

Page 3: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Applications Automatic speech recognition Automatic image tagging

Machine translation

Page 4: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Learning objectives

How artificial neural networks work?

What types of artificial neural networks are used for what tasks?

What are the state-of-the-art results achieved with artificial neural networks?

Page 5: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

HOW NEURAL NETWORKS WORK? Part 1

Page 6: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Frank Rosenblatt (1957)

Added learning rule to McCulloch-Pitts neuron.

Page 7: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

1, if 0

0, otherwise

i i

i

x w bz

Perceptron

Prediction: Learning:

1

( )

( )

i i iw w y z x

b b y z

b

Σ

1x

2xz

1w

2w

Page 8: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Let’s try it out!

x1 x2 y = x1 or x2

0 0 0

0 1 1

1 0 1

1 1 1

1 1 2 2

1 1 1

2 2 2

1, if x 0

0,otherwise

( )

( )

( )

w x w bz

w w y z x

w w y z x

b b y z

repeat

until y=z holds for entire dataset

Algorithm:

Page 9: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Perceptron limitations

Perceptron learning algorithm converges only for linearly separable problems.

Minsky, Papert, “Perceptrons” (1969)

Page 10: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Multi-layer perceptrons

Add non-linear activation functions

Add hidden layer(s)

Universal approximation theorem: Any continous function can be approximated to given precision by feed-forward neural network with single hidden layer containing finite number of neurons.

Page 11: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Forward propagation

+1

x1

x2

+1

Σ

Σ

Σ

xex

1

1)(

1b

2b

11w

12w

21w

22w

c

1 1 11 2 21 1

1 1( )

a x w x w b

h a

2 1 12 2 22 2

2 2( )

a x w x w b

h a

1 1 2 2z h v h v c

1v

2v

Page 12: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Loss function

• Function approximation:

21( )

2L z y

2( 10)z

Now we just need to find weight values that minimize the loss function for all inputs. How do we do that?

Page 13: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Backpropagation

+1

x1

x2

+1

Σ

Σ

Σ

))(1)(()(' xxx

Lz y

z

L L zz y

c z c

1

1 1

( )L L z

z y hv z v

2

2 2

( )L L z

z y hv z v

11 1 1

1 1 1

( ) (1 )hL L z

z y v h ha z h a

22 2 2

2 2 2

( ) (1 )hL L z

z y v h ha z h a

1 1

1 1 1 1

h aL L z

b z h a b

2 2

2 2 2 2

h aL L z

b z h a b

1 1

11 1 1 11

h aL L z

w z h a w

2 2

12 2 2 12

h aL L z

w z h a w

1 1

21 1 1 21

h aL L z

w z h a w

2 2

22 2 2 22

h aL L z

w z h a w

1 1 2 2i i i ia x w x w b ( )i ih a 1 1 2 2z h v h v c 21( )

2L z y

Page 14: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Gradient Descent

• Gradient descent finds weight values that result in small loss.

• Gradient descent is guaranteed to find only local minimum.

• But there is plenty of them and they are often good enough!

{ , , , }

learning rate

ij j jw v b c

L

Page 15: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Other loss functions

• Binary classification:

• Multi-class classification:

( )

log( ) (1 ) log(1 )

p z

L y p y p

softmax( ),

log log

i

j

z

i z

j

i i k

i

ep z p

e

L y p p

log( )p

log(1 )p

xex

1

1)(

Page 16: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Things to remember...

Perceptron was the first artificial neuron model invented in late 1950s.

Perceptron can learn only linearly separable classification problems.

Feed-forward networks with non-linear activation functions and hidden layers can overcome limitations of perceptrons.

Multi-layer artificial neural networks are trained using backpropagation and gradient descent.

Page 17: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

NEURAL NETWORKS TAXONOMY

Part 2

Page 18: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Simple feed-forward networks

• Architecture:

– Each node connected to all nodes of previous layer.

– Information moves in one direction only.

• Used for:

– Function approximation

– Simple classification problems

– Not too many inputs (~100)

OUTPUT LAYER

INPUT LAYER

HIDDEN LAYER

Page 19: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Convolutional neural networks

• Architecture:

– Convolutional layer: local connections + weight sharing.

– Pooling layer: translation invariance.

• Used for:

– images and spatial data,

– any other data with locality property, i.e. adjacent characters make up word.

-2

2 2

2 1

0 1 2 -1

POOLING LAYER

INPUT LAYER

1

2

-3

CONVOLUTIONAL LAYER

1 0 -1 weights:

max

Page 20: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Hubel & Wiesel (1959)

• Performed experiments with anesthetized cat.

• Discovered topographical mapping, sensitivity to orientation and hierarchical processing.

Page 21: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Convolution

Convolution matches the same pattern over entire image and calculates score for each match.

Page 23: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Pooling

Pooling achieves translation invariance by taking maximum of adjacent convolution scores.

Page 24: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Example: handwritten digit recognition

Y. LeCun et al., “Handwritten digit recognition: Applications of neural net chips and automatic learning”, 1989.

LeCun et al. (1989)

Page 25: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Recurrent neural networks

• Architecture:

– Hidden layer nodes connected to itself.

– Allows retaining internal state and memory.

• Used for:

– speech recognition,

– machine translation,

– language modeling,

– any time series.

OUTPUT LAYER

INPUT LAYER

HIDDEN LAYER

Page 26: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Backpropagation through time

h1

z1

x1

OUTPUT LAYER

INPUT LAYER

h2

z2

x2

h3

z3

x3

h0 HIDDEN LAYER

time

h4

z4

x4

y4 y3 y2 y1

21( )

2L z y

Page 27: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Different configurations

Page 28: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Autoencoders

• Architecture: – Input and output layers

are the same.

– Hidden layer functions as a “bottleneck”.

– Network is trained to reconstruct input from hidden layer activations.

• Used for: – image semantic hashing

– dimensionality reduction

OUTPUT LAYER = INPUT LAYER

INPUT LAYER

HIDDEN LAYER

Page 29: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

We didn’t talk about...

• Long Short Term Memory (LSTMs)

• Restricted Boltzmann Machines (RBMs)

• Echo State Networks / Liquid State Machines

• Hopfield Network

• Self-organizing maps (SOMs)

• Radial basis function networks (RBFs)

• But we covered the most important ones!

Page 30: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Things to remember...

Simple feed-forward networks are usually used for function approximation and classification with few input features.

Convolutional neural networks are mostly used for images and spatial data.

Recurrent neural networks are used for language modeling and time series.

Autoencoders are used for image semantic hashing and dimensionality reduction.

Page 31: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

SOME STATE-OF-THE-ART RESULTS Part 3

Page 32: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Deep Learning

• Artificial neural networks and backpropagation have been around since 1980s. What’s all this fuss about “deep learning”?

• What has changed:

– we have much bigger datasets,

– we have much faster computers (think GPUs),

– we have learned few tricks how to train neural networks with very many layers.

Page 33: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Revolution of Depth

(human error ~5.1%)

Page 34: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Neural Image Processing

Page 35: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Instance Segmentation

https://github.com/matterport/Mask_RCNN

https://www.youtube.com/watch?v=OOT3UIXZztE

Page 36: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Image Captioning

Page 37: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Image Captioning Errors

Page 39: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Skype Translator

https://www.youtube.com/watch?v=NhxCg2PA3ZI

Page 40: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Adversarial Examples

https://www.youtube.com/watch?v=XaQu7kkQBPc

Page 41: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Things to remember...

Artificial neural networks are state-of-the-art in image recognition, speech recognition, machine translation and many other fields.

Anything that you can do in 1 second, probably we can train a neural network to do the same, i.e. neural nets can do perception.

But in the end they are just reactive function approximators and can be easily fooled. In particular they do not think like humans (yet).

Page 42: Artificial Neural Networks...Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can

Thank you!

[email protected]