44
Idegrendszeri modellezés Orbán Gergő http://golab.wigner.mta.hu

nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

  • Upload
    doanh

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

Idegrendszeri modellezés

Orbán Gergő

http://golab.wigner.mta.hu

Page 2: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

1

Neurons and neural networks I.

Page 3: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

2

High-level vs. low level modeling

Previously in this course:

* Mechanistic description of neurons* Descriptive models of neuronal operations

Goals of the next classes:

Are these models compatible with the neural architecture?How can neural computations interpreted in this framework?

Page 4: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

3

Outline

• single neuron• neural network

• supervised learning• unsupervised learning

Page 5: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

4

Neural networks, basic concepts

Goal of the neural network can be phrased:learning about its inputforming memories

Address based memory• not associative• not fault tolerant• not distributed

Biological memory• associative• error tolerant

• cue error• hardware fault

• parallel and distributed

Page 6: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

5

Terminology

Architecture: what variables are involved, how they interact(weights, activities)

Activity rule: short time-scale behaviorhow activitiy of neurons depends on inputs

Learning rule: long time-scle behaviorhow weights change as a function of activities ofother neurons, time, or other factors

Page 7: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

6

Terminology

Supervised learning: labelled data is provided, i.e. an input-output relationship is given;the ‘teacher’ tells whether the response ofthe neuron to a given input was correct

Unsupervised learning: unlabelled data is available;learns the statistics of input:generating a table of occurencessome more sophisticated representationis learned

Page 8: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

7

Single neuron as a classifier DefinitionsArchitecture: I inputs xi

weights wi

Activity rule:1. activation is calculated

2. output is calculated• linear• logistic

• tanh• threshold

Page 9: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

7

Single neuron as a classifier DefinitionsArchitecture: I inputs xi

weights wi

Activity rule:1. activation is calculated

2. output is calculated• linear• logistic

• tanh• threshold

Page 10: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

8

Input space

Single neuron as a classifier Concepts

Page 11: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

8

Input space

Single neuron as a classifier Concepts

Page 12: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

8

Input space

Weight space

Single neuron as a classifier Concepts

Page 13: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

9

Single neuron as a classifier Concepts

The supervised learning paradigm:given example inputs x and target outputs tlearning the mapping between them

the trained network is supposed to give ‘correct response’ for any given input stimulus

training is equivalent to learning the appropriate weights

in order to achieve this, an objective function (or error function) is defined, which is minimized during training

Page 14: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

10

Binary classification in a neuronError function:

Page 15: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

10

Binary classification in a neuronError function:

1. information content of outcomes2. relative entropy of distributions (t,

1-t) and (y, 1-y)

Page 16: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

10

Binary classification in a neuronError function:

1. information content of outcomes2. relative entropy of distributions (t,

1-t) and (y, 1-y)

Learning: back-propagation algorithm

Here is the error

Page 17: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

11

Binary classification The algorithm

1. Get the data2. Calculate activations:3. Calculate the output of the neuron:4. Calculate error:5. Update weight(s):

Page 18: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

11

Binary classification The algorithm

1. Get the data2. Calculate activations:3. Calculate the output of the neuron:4. Calculate error:5. Update weight(s):

alternatively, batch learning can be done

Page 19: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

11

Binary classification The algorithm

1. Get the data2. Calculate activations:3. Calculate the output of the neuron:4. Calculate error:5. Update weight(s):

alternatively, batch learning can be done

Page 20: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

11

Binary classification The algorithm

1. Get the data2. Calculate activations:3. Calculate the output of the neuron:4. Calculate error:5. Update weight(s):

alternatively, batch learning can be done

gradient descent

Page 21: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

11

Binary classification The algorithm

1. Get the data2. Calculate activations:3. Calculate the output of the neuron:4. Calculate error:5. Update weight(s):

alternatively, batch learning can be done

gradient descent

stochastic gradient descent

Page 22: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Page 23: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Page 24: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Page 25: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Page 26: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Evolution of weights

Page 27: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

12

Binary classification Demonstration

Evolution of weights

Evolution of the errorfunction

Page 28: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

13

Binary classification Are we happy?

Page 29: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

13

Binary classification Are we happy?

w’s seem to grow unbounded

Page 30: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

13

Binary classification Are we happy?

w’s seem to grow unbounded

Is this bad for us?

Page 31: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

13

Binary classification Are we happy?

w’s seem to grow unbounded

Is this bad for us?

Generic pain killer: regularization

(weight decay)

Page 32: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

14

Binary classification Effect of regularization

Page 33: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

14

Binary classification Effect of regularization

Page 34: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

15

Capacity of the neuron

How much information can the neuron transmit?

Page 35: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

16

Interpreting learning as inferenceSo far: optimization wrt. an objective function

where

Page 36: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

16

Interpreting learning as inferenceSo far: optimization wrt. an objective function

where

What’s this quirky regularizer, anyway?

Page 37: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

17

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

Page 38: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

17

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

Page 39: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

17

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

Page 40: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

17

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

Page 41: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

17

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

what we get in the objective function M(w): the posterior distribution of w:

Page 42: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

18

Interpreting learning as inference Relationship between M(w) and the posterior

interpretation: minimizing M(w) leads to finding the maximum a posteriori estimate wMP

The log probability interpretation of the objective function retains:additivity of errors, whilekeeping the multiplicativity of probabilities

Page 43: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

19

Interpreting learning as inference Properties of the Bayesian estimate

Page 44: nn perceptron 1 - cneuro.rmki.kfki.hucneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_1_0.pdf · 6 Terminology Supervised learning: labelled data is provided, i.e. an input-output

19

Interpreting learning as inference Properties of the Bayesian estimate

• The probabilistic interpretation makes our assumptions explicit:by the regularizer we imposed a soft constraint on the learned parameters, which expresses our prior expecations.

• An additional plus:beyond getting wMP we get a measure for learned parameter uncertainty