Lecture 3 Perceptrons

7/21/2019 Lecture 3 Perceptrons

http://slidepdf.com/reader/full/lecture-3-perceptrons 1/46

The Linear Classifier,

also known as the “Perceptron”

COMP24111 lecture 3

LAST WEEK: our first“machine learning” algorithm

Testing point x

For each training datapoint x’

measure distance(x,x’)

Sort distances

Select K nearest

Assign most common class

The K-Nearest Neighbour Classifier

Make your own notes on itsadvantages / disadvantages.

I’ll ask for volunteers nexttime we meet…..

Model(memorize the

training data)

Testing Data(no labels)

Training data

Predicted Labels

Learning algorithm

(do nothing)

Supervised Learning Pipeline for Nearest Neighbour

Themost important concept in Machine Learning

Looks good so far…

Oh no! Mistakes!

What happened?

Oh no! Mistakes!

What happened?

We didn’t have all the data.

We can never assume that we do.

This is called “OVER-FITTIN”to the small dataset.

The Linear Classifier

COMP24111 lecture 3

Model(equation)

Training data

Predicted Labels

Learning algorithm

(search for good

parameters)

Supervised Learning Pipeline for Linear Classifiers

A more simple,compact model?

height

weightt

dancer""else player""then)(if t weight >

What’s an algorithm to find a good threshold?

height

weight

while ( numMistakes != 0 ){

t = t + 1

numMistakes = testRule(t)

We have our second Machine Learning procedure.

t=40while ( numMistakes != 0 )

t = t + 1

The threshold classifier (also known as a“Decision Stump”)

Three“ingredients” of a Machine Learning procedure

“Model”The final product, the thing you have to package upand send to a customer. A piece of code with some parameters that need to be set.

“Error function”The performance criterion: the function you use to judge how well the parameters of the model are set.

“Learningalgorithm”

The algorithm that optimises the model parameters,using the error function to judge how well it is doing.

Three“ingredients” of a Threshold Classifier

Learningalgorithm

while ( numMistakes != 0 )

{ t = t + 1

Error function

dancer""else player""then)(if t x >

General case – we’ re not just talking about the weight

of a rugby player – a threshold can be put on any feature‘ x’ .

(memorize the

training data)

Training data

Predicted Labels

Learning algorithm

(do nothing)

Supervised Learning Pipeline for Nearest Neighbour

What’s the“model” for the Nearest Neighbour classifier?

height

weight

For the k-nn, the model is thetraining data itself ! - very good accuracy - very computationally intensive!

Testing point x

For each training datapoint x’

measure distance(x,x’)

Sort distances

Select K nearest

Assign most common class

height

weight

New data: what’s an algorithm to find a good threshold?

Our model does notmatch the problem!

1 mistake…

height

weight

New data: what’s an algorithm to find a good threshold?

But our current modelcannot represent this…

We need a more sophisticated model…

dancer""else player""then)(if t x >

nput signals sent

from other neurons

f enough sufficient

signals accumulate! theneuron fires a signal"

#onnection strengths determine

ho$ the signals are accumulated

)( t aif >1=output output

signal

• input signals!x’ and coefficients!w’ are multiplied

• weights correspond to connection strengths

• signals are added up – if they are enough, FIRE!

else0=output

iw xa ∑=

incoming

signal

connection

strengthacti%ation

output

signal

Sum notation

(just like a loop from 1 to

double[] x =

double[] =

Multiple corresponding

elements and add them up

if (actiation ! threshold) "#$% &

(actiation)

iw xa ∑=

Calculation…

t >if 0else '1then == output output

The Perceptron Decision Rule

out"ut # $out"ut # %

Ru&'( "la(er # $)allet dancer # %

s this a good decision boundar&'

w$ # $.%

w* # %.*

t # %.%+

w$ # *.$

w* # %.*

t # %.%+

w$ # $.,

w* # %.%*

t # %.%+

han&in& the wei&htsthreshold ma/es the decision 'oundar( move.

0ointless im"ossi'le to do it '( hand 1 onl( o/ 2or sim"le *-3 case.

We need an al&orithm4.

w$ # -%.5

w* # %.%6

t # %.%+

*0 '*0 '2*0 +=w

0*2 '*0 '0*1 += x

,1* -hat is the actiation' a' of the neuron.

,2* /oes the neuron fire.

,3* -hat if e set threshold at 0* and eight 3 to ero.

iiw xa1

Take a 20 minute break

and think about this.

20 minute break

*0 '*0 '2*0 +=w

0*2 '*0 '0*1 += x

iiw xa1

3*1)*00*2()*0*0()2*00*1(1

=×+×+×==∑=

iiw xa

" *hat is the acti%ation! a! of the neuron'

+" Does the neuron fire'

if (activation > threshold) output else output"

…# $o %es& it fires#

*0 '*0 '2*0 +=w

0*2 '*0 '0*1 += x

iiw xa1

*0)0*00*2()*0*0()2*00*1(1

=×+×+×==∑=

iiw xa

," *hat if $e set threshold at -". and $eight /, to zero'

if (activation > threshold) output else output"

…# $o no& it does not fire##

We need a more sophisticated model…

height

weight

dancer""else player""then))((if t x f > )(

kg weight x

cmheight x

i xw∑=

)4()4()( 2211 xw xw x f +=

The Perceptron

i xw∑=

)4()4()( 2211 xw xw x f +=

height

weight

height

weight

dancer""else player""then)(if t x f >

, and change the position of the DECISION BOUNDARYt 1w 2w

Decisionboundary

The Perceptron

error)tionclassifica(a*k*a* mistakesof 5um6erError function

height

weight

Learning algo. alues***andtheoptimisetoneed****... t w

"dancer"

"player" 0y7else1y7thenif

==>∑=

t xw i

Perceptron Learning Rule

ne eight 8 old eight 9 0*1 ( true:a6el ; output ) input×

i24 8 target = !9 output = ! : 4. then u"date # ;

i24 8 target = !9 output = " : 4. then u"date # ;

i24 8 target = "9 output = ! : 4. then u"date # ;

i24 8 target = "9 output = " : 4. then u"date # ;

What 'eight updates do these cases produce?

update

initialise weights to random numbers in range -1 to +1

or n = 1 to "M#$%&R'%$

or ea*h training eam,le (.)

*al*ulate a*ti/ation

or ea*h weight

u,date weight b. learning rule

Perceptron convergence theorem:

If the data is linearly separable, then application of the

Perceptron learning rule will find a separating decision boundary,

within a finite number of iterations

Learning algorithm for the Perceptron

Model(if… then…)

Training data

Predicted Labels

Learning algorithm(search for good

parameters)

Supervised Learning Pipeline for Perceptron

N d l l bl

New data…. non-linearly separable

height

weight

Our model does notmatch the problem!

(AGAIN!)

Many mistakes!

dancer""else player""thenif 1

t xw i

i >∑=

M lil P

Multilayer Perceptron

MLPd ii b d li bl ld!

MLP decision boundary – nonlinear problems, solved!

height

weight

N lNt k

Neural Networks - summary

Perceptrons are a (simple) emulation of a neuron.

Layering perceptrons gives you… a multilayer perceptron.An MLP is one type of neural network – there are others.

An MLP with sigmoid activation functions can solve highly nonlinear problems.

Downside – we cannot use the simple perceptron learning algorithm.

Instead we have the backpropagation algorithm.

This is outside the scope of this introductory course.

Lecture 3 Perceptrons

Documents

Lecture 13 – Perceptrons Machine Learning March 16, 2010

Lecture 2, Part 1: Multilayer Perceptrons

CSC421/2516 Lecture 3: Multilayer Perceptronsrgrosse/courses/csc421... · Roger Grosse and Jimmy Ba CSC421/2516 Lecture 3: Multilayer Perceptrons 25/25. Expressive Power Limits of

COMS 4995 Lecture 2: Multilayer Perceptrons & …

Learning: Nearest Neighbor, Perceptrons & Neural Nets

Machine Learning: Multi Layer Perceptrons

Ed): Single-Layer Perceptrons Haykin Chapter 3 (Chap 1, 3, 3rd

Lecture 4: Perceptrons and Multilayer Perceptrons€¦ · · 2007-10-09Lecture 4: Perceptrons and Multilayer Perceptrons ... Lecture 4: Perceptrons and Multilayer Perceptrons –

CSC321 Lecture 5: Multilayer Perceptrons

Ed): Single-Layer Perceptrons Haykin Chapter 3 (Chap 1, 3, 3rdcourses.cs.tamu.edu/choe/16spring/636/lectures/slide03.pdf · Russel & Norvig Perceptrons can represent basic boolean

Ed): Single-Layer Perceptrons Haykin Chapter 3 (Chap 1, 3, 3rdcourses.cs.tamu.edu/choe/17spring/636/lectures/slide03.pdfSlide03 Haykin Chapter 3 (Chap 1, 3, 3rd Ed): Single-Layer Perceptrons

Multi-Layer Perceptrons Michael J. Watts

Machine Learning: Perceptrons

Lecture 13 – Perceptrons Machine Learning. Last Time Hidden Markov Models – Sequential modeling represented in a Graphical Model 2

Multi-Layer Perceptrons

Supervised Learning I: Perceptrons and LMS

Supervised Learning: Perceptrons and Backpropagation

Dynamics of Generalization in Linear Perceptrons

6 Neural Networks (1. Perceptrons and 2. Multi layered ...nptel.ac.in/courses/102106023/ch6_perceptron_mlp.pdf · 6 Neural Networks (1. Perceptrons and 2. Multi layered Perceptrons)

CSC421/2516 Lecture 3: Multilayer Perceptronsrgrosse/courses/csc421_2019/slides/lec03.pdf · CSC421/2516 Lecture 3: Multilayer Perceptrons Roger Grosse and Jimmy Ba Roger Grosse and