Derivation of a Learning Rule for Perceptrons

Dr.-Ing. Erwin SitompulPresident University

Lecture 3

Introduction to Neural Networksand Fuzzy Logic

President University Erwin Sitompul NNFL 3/1


Single Layer PerceptronsNeural Networks

Derivation of a Learning Rule for Perceptrons

Widrow [1962]

x1

x2

xm

wk1

wk2

wkm

.

.

.

xwTk xwT

kky

Adaline(Adaptive Linear Element)

Goal: Tk kky d w x


Least Mean Squares (LMS)Single Layer PerceptronsNeural Networks

2

1

1( ) ( ) ( )

2

p

k k ki

E d i y i

w

2T

1

1( ) ( )

2

p

k k ki

d i i

w x

2

1 1

1( ) ( )

2

p m

k kj ji j

d i w x i

The following cost function (error function) should be minimized:

i : index of data set, the ith data setj : index of input, the jth input



Adaline Learning Rule

With

2

1 1

1( ) ( ) ( ) ,

2

p m

k k kj ji j

E d i w x i

w

T

1 2

( ) ( ) ( )( ) , , ,k k k

kk k km

E E EE

w w w

w w ww then

As already obtained before,

( )k kE w w Weight Modification Rule

1

( )( ) ( )

pk

k jikj

Ei x i

w

w

( ) ( ) ( )k k ki d i y i Defining

we can write



Adaline Learning Modes

Batch Learning Mode

1

( ) ( )ki

kj

p

jx iw i

Incremental Learning Mode

k jkjw x


Tangent Sigmoid Activation FunctionSingle Layer PerceptronsNeural Networks

Goal:

2( ) 1

1 kk a netf net

e

T( )k k kdy f w x

x1

x2

xm

wk1

wk2

wkm

.

.

.

xwTk

T( )k ky f w x


Logarithmic Sigmoid Activation FunctionSingle Layer PerceptronsNeural Networks

Goal:

1( )

1 kk a netf nete

T( )k k kdy f w x

x1

x2

xm

wk1

wk2

wkm

.

.

.

xwTk

T( )k ky f w x



Derivation of Learning RulesFor arbitrary activation function,

2T

1

1( ) ( ( ))

2

p

k k ki

d i f i

w x

2

1 1

1( ) ( )

2

p m

k kj ji j

d i f w x i

( )k kE w w

2

1

1( ) ( ) ( )

2

p

k k ki

E d i y i

w

( )kkj

kj

Ew

w

w


Derivation of Learning RulesSingle Layer PerceptronsNeural Networks

2

1

1( ) ( ) ( )

2

p

k k ki

E d i y i

w( )kkj

kj

Ew

w

w

( ) ( ) ( )

( )k k k

kj k kj

E E y i

w y i w

w w

T( ) ( )k ky i f i w x

1

( ) ( )( ) ( )

pk k

k kikj kj

E y id i y i

w w

w

) (( )kk ny eti f i

1

) ( )(m

kj jj

k w x inet i

( )jx iDepends on the

activation function used

1

( ) ( )m

k kj jj

y i f w x i

1

( )

( )

( )( ) ( )

pk

k kk

i jk k

net i

net

y id i y i

wi



1

( ) ( )( ) ( )

( )

( )

pk k

k kik jj

k

k k

E y i netd i y i

w

i

net i w

w

) (( )kk ny eti f i

( )k kf net a net 1( )

1 kk a netf nete

2

( ) 11 kk a netf nete

( )k

k

f neta

net

( )

(1 )kk k

k

f netay y

net

2( )

(1 )2

kk

k

f net ay

net



( )kkj

kj

Ew

w

w

1

( ) ( ))

()

)( (

pk k

k jk kij

E y ix

w neti i

i

w ( ) ( ) ( )k k ki d i y i

1

( ) ( )p

kj k ji

w i x i a

1

( ) ( ) (1 )p

kj k j k ki

w i x i ay y

2

1

( ) ( ) (1 )2

p

kj k j ki

aw i x i y


Homework 2Single Layer PerceptronsNeural Networks

x1

x2

w11

w12

T1w x T

1 1y w x

Case 1 [x1;x2]=[2;3][y1]

=[5]

Case 2 [x1;x2] =[[2 1];[3 1]]

[y1]=[5 2]

Use initial values w11=1 and w12=1.5, and η = 0.01.Determine the required number of iterations. Note: Submit the m-file in hardcopy and softcopy.

Given a neuron with linear activation function (a=0.5), write an m-file that will calculate the weights w11 and w12 so that the input [x1;x2] can match output y1 the best.

• Arief, Lukas, Rinald • Dian, Edwind, Kartika, Richardo


MLP ArchitectureMulti Layer PerceptronsNeural Networks

y1

y2

Inputlayer

Hidden layers

Outputlayer

Inputs Outputs

x1

x2

x3

wji

wkj

wlk

Possess sigmoid activation functions in the neurons to enable modeling of nonlinearity.

Contains one or more “hidden layers”.Trained using the “Backpropagation” algorithm.


Advantages of MLPMulti Layer PerceptronsNeural Networks

x1

x2

x3

wji wkj

wlk

MLP with one hidden layer is a universal approximator.MLP can approximate any function

within any preset accuracyThe conditions: the weights and the

biases are appropriately assigned through the use of adequate learning algorithm.

MLP can be applied directly in identification and control of dynamic system with nonlinear relationship between input and output.

MLP delivers the best compromise between number of parameters, structure complexity, and calculation cost


Learning Algorithm of MLPMulti Layer PerceptronsNeural Networks

f(.)

f(.)

f(.)

Function signalError signal

Forward propagation

Backward propagation

Computations at each neuron j:Neuron output, yj

Vector of error gradient, E/wji

“BackpropagationLearning Algorithm”


If node j is an output node,

yi(n)wji(n) netj(n)

f(.)

yj(n)

-1

dj(n)

ej(n)

Backpropagation Learning Algorithm


yi(n)wji(n) netj(n)

f(.)

yj(n) wkj(n)netk(n)

f(.)

yk(n)

-1

dk(n)

ek(n)

Backpropagation Learning Algorithm

If node j is a hidden node,


MLP Training

Backward Pass•Calculate j(n)•Update weights wji(n+1)

Forward Pass•Fix wji(n)•Compute yj(n)

i j kLeft Right

i j kLeft Right

Documents

Derivation of a Learning Rule for Perceptrons