Upload
olwen
View
59
Download
1
Embed Size (px)
DESCRIPTION
NN – cont. Alexandra I. Cristea. USI intensive course “Adaptive Systems” April-May 200 3. We have seen how the neuron computes, let’s see What it can compute? How it can learn?. What does the neuron compute?. Perceptron, discrete neuron. First, simple case: no hidden layers - PowerPoint PPT Presentation
Citation preview
NN – cont.
Alexandra I. CristeaUSI intensive course “Adaptive Systems” April-May 2003
• We have seen how the neuron computes, let’s see– What it can compute?– How it can learn?
What does the neuron compute?
Perceptron, discrete neuron
• First, simple case: – no hidden layers– Only one neuron
– Get rid of threshold – b becomes w0
– Y – Boolean function : > 0 fires 0 doesn’t fire
Threshold function f
f
(w0 = - t = -1)
t=1
1f
Y = X1 or X2
W1=1 W2= 1
X1X2
0 0 1
1 1 1
0 1Y
f
X1 X2
t=1
1f
Y = X1 and X2
W1= 0,5 W2= 0,5
X1X2
0 0 0
1 0 1
0 1Y
f
X1 X2
t=1
1f
Y = or(x1,…,xn)
w1=w2=…=wn=1t=1
1f
Y = and(x1,…,xn)
w1=w2=…=wn=1/nt=1
1f
What are we actually doing?
X1X2
0 -1 1
1 1 1
0 1Y
X1X2
0 0 0
1 0 1
0 1Y
X1X2
0 0 1
1 1 1
0 1Y
w0+w1*X1+w2*X2
W 0=-1; W1 = 7; W2= 9
W 0=-1; W1 = 0,7; W2= 0,9
W 0=1; W1 = 7; W2= 9
X1
X2
x1
x2
w0+w1*x1+w2*x2
w0= - 1w1= - 0,67w2= 1
Linearly Separable Set
w0+w1*x1+w2*x2
Linearly Separable Set
x1
x2
w0= - 1w1= 0,25w2= - 0,1
w0+w1*x1+w2*x2
Linearly Separable Set
x1
x2
w0= - 1w1= 0,25w2= 0,04
w0+w1*x1+w2*x2
Linearly Separable Set
x1
x2
w0= - 1w1= 0,167w2= 0,1
Non-linearly separable Set
w0+w1*x1+w2*x2
Non Linearly Separable Set
x1
x2
w0=w1=w2=
w0+w1*x1+w2*x2Non Linearly Separable Set
x1
x2
w0=w1=w2=
w0+w1*x1+w2*x2Non Linearly Separable Set
x1
x2
w0=w1=w2=
w0+w1*x1+w2*x2Non Linearly Separable Set
x1
x2
w0=w1=w2=
Perceptron Classification Theorem
A finite set X can be classified correctly by a one-layer perceptron if and only if it is linearly separable.
w0+w1*x1+w2*x2
Typical non-linearly separable set: Y=XOR(x1,x2)
x1
x20,0 1,0
0,1 1,1
Y=1Y=0
How does the neuron learn?
Learning: weight computation• W1* ( X1 =1)+ W 2 * ( X2= 1)>=(t=
1)• W1* ( X1 =0)+ W 2 * ( X2= 1)<(t=1)• W1* ( X1 =1)+ W 2 * ( X2= 0)<(t=1)• W1* ( X1 =0)+ W 2 * ( X2= 0)<(t=1)
X2
X1
W1*X1 + W2*X2
Perceptron Learning Ruleincremental version
FOR i:= 0 TO n DO wi:=random initial value ENDFOR;
REPEAT select a pair (x,t) in X; (* each pair must have a positive probability of
being selected *) IF wT * x' > 0 THEN y:=1 ELSE y:=0 ENDIF; IF y t THEN
FOR i:= 0 TO n DO wi:= wi + (t-y) xi' ENDFOR ENDIF;
UNTIL X is correctly classified
ROSENBLATT (1962)
Idea Perceptron Learning Rule
w
x’
wnew wnew=w + x’ t=1y=0 (wTx’0)
wniew
x’
w
x’ x’
wnew=w - x’
wi:= wi + (t-y) xi'
w changes in the w changes in the direction of the input direction of the input
+ -
t=0y=1 (wTx’>0)
For multi-layered perceptrons w. continuous neurons, a simple and successful learning algorithm exists.
BKP:ErrorBKP:Error
Input Output
Hidden layery1、 d 1
y2、 d 2
y3、 d 3
y4、 d 4
e1=d1 - y1
e2=d2 - y2
e3=d3 - y3
e4=d4 - y4
Hidden Hidden layerlayererrorerror ??
Synapse
W : weight
neuron1 neuron2
y1value
y2 = w*y1value
Value (y1,y2)= Internal activation
Forward propagation
Weight serves as amplifier!Weight serves as amplifier!
Inverse Synapse
W : weight
neuron1 neuron2
e1=????value
e2value
Value(e1,e2)= Error
Backward propagation
Weight serves as amplifier!Weight serves as amplifier!
Inverse Synapse
W : weight
neuron1 neuron2
e1=ww ** e2e2value
e2value
Value(e1,e2)= Error
Backward propagation
Weight serves as amplifier!Weight serves as amplifier!
BKP:ErrorBKP:Error
Input Output
Hidden layery1、 d 1
y2、 d 2
y3、 d 3
y4、 d 4
e1=d1 - y1
e2=d2 - y2
e3=d3 - y3
e4=d4 - y4
Hidden Hidden layerlayererrorerror ??
O2 O1I1 O2,I2
Backpropagation to hidden layerBackpropagation to hidden layer
w1
w3
w2Input
I1Output
O1
Hidden layer
ee [ j ] = ie [ i ]w[ j,i ]Backpropagation :
e 1
e 2
e 3O2,I2
Update rule for 2 weight typesUpdate rule for 2 weight types
• ① I2 ( hidden layer ) , O1 ( system output )• ② I1 ( system input ) , O2 ( hidden layer )
① Δ w =α(d[i]-y[i]) f’(S[i])f(S[i]) = =αe[i] f(S[i]) (simplification (simplification f’f’=1 for repeater, e.g.)=1 for repeater, e.g.)
S[i] = jw[j, i ](t)h[j]
② Δ w =α ( ie[i] w [j,i] ) f’(S[j])f(S[j]) =α ee[j]f(S[j]) S[j] = kw[k,j](t)x[k]
Backpropagation algorithmFOR s := 1 TO r DO Ws := initial matrix(often random);
REPEAT
select a pair (x,t) in X; y0:=x; # forward phase: compute the actual output ys of the network with input x
FOR s := 1 TO r DO ys := F(Ws ys-1) END; # yr is the output vector of the network # backpropagation phase: propagate the errors back through the network # and adapt the weights of all layers
dr := Fr’ (t - yr) ;
FOR s := r TO 2 DO ds-1 := Fs-1' WsT ds;
Ws := Ws + ds ys-1T; END;
W1 := W1 + d1 y0T
UNTIL stop criterion
Conclusion
• We have seen binary function representation with single layer perceptron
• We have seen a learning algorithm for SLP
• We have seen a learning algorithm for MLP (BP)
• So, neurons can represent knowledge AND learn!