12
Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

Embed Size (px)

Citation preview

Page 1: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

Other NN Models

• Reinforcement learning (RL)• Probabilistic neural networks• Support vector machine (SVM)

Page 2: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

Reinforcement learning (RL)

• Basic ideas: – Supervised learning: (delta rule, BP)

• Samples (x, f(x)) to learn function f(.)• precise error can be determined and is used to drive the

learning.– Unsupervised learning: (competitive, SOM, BM)

• no target/desired output provided to help learning, • learning is self-organized/clustering

– reinforcement learning: in between the two• no target output for input vectors in training samples• a judge/critic will evaluate the output

good: reward signal (+1) bad: penalty signal (-1)

Page 3: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

• RL exists in many places– Originated from psychology (conditional reflex)

– In many applications, it is much easier to determine good/bad, right/wrong, acceptable/unacceptable than to provide precise correct answer/error.

– It is up to the learning process to improve the system’s performance based on the critic’s signal.

– Machine learning community, different theories and algorithms

major difficulty: credit/blame distribution

chess playing: W/L (multi-step)

soccer playing: W/L (multi-player)

Page 4: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

• Principle of RL– Let r = +1 reword (good output)

r = -1 penalty (bad output)

– If r = +1, the system is encouraged to continue what it is doing

If r = -1, the system is encouraged not to do what it is doing.

– Need to search for better output

• because r = -1 does not indicate what the good output should be.

• common method is “random search”

Page 5: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

• ARP: the associative reword-and-penalty – Algorithm for NN RL (Barton and Anandan, 1985)

– Architecture

criticz(k)

y(k)

x(k)

input: x(k)

output: y(k)

stochastic units: z(k) for random search

Page 6: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

– Random search by stochastic units zi

or let zi obey a continuous probability distribution

function.

or let is a random noise, obeys

certain distribution.

Key: z is not a deterministic function of x, this gives z a chance to be a good output.

– Prepare desired output (temporary)

2 / 1

2 / 1

( 1) (1 )

( 1) (1 )

i

i

net Ti

net Ti

p z e

p z e

wherei iz net

( ) if ( ) 1( )

( ) if ( ) 1y k r k

d ky k r k

Page 7: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

– Compute the errors at z layer

where E(z(k)) is the expected value of z(k) because z is a random variable

How to compute E(z(k))• take average of z over a period of time• compute from the distribution, if possible• if logistic sigmoid function is used,

– Training: • Delta rule to learn weights for output nodes

• BP or other methods to modify weights at lower layers

( ) ( ) ( ( ))e k d k E z k

( ) ( 1) ( ) ( 1)(1 ( )) tanh( / )i i i iE z g net g net net T

if 1 with

if 1i j

iji j

e y rw

e y r

?

Page 8: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

Probabilistic Neural Networks

1. Purpose: classify a given input pattern x into one of the pre-defined classes by Bayesian decision rule.– Suppose there are k predefined classes s1, …sk

P(si): prior probability of class si

P(x|si): conditional probability of x, given si

P(x): probability of x

P(si|x): posterior probability of si, given x– Example:

, the set of all patients

si: the set of all patients having disease si

x: a description (manifestations) of a patient

1{ }kS s s K

Page 9: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

P(x|si): prob. patient with disease si will have

description x

P(si|x): prob. patient with description x will have

disease si.

by Bayes’ theorem:)|(max)|( xsPxsP j

ji

( | ) ( )( | ) because ( ) is constant,

( )( | ) max ( | ) iff ( | ) ( ) max ( | ) ( )

In , ( | ) are learned from examplers

i ii i

i j i i j jj j

i

P x s P sP s x P x s

P xP s x P s x P x s P s P x s P s

PNN P x s

Page 10: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

2. Estimate probabilities - Training exemplars: the jth exemplar belonging to si

- Priors can be obtained either by experts’ estimate or calculated from exemplars

- Conditionals are estimated according to Parzen estimator:

- closely related to radial basis function of Gaussian

2( )

/ 2 21

1( | ) exp

(2 ) 2

where : dimension of the pattern : # of exemplars in : input pattern

iin

j

i m mji i i

i i

x xP x s

n

mn sx

( )ijx

1( ) | | / | |ki i j jP s s s

2

2

1 ( )( ) exp( )

22

x uf x

Page 11: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

3. PNN architecture: feed forward of 4 layers

input layer

decision layer

class layer

exemplar layer

• Exemplar layer: RBF nodes, one per exemplar, centered on

• Class layer: connecting to all exemplars belonging to that class si,

• Decision layer: picks up winner based on

• If necessary training to adjust weights for upper layers

( )ijx

2( ) 2exp( / )ij jy x x

i jz y

( )

( )

determined by the distance between and is large if it is close to ,

ij j

ij j

y x xy x

( )

approxi. Parzen estimate of ( | )

is large if is close to more i i

ii j

z P x s

z x x

( )i iz P s

Page 12: Other NN Models Reinforcement learning (RL) Probabilistic neural networks Support vector machine (SVM)

4. Comments:– Classification by Bayes’ rule

– Fast classification

– Fast learning

– Guaranteed to approach the Bayes’ optimal decision surface provided that the class probability density functions are smooth and continuous.

– Trade nodes for time( not good with large training samples)

– The probabilistic density function to be represented must be smooth and continuous.