20
MODULE-IV STATISTICAL METHODS IN ANN

Artificial Neural Networks

Embed Size (px)

DESCRIPTION

Statistical training of Neural Networks.Boltman training, nonlinear optimization techniques.

Citation preview

Page 1: Artificial Neural Networks

MODULE-IV

STATISTICAL METHODS IN ANN

Page 2: Artificial Neural Networks

Module 4

Statistical Methods: Boltzmann's Training - Cauchy

training - Artificial specific heat methods - applications

to general non-linear optimization problems

Page 3: Artificial Neural Networks

Statistical Methods are used for

Training ANN

Producing output from trained network

Training Methods

Deterministic Methods

Statistical Training Methods

Page 4: Artificial Neural Networks

Deterministic Training Method

Follows a step by step procedure.

Weights are changed based on their current

values of weight.

It also based on the desired output and the

actual output.

E.g.:-Perceptron Training Algorithm.

Back Propagation Algorithm etc…

Page 5: Artificial Neural Networks

Statistical Training Methods

Make pseudo random change in the weights

Retains only those change which results in

improvements.

Page 6: Artificial Neural Networks

GENERAL PROCEDURE( FOR STTISTICAL TRAINING METHOD)

Apply a set of input and compute the resulting

output

Compare the result with target, find the error.

The objective of the training is to minimize the error.

Select a weight in random and adjust it by a small

random amount.

Page 7: Artificial Neural Networks

If the adjustment improves our objective retain

the change

Otherwise return the weight to the previous

value

Repeat the procedures until the network is

trained to the desired level

Page 8: Artificial Neural Networks

The local minima problem

The objective function minimization problem can get trapped in poor solution.

Weight

A

B

Obj

ectiv

e F

unct

ion

Page 9: Artificial Neural Networks

If the objective function is at A and if the random

weight changes are small then the weight

adjustment will be rejected.

The superior weight setting at point B will never

found and the system will be trapped in local

minima instead of global minima at point B.

If the random weight changes are large both point

A and B are visited frequently, but so will every

other point.

The weight will change so drastically that it will

never settle at desired point.

Page 10: Artificial Neural Networks

Solution & Explanation

Statistical method overcome local minima problem by a weight adjustment strategy.

Example:

Let the fig. represents a ball on a surface in a

box.

If the box is shaken violently ,then the ball will

move rapidly from one side to the other side.

The probability to occupy any point on the

surface is equal for all points.

Page 11: Artificial Neural Networks

If the violence of shaking is gradually reduced the ball

will stick to both point A and B.

If the shaking is again reduced it will settle to point B.

Page 12: Artificial Neural Networks

The ANN are trained in the same way as through

random weight adjustment.

At first large random adjustment are made.

The weight change that improves the objective

function is retained.

The average step size is hence gradually reduced to

reach global minimum.

Page 13: Artificial Neural Networks

Annealing [ Boltzmann Law ]

Annealing:-If a metal is raised to a temperature above melting point ,the atoms are in violent random motion. The atoms always tend to reach a minimum energy state. As the metal is gradually cooled the atoms enters a minimum possible energy state corresponds to each temperature.

)/(exp)( kTeeP P(e)=probability that the system is in a state with energy e.,k Boltzmann’s constant. T –temperature.

Page 14: Artificial Neural Networks

Simulated Annealing [Boltzmann Traing]

Define a variable T that represents an artificial

temperature. (Start with T at large value).

Apply a set of input to the network, and calculate

the outputs and objective function.

Make a random change weight and recalculate the

network output.

Calculate new objective function.

Page 15: Artificial Neural Networks

If the objective function is reduced, retain the

weight change.

If the weight change results in an increase in

objective function ,calculate the probability of

accepting the weight change.

)/(exp)( kTccP

P(c)=probability of a change of c in the objective function,k Boltzmann’s constant. T –temperature.

Page 16: Artificial Neural Networks

Select a random number r from a uniform

distribution between zero and one.

If p(c) is greater than r, retain the change otherwise

return the weight to previous value.

This allows the system to take a step in a

direction that worsen the objective function,

hence escapes from local minimum.

Repeat the weight change process over each of the

weights in the network, gradually reducing the

temperature T until an acceptably low value for

objective function is obtained.

Page 17: Artificial Neural Networks

How to select weights/artificial Temperature for training

The size of the random weight change is selected by various methods.Eg:-

P(w)=Probability of a weight change of size w.

)/exp()( 22 TwP

T=artificial temperature

To achieve global minimum at the earliest the cooling rate is usually expressed as follows

))1(log()( 0

t

TtT

Page 18: Artificial Neural Networks

•The main disadvantage of Boltzmann’s training is very

low cooling rate and hence long computations.

•Boltzmann’s machine usually takes impractical time

for training.

Page 19: Artificial Neural Networks

Cauchy Training

Cauchy training method is more rapid than

Boltzmann training.

Cauchy training substitutes cauchy’s distribution for

Boltzmann's distribution.

Caushy’s distribution has longer “tails", hence more

probability for larger step size.

The temperature reduction rate is changed to

inverse linear. (For Boltzmann training it was inverse

logarithmic.)

Page 20: Artificial Neural Networks

Cauchy ‘s distribution is

])([

)()(

22 xtT

tTxP

The inverse linear relationship for temperature reduction reduces the training time.

)1()( 0

t

TtT