Upload
sonal-honale
View
213
Download
0
Embed Size (px)
Citation preview
7/30/2019 dffffffffff
1/6
feed-forward backpropagation network
Syntax
net = newff(P,T,[S1 S2...S(N-l)],{TF1 TF2...TFNl},
BTF,BLF,PF,IPF,OPF,DDF)
Description
newff(P,T,[S1 S2...S(N-l)],{TF1 TF2...TFNl}, BTF,BLF,PF,IPF,OPF,DDF) takes several arguments
PR x Q1 matrix of Q1 sample R-element input vectorsTSN x Q2 matrix of Q2 sample SN-element target vectors
SiSize of ith layer, for N-1 layers, default = [ ].
(Output layer size SN is determined from T.)
TFiTransfer function of ith layer. (Default = 'tansig' for
hidden layers
tansig is a neural transfer function. Transfer functions calculate a layer's output from its net input.
tansig(N,FP) takes N and optional function parameters,
NS x Q matrix of net input (column) vectorsFPStruct of function parameters (ignored)
and returns A, the S x Q matrix of N's elements squashed into [-1 1].
tansig('dn',N,A,FP) returns the derivative of A with respect to N. If A or FP is not supplied or is set to
[], FP reverts to the default parameters, and A is calculated from N.
a = tansig(n) = 2/(1+exp(-2*n))-1
This is mathematically equivalent to tanh(N). It differs in that it runs faster than the MATLAB
implementation of tanh, but the results can have very small numerical differences. This function is a
good tradeoff for neural networks, where speed is important and the exact shape of the transfer
function is not.
and 'purelin'
purelin(N,FP) takes N and optional function parameters,
NS x Q matrix of net input (column) vectorsFPStruct of function parameters (ignored)
7/30/2019 dffffffffff
2/6
and returns A, an S x Q matrix equal to N.
purelin('dn',N,A,FP) returns the S x Q derivative of A with respect to N. If A or FP is not supplied or is
set to [], FP reverts to the default parameters, and A is calculated from N.
a = purelin(n) = n
for output layer.)BTFBackpropagation network training function (default = 'trainlm')
Gradient descent backpropagation
Syntax
[net,TR] = traingd(net,TR,trainV,valV,testV)
info = traingd('info')
Description
traingd is a network training function that updates weight and bias values according to gradient
descent.
traingd(net,TR,trainV,valV,testV) takes these inputs,
netNeural networkTRInitial training record created by traintrainVTraining data created by
trainvalVValidation data created by traintestVTest data created by train
and returns
netTrained network
TRTraining record of various values over each epoch
Each argument trainV, valV, and testV is a structure of these fields:
XN x TS cell array of inputs for N inputs and TS time steps. X{i,ts} is an Ri x Q matrix for the ith input
and TS time step.XiN x Nid cell array of input delay states for N inputs and Nid delays. Xi{i,j} is an Ri x
Q matrix for the ith input and jth state.PdN x S x Nid cell array of delayed input states.TNoxTS cell
array of targets for No outputs and TS time steps. T{i,ts} is an Si x Q matrix for the ith output and TS
time step.TlNl x TS cell array of targets for Nl layers and TS time steps. Tl{i,ts} is an Si x Q matrix for
the ith layer and TS time step.AiNl x TS cell array of layer delays states for Nl layers, TS time steps.
Ai{i,j} is an Si x Q matrix of delayed outputs for layer i, delay j.
Training occurs according to traingd's training parameters, shown here with their default values:
net.trainParam.epochs10Maximum number of epochs to trainnet.trainParam.goal0Performance
goalnet.trainParam.showCommandLine0Generate command-line
outputnet.trainParam.showWindow1Show training GUInet.trainParam.lr0.01Learning
ratenet.trainParam.max_fail5Maximum validation failuresnet.trainParam.min_grad1e-10Minimum
performance gradientnet.trainParam.show25Epochs between displays (NaN for no
displays)net.trainParam.timeinfMaximum time to train in seconds
7/30/2019 dffffffffff
3/6
Network Use
You can create a standard network that uses traingd with newff, newcf, or newelm. To prepare a
custom network to be trained with traingd, Set net.trainFcn to 'traingd'. This sets net.trainParam totraingd's default parameters. Set net.trainParam properties to desired values.
In either case, calling train with the resulting network trains the network with traingd.
See newff, newcf, and newelm for examples.
Algorithm
traingd can train any network as long as its weight, net input, and transfer functions have derivative
functions.
Backpropagation is used to calculate derivatives of performance perf with respect to the weight and
bias variables X. Each variable is adjusted according to gradient descent:
dX = lr * dperf/dX
Training stops when any of these conditions occurs: The maximum number of epochs (repetitions) is
reached. The maximum amount of time is exceeded. Performance is minimized to the goal. The
performance gradient falls below min_grad. Validation performance has increased more than
max_fail times since the last time it decreased (when using validation).
BLFBackpropagation weight/bias learning function (default = 'learngdm')
earngd Provide feedback about this page
Gradient descent weight and bias learning function
Syntax
[dW,LS] = learngd(W,P,Z,N,A,T,E,gW,gA,D,LP,LS)
[db,LS] = learngd(b,ones(1,Q),Z,N,A,T,E,gW,gA,D,LP,LS)
info = learngd(code)
Description
learngd is the gradient descent weight and bias learning function.
7/30/2019 dffffffffff
4/6
learngd(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) takes several inputs,
WS x R weight matrix (or S x 1 bias vector)PR x Q input vectors (or ones(1,Q))ZS x Q output gradient
with respect to performance x Q weighted input vectorsNS x Q net input vectorsAS x Q output
vectorsTS x Q layer target vectorsES x Q layer error vectorsgWS x R gradient with respect to
performancegAS x Q output gradient with respect to performanceDS x S neuron distancesLPLearningparameters, none, LP = []LSLearning state, initially should be = []
and returns
dWS x R weight (or bias) change matrixLSNew learning state
Learning occurs according to learngd's learning parameter, shown here with its default value.
LP.lr - 0.01Learning rate
learngd(code) returns useful information for each code string:
'pnames'Names of learning parameters'pdefaults'Default learning parameters'needg'Returns 1 if this
function uses gW or gA
Examples
Here you define a random gradient gW for a weight going to a layer with three neurons from an
input with two elements. Also define a learning rate of 0.5.
gW = rand(3,2);
lp.lr = 0.5;
Because learngd only needs these values to calculate a weight change (see algorithm below), use
them to do so.
dW = learngd([],[],[],[],[],[],[],gW,[],[],lp,[])
Network Use
You can create a standard network that uses learngd with newff, newcf, or newelm. To prepare the
weights and the bias of layer i of a custom network to adapt with learngd, Set net.adaptFcn to
'trains'. net.adaptParam automatically becomes trains's default parameters. Set each
net.inputWeights{i,j}.learnFcn to 'learngd'. Set each net.layerWeights{i,j}.learnFcn to 'learngd'. Set
net.biases{i}.learnFcn to 'learngd'. Each weight and bias learning parameter property is automatically
set to learngd's default parameters.
To allow the network to adapt, Set net.adaptParam properties to desired values. Call adapt with the
network.
See newff or newcf for examples.
7/30/2019 dffffffffff
5/6
Algorithm
learngd calculates the weight change dW for a given neuron from the neuron's input P and error E,
and the weight (or bias) learning rate LR, according to the gradient descent dw = lr*gW.PFPerformance function. (Default = 'mse')
IPFRow cell array of input processing functions. (Default =
{'fixunknowns','removeconstantrows','mapminmax'})
OPFRow cell array of output processing functions. (Default = {'removeconstantrows','mapminmax'})
DDFData divison function (default = 'dividerand'),
and returns an N-layer feed-forward backpropagation network.
The transfer functions TFi can be any differentiable transfer function such as tansig, logsig, or
purelin.
The training function BTF can be any of the backpropagation training functions such as trainlm,
trainbfg, trainrp, traingd, etc.
Caution trainlm is the default training function because it is very fast, but it requires a lot of
memory to run. If you get an out-of-memory error when training, try one of these: Slow trainlmtraining but reduce memory requirements by setting net.trainParam.mem_reduc to 2 or more. (See
help trainlm.) Use trainbfg, which is slower but more memory efficient than trainlm. Use trainrp,
which is slower but more memory efficient than trainbfg.
The learning function BLF can be either of the backpropagation learning functions learngd or
learngdm.
The performance function can be any of the differentiable performance functions such as mse or
msereg.
Examples
Here is a problem consisting of inputs P and targets T to be solved with a network.
P = [0 1 2 3 4 5 6 7 8 9 10];
T = [0 1 2 3 4 3 2 1 2 3 4];
Here a network is created with one hidden layer of five neurons.
net = newff(P,T,5);
7/30/2019 dffffffffff
6/6
The network is simulated and its output plotted against the targets.
Y = sim(net,P);
plot(P,T,P,Y,'o')
The network is trained for 50 epochs. Again the network's output is plotted.
net.trainParam.epochs = 50;
net = train(net,P,T);
Y = sim(net,P);
plot(P,T,P,Y,'o')
Algorithm
Feed-forward networks consist of Nl layers using the dotprod weight function, netsum net input
function, and the specified transfer function.
The first layer has weights coming from the input. Each subsequent layer has a weight coming from
the previous layer. All layers have biases. The last layer is the network output.
Each layer's weights and biases are initialized with initnw.
Adaption is done with trains, which updates weights with the specified learning function. Training is
done with the specified training function. Performance is measured according to the specifiedperformance function.