dffffffffff

Embed Size (px)

Citation preview

  • 7/30/2019 dffffffffff

    1/6

    feed-forward backpropagation network

    Syntax

    net = newff(P,T,[S1 S2...S(N-l)],{TF1 TF2...TFNl},

    BTF,BLF,PF,IPF,OPF,DDF)

    Description

    newff(P,T,[S1 S2...S(N-l)],{TF1 TF2...TFNl}, BTF,BLF,PF,IPF,OPF,DDF) takes several arguments

    PR x Q1 matrix of Q1 sample R-element input vectorsTSN x Q2 matrix of Q2 sample SN-element target vectors

    SiSize of ith layer, for N-1 layers, default = [ ].

    (Output layer size SN is determined from T.)

    TFiTransfer function of ith layer. (Default = 'tansig' for

    hidden layers

    tansig is a neural transfer function. Transfer functions calculate a layer's output from its net input.

    tansig(N,FP) takes N and optional function parameters,

    NS x Q matrix of net input (column) vectorsFPStruct of function parameters (ignored)

    and returns A, the S x Q matrix of N's elements squashed into [-1 1].

    tansig('dn',N,A,FP) returns the derivative of A with respect to N. If A or FP is not supplied or is set to

    [], FP reverts to the default parameters, and A is calculated from N.

    a = tansig(n) = 2/(1+exp(-2*n))-1

    This is mathematically equivalent to tanh(N). It differs in that it runs faster than the MATLAB

    implementation of tanh, but the results can have very small numerical differences. This function is a

    good tradeoff for neural networks, where speed is important and the exact shape of the transfer

    function is not.

    and 'purelin'

    purelin(N,FP) takes N and optional function parameters,

    NS x Q matrix of net input (column) vectorsFPStruct of function parameters (ignored)

  • 7/30/2019 dffffffffff

    2/6

    and returns A, an S x Q matrix equal to N.

    purelin('dn',N,A,FP) returns the S x Q derivative of A with respect to N. If A or FP is not supplied or is

    set to [], FP reverts to the default parameters, and A is calculated from N.

    a = purelin(n) = n

    for output layer.)BTFBackpropagation network training function (default = 'trainlm')

    Gradient descent backpropagation

    Syntax

    [net,TR] = traingd(net,TR,trainV,valV,testV)

    info = traingd('info')

    Description

    traingd is a network training function that updates weight and bias values according to gradient

    descent.

    traingd(net,TR,trainV,valV,testV) takes these inputs,

    netNeural networkTRInitial training record created by traintrainVTraining data created by

    trainvalVValidation data created by traintestVTest data created by train

    and returns

    netTrained network

    TRTraining record of various values over each epoch

    Each argument trainV, valV, and testV is a structure of these fields:

    XN x TS cell array of inputs for N inputs and TS time steps. X{i,ts} is an Ri x Q matrix for the ith input

    and TS time step.XiN x Nid cell array of input delay states for N inputs and Nid delays. Xi{i,j} is an Ri x

    Q matrix for the ith input and jth state.PdN x S x Nid cell array of delayed input states.TNoxTS cell

    array of targets for No outputs and TS time steps. T{i,ts} is an Si x Q matrix for the ith output and TS

    time step.TlNl x TS cell array of targets for Nl layers and TS time steps. Tl{i,ts} is an Si x Q matrix for

    the ith layer and TS time step.AiNl x TS cell array of layer delays states for Nl layers, TS time steps.

    Ai{i,j} is an Si x Q matrix of delayed outputs for layer i, delay j.

    Training occurs according to traingd's training parameters, shown here with their default values:

    net.trainParam.epochs10Maximum number of epochs to trainnet.trainParam.goal0Performance

    goalnet.trainParam.showCommandLine0Generate command-line

    outputnet.trainParam.showWindow1Show training GUInet.trainParam.lr0.01Learning

    ratenet.trainParam.max_fail5Maximum validation failuresnet.trainParam.min_grad1e-10Minimum

    performance gradientnet.trainParam.show25Epochs between displays (NaN for no

    displays)net.trainParam.timeinfMaximum time to train in seconds

  • 7/30/2019 dffffffffff

    3/6

    Network Use

    You can create a standard network that uses traingd with newff, newcf, or newelm. To prepare a

    custom network to be trained with traingd, Set net.trainFcn to 'traingd'. This sets net.trainParam totraingd's default parameters. Set net.trainParam properties to desired values.

    In either case, calling train with the resulting network trains the network with traingd.

    See newff, newcf, and newelm for examples.

    Algorithm

    traingd can train any network as long as its weight, net input, and transfer functions have derivative

    functions.

    Backpropagation is used to calculate derivatives of performance perf with respect to the weight and

    bias variables X. Each variable is adjusted according to gradient descent:

    dX = lr * dperf/dX

    Training stops when any of these conditions occurs: The maximum number of epochs (repetitions) is

    reached. The maximum amount of time is exceeded. Performance is minimized to the goal. The

    performance gradient falls below min_grad. Validation performance has increased more than

    max_fail times since the last time it decreased (when using validation).

    BLFBackpropagation weight/bias learning function (default = 'learngdm')

    earngd Provide feedback about this page

    Gradient descent weight and bias learning function

    Syntax

    [dW,LS] = learngd(W,P,Z,N,A,T,E,gW,gA,D,LP,LS)

    [db,LS] = learngd(b,ones(1,Q),Z,N,A,T,E,gW,gA,D,LP,LS)

    info = learngd(code)

    Description

    learngd is the gradient descent weight and bias learning function.

  • 7/30/2019 dffffffffff

    4/6

    learngd(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) takes several inputs,

    WS x R weight matrix (or S x 1 bias vector)PR x Q input vectors (or ones(1,Q))ZS x Q output gradient

    with respect to performance x Q weighted input vectorsNS x Q net input vectorsAS x Q output

    vectorsTS x Q layer target vectorsES x Q layer error vectorsgWS x R gradient with respect to

    performancegAS x Q output gradient with respect to performanceDS x S neuron distancesLPLearningparameters, none, LP = []LSLearning state, initially should be = []

    and returns

    dWS x R weight (or bias) change matrixLSNew learning state

    Learning occurs according to learngd's learning parameter, shown here with its default value.

    LP.lr - 0.01Learning rate

    learngd(code) returns useful information for each code string:

    'pnames'Names of learning parameters'pdefaults'Default learning parameters'needg'Returns 1 if this

    function uses gW or gA

    Examples

    Here you define a random gradient gW for a weight going to a layer with three neurons from an

    input with two elements. Also define a learning rate of 0.5.

    gW = rand(3,2);

    lp.lr = 0.5;

    Because learngd only needs these values to calculate a weight change (see algorithm below), use

    them to do so.

    dW = learngd([],[],[],[],[],[],[],gW,[],[],lp,[])

    Network Use

    You can create a standard network that uses learngd with newff, newcf, or newelm. To prepare the

    weights and the bias of layer i of a custom network to adapt with learngd, Set net.adaptFcn to

    'trains'. net.adaptParam automatically becomes trains's default parameters. Set each

    net.inputWeights{i,j}.learnFcn to 'learngd'. Set each net.layerWeights{i,j}.learnFcn to 'learngd'. Set

    net.biases{i}.learnFcn to 'learngd'. Each weight and bias learning parameter property is automatically

    set to learngd's default parameters.

    To allow the network to adapt, Set net.adaptParam properties to desired values. Call adapt with the

    network.

    See newff or newcf for examples.

  • 7/30/2019 dffffffffff

    5/6

    Algorithm

    learngd calculates the weight change dW for a given neuron from the neuron's input P and error E,

    and the weight (or bias) learning rate LR, according to the gradient descent dw = lr*gW.PFPerformance function. (Default = 'mse')

    IPFRow cell array of input processing functions. (Default =

    {'fixunknowns','removeconstantrows','mapminmax'})

    OPFRow cell array of output processing functions. (Default = {'removeconstantrows','mapminmax'})

    DDFData divison function (default = 'dividerand'),

    and returns an N-layer feed-forward backpropagation network.

    The transfer functions TFi can be any differentiable transfer function such as tansig, logsig, or

    purelin.

    The training function BTF can be any of the backpropagation training functions such as trainlm,

    trainbfg, trainrp, traingd, etc.

    Caution trainlm is the default training function because it is very fast, but it requires a lot of

    memory to run. If you get an out-of-memory error when training, try one of these: Slow trainlmtraining but reduce memory requirements by setting net.trainParam.mem_reduc to 2 or more. (See

    help trainlm.) Use trainbfg, which is slower but more memory efficient than trainlm. Use trainrp,

    which is slower but more memory efficient than trainbfg.

    The learning function BLF can be either of the backpropagation learning functions learngd or

    learngdm.

    The performance function can be any of the differentiable performance functions such as mse or

    msereg.

    Examples

    Here is a problem consisting of inputs P and targets T to be solved with a network.

    P = [0 1 2 3 4 5 6 7 8 9 10];

    T = [0 1 2 3 4 3 2 1 2 3 4];

    Here a network is created with one hidden layer of five neurons.

    net = newff(P,T,5);

  • 7/30/2019 dffffffffff

    6/6

    The network is simulated and its output plotted against the targets.

    Y = sim(net,P);

    plot(P,T,P,Y,'o')

    The network is trained for 50 epochs. Again the network's output is plotted.

    net.trainParam.epochs = 50;

    net = train(net,P,T);

    Y = sim(net,P);

    plot(P,T,P,Y,'o')

    Algorithm

    Feed-forward networks consist of Nl layers using the dotprod weight function, netsum net input

    function, and the specified transfer function.

    The first layer has weights coming from the input. Each subsequent layer has a weight coming from

    the previous layer. All layers have biases. The last layer is the network output.

    Each layer's weights and biases are initialized with initnw.

    Adaption is done with trains, which updates weights with the specified learning function. Training is

    done with the specified training function. Performance is measured according to the specifiedperformance function.