15
Computing Gradient Vector and Jacobian Matrix in Arbitrarily Connected Neural Networks Author : Bogdan M. Wilamowski, Fellow, IEEE, Nicholas J. Cotton, Okyay Kaynak, Fellow, IEEE, and Günhan Dündar Source : IEEE INDUSTRIAL ELECTRONICS MAGAZINE Date : 2012/3/28 Presenter : 林林林 1

Computing Gradient Vector and Jacobian Matrix in Arbitrarily Connected Neural Networks

  • Upload
    ossie

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Computing Gradient Vector and Jacobian Matrix in Arbitrarily Connected Neural Networks. Author : Bogdan M. Wilamowski , Fellow, IEEE, Nicholas J. Cotton, Okyay Kaynak , Fellow, IEEE, and Günhan Dündar Source : IEEE INDUSTRIAL ELECTRONICS MAGAZINE Date : 2012/3/28 Presenter : 林哲緯. - PowerPoint PPT Presentation

Citation preview

Page 1: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

1

Computing Gradient Vector and Jacobian Matrix inArbitrarily Connected Neural Networks

Author : Bogdan M. Wilamowski, Fellow, IEEE, Nicholas J. Cotton, Okyay Kaynak, Fellow, IEEE, and Günhan DündarSource : IEEE INDUSTRIAL ELECTRONICS MAGAZINEDate : 2012/3/28Presenter : 林哲緯

Page 2: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

2

Outline

• Numerical Analysis Method• Neuron Network Architectures• NBN Algorithm

Page 3: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

3

Minimization problem

Newton's method

Page 4: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

4

Minimization problem

http://www.nd.com/NSBook/NEURAL%20AND%20ADAPTIVE%20SYSTEMS14_Adaptive_Linear_Systems.html

Steepest descent method

Page 5: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

5

Least square problem

Gauss–Newton algorithm

http://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm

Page 6: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

6

Levenberg–Marquardt algorithm

• Levenberg–Marquardt algorithm– Combine the advantages of Gauss–Newton

algorithm and Steepest descent method– far off the minimum like Steepest descent method– Close to the minimum like Newton algorithm– It’s find local minimum not global minimum

Page 7: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

7

Levenberg–Marquardt algorithm

• Advantage– Linear– First-order differential

• Disadvantage– inverting is not used at all

Page 8: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

8

Outline

• Numerical Analysis Method• Neuron Network Architectures• NBN Algorithm

Page 9: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

9

Weight updating ruleSecond-order algorithmFirst-order algorithm

α : learning constantg : gradient vector

J : Jacobian matrixμ : learning parameterI : identity matrixe : error vector

MLP ACNFCN

Page 10: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

10

Forward & Backward Computation

Forward : 12345, 21345, 12435, or 21435Backward : 54321, 54312, 53421, or 53412

Page 11: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

11

Jacobian matrix

Row : pattern(input)*outputColumn : weightp = input numberno = output number

Row = 2*1 = 2Column = 8Jacobin size = 2*8

Page 12: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

12

Jacobian matrix

Page 13: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

13

Outline

• Numerical Analysis Method• Neuron Network Architectures• NBN Algorithm

Page 14: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

14

Direct Computation of Quasi-Hessian Matrix and Gradient Vector

Page 15: Computing Gradient Vector and  Jacobian  Matrix in Arbitrarily Connected Neural Networks

15

Conclusion

• memory requirement for quasi-Hessian matrix and gradient vector computation is decreased by(P × M) times

• can be used arbitrarily connected neural networks

• two procedures– Backpropagation process(single output)– Without backpropagation process(multiple outputs)