The back-propagation algorithm · The back propagation algorithm Notation I x‘ j: Input to node j...

The back-propagation algorithm

January 8, 2012

The neuron

I The sigmoid equation is what is typically used as a transferfunction between neurons. It is similar to the step function,but is continuous and differentiable.

σ(x) =1

1 + e−x(1)

-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure: The Sigmoid Function

I One useful property of this transfer function is the simplicityof computing it’s derivative. Let’s do that now...

The neuron

σ(x) =1

1 + e−x(1)

-5 -4 -3 -2 -1 0 1 2 3 4 5

The neuron

σ(x) =1

1 + e−x(1)

-5 -4 -3 -2 -1 0 1 2 3 4 5

The derivative of the sigmoid transfer function

dxσ(x) =

1 + e−x

=e−x

(1 + e−x)2

1 + e−x

(1 + e−x)2

=1 + e−x

(1 + e−x)2−

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

1 + e−x

(1 + e−x)2

=1 + e−x

(1 + e−x)2−

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

1 + e−x

(1 + e−x)2

=1 + e−x

(1 + e−x)2−

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

=(1 + e−x)− 1

(1 + e−x)2

=1 + e−x

(1 + e−x)2−

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

=(1 + e−x)− 1

(1 + e−x)2

=1 + e−x

(1 + e−x)2−

(1 + e−x) 2

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

=(1 + e−x)− 1

(1 + e−x)2

=1 + e−x

(1 + e−x)2−(

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

=(1 + e−x)− 1

(1 + e−x)2

=1 + e−x

(1 + e−x)2−(

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

dxσ(x) =

1 + e−x

(1 + e−x)2

=(1 + e−x)− 1

(1 + e−x)2

=1 + e−x

(1 + e−x)2−(

1 + e−x

= σ(x)− σ(x)2

σ′ = σ(1− σ)

Single input neuron

σξω

Figure: A Single-Input Neuron

In the above figure (2) you can see a diagram representing a singleneuron with only a single input. The equation defining the figure is:

O = σ(ξω)

Single input neuron

σξω

Figure: A Single-Input Neuron

In the above figure (2) you can see a diagram representing a singleneuron with only a single input. The equation defining the figure is:

O = σ(ξω + θ)

Multiple input neuron

ξ1ω1

ξ2ω2

ξ3ω3

Figure: A Multiple Input Neuron

Figure 3 is the diagram representing the following equation:

O = σ(ω1ξ1 + ω2ξ2 + ω3ξ3 + θ)

A neural network

Figure: A layer

A neural network

Figure: A neural network

A neural network

Figure: A neural network

The back propagation algorithm

Notation

I x`j : Input to node j of layer `

I W `ij : Weight from layer `− 1 node i to layer ` node j

I σ(x) = 11+e−x : Sigmoid Transfer Function

I θ`j : Bias of node j of layer `

I O`j : Output of node j in layer `

I tj : Target value of node j of the output layer

Notation

The error calculation

Given a set of training data points tk and output layer output Ok

we can write the error as

∑k∈K

(Ok − tk)2

We let the error of the network for a single training iteration bedenoted by E . We want to calculate ∂E

∂W `jk

, the rate of change of

the error with respect to the given connective weight, so we canminimize it.Now we consider two cases: The node is an output node, or it is ina hidden layer...

Output layer node

∂Wjk=

For notation purposes I will define δk to be the expression(Ok − tk)Ok(1−Ok), so we can rewrite the equation above as

∂Wjk= Ojδk

whereδk = Ok(1−Ok)(Ok − tk)

Output layer node

∂Wjk=

∂Wjk

∑k∈K

(Ok − tk)2

∂Wjk= Ojδk

Output layer node

∂Wjk= (Ok − tk)

∂WjkOk

∂Wjk= Ojδk

Output layer node

∂Wjk= (Ok − tk)

∂Wjkσ(xk)

∂Wjk= Ojδk

Output layer node

∂Wjk= (Ok − tk)σ(xk)(1− σ(xk))

∂Wjkxk

∂Wjk= Ojδk

Output layer node

∂Wjk= (Ok − tk)Ok(1−Ok)Oj

∂Wjk= Ojδk

Output layer node

∂Wjk= (Ok − tk)Ok(1−Ok)Oj

∂Wjk= Ojδk

Hidden layer node

∂Wij=

But, recalling our definition of δk we can write this as

∂Wij= OiOj(1−Oj)

∑k∈K

δkWjk

Similar to before we will now define all terms besides the Oi to beδj , so we have

∂Wij= Oiδj

Hidden layer node

∂Wij=

∂Wij

∑k∈K

(Ok − tk)2

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=∑k∈K

(Ok − tk)∂

∂WijOk

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=∑k∈K

(Ok − tk)∂

∂Wijσ(xk)

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=∑k∈K

(Ok − tk)σ(xk)(1− σ(xk))∂xk∂Wij

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=∑k∈K

(Ok − tk)Ok(1−Ok)∂xk∂Oj

·∂Oj

∂Wij

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=∑k∈K

(Ok − tk)Ok(1−Ok)Wjk∂Oj

∂Wij

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij=

∂Wij

∑k∈K

(Ok − tk)Ok(1−Ok)Wjk

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij= Oj(1−Oj)

∂xj∂Wij

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∂Wij= Oj(1−Oj)Oi

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∑k∈K

δkWjk

∂Wij= Oiδj

Hidden layer node

∑k∈K

δkWjk

∂Wij= Oiδj

How weights affect errors

For an output layer node k ∈ K

∂Wjk= Ojδk

For a hidden layer node j ∈ J

∂Wij= Oiδj

whereδj = Oj(1−Oj)

∑k∈K

δkWjk

What about the bias?

If we incorporate the bias term θ into the equation you will findthat

∂O∂θ

= O(1−O)∂θ

and because ∂θ/∂θ = 1 we view the bias term as output from anode which is always one.

This holds for any layer ` we are concerned with, a substitutioninto the previous equations gives us that

∂θ= δ`

(because the O` is replacing the output from the “previous layer”)

What about the bias?

If we incorporate the bias term θ into the equation you will findthat

∂O∂θ

= O(1−O)∂θ

and because ∂θ/∂θ = 1 we view the bias term as output from anode which is always one.This holds for any layer ` we are concerned with, a substitutioninto the previous equations gives us that

∂θ= δ`

(because the O` is replacing the output from the “previous layer”)

1. Run the network forward with your input data to get thenetwork output

2. For each output node compute

δk = Ok(1−Ok)(Ok − tk)

3. For each hidden node calulate

δj = Oj(1−Oj)∑k∈K

δkWjk

4. Update the weights and biases as followsGiven

∆W = −ηδ`O`−1

∆θ = −ηδ`apply

W + ∆W →W

θ + ∆θ → θ

The back-propagation algorithm · The back propagation algorithm Notation I x‘ j: Input to node j...

Documents

IndoorGML–Candidate Standard for Indoor Spatial … · Layer “WiFi ” Room 1 Room 2 Room ... Indoor Navigation Module –Anchor Node Ground Transportation Network Anchor Node

F-layer duct propagation of hydromagnetic waves in …in the magnetosphere using hydromagnetic whisders.Journal ot Geophysical Research, 70: 5839-5848. F-layer duct propagation of

Machine Learning using Matlab - Uni Konstanz...Example Layer 1 Layer 2 Layer 3 Layer 4 Forward propagation Backpropagation Given a training example (x,y), the cost function is first

M.KUMARASAMY COLLEGE OF ENGINEERING, KARUR … · 2020. 10. 7. · UNIT III WAVELENGTH ROUTING NETWORKS 9 The optical layer, Node Designs, Optical layer cost tradeoff, Routing and

Fatigue Crack Layer Propagation in Silicon-Iron...NASA-CR-175115 NASA Contractor Report 175115 19860016379 Fatigue Crack Layer Propagation in Silicon-Iron Y. Birol, G. Welsch, and

Vehicle Networks - STI Innsbruck · Vehicle Networks CAN-based Higher Layer Protocols ... of messages > 8 bytes Node . ... Layer 3 PDU: packet Layer 4 PDU: segment Layer N. PCI

Torsional Surface Wave Propagation in Anisotropic …journals.iau.ir/article_529293_4f61ec4f2a5e3aca462f9a3a...Torsional Surface Wave Propagation in Anisotropic Layer Sandwiched Between

Shear Wave Propagation in Piezoelectric-Piezoelectric ... · The propagation behavior of shear wave in piezoelectric composite structure is investigated by two layer model presented

Data Link Layer Moving Frames. Link Layer Protocols: ethernet, 802.11 wireless, 802.5 Token Ring and PPP Has node-to-node job of moving network layer

“ To Filter or to Authorize: Network-Layer DoS Defense Against Multimillion-node Botnets ”

Sushil Kumar Tomarpuasc.ac.in/director_cv/Prof_S.K.Tomar.pdfSaini and S K Tomar: Surface wave propagation in anisotropic elastic layer sandwiched between a uniform layer of liquid

Back Propagation Neural Networks - ksvi.mff.cuni.czmraz/swprojects/knocker/User Manual/Back... · KNOCKER 1 Introduction to Back-Propagation multi-layer neural networks Lots of types

Order this document AN1798 - NXP Semiconductors · Node A after a time tProp(B,A), before the end of Node A’s propagation segment, thus ensuring that Node A will correctly sample

To Filter or to Authorize: Network-Layer DoS Defense Against Multimillion-node Botnets

Research Article Detection of Malware Propagation in ...downloads.hindawi.com/journals/ijdsn/2015/530250.pdf · Research Article Detection of Malware Propagation in Sensor Node and

Physical Layer Propagation: UTP and Optical Fiber Chapter 3 Updated January 2009 XU Zhengchuan Fudan University

Chapter 5 Link Layer - UIC Computer Science 5: Link layer ... 802.11 WiFi) shared RF (satellite) cocktail party ... code) allocate piece to node for exclusive use

Learning in Multi-Layer Perceptrons - Back-Propagation

Effects of a Suspended Bottom Boundary Layer on Sonar Propagation

High Frequency Propagation Presented by K3LR Documents and News Articles... · 2017. 12. 28. · Trans-Equatorial Propagation - TEP • The F layer is much more intensely ionized