Upload
haardikgarg
View
216
Download
0
Embed Size (px)
Citation preview
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 1/35
Advance Topics in Mathematical Methods ME7100
Artificial Neural Network
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 2/35
Advance Topics in Mathematical Methods ME7100
Introduction
An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems,such as the brain, process information.
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
•
The neuron sends out signals through thin stand known as anaxon
• Neuron collects signals from others through structures called
dendrites .
• Synapse converts the activity from the axon into electrical effects
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 3/35
Advance Topics in Mathematical Methods ME7100
Introduction
An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems,such as the brain, process information.
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
Inputvalues
weights
Summing
function
Bias b
Activation
function
Output
y
x 1
x 2
x m
w 2
w m
w 1
) xw( f ii
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 4/35
Advance Topics in Mathematical Methods ME7100
Introduction
Input
values
weights
Summing
function
Bias b
Activation
function
Output
y
x 1
x 2
x m
w 2
w m
w 1
) xw( f ii
Artificial Neural Network (ANN)
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 5/35
Advance Topics in Mathematical Methods ME7100
● An Artificial Neural Network encompasses−
neuron model: Type of activation function,− an architecture: network structure
− number of neurons, number of layers, weight at each neuron
− a learning algorithm: Training of ANN by modifying theweights in order to mimic the known observations (input, output) such that the unknown
Introduction
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 6/35
Advance Topics in Mathematical Methods ME7100
Activation Function
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 7/35Advance Topics in Mathematical Methods ME7100
Activation Function
sigmoid
rational function
hyperbolic tangent
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 8/35Advance Topics in Mathematical Methods ME7100
Network architectures
●Three different classes of network architectures
−single-layer feed-forward
−multi-layer feed-forward
−recurrent
single-layer feed-forward multi-layer feed-forward
http://codebase.mql4.com/5738
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 9/35Advance Topics in Mathematical Methods ME7100
Network architectures
●Three different classes of network architectures
−Recurrent: In a recurrent network, the weight matrix for each
layer contains input weights from all other neurons in the network, not
just neurons from the previous layer.
http://en.wikibooks.org/wiki/Artificial_Neural_Networks/Recurrent_Networks
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 10/35
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 11/35Advance Topics in Mathematical Methods ME7100
Network architectures
−single-layer feed-forward
−e.g. Perceptron -Rosenblatt (1958) classificationinto one of two categories.
●used for binary classification: Geometrically finding a hyper-plane that separates the examples in two classes
http://codebase.mql4.com/5738
v=c
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 12/35Advance Topics in Mathematical Methods ME7100
Network architectures
−single-layer feed-forward
Case : Can we predict heart disease on the basis of Age
sex (M/ F)
smoking frequency
Cholesterol
BPWeight Age
sex (m=1,
F=0) smoking frequency Cholesterol BP Weight
Heart Patient
( 0= nonpatient, 1=
patient)
55 0 3 143 109 66 0
41 0 1 145 91 43 0
45 1 1 224 126 46 1
60 0 8 237 83 85 1
22 0 3 140 83 56 0
53 1 4 163 94 73 1
34 0 5 188 88 53 1
41 1 5 192 120 46 1
39 1 6 222 126 75 1
52 1 8 179 99 72 1
58 0 7 165 122 58 1
58 1 6 182 117 47 1
37 1 3 174 113 46 0
49 0 2 190 126 45 1
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 13/35Advance Topics in Mathematical Methods ME7100
Network architectures
−single-layer feed-forward
−e.g. Perceptron -Rosenblatt (1958) classificationinto one of two categories.
−A perceptron uses a step function
http://codebase.mql4.com/5738
b =1.55
Y =0 or 1
f(v)
2501
2500
vif
vif )v( f
v =∑w i x i +b
X W
Age 0.880052
sex (m=1, F=0) -1.13407
smoking frequency 1.275656
Cholesterol 0.870191
BP 0.124578
Weight 0.759339
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 14/35Advance Topics in Mathematical Methods ME7100
Network TrainingMethod of Gradient descent:An algorithm for finding the nearest
local minimum/maximum of a function which presupposes that the
gradient of the function can be computed.
x=xmax
x=x1 x=x1
x=xmax
x=x1 x=x1
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 15/35Advance Topics in Mathematical Methods ME7100
Network TrainingMethod of Gradient descent:An algorithm for finding the nearest
local minimum/maximum of a function which presupposes that the
gradient of the function can be computed.
x=xmax
x=x1
x=x2
Choose arbitrary point x=x1
For maximization (f(x 2 )- f(x 1))>0
If f’(x) <0 , x 2 < x 1
f’(x) >0 , x 2 > x 1
Thus, the following will always
yield movement towards maxima
x 2 =x 1+ η f ’(x ) η>0
η is learning rate x=x1
x=x2
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 16/35Advance Topics in Mathematical Methods ME7100
Network TrainingMethod of Gradient descent:An algorithm for finding the nearest
local minimum/maximum of a function which presupposes that the
gradient of the function can be computed.
x=xmax
x=x1 x=x1
Choose arbitrary point x=x1
For minimization (f(x 2 )- f(x 1))<0
If f’(x) <0 , x 2 > x 1
f’(x) >0 , x 2 < x 1
Thus, the following will always
yield movement towards minima
x 2 =x 1- η f ’(x ) η>0
η is learning rate
x=x2 x=x2
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 17/35Advance Topics in Mathematical Methods ME7100
17Multi variable - single response
Risk of trapping in local
maxima or minima!!
Heuristic Method
http://bayen.eecs.berkeley.edu/bayen/?q=webfm_send/246
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 18/35Advance Topics in Mathematical Methods ME7100
Network Training
Method of Gradient descent for ANN: The network is to be optimized
through adjustment of weights such that error in prediction is
minimized
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 19/35Advance Topics in Mathematical Methods ME7100
Network Training
Liner Perceptron
x 1
x 2
x n
w 2
w 1
w n
b (bias)
y f(v)
v =∑w i x i +b
()
∆ ∆ (′())
∆
Repeat till solution converges
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 20/35Advance Topics in Mathematical Methods ME7100
Network Training
Liner Perceptron s number of training samples
x 1
x 2
x n
w 2
w 1
w n
b (bias)
y f(v)
v =∑w i x i +b
∆
Repeat till solution converges
=
()
∆
=
() [′()
∆ =
[ ]
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 21/35Advance Topics in Mathematical Methods ME7100
Network TrainingLinear single layer perceptron ( m inputs & n
outputs)
=
w ij x i +bj
=
( )
∆
∆ =
( ) [′( )
∆
=
( )
Repeat till solution converges
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 22/35Advance Topics in Mathematical Methods ME7100
Network TrainingLinear single layer perceptron ( m inputs & n
outputs, s number of training samples)
=
=
( )
∆
∆ =
=
( ) [′( )
∆ =
=
[ ]
=
w ij x i +bj
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 23/35
Advance Topics in Mathematical Methods ME7100
Network Training
Non-linear Perceptron
x 1
x 2
x n
w 2
w 1
w n
b (bias)
y f(v)
v =∑w i x i +b
+ −
()
∆
∆ [ ]
∆ [ ][ ]
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 24/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear single layer perceptron ( m inputs & n
outputs)
=
w ij x i +bj
=
( )
∆
∆ =
( ) [′( )
∆
=
( )
Repeat till solution converges
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 25/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear multi-layer perceptron ( 2 inputs, 2
hidden neuron and one output)
( + + ) f
X1
X2
f1
f2
y
w 1w 12
b b1
∆
w 11
w 21
w 22
b2
w 2
∆ ( + + ) ∆ ( + + ) ( + + 1)
∆ ( + + ) ( + + 2)
Output weight scheme
similarly
f ( + + 1)
f ( + + 1)
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 26/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear multi-layer perceptron ( 2 inputs, 2
hidden neuron and one output)
( + + ) f
f1
f2
y
w 1w 12
b b1
∆
w 11
w 21
w 22
b2
w 2
∆ ( + + )
f ( + + 1)
Input weight scheme
∆ ( + + )
∆ ( + + )
∆ ( + + )
similarly
f ( + + 1)
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 27/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear multi-layer perceptron ( 2 inputs, 2
hidden neuron and one output)
( + + ) f
f1
f2
y
w 1w 12
b b1
∆
w 11
w 21
w 22
b2
w 2
f ( + + 1)
Input weight scheme
∆ ( + + )
∆ ( + + )
similarly ∆ ( + + )
f (
+ + 1)
Back Propagation
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 28/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear multi-layer perceptron ( m inputs h
hidden neuron and one output)
(+ )
f f j
m inputs h hidden neurons 1 Output
(ith
input jth
neuron )
y w j
w ij
(+ =
(+)
b b j
∆ (+ = (+ ) = (+ )
∆
∆ (+ = (+ )
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 29/35
Advance Topics in Mathematical Methods ME7100
Network TrainingLinear multi-layer perceptron ( m inputs & n
outputs)
( + )
+
=
( + )
f k f j
m inputs h hidden neurons n Output
(ith
input jth
neurons kth
output)
y k
w jk
w ij
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 30/35
Advance Topics in Mathematical Methods ME7100
Network Training
1. Sigmoidal multi Perceptron s number oftraining samples
2. Sigmoidal single layer perceptron ( m inputs &
n outputs)
3. Sigmoidal single layer perceptron ( m inputs &
n outputs, s number of training samples)
Derive weight change scheme for
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 31/35
Advance Topics in Mathematical Methods ME7100
Optimal ANN Architecture
• Map the known data
• Generalize the new data
r
Number of Iterations
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 32/35
Advance Topics in Mathematical Methods ME7100
Optimal ANN Architecture
• Generalization Techniques
– Splitting Technique
100 20
Training
Validation
Testing
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 33/35
Advance Topics in Mathematical Methods ME7100
Predictive Process Models : ANN
Input
Input
Layer
Hidden
Layer
Output
Layer
Output
F ([W][X]+[B]))
X
1
X
2
X
3
W
1
W
2
W
3
B
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 34/35
Advance Topics in Mathematical Methods ME7100
Optimal ANN Architecture
• Generalization Techniques
– Cross Validation
Divide data set in (n) groups of size k each
Train (n-1) Data sets and check the error in remaining n
th
set
Repeat process with different initial weights and average
the results (ensemble method)
Calculate error for all the n sets taken as testing sets
Calculate Error of cross validation
2n
1i
n/k
1 jCalculatedActual
cv )y()y( ijij N
1 E
8/19/2019 L_3(H1)
http://slidepdf.com/reader/full/l3h1 35/35