Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
• We have been discussing some simple ideas fromstatistical learning theory.
PR NPTEL course – p.1/123
• We have been discussing some simple ideas fromstatistical learning theory.
• The risk minimization framework that we discussedgives us a better perspective on understanding theunifying theme in different learning algorithms.
PR NPTEL course – p.2/123
• We have been discussing some simple ideas fromstatistical learning theory.
• The risk minimization framework that we discussedgives us a better perspective on understanding theunifying theme in different learning algorithms.
• We will now go back to studying pattern classificationalgorithms.
PR NPTEL course – p.3/123
• We have been discussing some simple ideas fromstatistical learning theory.
• The risk minimization framework that we discussedgives us a better perspective on understanding theunifying theme in different learning algorithms.
• We will now go back to studying pattern classificationalgorithms.
• We will first briefly review algorithms for learninglinear classifiers and then start looking at methods tolearn nonlinear classifiers.
PR NPTEL course – p.4/123
Linear Models
• In the two class case, the linear classifier is given by
h(X) = sign(W TX + w0)
PR NPTEL course – p.5/123
Linear Models
• In the two class case, the linear classifier is given by
h(X) = sign(W TX + w0)
• We have seen that we can also think of h(X) as
h(X) = sign(W TΦ(X) + w0),
where Φ(X) = [φ1(X), · · · , φm(X)]T
as long as φi are fixed (possibly non-linear) functions.
PR NPTEL course – p.6/123
• We discussed many algorithms for learning W .
PR NPTEL course – p.7/123
• We discussed many algorithms for learning W .• The Perceptron algorithm is a simple error-correcting
method that is guarenteed to find a separatinghyperplane if one exists.
PR NPTEL course – p.8/123
• We discussed many algorithms for learning W .• The Perceptron algorithm is a simple error-correcting
method that is guarenteed to find a separatinghyperplane if one exists.
• The perceptron convergence theorem shows thatgiven any training set of linearly separable patterns,the algorithm will find a separating hyperplane.
PR NPTEL course – p.9/123
• We discussed many algorithms for learning W .• The Perceptron algorithm is a simple error-correcting
method that is guarenteed to find a separatinghyperplane if one exists.
• The perceptron convergence theorem shows thatgiven any training set of linearly separable patterns,the algorithm will find a separating hyperplane.
• Our discussion on statistical learning theory gives usan idea of how many iid examples we should havebefore we can be confident that the hyperplane thatseparates the examples will also do well on test data.
PR NPTEL course – p.10/123
• We have also seen the least-squares method wherewe find W to minimize
J(W ) =1
n
∑
i
(W TXi − yi)2
where, for simplicity of notation, we have assumedaugumented feature vectors.
PR NPTEL course – p.11/123
• We have also seen the least-squares method wherewe find W to minimize
J(W ) =1
n
∑
i
(W TXi − yi)2
where, for simplicity of notation, we have assumedaugumented feature vectors.
• In our risk minimization framework, H is parametrizedby W , we take h(X) = W TX and minimize empiricalrisk under squared-error loss function.
PR NPTEL course – p.12/123
• We have seen how to obtain the least-squaressolution:
W ∗ = (ATA)−1ATY
where rows of matrix A are feature vectors andcomponents of Y are yi.
PR NPTEL course – p.13/123
• We have seen how to obtain the least-squaressolution:
W ∗ = (ATA)−1ATY
where rows of matrix A are feature vectors andcomponents of Y are yi.
• The least-squares method can also be used to learnlinear regression models.
PR NPTEL course – p.14/123
• We have seen how to obtain the least-squaressolution:
W ∗ = (ATA)−1ATY
where rows of matrix A are feature vectors andcomponents of Y are yi.
• The least-squares method can also be used to learnlinear regression models.
• The only difference is that in a regression model, theyi are real-valued.
PR NPTEL course – p.15/123
• We have seen that we can also minimize the empiricalrisk J(W ) using gradient descent.
PR NPTEL course – p.16/123
• We have seen that we can also minimize the empiricalrisk J(W ) using gradient descent.
• We can also run this gradient descent in anincremental fashion by considering one example at atime.
PR NPTEL course – p.17/123
• We have seen that we can also minimize the empiricalrisk J(W ) using gradient descent.
• We can also run this gradient descent in anincremental fashion by considering one example at atime.
• That gives us another classical algorithm called theLMS algorithm.
PR NPTEL course – p.18/123
• We have also seen that we can use the least squaresidea to learn a model g(W TX) by redefining J as
J(W ) =1
n
∑
i
(g(W TXi) − yi)2
PR NPTEL course – p.19/123
• We have also seen that we can use the least squaresidea to learn a model g(W TX) by redefining J as
J(W ) =1
n
∑
i
(g(W TXi) − yi)2
• An important example is the logistic regression wherewe take g as the sigmoid function.
PR NPTEL course – p.20/123
• We have also seen that we can use the least squaresidea to learn a model g(W TX) by redefining J as
J(W ) =1
n
∑
i
(g(W TXi) − yi)2
• An important example is the logistic regression wherewe take g as the sigmoid function.
• We minimize J by incremental version of gradientdescent.
PR NPTEL course – p.21/123
• Another important method for learning linearclassifiers is the Fisher Linear Discriminant.
PR NPTEL course – p.22/123
• Another important method for learning linearclassifiers is the Fisher Linear Discriminant.
• Here, we look for a direction W such that the patternsof the two classes get ‘well-separated’ when projectedonto this one-dimensional subspace.
PR NPTEL course – p.23/123
• Another important method for learning linearclassifiers is the Fisher Linear Discriminant.
• Here, we look for a direction W such that the patternsof the two classes get ‘well-separated’ when projectedonto this one-dimensional subspace.
• As we mentioned, Fisher Linear Discriminant can bethought of as a special case of least-squares methodof learning a linear regression model with specialtarget values.
PR NPTEL course – p.24/123
Beyond linear models
• Learning linear models (classifiers) is generallyefficient.
PR NPTEL course – p.25/123
Beyond linear models
• Learning linear models (classifiers) is generallyefficient.
• However, linear models are not always sufficient.
PR NPTEL course – p.26/123
Beyond linear models
• Learning linear models (classifiers) is generallyefficient.
• However, linear models are not always sufficient.• Best linear functions may still be a poor fit.
PR NPTEL course – p.27/123
Beyond linear models
• Learning linear models (classifiers) is generallyefficient.
• However, linear models are not always sufficient.• Best linear functions may still be a poor fit.• We have looked at three broad approaches to
learning nonlinear classifiers.
PR NPTEL course – p.28/123
Beyond linear models
• Learning linear models (classifiers) is generallyefficient.
• However, linear models are not always sufficient.• Best linear functions may still be a poor fit.• We have looked at three broad approaches to
learning nonlinear classifiers.• We now discuss neural network models.
PR NPTEL course – p.29/123
Neural network models
• We need a ‘good’ parameterized class of nonlinearfunctions to learn nonlinear classifiers.
PR NPTEL course – p.30/123
Neural network models
• We need a ‘good’ parameterized class of nonlinearfunctions to learn nonlinear classifiers.
• Artificial neural networks are one such class
PR NPTEL course – p.31/123
Neural network models
• We need a ‘good’ parameterized class of nonlinearfunctions to learn nonlinear classifiers.
• Artificial neural networks are one such class• Nonlinear functions are built up through composition
of summation and sigmoids.
PR NPTEL course – p.32/123
Neural network models
• We need a ‘good’ parameterized class of nonlinearfunctions to learn nonlinear classifiers.
• Artificial neural networks are one such class• Nonlinear functions are built up through composition
of summation and sigmoids.• Useful for both classification and Regression.
PR NPTEL course – p.33/123
• In this course we will study only multilayer feedforwardnetworks.
PR NPTEL course – p.34/123
• In this course we will study only multilayer feedforwardnetworks.
• They are useful because they offer goodparameterized class of nonlinear functions and thereare some efficient algorithms to learn them.
PR NPTEL course – p.35/123
• In this course we will study only multilayer feedforwardnetworks.
• They are useful because they offer goodparameterized class of nonlinear functions and thereare some efficient algorithms to learn them.
• However, historically, development of (artificial) neuralnetwork models was motivated by some ideas on thestructure of human brain.
PR NPTEL course – p.36/123
• In this course we will study only multilayer feedforwardnetworks.
• They are useful because they offer goodparameterized class of nonlinear functions and thereare some efficient algorithms to learn them.
• However, historically, development of (artificial) neuralnetwork models was motivated by some ideas on thestructure of human brain.
• We briefly look at this perspective of neural networksas an approach to engineering intelligent systems.
PR NPTEL course – p.37/123
What is an Artificial Neural Network?
PR NPTEL course – p.38/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up of
simple processing units that has a propensity for acquiring
problem solving knowledge through experience"
PR NPTEL course – p.39/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up ofsimple processing units that has a propensity foracquiring problem solving knowledge through experience"
• Large number of inter connected units
PR NPTEL course – p.40/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up ofsimple processing units that has a propensity foracquiring problem solving knowledge through experience"
• Large number of inter connected units• Each unit implements simple function, nonlinear
PR NPTEL course – p.41/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up ofsimple processing units that has a propensity foracquiring problem solving knowledge through experience"
• Large number of inter connected units• Each unit implements simple function, nonlinear• The ‘knowledge’ resides in the interconnection
strengths.
PR NPTEL course – p.42/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up ofsimple processing units that has a propensity foracquiring problem solving knowledge through experience"
• Large number of inter connected units• Each unit implements simple function, nonlinear• The ‘knowledge’ resides in the interconnection
strengths.• Problem solving ability is often through ‘learning’
PR NPTEL course – p.43/123
What is an Artificial Neural Network?
"A parallel distributed information processor made up ofsimple processing units that has a propensity foracquiring problem solving knowledge through experience"
• Large number of inter connected units• Each unit implements simple function, nonlinear• The ‘knowledge’ resides in the interconnection
strengths.• Problem solving ability is often through ‘learning’
An architecture inspired by the structure of Brain
PR NPTEL course – p.44/123
The Human Brain
• Neuron - the basic computing unit
PR NPTEL course – p.45/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons
PR NPTEL course – p.46/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons• In the Brain
No of neurons ∽ 1011 (100 billion)
PR NPTEL course – p.47/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons• In the Brain
No of neurons ∽ 1011 (100 billion)Average synapses per neuron ∽ 10000
(1000-100000)
PR NPTEL course – p.48/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons• In the Brain
No of neurons ∽ 1011 (100 billion)Average synapses per neuron ∽ 10000
(1000-100000)Total synapses ∽ 1015
PR NPTEL course – p.49/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons• In the Brain
No of neurons ∽ 1011 (100 billion)Average synapses per neuron ∽ 10000
(1000-100000)Total synapses ∽ 1015
Neuron time constants ∽ Milliseconds
PR NPTEL course – p.50/123
The Human Brain
• Neuron - the basic computing unit• Brain is a highly organized structure of networks of
interconnected neurons• In the Brain
No of neurons ∽ 1011 (100 billion)Average synapses per neuron ∽ 10000
(1000-100000)Total synapses ∽ 1015
Neuron time constants ∽ MillisecondsSingle neuron can send 100 spikes per second
PR NPTEL course – p.51/123
A rough estimate of processing power:
One arithmetic operation per synapse→ 104 operations per neuron per spike→ 106 operations per neuron per sec→ 1017 operations per sec!!(A gigaflop is 109 operations, teraflop is 1012 operations!)
PR NPTEL course – p.52/123
A rough estimate of processing power:
One arithmetic operation per synapse→ 104 operations per neuron per spike→ 106 operations per neuron per sec→ 1017 operations per sec!!(A gigaflop is 109 operations, teraflop is 1012 operations!)
Massive parallelism can deliver massive computingpower,
PR NPTEL course – p.53/123
A rough estimate of processing power:
One arithmetic operation per synapse→ 104 operations per neuron per spike→ 106 operations per neuron per sec→ 1017 operations per sec!!(A gigaflop is 109 operations, teraflop is 1012 operations!)
Massive parallelism can deliver massive computingpower,
if we know how to manage it
PR NPTEL course – p.54/123
Digital computers:• Precise design, highly constrained, not very adaptive
or fault tolerant, Centralized control, deterministic,basic switching times ∽ 10−9 sec
PR NPTEL course – p.55/123
Digital computers:• Precise design, highly constrained, not very adaptive
or fault tolerant, Centralized control, deterministic,basic switching times ∽ 10−9 sec
Natural neural networks:• massively parallel, highly adaptive and fault tolerant,
self configuring, self repairing, noisy, stochastic, basicswitching time ∽ 10−3 sec
PR NPTEL course – p.56/123
Digital computers:• Precise design, highly constrained, not very adaptive
or fault tolerant, Centralized control, deterministic,basic switching times ∽ 10−9 sec
Natural neural networks:• massively parallel, highly adaptive and fault tolerant,
self configuring, self repairing, noisy, stochastic, basicswitching time ∽ 10−3 sec
• Most capabilities of Brain are LEARNT.
PR NPTEL course – p.57/123
Artificial Intelligence (AI)
• ‘Understanding’ intelligence in computational terms.
PR NPTEL course – p.58/123
Artificial Intelligence (AI)
• ‘Understanding’ intelligence in computational terms.• Developing ‘machines’ that are ‘intelligent’.
PR NPTEL course – p.59/123
Artificial Intelligence (AI)
• ‘Understanding’ intelligence in computational terms.• Developing ‘machines’ that are ‘intelligent’.
At least two distinct approaches
PR NPTEL course – p.60/123
Artificial Intelligence (AI)
• ‘Understanding’ intelligence in computational terms.• Developing ‘machines’ that are ‘intelligent’.
At least two distinct approaches
• Try to model intelligent behavior in terms ofprocessing structured symbols. (Resulting methods,algorithms etc may not resemble brain at theimplementation level)
PR NPTEL course – p.61/123
Artificial Intelligence (AI)
• ‘Understanding’ intelligence in computational terms.• Developing ‘machines’ that are ‘intelligent’.
At least two distinct approaches
• Try to model intelligent behavior in terms ofprocessing structured symbols. (Resulting methods,algorithms etc may not resemble brain at theimplementation level)
• A second approach is based on mimicking humanbrain at architectural/implementation level
PR NPTEL course – p.62/123
The symbolic AI approach
• Brain is to be understood in computational terms only
PR NPTEL course – p.63/123
The symbolic AI approach
• Brain is to be understood in computational terms only• Physical symbol system hypothesis
PR NPTEL course – p.64/123
The symbolic AI approach
• Brain is to be understood in computational terms only• Physical symbol system hypothesis• A digital computer is a universal symbol manipulator
and can be programmed to be intelligent
PR NPTEL course – p.65/123
The symbolic AI approach
• Brain is to be understood in computational terms only• Physical symbol system hypothesis• A digital computer is a universal symbol manipulator
and can be programmed to be intelligent• Many useful engineering applications e.g. Expert
systems
PR NPTEL course – p.66/123
The symbolic AI approach
• Brain is to be understood in computational terms only• Physical symbol system hypothesis• A digital computer is a universal symbol manipulator
and can be programmed to be intelligent• Many useful engineering applications e.g. Expert
systems
An implicit faith: The architecture of Brain per se is irrele-
vant for engineering intelligent artifacts
PR NPTEL course – p.67/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
PR NPTEL course – p.68/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
• Computational architectures inspired by brain
PR NPTEL course – p.69/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
• Computational architectures inspired by brainComputational methods for ‘learning’dependencies in data stream
PR NPTEL course – p.70/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
• Computational architectures inspired by brainComputational methods for ‘learning’dependencies in data streame.g. Pattern Recognition, System identification
PR NPTEL course – p.71/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
• Computational architectures inspired by brainComputational methods for ‘learning’dependencies in data streame.g. Pattern Recognition, System identification
• Characteristics: Emergent properties, learning, selfadaptation
PR NPTEL course – p.72/123
Artificial Neural Networks
• Can be viewed as one approach towardsunderstanding brain/building intelligent machines
• Computational architectures inspired by brainComputational methods for ‘learning’dependencies in data streame.g. Pattern Recognition, System identification
• Characteristics: Emergent properties, learning, selfadaptation
• Modeling Biology?Mathematically purified neurons!!
PR NPTEL course – p.73/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.
PR NPTEL course – p.74/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units
PR NPTEL course – p.75/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units• Each unit has simple input-output mapping
PR NPTEL course – p.76/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units• Each unit has simple input-output mapping• Each interconnection has numerical weight attached
to it
PR NPTEL course – p.77/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units• Each unit has simple input-output mapping• Each interconnection has numerical weight attached
to it• Output of unit depends on outputs and connection
weights of units connected to it
PR NPTEL course – p.78/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units• Each unit has simple input-output mapping• Each interconnection has numerical weight attached
to it• Output of unit depends on outputs and connection
weights of units connected to it• ‘Knowledge’ resides in the weights
PR NPTEL course – p.79/123
Artificial Neural Networks
Computing machines that try to mimic brain architecture.• A large network of interconnected units• Each unit has simple input-output mapping• Each interconnection has numerical weight attached
to it• Output of unit depends on outputs and connection
weights of units connected to it• ‘Knowledge’ resides in the weights• Problem solving ability is often through learning
PR NPTEL course – p.80/123
Single neuron model
PR NPTEL course – p.81/123
Single neuron model
• xi are inputs into the (artificial) neuron and wi are thecorresponding weights. y is the output of the neuron
PR NPTEL course – p.82/123
Single neuron model
• xi are inputs into the (artificial) neuron and wi are thecorresponding weights. y is the output of the neuron
• Net input : η =∑
jwjxj
• output: y = f(η), where f(.) is called activationfunction
PR NPTEL course – p.83/123
Single neuron model
• xi are inputs into the (artificial) neuron and wi are thecorresponding weights. y is the output of the neuron
• Net input : η =∑
jwjxj
• output: y = f(η), where f(.) is called activationfunction(Perceptron, AdaLinE are such models).
PR NPTEL course – p.84/123
Networks of neurons
• We can connect a number of such units or neurons toform a network. Inputs to a neuron can be outputs ofother neurons (and/or external inputs).
PR NPTEL course – p.85/123
Networks of neurons
• We can connect a number of such units or neurons toform a network. Inputs to a neuron can be outputs ofother neurons (and/or external inputs).
PR NPTEL course – p.86/123
Networks of neurons
• We can connect a number of such units or neurons toform a network. Inputs to a neuron can be outputs ofother neurons (and/or external inputs).
• Notation:yj – output of jth neuron;wij – weight of connection from neuron i to neuron j.
PR NPTEL course – p.87/123
• Each neuron computes weighted sum of inputs andpasses it through its activation function, to computeoutput
PR NPTEL course – p.88/123
• Each neuron computes weighted sum of inputs andpasses it through its activation function, to computeoutput
• For example, output of neuron 5 is
y5 = f5 (w35 y3 + w45 y4)
PR NPTEL course – p.89/123
• Each neuron computes weighted sum of inputs andpasses it through its activation function, to computeoutput
• For example, output of neuron 5 is
y5 = f5 (w35 y3 + w45 y4)
= f5 (w35 f3(w13y1 + w23y2) + w45 f4(w14y1 + w24y2))
PR NPTEL course – p.90/123
• Each neuron computes weighted sum of inputs andpasses it through its activation function, to computeoutput
• For example, output of neuron 5 is
y5 = f5 (w35 y3 + w45 y4)
= f5 (w35 f3(w13y1 + w23y2) + w45 f4(w14y1 + w24y2))
• By convention, we take y1 = x1 and y2 = x2.
PR NPTEL course – p.91/123
• Each neuron computes weighted sum of inputs andpasses it through its activation function, to computeoutput
• For example, output of neuron 5 is
y5 = f5 (w35 y3 + w45 y4)
= f5 (w35 f3(w13y1 + w23y2) + w45 f4(w14y1 + w24y2))
• By convention, we take y1 = x1 and y2 = x2.• Here, x1, x2 are inputs and y5, y6 are outputs.
PR NPTEL course – p.92/123
• A single neuron ‘represents’ a class of functions fromℜm to ℜ.
PR NPTEL course – p.93/123
• A single neuron ‘represents’ a class of functions fromℜm to ℜ.
• Specific set of weights realise specific functions.
PR NPTEL course – p.94/123
• A single neuron ‘represents’ a class of functions fromℜm to ℜ.
• Specific set of weights realise specific functions.• By interconnecting many units/neurons, networks can
represent more complicated functions from ℜm to ℜm′
.
PR NPTEL course – p.95/123
• A single neuron ‘represents’ a class of functions fromℜm to ℜ.
• Specific set of weights realise specific functions.• By interconnecting many units/neurons, networks can
represent more complicated functions from ℜm to ℜm′
.• The architecture constrains the function class that can
be represented. Weights define specific function inthe class.
PR NPTEL course – p.96/123
• A single neuron ‘represents’ a class of functions fromℜm to ℜ.
• Specific set of weights realise specific functions.• By interconnecting many units/neurons, networks can
represent more complicated functions from ℜm to ℜm′
.• The architecture constrains the function class that can
be represented. Weights define specific function inthe class.
• To form meaningful networks, nonlinearity ofactivation function is important.
PR NPTEL course – p.97/123
Typical activation functions
1. Hard limiter:
f(x) = 1 if x > τ= 0 otherwise
PR NPTEL course – p.98/123
Typical activation functions
1. Hard limiter:
f(x) = 1 if x > τ= 0 otherwise
• We can keep the τ to be zero and add one more inputline to the neuron. An example of a single neuron withthis activation function is Perceptron.
PR NPTEL course – p.99/123
Activation functions (cont).......
2. Sigmoid function:
f(x) =a
1 + exp (−bx), a, b > 0
PR NPTEL course – p.100/123
Activation functions (Contd.)
3. tanh
f(x) = atanh(bx), a, b > 0
PR NPTEL course – p.101/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
PR NPTEL course – p.102/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
PR NPTEL course – p.103/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
Useful in many applications
PR NPTEL course – p.104/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
Useful in many applicationsTime series prediction
PR NPTEL course – p.105/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
Useful in many applicationsTime series predictionsystem identification and control
PR NPTEL course – p.106/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
Useful in many applicationsTime series predictionsystem identification and controlpattern recognition and Regression
PR NPTEL course – p.107/123
Why study such models?
• A belief that the architecture of Brain is critical tointelligent behavior.
• Models can implement highly nonlinear functions.They are adaptive and can be trained.
Useful in many applicationsTime series predictionsystem identification and controlpattern recognition and Regression
• Model can help us understand Brain functionComputational neuroscience
PR NPTEL course – p.108/123
Many different models are possible
PR NPTEL course – p.109/123
Many different models are possible
• Evolution:• Discrete time / continuous time• synchronous / asynchronous• deterministic / stochastic
PR NPTEL course – p.110/123
Many different models are possible
• Evolution:• Discrete time / continuous time• synchronous / asynchronous• deterministic / stochastic
• Interconnections:• Feedforward / having feedback
PR NPTEL course – p.111/123
Many different models are possible
• Evolution:• Discrete time / continuous time• synchronous / asynchronous• deterministic / stochastic
• Interconnections:• Feedforward / having feedback
• States or outputs of units:• binary / finitely many / continuous
PR NPTEL course – p.112/123
Recurrent networks
• The network we saw earlier has no feedback.
PR NPTEL course – p.113/123
Recurrent networks
• The network we saw earlier has no feedback.• Here is an example of a network with feedback
PR NPTEL course – p.114/123
Recurrent networks
• The network we saw earlier has no feedback.• Here is an example of a network with feedback
• Can model a dynamical system:
y(k) = f(y(k − 1), x1(k), x2(k))
PR NPTEL course – p.115/123
• We will consider only feedforward networks whichprovide a general class of nonlinear functions.
PR NPTEL course – p.116/123
• We will consider only feedforward networks whichprovide a general class of nonlinear functions.
• These can always be organized as a layered network.
PR NPTEL course – p.117/123
• We will consider only feedforward networks whichprovide a general class of nonlinear functions.
• These can always be organized as a layered network.
• This network represents a class of functions from ℜ2
to ℜ2.PR NPTEL course – p.118/123
• Each unit can also have a ‘bias’ input.
PR NPTEL course – p.119/123
• Each unit can also have a ‘bias’ input.• This is shown for a single unit below.
PR NPTEL course – p.120/123
• Each unit can also have a ‘bias’ input.• This is shown for a single unit below.
• One can always think of bias as an extra input
y = f
(
d∑
i=1
wixi + w0
)
PR NPTEL course – p.121/123
• Each unit can also have a ‘bias’ input.• This is shown for a single unit below.
• One can always think of bias as an extra input
y = f
(
d∑
i=1
wixi + w0
)
= f
(
d∑
i=0
wixi
)
, x0 = +1
PR NPTEL course – p.122/123
Multilayer feedforward networks
• Here is a general multilayer feedforward network.
PR NPTEL course – p.123/123
Linear ModelsLinear ModelsBeyond linear modelsBeyond linear modelsBeyond linear modelsBeyond linear modelsBeyond linear modelsNeural network modelsNeural network modelsNeural network modelsNeural network modelsWhat is an Artificial Neural Network?What is an Artificial Neural Network?What is an Artificial Neural Network?What is an Artificial Neural Network?What is an Artificial Neural Network?What is an Artificial Neural Network?What is an Artificial Neural Network?The Human BrainThe Human BrainThe Human BrainThe Human BrainThe Human BrainThe Human BrainThe Human Brain A rough estimate of processing power: A rough estimate of processing power: A rough estimate of processing power: Artificial Intelligence (AI)Artificial Intelligence (AI)Artificial Intelligence (AI)Artificial Intelligence (AI)Artificial Intelligence (AI) The symbolic AI approach The symbolic AI approach The symbolic AI approach The symbolic AI approach The symbolic AI approach Artificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksArtificial Neural NetworksSingle neuron modelSingle neuron modelSingle neuron modelSingle neuron modelNetworks of neuronsNetworks of neuronsNetworks of neuronsTypical activation functionsTypical activation functionsActivation functions (cont).......Activation functions (Contd.) Why study such models? Why study such models? Why study such models? Why study such models? Why study such models? Why study such models? Why study such models?Many different models are possibleMany different models are possibleMany different models are possibleMany different models are possibleRecurrent networksRecurrent networksRecurrent networksMultilayer feedforward networks