Upload
others
View
19
Download
0
Embed Size (px)
Citation preview
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
An Introduction toRecurrent Neural Networks
Alex Atanasov1
1Dept. of PhysicsYale University
April 24, 2018
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Overview
1 Introduction to RNNs
2 Historical Background
3 Mathematical Formulation
4 Unrolling
5 Computing Gradients
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Throughout this presentation I will be using the notation from thebook by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Sequential Processing
Modeling of Neuronal Connectivity
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Sequential Processing
Modeling of Neuronal Connectivity
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Sequential Processing
Modeling of Neuronal Connectivity
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Sequential Processing
Modeling of Neuronal Connectivity
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Most human brains don’t look like this:
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Most human brains don’t look like this:
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Most human brains don’t look like this1:
1Obligatory political reference here
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Instead, we have neurons connecting in a dense web called arecurrent network
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Motivation
Two motivations for recurrent neural network models:
Modeling of Neuronal Connectivity
Instead, we have neurons connecting in a dense web called arecurrent network
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
RNN Examples
We can also combine RNNs with other networks we’ve seen before
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
RNN Examples
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
RNN Examples
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Recurrent neural networks are
Designed for processing sequential data
Like CNNs, motivated by biological example
Unlike CNNs and deep neural networks in that their neuralconnections can contain cycles
A very general form of neural network
Turing complete
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Visual Representation
From a neuroscience paper1:
1Song et al. 2014, Training Excitatory-Inhibitory Recurrent Neural Networksfor Cognitive Tasks: A Simple and Flexible Framework
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Visual Representation
From a neuroscience paper1:
1Song et al. 2014, Training Excitatory-Inhibitory Recurrent Neural Networksfor Cognitive Tasks: A Simple and Flexible Framework
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Hopfield Networks
The first formulation of a “recurrent-like” neural network wasmade by John Hopfield (1982)
Very simple form of neural network
Build in the context of theoretical neuroscience
First order attempt at understanding the mechanismunderlying associative memory
Still an important object of study, primarily in neuroscience
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Hopfield Networks
The first formulation of a “recurrent-like” neural network wasmade by John Hopfield (1982)
Very simple form of neural network
Build in the context of theoretical neuroscience
First order attempt at understanding the mechanismunderlying associative memory
Still an important object of study, primarily in neuroscience
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Hopfield Networks
The first formulation of a “recurrent-like” neural network wasmade by John Hopfield (1982)
Very simple form of neural network
Build in the context of theoretical neuroscience
First order attempt at understanding the mechanismunderlying associative memory
Still an important object of study, primarily in neuroscience
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Hopfield Networks
The first formulation of a “recurrent-like” neural network wasmade by John Hopfield (1982)
Very simple form of neural network
Build in the context of theoretical neuroscience
First order attempt at understanding the mechanismunderlying associative memory
Still an important object of study, primarily in neuroscience
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Hopfield Networks
The first formulation of a “recurrent-like” neural network wasmade by John Hopfield (1982)
Very simple form of neural network
Build in the context of theoretical neuroscience
First order attempt at understanding the mechanismunderlying associative memory
Still an important object of study, primarily in neuroscience
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
LSTM Networks
Invented by Hochreiter and Schmidhuber in 1997
Only recently (2007), when computational resources werepowerful enough to train LSTM networks, did they begin torevolutionize the field
First came to fame for outperforming all traditional models incertain speech applications
In 2009, RNNs set records for handwriting recognition
In 2015, Google’s speech recognition received at 47% jumpusing an LSTM network
Significantly improved machine translation, language modelingand multilingual language processing (i.e. Google Translate)
Together with CNNs, significantly improved image captioning
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Lets actually define what an RNN is
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Lets actually define what an RNN is
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).
The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).
Thinking of t as “time” gives us the dynamical evolution of asystem.
The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
Our input will be a sequence of vectors:
x(1) . . . x(τ)
Let 1 ≤ t ≤ τ index these vectors. Our input is then denoted x(t).The activity of the neurons inside the RNN at time t will denotedby the vector h(t)
The output at time t will be denoted o(t)
The target output at time t will be denoted y(t)
The RNN’s goal is to minimize a loss L(o(t), y(t)) over all times.
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
So:
x(t) h(t) o(t)
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Our Variables
So:
x(t) h(t) o(t)
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
Given an input x(t) with
Input weights Uij connecting input j to RNN neuron i
Internal weights Wij connecting RNN neuron j to RNNneuron i and internal bias bi on RNN neuron i
Output weights Vij connecting RNN neuron j to outputneuron i and internal bias ci on output neuron i
The time evolution of h(t), o(t) is given by:
h(t) = tanh[U · x(t) + W · h(t−1) + b]
o(t) = V · h(t) + c
Often, we want a probability as our output, so our RNN output is
y(t) = softmax(o(t))
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Equations of Evolution
So:
x(t) U h(t), W, b V, c o(t)
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Computational Graph
Often in the field of deep learning, RNNs are pictorially describedby computational graphs.
Notice this is different from the usual deep network picture
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Computational Graph
Often in the field of deep learning, RNNs are pictorially describedby computational graphs.
Notice this is different from the usual deep network picture
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Computational Graph
Often in the field of deep learning, RNNs are pictorially describedby computational graphs.
Notice this is different from the usual deep network picture
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Computational Graph
To recover a deep network picture, we perform an operation knownas unrolling the graph
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Computational Graph
To recover a deep network picture, we perform an operation knownas unrolling the graph
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Examples of Computational Graphs
Feeding the output in:
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Examples of Computational Graphs
Feeding the output in:
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Examples of Computational Graphs
Summarizing a sequence
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Gradients Descent on the Unrolled Graph
After unrolling the computational graph of an RNN, we can use thesame gradient methods that we’re familiar with for deep networks.
Our trainable variables are U,W,V,b and c
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Gradients Descent on the Unrolled Graph
After unrolling the computational graph of an RNN, we can use thesame gradient methods that we’re familiar with for deep networks.Our trainable variables are U,W,V,b and c
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Gradients Descent on the Unrolled Graph
After unrolling the computational graph of an RNN, we can use thesame gradient methods that we’re familiar with for deep networks.Our trainable variables are U,W,V,b and c
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
HopfieldLSTMNeural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
HopfieldLSTMNeural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
HopfieldLSTMNeural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
Hopfield
LSTMNeural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
HopfieldLSTM
Neural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks
Introduction to RNNs Historical Background Mathematical Formulation Unrolling Computing Gradients
Next lecture
More on the the training of RNNs: vanishing and explodinggradients
A tour of the many variations of RNNs
HopfieldLSTMNeural Turing Machines
Alex Atanasov VFU
An Introduction to Recurrent Neural Networks