20
Introduction to Deep Learning Recurrent Neural Networks Prof. Songhwai Oh ECE, SNU Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 1

Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Introduction to Deep LearningRecurrent Neural Networks

Prof. Songhwai OhECE, SNU

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 1

Page 2: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

RECURRENT NEURAL NETWORKS (RNN)Adapted from slides by Hyemin Ahn

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 2

Page 3: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : For what?

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 3

• Human remembers and uses the pattern of a sequence.

• Try ‘a b c d e f g…’

• But how about ‘z y x w v u t s…’ ?

• The idea behind RNN is to make use of sequential 

information.

• Let’s learn a pattern of a sequence and utilize it (estimate, 

generate, etc.)

• But HOW?

Page 4: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : Typical RNNs

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 4

OUTPUT

INPUT

ONE STEP DELAY

HIDDEN STATE

• RNNs are called RECURRENT because they perform the same task for every element of a sequence, with the output being depended on the previous computations.

• RNNs have a memory which captures information about what has been calculated so far.

• The hidden state  captures some information about the sequence.

• If we use  tanh , Vanishing or exploding gradient problem happens. 

• To overcome this, we use LSTM or GRU.• LSTM: Long short‐term memory• GRU: gated recurrent unit

Page 5: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 5

• Let’s think about the machine, which guesses the dinner menu from stuffs in a shopping bag.

Umm,,Carbonara!

Page 6: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 6

Cell state,Internal memory unit,Like a conveyor belt!

Page 7: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 7

Cell state,Internal memory unit,Like a conveyor belt!

ForgetSome 

Memories!

Page 8: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 8

Cell state,Internal memory unit,Like a conveyor belt!

ForgetSome 

Memories!

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Page 9: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 9

Cell state,Internal memory unit,Like a conveyor belt!

InsertSome 

Memories!

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Page 10: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 10

Cell state,Internal memory unit,Like a conveyor belt!

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Page 11: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 11

Cell state,Internal memory unit,Like a conveyor belt!

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Page 12: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 12

Cell state,Internal memory unit,Like a conveyor belt!

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Page 13: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 13

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/

Page 14: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 14

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/

Forget gate

Input gate

Page 15: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : LSTM

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 15

LSTM learns (1) How to forget a memory when the  and new input  is given,(2) Then how to add the new memory from given  and  .

Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/

Output gate

Cell state

Hidden state

Page 16: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Recurrent Neural Networks : GRU

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 16

∙ ,∙ ,∙ ,

tanh ∙ ,∗ ∗∗ tanh

Maybe we can simplify this structure, efficiently!

GRU

∙ ,∙ ,

tanh ∙ ∗ ,1 ∗ ∗

Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/

updatereset

Page 17: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Sequence to Sequence Model: What is it?

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 17

1 2 3 4 5 LSTM/GRUEncoder

LSTM/GRUDecoder

1

Western FoodTo

Korean FoodTransition

Page 18: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Sequence to Sequence Model: Implementation

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 18

• The simplest way to implement sequence to sequence model is 

to just pass the last hidden state of decoder to the first GRU cell of encoder!  

• However, this method’s power gets weaker when the encoder need to generate a longer sequence.

Page 19: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Sequence to Sequence Model: Attention Decoder

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 19

Bidirectional GRU Encoder

Attention GRU Decoder

• For each GRU cell consisting the decoder, let’s pass the encoder’s information differently!

, ,1 ∗ ∗

tanh ∗

tanh

Page 20: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,

Wrap Up

• Recurrent neural networks• LSTM• GRU• Seq2Seq

Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 20