42
deeplearning.ai Recurrent Neural Networks Why sequence models?

Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

Whysequencemodels?

Page 2: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Examples of sequence data

Music generation ∅Speech recognition “The quick brown fox jumped

over the lazy dog.”

Sentiment classification “There is nothing to like in this movie.”

DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG

Machine translation Voulez-vous chanter avec moi?

Do you want to sing with me?

Video activity recognition Running

Name entity recognition Yesterday, Harry Potter met Hermione Granger.

Yesterday, Harry Potter met Hermione Granger.

Page 3: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

Notation

Page 4: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Motivating example

x: Harry Potter and Hermione Granger invented a new spell.

Page 5: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Representing words

x: Harry Potter and Hermione Granger invented a new spell. !"#$ !"%$ !"&$ ⋯ !"($

Page 6: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Representing words

x: Harry Potter and Hermione Granger invented a new spell. !"#$ !"%$ !"&$ ⋯ !"($

And=367Invented=4700A=1New=5976Spell=8376Harry=4075Potter=6830Hermione=4200Gran… =4000

Page 7: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

RecurrentNeuralNetworkModel

Page 8: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Why not a standard network?!"#$

!"%$

⋮!"'($

⋮ ⋮

)"#$

)"%$

⋮)"'*$

Problems:- Inputs, outputs can be different lengths in different examples.

- Doesn’t share features learned across different positions of text.

Page 9: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Recurrent Neural Networks

Hesaid,“TeddyRooseveltwasagreatPresident.”

Hesaid,“Teddybearsareonsale!”

Page 10: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Forward Propagation

+",$

!"#$

)-"#$

+"#$

!"%$

)-"%$

+"%$

!".$

)-".$

+"'(/#$

!"'($

)-"'*$

Page 11: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Simplified RNN notation

+"1$ = 3(566+"1/#$ +568!"1$ + 96)

)-"1$ = 3(5;6+"1$ + 9;)

Page 12: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

Backpropagationthroughtime

Page 13: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Forward propagation and backpropagation

!"#$

%"&$

'("&$

!"&$

%")$

'(")$

!")$

%"*$

'("*$

!"+,-&$

%"+,$

'("+.$

Page 14: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Forward propagation and backpropagation

ℒ"1$ '("1$, '"1$ =

Backpropagation through time

Page 15: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

DifferenttypesofRNNs

Page 16: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Examples of sequence data

Music generation ∅Speech recognition “The quick brown fox jumped

over the lazy dog.”

Sentiment classification “There is nothing to like in this movie.”

DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG

Machine translation Voulez-vous chanter avec moi?

Do you want to sing with me?

Video activity recognition Running

Name entity recognition Yesterday, Harry Potter met Hermione Granger.

Yesterday, Harry Potter met Hermione Granger.

Page 17: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Examples of RNN architectures

Page 18: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Examples of RNN architectures

Page 19: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Summary of RNN types

"#$%

&#'%

()#'%

One to one One to many

"#$%

&

()#'% ()#*% ()#+,%

&#*% &#+.%

"#$%

&#'%

()

Many to one

"#$%

&#'%

()#+,%

⋯&#*% &#+.%

()#'% ()#*%

Many to many Many to many

"#$%

&#'%

()#'%

&#+.%

()#+,%

⋯⋯

Page 20: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

Languagemodelandsequencegeneration

Page 21: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

What is language modelling?Speech recognition

The apple and pair salad.

The apple and pear salad.

!(The apple and pair salad) =

!(The apple and pear salad) =

Page 22: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Language modelling with an RNNTraining set: large corpus of english text.

Cats average 15 hours of sleep a day.

The Egyptian Mau is a bread of cat. <EOS>

Page 23: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

RNN model

Cats average 15 hours of sleep a day. <EOS>

ℒ &'()*, &()* = −-�

0&0()* log &'0()*

ℒ =-�

)ℒ()* &'()*, &()*

Page 24: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

Samplingnovelsequences

Page 25: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Sampling a sequence from a trained RNN

!"#$

%"&$

'(")*$

'"&$ '")-.&$

'("&$ '("/$

'"/$

'("0$

!"&$ !"/$ !"0$ !")*$

Page 26: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Character-level language model

Vocabulary = [a, aaron, …, zulu, <UNK>]

!"#$

%"&$

'(")*$

'("&$ '("/$ '("0$

!"&$ !"/$ !"0$ !")*$

'("&$ '("/$ '(")-.&$

Page 27: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Sequence generation

President enrique peña nieto, announced sench’s sulk former coming football langstonparing.

“I was not at all surprised,” said hich langston.

“Concussion epidemic”, to be examined.

The gray football the told some and this has on the uefa icon, should money as.

News Shakespeare

The mortal moon hath her eclipse in love.

And subject of this thou art another this fold.

When besser be my love to me see sabl’s.

For whose are ruse of mine eyes heaves.

Page 28: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

VanishinggradientswithRNNs

Page 29: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Vanishing gradients with RNNs

!"#$

%"&$

'(")*$

%"-$ %").$

'("&$ '("-$

%"/$

'("/$

!"&$ !"-$ !"/$ !")*$

Explodinggradients.

% '(⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮⋯

Page 30: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

GatedRecurrentUnit(GRU)

Page 31: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

RNN unit

!"#$ = &(() !"#*+$, -"#$ + /))

Page 32: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

GRU (simplified)

The cat, which already ate …, was full. [Cho et al., 2014. On the properties of neural machine translation: Encoder-decoder approaches][Chung et al., 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling]

Page 33: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Full GRU

Γ2 = 3((2 5"#*+$, -"#$ + /2)

5"#$ = Γ2∗ 5̃"#$ + 1 − Γ2 + 5"#*+$

The cat, which ate already, was full.

5̃"#$ = tanh((>[ 5"#*+$, -"#$] + />)

Page 34: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

LSTM(longshorttermmemory)unit

Page 35: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

GRU and LSTM

!̃#$% = tanh(,- Γ/ ∗ !#$12%, 4#$% + 6-)

Γ8 = 9(,8 !#$12%, 4#$% + 68)

!#$% = Γ8∗ !̃#$% + 1 − Γ8 ∗ !#$12%

Γ/ = 9(,/ !#$12%, 4#$% + 6/)

GRU LSTM

=#$% = !#$%

[Hochreiter & Schmidhuber 1997. Long short-term memory]

Page 36: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

LSTM units

!̃#$% = tanh(,- Γ/ ∗ !#$12%, 4#$% + 6-)

Γ8 = 9(,8 !#$12%, 4#$% + 68)

!#$% = Γ8∗ !̃#$% + 1 − Γ8 ∗ !#$12%

Γ/ = 9(,/ !#$12%, 4#$% + 6/)

GRU LSTM

!#$% = Γ8 ∗ !̃#$% + Γ> ∗ !#$12%

!̃#$% = tanh(,- =#$12%, 4#$% + 6-)

Γ8 = 9(,8 =#$12%, 4#$% + 68)

Γ> = 9(,> =#$12%, 4#$% + 6>)

Γ? = 9(,? =#$12%, 4#$% + 6?)

=#$% = Γ? ∗ !#$%

=#$% = !#$%

[Hochreiter & Schmidhuber 1997. Long short-term memory]

Page 37: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

!#$% = Γ8 ∗ !̃#$% + Γ> ∗ !#$12%

!̃#$% = tanh(,- =#$12%, 4#$% + 6-)Γ8 = 9(,8 =#$12%, 4#$% + 68)Γ> = 9(,> =#$12%, 4#$% + 6>)Γ? = 9(,? =#$12%, 4#$% + 6?)

=#$% = Γ? ∗ !#$%

LSTM in pictures

!#$12%

=#$12%!#$%

4#$%

forget gate update gate tanh output gate

⨁ !#$%=#$%

=#$%=#$%

tanh

softmax

!̃#$% A#$%B#$% C#$%

D#$%

----*

**

!#E%

=#E%

!#2%

4#2%

⨁=#2%

=#2%

softmax

D#2%

----* !#2%

=#2%

4#F%

⨁=#F%

=#F%

softmax

D#F%

!#F%----* !#F%

=#F%

4#G%

⨁=#G%

=#G%

softmax

D#G%

!#G%----*

Page 38: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

BidirectionalRNN

Page 39: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Getting information from the futureHe said, “Teddy bears are on sale!”

He said, “Teddy Roosevelt was a great President!”

He said, “Teddy bears are on sale!”

!"#$%

'#(% '#$%

!"#)% !"#(%

'#*%

!"#*%

+#,%

'#)%

+#)% +#(% +#*% +#$%

'#-%

!"#.% !"#-%

'#/%

!"#/%

+#-% +#/%+#.%

'#.%

Page 40: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Bidirectional RNN (BRNN)

Page 41: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

deeplearning.ai

RecurrentNeuralNetworks

DeepRNNs

Page 42: Recurrent Neural Networks Why sequence models? · 2019-01-07 · Andrew Ng Sequence generation President enrique peña nieto, announced sench’s sulk former coming football langston

AndrewNg

Deep RNN example

!"#$ !"%$ !"&$ !"'$

([#]"+$

([%]"&$ ([%]"'$([%]"#$ ([%]"%$([%]"+$

,"#$ ,"%$ ,"&$ ,"'$

([&]"&$ ([&]"'$([&]"#$ ([&]"%$([&]"+$