43
Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced Cutting Edge Research Seminar Dialogue System with Deep Neural Networks Assistant Professor Koichiro Yoshino 2018/1/25 ©Koichiro Yoshino AHC-Lab. NAIST, PRESTO JST Advanced Cutting Edge Research Seminar 2 1

Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

  • Upload
    vodieu

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

Nara Institute of Science and TechnologyAugmented Human Communication LaboratoryPRESTO, Japan Science and Technology Agency

Advanced Cutting Edge

Research Seminar

Dialogue System with Deep Neural

Networks

Assistant Professor

Koichiro Yoshino

2018/1/25 ©Koichiro Yoshino AHC-Lab. NAIST,

PRESTO JSTAdvanced Cutting Edge Research Seminar 2 1

Page 2: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

1. Basis of spoken dialogue systems

– Type and modules of spoken dialogue systems

2. Deep learning for spoken dialogue systems

– Basis of deep learning (deep neural networks)

– Recent approaches of deep learning for spoken dialogue systems

3. Dialogue management using reinforcement learning

– Basis of reinforcement learning

– Statistical dialogue management using intention dependency graph

4. Dialogue management using deep reinforcement learning

– Implementation of deep Q-network in dialogue management

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 2

Course works

Page 3: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Perceptron

– Simple binary classifier

• Multi-layer perceptron

– Combination of binary classifiers

• Deep neural networks

• Ways to apply deep neural networks

– What kind of problem can be solved with DNN?

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 3

Basis of deep neural networks

Page 4: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Simplest unit of neural networks

– Takes several inputs produces one output

– Perceptron does a single decision given several inputs

𝒚 = 𝐬𝐢𝐠𝐧

𝒊

𝒘𝒊 ∙ 𝝋𝒊(𝒙) + 𝒃

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 4

Simple perceptron

Binary output

+1 or -1Weight for 𝝋𝒊 Feature function on 𝒙

bias 𝒃

𝒚𝒙

Page 5: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Problems the perceptron can solve

– Linearly solvable binary classification problem

• Positive: “This room is good”

• Positive: “This building is cool”

• Negative: “This room is bad”

• Problems the perceptron cannot solve

– Non-linear problems

• Positive: “very good”

• Positive: “not bad”

• Negative: “very bad”

• Negative: “not good”

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 5

Properties of simple perceptron

negative

positivegood

bad

cool

goodbad

not

very very good

not goodnot bad

very bad

Page 6: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Using several classifiers

– Multi-layer perceptron (MLP)

• Feed-forward network

– Classifiers of the 1st layer take

the same input, but learn

different weights

– If we have two linear

separating planes, we can

classify the example of

non-linear problem

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 6

Solutions for non-linear problems

goodbad

not

very 𝒙𝟏: very good

𝒙𝟒: not good𝒙𝟑: not bad

𝒙𝟐: very bad

𝒚

Page 7: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• 1st layer

– Input: same as the single perceptron

– Output: features for the decision

of the 2nd layer

• Each perceptron may learn the

mapping between the input feature

space and the new feature space

• Kernel methods do the similar thing

• 2nd layer

– Input: features that come from

the 1st layer

– Output: classification result

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 7

Multi-layer perceptron (MLP)

goodbad

not

very 𝒙𝟏

𝒙𝟒𝒙𝟑

𝒙𝟐

𝒙𝟏

𝒙𝟒

𝒙𝟑

𝒙𝟐

𝝋1(𝒙)

𝝋2(𝒙)

𝝋1 𝒙 ∗ 𝝋2 𝒙 = 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒

𝝋1 𝒙 ∗ 𝝋2 𝒙 = 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒

Page 8: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Deep neural network (DNN) is deeply layered multi-layer

perceptron

– It can train the mapping of 𝑿 and 𝒚 (𝒚 = 𝒇(𝑿)) even if it is complex

– Restricted Boltzmann machine (RBM) is a key technique to train the

model

• Pre-trains mapping of each layer (𝑿 and 𝑯𝟏, 𝑯𝟏 and 𝑯𝟐, …) and

finetunes entire the network (backpropagation)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 8

Deep neural networks

𝒙𝟏

𝒙𝟐

𝒙𝟑

𝒙𝟒

𝒉𝟏𝟏

𝒉𝟐𝟏

𝒉𝟑𝟏

𝒉𝟒𝟏

𝒉𝟏𝟐

𝒉𝟐𝟐

𝒉𝟑𝟐

𝒉𝟒𝟐

𝒉𝟏𝟑

𝒉𝟐𝟑

𝒉𝟑𝟑

𝒉𝟒𝟑

𝒉𝟏 𝒚

Page 9: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Recurrent neural network

is a neural network that

has a recursion

– 𝒉𝒕 = 𝐭𝐚𝐧𝐡(𝑾𝒙𝒉𝒙𝒕 +𝑾𝒉𝒉𝒉

𝒕−𝟏 + 𝒃𝒉)

– 𝒚𝒑𝒕 = 𝐬𝐨𝐟𝐭𝐦𝐚𝐱(𝑾𝒉𝒚𝒑𝒉𝒕 + 𝒃𝒚𝒑)

• 𝐭𝐚𝐧𝐡 =𝒆𝒙−𝒆−𝒙

𝒆𝒙+𝒆−𝒙, 𝐬𝐨𝐟𝐭𝐦𝐚𝐱 =

𝒆𝒚𝒊

𝒌 𝒆𝒚𝒌

• This structure works well for sequential input (𝑿𝟏, 𝑿𝟐, … , 𝑿𝒕)

– 𝒕 is a time step

– Input 𝒉𝒕−𝟏 for time-step 𝒕 will be a memory of previous input

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 9

Variation of neural networks:

Recurrent neural network (RNN)

𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓

𝒚𝒑𝟏 𝒚𝒑

𝟐 𝒚𝒑𝟑 𝒚𝒑

𝟒 𝒚𝒑𝟓

𝒉𝟏 𝒉𝟐 𝒉𝟑 𝒉𝟒 𝒉𝟓𝒉𝟎

Page 10: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• CNN is the state of the art algorithm for classification

– 𝒄𝒊,𝒋 = 𝒔=𝟎𝒎−𝟏 𝒕

𝒏−𝟏𝒘𝒔𝒕𝒙 𝒊+𝒔 ,(𝒋+𝒕) + 𝒃𝒄

– 𝒂𝒊,𝒋 = 𝐭𝐚𝐧𝐡 𝒄𝒊,𝒋 , 𝒑𝒊,𝒋 = 𝐦𝐚𝐱 𝒂𝒊,𝒋

– 𝒐 = 𝐭𝐚𝐧𝐡 𝑾𝒑𝒐𝒑 + 𝒃𝒐 , 𝒚 = 𝐬𝐨𝐟𝐭𝐦𝐚𝐱(𝑾𝒐𝒚𝒐 + 𝒃𝒚)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 10

Variation of neural networks:

Convolutional neural network (CNN)

Convolution

Max-poolingActivationFull-connection

and softmax

Page 11: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Deep learning can learn the mapping between 𝒙 and 𝒚

if we have a large-scale aligned data

– Speech sounds and their phonemes

– Transcribed utterances and dialogue states

– Belief and action-value function

– State and action

– Action and utterance

• Of course, it is not so simple, but successful works of deep

learning solve any problem of mapping that was hard to

solve existing frameworks…

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 11

Ways to apply deep learning

Page 12: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 12

Tasks of spoken dialogue systems

SLU

LG

ModelKnowledge

base

DM

$FROM=Ikoma

$LINE=Kintetsu

1 ask $TO_GO

2 inform $NEXT

…Where will you go?

$FROM=Ikoma

$TO_GO=???

$LINE=KintetsuI’d like to take

Kintetsu-line

from Ikoma stat.

• Speech recognition

• Spoken language understanding

• Dialogue state tracking

• Action decision

• Language generation

• End-to-end dialogue

Page 13: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 13

Speech recognition with DNN in early stage

• Conventional ASR architecture

𝐚𝐫𝐠𝐦𝐚𝐱𝑾𝑷(𝑾|𝑿) = 𝐚𝐫𝐠𝐦𝐚𝐱

𝑾𝑷 𝑿 𝑾 𝑷(𝑾)

𝑊 is word sequence and 𝑋 is speech

Acoustic

model

Language

model

DNN-HMMGMM-HMM

a r a

𝒙𝟏 𝒙𝟐 𝒙𝟑

a r a

……

𝒙𝟏

……

𝒙𝟐

……

𝒙𝟑

Page 14: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 14

Speech recognition with DNN in early stage

• Just replace the generative probability of GMM for a phoneme with

discriminative probability to classify a phoneme from speech

– The other architecture (HMM-based phoneme sequence selection and n-

gram based language modeling) was the same, but it reduced 20-30% errors

of speech recognition

DNN-HMMGMM-HMM

a r a

𝒙𝟏 𝒙𝟐 𝒙𝟑

a r a

……

𝒙𝟏

……

𝒙𝟐

……

𝒙𝟑

Page 15: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Language model calculates likelihoods of word sequence 𝑾

– 𝑷 𝑾 = 𝑷 𝒘𝟏 𝑷 𝒘𝟐|𝒘𝟏 …𝑷 𝒘𝒏|𝒘𝟏, … ,𝒘𝒏−𝟏

• Existing language models are modeled with N-gram model

that approximate given words

– 𝑷 𝑾 ≈ 𝒊𝑷(𝒘𝒊|𝒘𝒊−𝟏:𝒊−𝑵−𝟏)

• The problem can be solved with RNN

– 𝒉𝒕 = 𝐭𝐚𝐧𝐡(𝑾𝒘𝒉𝒘𝒕−𝟏 +𝑾𝒉𝒉𝒉

𝒕−𝟏 + 𝒃𝒉)

– 𝒘𝒕 = 𝐬𝐨𝐟𝐭𝐦𝐚𝐱(𝑾𝒉𝒚𝒑𝒉𝒕 + 𝒃𝒚𝒑)

• RNN (and its successors) became a state of the art LM

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 15

Language model and recurrent neural network

Page 16: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 16

End-to-end speech recognition system

• Early DNN-based speech recognition

system just replaced some modules

with deep neural networks, but

recent researchers tries to train the

model of 𝐚𝐫𝐠𝐦𝐚𝐱𝑾𝑷(𝑾|𝑿)

– Including pre-processing of ASR

• Ochiai et al., “Multichannel end-to-end speech

recognition.” In Proc. ICML, 2017.

図は論文より引用

Page 17: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• SLU

– Convert the user utterance into machine-readable expressions

• DM

– Decide the next system action from the SLU result and dialogue history

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 17

Problem of Spoken language understanding (SLU)

and dialogue state tracking (DST)

I want to take

Kintetsu from

Ikoma

I want to take Kintetsu from Ikoma

Line From Stat

Train_info{$FROM=

Ikoma,$LINE=Kintetsu}

Train_info{

$FROM=Ikoma

$TO_GO=Namba

$LINE=Kintetsu

}

1 inform $NEXT_TRAIN

2 ask $TO_GO

SLU result Dialogue state

history

Action decision

$FROM=???

$TO_GO=Namba

$LINE=???

Train_info{$FROM=

Ikoma,$LINE=Kintetsu}

Page 18: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 18

Simple classification for SLU

Chinese word model

Chinese char model

(Translated) English

word model

A MULTICHANNEL

CONVOLUTIONAL NEURAL

NETWORK FOR CROSS-LANGUAGE

DIALOG STATE TRACKING

Shi et al., In Proc. IEEE-SLT 2016

Page 19: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 19

CNN for classification

• CNN requires to use fixed

size of matrix as the input

• Using two techniques

– Embedding

converts word into fixed

length meaning vector

– 0-padding

sets the height of matrix with

the maximum sentence

length in the training data

and fill with 0

if the sentence length is

smaller than the max.… …

He doesn't have … himself

word embedding

artificial

intelligence class

𝒙𝟏𝒙𝟐𝒙𝟑

fixed length

vector of words

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0-padding: fill with 0 if the

sentence length is smaller than

maximum sentence length

Page 20: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• SLU: find a dialogue frame 𝑭 given words in the utterance 𝑾

– 𝐚𝐫𝐠𝐦𝐚𝐱𝑭𝑷(𝑭|𝑾)

• For the tagging problem, there are many works

– Slot filling, domain/Intent classification,

dialogue act classification, …

• DST: find a dialogue state 𝑺 given sequence of 𝑭𝟏:𝒕

– 𝐚𝐫𝐠𝐦𝐚𝐱𝑺𝑷(𝑺|𝑭𝟏:𝒕)

• It can be solved as a joint problem

– 𝑷 𝑺 𝑾𝟏:𝒕 = 𝑷 𝑺 𝑭𝟏:𝒕 𝑷(𝑭𝟏:𝒕|𝑾𝟏:𝒕)

• Can be solved with the same model? (with sequential model)

– RNN, long short term memory neural network (LSTM)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 20

Problem definition of SLU and DST

Page 21: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Word-Based Dialog State Tracking with Recurrent Neural Networks.

Henderson et al., In Proc. SIGDIAL, pp, 292-300, 2014.

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 21

RNN-based dialogue state tracking

Page 22: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 22

LSTM-based dialogue state tracking

Is there any activity in

Singapore?

User utterance

word sequence

Word embedding

LSTM

Task: activity{

Area: Singapore

Price range: -

…}

Other features

T

Dialogue State Tracking

using Long Short Term

Memory Neural Networks.

Yoshino et al., In Proc.

IWSDS, 2016.

Page 23: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• RNN: Output dialogue state given sequence of words

(utterance)

– 𝒉𝒕 = 𝐭𝐚𝐧𝐡(𝑾𝑿𝒉𝑿𝒕 +𝑾𝒉𝒉𝒉

𝒕−𝟏 + 𝒃𝒉)

• 𝑿𝒕 is a sequence of words in time 𝒕

• Dialogue history is propagated as hidden layer 𝒉𝒕−𝟏

– 𝒚𝒑𝒕 = 𝐬𝐨𝐟𝐭𝐦𝐚𝐱(𝑾𝒉𝒚𝒑𝒉𝒕 + 𝒃𝒚𝒑)

• Output a dialogue state 𝒚𝒑𝒕 that has the highest probability

• Belief update

– 𝒃𝒕 ≈ 𝑷(𝒐𝒕|𝒔𝒕) 𝒔𝒊𝑷 𝒔𝒋𝒕 𝒔𝒊𝒕−𝟏 𝒃𝒕−𝟏

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 23

Relation between belief update and RNN-

based dialogue state tracker

observation state transition belief

observation state transition

belief

Page 24: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Decide action 𝒂𝒕 given belief 𝒃𝒕

• There are two ways:

– Find the best policy (policy gradient)

– Find the best Q-function (Q-network)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 24

Problem of action decision

Train_info{

$FROM=Ikoma

$TO_GO=Bamba

$LINE=Kintetsu

}

1 inform $NEXT_TRAIN

2 ask $TO_GO

Belief of dialogue states

Action decision

I’d like to go to

Namba with

Kintetsu

?

𝒃𝒕

𝒔𝟏𝒔𝟐𝒔𝟑𝒔𝟒

Train_info{

$FROM=Ikoma

$TO_GO=Bamba

$LINE=Kintetsu

}

Train_info{

$FROM=Ikoma

$TO_GO=Bamba

$LINE=Kintetsu

}

Train_info{

$FROM=Ikoma

$TO_GO=Bamba

$LINE=Kintetsu

}

𝒂𝒕

Page 25: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Maximize the expected future reward (value function)

– 𝑽𝝅∗𝒔𝒕 = 𝐦𝐚𝐱

𝝅𝑽𝝅(𝒔𝒕)

= 𝐦𝐚𝐱𝒂 𝒔𝒕+𝟏𝑷 𝒔

𝒕+𝟏|𝒔𝒕, 𝒂𝒕

𝒂

𝑹 𝒔𝒕, 𝝅 𝒔 , 𝒔𝒕+𝟏 + 𝜸𝑽𝝅∗𝒔𝒕+𝟏

– 𝑸𝝅∗𝒔, 𝒂 = 𝒔𝒕+𝟏𝑷 𝒔

𝒕+𝟏|𝒔𝒕, 𝒂𝒕 𝑹 𝒔𝒕, 𝒂𝒕, 𝒔𝒕+𝟏 + 𝜸𝐦𝐚𝐱𝒂𝑸𝝅∗(𝒔𝒕+𝟏, 𝒂𝒕+𝟏)

• Policy gradient directly calculates the score 𝑽𝝅∗𝒔𝒕

• Q-network calculates 𝑸𝝅∗𝒔, 𝒂 for each action according to the

sampling manner of Q-learning

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 25

What was good actions?

Page 26: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• In policy gradient, policy is not decisive: 𝝅(𝒔, 𝒂) is a

probability of selecting action 𝒂 given state 𝒔

– 𝑱 𝜽 = 𝑽𝝅𝜽 𝒔

– 𝜵𝜽𝑱 𝜽 = 𝑬𝝅𝜽[𝜵𝜽𝐥𝐨𝐠𝝅_𝜽(𝒔, 𝒂)𝑸𝝅𝜽 𝒔, 𝒂 ]

– If we can learn parameter 𝜽 of policy 𝝅𝜽 as maximizing the 𝑱 𝜽

in existing data, we can get the policy that maximizes the reward for

existing data sequence

– We can use deep learning for the parameter learning

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 26

Policy gradient

Page 27: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Premise: if we can calculate 𝑸 𝒔, 𝒂 for every pair, we should

decide action 𝒂 according to 𝐦𝐚𝐱𝒂𝑸 𝒔,𝒂

• Problem: we don’t know 𝑷 𝒔𝒕+𝟏|𝒔𝒕, 𝒂𝒕 to calculate 𝑸 𝒔, 𝒂

• Solution: approximate the 𝑷 𝒔𝒕+𝟏|𝒔𝒕, 𝒂𝒕 with sampling

– 𝑸 𝒔𝒕, 𝒂𝒕

𝒖𝒑𝒅𝒂𝒕𝒆𝟏 − 𝜶 𝑸 𝒔𝒕, 𝒂𝒕 + 𝜶 𝑹 𝒔𝒕, 𝒂𝒕, 𝒔𝒕+𝟏 + 𝜸𝐦𝐚𝐱

𝒂𝒕+𝟏𝑸(𝒔𝒕+𝟏, 𝒂𝒕+𝟏)

– Back-propagate the reward from the end of the sample

– We will try to build a dialogue manager by using this algorithm in

the next class

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 27

Q-learning

Page 28: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Idea: If we can regress the Q-value at each sampling, learning

will be efficient

– 𝓛 𝜽𝒊 = 𝑬𝒔,𝒂,𝒓,𝒔′[ 𝒚 − 𝑸 𝒔, 𝒂𝟐]

– 𝒚 = 𝑹 𝒔𝒕, 𝒂𝒕, 𝒔𝒕+𝟏 + 𝜸𝐦𝐚𝐱𝒂𝑸(𝒔𝒕+𝟏, 𝒂𝒕+𝟏)

• Regression?

train the mapping between 𝒚 and 𝑸 𝒔, 𝒂 ?

Deep learning! (Deep Q-network; DQN)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 28

Q-network

tanh 𝑸(𝒃, 𝒂)

𝒔 = 𝒔𝟏: 𝟎. 𝟎𝒔 = 𝒔𝟐: 𝟎. 𝟗

𝒔 = 𝒔𝒏: 𝟎. 𝟎

𝒃

Page 29: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 29

Joint learning of dialogue state tracking and

action decision with deep learning

• LSTM-based DST

results are used as

the input of DQN

LSTM: calculate 𝒃𝒕 from observation

DQN: optimize 𝑸(𝒃𝒕, 𝒂𝒕+𝟏; 𝜽) with regression

Fine tuning of entire the network

Towards End-to-End

Learning for Dialog State

Tracking and Management

using Deep Reinforcement

Learning. Zhao et al., In

Proc. SIGDIAL, 2016

Page 30: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Generate a sentence given a system action

• Conventional: statistical template based approach

– It is still weak for out of vocabulary…

(and out of template)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 30

Language generation

Language

generationAsk $TO_GO Where will you go?

両手鍋/T を 油/F で熱/Ac する

セロリ/F と 青ねぎ/F とニンニク/F を 加え/Ac る

1分ほど/D 炒め/Ac る

フローグラフからの手順書の生成. 情処論文誌, 2015.

Page 31: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• RNN can generate a sequence of words by using the

generated words as its next input (decoder model)

– 𝒉𝒕 = 𝐭𝐚𝐧𝐡(𝑾𝒘𝒉𝒘𝒕−𝟏 +𝑾𝒉𝒉𝒉

𝒕−𝟏 + 𝒃𝒉)

– 𝒘𝒕 = 𝐬𝐨𝐟𝐭𝐦𝐚𝐱(𝑾𝒉𝒚𝒑𝒉𝒕 + 𝒃𝒚𝒑)

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 31

RNN-language model for generation

𝒙𝟏=He 𝒙𝟐=doesn’t 𝒙𝟑=have 𝒙𝟕=in

𝒚𝒑𝟏=doesn’t 𝒚𝒑

𝟐=have 𝒚𝒑𝟑=very 𝒚𝒑

𝟖=himself

𝑾𝒉𝟏 𝑾𝒉𝟐 𝑾𝒉𝟑 𝑾𝒉𝒏−𝟐 𝑾𝒉𝟓𝑾𝒉𝟎…

𝑾𝒙𝟏 𝑾𝒙𝟐 𝑾𝒙𝟑 𝑾𝒙𝟕

Dimension size of

output will be a

vocabulary

Page 32: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 32

Decoder with condition:

semantically conditioned LSTM

recurrent

hidden layer

embedding of

a word

1-hot dialog

act and slot

values

What to say

(contents)

How to say (=LM)

Semantically Conditioned LSTM-

based Natural Language Generation

for Spoken Dialogue Systems. Wen

et al., In Proc. EMNLP, 2015.

Page 33: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 33

semantically conditioned LSTM

decoding results for dialogue systems

Page 34: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Sequence-to-sequence

modeling of generation

– Change the response according

to the dialogue context

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 34

Context-aware NLG

Dusek et al., A context-aware natural

language generator for dialogue

systems. In Proc. SIGDIAL 2016.

Page 35: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 35

QA style

SLU

ModelKnowledge

base

DM

$FROM=Ikoma

$LINE=Kintetsu

1 ask $TO_GO

2 inform $NEXT

$FROM=Ikoma

$TO_GO=???

$LINE=KintetsuI’d like to take

Kintetsu-line

from Ikoma stat.

Skip the

management(example-base)

LG

Where will you go?

Page 36: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 36

Encoder-decoder

• We can combine ideas of encoder and decoder to make the neural

network that remembers the input sentence and outputs the input

sentence

– RNN may remember not only words but also order of words

where is the rest room EOS

rest

rest

room

room

is

is

next

entrance

.

.

EOS

Vinyals, Oriol, and Quoc Le. "A neural

conversational model." arXiv preprint

arXiv:1506.05869 (2015).

Page 37: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 37

Attention model

• Gives attentional point to be decoded

next to the entrance . EOS

X

𝐬𝐢𝐠𝐧 (softmax)

X …

Decides to through the input or not!

where is the rest room EOS next to the entrance .

Page 38: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• 𝒂𝒕,𝒋 = 𝐚𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧_𝐬𝐜𝐨𝐫𝐞(𝒉𝒆𝒋, 𝒉𝒅𝒕 )

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 38

Attention model

X X

Decides to through the input or not!

X X X

attention_score & softmax

next to the entrance . EOS

where is the rest room EOS next to the entrance .

Page 39: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• One typical end-to-end modeling of dialogue

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 39

ChitChat

Serban et al.,Building

end-to-end dialogue

systems using generative

hierarchical neural

network models. In Proc.

AAAI 2016.

Page 40: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• The task is goal

oriented (API call),

but try to work the

system on end-to-end

memory network

– The problem is

perfectly solved in

DSTC6-Track1

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 40

Memory Network for dialogue systems

Page 41: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• https://sites.google.com/site/deeplearningdialogue/

– Deep Learning for Dialogue Systems Tutorial

by Yun-Nung Chen, Asli Celikyikmaz, and Dilek Hakkani-Tur

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 41

If you are interested in more recent works…

Page 42: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• Deep learning is applied for several tasks of spoken dialogue systems

in recent years

– Speech recognition, understanding, state tracking, action decision,

generation, and end-to-end modeling

• How do we apply deep learning for dialogue (in development)?

– Clarify your problem to setup the input and output

– Find similar systems of existing works in recent (2-3 years) conferences

(SIGDIAL, NAACL, ACL, EMNLP, COLING, AAAI, IJCAI, …)

– Can you prepare the enough data pair of the input and the output?

• How do we apply deep learning for dialogue (in research)?

– Find a mapping problem that requires hi-dimension or non-linear

• Consider properties of your input and output, even if it is a part of your

problem

– Can you prepare the enough data pair of the input and the output?

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 42

Summary

Page 43: Advanced Cutting Edge Research Seminar Dialogue …Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO, Japan Science and Technology Agency Advanced

• 1/30

• Dialogue management with Q-learning

– We will see the detailed algorithm and implementation of the

dialogue manager with Q-learning

– We will discuss about user simulator

2018/1/25 ©Koichiro Yoshino

AHC-Lab. NAIST, PRESTO JSTAdvanced Cutting Edge Research Seminar 2 43

Next contents