Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Lecture16:Dialogue
AlanRi4er(many slides from Greg Durrett)
ThisLecture
‣ Chatbotdialoguesystems
‣ Task-orienteddialogue
‣ OtherdialogueapplicaAons
Chatbots
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
OriginalInterpretaAon:
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
A B
C
B B
trainedjudge
OriginalInterpretaAon:
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
A B
C
B B
trainedjudge
OriginalInterpretaAon:
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
A B
C
B B
trainedjudgeC trainedjudge
OriginalInterpretaAon: StandardInterpretaAon:
TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB
A B
C
B B
trainedjudgeC trainedjudge
OriginalInterpretaAon: StandardInterpretaAon:
‣ Thetestisnot“doesthiscomputerseemhuman-liketorandompeoplewithawebbrowser?”
ELIZA‣ Created1964-1966atMIT,heavilyscripted
‣ DOCTORscriptwasmostsuccessful:repeatsuser’sinput,asksinanequesAons
Weizenbaum(1966)
ELIZA
‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule
ELIZA
(.*)you(.*)me
WhydoyouthinkI$2you?
‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule
ELIZA
(.*)you(.*)me
WhydoyouthinkI$2you?
‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule
My(.)(.*)
Whatelsecomestomindwhenyou thinkofyour$1?
ELIZA
(.*)you(.*)me
WhydoyouthinkI$2you?
‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule
‣ Veryli4leneedtogeneratenewcontent,butcanonlyhaveonetypeofconversaAon
My(.)(.*)
Whatelsecomestomindwhenyou thinkofyour$1?
Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006
User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe
Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006
‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat
User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe
Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006
‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat
User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe
‣ Canofengivesensibleanswers,butthebotdoesn’treallyimposehigh-level discoursestructure
Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone
Ri4eretal.(2011)
Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone
Ri4eretal.(2011)
Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone
‣ Filterthedata,usestaAsAcalmeasurestopruneextractedphrasestogetbe4erperformance
Ri4eretal.(2011)
Data-DrivenApproaches
Ri4eretal.(2011)
Seq2seqmodels
Whatareyoudoing
I
<s>
am going home [STOP]
‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask
Seq2seqmodels
Whatareyoudoing
I
<s>
am going home [STOP]
‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask
‣Whymightthismodelperformpoorly?Whatmightitbebadat?
Seq2seqmodels
Whatareyoudoing
I
<s>
am going home [STOP]
‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask
‣Whymightthismodelperformpoorly?Whatmightitbebadat?
‣ Hardtoevaluate:
LackofDiversity
Lietal.(2016)
‣ Trainingtomaximizelikelihoodgivesasystemthatpreferscommonresponses:
LackofDiversity
Lietal.(2016)
‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell
LackofDiversity
Lietal.(2016)
‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell
‣ StandardcondiAonallikelihood: logP (R|U)
LackofDiversity
Lietal.(2016)
‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell
‣MutualinformaAon:
‣ StandardcondiAonallikelihood: logP (R|U)
logP (R,U)
P (R)P (U)= logP (R|U)� logP (R)
‣ logP(R)canreflectprobabiliAesunderalanguagemodel
LackofDiversity
Lietal.(2016)
‣ OpenSubAtlesdata
Futureofchatbots‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…
Lietal.(2016)Persona…
Futureofchatbots‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…
‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng
Lietal.(2016)Persona…
Futureofchatbots
‣ XiaoIce:MicrosofchatbotinChinese,20Musers,averageuserinteracts60Ames/month
‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…
‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng
Lietal.(2016)Persona…
Futureofchatbots
‣ XiaoIce:MicrosofchatbotinChinese,20Musers,averageuserinteracts60Ames/month
‣ Peopledoseemtoliketalkingtothem…?
‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…
‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng
Lietal.(2016)Persona…
Task-OrientedDialogue
Task-OrientedDialogue‣ QuesAonanswering/search:
Task-OrientedDialogue‣ QuesAonanswering/search:
Task-OrientedDialogue
Google,what’sthemostvaluable
Americancompany?
‣ QuesAonanswering/search:
Task-OrientedDialogue
Google,what’sthemostvaluable
Americancompany?
Apple
‣ QuesAonanswering/search:
Task-OrientedDialogue
Google,what’sthemostvaluable
Americancompany?
Apple
WhoisitsCEO?
‣ QuesAonanswering/search:
Task-OrientedDialogue
Google,what’sthemostvaluable
Americancompany?
Apple
WhoisitsCEO?
TimCook
‣ QuesAonanswering/search:
Task-OrientedDialogue‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4stars
onGoogle
‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4stars
onGoogle
‣ Personalassistants/APIfront-ends:
Howexpensiveisit?
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4stars
onGoogle
‣ Personalassistants/APIfront-ends:
Howexpensiveisit?
Entreesarearound$30each
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4stars
onGoogle
‣ Personalassistants/APIfront-ends:
Howexpensiveisit?
Entreesarearound$30each
Findmesomethingcheaper
Task-OrientedDialogue‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
HeyAlexa,whyisn’tmyAmazon orderhere?
‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
HeyAlexa,whyisn’tmyAmazon orderhere?
Letmeretrieveyourorder. Yourorderwasscheduledtoarrive
at4pmtoday.
‣ Personalassistants/APIfront-ends:
Task-OrientedDialogue
HeyAlexa,whyisn’tmyAmazon orderhere?
Letmeretrieveyourorder. Yourorderwasscheduledtoarrive
at4pmtoday.
‣ Personalassistants/APIfront-ends:
Itnevercame
Task-OrientedDialogue
HeyAlexa,whyisn’tmyAmazon orderhere?
Letmeretrieveyourorder. Yourorderwasscheduledtoarrive
at4pmtoday.
‣ Personalassistants/APIfront-ends:
Itnevercame
Okay,Icanputyouthroughtocustomerservice.
AirTravelInformaAonService(ATIS)‣ Givenanu4erance,predictadomain-specificsemanAcinterpretaAon
DARPA(early1990s),FigurefromTuretal.(2010)
‣ CanformulateassemanAcparsing,butsimpleslot-fillingsoluAons(classifiers)workwelltoo
FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem
Youngetal.(2013)
FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem
Youngetal.(2013)
‣ Dialoguestate:reflectsanyinformaAonabouttheconversaAon(e.g.,searchhistory)
FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem
Youngetal.(2013)
‣ Dialoguestate:reflectsanyinformaAonabouttheconversaAon(e.g.,searchhistory)
‣ Useru4erance->updatedialoguestate->takeacAon(e.g.,querytherestaurantdatabase)->saysomething
FullDialogueTask
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
Entreesarearound$30each
POMDP-basedDialogueSystems
Youngetal.(2013)
‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate
POMDP-basedDialogueSystems
Youngetal.(2013)
‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel
‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate
POMDP-basedDialogueSystems
Youngetal.(2013)
‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel
‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate
‣ Generator:usetemplatesorseq2seqmodel
POMDP-basedDialogueSystems
Youngetal.(2013)
‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel
‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate
‣ Generator:usetemplatesorseq2seqmodel
‣Wheredorewardscomefrom?
RewardforcompleAngtask?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
make_reservation(curr_result)
Howexpensiveisit?
+1
…
OkaymakemeareservaAon!
curr_result <- execute_search()
RewardforcompleAngtask?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
make_reservation(curr_result)
Howexpensiveisit?
+1
…
OkaymakemeareservaAon!
curr_result <- execute_search()
Veryindirectsignal ofwhatshould happenuphere
Usergivesreward?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
Entreesarearound$30each
+1
+1
Usergivesreward?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
Entreesarearound$30each
+1
+1
Howdoestheuserknowtherightsearchhappened?
Wizard-of-Oz
Kelley(early1980s),FordandSmith(1982)
‣ LearningfromdemonstraAons:“wizard”pullstheleversandmakesthedialoguesystemupdateitsstateandtakeacAons
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search(){wizardenters
these
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search(){wizardenters
these
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis
outorinvokes templates
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search(){wizardenters
these
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis
outorinvokes templates
‣Wizardcanbeatrainedexpertandknowexactlywhatthedialoguesystemsissupposedtodo
LearningfromStaAcTraces
Bordesetal.(2017)
‣ Usingeitherwizard-of-OzorotherannotaAons,cancollectstaActracesandtrainfromthese
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()stars <- 4+
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()
‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraAng?Whatdoes“good”mean?
stars <- 4+
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()
‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraAng?Whatdoes“good”mean?
‣ HardtochangesystembehavioriftrainingfromstaActraces,especiallyifsystemcapabiliAesordesiredbehaviorchange
stars <- 4+
Goal-orientedDialogue
‣ BigCompanies:AppleSiri(VocalIQ),GoogleAllo,AmazonAlexa,MicrosofCortana,FacebookM,SamsungBixby,TencentWeChat
‣ Startups:
‣ Lotsofcoolworkthat’snotpublicyet
‣ Tonsofindustryinterest!
OtherDialogueApplicaAons
Search/QAasDialogue
‣ “HasChrisPra4wonanOscar?”/“HashewonanOscar”
QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem
Iyyeretal.(2017)
QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem
Iyyeretal.(2017)
‣ QAishardenoughonitsown
‣ Challenges:
QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem
Iyyeretal.(2017)
‣ QAishardenoughonitsown
‣ Usersmovethegoalposts
‣ Challenges:
QAasDialogue‣ UWQuACdataset:QuesAonAnsweringinContext
Choietal.(2018)
SearchasDialogue
‣ Googlecandealwithmisspellings,somoremisspellingshappen—Googlehastodomore!
DialogueMissionCreep
System
Erroranalysis
Be4ermodel
Data
MostNLPtasks
DialogueMissionCreep
System
Erroranalysis
Be4ermodel
‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0
Data
MostNLPtasks
DialogueMissionCreep
System
Erroranalysis
Be4ermodel
‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0
Data
MostNLPtasks
System
Erroranalysis
Be4ermodel
Data
Dialogue/Search/QA
DialogueMissionCreep
System
Erroranalysis
Be4ermodel
‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0
DataHarderData
MostNLPtasks
System
Erroranalysis
Be4ermodel
Data
Dialogue/Search/QA
???
DialogueMissionCreep
System
Erroranalysis
Be4ermodel
‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0
Data
‣ Errorrate->???;“missioncreep”fromHCIelement
HarderData
MostNLPtasks
System
Erroranalysis
Be4ermodel
Data
Dialogue/Search/QA
???
DialogueMissionCreep
‣ Highvisibility—yourproducthastoworkreallywell!
Takeaways
‣ Somedecentchatbots,applicaAons:predicAvetextinput,…
‣ Task-orienteddialoguesystemsaregrowinginscopeandcomplexity
‣Moreandmoreproblemsarebeingformulatedasdialogue—interesAngapplicaAonsbutchallengingtogetworkingwell