Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
CS388:NaturalLanguageProcessingLecture21:Dialogue
GregDurrett
Administrivia‣ Proposalduetodayat5pm
‣ ProposalsreturnedASAP,thenProject2
Recall:ExtracGveSummarizaGon‣ Countnumberofdocumentseachbigramoccursintomeasureimportance
GillickandFavre(2009)
score(massiveearthquake)=3score(Iraqicapital)=1score(sixkilled)=2score(magnitude7.3)=2
‣ ILPformulaGon:candsareindicatorvariablesindexedoverconcepts(bigrams)andsentences,respecGvely
“setcito1iffsomesentencethatcontainsitisincluded”
sumofincludedsentences’lengthscan’texceedL
‣ Findsummarythatmaximizesthescoreofbigramsitcovers
Recall:Compression
‣ Nowsjvariablesarenodesorsetsofnodesintheparsetree
Atleastsixpeoplewerekilledandmanyothersinjured
S SCC
S
s2s1
‣ Newconstraint:s2≤s1“s1isaprerequisitefors2”
Recall:Pointer-Generator
Seeetal.(2017)
Recall:Seq2seqSummarizaGon
Seeetal.(2017)
‣ SoluGons:copymechanism,coverage,justlikeinMT…
‣ ThingsmightsGllgowrong,nowayofprevenGngthis…
ThisLecture‣ Chatbotdialoguesystems
‣ Task-orienteddialogue
‣ OtherdialogueapplicaGons
Chatbots
TuringTest(1950)‣ ImitaGongame:AandBarelockedinroomsandanswerC’squesGonsviatypewriter.BotharetryingtoactlikeB
A B
C
B B
trainedjudgeC trainedjudge
OriginalInterpretaGon: StandardInterpretaGon:
‣ Thetestisnot“doesthiscomputerseemhuman-liketorandompeoplewithawebbrowser?”
ELIZA‣ Created1964-1966atMIT,heavilyscripted
‣ DOCTORscriptwasmostsuccessful:repeatsuser’sinput,asksinanequesGons
Weizenbaum(1966)
ELIZA
(.*)you(.*)me
WhydoyouthinkI$2you?
‣ IdenGfykeyword,idenGfycontext,applytransformaGonrule
‣ Verylijleneedtogeneratenewcontent,butcanonlyhaveonetypeofconversaGon
My(.)(.*)
Whatelsecomestomindwhenyou thinkofyour$1?
Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006
‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat
User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe
‣ Canooengivesensibleanswers,butthebotdoesn’treallyimposehigh-level discoursestructure
Data-DrivenApproaches‣ CantreatasamachinetranslaGonproblem:“translate”fromcurrentujerancetonextone
‣ Filterthedata,usestaGsGcalmeasurestopruneextractedphrasestogetbejerperformance
Rijeretal.(2011)
Data-DrivenApproaches
Rijeretal.(2011)
Seq2seqmodels
Whatareyoudoing
I
<s>
am going home [STOP]
‣ JustlikeconvenGonalMT,cantrainseq2seqmodelsforthistask
‣Whymightthismodelperformpoorly?Whatmightitbebadat?
‣ Hardtoevaluate:
LackofDiversity
Lietal.(2016)
‣ Trainingtomaximizelikelihoodgivesasystemthatpreferscommonresponses:
LackofDiversity
Lietal.(2016)
‣ SoluGon:mutualinformaGoncriterion;responseRshouldbepredicGveofuserujeranceUaswell
‣MutualinformaGon:
‣ StandardcondiGonallikelihood: logP (R|U)
log
P (R,U)
P (R)P (U)
= logP (R|U)� logP (R)
‣ logP(R)canreflectprobabiliGesunderalanguagemodel
LackofDiversity
Lietal.(2016)
‣ OpenSubGtlesdata
Futureofchatbots
‣ XiaoIce:MicrosoochatbotinChinese,20Musers,averageuserinteracts60Gmes/month
‣ Peopledoseemtoliketalkingtothem…?
‣ HowdeepcanaconversaGonbewithoutmoresemanGcgrounding?Basicfactsaren’tevenconsistent…
‣ Canforcechatbotstogiveconsistentanswers,butsGllprobablynotveryinteresGng
Lietal.(2016)Persona…
Task-OrientedDialogue
Task-OrientedDialogue
Google,what’sthemostvaluable
Americancompany?
Apple
WhoisitsCEO?
TimCook
‣ QuesGonanswering/search:
Task-OrientedDialogue
Siri,findmeagoodsushi restaurantinChelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4stars
onGoogle
‣ Personalassistants/APIfront-ends:
Howexpensiveisit?
Entreesarearound$30each
Findmesomethingcheaper
Task-OrientedDialogue
HeyAlexa,whyisn’tmyAmazon orderhere?
Letmeretrieveyourorder. Yourorderwasscheduledtoarrive
at4pmtoday.
‣ Personalassistants/APIfront-ends:
Itnevercame
Okay,Icanputyouthroughtocustomerservice.
AirTravelInformaGonService(ATIS)‣ Givenanujerance,predictadomain-specificsemanGcinterpretaGon
DARPA(early1990s),FigurefromTuretal.(2010)
‣ CanformulateassemanGcparsing,butsimpleslot-fillingsoluGons(classifiers)workwelltoo
FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem
Youngetal.(2013)
‣ Dialoguestate:reflectsanyinformaGonabouttheconversaGon(e.g.,searchhistory)
‣ Userujerance->updatedialoguestate->takeacGon(e.g.,querytherestaurantdatabase)->saysomething
‣Muchmorecomplexthanchatbots!
FullDialogueTask
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
Entreesarearound$30each
POMDP-basedDialogueSystems
Youngetal.(2013)
‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel
‣ POMDP:useristhe“environment,”anujeranceisanoisysignalofstate
‣ Generator:usetemplatesorseq2seqmodel
‣Wheredorewardscomefrom?
RewardforcompleGngtask?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
make_reservation(curr_result)
Howexpensiveisit?
+1
…OkaymakemeareservaGon!
curr_result <- execute_search()
Veryindirectsignal ofwhatshould happenuphere
Usergivesreward?
FindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle
curr_result <- execute_search()
Howexpensiveisit?get_value(cost, curr_result)
Entreesarearound$30each
+1
+1
Howdoestheuserknowtherightsearchhappened?
Wizard-of-Oz
Kelley(early1980s),FordandSmith(1982)
‣ LearningfromdemonstraGons:“wizard”pullstheleversandmakesthedialoguesystemupdateitsstateandtakeacGons
FullDialogueTaskFindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search(){wizardenters
these
SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis
outorinvokes templates
‣Wizardcanbeatrainedexpertandknowexactlywhatthedialoguesystemsissupposedtodo
LearningfromStaGcTraces
Bordesetal.(2017)
‣ Usingeitherwizard-of-OzorotherannotaGons,cancollectstaGctracesandtrainfromthese
FullDialogueTaskFindmeagoodsushirestaurantinChelsea
restaurant_type <- sushi
location <- Chelsea
curr_result <- execute_search()
‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraGng?Whatdoes“good”mean?
‣ HardtochangesystembehavioriftrainingfromstaGctraces,especiallyifsystemcapabiliGesordesiredbehaviorchange
stars <- 4+
Goal-orientedDialogue
‣ BigCompanies:AppleSiri(VocalIQ),GoogleAllo,AmazonAlexa,MicrosooCortana,FacebookM,SamsungBixby,TencentWeChat
‣ Startups:
‣ Lotsofcoolworkthat’snotpublicyet
‣ Tonsofindustryinterest!
OtherDialogueApplicaGons
Search/QAasDialogue
‣ “HasChrisPrajwonanOscar?”/“HashewonanOscar”
QAasDialogue‣ DialogueisaverynaturalwaytofindinformaGonfromasearchengineoraQAsystem
Iyyeretal.(2017)
‣ QAishardenoughonitsown
‣ Usersmovethegoalposts
‣ Challenges:
QAasDialogue‣ UWQuACdataset:QuesGonAnsweringinContext
Choietal.(2018)
SearchasDialogue
‣ Googlecandealwithmisspellings,somoremisspellingshappen—Googlehastodomore!
DialogueMissionCreep
System
Erroranalysis
Bejermodel
‣ FixeddistribuGon(e.g.,naturallanguagesentences),errorrate->0
Data
‣ Errorrate->???;“missioncreep”fromHCIelement
HarderData
MostNLPtasks
System
Erroranalysis
Bejermodel
Data
Dialogue/Search/QA
???
DialogueMissionCreep
‣ Highvisibility—yourproducthastoworkreallywell!
Takeaways‣ Somedecentchatbots,butunclearhowtomakethesemoresophisGcatedthantheyarerightnow
‣ Task-orienteddialoguesystemsaregrowinginscopeandcomplexity—reallyexciGngsystemsontheway
‣Moreandmoreproblemsarebeingformulatedasdialogue—interesGngapplicaGonsbutchallengingtogetworkingwell