CS388: Natural Language Processing Lecture 21: Dialoguegdurrett/courses/fa2018/lectures/lec21-… · User: Doing okay. what plans do you have today? Cleverbot: I do revision. User:

CS388:NaturalLanguageProcessingLecture21:Dialogue

GregDurrett

Administrivia‣ Proposalduetodayat5pm

‣ ProposalsreturnedASAP,thenProject2

Recall:ExtracGveSummarizaGon‣ Countnumberofdocumentseachbigramoccursintomeasureimportance

GillickandFavre(2009)

score(massiveearthquake)=3score(Iraqicapital)=1score(sixkilled)=2score(magnitude7.3)=2

‣ ILPformulaGon:candsareindicatorvariablesindexedoverconcepts(bigrams)andsentences,respecGvely

“setcito1iffsomesentencethatcontainsitisincluded”

sumofincludedsentences’lengthscan’texceedL

‣ Findsummarythatmaximizesthescoreofbigramsitcovers

Recall:Compression

‣ Nowsjvariablesarenodesorsetsofnodesintheparsetree

Atleastsixpeoplewerekilledandmanyothersinjured

S SCC

S

s2s1

‣ Newconstraint:s2≤s1“s1isaprerequisitefors2”

Recall:Pointer-Generator

Seeetal.(2017)

Recall:Seq2seqSummarizaGon

Seeetal.(2017)

‣ SoluGons:copymechanism,coverage,justlikeinMT…

‣ ThingsmightsGllgowrong,nowayofprevenGngthis…

ThisLecture‣ Chatbotdialoguesystems

‣ Task-orienteddialogue

‣ OtherdialogueapplicaGons

Chatbots

TuringTest(1950)‣ ImitaGongame:AandBarelockedinroomsandanswerC’squesGonsviatypewriter.BotharetryingtoactlikeB

A B

C

B B

trainedjudgeC trainedjudge

OriginalInterpretaGon: StandardInterpretaGon:

‣ Thetestisnot“doesthiscomputerseemhuman-liketorandompeoplewithawebbrowser?”

ELIZA‣ Created1964-1966atMIT,heavilyscripted

‣ DOCTORscriptwasmostsuccessful:repeatsuser’sinput,asksinanequesGons

Weizenbaum(1966)

ELIZA

(.*)you(.*)me

WhydoyouthinkI$2you?

‣ IdenGfykeyword,idenGfycontext,applytransformaGonrule

‣ Verylijleneedtogeneratenewcontent,butcanonlyhaveonetypeofconversaGon

My(.)(.*)

Whatelsecomestomindwhenyou thinkofyour$1?

Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006

‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat

User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe

‣ Canooengivesensibleanswers,butthebotdoesn’treallyimposehigh-level discoursestructure

Data-DrivenApproaches‣ CantreatasamachinetranslaGonproblem:“translate”fromcurrentujerancetonextone

‣ Filterthedata,usestaGsGcalmeasurestopruneextractedphrasestogetbejerperformance

Rijeretal.(2011)

Data-DrivenApproaches

Rijeretal.(2011)

Seq2seqmodels

Whatareyoudoing

I

<s>

am going home [STOP]

‣ JustlikeconvenGonalMT,cantrainseq2seqmodelsforthistask

‣Whymightthismodelperformpoorly?Whatmightitbebadat?

‣ Hardtoevaluate:

LackofDiversity

Lietal.(2016)

‣ Trainingtomaximizelikelihoodgivesasystemthatpreferscommonresponses:

LackofDiversity

Lietal.(2016)

‣ SoluGon:mutualinformaGoncriterion;responseRshouldbepredicGveofuserujeranceUaswell

‣MutualinformaGon:

‣ StandardcondiGonallikelihood: logP (R|U)

log

P (R,U)

P (R)P (U)

= logP (R|U)� logP (R)

‣ logP(R)canreflectprobabiliGesunderalanguagemodel

LackofDiversity

Lietal.(2016)

‣ OpenSubGtlesdata

Futureofchatbots

‣ XiaoIce:MicrosoochatbotinChinese,20Musers,averageuserinteracts60Gmes/month

‣ Peopledoseemtoliketalkingtothem…?

‣ HowdeepcanaconversaGonbewithoutmoresemanGcgrounding?Basicfactsaren’tevenconsistent…

‣ Canforcechatbotstogiveconsistentanswers,butsGllprobablynotveryinteresGng

Lietal.(2016)Persona…

Task-OrientedDialogue


Google,what’sthemostvaluable

Americancompany?

Apple

WhoisitsCEO?

TimCook

‣ QuesGonanswering/search:


Siri,findmeagoodsushi restaurantinChelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4stars

onGoogle

‣ Personalassistants/APIfront-ends:

Howexpensiveisit?

Entreesarearound$30each

Findmesomethingcheaper


HeyAlexa,whyisn’tmyAmazon orderhere?

Letmeretrieveyourorder. Yourorderwasscheduledtoarrive

at4pmtoday.

‣ Personalassistants/APIfront-ends:

Itnevercame

Okay,Icanputyouthroughtocustomerservice.

AirTravelInformaGonService(ATIS)‣ Givenanujerance,predictadomain-specificsemanGcinterpretaGon

DARPA(early1990s),FigurefromTuretal.(2010)

‣ CanformulateassemanGcparsing,butsimpleslot-fillingsoluGons(classifiers)workwelltoo

FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem

Youngetal.(2013)

‣ Dialoguestate:reflectsanyinformaGonabouttheconversaGon(e.g.,searchhistory)

‣ Userujerance->updatedialoguestate->takeacGon(e.g.,querytherestaurantdatabase)->saysomething

‣Muchmorecomplexthanchatbots!

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?get_value(cost, curr_result)


POMDP-basedDialogueSystems

Youngetal.(2013)

‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel

‣ POMDP:useristhe“environment,”anujeranceisanoisysignalofstate

‣ Generator:usetemplatesorseq2seqmodel

‣Wheredorewardscomefrom?

RewardforcompleGngtask?



location <- Chelsea


make_reservation(curr_result)

Howexpensiveisit?

+1

…OkaymakemeareservaGon!


Veryindirectsignal ofwhatshould happenuphere

Usergivesreward?



location <- Chelsea



Howexpensiveisit?get_value(cost, curr_result)


+1

+1

Howdoestheuserknowtherightsearchhappened?

Wizard-of-Oz

Kelley(early1980s),FordandSmith(1982)

‣ LearningfromdemonstraGons:“wizard”pullstheleversandmakesthedialoguesystemupdateitsstateandtakeacGons

FullDialogueTaskFindmeagoodsushirestaurantinChelsea


location <- Chelsea

curr_result <- execute_search(){wizardenters

these

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis

outorinvokes templates

‣Wizardcanbeatrainedexpertandknowexactlywhatthedialoguesystemsissupposedtodo

LearningfromStaGcTraces

Bordesetal.(2017)

‣ Usingeitherwizard-of-OzorotherannotaGons,cancollectstaGctracesandtrainfromthese

FullDialogueTaskFindmeagoodsushirestaurantinChelsea


location <- Chelsea


‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraGng?Whatdoes“good”mean?

‣ HardtochangesystembehavioriftrainingfromstaGctraces,especiallyifsystemcapabiliGesordesiredbehaviorchange

stars <- 4+

Goal-orientedDialogue

‣ BigCompanies:AppleSiri(VocalIQ),GoogleAllo,AmazonAlexa,MicrosooCortana,FacebookM,SamsungBixby,TencentWeChat

‣ Startups:

‣ Lotsofcoolworkthat’snotpublicyet

‣ Tonsofindustryinterest!

OtherDialogueApplicaGons

Search/QAasDialogue

‣ “HasChrisPrajwonanOscar?”/“HashewonanOscar”

QAasDialogue‣ DialogueisaverynaturalwaytofindinformaGonfromasearchengineoraQAsystem

Iyyeretal.(2017)

‣ QAishardenoughonitsown

‣ Usersmovethegoalposts

‣ Challenges:

QAasDialogue‣ UWQuACdataset:QuesGonAnsweringinContext

Choietal.(2018)

SearchasDialogue

‣ Googlecandealwithmisspellings,somoremisspellingshappen—Googlehastodomore!

DialogueMissionCreep

System

Erroranalysis

Bejermodel

‣ FixeddistribuGon(e.g.,naturallanguagesentences),errorrate->0

Data

‣ Errorrate->???;“missioncreep”fromHCIelement

HarderData

MostNLPtasks

System

Erroranalysis

Bejermodel

Data

Dialogue/Search/QA

???

DialogueMissionCreep

‣ Highvisibility—yourproducthastoworkreallywell!

Takeaways‣ Somedecentchatbots,butunclearhowtomakethesemoresophisGcatedthantheyarerightnow

‣ Task-orienteddialoguesystemsaregrowinginscopeandcomplexity—reallyexciGngsystemsontheway

‣Moreandmoreproblemsarebeingformulatedasdialogue—interesGngapplicaGonsbutchallengingtogetworkingwell

Documents

CS388: Natural Language Processing Lecture 21: Dialoguegdurrett/courses/fa2018/lectures/lec21-… · User: Doing okay. what plans do you have today? Cleverbot: I do revision. User: