32
State-of-the-art Parsing

State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

State-of-the-artParsing

Page 2: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

State-of-the-artParsers

‣ 2012:Transi5on-basedMaltparserachievedgoodresults(~90UAS)

‣ 2010:BeEergraph-basedparsersusing“parentannota5on”(~93UAS)

‣ 2014:Stanfordneuraldependencyparser(ChenandManning)got92UASwithtransi5on-basedneuralmodel

‣ 2005:Eisneralgorithmgraph-basedparserwasSOTA(~91UAS)

‣ 2016:ImprovementstoChenandManning

‣ LabeledaEachmentscore:havetolabeleachedgecorrectly(butthisisn’tthathard—nounbeforeverb->nsubjinmostcontexts)

‣ UnlabeledaEachmentscore:frac5onofwordswithcorrectparent

Page 3: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

StanfordDependencyParser

ChenandManning(2014)

‣ Feedforwardneuralnetworkontopoffeaturevectorextractedfromstackandbuffer

1stinstack 2ndinstack 1stinbuf POSofle\mostchildof1stinstack… …

Page 4: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

StanfordDependencyParser

ChenandManning(2014)

Page 5: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

StanfordDependencyParser

ChenandManning(2014)

‣MSTParser:“graph-based”parser(likeCKY)from2005—soChen+Manning’sparserisn’tmuchbeEerbutismuchfaster!

Page 6: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

ParseyMcParseFace

Andoretal.(2016)

‣ Closetostate-of-the-art,releasedbyGooglepublicly

‣ 94.61UASonthePennTreebankusingatransi5on-basedsystem

‣ SamefeaturesetasChenandManning(2014),Googlefine-tunedit

‣ Addi5onaldataharvestedvia“tri-training”,formofself-training

(a.k.a.SyntaxNet)

hEps://github.com/tensorflow/models/tree/master/research/syntaxnet

Page 7: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

AllenNLP

‣ Veryniceandusablewebdemo

‣ Reimplementa5onofgraph-based,state-of-the-artparser

‣ Somefancytrickswehaven’tdiscussedyet

hEps://demo.allennlp.org/dependency-parsing

Page 8: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Otherlanguages‣ Annotatedependencieswiththesamerepresenta5oninmanylanguages

hEp://universaldependencies.org/

English

Bulgarian

Czech

Swiss

Page 9: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Seman5cRoleLabeling

Page 10: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Seman5cRoleLabeling

LadyGagaperformedaconcertforstudents

‣ Performingevent

‣ Subject:LadyGaga‣ Object:aconcert‣ Audience:students

AconcertwasperformedbyLadyGagaforstudents

‣ Sameeventdescribedbuttherepresenta5onlooksdifferent

VARG1 ARG0 ARG2

VARG0 ARG1 ARG2

Page 11: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

VerbNet

verbs.colorado.edu

‣ Definestheseman5csofverbs,argumentsforeveryverbinEnglish

Page 12: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Seman5cRoles

‣ Relatedtothetarolesinlinguis5cs

‣ Agent(~subject),pa5ent/theme(~object),goal(~indirectobject)

‣ “Postprocessing”layerontopofdependencyparsingthatexposesusefulinforma5on,canonicalizesacrossgramma5calconstruc5ons

ARG0 ARG1 ARG2+(seman5csvary)

Page 13: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Seman5cRoleLabeling‣ Iden5fypredicate,disambiguateit,iden5fythatpredicate’sarguments

‣ VerbrolesfromPropbank(Palmeretal.,2005)

FigurefromHeetal.(2017)

quicken:

Page 14: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

SRLforQA

ShenandLapata(2007)

‣ Ques5onandseveralanswercandidates

Q:Whodiscoveredprions?

AC1:In1997,StanleyB.Prusiner,ascienEstintheUnitedStates,discoveredprions…

AC2:Prionswereresearchedby…

Scorebymatchingexpectedanswerphrase(EAP)againstanswercandidate(AC)Prusiner

Page 15: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

MoreonSRL

‣ EmmaStrubellfromUMassAmherst:“NeuralNetworkArchitecturesforFastandRobustNLP”

‣ EvencomplexneuralnetworkmodelsforSRLbenefitfromdependencyinforma5on

‣ Tuesday,11amGDCmainauditorium

‣ IncludesdiscussionofworkonneuralSRLsystem

Page 16: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Rela5onExtrac5on

Page 17: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Rela5onExtrac5on

TimCookistheCEOofApple.

AppleCEOTimCooksaidthat…

AppleshareshavetakenabeaEng,muchtothechagrinofitsCEO,TimCook

Cook’stenureasCEOofApple…

Wozniak’sdesiretobeCEO…

Page 18: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Rela5onExtrac5on‣ Extracten5ty-rela5on-en5tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeEmescaughtinthelineoffire

Located_In

‣ Systemscanbefeature-basedorneural,lookatsurfacewords,dependencypathfeatures,seman5croles

Na5onality

‣ Problem:limiteddataforscalingtobigontologies

‣ UseNER-likesystemtoiden5fyen5tyspans,classifyrela5onsbetweenen5typairswithaclassifier

Page 19: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

DistantSupervision

Mintzetal.(2009)

[StevenSpielberg]’sfilm[SavingPrivateRyan]islooselybasedonthebrothers’story

‣ Iftwoen55esinarela5onappearinthesamesentence,assumethesentenceexpressestherela5on

‣ Lotsofrela5onsinourknowledgebasealready(e.g.,23,000film-directorrela5ons);usethesetobootstrapmoretrainingdata

Allisonco-producedtheAcademyAward-winning[SavingPrivateRyan],directedby[StevenSpielberg]

Director

Director

Page 20: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

DistantSupervision

Mintzetal.(2009)

‣ Learndecentlyaccurateclassifiersfor~100Freebaserela5ons‣ Couldbeusedtocrawlthewebandexpandourknowledgebase

Page 21: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

OpenIE

Page 22: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

OpenInforma5onExtrac5on

‣ Typicallynofixedrela5oninventory

‣ “Open”ness—wanttobeabletoextractallkindsofinforma5onfromopen-domaintext

‣ Acquirecommonsenseknowledgejustfrom“reading”aboutit,butneedtoprocesslotsoftext(“machinereading”)

Page 23: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

TextRunner‣ Extractposi5veexamplesof(e,r,e)triplesviaparsingandheuris5cs

BarackObama,44thpresidentoftheUnitedStates,wasbornonAugust4,1961inHonolulu

=>Barack_Obama,wasbornin,Honolulu

‣ TrainaNaiveBayesclassifiertofiltertriplesfromrawtext:usesfeaturesonPOStags,lexicalfeatures,stopwords,etc.

‣ 80xfasterthanrunningaparser(whichwasslowin2007…)

Bankoetal.(2007)

‣ Usemul5pleinstancesofextrac5onstoassignprobabilitytoarela5on

Page 24: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Exploi5ngRedundancy

Bankoetal.(2007)

‣ Concrete:definitelytrueAbstract:possiblytruebutunderspecified

‣ Hardtoevaluate:canassessprecision ofextractedfacts,buthowdoweknowrecall?

‣ 9Mwebpages/133Msentences

‣ 2.2tuplesextractedpersentence,filterbasedonprobabili5es

Page 25: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

ReVerb

Faderetal.(2011)

‣Moreconstraints:openrela5onshavetobeginwithverb,endwithpreposi5on,becon5guous(e.g.,wasbornon)

‣ Extractmoremeaningfulrela5ons,par5cularlywithlightverbs

Page 26: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

ReVerb

Faderetal.(2011)

‣ Foreachverb,iden5fythelongestsequenceofwordsfollowingtheverbthatsa5sfyaPOSregex(V.*P)andwhichsa5sfyheuris5clexicalconstraintsonspecificity

‣ Findthenearestargumentsoneithersideoftherela5on

‣ Annotatorslabeledrela5onsin500documentstoassessrecall

Page 27: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

QAfromOpenIE

Choietal.(2015)

Page 28: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Takeaways

‣ Rela5onextrac5on:cancollectdatawithdistantsupervision,usethistoexpandknowledgebases

‣ OpenIE:extractslotsofthings,buthardtoknowhowgoodorusefultheyare‣ Cancombinewithstandardques5onanswering

‣ Addnewfactstoknowledgebases

‣ SRL/AMR:handleabunchofphenomena,butmoreorlesslikesyntax++intermsofwhattheyrepresent

‣Many,manyapplica5onsandtechniques

Page 29: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Roadmap

Page 30: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Roadmap‣ Classifica5on:conven5onalandneural,wordrepresenta5ons(3weeks)

‣ Textanalysis:tagging,parsing,informa5onextrac5on(3.5weeks)

‣ Structuredmodelsforsequences,trees(HMMs,PCFGs),aswellasunstructuredapproaches(transi5on-basedparsing)

‣ LotsofNLPtaskscanbeformulatedastagging

‣ Linearandneuralclassifica5on‣ Howtobuildeffec5vewordvectors

Page 31: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Applica5onsofTagging‣ Extractproductoccurrencesincybercrimeforums,butnoteverythingthatlookslikeaproductisaproduct

Portnoffetal.(2017),DurreEetal.(2017)Notaproductinthiscontext

Page 32: State-of-the-art Parsing - University of Texas at Austingdurrett/courses/sp2019/lectures/lec... · 2019. 3. 7. · State-of-the-art Parsers ‣ 2012: Transi5on-based Maltparser achieved

Roadmap‣ Classifica5on:conven5onalandneural,wordrepresenta5ons(3weeks)

‣ Textanalysis:tagging,parsing,informa5onextrac5on(3.5weeks)

‣ Structuredmodelsforsequences,trees(HMMs,PCFGs),aswellasunstructuredapproaches(transi5on-basedparsing)

‣ Linearandneuralclassifica5on‣ Howtobuildeffec5vewordvectors

‣ Genera5on,applica5ons:languagemodeling,machinetransla5on,dialogue(4weeks)

‣ Otherapplica5ons:ques5onanswering,TBD(3weeks)

‣ Missing:structuredneuralmodels.Theseareabitbeyondthisclassbutwe’llseeonewaytodothisa\erspringbreak