60

Incremental - coli.uni-saarland.de

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Incremental - coli.uni-saarland.de

In remental Parsing with TAGMiriam KäshammerCourse: In remental Pro essingDepartment of Computational Linguisti s, Universität des SaarlandesMay 28, 2011

Page 2: Incremental - coli.uni-saarland.de

2/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewTree Adjoining Grammar (TAG)tree-rewriting formalisma set of (elementary) trees with two operationsauxiliary treesVPADV VP*ofteninitial trees SNP↓ VPVlaughsNPPeter

Page 3: Incremental - coli.uni-saarland.de

3/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewTree Adjoining GrammarOperationsSubstitution Repla ing a leaf with an initial treeAdjun tion Repla ing an internal node with an auxiliary treeNPPeter SNP↓ VPVlaughs ⇒

SNP VPPeter ADV VPoften VlaughsVPADV VP*often

Page 4: Incremental - coli.uni-saarland.de

4/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewTree Adjoining GrammarSome linguisti prin iples:Lexi alization: Ea h elementary tree has at least onenon-empty lexi al item, its an hor. (LTAG)Predi ate argument o-o uren e: Elementary trees ofpredi ates ontain slots for the arguments they sub ategorizefor.NPPeter SNP↓ VPVlaughsVPADV VP*often

Page 5: Incremental - coli.uni-saarland.de

5/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewWhy is TAG interesting?Mildly ontext-sensitive formalism1 It generates (at least) all ontext-free languages.2 It aptures a limited amount of ross-serial dependen ies, e.g.the opy language {ww|w ∈ {a, b}∗}.3 It an be parsed in polynomial time. (O(n6))4 It has onstant growth property.⇒ appropriate to des ribe natural languagesImportant hara teristi s1 Extended domain of lo ality - elementary trees an bearbitrarily large.2 Fa toring of re ursion - adjun tion operations allows to putre ursive stru tures into separate elementary trees.

Page 6: Incremental - coli.uni-saarland.de

6/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewWhy in remental parsing?Psy holinguisti eviden eHumans build up semanti representation before rea hing theend of the senten e.Interpretation is based on fast, left-to-write onstru tion ofsynta ti relations.Boost in speed

Page 7: Incremental - coli.uni-saarland.de

7/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewIn remental TAG parsing[Sturt and Lombardo, 2005℄ argue that models of humanparsing in orporate an operation similar to adjun tion in TAG.Traditional LTAG does not allow full onne tedness.Peter often ...NPPeter VPADV VP*often

Page 8: Incremental - coli.uni-saarland.de

8/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewWhere is in rementality en oded?Components of a parser:a ( ompeten e) grammara parsing strategya memory organizing strategyan ora le

Page 9: Incremental - coli.uni-saarland.de

8/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewWhere is in rementality en oded?Components of a parser:a ( ompeten e) grammar⇒ a parsing strategya memory organizing strategyan ora le

Page 10: Incremental - coli.uni-saarland.de

8/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewWhere is in rementality en oded?Components of a parser:⇒ a ( ompeten e) grammara parsing strategya memory organizing strategyan ora le

Page 11: Incremental - coli.uni-saarland.de

9/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewApproa hesLTAG-spinal - �In remental� parserIn remental LTAG ParsingL. Shen and A. K. Joshi (2005)DVTAG - Stri tly in remental parserDynami TAG and lexi al dependen iesA. Mazzei, V. Lombardo and P. Sturt (2007)

Page 12: Incremental - coli.uni-saarland.de

10/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion TAGMotivationOverviewOutline1 Introdu tion2 �In remental� TAG3 Stri tly in remental TAG4 Con lusion

Page 13: Incremental - coli.uni-saarland.de

11/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationOutline1 Introdu tionTAGMotivationOverview2 �In remental� TAGGrammar FormalismParsing AlgorithmTrainingEvaluation3 Stri tly in remental TAGDynami sFormalismWide- overage Grammar4 Con lusion

Page 14: Incremental - coli.uni-saarland.de

12/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationLTAG-spinalVariant of LTAGinitial tree only ontains its spineauxiliary tree only ontains its spine and a foot node dire tlylinked to the spineDe�nition: SpineThe spine of an elementary tree is the path from the root node tothe an hor of the tree.

Page 15: Incremental - coli.uni-saarland.de

13/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationLTAG-spinal formalismOperationsAdjun tion (A) Same as in LTAG.Atta hment (T) Atta hment of an initial tree α to a node n ofanother tree α′: add the root of α to n as a new hild.Conjun tion (C) Spe ial operation to build oordination stru tures.

Page 16: Incremental - coli.uni-saarland.de

14/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationLTAG-spinal formalismRelation to LTAGLTAG-spinal is more powerful than CFG.[Shen and Joshi, 2005a℄LTAG-spinal with atta hment onstraints is weakly equivalentto traditional LTAG. [Shen et al., 2007℄

Page 17: Incremental - coli.uni-saarland.de

14/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationLTAG-spinal formalismRelation to LTAGLTAG-spinal is more powerful than CFG.[Shen and Joshi, 2005a℄LTAG-spinal with atta hment onstraints is weakly equivalentto traditional LTAG. [Shen et al., 2007℄LTAG-spinal trees generalize over predi ates with di�erentsub ategorization frames.

Page 18: Incremental - coli.uni-saarland.de

15/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationThe Parsing AlgorithmFour types of parser operations:Atta hment, adjun tion, onjun tionGeneration: generate a possible spine for a given worda ording to the ontext and the lexi on (Supertagging)Variant of the shift-redu e algorithm, using a sta k ofdis onne ted treelets to represent the left ontextShift: Read a word, generate a list of possible elementary treesfor this word. For ea h elementary tree, push it into the sta k.Redu e: Pop the top two treelets from the sta k, ombinethem by atta hment, adjun tion or onjun tion and push the ombined tree into the sta k.Beam-sear h to prune the sear h spa e

Page 19: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoinGraph taken from http://libinshen.net/Do uments/ij 04_slides.ps

Page 20: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 21: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 22: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 23: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 24: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 25: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 26: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 27: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExampleG: generate T: atta h A: adjoin C: onjoin

Page 28: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExample

G: generate T: atta h A: adjoin C: onjoin

Page 29: Incremental - coli.uni-saarland.de

16/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationExample

G: generate T: atta h A: adjoin C: onjoinGraph taken from http://libinshen.net/Do uments/ij 04_slides.ps

Page 30: Incremental - coli.uni-saarland.de

17/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationFlex Model vs. Eager ModelPseudo-ambiguity in the shift-redu e derivation:A adjoins to B, B adjoins to C((A → B) → C)(A → (B → C))

Page 31: Incremental - coli.uni-saarland.de

17/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationFlex Model vs. Eager ModelPseudo-ambiguity in the shift-redu e derivation:A adjoins to B, B adjoins to C((A → B) → C)(A → (B → C))Flex ModelBoth derivations are allowed.Eager ModelOnly ((A → B) → C) is allowed.

Page 32: Incremental - coli.uni-saarland.de

18/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationFeatures and LearningFeatures extra ted from gold-standard parses have the followingformat:(operation, main_spine, hild_spine, spine_node, ontext)spine_node: node on the main_spine onto whi h the hild_spine is atta hed/adjoined/ onjoined ontext: dependent on the type of operation; in ludesamongst others the (0,2) window in the senten eWeights for the features are learned using a per eptron-likealgorithm as proposed in [Collins, 2002℄.

Page 33: Incremental - coli.uni-saarland.de

19/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationEvaluationLTAG-spinal treebank (see [Shen and Joshi, 2005b℄)Training, development and test data from WSJSynta ti dependen y for evaluation against PTBmodel beam sen/se f-s ore %Flex 10 0.37 89.3Eager 10 0.79 88.7With an extension (Combined Parses) and beam=100: 94.2%

Page 34: Incremental - coli.uni-saarland.de

20/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Grammar FormalismParsing AlgorithmTrainingEvaluationWhat we have seen so farParser for LTAG-spinalIn remental?Input is pro essed in rementally, but only partiallyStru ture is not fully onne ted (usage of a sta k)Look-ahead of 2 wordsImplemented: e� ient statisti al parsingGenerative power of grammar is stronger than CFGMotivation?

Page 35: Incremental - coli.uni-saarland.de

21/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarOutline1 Introdu tionTAGMotivationOverview2 �In remental� TAGGrammar FormalismParsing AlgorithmTrainingEvaluation3 Stri tly in remental TAGDynami sFormalismWide- overage Grammar4 Con lusion

Page 36: Incremental - coli.uni-saarland.de

22/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDynami sDynami GrammarThe synta ti analysis is viewed as a dynami pro ess⇒ A sequen e of transitions between adja ent synta ti states Si−1 and Si.A synta ti state ontains all the synta ti information aboutthe fragment already pro essed.

Page 37: Incremental - coli.uni-saarland.de

23/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDynami s and In rementalityIn remental pro essing:Ea h input word wi de�nes a transition from Si−1(left- ontext) to Si.States as partial synta ti stru turesStrong onne tivity:Impose that transitions do not produ e dis onne ted treesParsing strategy is part of the grammar⇒ In rementality-in- ompenten e

Page 38: Incremental - coli.uni-saarland.de

24/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDynami s in TAG - Intuition (I)

Page 39: Incremental - coli.uni-saarland.de

25/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDynami s in TAG - Intuition (II)

At step i, the elementary tree an hored in wi is ombined with the partialstru ture spanning the words w1...wi−1.⇒ Updated left- ontext spanning the words w1...wi

Page 40: Incremental - coli.uni-saarland.de

26/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAGDynami Version of TAGElementary trees similar to LTAG trees, BUT

Page 41: Incremental - coli.uni-saarland.de

26/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAGDynami Version of TAGElementary trees similar to LTAG trees, BUTLexi al items (but not the left-an hor) an be underspe i�ed.The preterminal ategory is paired with a �nite list of values.⇒ predi ted nodes to ful�ll full onne tivity

Page 42: Incremental - coli.uni-saarland.de

26/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAGDynami Version of TAGElementary trees similar to LTAG trees, BUTLexi al items (but not the left-an hor) an be underspe i�ed.The preterminal ategory is paired with a �nite list of values.⇒ predi ted nodes to ful�ll full onne tivityDistin tion between left/right auxiliary trees

Page 43: Incremental - coli.uni-saarland.de

27/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAGHead featureFeature that indi ates the lexi al head of ea h node in theelementary treeNeeded to ompute a dependen y tree

Page 44: Incremental - coli.uni-saarland.de

28/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarSome DVTAG terminologyDotted treeA pair 〈γ, i〉 where γ is a tree and i is an integer su h thati ∈ 0 . . . |Y IELD(γ)|.Moreover, all the symbols on the yield that are on the left of thedot are terminal symbols.Fringe of 〈γ, i〉Set of nodes that are a essible for operations

Page 45: Incremental - coli.uni-saarland.de

29/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAG OperationsCombination of a left- ontext 〈Λ, i〉 with an (unanalysed)elementary tree 〈γ, 0〉2 substitution operations4 adjoining operationsa shift operation�Normal� operationsSubstitution Sub→Adjoining from the left ∇→LAdjoining from the right ∇→RShift Shi

Inverse operationsInv. substitution Sub←Inv. adj. from the left ∇←LInv. adj. from the right ∇←R

Page 46: Incremental - coli.uni-saarland.de

30/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAG OperationsShiftShi(〈γ, i〉) takes as input a single dotted tree 〈γ, i〉 and returns thedotted tree 〈γ, i + 1〉. It an be applied only if a terminal symbolbelongs to the fringe of 〈γ, i〉.

Page 47: Incremental - coli.uni-saarland.de

31/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAG OperationsSubstitutionSub→(〈α, 0〉, 〈γ, i〉):If there is a substitution node N in the fringe of 〈γ, i〉 su h thatlabel(N) = label(root(α)), the operation returns a new dotted tree〈δ, i + 1〉 su h that δ is obtained by grafting α into N .

Page 48: Incremental - coli.uni-saarland.de

32/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAG OperationsSubstitution and inverse Substitution

Page 49: Incremental - coli.uni-saarland.de

33/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarDVTAG OperationsAdjoining from the left∇→

L(〈β, 0〉, 〈γ, i〉, add) where β is a left auxiliary tree:If there is a non-terminal node N at position add in the fringe of

〈γ, i〉 su h that label(N) = label(root(β)), the operation returns anew dotted tree 〈δ, i + 1〉 su h that δ is obtained by grafting β intoN .

Page 50: Incremental - coli.uni-saarland.de

34/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarExample derivation

Page 51: Incremental - coli.uni-saarland.de

35/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarExample derivation

Page 52: Incremental - coli.uni-saarland.de

36/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarA wide- overage DVTAGWays to build a grammar:1 Manually write it (XTAG, FTAG)2 Automati ally extra t it from treebanksAnti ipated problem: Size of grammar be ause of predi ted nodes

Page 53: Incremental - coli.uni-saarland.de

37/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarGrammar seizes in omparison# of tree templatesXTAG 1,200DVTAG from XTAG 6,000,000

Page 54: Incremental - coli.uni-saarland.de

37/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarGrammar seizes in omparison# of tree templatesXTAG 1,200DVTAG from XTAG 6,000,000DVTAG from PTB 12,000

Page 55: Incremental - coli.uni-saarland.de

37/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusion Dynami sFormalismWide- overage GrammarGrammar seizes in omparison# of tree templatesXTAG 1,200DVTAG from XTAG 6,000,000DVTAG from PTB 12,000LTAG-spinal 1,200Numbers are taken from [Mazzei, 2005℄ and [Shen et al., 2007℄, and rounded.

Page 56: Incremental - coli.uni-saarland.de

38/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusionOutline1 Introdu tionTAGMotivationOverview2 �In remental� TAGGrammar FormalismParsing AlgorithmTrainingEvaluation3 Stri tly in remental TAGDynami sFormalismWide- overage Grammar4 Con lusion

Page 57: Incremental - coli.uni-saarland.de

39/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusionCon lusionLTAG-spinalParsing strategy spe i�es the �in remental� nature.In fa t, not very in remental (sta k, look-ahead)E� ient, implemented parser availableDVTAGThe ( ompeten e) grammar determines the parsing strategyNatively ful�lls a stri t version of in rementalityResembles left- orner strategy (⇒ enter embeddings)Grammars grow very large in size

Page 58: Incremental - coli.uni-saarland.de

40/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusionReferen es ICollins, M. (2002).Dis riminative training methods for hidden markov models: Theory andexperiments with per eptron algorithms.EMNLP.Mazzei, A. (2005).Formal and empiri al issues of applying dynami s to Tree AdjoiningGrammars.PhD thesis, Dipartimento di Informati a, Universita degli studi di Torino.Mazzei, A., Lombardo, V., and Sturt, P. (2007).Dynami TAG and lexi al dependen ies.Resear h on Language and Computation.Shen, L. and Joshi, A. K. (2005a).Building an LTAG Treebank.Te hni al Report MS-CIS-05-15, CIS, University of Pennsylvania,Philadelphia, PA.

Page 59: Incremental - coli.uni-saarland.de

41/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusionReferen es IIShen, L. and Joshi, A. K. (2005b).In remental LTAG Parsing.Pro eedings of the onferen e on Human Language Te hnology andEmpiri al Methods in Natural Language Pro essing.Shen, L., Joshi, A. K., and Champollion, L. (2007).LTAG-spinal and the Treebank.Language Resour es and Evaluation.Sturt, P. and Lombardo, V. (2005).Pro essing oordinated stru tures: In rementality and onne tedness.Cognitive S ien e, 29.

Page 60: Incremental - coli.uni-saarland.de

42/ 42

Introdu tion�In remental� TAGStri tly in remental TAGCon lusionDis ussionWhy would we want in rementality in ompeten e?For NLP appli ations the �in remental� parser might beenough.Can psy holinguisti �ndings/memory pro�les be explainedwith these grammers?