24
Parsing with Lexical- Functional Grammars Nikos Engonopoulos Introduction LFG parsing using Discriminative Estimation Robust LFG parsing Discriminative Estimation PCFG-based LFG approximations Automatic f-structure annotation Inducing LFG approximations Long-distance dependencies Experimental Evaluation Discussion Parsing with Lexical-Functional Grammars Nikos Engonopoulos Universit¨ at des Saarlandes January 19, 2012

Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Parsing with Lexical-Functional Grammars

Nikos Engonopoulos

Universitat des Saarlandes

January 19, 2012

Page 2: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 3: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 4: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

LFG basics

Lexical Functional Grammar:

lexical (not transformational): richly structured lexicon whichencodes grammatical structure (e.g. verbal frames)

functional (not configurational): grammatical functions (e.g.subject) object are explicit primitives

LFG distinguishes (minimally) two representations for a sentence:

c(onstituent)-structure Phonological realization

Represented as a tree structureSimilar to CFG structures

f(unctional)-structure Abstract functional organization:predicate-argument structures, functional relations

Represented as an attribute-value matrix (AVM)Similar to dependency structures

Page 5: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

LFG structure example

Page 6: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Parsing with LFG

Deep parsing with LFG (or HPSG) presents a number of challenges:

Annotation with LFG structures is very difficult andtime-consuming

Very small LFG-annotated training corpora

Difficult to automatically induce stochastic LFG grammarsdirectly using the same data-driven approaches as for PCFGs

Hand-crafted grammars have usually low coverage, and areextremely time-consuming and expensive

Two different approaches will be presented:

The PARC approach Carefully building a hand-crafted LFG, thenusing stochastic disambiguation guided by treebankPSG annotations (linguistically motivated)

The DCU approach Automatically extracting an approximation of anLFG using treebank PCFG and functional annotations,plus advanced heuristics for building missinginformation (data-driven)

Page 7: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 8: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

LFG Parsing using Discriminative Estimation

Riezler et al. (2002) address the LFG parsing problem by using:

A hand-crafted LFG grammar (ParGram, 1999)

A widely used PSG-annotated corpus (WSJ), also containingsome partial functional annotations (e.g. NP-SBJ, ADJP-PRD)

A deterministic LFG parser which produces packedrepresentations of all possible parses for each input

A discriminative statistical estimation method to stochasticallydisambiguate between the possible parses

Page 9: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Robust, polynomial-time deterministic parsing withLFG

An LFG grammar with 314 rules is used, compiled into acollection of FSMs with ∼ 8, 700 states and ∼ 20, 000 transitions

Grammar modified to parse POS tags and labeled bracketings,instead of raw text

The XLE parser (Maxwell and Kaplan, 1993) is used to tagpartially labeled data from WSJ and creating packedrepresentations of all possible parses for each input

Robustness: When no parse is found, a partial one is assigned(FRAGMENT parse)Efficiency: If a parsing time threshold is exceeded, a SKIMMED

parse is produced (full or partial)

Page 10: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Conditional exponential model for stochasticdisambiguation

The conditional probability of a parse x given a sentence y ispλ(x |y) = Zλ(y)−1eλ·f(x), where:

f = (f1, ..., fn) is a vector of feature functions which map parseto numeric values

λ = (λ1, ..., λn) is a vector of log-parameters for these propertyfunctions

Zλ(y) =∑

x∈X (y) eλ·f(x) is a normalization constant

Feature functions can represent various complex properties of theparses

Page 11: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Feature functions

Around 1000 feature functions are used, for example:

c-structure nodes and subtrees

number of recursively embedded phrases (to account forhigh-vs-low attachment)

values of specific f-structure attributes

presence of particular attribute-value pairs

branching behaviour of c-structures

properties based on soft clustering of grammatical relationsencoded as lexicalized head-argument-relation tuples

...

Page 12: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Discriminative Estimation method

Goal: maximize the conditional likelihood of the correct parse giventhe training example

P(λ) = −L(λ)− G (λ) = −logm∏j=1

pλ(zj |yj) +n∑

i=1

λi2

2σi 2(1)

In this case: no gold-standard LFG parse

Instead, consider for each training sentence y the set of parsesX (y , z) consistent with the partial treebank annotation z :

P(λ) = −m∑j=1

log

∑X (yj ,zj )

eλ·f(x)∑X (yj )

eλ·f(x)+

n∑i=1

λi2

2σi 2(2)

= −m∑j=1

log∑

X (yj ,zj )

eλ·f(x) +m∑j=1

log∑X (yj )

eλ·f(x) +n∑

i=1

λi2

2σi 2(3)

Page 13: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 14: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Data-driven, PCFG-based LFG approximations

Problem: Building a broad-coverage, large-scale LFG (orHPSG) grammar is time-consuming and expensive

However, large PCFG annotated corpora exist (e.g. Penn-IItreebank), with some functional, trace, head annotations

The DCU approach:

Automatically generate PCFG-based approximations of LFGgrammars, using an shallow f-structure annotation algorithm

Automatically extract semantic subcategorization frames andfinite approximations of Functional Uncertainty equations fromthe training data

Resolve long-distance dependencies using the inducedf-structures, frames and FU equations

Page 15: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Automatic f-structure annotation algorithm

An algorithm exploting Penn-II annotations to produce f-structures.

1 The Penn-II trees are automatically head-lexicalized

2 All local subtrees are partitioned into left-right context (relativeto head)

3 A pipeline of 4 modules is applied for annotating with

f-equations:

4 A constraint solver produces full f-structures

The algorithm has 99.82% coverage and 96.57% F-1 on Penn-IItreebank data.

Page 16: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Automatic extraction of LFG approximations usingPCFG annotations

Two alternative architectures used:

1 The pipeline model

extract a PCFG from a treebankparse unseen datagenerate f-structures using the f-structure annotation algorithm

2 The integrated model

annotate treebank with f-structure equationsextract a PCFG from the augmented treebank annotationsparse unseen data using the extended PCFGresolve f-structure equations into f-structures using a constraintsolver

Page 17: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Long-distance dependencies

Problem: Resulting f-structures are still “shallow”: they don’taccount for long-distance dependencies

Solution: Resolve LDDs by using Functional uncertainty (FU)equations and subcategorization frames

Page 18: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

LDD resolution with extracted frames and FUequations

1 Subcategorization frames:

2 Functional uncertainty (FU) equations:

3 Instead of being hand-coded, they are learned from the trainingdata and assigned probabilities

4 The LDD resolution algorithm then recursively traverses thef-structure looking for paths that match FU equations, whilesatisfying subcategorization constraints

Page 19: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 20: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Experimental Evaluation: PARC approach

Two evaluation methods for produced f-structures:

1 LFG f-relations2 Dependency relations (to enable comparison with other

formalisms)

Results on the PARC 700 dataset (700 randomly selected sentencesfrom section 23 of WSJ):

Later results with a better disambiguation technique (Kaplan et al.,2004):

Page 21: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Experimental Evaluation: DCU approach

On the same dataset (PARC 700), the DCU approach achieves betterF-1 than the PARC approach.

Page 22: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Outline

1 Introduction

2 LFG parsing using Discriminative EstimationRobust LFG parsingDiscriminative Estimation

3 PCFG-based LFG approximationsAutomatic f-structure annotationInducing LFG approximationsLong-distance dependencies

4 Experimental Evaluation

5 Discussion

Page 23: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

Discussion

LFG parsing is challenging due to costly grammar engineeringand annotating

Two approaches were presented:1 Linguistically motivated (PARC): hand-crafted LFG using

stochastic disambiguation guided by existing treebankannotations

more robust and consistentno annotation efforthuge engineering effort, over-reliance on a single person’sexpertise

2 Data-driven (DCU): automatically induced LFG approximationfrom existing treebank annotations

no grammar engineering or annotation effortmore fragmented grammar, no linguistic motivation

Both approaches achieve similar evaluation results

Page 24: Parsing with Lexical-Functional Grammarsyzhang/rapt-ws1112/... · 2012. 2. 8. · Lexical Functional Grammar: lexical (not transformational): richly structured lexicon which encodes

Parsing withLexical-

FunctionalGrammars

NikosEngonopoulos

Introduction

LFG parsing usingDiscriminativeEstimation

Robust LFGparsing

DiscriminativeEstimation

PCFG-based LFGapproximations

Automaticf-structureannotation

Inducing LFGapproximations

Long-distancedependencies

ExperimentalEvaluation

Discussion

References

Aoife Cahill, Michael Burke, Ruth O’Donovan, Josef vanGenabith, and Andy Way, Long-distance dependency resolutionin automatically acquired wide-coverage pcfg-based lfgapproximations, Proceedings of the 42nd Annual Meeting of theAssociation for Computational Linguistics (ACL-04), 2004, July21–26, Barcelona, Spain, pp. 320–327.

R.M. Kaplan, S. Riezler, T.H. King, J.T. Maxwell,A. Vasserman, and R. Crouch, Speed and accuracy in shallowand deep stochastic parsing, Proceedings of HLT-NAACL, vol. 4,2004, pp. 97–104.

Stefan Riezler, Tracy H. King, Ronald M. Kaplan, RichardCrouch, John T. Maxwell, and Mark Johnson, Parsing the WallStreet Journal using a Lexical-Functional Grammar anddiscriminative estimation techniques, Proceedings of the 40thAnnual Meeting of the Association for Computational Linguistics(ACL’02) (Philadephia), 2002.