43
Parsing with unification Frederik Fouvry Department of Computational Linguistics and Phonetics Saarland University Introduction to Computational Linguistics Frederik Fouvry Parsing with unification

Parsing with unification

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parsing with unification

Parsing with unification

Frederik Fouvry

Department of Computational Linguistics and PhoneticsSaarland University

Introduction to Computational Linguistics

Frederik Fouvry Parsing with unification

Page 2: Parsing with unification

Outline

1 Motivation2 Unification3 Other issues4 References

Frederik Fouvry Parsing with unification

Page 3: Parsing with unification

Motivation

Insufficiency of CFGs

Atomic categories:No relation between the categories in a CFG:e.g. NP, N, N′, VP, VP_3sg, NsgHard to express generalisations in the grammar:for every rule that operates on a number of differentcategories, the rule specification has to be repeated

Frederik Fouvry Parsing with unification

Page 4: Parsing with unification

Motivation

An example

NP → Det NNPsg → Detsg NsgNPpl → Detpl NplCan we throw away the first instance of the rule?No: sheep is underspecified, just like the, . . .We need to add the cross-product:NPsg → Detsg NNPpl → Detpl NNPsg → Det NsgNPpl → Det Npl

Frederik Fouvry Parsing with unification

Page 5: Parsing with unification

Motivation

An example

Alternatively, words like sheep and the could be associatedwith several lexical entries.→ only reduces the number of rules somewhat→ increases the lexical ambiguity considerably

Frederik Fouvry Parsing with unification

Page 6: Parsing with unification

Motivation

More problems

The grammar cannot rule out yet: Those sheep runs→ subject-verb agreement is not encoded yetSubcategorisation frames in their different stages ofsaturation are to be done as well.However: the expansion could be done automatically fromfeature structure descriptions: e.g.

CATEGORY nounSUBCAT 〈〉NUMBER singPERSON 3

→ NP_3sg

Frederik Fouvry Parsing with unification

Page 7: Parsing with unification

Motivation

More problems

The grammar cannot rule out yet: Those sheep runs→ subject-verb agreement is not encoded yetSubcategorisation frames in their different stages ofsaturation are to be done as well.However: the expansion could be done automatically fromfeature structure descriptions: e.g.

CATEGORY nounSUBCAT 〈〉NUMBER singPERSON 3

→ NP_3sg

Frederik Fouvry Parsing with unification

Page 8: Parsing with unification

Motivation

More problems

The formalism does not leave any room for generalisationslike the following:

“All verbs have to agree in number and person with theirsubject.”S → NP_(*) VP_(*) \1 = \2“In a headed phrase, the head daughter has the samecategory as the mother.”XP → Y X

Feature structures can do that.When a feature structure stands for an infinite set ofcategories, the grammar cannot be compiled out into aCFG.

Frederik Fouvry Parsing with unification

Page 9: Parsing with unification

DefinitionsParsing

Efficiency techniques

Part II

Definitions

Frederik Fouvry Parsing with unification

Page 10: Parsing with unification

DefinitionsParsing

Efficiency techniques

Outline

2 DefinitionsWhat is a feature structure?What is unification?

3 Parsing

4 Efficiency techniques

Frederik Fouvry Parsing with unification

Page 11: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Outline

2 DefinitionsWhat is a feature structure?What is unification?

3 Parsing

4 Efficiency techniques

Frederik Fouvry Parsing with unification

Page 12: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Definition

A feature structure is a directed graph, consisting of nodes andlabelled edges. One node is special: the root node, from whichevery node can be reached by following edges.A feature structure is a tuple 〈Q, q, δ〉:

Q is a finite set of nodes, rooted at qq ∈ Q is the root nodeδ : Feat×Q → Q: a partial feature value function

Frederik Fouvry Parsing with unification

Page 13: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Notation

As a graph

As an AVM

F | H 1

G

[I 1

J 3

]

Frederik Fouvry Parsing with unification

Page 14: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Outline

2 DefinitionsWhat is a feature structure?What is unification?

3 Parsing

4 Efficiency techniques

Frederik Fouvry Parsing with unification

Page 15: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Subsumption

An order relation between elements of a set:v: P × P 〈P,v〉It is an information ordering:a subsumes b iff a contains less information than b,alternatively iff a is more general than b.Special cases

There may be elements a, b such that a 6v b and b 6v a(incomparable)Each element subsumes itselfa v b ∧ b v a ⇔ a = bIn an anti-chain, no two elements are comparable

Frederik Fouvry Parsing with unification

Page 16: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Unification is the operation of merging information-bearingstructures, without loss of information if the unificands areconsistent (monotonicity).

Frederik Fouvry Parsing with unification

Page 17: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Feature structure unification

Here, v is a relation in the set of feature structuresFeature structure unification (t) is the operation ofcombining two feature structures so that the result is themost general feature structure that is subsumed by the twounificands (the least upper bound). If there is no suchstructure, then the unification fails.Two feature structures that can be unified are compatible(or consistent). Comparability entails compatibility, but notthe other way round.There is untyped feature structure unification and typedfeature structure unification.

Frederik Fouvry Parsing with unification

Page 18: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Untyped feature structure unification

Token-identity: two feature structures are token-identical iffthey are the same object.Consistent/compatible: two feature structures areconsistent if they

have the same value,the values of their common features are consistent.

Frederik Fouvry Parsing with unification

Page 19: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Untyped unification: examples

See also Shieber (1986)[CATEGORY noun

]t

[NUMBER singular

]=

[CATEGORY nounNUMBER singular

][

CAT []]t

[CAT | CASE accusative

]=

[CAT | CASE accusative

][

F 1

H 1

]t

[F []

H | G []

]=

F 1[

G []]

H 1

[

CATEGORY noun]t

[CATEGORY verb

]= fail

Frederik Fouvry Parsing with unification

Page 20: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Untyped unification: examples

AGR 1[

NUM sg]

SUBJ[

AGR: 1]

t[

SUBJ

[AGR

[PERS third

]]]=

AGR 1

[NUM sgPERS third

]

SUBJ[

AGR 1]

Frederik Fouvry Parsing with unification

Page 21: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Destructive and non-destructive unification

In implementations, there are two ways to perform unification:

Destructive unification: in the process of unifying twostructures, one is modified and will contain the resultNon-destructive unification: the unificands are notchanged, and the result is a totally new structure.

The former is faster, but gives undesirable effects in somecases. For instance, when you apply a grammar rule, you donot want the rule to be different after the application.Non-destructive unification is easier to keep track of, butrequires copying. Because it does not change the featurestructures, the latter is used in implementations.

Frederik Fouvry Parsing with unification

Page 22: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Typed unification

Type-identity: two object are type-identical iff they are ofthe same type.Consistent: two feature structures are consistent if

their type values are consistenttheir features have consistent values.

Frederik Fouvry Parsing with unification

Page 23: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Type hierarchies

A type hierarchy is a partially ordered set 〈Type,v〉Often type hierarchies have to obey the bounded completepartial order requirement:“For every set of elements with an upper bound, there is aleast upper bound.”It ensures that every unification is uniqueEvery feature structure node q has a typed value: θ(q)

In a type hierarchy, the more specific types inherit allproperties from their supertypes. It is not possible toremove a property.

Frederik Fouvry Parsing with unification

Page 24: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Typed feature structures

A typed feature structure is a tuple 〈Q, q, δ, θ〉:Q is a finite set of nodes, rooted at qq ∈ Q is the root nodeδ : Feat×Q → Q: a partial feature value functionθ : Q → Type: a total type assignment function

Typed feature structures stand in a subsumption hierarchy,the shape of which is determined by the type hierarchy andfeature reentrancies. Even though the type hierarchy isfinite, the feature structure hierarchy is not necessarilyfinite.It may not be immediately clear a reentrancy containsmore information than a structure without. After all: thelatter structure has more nodes. A reentrancy adds theknowledge that two things do not only look the same, theyare the same.

Frederik Fouvry Parsing with unification

Page 25: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Typed feature structure unification

Let F , F ′ ∈ F and F = 〈Q, q, θ, δ〉, F ′ = 〈Q′, q′, θ′, δ′〉. It isrequired that Q ∩Q′ = ∅. A least equivalence relation ./ isdefined on Q ∪Q′ such that

q ./ q′

δ(f , q) ./ δ(f , q′) if both are defined and q ./ q′

Then F t F ′ = 〈(Q ∪Q′)/./, [q]./, θ./, δ./〉

withθ./([q]./) =

⊔{(θ ∪ θ′)(q′)|q ./ q′}

δ./(f , [q]./) ={[(δ ∪ δ′)(f , q)]./ if (δ ∪ δ′)(f , q) is definedundefined otherwise

if all joins in θ./ exist. It is undefined otherwise.(Carpenter, 1992)

Frederik Fouvry Parsing with unification

Page 26: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

[F 1

G 1

]t

[F aG b

]=

[F 1 a/bG 1

]

Frederik Fouvry Parsing with unification

Page 27: Parsing with unification

DefinitionsParsing

Efficiency techniques

What is a feature structure?What is unification?

Features

In an untyped framework, feature may be added anytimeanywhere: there are no restrictions.In typed feature structures, the occurrence of features islimited by the type hierarchy:

Each feature is introduced on a unique, most general typeOnly that type and its subtypes can carry that featureEach feature is introduced with a value, and all valid valueshave to be subsumed by this value.

These requirements ensure monotonicity in featurestructure unification

Frederik Fouvry Parsing with unification

Page 28: Parsing with unification

DefinitionsParsing

Efficiency techniques

Parsing with unification-based grammars

In most implementations, the rules have a context-freebackbone, but feature structures in the categories.Information can be shared between the categories in therule.[

CATEGORY nounSUBCAT 〈〉

]→ 1

[CATEGORY det

]CATEGORY noun

SUBCAT⟨

1⟩

Sometimes the rules are written in a CFG-like format,sometimes feature structures whereby a feature identifiesthe daughters.

Frederik Fouvry Parsing with unification

Page 29: Parsing with unification

DefinitionsParsing

Efficiency techniques

Parsing

Is there any difference in parsing?No. All known techniques can be used, and you will obtaina working parser, provided that you use non-destructiveunification.But it will be (much) slower: the categories are muchbigger, and the unification is non-destructive. A lot ofcopying is done.

Frederik Fouvry Parsing with unification

Page 30: Parsing with unification

DefinitionsParsing

Efficiency techniques

Techniques to improve efficiency

Packing (subsumption packing)Rule filter: not all rules can feed into all other rulesQuick check: some paths are more likely to fail than othersSharing and deleting of daughters: do not keep informationthat can easily be (re)computed or retrievedDelayed copying (Tomabechi): only copy when you aresure that it will be used

Frederik Fouvry Parsing with unification

Page 31: Parsing with unification

DefinitionsParsing

Efficiency techniques

Subsumption packing

With CFGs and chart parsing, every category is only storedonce for a given pair of indices to avoid recomputation.The criterion is a simple identity/equality check.Suppose we have (among others) the following featurestructure in the chart:CAT noun

AGR[

PER 3]

Frederik Fouvry Parsing with unification

Page 32: Parsing with unification

DefinitionsParsing

Efficiency techniques

Subsumption packing

After a rule application, we want to add one of the followingfeature structures:

1

CAT noun

AGR[

PER 1]

2

CAT noun

AGR

[PER 3NUM sg

]

3

CAT noun

AGR[

NUM sg]

4

[CAT nounAGR []

]

Frederik Fouvry Parsing with unification

Page 33: Parsing with unification

DefinitionsParsing

Efficiency techniques

Subsumption packing

Which one the two should we take?all: too many solutions (spurious ambiguity)the first, most recent, . . . : may give over/undergeneration

e.g. with (4) a solution with

CAT noun

AGR[

PER 1]is also possible,

although that does not correspond with the original situationin general: when the newer category is more specific, usingit may invalidate older analyses (which were based on amore general feature structure; see (2)), and vice versa

Frederik Fouvry Parsing with unification

Page 34: Parsing with unification

DefinitionsParsing

Efficiency techniques

Subsumption packing

In CFGs with atomic catgories, we use an equality checkWith feature structures, we want to be able to useunification (it is the operation we use in rule applications),but unification should not be used to perform the check.A subsumption check will tell us what is the most generalfeature structure, and that one should be stored in thechart:

if new v old, then the set of solutions from new will be asuperset of the set of solutions from old, so replace old bynew.if old v new, then new should be discarded (it is alreadyimplied by old)otherwise, add new.

In this way, no solutions are invalidated.

Frederik Fouvry Parsing with unification

Page 35: Parsing with unification

Statistical processingDefault unification

Part III

Other issues

Frederik Fouvry Parsing with unification

Page 36: Parsing with unification

Statistical processingDefault unification

Statistical processing with feature structures

Applying statistical techniques to feature structures is veryhard, mainly because of the presence of reentrancies(Abney, 1997, See e.g.).Very often the following technique is applied: simplify thefeature structure, even to the type of the root node only.That way, the categories can be made sufficiently simple.Examples: Bouma et al. (2001); Toutanova et al. (2002)

Frederik Fouvry Parsing with unification

Page 37: Parsing with unification

Statistical processingDefault unification

Default unification

Credulous default unification: the default FS adds as muchinformation as possible that is not conflicting with the strictFS. It is non-deterministic.Sceptical default unification: the default FS adds theinformation that is common between each variant ofcredulous default unification.(Carpenter, 1993)Sensitive to order of processingPersistent associative default unification (Lascarides et al.,1996)Mainly used for lexical specification

Frederik Fouvry Parsing with unification

Page 38: Parsing with unification

Statistical processingDefault unification

Credulous default unification

F<tc G = {F tG′|G′ v

G is maximal such that F tG′ is defined}

[F a

] <tc

F 1 bG 1

H c

= {

F aG bH c

,F 1 a

G 1

H c

}

Frederik Fouvry Parsing with unification

Page 39: Parsing with unification

Statistical processingDefault unification

Sceptical default unification

F<ts G = u(F

<tc G)

[F a

] <ts

F 1 bG 1

H c

= u{

F aG bH c

,F 1 a

G 1

H c

} =

F aG ⊥H c

Frederik Fouvry Parsing with unification

Page 40: Parsing with unification

Statistical processingDefault unification

Desirable properties of default unification

Always well-definedAll strict information is preservedIf F and G are consistent, it should give the same result asstrict unificationIt is finite

Frederik Fouvry Parsing with unification

Page 41: Parsing with unification

ReferencesReferences

Part IV

References

Frederik Fouvry Parsing with unification

Page 42: Parsing with unification

ReferencesReferences

References

(Abney, 1997) Steven P. Abney. Stochastic attribute-value grammars.Computational Linguistics, 23(4):597–618, December 1997.

(Bouma et al., 2001) Gosse Bouma, Gertjan van Noord, and Robert Malouf.Alpino: Wide coverage computational analysis of Dutch. In WalterDaelemans, Khalil Sima’an, Jorn Veenstra, and Jakub Zavrel, editors,Computational Linguistics in the Netherlands 2000. Selected Papers fromthe Eleventh CLIN Meeting, number 37 in Language and Computers:Studies in Practical Linguistics, pages 45–59, Amsterdam/New York, NY,2001. Selection of the papers presented at CLIN ’00 in Tilburg.

(Carpenter, 1992) Bob Carpenter. The logic of typed feature structures:With applications to unification grammars, logic programs and constraintresolution. Number 32 in Cambridge Tracts in Computer Science.Cambridge–New York–Melbourne, 1992.

(Carpenter, 1993) Bob Carpenter. Skeptical and credulous defaultunification with application to templates and inheritance. In Ted Briscoe,Ann Copestake, and Valeria de Paiva, editors, Inheritance, defaults andthe lexicon, Studies in Natural Language Processing, pages 13–37.Cambridge, 1993.

Frederik Fouvry Parsing with unification

Page 43: Parsing with unification

ReferencesReferences

References

(Davey and Priestley, 2002) B. A. Davey and H. A. Priestley. Introduction tolattices and order. Cambridge, second edition, 2002.

(Lascarides et al., 1996) Alex Lascarides, Ted Briscoe, Nicholas Asher, andAnn Copestake. Order independent and persistent typed defaultunification. Linguistics and Philosophy, 19(1):1–90, February 1996. http://www.cl.cam.ac.uk/Research/NL/acquilex/papers.html(23 January 1998). Revised version of ACQUILEX II WP 34 (August1994/March 1995).

(Shieber, 1986) Stuart M. Shieber. An introduction to unification-basedapproaches to grammar. Number 5 in CSLI Lecture Notes. Stanford,California, January 1986.

(Toutanova et al., 2002) Kristina Toutanova, Christopher D. Manning,Stuart M. Shieber, Dan Flickinger, and Stephan Oepen. Parsedisambiguation for a rich HPSG grammar. In Proceedings of the FirstWorkshop on Treebanks and Linguistic Theories, pages 253–263,Sozopol, Bulgaria, 20–21 September 2002.http://www.bultreebank.org/Proceedings.html (14 October2003).

Frederik Fouvry Parsing with unification