29
INF2820 Computational Linguistics, 2013 Jan Tore Lønning 11 March

INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

INF2820 Computational Linguistics, 2013 Jan Tore Lønning 11 March

Page 2: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Today With recommended (order of) reading • Grammatical features (Last week)

• NLTK book sec 9.1 • Feature structures

• J&M, sec 15.1 • Unification and subsumption

• J&M, sec. 15.2 • Feature structures in NLTK

• NLTK book sec 9.2 • Feature-Based grammars/Unification grammars

• Partly: • J&M, sec 15.3, NLTK book sec 9.3

2

Page 3: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Towards a formalization

• Formally: • Can a category have more than one feature? • What are the possible values of features? • What are the grammar rules? • How should the grammar rules be interpreted?

• Applicability: • How should a grammar with features for Nat. Lang look? • What more can features be used for?

• Semantic representations

• Computationally: • How can feature structure grammars be parsed?

3 March 12, 2013

Page 4: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

More than one feature, ex: German S

NP[CASE=nom, NUM=?x, PERS=?y] VP[NUM=?x, PERS=y?]

NP[CASE=?z,NUM=?x, PERS=3rd] Det[CASE=?z,NUM=?x, GEN=?u] N[CASE=?z,NUM=?x, GEN=?u]

VP[NUM=?x] V[SUBC= dtv, NUM=?x] NP[CASE=dat] NP[CASE=acc]

Det[NUM=sg, CASE=nom, GEN=mask] 'der'

4

Page 5: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Feature structures • Long tradition in

linguistics • E.g. Phonology

• A set of features and

values: • Each value is

appropriate for that feature

• Take it one step further:

• Allow feature structures as values

5

Page 6: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Feature structures as graphs

• Two alternative notations 6

Directed Acyclic Graphs

(DAGs)

Attribute Value Matrices (AVMs)

Page 7: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Reentrancies

7

Page 8: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Reentrancies and programming

• Reentrancies in feature structures resemble the difference in programming between

• two variables pointing to the same object (identity)

• and two variables having similar values

>>> a = [3,4,5] >>> b = [6,7,a,9] >>> c = a[:] >>> a.pop() 5 >>> a ? >>> c ? >>> b ? >>>

8

Page 9: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Today With recommended (order of) reading • Grammatical features (Last week)

• NLTK book sec 9.1 • Feature structures

• J&M, sec 15.1 • Unification and subsumption

• J&M, sec. 15.2 • Feature structures in NLTK

• NLTK book sec 9.2 • Feature-Based grammars/Unification grammars

• Partly: • J&M, sec 15.3, NLTK book sec 9.3

9

Page 10: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Unification of feature structures

March 12, 2013 10

Page 11: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

March 12, 2013 11

Page 12: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

March 12, 2013 12

Page 13: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

March 12, 2013 13

Page 14: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Subsumption and unification

Subsumption • F subsumes G • ”F is as least as general as G” • • If and only if:

• F is atomic and F=G • F is complex and

• For each x in F: F(x) subsumes G(x)

• For any paths p, q in F: If F(p) = F(q) then G(p) = G(q)

Unification

• H is the unification of F and G

• H = • If and only if

• • • And H is the most general

f.s. with these properties

March 12, 2013 14

Page 15: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Today With recommended (order of) reading • Grammatical features (Last week)

• NLTK book sec 9.1 • Feature structures

• J&M, sec 15.1 • Unification and subsumption

• J&M, sec. 15.2 • Feature structures in NLTK

• NLTK book sec 9.2 • Feature-Based grammars/Unification grammars

• Partly: • J&M, sec 15.3, NLTK book sec 9.3

15

Page 16: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

NLTK - implementation >>> fs1 = nltk.FeatStruct(TENSE='past', NUM='sg') >>> fs1 [NUM='sg', TENSE='past'] >>> print fs1 [ NUM = 'sg' ] [ TENSE = 'past' ] >>> from nltk import FeatStruct >>> fs2 = FeatStruct(CAT='vp', AGR = fs1) >>> print fs2 [ AGR = [ NUM = 'sg' ] ] [ [ TENSE = 'past' ] ] [ ] [ CAT = 'vp' ]

16 12. mars 2013

Page 17: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

NLTK - implementation >>> fs3 = fs2.unify(FeatStruct( "[AGR = ?x, SUBJ = [AGR = ?x]]"))

>>> print fs3 [ AGR = (1) [ NUM = 'sg' ] ] [ [ TENSE = 'past' ] ] [ ] [ CAT = 'vp' ] [ ] [ SUBJ = [ AGR -> (1) ] ]

17 12. mars 2013

Page 18: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Today With recommended (order of) reading • Grammatical features (Last week)

• NLTK book sec 9.1 • Feature structures

• J&M, sec 15.1 • Unification and subsumption

• J&M, sec. 15.2 • Feature structures in NLTK

• NLTK book sec 9.2 • Feature-Based grammars/Unification grammars

• Partly: • J&M, sec 15.3, NLTK book sec 9.3

18

Page 19: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Two formats for grammar rules

NLTK

• S NP[AGR=?x] VP[AGR=?x]

• NP[AGR=?x] Det[AGR=?x] Nom[AGR=?x]

J&M

March 12, 2013 19

Page 20: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Two formats for grammar rules 2

NLTK • V[AGR=[NUM=PL]] ’serve’

• V[AGR=[NUM=SG, PERS=3rd]] ’serves’ • VP[AGR=?x] V[AGR=?x] NP

J&M

March 12, 2013 20

Page 21: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Comparing the formats

NLTK

• Extend non-terminals with partial feature structures

• The feature structures may contain variables for coindexing

• Used in e.g. (early) Head-driven Phrase Structure Grammars

Jurafsky & Martin • Add equations to CFG-rules • An equation equals

• Two paths, or • A path and an atomic value

• Inspired by

• PATR • Lexical-Functional Grammar

12. mars 2013 21

Amount to the same (before extensions)

Page 22: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Interpretation of feature-based grammars

• We have defined: • feature structures and unification • grammar rules with feature structures (x2)

• We should also make clear exactly what a

feature structure grammar defines • (missing from both J&M and NLTK-book)

• We will give a semi-formal definition

22 12. mars 2013

Page 23: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Remember: CFG & Trees

• A local three: • A node which is

not a leaf • All the daughters • The order

between the daughters

• A rule • B s1, s2, …, sn • licenses a locale

tree if and only if is on the form:

March 12, 2013 23

B

s1 s2 sn … …..

Page 24: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Trees • A CFG G, generates a tree t iff

• The top of t is annotated with S • The leafs are tagged with

terminals • Each local tree is licensed by a

rule • T(G) = the set of trees generated

by G • The yield of the tree t are the

symbols on the leafs in order

• A string w may be derived from G iff w is the yield of a tree in T(G).

March 12, 2013 24

Abbreviation: ”iff” for ”if and only if”

Page 25: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Trees with feature structures

12. mars 2013 25

NP, VP,

V, N, DET, NP,

N, DET,

S,

the restaurant serves many fish

Each non-terminal node contains a feature structure

Page 26: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Conditions on grammaticality

12. mars 2013 26

NP, VP,

V, N, DET, NP,

N, DET,

S,

the restaurant serves many fish

Each local tree must be licensed by a grammar rule

Page 27: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Local tree licensed by rule –ex 1

• J&M-format: • The local tree respects

all the equations

• NLTK-format S NP[AGR=?x] VP[AGR=?x] • The rule corresponds to a

partial local tree • The actual local tree

extends this

12. mars 2013 27

NP, VP,

S, Each local tree must be licensed by a grammar rule

Page 28: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Local tree licensed by rule –ex 2

• J&M-format: • The local tree respects

all the equations

DET the <DET AGR PERS>=3rd

• NLTK-format DET[AGR=[PERS=‘3rd’]]-> ‘the’

12. mars 2013 28

DET, Each local tree must be licensed by a grammar rule

the

DET,

the

Page 29: INF2820 Computational Linguistics, 2013 · INF2820 Computational Linguistics, 2013 Jan Tore Lønning . 11 March . Today . With recommended (order of) reading • Grammatical features

Conditions on grammaticality

A tree T with feature structures is licensed by feature-structure grammar G if and only if:

• If t1, t2, …, tn are all the local trees in T • Then there are some corresponding rules in G, say g1, g2,

…, gn such that: • Tree ti is licensed by rule gi for i= 1, 2, …, n • T is a minimal structure which satisfy these gi-s

• T is minimal: • If fs_i is the feature structure at the mother of local tree ti for i = 1, 2, …, n • Then we cannot find a structural similar tree for the same sentence with feature

structures fs’_i such that • fs’_i subsumes fs_i for i = 1, 2, …, n • fs_i does not subsume fs’_i for at least one i

12. mars 2013 29