21
October 2008 CSA3180: Sentence Parsing 1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser

CSA3180: NLP Algorithms

  • Upload
    dane

  • View
    24

  • Download
    2

Embed Size (px)

DESCRIPTION

CSA3180: NLP Algorithms. Sentence Parsing Algorithms 2 Problems with DFTD Parser. Left Recursion Handling Ambiguity Inefficiency. Problems with DFTD Parser. Left Recursion. A grammar is left recursive if it contains at least one non-terminal A for which A  * A and   *  - PowerPoint PPT Presentation

Citation preview

Page 1: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 1

CSA3180: NLP Algorithms

Sentence Parsing Algorithms 2

Problems with DFTD Parser

Page 2: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 2

Problems withDFTD Parser

• Left Recursion• Handling Ambiguity• Inefficiency

Page 3: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 3

Left Recursion

• A grammar is left recursive if it contains at least one non-terminal A for whichA * A and *

(n.b. * is the transitive closure of )• Intuitive idea: derivation of that category

includes itself along its leftmost branch.

NP NP PP NP NP and NP

NP DetP Nominal DetP NP ' s

Page 4: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 4

Left RecursionLeft recursion can lead to an infinite loop

[nltk demo

Page 5: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 5

Dealing with Left Recursion

• Use different parsing strategy• Reformulate the grammar to

eliminate LRA A |

is rewritten asA A' A' A' |

Page 6: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 6

Rewriting the Grammar

NP → NP ‘and’ NP

NP → D N | D N PP

Page 7: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 7

Rewriting the Grammar

NP → NP ‘and’ NP

β

NP → D N | D N PP

α

Page 8: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 8

Rewriting the Grammar

NP → NP ‘and’ NP

β

NP → D N | D N PP

α

New Grammar

NP → α NP1

NP1 → β NP1 | ε

Page 9: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 9

Rewriting the Grammar

NP → NP ‘and’ NP

β

NP → D N | D N PP

α

New Grammar

NP → α NP1

NP1 → β NP1 | ε

α → D N | D N PP

β → ‘and’ NP

Page 10: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 10

New Parse Tree

D N

the cat

α

NP

NP1

ε

Page 11: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 11

Rewriting the Grammar

• Different parse tree• Unnatural parse tree?

Page 12: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 12

Problems withDFTD Parser

• Left Recursion• Handling Ambiguity• Inefficiency

Page 13: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 13

Handling Ambiguity

• Coordination Ambiguity: different scope of conjunction:Hot curry and ice taste nice with riceHot curry and rice taste nice with ice

• Attachment Ambiguity: a constituent can be added to the parse tree in different places:I shot an elephant in my trousers

• VP → VP PPNP → NP PP

Page 14: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 14

Real sentences are full of ambiguities

President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio

Page 15: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 15

Prepositional Phrase Ambiguity

No of PPs # parses

2 2

3 5

4 14

5 132

6 469

7 1430

8 4867

he will deliver- to the American people- over nationwide TV- in New York- during September- for very good reasons

Page 16: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 16

Growth of Number of Ambiguities

The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. No of PPs # parses

2 2

3 5

4 14

5 132

6 469

7 1430

8 4867

Page 17: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 17

Handling Ambiguities

• Statistical disambiguation– which is the most probable interpretation?

• Semantic knowledge– which is the most sensible interpretation?– Subatomic particles such as positively charged

protons and electrons

Page 18: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 18

Problems withDFTD Parser

• Left Recursion• Handling Ambiguity• Inefficiency

Page 19: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 19

Repeated Parsing of Subtrees

• Local versus global ambiguity.– NP → Det Noun– NP → NP PP

• Because of the top down depth first, left to right policy, the parser builds trees that fail because they do not cover all of the input.

• Successive parses cover larger segments of the input, but these include structures that have already been built before.

Page 20: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 20

Repeated Parsing ofSubtrees

a flight 4

from Indianapolis 3

to Houston 2

on TWA 1

A flight from Indianapolis 3

A flight from Indianapolis to Houston

2

A flight from Indianapolis to Houston on TWA

1

NP Nom

Det Noun

a flight

NP

NP PP Nom

Det Noun P Noun

a flight from Indianapolis

Page 21: CSA3180: NLP Algorithms

October 2008 CSA3180: Sentence Parsing 21

Repeated Parsing ofSubtrees

a flight 4

from Indianapolis 3

to Houston 2

on TWA 1

A flight from Indianapolis 3

A flight from Indianapolis to Houston

2

A flight from Indianapolis to Houston on TWA

1