Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th , 10 th March , 2011

Preview:

DESCRIPTION

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing). Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th , 10 th March , 2011 - PowerPoint PPT Presentation

Citation preview

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 23, 24– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

8th, 10th March, 2011

(Lectures 21 and 22 were on Sentiment Analysis by Aditya Joshi)

A note on Language Modeling Example sentence

“ ^ The tortoise beat the hare in the race.”

Guided Guided by Guided byby frequency Language world

Knowledge KnowledgeN-gram (n=3) CFG Probabilistic

CFGDependencyGrammar

Prob. DG

^ the tortoise 5*10-3

S-> NP VP S->NP VP 1.0

Semantic Roles agt, obj, sen, etc.Semantic Rules are always between “Heads”

Semantic Roles with probabilitiesthe tortoise beat

3*10-2NP->DT N NP->DT N

0.5tortoise beat the 7*10-5

VP->V NP PP

VP->V NP PP 0.4

beat the hare5*10-6

PP-> P NP PP-> P NP 1.0

Parse Tree

S

NP VP

DT N V NP PP

The Tortoise beat DT N P NP

the hare in DT N

the race

UNL Expression

beat@past

tortoise@def hare@defrace@de

f

agt obj scn (scene)

Purpose of LM Prediction of next word (Speech

Processing) Language Identification (for same

script) Belongingness check (parsing) P(NP->DT N) means what is the

probability that the ‘YIELD’ of the non terminal NP is DT N

Need for Deep Parsing Sentences are linear structures But there is a hierarchy- a tree-

hidden behind the linear structure There are constituents and

branches

PPs are at the same level: flat with respect to the head word “book”

NP

PPAP

big

The

of poems

with the blue cover

[The big book of poems with theBlue cover] is on the table.

book

No distinction in terms of dominance or c-

command

PP

“Constituency test of Replacement” runs into problems

One-replacement: I bought the big [book of poems with

the blue cover] not the small [one] One-replacement targets book of

poems with the blue cover Another one-replacement:

I bought the big [book of poems] with the blue cover not the small [one] with the red cover

One-replacement targets book of poems

More deeply embedded structureNP

PP

AP

big

The

of poems

with the blue cover

N’1

Nbook

PP

N’2

N’3

Grammar and Parsing Algorithms

A simplified grammar S NP VP NP DT N | N VP V ADV | V

Example Sentence

People laugh1 2 3

Lexicon:People - N, V Laugh - N, V

These are positions

This indicate that both Noun and Verb is

possible for the word “People”

Top-Down Parsing State Backup State Action-----------------------------------------------------------------------------------------------------1. ((S) 1) - -

2. ((NP VP)1) - -3a. ((DT N VP)1) ((N VP) 1) -3b. ((N VP)1) - -4. ((VP)2) - Consume “People”5a. ((V ADV)2) ((V)2) -6. ((ADV)3) ((V)2) Consume “laugh”5b. ((V)2) - -6. ((.)3) - Consume “laugh”

Termination Condition : All inputs over. No symbols remaining.Note: Input symbols can be pushed back.

Position of input pointer

Discussion for Top-Down Parsing This kind of searching is goal driven. Gives importance to textual precedence

(rule precedence). No regard for data, a priori (useless

expansions made).

Bottom-Up Parsing

Some conventions:N12

S1? -> NP12 ° VP2?

Represents positions

End position unknownWork on the LHS done, while the work on RHS remaining

Bottom-Up Parsing (pictorial representation)

S -> NP12 VP23 °

People Laugh 1 2 3

N12 N23

V12 V23

NP12 -> N12 ° NP23 -> N23 ° VP12 -> V12 ° VP23 -> V23 ° S1? -> NP12 ° VP2?

Problem with Top-Down Parsing• Left Recursion

• Suppose you have A-> AB rule. Then we will have the expansion as

follows:• ((A)K) -> ((AB)K) -> ((ABB)K) ……..

Combining top-down and bottom-up strategies

Top-Down Bottom-Up Chart Parsing Combines advantages of top-down &

bottom-up parsing. Does not work in case of left recursion.

e.g. – “People laugh” People – noun, verb Laugh – noun, verb

Grammar – S NP VPNP DT N | NVP V ADV | V

Transitive ClosurePeople laugh

1 2 3

S NP VP NP N VP V

NP DT N S NPVP S NP VP NP N VP V ADV success

VP V

Arcs in Parsing Each arc represents a chart which

records Completed work (left of ) Expected work (right of )

ExamplePeople laugh loudly

1 2 3 4

S NP VP NP N VP V VP V ADVNP DT N S NPVP VP VADV S NP VPNP N VP V ADV S NP VP

VP V

Dealing With Structural Ambiguity Multiple parses for a sentence

The man saw the boy with a telescope.

The man saw the mountain with a telescope.

The man saw the boy with the ponytail.

At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2nd & 3rd sentence.

Prepositional Phrase (PP) Attachment ProblemV – NP1 – P – NP2

(Here P means preposition)NP2 attaches to NP1 ?or NP2 attaches to V ?

Parse Trees for a Structurally Ambiguous SentenceLet the grammar be – S NP VPNP DT N | DT N PPPP P NPVP V NP PP | V NPFor the sentence,“I saw a boy with a telescope”

Parse Tree - 1S

NP VP

N V NP

Det N PP

P NP

Det N

I saw

a boy

with

a telescope

Parse Tree -2S

NP VP

N V NP

Det N

PP

P NP

Det NI saw

a boy with

a telescope

Parsing Structural Ambiguity

Parsing for Structurally Ambiguous Sentences Sentence “I saw a boy with a telescope” Grammar:

S NP VPNP ART N | ART N PP | PRONVP V NP PP | V NP

ART a | an | theN boy | telescopePRON IV saw

Ambiguous Parses Two possible parses:

PP attached with Verb (i.e. I used a telescope to see)

( S ( NP ( PRON “I” ) ) ( VP ( V “saw” ) ( NP ( (ART “a”) ( N “boy”))( PP (P “with”) (NP ( ART “a” ) ( N

“telescope”))))) PP attached with Noun (i.e. boy had a

telescope) ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” )

( NP ( (ART “a”) ( N “boy”) (PP (P “with”) (NP ( ART “a” ) ( N

“telescope”))))))

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON

3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )

(b) ( ( PRON VP ) 1)

− ART does not match “I”,

backup state (b) used

3B

( ( PRON VP ) 1 ) − −

4 ( ( VP ) 2 ) − Consumed “I”

5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used

6 ( ( NP PP ) 3 ) − Consumed “saw”

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10

( ( P NP ) ) − −

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10

( ( P NP ) 5 ) − −

11

( ( NP ) 6 ) − Consumed “with”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10

( ( P NP ) 5 ) − −

11

( ( NP ) 6 ) − Consumed “with”

12

( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10

( ( P NP ) 5 ) − −

11

( ( NP ) 6 ) − Consumed “with”

12

( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

13

( ( N ) 7 ) − Consumed “a”

Top Down ParseState Backup State Action Comments

1 ( ( S ) 1 ) − − Use S NP VP

… … … … …

7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )

(b) ( ( PRON PP ) 3 )

8 ( ( N PP) 4 ) − Consumed “a”

9 ( ( PP ) 5 ) − Consumed “boy”

10

( ( P NP ) 5 ) − −

11

( ( NP ) 6 ) − Consumed “with”

12

( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)

13

( ( N ) 7 ) − Consumed “a”

14

( ( − ) 8 ) − Consume “telescope”

Finish Parsing

Top Down Parsing - Observations Top down parsing gave us the Verb

Attachment Parse Tree (i.e., I used a telescope)

To obtain the alternate parse tree, the backup state in step 5 will have to be invoked

Is there an efficient way to obtain all parses ?

I saw a boy with a telescope1 2 3 4 5 6 7 8

Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purple for Noun Attachment Parse Red for Invalid Parse

Bottom Up Parse

I saw a boy with a telescope1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 PRON12S1?NP12VP2?

I saw a boy with a telescope1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3?

I saw a boy with a telescope1 2 3 4 5 6 7 8

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

NP35ART34N45

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

NP35ART34N45

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5?

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5?

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5?

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

NP68ART67N78

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5?

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

NP68ART67N78

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5? PP58P56NP68

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

NP68ART67N78

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5? PP58P56NP68

NP38ART34N45PP58

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

NP68ART67N78

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5? PP58P56NP68

NP38ART34N45PP58

VP28V23NP38VP28V23NP35PP58

I saw a boy with a telescope1 2 3 4 5 6 7 8

NP3?ART34N45PP5?

Bottom Up Parse

NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??

VP2?V23NP3? NP35 ART34N45

NP3?ART34N45PP5?

PP5?P56NP6?NP35ART34N45 NP68ART67N7?

NP6?ART67N78PP8?

NP68ART67N78

VP25V23NP35 S15NP12VP25

VP2?V23NP35PP5? PP58P56NP68

NP38ART34N45PP58

VP28V23NP38VP28V23NP35PP58

S18NP12VP28

Bottom Up Parsing - Observations Both Noun Attachment and Verb

Attachment Parses obtained by simply systematically applying the rules

Numbers in subscript help in verifying the parse and getting chunks from the parse

ExerciseFor the sentence,“The man saw the boy with a

telescope” & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.

Start of Probabilistic Parsing

Example of Sentence labeling: Parsing

[S1[S[S[VP[VBCome][NP[NNPJuly]]]][,,][CC and][S [NP [DT the] [JJ IIT] [NN campus]][VP [AUX is][ADJP [JJ abuzz][PP[IN with][NP[ADJP [JJ new] [CC and] [ VBG returning]][NNS students]]]]]][..]]]

Noisy Channel ModelingNoisy ChannelSource

sentenceTarget parse

T*= argmax [P(T|S)]T

= argmax [P(T).P(S|T)]T

= argmax [P(T)], since given the parse theT sentence is completely

determined and P(S|T)=1

Corpus A collection of text called corpus, is used for

collecting various language data With annotation: more information, but manual labor

intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged

words. Switchboard: very famous corpora 2400

conversations, 543 speakers, many US dialects, annotated with orthography and phonetics

Discriminative vs. Generative Model

W* = argmax (P(W|SS)) W

Compute directly fromP(W|SS)

Compute fromP(W).P(SS|W)

DiscriminativeModel

GenerativeModel

Language Models N-grams: sequence of n consecutive

words/characters Probabilistic / Stochastic Context Free

Grammars: Simple probabilistic models capable of

handling recursion A CFG with probabilities attached to rules Rule probabilities how likely is it that a

particular rewrite rule is used?

PCFGs Why PCFGs?

Intuitive probabilistic models for tree-structured languages

Algorithms are extensions of HMM algorithms

Better than the n-gram model for language modeling.

Formal Definition of PCFG A PCFG consists of

A set of terminals {wk}, k = 1,….,V {wk} = { child, teddy, bear, played…}

A set of non-terminals {Ni}, i = 1,…,n {Ni} = { NP, VP, DT…} A designated start symbol N1

A set of rules {Ni j}, where j is a

sequence of terminals & non-terminals NP DT NN A corresponding set of rule probabilities

Rule Probabilities Rule probabilities are such that

E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = 0.3

P( NP DT NN) = 0.2 Means 20 % of the training data

parses use the rule NP DT NN

i

P(N ) 1i ji

Probabilistic Context Free Grammars

S NP VP 1.0 NP DT NN 0.5 NP NNS 0.3 NP NP PP 0.2 PP P NP 1.0 VP VP PP 0.6 VP VBD NP 0.4

DT the 1.0 NN gunman 0.5 NN building 0.5 VBD sprayed 1.0 NNS bullets 1.0

Example Parse t1`

The gunman sprayed the building with bullets. S1.0

NP0.5 VP0.6

DT1.0NN0.5

VBD1.0NP0.5

PP1.0

DT1.0 NN0.5

P1.0 NP0.3

NNS1.0

bullets

with

buildingthe

The gunman

sprayed

P (t1) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225VP0.4

Another Parse t2

S1.0

NP0.5 VP0.4

DT1.0 NN0.5VBD1.0

NP0.5 PP1.0

DT1.0 NN0.5 P1.0 NP0.3

NNS1.

0bullets

withbuilding

the

Thegunman

sprayed

NP0.2

P (t2) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015

The gunman sprayed the building with bullets.

Probability of a sentence Notation :

wab – subsequence wa….wb Nj dominates wa….wb

or yield(Nj) = wa….wbwa……………..wb

Nj

1

1 1

1

: ( )

( ) ( , )

= ( ) ( | )

( )m

m mt

mt

t yield t w

P w P w t

P t P w t

P t

Where t is a parse tree of the sentence

the..sweet..teddy ..bear

NP

• Probability of a sentence = P(w1m)

1( | ) = 1mP w t If t is a parse tree for the sentence w1m, this will be 1 !!

Assumptions of the PCFG model Place invariance :

P(NP DT NN) is same in locations 1 and 2 Context-free :

P(NP DT NN | anything outside “The child”) = P(NP DT NN)

Ancestor free : At 2,P(NP DT NN|its ancestor is VP)

= P(NP DT NN)

S

NP

The child

VP

NP

The toy

1

2

Probability of a parse tree Domination :We say Nj dominates from k

to l, symbolized as , if Wk,l is derived from Nj

P (tree |sentence) = P (tree | S1,l ) where S1,l means that the start symbol S dominates the word sequence

W1,l

P (t |s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)

Probability of a parse tree (cont.)S1,l

NP1,2 VP3,l

N2V3,3 PP4,l

P4,4 NP5,lw2

w4

DT1

w1 w3

w5 wl

P ( t|s ) = P (t | S1,l )= P ( NP1,2, DT1,1 , w1,

N2,2, w2,VP3,l, V3,3 , w3,

PP4,l, P4,4 , w4, NP5,l, w5…l | S1,l )

= P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) * D(w1 | DT1,1) * P (w2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4|P4,4) * P (w5…l | NP5,l)

(Using Chain Rule, Context Freeness and Ancestor Freeness )

HMM ↔ PCFG O observed sequence ↔ w1m

sentence X state sequence ↔ t parse

tree model ↔ G grammar

Three fundamental questions

HMM ↔ PCFG How likely is a certain observation given the

model? ↔ How likely is a sentence given the grammar?

How to choose a state sequence which best explains the observations? ↔ How to choose a parse which best supports the sentence?

arg max ( | , )X

P X O 1arg max ( | , )mt

P t w G↔

1( | )mP w G( | )P O ↔

HMM ↔ PCFG

How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences?arg max ( | )P O

1arg max ( | )m

GP w G↔

Interesting Probabilities

The gunman sprayed the building with bullets 1 2 3 4 5 6 7

(4,5) NP

N1

NP

(4,5)NP

What is the probability of having a NP at this position such that it will derive “the building” ? -

What is the probability of starting from N1 and deriving “The gunman sprayed”, a NP and “with bullets” ? -

Inside Probabilities

Outside Probabilities

Interesting Probabilities Random variables to be considered

The non-terminal being expanded. E.g., NP

The word-span covered by the non-terminal. E.g., (4,5) refers to words “the building”

While calculating probabilities, consider: The rule to be used for expansion :

E.g., NP DT NN The probabilities associated with the RHS

non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities

Outside Probability j(p,q) : The probability of beginning

with N1 & generating the non-terminal Nj

pq and all words outside wp..wq1( 1) ( 1)( , ) ( , , | )j

j p pq q mp q P w N w G

w1 ………wp-1 wp…wqwq+1 ……… wm

N1

Nj

Inside Probabilities j(p,q) : The probability of generating

the words wp..wq starting with the non-terminal Nj

pq.( , ) ( | , )jj pq pqp q P w N G

w1 ………wp-1 wp…wqwq+1 ……… wm

N1

Nj

Outside & Inside Probabilities: example

The gunman sprayed the building with bullets 1 2 3 4 5 6 7

4,5

(4,5) for "the building"

(The gunman sprayed, , with bullets | )NP

P NP G

N1

NP

4,5(4,5) for "the building" (the building | , )NP P NP G

Recommended