Upload
ailani
View
55
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim Martin (after Dan Jurafsky) from U. Colorado Rada Mihalcea, University of North Texas http://www.cs.unt.edu/~rada/CSCE5290/ - PowerPoint PPT Presentation
Citation preview
Parsing with Context Free GrammarsCSC 9010 Natural Language Processing
Paula Matuszek and Mary-Angela Papalaskari
This slide set was adapted frombullJim Martin (after Dan Jurafsky) from U ColoradobullRada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290bull Robert Berwick MITbullBonnie Dorr University of Maryland
Slide 1
Parsing
Mapping from strings to structured representation
bull Parsing with CFGs refers to the task of assigning correct trees to input strings
bull Correct here means a tree that covers all and only the elements of the input and has an S at the top
bull It doesnrsquot actually mean that the system can select the correct tree from among the possible trees
bull As with everything of interest parsing involves a search which involves the making of choices
bull Wersquoll start with some basic methods before moving on to more complex ones
Slide 1
Programming languages
max = min = grade
Read and process the rest of the grades
while (grade gt= 0)
count++
sum += grade
if (grade gt max)
max = grade
else
if (grade lt min)
min = grade
Systemoutprint (Enter the next grade (-1 to quit) )
grade = KeyboardreadInt ()
bull Easy to parsebull Designed that way
Slide 1
Natural Languages
max = min = grade Read and process the rest of the grades while (grade gt= 0)
count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =
grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =
KeyboardreadInt ()
bull No ( ) [ ] to indicate scope and precedence
bull Lots of overloading (arity varies)
bull Grammar isnrsquot known in advance
bullContext-free grammar is not the best formalism
Slide 1
Some assumptions
bull You have all the words already in some buffer
bull The input isnrsquot pos tagged
bull We wonrsquot worry about morphological analysis
bull All the words are known
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Parsing
Mapping from strings to structured representation
bull Parsing with CFGs refers to the task of assigning correct trees to input strings
bull Correct here means a tree that covers all and only the elements of the input and has an S at the top
bull It doesnrsquot actually mean that the system can select the correct tree from among the possible trees
bull As with everything of interest parsing involves a search which involves the making of choices
bull Wersquoll start with some basic methods before moving on to more complex ones
Slide 1
Programming languages
max = min = grade
Read and process the rest of the grades
while (grade gt= 0)
count++
sum += grade
if (grade gt max)
max = grade
else
if (grade lt min)
min = grade
Systemoutprint (Enter the next grade (-1 to quit) )
grade = KeyboardreadInt ()
bull Easy to parsebull Designed that way
Slide 1
Natural Languages
max = min = grade Read and process the rest of the grades while (grade gt= 0)
count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =
grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =
KeyboardreadInt ()
bull No ( ) [ ] to indicate scope and precedence
bull Lots of overloading (arity varies)
bull Grammar isnrsquot known in advance
bullContext-free grammar is not the best formalism
Slide 1
Some assumptions
bull You have all the words already in some buffer
bull The input isnrsquot pos tagged
bull We wonrsquot worry about morphological analysis
bull All the words are known
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Programming languages
max = min = grade
Read and process the rest of the grades
while (grade gt= 0)
count++
sum += grade
if (grade gt max)
max = grade
else
if (grade lt min)
min = grade
Systemoutprint (Enter the next grade (-1 to quit) )
grade = KeyboardreadInt ()
bull Easy to parsebull Designed that way
Slide 1
Natural Languages
max = min = grade Read and process the rest of the grades while (grade gt= 0)
count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =
grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =
KeyboardreadInt ()
bull No ( ) [ ] to indicate scope and precedence
bull Lots of overloading (arity varies)
bull Grammar isnrsquot known in advance
bullContext-free grammar is not the best formalism
Slide 1
Some assumptions
bull You have all the words already in some buffer
bull The input isnrsquot pos tagged
bull We wonrsquot worry about morphological analysis
bull All the words are known
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Natural Languages
max = min = grade Read and process the rest of the grades while (grade gt= 0)
count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =
grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =
KeyboardreadInt ()
bull No ( ) [ ] to indicate scope and precedence
bull Lots of overloading (arity varies)
bull Grammar isnrsquot known in advance
bullContext-free grammar is not the best formalism
Slide 1
Some assumptions
bull You have all the words already in some buffer
bull The input isnrsquot pos tagged
bull We wonrsquot worry about morphological analysis
bull All the words are known
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Some assumptions
bull You have all the words already in some buffer
bull The input isnrsquot pos tagged
bull We wonrsquot worry about morphological analysis
bull All the words are known
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Top-Down Parsing
bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S
bull Then work your way down from there to the words
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Top Down Space
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Bottom-Up Parsing
bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way
bull Then work your way up from there
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Bottom-Up Space
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Top-Down VS Bottom-Up
bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words
bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root
bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Top-Down Depth-First Left-to-Right Search
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
flight flight
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
flightflight
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Bottom-Up Filtering
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Possible Problem Left-Recursion
What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with
Did the flighthellip
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Solution Rule Ordering
S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP
The key for the NP is that you want the recursive option after any base case
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Avoiding Repeated Work
Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over
Consider an attempt to top-down parse the following as an NP
A flight from Indianapolis to Houston on TWA
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
flight
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
flight
flight
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Dynamic Programming
bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial
time
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley Parsing
Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent
Completed constituents and their locationsIn-progress constituentsPredicted constituents
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
States
The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
StatesLocations
It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the
start of the sentenceNP -gt Det Nominal [12] An NP is in progress the
Det goes from 1 to 2
VP -gt V NP [03] A VP has been found starting at 0 and ending
at 3
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Graphically
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley
bull As with most dynamic programming approaches the answer is found by looking in the table in the right place
bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete
bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]
bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states
as new constituents are discoveredndash New complete states are created in the same way
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley
bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word
ndash Extend states based on matchesndash Add new predictionsndash Go to 2
ndash Look at N+1 to see if you have a winner
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley and Left Recursion
bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search
ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them
S -gt NP VPNP -gt NP PP
bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless
bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Predictor
Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state
beginning and ending where generating state ends
So predictor looking at
S -gt VP [00]
results in
VP -gt Verb [00]VP -gt Verb NP [00]
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Scanner
Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry
So scanner looking at
VP -gt Verb NP [00]
If the next word ldquobookrdquo can be a verb add new state
VP -gt Verb NP [01]
Add this state to chart entry following current one
Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Completer
Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this
category
bull copy state bull move dotbull insert in current chart entry
GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]
AddVP -gt Verb NP [03]
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley how do we know we are done
Find an S state in the final column that spans from 0 to n+1 and is complete
S ndashgt α [0n+1]
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley
So sweep through the table from 0 to n+1hellip
New predicted states are created by starting top-down from S
New incomplete states are created by advancing existing states as new constituents are discovered
New complete states are created in the same way
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley
More specificallyhellipPredict all the states you can upfront
Read a wordExtend states based on matchesAdd new predictionsGo to 2
Look at N+1 to see if you have a winner
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example
Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Example (contrsquod)
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
A simple example
Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)
Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)
Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)
Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)
Grammar S rarr NP VP NP rarr N VP rarr V NP
Lexicon Nrarr I | saw | Mary Vrarr saw
Input I saw Mary
Sentence accepted
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
What is it
What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer
The presence of an S state with the right attributes in the right place indicates a successful recognition
But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Augmenting the chart with structural information
S8S9
S10
S11
S13S12
S8
S9
S8
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Retrieving Parse Trees from Chart
All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S
in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an
exponential number of treesSo we can at least represent ambiguity efficiently
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley and Left Recursion
Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley and Left Recursion 1
S -gt NP VPNP -gt NP PP
Predictor given first ruleS -gt NP VP [00]
PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Earley and Left Recursion 2
When a state gets advanced make a copy and leave the original alonehellip
Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create
NP -gt NP PP [02]But we leave the original state as is
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not
Slide 1
Dynamic Programming Approaches
EarleyTop-down no filtering no restriction on grammar form
CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form
(CNF)Details are not important
Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not