ParsingParsingContext-Free Grammars, Context-Free Grammars,
Parsing, Syntax TreesParsing, Syntax Trees
ParsingParsing
Produce the parse tree for a given program Produce the parse tree for a given program (token stream):(token stream):
i := 1 ; sum := 0 ; while not i = 10 do { i = …i := 1 ; sum := 0 ; while not i = 10 do { i = …
ParsingParsing;;
== ;;
ii 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingUsing a stack, match grammar symbols Using a stack, match grammar symbols
(terminal and nonterminal) with the (terminal and nonterminal) with the tokens in the stream:tokens in the stream:
i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …
C_
Recursive-Descent ParsingRecursive-Descent ParsingBeginning with the grammar’s “start Beginning with the grammar’s “start
symbol”, pop the next stack symbol and symbol”, pop the next stack symbol and try to match it with the next token in the try to match it with the next token in the streamstream
ii = 1 ; sum = 0 ; while not … = 1 ; sum = 0 ; while not …C_
Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol by If they don’t match, expand the symbol by
the appropriate grammar production and the appropriate grammar production and push those symbols onto the stackpush those symbols onto the stack
i = 1 ; sum = 0 ; while not …i = 1 ; sum = 0 ; while not …
C ::= C ::= S ; CS ; C
S;C_
Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the
new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …
i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …
C ::= C ::= S ; CS ; C
S;C_
Recursive-Descent ParsingRecursive-Descent ParsingCC;;
SS== CC;;ii 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the streamstream
ii = 1 ; sum = 0 ; while not … = 1 ; sum = 0 ; while not …S;C_
Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol If they don’t match, expand the symbol
by the appropriate grammar production by the appropriate grammar production and push those symbols onto the stackand push those symbols onto the stack
i = 1 ; sum = 0 ; while not …i = 1 ; sum = 0 ; while not …
S ::= S ::= id = numid = num
id=num;C_
Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the
new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …
i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …
S ::= S ::= id = numid = num
id=num;C_
Recursive-Descent ParsingRecursive-Descent ParsingC;C;
SS== CC;;id id ii num num 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the token…stream. If they match, eat the token…
ii = 1 ; sum = 0 ; while = 1 ; sum = 0 ; while not …not …id=num;C_
Recursive-Descent ParsingRecursive-Descent ParsingC;C;
S=S= CC;;idid ii num num 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token
== 1 ; sum = 0 ; while not 1 ; sum = 0 ; while not ……=num;C_
Recursive-Descent ParsingRecursive-Descent ParsingC;C;
SS== CC;;idid i i num num 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token
11 ; sum = 0 ; while not i ; sum = 0 ; while not i == …== …num;C_
Recursive-Descent ParsingRecursive-Descent ParsingC;C;
S=S= CC;;idid i i numnum 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token
;; sum = 0 ; while not i sum = 0 ; while not i == …== …;C_
Recursive-Descent ParsingRecursive-Descent ParsingCC;;
SS== CC;;idid i i numnum 1 1
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. stream.
sumsum = 0 ; while not i == = 0 ; while not i == 10 …10 …
C_
Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol If they don’t match, expand the symbol
by the appropriate grammar production by the appropriate grammar production and push those symbols onto the stackand push those symbols onto the stack
sum = 0 ; while not i == 10 …sum = 0 ; while not i == 10 …
C := C := S ; CS ; C
S;C_
Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the
new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …
sum = 0 ; while not i == sum = 0 ; while not i == 10 …10 …
C := C := S ; CS ; C
S;C_
Recursive-Descent ParsingRecursive-Descent ParsingC;C;
== CC;;
ii 11
SS = = CC while whilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token
++ i } i }+id}_
Recusive-Descent ParsingRecusive-Descent Parsing;;
== ;;
ii 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum ++
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to
match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token
ii } }id}_
Recusive-Descent ParsingRecusive-Descent Parsing;;
== ;;
ii 11
= = whilewhilesumsum 00 not not ;;
==== = = = =ii 1010 ii ++ sumsum + +
ii 1 1 sumsum ii
Recursive-Descent ParsingRecursive-Descent ParsingIf the symbol stack and token stream If the symbol stack and token stream
simultaneously become empty then simultaneously become empty then the program successfully parsed.the program successfully parsed.
}}}_
Which production?Which production?When we expand a nonterminal, how do When we expand a nonterminal, how do
we decide which production to use?we decide which production to use?
ii + 1 + 1
E ::= E + numE ::= E + num || numnum
E_
Which production?Which production?We must predict the structure of the We must predict the structure of the
program based on the next input program based on the next input token – predictive parsing.token – predictive parsing.
ii + 1 + 1
E ::= E + numE ::= E + num ||numnum
E_
Which production?Which production?If we have a choice of productions based If we have a choice of productions based
on the next input token, the grammar on the next input token, the grammar belongs to a class that can’t be parsed belongs to a class that can’t be parsed by the predictive algorithm.by the predictive algorithm.
ii + 1 + 1
E ::= E + numE ::= E + num ||numnum
E_
Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar
EE ::=::= E + TE + T || TTTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)
We don’t know if an E will have 1 term We don’t know if an E will have 1 term or many terms. Should we expand E or many terms. Should we expand E to E + T or T?to E + T or T?
Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar
EE ::=::= E + TE + T || TTTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)
E ::= T { + T }* E ::= T { + T }* T + T + T + … T + T + T + …
We do know that E must always start We do know that E must always start with T followed by 0 or more with T followed by 0 or more + T+ T ‘s ‘s
Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar
EE ::=::= T E’T E’ || E + T E + T || TTE’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)
Solution: left-factor the TSolution: left-factor the TE’ is E’ is nullablenullable
and symbolizes an appended (+ T)and symbolizes an appended (+ T)Now, what about T and F ?Now, what about T and F ?
Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar
EE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’ || T * FT * F || FFT’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)
The T/F interrelationship is isomorphic…The T/F interrelationship is isomorphic…
The “First” SetThe “First” SetEE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’T’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)
First (T E’ ) = { num , ( } First (T E’ ) = { num , ( } First (+ T E’ ) = { + }First (+ T E’ ) = { + }First (F T’ ) = ??First (F T’ ) = ??……
Predictive Parsing TablePredictive Parsing Table
++ ** numnum (( ))EE TE’TE’ TE’TE’E’E’ +TE’+TE’TT FT’FT’ FT’FT’T’T’ *FT’*FT’FF numnum (E)(E)
Next Token (terminal) in input
Nex
t sym
bol (
non-
term
inal
) on
stac
k
Predictive Parsing TablePredictive Parsing TableProduction under each terminal in its Production under each terminal in its
First setFirst set
One entry per (stack symbol * input One entry per (stack symbol * input token),token),
… … otherwise??otherwise??
Empty cells raise parsing errorEmpty cells raise parsing error
But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an
empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?
((1 + 3) * 5 …1 + 3) * 5 …
E ::= T E’E ::= T E’
E
But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an
empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?
33) * 5 …) * 5 …numT’E’)T’E’
But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an
empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?
)) * 5 … * 5 …
Parse error?Parse error?
T’E’)T’E’
The “Follow” SetThe “Follow” SetEE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’T’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)
Follow ( E’ ) = { ) } Follow ( E’ ) = { ) } Follow ( T’ ) = { ) , + }Follow ( T’ ) = { ) , + }
Include a null production entry for each nullable Include a null production entry for each nullable nonterminal under all terminals in its Follow set …nonterminal under all terminals in its Follow set …
Complete Predictive Parsing Complete Predictive Parsing TableTable
++ ** numnum (( ))EE TE’TE’ TE’TE’E’E’ +TE’+TE’ εεTT FT’FT’ FT’FT’T’T’ εε *FT’*FT’ εε FF numnum (E)(E)
Next Token (terminal) in input
Nex
t sym
bol (
non-
term
inal
) on
stac
k