Upload
galena-moses
View
35
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Ambiguity, LL1 Grammars and Table-driven Parsing. Problems with Grammars. Not all grammars are usable! Ambiguous Unproductive non-terminals Unreachable rules. E. E. E. E. E. *. E. +. D. E. *. E. E. +. E. D. 3. 1. D. D. D. D. 2. 3. 1. 2. Ambiguous Grammar. - PowerPoint PPT Presentation
Citation preview
Ambiguity, LL1 Grammars andTable-driven Parsing
Problems with Grammars
• Not all grammars are usable!– Ambiguous– Unproductive non-terminals– Unreachable rules
F = { E D | ( E ) | E + E | E – E | E * E | E / E ,
D 0 | 1 | … | 9 }
E
E E+
E E*D
D D1
2 3
E
EE *
E E+ D
D D
1 2
3
Ambiguous Grammar
1 + 2 * 3 G is ambiguous ifthere exists S in L(G),such that there aretwo different parsetrees for S
Multiple meanings:Precedence (1+2)*3≠1+(2*3)
Associativity (1-2)-3≠1-(2-3)
Fixing Precedence Ambiguity
F = { E D | ( E ) | E + E |
E – E | E * E | E / E ,
D 0 | 1 | … | 9 }
E T | E + T | E – T
T F | T * F | T / F
F D | ( E )
D 0 | 1 | … | 9
E
E T+
T F*T
F D
1 2
3
F
D D
Observe:Operators lower in the parse tree are executed first Operators executed first have higher precedenceFix:Introduce a new non-terminal symbol for each precedence level
Adding the Power Operator
E T | E+T | ET
T P | T*P | T/P
P F | FP
F D | (E)
D 0 | 1 | … | 9
E T | E + T | E – T
T F | T * F | T / F
F D | ( E )
D 0 | 1 | … | 9
Fixing Associative Ambiguity
E D | E D
E
E D
E D
D
3
2
1
(3 2) 1
Left recursion/Left associativity Right recursion/Right associativity
E D | D E
2 (3 2)
E
D E
2 D E
3
2
D
232
=
Unreachable Rules
= {S aABb ,
A a | aA ,
B b | bBD ,
C cD ,
D e }
F = { S aABb , A a | aA , B b | bBD , C cD , D e }
1. Initialize the set of reachable non-terminals R with the start symbol.
2. For each round, if R includes the lhs of a production rule, add the non-terminals in the rhs to R.
3. Loop on #2 until there are no changes to R.
4. Rules whose lhs’s are non-terminals in VN minus the non-terminals in R are the set of unreachable rules.
Initialize: R = {S}
Round 1: R = {S, A, B}
Round 2: R = {S, A, B, D}
Round 3: R = {S, A, B, D}
Done: no change: VN – {S, A, B, D} = {C}
Least-fixed point algorithm
Unproductive Non-terminals = { S aABb ,
A bC , B d | dB , C eC }
1. Start with the set of terminals T.
2. For each round, if T covers a rhs of a production rule, add the lhs to T.
3. Loop on #2 until there are no changes to T.
4. The alphabet of terminals and non-terminals, V, minus T is the set of unproductive non-terminals.
Least-fixed point algorithm
= { S aABb ,
A bC ,
B d | dB ,
C eC }
Initialize: T = {a, b, d, e}
Round 1: T = {a, b, d, e, B}
Round 2: T = {a, b, d, e, B}
Done: no change: {a, b, d, e, A, B, C, S} – T = {A, C, S}
C never produces all terminals.C eC eeC … enC
A also because it always produces CA bC beC … benC
S also because it always produce AS aABb aAbb abCbb …
E
N O E E
… + * N O E E N
… + N N 0 1 2
0 1 2 3 0 1 2 3 4
E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4 *+342
Top-down Parsing with Backtracking
Prefix expressions associate an operator with the next two operandsE.g., *+324=(2+3)*4, *2+34=2*(3+4)
LL(1) Parsers• Problem:
– Never know what production to try (and very inefficient)
• Solution:– LL parser: parses input from Left to right, and constructs a
Leftmost derivation of the sentence– LL(k) parser uses k tokens of look-ahead
• LL(1) parsers:– Somewhat restrictive, BUT – Only need current non-terminal and next token to make
parsing decision• LL(1) parsers require LL(1) grammars
Simple LL(1) Grammars
All rules have the form:
A a11 | a22 | … | ann
whereai (1 ≤ i ≤ n) is a terminal
ai aj for i j
i (1 ≤ i ≤ n) is a sequence of terminals and non-terminals, or is empty
Creating Simple LL(1) Grammars
• By making all production rules of the form:
A a11 | a22 | … | ann
• Thus,
E 0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE
• Why is this not a simple LL(1) grammar?
E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4
• How can we change it to simple LL(1)?
E (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE
* + 2 3 4
E
2 * 3
E
?
* E E
8
E E+
6
2
3
3
44
5 E E
7
2
3
E E*
8
3
4
Success! Fail!
LL(1) Parsing
Simple LL(1) Parse TableA parse table is defined as follows:
(V {#}) (VT {#}) {(, i), pop, accept, error}where
– is the right side of production number i– # marks the end of the input string (# V)
If A (V {#}) is the symbol on top of the stack and a (VT {#}) is the current input symbol, then:
ACTION(A, a) = pop if A = a for a VT
accept if A = # and a = # (a, i) which means “pop, then push a and output
i” (A a is the ith production) error otherwise
Simple LL(1) Parse Table Example
E (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE
0 1 2 3 + * #
E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)
0 pop
1 pop
2 pop
3 pop
+ pop
* pop
# accept
V{#}
VT {#}
All blank entries are error
0 1 2 3 + * #
E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)
0,1,2,3,+,* pop pop pop pop pop pop
# accept
Action Stack Input Output
Initialize E# *+123#ACTION(E,*) = Replace [E,*EE], Out 6 *EE# *+123# 6ACTION(*,*) = pop(*,*) EE# *+123# 6ACTION(E,+) = Replace [E,+EE], Out 5 +EEE# *+123# 65ACTION(+,+) = pop(+,+) EEE# *+123# 65ACTION(E,1) = Replace [E,1], Out 2 1EE# *+123# 652ACTION(1,1) = pop(1,1) EE# *+123# 652ACTION(E,2) = Replace [E,2], Out 3 2E# *+123# 6523ACTION(2,2) = pop(2,2) E# *+123# 6523ACTION(E,3) = Replace [E,3], Out 4 3# *+123# 65234ACTION(3,3) = pop(3,3) # *+123# 65234ACTION(#,#) = accept Done!
Parse Table Execution: *+123
• Consider the following grammarE (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
• Not simple LL(1): rules (1) & (2)• However:
– N leads only to {0, 1, 2, 3}– O leads only to {+, *}– {0, 1, 2, 3} {+, *} =
We can distinguish between rules (1) and (2):– If we see 0, 1, 2, or 3, we choose (1)– If we see + or *, we choose (2)
Relaxing Simple LL(1) Restrictions
LL(1) Grammars
• For any , define
FIRST() = { | * and VT}
• A grammar is LL(1) if for all rules of the form
A 1 | 2 | … | n
then,
FIRST(i) FIRST(j) = for i j(i.e., the sets FIRST(1), FIRST(2), …, and FIRST(n) are pairwise disjoint)
E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
+ * 0 1 2 3 #E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)O (+,3) (*,4)N (0,5) (1,6) (2,7) (3,8)+ pop* pop0 pop1 pop2 pop3 pop# accept
V{#}
VT {#}
For (A, a), we select (, i) if a FIRST() and is the right hand side of rule i.
LL(1) Parse Table
+ * 0 1 2 3 #
E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)
O (+,3) (*,4)
N (0,5) (1,6) (2,7) (3,8)
+,*,0,1,2,3 pop pop pop pop pop pop
# accept
Action Stack Input Output
Initialize E# *+123#ACTION(E,*) = Replace [E,OEE], Out 2 OEE# *+123# 2
ACTION(*,*) = pop(*,*) EE# *+123# 24ACTION(E,+) = Replace [E,OEE], Out 2 OEEE# *+123# 242
ACTION(+,+) = pop(+,+) EEE# *+123# 2423
ACTION(N,1) = Replace [N,1], Out 6 1EE# *+123# 242316ACTION(1,1) = pop(1,1) EE# *+123# 242316ACTION(E,2) = Replace [E,N], Out 1 NE# *+123# 2423161
ACTION(2,2) = pop(2,2) E# *+123# 24231617ACTION(E,3) = Replace [E,N], Out 1 N# *+123# 242316171
ACTION(3,3) = pop(3,3) # *+123# 2423161718ACTION(#,#) = accept Done!
ACTION(O,*) = Replace [O,*], Out 4 *EE# *+123# 24
ACTION(O,+) = Replace [O,+], Out 3 +EEE# *+123# 2423
ACTION(E,1) = Replace [E,N], Out 1 NEE# *+123# 24231
ACTION(N,2) = Replace [N,2], Out 7 2E# *+123# 24231617
ACTION(N,3) = Replace [N,3], Out 8 3# *+123# 2423161718
Parse Table Execution Revisited: *+123
What does 2 4 2 3 1 6 1 7 1 8 mean?
E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
E
(2)OEE
(1)N
(6)1 (7)2
(8)3
(4)* (2)OEE (1)N
(3)+ (1)N
2 4 2 3 1 6 1 7 1 8 defines a parse tree via a preorder traversal