32
Bottom-Up Parsing Part II Canonical Collection of LR(0) items CC_LR(0)_I items(G’:augmented_grammar){ C = {CLOSURE({S’ •S})} ; repeat{ foreach(I C) foreach(grammar symbol X) if(GOTO(I,X)&& GOTO(I,X) C) C = C {GOTO(I,X)}; }until(no new sets of items are added to C) return C; } 1 2

Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Embed Size (px)

Citation preview

Page 1: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Bottom-Up ParsingPart II

Canonical Collection of LR(0) items

CC_LR(0)_I items(G’:augmented_grammar){ C = {CLOSURE({S’ → •S})} ; repeat{ foreach(I ∈ C) foreach(grammar symbol X) if(GOTO(I,X)≠∅ && GOTO(I,X) ∉ C) C = C ∪ {GOTO(I,X)}; }until(no new sets of items are added to C) return C;}

1

2

Page 2: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

LR(0) automaton

G’ : augmented grammar

LR(0) automaton for G’

〈Q, q0, GOTO: Q × (TG’ ∪ NG’) → Q, F〉where:Q = F = items(G’),q0 = CLOSURE({S’ → •S})

E’ → EE → E + T | TT → T* F | FF → (E) | id

Construction of the LR(0) automaton for the expression grammar:

3

4

Page 3: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

E’ → EE → E + T | TT → T* F | FF → (E) | id

Transition diagram (1/2)

I0E’ !> .EE !> . E+TE !> .TT !> .T*FT !> .FF !> .(E)F !> .id

E !> E+ . TT !> . T*FT !> .FF !> .(E)F !> .id

I6

E !> E+T.T !> T.*F

I9

T !> T*F .I10

T !> T*.FF !> .(E)F !> . id

I7E !> T.T !> T.*F

I2

T !> F .I3

F !> ( E ) .I11

F !> ( E . )E !> E . + T

I8F !> ( . E )E !> . E + TE !> .TT !> . T * FT !> . FF !> . ( E )F !> . id

I4

F !> id .I5

E + T*

F

(

idT

*F

(

id

F(id

id

(

E

T

F

)

+

I7

I4

I5

I6I2

I3

I3

I5

I4

E’ !> E.E !> E . + T

1I

Compiler notes #3, 20070503, Tsan-sheng Hsu 66

E’ → EE → E + T | TT → T* F | FF → (E) | id

5

6

Page 4: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

I0 I1

I2

I3

I4

I5

I6

I7

I8

I10

I9

I11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

E’ → EE → E + T | TT → T* F | FF → (E) | id

Push(I0);repeat{

//Begin to scan the input from left to rightIi = top() and next input symbol a;if (Ij = GOTO(Ii,a)) then shift a and Ij;

// Push(a) // push(Ij).

else if (A→β• ∈ Ii) then{ perform “reduce by A→β”; go to the state Ij = GOTO(I,A) and push A and Ij into the stack //where I is the state on the top_of__stack //after removing β and the corresponding states; } _exit(Reject if none of the above can be done); _exit(Report “conflicts” if more than one can be done);

} until EOF is seen

Shift-reduce parsing using LR(0) automaton

7

8

Page 5: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

LRParsingProgram

$

::

sm-1

sm

ACTION GOTO

a1 ai ... $an...

stack

input

output

MODEL OF AN LR PARSER

〈Q, 0, GOTO: Q × (TGʼ ∪ NGʼ) → Q, F〉

I0,...,In enumeration of items(G') I0= CLOSURE({Sʼ → •S})

GOTO(s,X)=s' ⇔ GOTO(Is,X)= Is'

Q = {0,...,n} s Is

∀ s ∈ Q we associate a unique symbol Xs ∈ TG’ ∪ NG’ to s≠0if GOTO(s',X)=s then Xs = X.Moreover, X0 represents the symbol $ (bottom of the stack)

Proposition: if GOTO(s',X)=s and GOTO(s",Y)=s then X=Y

Fix for G' a finite bijective enumeration p: {1,...,n} → G'-productions, s.t. p(1)= S→S'

9

10

Page 6: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Construct the LR(0) automaton for the grammar

Solution for exercise 1

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 6 / 16

accept

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

11

12

Page 7: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Solution for exercise 1

Follow1(S) = {$}Follow1(A) = {f,$}Follow1(B) = {f,$}Follow1(C) = {e,f,$}

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 7 / 16

43

!"#$%&'()*+,--,./01

• !"#$%&'&()*+&,-..-/012

• 3%4%#*&5)*("&)+*6()7&$6#)7%89– (:&;&!&"<#&*6%)&#==&,>31?0#2@A$B&*+&,-..-/0<2

– (:&;&!&"<&*6%)&#==&,-..-/0;2&*+&,-..-/0<2

– (:&;&!&"<#&#)=&$&(8&()&,>31?0#2&*6%)&#==&,-..-/0;2&

*+&,-..-/0<2

S′→SS→AA→bB A→a B→cC B→cCe C → dAf

Solution for exercise 1

Follow1(S) = {$}Follow1(A) = {f,$}Follow1(B) = {f,$}Follow1(C) = {e,f,$}

Faculty of Sciences (ULB) INFO-F403 – Exercises 2010-2011 7 / 16

43

!"#$%&'()*+,--,./01

• !"#$%&'&()*+&,-..-/012

• 3%4%#*&5)*("&)+*6()7&$6#)7%89– (:&;&!&"<#&*6%)&#==&,>31?0#2@A$B&*+&,-..-/0<2

– (:&;&!&"<&*6%)&#==&,-..-/0;2&*+&,-..-/0<2

– (:&;&!&"<#&#)=&$&(8&()&,>31?0#2&*6%)&#==&,-..-/0;2&

*+&,-..-/0<2

13

14

Page 8: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

StateACTIONACTIONACTION GOTO GOTO GOTO

State

A

G

A= ACTION[i,a]G = GOTO[j,A]

ACTION[i,a]= s j (shift symbol j (namely Xj))ACTION[i,a]= r k (reduce production p(k))ACTION[i,a]= accept ACTION[i,a]= error

PARSING TABLE

i

j

a A

terminals and $ non terminals

15

16

Page 9: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

remaining input

( 0 s1 ... sm , ai ai+1 ... an $ )

LR-PARSER CONFIGURATIONS

X0 Xs1 ... Xsm ai ai+1 ... an

configuration

right-sentential form

stack

top

BEHAVIOR OF THE LR PARSER

symbol Xs

p(k)=A → β|β| = rs=GOTO[sm-r, A]

ACTION[sm,ai] = reduce k

( s0 s1 ... sm , ai ai+1 ... an $ )

( s0 s1 ... sm-r s, ai ai+1 ... an $ )

ACTION[sm,ai] = shift s

( s0 s1 ... sm , ai ai+1 ... an $ )

( s0 s1 ... sm s, ai+1 ... an $ )

symbol Xs

17

18

Page 10: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

BEHAVIOR OF THE LR PARSER

ACTION[sm,ai] = error

( s0 s1 ... sm , ai ai+1 ... an $ )

error recovery routine

ACTION[sm,ai] = accept

( s0 s1 ... sm , ai ai+1 ... an $ )

end of parsing

LR-parsing algorithm

19

20

Page 11: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

0 1

2

3

4

5

6

7

8

10

9

11

E + T

T

id(

F

(

T

$

accept

Eid

id(

F

*F

*

+ )

id(

F

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

id ✱ id + id

21

22

Page 12: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

X1 = EX2 = TX3 = FX4 = (X5 = idX6 = +X7 = *X8 = EX9 = TX10 = FX11 = )

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(0) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[S’→ •S] GOTO function);3.foreach state i:(a) if (A→ α•aβ ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a];(b) if (A→ α• ∈ Ii && A ≠ S’)

foreach (a ∈ FOLLOW(A)) add reduce p-1(A → α) to ACTION[i,a];(c) if (S’→ S• ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

23

24

Page 13: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

LR-parsing algorithm

25

26

Page 14: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

Exercise: Construct the PARSING TABLE

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

27

28

Page 15: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

ParseNextDo AllDo StepDo Selected

aa bb cc dd ee ff $$ AA BB CC SS

0 s3 s4 1 2

1 r1

2 acc

3 r3 r3

4 s6 5

5 r2 r2

6 s8 7

7 s9 r4 r4

8 s3 s4 10

9 r5 r5

10 s11

11 r6 r6 r6

A

S

f

b

a

c

b e

A

d

a

B

C

q0

S'!"S

S!"A

A!"a

A!"bB

q1

S!A"

q2

S'!S"

q3

A!a"

q4

B!"cCe

A!b"B

B!"cC

q5

A!bB"

q6

B!c"Ce

C!"dAf

B!c"C

q7

B!cC"

B!cC"e

q8

C!d"Af

A!"a

A!"bB

q9

B!cCe"

q10

C!dA"f

q11

C!dAf"

FIRSTFIRST FOLLOWFOLLOW

A { b, a } { f, $ }

B { c } { f, $ }

C { d } { f, e, $ }

S { b, a } { $ }

Parse table complete. Press "parse" to use it.S' S

S A

A bB

A a

B cC

B cCe

C dAf

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(0) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[S’→ •S] GOTO function);3.foreach state i:(a) if (A→ α•aβ ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a];(b) if (A→ α• ∈ Ii && A ≠ S’)

foreach (a ∈ FOLLOW(A)) add reduce p-1(A → α) to ACTION[i,a];(c) if (S’→ S• ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Constructing an SLR-parsing Table

29

30

Page 16: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

ParseNextDo AllDo StepDo Selected

aa bb cc $$ AA BB SS

0 s2 1

1 acc

2 s4 3

3 s6 s7 5

4 s4 s9 8

5 s10

6 s6 s7 11

7 r5

8 s12

9 r3 r3

10 r1

11 r4

12 r2 r2

Bc

a

b

b

b

S

A

a a

A

c

c

c

Bq0

S!"aABb

S'!"S

q1

S'!S"

q2

A!"ac

A!"aAc

S!a"ABb

q3

S!aA"Bb

B!"c

B!"bB

q4

A!"ac

A!"aAc

A!a"Ac

A!a"c

q5

S!aAB"b

q6

B!b"B

B!"c

B!"bB

q7

B!c"

q8

A!aA"c

q9

A!ac"

q10

S!aABb"

q11

B!bB"

q12

A!aAc"

FIRSTFIRST FOLLOWFOLLOW

A { a } { b, c }

B { b, c } { b }

S { a } { $ }

Parse table complete. Press "parse" to use it.S' S

S aABb

A aAc

A ac

B bB

B c

31

32

Page 17: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Item set ?

it is not ambiguous

Item set :

Parsing table?

33

34

Page 18: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Item set :

parsing table shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R) and therefore: reduce R → L ∈ ACTION[2,=]

conflict!

if (A→ α• ∈ Ii && A ≠ S’) foreach (a ∈ FOLLOW(A)) add reduce p-1(A → α) to ACTION[i,a];

=

Viable prefix A prefix of right-sentential forms that can appear on the

top of the stack

Not all prefixes of right-sentential forms can appear on the stack, however, since the parser must not shift past the

handle

Example:

STACK: (, (E, and (E), but not (E)*(E) is a handle (the parser must reduce it to F before shifting *)

35

36

Page 19: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Viable prefix:

a prefix of right-sentential forms that can appear on the top of the stack and that does not continue past the right end of the rightmost handle of that sentential form. a prefix γ of αβ, where

S' ⇒ αAw ⇒ αβwrm rm*

S' ⇒ αAw ⇒ αβ1β2wrm rm*

A→β1•β2 is valid for a viable prefix αβ1 if

S' ⇒ αAw ⇒ αβ1β2wrm rm*

A→β1•β2 is valid for a viable prefix αβ1 if

if β2 ≠ε then shift if β2 =ε then reduce A→β1

THEOREM: the set of valid items for a viable prefix γ is exactly the set of items reached from the initial state along the path labeled γ in the LR(0) automaton for the grammar

Two valid items may tell us to do different things for the same viable prefix. Some of these conflicts can be resolved by looking at the next input symbol

possible conflicts

37

38

Page 20: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Item set :

Parsing table shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R)and therefore: reduce R → L ∈ ACTION[2,=]

Conflict!

=

LR(1) PARSING

Discussion (1/3)

Every SLR(1) grammar is unambiguous, but there are manyunambiguous grammars that are not SLR(1).

Grammar:• S ! L = R | R• L ! "R | id• R ! L

States:I0:

! S! " ·S! S " ·L = R! S " ·R! L " · # R! L " ·id! R " ·L

I1: S! ! S·I2:

! S " L· = R! R " L·

I3: S ! R·I4:

! L " # · R! R " ·L! L " · # R! L " ·id

I5: L ! id·

I6:! S " L = ·R! R " ·L! L " · # R! L " ·id

I7: L ! "R·I8: R ! L·I9: S ! L = R·

Compiler notes #3, 20070503, Tsan-sheng Hsu 75

0 2shift 6 ∈ ACTION[2,=]symbol = ∈ FOLLOW(R) and therefore:

reduce R → L ∈ ACTION[2,=] 0 3

$R

$L

there is no right-sentential form of the grammar that begins R = ...

Thus state 2, which is the state corresponding to viable prefix L only, should not really call for reduction of that L to R

39

40

Page 21: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

However, it is possible that when state i is on the top of the stack, we have the viable prefix βα on the top of the stack, and βA cannot be followed by a.

In this case, we cannot perform the reduction A → α.

In SLR(1) parsing, if A → α• ∈ Si, and a ∈ FOLLOW(A),then we perform the reduction A → α

LR(1) itemsAn LR(1) item is in the form of

[A → α•β,a] 1.the first field A → α•β is an LR(0) item 2.the second field a is a terminal belonging to a subset

(possibly proper) of FOLLOW[A].

Intuition: perform a reduction based on an LR(1) item [A → α•, a] only when the next symbol is a.

Instead of maintaining FOLLOW sets of viable prefixes, we maintain FIRST sets of possible future extensions of the current viable prefix.

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there

exists a derivation S' ⇒ δAw ⇒ δαβwrm rm*

γ

41

42

Page 22: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Examples of LR(1) items

Grammar:• S ! BB• B ! aB | b

S!="

rmaaBab ="

rmaaaBab

viable prefix aaa can reach [B ! a · B, a]

S!="

rmBaB ="

rmBaaB

viable prefix Baa can reach [B ! a · B, $]

Compiler notes #3, 20070503, Tsan-sheng Hsu 80

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there exists a derivation S' ⇒ δAw ⇒ δαβwrm rm

*

Finding all LR(1) items

Ideas: redefine the closure function.• Suppose [A! ! · B", a] is valid for a viable prefix # " $!.• In other words,

S!=#

rm$ A a% =#

rm$ !B" a%.

! " is # or a sequence of terminals.

• Then for each production B ! &, assume "a% derives the sequence ofterminals bea%.

S!=#

rm$!B "a%

!=#rm

$!B bea%!=#

rm$!& bea%

Thus [B ! ·&, b] is also valid for # for each b $ FIRST("a).Note a is a terminal. So FIRST("a) = FIRST("a%).

Lookahead propagation .

Compiler notes #3, 20070503, Tsan-sheng Hsu 81

where • γ = δα• w = aw' or w=ε and a=$.

[A → α•β, a] is valid for a viable prefix γ if there exists a derivation S' ⇒ δAw ⇒ δαβwrm rm

*

γ γγ

γ

γ

43

44

Page 23: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

set_of_items CLOSURE(I: set_of_items){

J=I;

repeat{

foreach ( [A → α•Bβ,a] ∈ J)

foreach ( B → γ ∈ G)

foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]};

}until (no more new items are added to J)

return J;

}

set_of_items GOTO(I set_of_items, X:symbol){ J=∅;

foreach([A → α•Xβ,a] ∈ I)

J = J ∪ {[A → αX•β,a]};

return CLOSURE(J);

}

45

46

Page 24: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

CC_LR(1)_I items(G’:augmented_grammar){ C = {CLOSURE({[S’ → •S,$]})} ; repeat{ foreach(I ∈ C) foreach(grammar symbol X) if(GOTO(I,X)≠∅ && GOTO(I,X) ∉ C) C = C ∪ {GOTO(I,X)}; }until(no new sets of items are added to C) return C;}

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(1) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[[S’→ •S,$]] GOTO function);3.foreach state i:(a) if ([A→ α•aβ,b] ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a]; // a is terminal(b) if ([A→ α•,a] ∈ Ii && A ≠ S’)

add reduce p-1(A → α) to ACTION[i,a];(c) if ([S’→ S•,$] ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

47

48

Page 25: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Example for constructing LR(1) closures

Grammar:• S! ! S• S ! CC• C ! cC | d

closure1({[S!! ·S, $]}) =• {[S! ! ·S, $],• [S ! ·CC, $],• [C ! ·cC, c/d],• [C ! ·d, c/d]}

Note:• FIRST(!$) = {$}• FIRST(C$) = {c, d}• [C ! ·cC, c/d] means

! [C " ·cC, c] and! [C " ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

Example of an LR(1) parsing tableaction1 GOTO1

state c d $ S C0 s3 s4 1 21 accept2 s6 s7 53 s3 s4 84 r3 r35 r16 s6 s7 97 r38 r2 r29 r2

Canonical LR(1) parser:• Most powerful!• Has too many states and thus occupies too much space.

Compiler notes #3, 20070503, Tsan-sheng Hsu 87

set_of_items CLOSURE(I: set_of_items){

J=I;

repeat{

foreach ( [A → α•Bβ,a] ∈ J)

foreach ( B → γ ∈ G)

foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]};

}until (no more new items are added to J)

return J;

}

Example for constructing LR(1) closures

Grammar:• S! ! S• S ! CC• C ! cC | d

closure1({[S!! ·S, $]}) =• {[S! ! ·S, $],• [S ! ·CC, $],• [C ! ·cC, c/d],• [C ! ·d, c/d]}

Note:• FIRST(!$) = {$}• FIRST(C$) = {c, d}• [C ! ·cC, c/d] means

! [C " ·cC, c] and! [C " ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

49

50

Page 26: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Example for constructing LR(1) closures

Grammar:• S! ! S• S ! CC• C ! cC | d

closure1({[S!! ·S, $]}) =• {[S! ! ·S, $],• [S ! ·CC, $],• [C ! ·cC, c/d],• [C ! ·d, c/d]}

Note:• FIRST(!$) = {$}• FIRST(C$) = {c, d}• [C ! ·cC, c/d] means

! [C " ·cC, c] and! [C " ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

set_of_items CLOSURE(I: set_of_items){ J=I; repeat{ foreach ( [A → α•Bβ,a] ∈ J) foreach ( B → γ ∈ G) foreach (b ∈ FIRST[βa]) J = J ∪ {[B → •γ,b]}; }until (no more new items are added to J) return J;}

51

52

Page 27: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Example for constructing LR(1) closures

Grammar:• S! ! S• S ! CC• C ! cC | d

closure1({[S!! ·S, $]}) =• {[S! ! ·S, $],• [S ! ·CC, $],• [C ! ·cC, c/d],• [C ! ·d, c/d]}

Note:• FIRST(!$) = {$}• FIRST(C$) = {c, d}• [C ! ·cC, c/d] means

! [C " ·cC, c] and! [C " ·cC, d].

Compiler notes #3, 20070503, Tsan-sheng Hsu 83

Example of an LR(1) parsing tableaction1 GOTO1

state c d $ S C0 s3 s4 1 21 accept2 s6 s7 53 s3 s4 84 r3 r35 r16 s6 s7 97 r38 r2 r29 r2

Canonical LR(1) parser:• Most powerful!• Has too many states and thus occupies too much space.

Compiler notes #3, 20070503, Tsan-sheng Hsu 87

LALR(1) parser — Lookahead LR

The method that is often used in practice.Most common syntactic constructs of programming languagescan be expressed conveniently by an LALR(1) grammar[DeRemer 1969].SLR(1) and LALR(1) always have the same number of states.Number of states is about 1/10 of that of LR(1).Simple observation:

• an LR(1) item is of the form [A! ! · ", c]

We call A! ! · " the first component .

Definition: in an LR(1) state, set of first components is calledits core .

Compiler notes #3, 20070503, Tsan-sheng Hsu 88

53

54

Page 28: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

1.G’ : augmented grammar, with enumeration p of productions;2.construct the LR(1) automaton (canonical item set C={I0,...,In}, I0=CLOSURE[[S’→ •S,$]] GOTO function);3.foreach state i:(a) if ([A→ α•aβ,b] ∈ Ii && GOTO[Ii,a]=Ij)

add shift j to ACTION[i,a]; // a is terminal(b) if ([A→ α•,a] ∈ Ii && A ≠ S’)

add reduce p-1(A → α) to ACTION[i,a];(c) if ([S’→ S•,$] ∈ Ii)

add accept to ACTION[i,$] 4. foreach (state i,k && symbol A) if (GOTO[Ij,A]=Ik) add k to GOTO[s,A]5. if (ACTION[i,a] is undefined) add error to ACTION[i,a]6. if (GOTO[i,A] is undefined) add error to GOTO[i,A]

Intuition for LALR(1) grammars

In an LR(1) parser, it is a common thing that several statesonly di!er in lookahead symbols, but have the same core.To reduce the number of states, we might want to merge stateswith the same core.

• If I4 and I7 are merged, then the new state is called I4,7.• After merging the states, revise the GOTO1 table accordingly.

Merging of states can never produce a shift-reduce conflict thatwas not present in one of the original states.

• I1 = {[A! !·, a], . . .}! For I1, one of the actions is to perform a reduce when the lookahead

symbol is “a”.

• I2 = {[B ! " · a#, b], . . .}! For I2, one of the actions is to perform a shift on input “a”.

• Merging I1 and I2, the new state I1,2 has shift-reduce conflicts.• However, we merge I1 and I2 because they have the same core.

! That is, [A ! "·, c] " I2 and [B ! # · a$, d] " I1.! The shift-reduce conflict already occurs in I1 and I2.

Merging of states can produce a new reduce-reduce conflict.

Compiler notes #3, 20070503, Tsan-sheng Hsu 89

55

56

Page 29: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

LALR(1) transition diagram

I0

S’ !> . S, $S !> . CC, $C !> . cC, c/dC !>.d, c/d

S’ !> S., $

I1

S !> C.C, $C !> .cC, $C !> .d, $

I2

S !> CC., $

I5

C !> c.C, $C !> .cC, $C !> .d, $

I6

C !> cC., $

I9

C !> d., $

I7

I3

C !> cC., c/d

I8

C !> d., c/d

I4

S

C

c

d

d

C

C

c

d

d

c

CC !> c.C, c/dC !> .cC, c/dC !> .d, c/d

c

Compiler notes #3, 20070503, Tsan-sheng Hsu 90

Possible new conflicts from LALR(1)

May produce a new reduce-reduce conflict.For example (textbook page 267, Example 4.58), grammar:

• S! ! S• S ! aAd | bBf | aBe | bAe• A! c• B ! c

The language recognized by this grammar is {acd, ace, bcd, bce}.You may check that this grammar is LR(1) by constructing thesets of items.You will find the set of items {[A! c·, d], [B ! c·, e]} is valid forthe viable prefix ac, and {[A! c·, e], [B ! c·, d]} is valid for theviable prefix bc.Neither of these sets generates a conflict, and their cores arethe same. However, their union, which is

• {[A! c·, d/e],• [B ! c·, d/e]},

generates a reduce-reduce conflict, since reductions by bothA! c and B ! c are called for on inputs d and e.

Compiler notes #3, 20070503, Tsan-sheng Hsu 91

57

58

Page 30: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

How to construct LALR(1) parsing table

Naive approach:• Construct LR(1) parsing table, which takes lots of intermediate spaces.• Merging states.

Space and/or time e!cient methods to construct an LALR(1)parsing table are known.

• Constructing and merging on the fly.• · · ·

Compiler notes #3, 20070503, Tsan-sheng Hsu 92

Summary

LR(1)

LL(1)

LALR(1)

SLR(1)LR(1)

LALR(1)

SLR(1)

LR(0)

LR(1) and LALR(1) can almost express all important program-ming languages issues, but LALR(1) is easier to write and usesmuch less space.LL(1) is easier to understand and uses much less space, butcannot express some important common-language features.

• May try to use it first for your own applications.• If it does not succeed, then use more powerful ones.

Compiler notes #3, 20070503, Tsan-sheng Hsu 93

59

60

Page 31: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

Ambiguous grammars are not too bad...Using ambiguous grammars

ambiguous grammars

unambiguous grammars

LR(1)

Ambiguous grammars often provide a shorter, more naturalspecification than their equivalent unambiguous grammars.Sometimes need ambiguous grammars to specify importantlanguage constructs.

• Example: declare a variable before its usage.var xyz : integerbegin

...xyz := 3;...

Use symbol tables to create “side e!ects.”

Compiler notes #4, 20070517, Tsan-sheng Hsu 14

Ambiguous grammars often provide a shorter, more natural specification than their equivalent unambiguous grammars.

Ambiguity from precedence and associativity

Precedence and associativity are important language constructs.Example:

• G1:! E ! E + E | E " E | (E) | id! Ambiguous, but easy to understand and maintain!

• G2:! E ! E + T | T! T ! T " F | F! F ! (E) | id! Unambiguous, but di!cult to understand and maintain!

Input: 1+2*3

E

E +

*

3

T

T T F

FF

1 2

Parse tree: G 2

E

E + E

1 E * E

2 3

Parse tree#1: G 1 1Parse tree#2: G

E

E E

1

E E

2

3

*

+

Compiler notes #4, 20070517, Tsan-sheng Hsu 15

61

62

Page 32: Bottom-Up Parsing · Bottom-Up Parsing Part II Canonical Collection of LR(0) items ... A bB A a B cC B cCe C dAf Do Selected Do Step Do All Next Parse ba b cc dd ee ff $$ AA BB CC

LR(0) states

input: id+id∗idthe parser enters state 7 after processing id+id

Ambiguity from dangling-else

Grammar:• Statement! Other Statement

| if Condition then Statement| if Condition then Statement else Statement

When seeingif C then S else S

• there is a shift/reduce conflict,• we always favor a shift.• Intuition: favor a longer match.

Need a mechanism to let user specify the default conflict-handling rule when there is a shift/reduce conflict.

Compiler notes #4, 20070517, Tsan-sheng Hsu 17

63

64