LL(1) Parsing

LL(1) Parsing

Prepared byManuel E. Bermúdez, Ph.D.

Associate ProfessorUniversity of Florida

Programming Language PrinciplesLecture 4

Example• Build the LL(1) parse table for the

following grammar.

S → begin SL end {begin} → id := E; {id}SL → SL S {begin,id} → S {begin,id}E → E+T {(, id}→ T {(, id}T → P*T {(, id}

→ P {(, id}P → (E) {(}→ id {id}

*

*

*

* - not LL(1)

Example (cont’d)

• Lemma: Left recursion always produces a non-LL(1) grammar (e.g., SL, E above)

• Proof: Consider A → A First () or

Follow (A) → First () Follow (A)

Problems with our Grammar

1. SL is left recursive.

2. E is left recursive.

3. T → P * T both begin with the same → P sequence of symbols (P).

Solution to Problem 3

• Change: T → P * T { (, id } → P { (, id }

• to: T → P X { (, id }X → * T { * } → { +, ; , ) }

Follow(X) Follow(T) due to T → P X Follow(E) due to E → E+T , E → T= { +, ;, ) } due to E → E+T, S → id := E

; and P → (E)

Disjoint!

Solution to Problem 3 (cont’d)

• In general, change A → 1

→ 2 . . . → n

to A → X X → 1 . . . → n

Hopefully all the ’s begin with different symbols

Solution to Problems 1 and 2

• We want (…((( T + T) + T) + T)…)• Instead, (T) (+T) (+T) … (+T)

Change: E → E + T { (, id } → T { (, id }

To: E → T Y { (, id }Y → + T Y { + } → { ; , ) }

• Follow(Y) Follow(E)• = { ; , ) }

No longer contains ‘+’, because we eliminated the production E → E + T

Solution to Problems 1 and 2 (cont’d)• In general,

Change: A → A1 A → 1 . . . . . . → An → m

to: A → 1 X X → 1 X . . . . . . → m X → n X →

Solution to Problems 1 and 2 (cont’d)• In our example,

Change: SL → SL S { begin, id } → S { begin, id }

To: SL → S Z { begin, id } Z → S Z { begin, id }

→ { end }

Modified GrammarS → begin SL end {begin} → id := E ; {id}SL → S Z {begin,id} Z → S Z {begin,id}

→ {end}E → T Y (,id}Y → + T Y {+}

→ {;,)}T → P X {(,id}X → * T {*}

→ {;,+,)}P → (E) {(}

→ id {id}

Disjoint.Grammar is LL(1)

Recursive Descent Parsing

• Top-down parsing strategy, suitable for LL(1) grammars.

• One procedure per nonterminal.• Contents of stack embedded in recursive

call sequence.• Each procedure “commits” to one

production, based on the next input symbol, and the select sets.

• Good technique for hand-written parsers.

For our Modified, LL(1) Grammarproc S; {S → begin SL end

→ id := E; }case Next_Token of

T_begin : Read(T_begin);SL;Read (T_end);

T_id : Read(T_id);Read (T_:=);E;Read (T_;);

otherwise Error;end

end;

“Read (T_X)” verifies that the upcoming token is X, and consumes it.

“Next_Token” is the upcoming token.

For our Modified, LL(1) Grammar (cont’d)proc SL; {SL → SZ}

S;Z;

end;

proc E; {E → TY}T;Y;

end;

Technically, should have insisted that Next Token be either T_begin or T_id, but S will do that anyway. Checking early would aid error recovery.

// Ditto for T_( and T_id.

For our Modified, LL(1) Grammar (cont’d)

proc Z;{Z → SZ→ }

case Next Token ofT_begin, T_id: S;Z;

T_end: ;otherwise Error;

endend;

For our Modified, LL(1) Grammar (cont’d)proc Y; {Y → +TY

→ }if Next Token = T_+ then

Read (T_+)T;Y;

end;

proc T; {T → PX}P;X

end;

Could have used a case statement

Could have checked for T_( and T_id.

For our Modified, LL(1) Grammar (cont’d)

proc X;{X → *T→ }

if Next Token = T_* thenRead (T_*);T;

end;

For our Modified, LL(1) Grammar (cont’d)proc P; {P →(E)

→ id }case Next Token of

T_(: Read (T_();E;Read (T_));

T_id: Read (T_id);otherwise Error;

endend;

LL(1) Parsing

Prepared byManuel E. Bermúdez, Ph.D.

Associate ProfessorUniversity of Florida

Programming Language PrinciplesLecture 4

Documents

LL(1) Parsing