76
Bottom-Up Syntax Analysis Mooly Sagiv http://www.cs.tau.ac.il/~msagiv/courses/ wcc13.html Textbook:Modern Compiler Design Chapter 2.2.5 (modified) 1

Bottom-Up Syntax Analysis

  • Upload
    maree

  • View
    29

  • Download
    2

Embed Size (px)

DESCRIPTION

Bottom-Up Syntax Analysis. Mooly Sagiv http://www.cs.tau.ac.il/~msagiv/courses/wcc13.html Textbook:Modern Compiler Design Chapter 2.2.5 (modified). Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program - PowerPoint PPT Presentation

Citation preview

Page 1: Bottom-Up  Syntax Analysis

Bottom-Up Syntax Analysis

Mooly Sagiv

http://www.cs.tau.ac.il/~msagiv/courses/wcc13.htmlTextbook:Modern Compiler Design

Chapter 2.2.5 (modified)

1

Page 2: Bottom-Up  Syntax Analysis

Efficient Parsers • Pushdown automata

• Deterministic

• Report an error as soon as the input is not a prefix of a valid program

• Not usable for all context free grammars

cup

context free grammar

tokens parser

“Ambiguity errors”

parse tree2

Page 3: Bottom-Up  Syntax Analysis

Kinds of Parsers

• Top-Down (Predictive Parsing) LL– Construct parse tree in a top-down matter– Find the leftmost derivation– For every non-terminal and token predict the next

production

• Bottom-Up LR– Construct parse tree in a bottom-up manner– Find the rightmost derivation in a reverse order– For every potential right hand side and token decide

when a production is found

3

Page 4: Bottom-Up  Syntax Analysis

Bottom-Up Syntax Analysis

• Input– A context free grammar– A stream of tokens

• Output– A syntax tree or error

• Method– Construct parse tree in a bottom-up manner– Find the rightmost derivation in (reversed order)– For every potential right hand side and token decide

when a production is found– Report an error as soon as the input is not a prefix of

valid program

4

Page 5: Bottom-Up  Syntax Analysis

Plan

• Pushdown automata

• Bottom-up parsing (informal)

• Non-deterministic bottom-up parsing

• Deterministic bottom-up parsing

• Interesting non LR grammars

5

Page 6: Bottom-Up  Syntax Analysis

Pushdown Automaton

control

parser-table

input

stack

$

$u t w

V

6

Page 7: Bottom-Up  Syntax Analysis

Informal Example(1)

S E$ E T | E + T T i | ( E )

input

+ i $i$

stack tree

i

i + i $

input

$

stack tree

shift

7

Page 8: Bottom-Up  Syntax Analysis

Informal Example(2)

S E$ E T | E + T T i | ( E )

input

+ i $i$

stack tree

i

reduce T i

input

+ i $T$

stack tree

i

T

8

Page 9: Bottom-Up  Syntax Analysis

Informal Example(3)

S E$ E T | E + T T i | ( E )

input

+ i $T$

stack tree

reduce E Ti

T

input

+ i $E$

stack tree

i

TE

9

Page 10: Bottom-Up  Syntax Analysis

Informal Example(4)

S E$ E T | E + T T i | ( E )

shift

input+ i $E

$

stack tree

i

TE

inputi $+

E$

stack tree

i

TE

+10

Page 11: Bottom-Up  Syntax Analysis

Informal Example(5)

shift

input

i $+E$

stack tree

i

TE

+

input

$i+E$

stack tree

i

TE

+ i

S E$ E T | E + T T i | ( E )

11

Page 12: Bottom-Up  Syntax Analysis

Informal Example(6)

reduce T i

input

$i+E$

stack tree

i

TE

+ i

input

$T+E$

stack tree

i

TE

+ i

T

S E$ E T | E + T T i | ( E )

12

Page 13: Bottom-Up  Syntax Analysis

Informal Example(7)

reduce E E + T

input

$T+E$

stack tree

i

TE

+ i

T

input

$E$

stack tree

i

TE

+ T

E

i

S E$ E T | E + T T i | ( E )

13

Page 14: Bottom-Up  Syntax Analysis

Informal Example(8)

input

$E$

stack tree

i

TE

+ T

E

shift

iinput

$ E

$

stack tree

i

TE

+ T

E

i

S E$ E T | E + T T i | ( E )

14

Page 15: Bottom-Up  Syntax Analysis

Informal Example(9)

input

$$ E

$

stack tree

i

TE

+ T

E

reduce S E $

i

S E$ E T | E + T T i | ( E )

15

Page 16: Bottom-Up  Syntax Analysis

Informal Example

reduce E E + T

reduce T i

reduce E T

reduce T i

reduce S E $

16

Page 17: Bottom-Up  Syntax Analysis

The Problem

• Deciding between shift and reduce

17

Page 18: Bottom-Up  Syntax Analysis

Informal Example(7)S E$ E T | E + T T i | ( E )

reduce E E + T

input

$T+E$

stack tree

i

TE

+ i

T

input

$E$

stack tree

i

TE

+ T

E

i18

Page 19: Bottom-Up  Syntax Analysis

Informal Example(7’)S E$ E T | E + T T i | ( E )

reduce E T

input

$T+E$

stack tree

i

TE

+ i

T

input

$E+E$

stack tree

i

TE

+ E

E

19

Page 20: Bottom-Up  Syntax Analysis

Bottom-UP LR(0) Itemsalready matched

regular expression

still need to be matched

input

input T·

already matched expecting to match 20

Page 21: Bottom-Up  Syntax Analysis

LR(0) items()i+$TE

1: S E$24, 6

2: S E $s3

3: S E $ r

4: E T510, 12

5: E T r

6: E E + T74, 6

7: E E + Ts8

8: E E + T910, 12

9: E E + T r

10: T is11

11: T i r

12: T (E)s13

13: T ( E)144, 6

14: T (E )s15

15: T (E) r

S E$

E T

E E + T

T i

T ( E )

21

Page 22: Bottom-Up  Syntax Analysis

Formal Example(1)

S E$ E T | E + T T i | ( E )

i + i $

input

1: S E$

stack

-move 6

i + i $

input

6: E E+T

1: S E$

stack

22

Page 23: Bottom-Up  Syntax Analysis

Formal Example(2)

S E$ E T | E + T T i | ( E )

-move 4

i + i $

input

4: E T

6: E E+T

1: S E$

stack

i + i $

input

6: E E+T

1: S E$

stack

23

Page 24: Bottom-Up  Syntax Analysis

Formal Example(3)

S E$ E T | E + T T i | ( E )

-move 10

i + i $

input

10: T i

4: E T

6: E E+T

1: S E$

stack

i + i $

input

4: E T

6: E E+T

1: S E$

stack

24

Page 25: Bottom-Up  Syntax Analysis

Formal Example(4)S E$ E T | E + T T i | ( E )

shift 11input

+ i $11: T i 10: T i

4: E T

6: E E+T

1: S E$

stack

i + i $

input

10: T i

4: E T

6: E E+T

1: S E$

stack

25

Page 26: Bottom-Up  Syntax Analysis

Formal Example(5)S E$ E T | E + T T i | ( E )

reduce T i

+ i $

input

5: E T 4: E T6: E E+T1: S E$

stack

+ i $

inputstack

11: T i 10: T i

4: E T

6: E E+T

1: S E$

26

Page 27: Bottom-Up  Syntax Analysis

Formal Example(6)S E$ E T | E + T T i | ( E )

reduce E T

+ i $

input

7: E E +T

6: E E+T

1: S E$

stack

+ i $

input

5: E T 4: E T

6: E E+T

1: S E$

stack

27

Page 28: Bottom-Up  Syntax Analysis

Formal Example(7)S E$ E T | E + T T i | ( E )

shift 8

i $

input

8: E E + T7: E E +T6: E E+T

1: S E$

stack

+ i $

input

7: E E +T6: E E+T

1: S E$

stack

28

Page 29: Bottom-Up  Syntax Analysis

Formal Example(8)S E$ E T | E + T T i | ( E )

-move 10

i $

input

10: T i

8: E E + T7: E E +T6: E E+T

1: S E$

stack

i $

input

8: E E + T7: E E +T6: E E+T

1: S E$

stack

29

Page 30: Bottom-Up  Syntax Analysis

Formal Example(9)S E$ E T | E + T T i | ( E )

shift 11

$

input

11: T i 10: T i8: E E + T7: E E +T6: E E+T

1: S E$

stack

i $

input

10: T i8: E E + T7: E E +T6: E E+T

1: S E$

stack

30

Page 31: Bottom-Up  Syntax Analysis

Formal Example(10)S E$ E T | E + T T i | ( E )

reduce T i

$

inputstack11: T i 10: T i8: E E + T7: E E +T6: E E+T

1: S E$

$

inputstack9: E E + T 8: E E + T7: E E +T6: E E+T

1: S E$

31

Page 32: Bottom-Up  Syntax Analysis

Formal Example(11)S E$ E T | E + T T i | ( E )

reduce E E + T

$

inputstack9: E E + T 8: E E + T7: E E +T6: E E+T

1: S E$

$

inputstack

2: S E $1: S E$

32

Page 33: Bottom-Up  Syntax Analysis

Formal Example(12)S E$ E T | E + T T i | ( E )

shift 3

$

inputstack

2: S E $1: S E$

inputstack

3: S E $ 2: S E $1: S E$

33

Page 34: Bottom-Up  Syntax Analysis

Formal Example(13)S E$ E T | E + T T i | ( E )

reduce S E$

inputstack

3: S E $ 2: S E $1: S E$

34

Page 35: Bottom-Up  Syntax Analysis

But how can this be done efficiently?

Deterministic Pushdown Automaton

35

Page 36: Bottom-Up  Syntax Analysis

Handles

• Identify the leftmost node (nonterminal) that has not been constructed but all whose children have been constructed

t1 t2 t4 t5 t6 t7 t8

input

1 2

3

36

Page 37: Bottom-Up  Syntax Analysis

Identifying Handles

• Create a deteteministic finite state automaton over grammar symbols– Sets of LR(0) items– Accepting states identify handles

• Use automaton to build parser tables– reduce For items A on every token– shift For items A t on token t

• When conflicts occur the grammar is not LR(0)• When no conflicts occur use a DPDA which

pushes states on the stack

37

Page 38: Bottom-Up  Syntax Analysis

A Trivial Example

• S A B $

• A a

• B a

38

Page 39: Bottom-Up  Syntax Analysis

()i+$TE

1: S E$24, 6

2: S E $s3

3: S E $ r

4: E T510, 12

5: E T r

6: E E + T74, 6

7: E E + Ts8

8: E E + T910, 12

9: E E + T r

10: T is11

11: T i r

12: T (E)s13

13: T ( E)144, 6

14: T (E )s15

15: T (E) r

S E$

E T

E E + T

T i

T ( E )

39

Page 40: Bottom-Up  Syntax Analysis

1: S E$

4: E T6: E E + T

10: T i12: T (E)

5: E T T

11: T i i

2: S E $7: E E + TE

13: T ( E)4: E T6: E E + T10: T i12: T (E)

)

)

15: T (E) (

14: T (E )7: E E + T

E

8: E E + T10: T i12: T (E)

+

+

9: E E + T

T

3: S E $ $

06

5

7

8

9

4

3

2

1

i

i

)

40

Page 41: Bottom-Up  Syntax Analysis

Example Control Table

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)41

Page 42: Bottom-Up  Syntax Analysis

i + i $

input

0($)shift 5

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

42

Page 43: Bottom-Up  Syntax Analysis

+ i $

input5 (i)

0 ($) reduce T i

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

43

Page 44: Bottom-Up  Syntax Analysis

+ i $

input6 (T)

0 ($) reduce E T

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

44

Page 45: Bottom-Up  Syntax Analysis

+ i $

input

1(E)

0 ($)

shift 3

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

45

Page 46: Bottom-Up  Syntax Analysis

i $

input3 (+)

1(E)

0 ($)

shift 5

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

46

Page 47: Bottom-Up  Syntax Analysis

$

input5 (i)

3 (+)

1(E)

0($)

reduce T i

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)stack

47

Page 48: Bottom-Up  Syntax Analysis

$

input4 (T)

3 (+)

1(E)

0($)

reduce E E + Tstack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

48

Page 49: Bottom-Up  Syntax Analysis

$

input

1 (E)

0 ($)shift 2

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

49

Page 50: Bottom-Up  Syntax Analysis

input2 ($)

1 (E)

0 ($)

accept

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

50

Page 51: Bottom-Up  Syntax Analysis

((i) $

input

0($)shift 7

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

51

Page 52: Bottom-Up  Syntax Analysis

(i) $

input

7(()

0($)

shift 7

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

52

Page 53: Bottom-Up  Syntax Analysis

i) $

input7 (()

7(()

0($)

shift 5

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

53

Page 54: Bottom-Up  Syntax Analysis

) $

input5 (i)

7 (()

7(()

0($)

reduce T i

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

54

Page 55: Bottom-Up  Syntax Analysis

) $

input6 (T)

7 (()

7(()

0($)

reduce E T

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

55

Page 56: Bottom-Up  Syntax Analysis

) $

input8 (E)

7 (()

7(()

0($)

shift 9

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

56

Page 57: Bottom-Up  Syntax Analysis

$

input

9 ())

8 (E)

7 (()

7(()

0($)

reduce T ( E )

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

57

Page 58: Bottom-Up  Syntax Analysis

$

input6 (T)

7(()

0($)

reduce E T

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

58

Page 59: Bottom-Up  Syntax Analysis

$

input8 (E)

7(()

0($)

err

stack

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)

59

Page 60: Bottom-Up  Syntax Analysis

1: S E$

4: E T6: E E + T

10: T i12: T (E)

5: E T T

11: T i i

2: S E $7: E E + TE

13: T ( E)4: E T6: E E + T10: T i12: T (E)

)

)

15: T (E) (

14: T (E )7: E E + T

E

7: E E + T10: T i12: T (E)

+

+

8: E E + T

T

2: S E $ $

06

5

7

8

9

4

3

2

1

i

i

60

Page 61: Bottom-Up  Syntax Analysis

Constructing LR(0) parsing table

• Add a production S’ S$• Construct a deterministic finite automaton

accepting “valid stack symbols”• States are set of items A

– The states of the automaton becomes the states of parsing-table

– Determine shift operations– Determine goto operations– Determine reduce operations

61

Page 62: Bottom-Up  Syntax Analysis

Filling Parsing Table

• A state si

• reduce A – A si

• Shift on t– A t si

• Goto(si, X) = sj

– A X si

(si, X) = sj

• When conflicts occurs the grammar is not LR(0)62

Page 63: Bottom-Up  Syntax Analysis

Example Control Table

i+)($ET

0s5errs7errerr16

1errs3errerrs2

2acc

3s5errs7errerr4

4reduce EE+T

5reduce T i

6reduce E T

7s5errs7errerr86

8errs3errs9err

9reduce T(E)63

Page 64: Bottom-Up  Syntax Analysis

LR(0) itemsi+$E

1: S E$24, 8

2: S E $s3

3: S E $ r S E$

4: E E + E54, 8

5: E E + Es6

6: E E + E7

7: E E + E r E E+E

8:E is9

9:E i r E i

S E$

E E+E

E i

Example Non LR(0) Grammar

64

Page 65: Bottom-Up  Syntax Analysis

Example Non LR(0)DFA

S E $ E E + E | i

SE$EE+EE i

0

SE$EE+E

2E

EE+ E

E E+EE i

4

+

EE+ E

EE+E

5E

+

E i

i

1 S E $

$

3

i65

Page 66: Bottom-Up  Syntax Analysis

i+$E

0s1errerr2

1red E i

2errs4s3

3accept

4s15

5red E E +E s4

red E E +E

red E E +E

SE$EE+EE i

0

SE$EE+E

2

E

EE+ EE E+E

E i

4

+

EE+ EEE+E

5E

+

E i

i

1 S E $

$

3

i

66

Page 67: Bottom-Up  Syntax Analysis

Dangling Else

67

S if cond s else s | if cond s

| assign

Page 68: Bottom-Up  Syntax Analysis

Non-Ambiguous Non LR(0) Grammar S E $E E + T | T T T * F | F F i

S E $E E + TE T T T * F T F F i

E T T T * F

T

i+*

0

1??

2

0

1

2

T T * F F i

*

68

Page 69: Bottom-Up  Syntax Analysis

Non-Ambiguous SLR(1) Grammar S E $E E + T | T T T * F | F F i

S E $E E + TE T T T * F T F F i

E T T T * F

T

i+*

0

1r E T s2

2

0

1

2

T T * F F i

*

69

Page 70: Bottom-Up  Syntax Analysis

LR(1) Parser

• LR(1) Items A , t is at the top of the stack and we are

expecting t

• LR(1) State– Sets of items

• LALR(1) State– Merge items with the same look-ahead

70

Page 71: Bottom-Up  Syntax Analysis

Grammar Hierarchy

Non-ambiguous CFG

CLR(1)

LALR(1)

SLR(1)

LL(1)

LR(0)

71

Page 72: Bottom-Up  Syntax Analysis

Interesting Non LR(1) Grammars

• Ambiguous – Arithmetic expressions– Dangling-else

• Common derived prefix

– A B1 a b | B2 a c

– B1

– B2

• Optional non-terminals– St OptLab Ass– OptLab id : | – Ass id := Exp

72

Page 73: Bottom-Up  Syntax Analysis

A motivating example• Create a desk calculator• Challenges

– Non trivial syntax– Recursive expressions (semantics)

• Operator precedence

73

Page 74: Bottom-Up  Syntax Analysis

import java_cup.runtime.*;%%%cup%eofval{ return sym.EOF;%eofval}NUMBER=[0-9]+%%”+” { return new Symbol(sym.PLUS); }”-” { return new Symbol(sym.MINUS); }”*” { return new Symbol(sym.MULT); }”/” { return new Symbol(sym.DIV); }”(” { return new Symbol(sym.LPAREN); }”)” { return new Symbol(sym.RPAREN); }{NUMBER} {

return new Symbol(sym.NUMBER, new Integer(yytext()));}\n { }. { }

Solution (lexical analysis)

• Parser gets terminals from the Lexer 74

Page 75: Bottom-Up  Syntax Analysis

terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;nonterminal Integer expr;precedence left PLUS, MINUS;precedence left DIV, MULT;Precedence left UMINUS;%%expr ::= expr:e1 PLUS expr:e2

{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}| expr:e1 MINUS expr:e2{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}| expr:e1 MULT expr:e2{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}| expr:e1 DIV expr:e2{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}| MINUS expr:e1 %prec UMINUS{: RESULT = new Integer(0 - e1.intValue(); :}| LPAREN expr:e1 RPAREN{: RESULT = e1; :}| NUMBER:n {: RESULT = n; :}75

Page 76: Bottom-Up  Syntax Analysis

Summary• LR is a powerful technique• Generates efficient parsers• Generation tools exit LALR(1)

– Bison, yacc, CUP• But some grammars need to be tuned

– Shift/Reduce conflicts– Reduce/Reduce conflicts– Efficiency of the generated parser

• There exist more general methods – GLR– Arbitrary grammars in n3

• Early parsers• CYK algorithms

76