54
06/13/22 1 Programming Languages and Compilers (CS 421) William Mansky http://courses.engr.illinois .edu/cs421/ Based in part on slides by Mattox Beckman, as updated by Vikram Adve, Gul Agha, Elsa Gunter, and Dennis Griffith

Programming Languages and Compilers (CS 421)

  • Upload
    lorin

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Programming Languages and Compilers (CS 421). William Mansky http://courses.engr.illinois.edu/cs421/. Based in part on slides by Mattox Beckman, as updated by Vikram Adve, Gul Agha, Elsa Gunter, and Dennis Griffith. Grammars. - PowerPoint PPT Presentation

Citation preview

Page 1: Programming Languages and Compilers (CS 421)

04/22/23 1

Programming Languages and Compilers (CS 421)

William Mansky

http://courses.engr.illinois.edu/cs421/

Based in part on slides by Mattox Beckman, as updated by Vikram Adve, Gul Agha, Elsa Gunter, and Dennis Griffith

Page 2: Programming Languages and Compilers (CS 421)

04/22/23 2

Grammars

Grammars are formal descriptions of which strings over a given character set are in a particular language

Language designers write grammar Language implementers use grammar

to know what programs to accept Language users use grammar to know

how to write legitimate programs

Page 3: Programming Languages and Compilers (CS 421)

04/22/23 3

Types of Formal Language Descriptions

Regular expressions, regular grammars, finite state automata

Context-free grammars, BNF grammars, syntax diagrams

Whole family more of grammars and automata – covered in automata theory

Page 4: Programming Languages and Compilers (CS 421)

04/22/23 4

Sample Grammar

Language: Parenthesized sums of 0’s and 1’s

<Sum> ::= 0 <Sum >::= 1 <Sum> ::= <Sum> + <Sum> <Sum> ::= (<Sum>)

Page 5: Programming Languages and Compilers (CS 421)

04/22/23 5

BNF Grammars

Start with a set of characters, a,b,c,… We call these terminals

And a set of different characters, X,Y,Z,… We call these nonterminals

One special nonterminal S called start symbol

Page 6: Programming Languages and Compilers (CS 421)

04/22/23 6

BNF Grammars

BNF rules (aka productions) have form X ::= y where X is any nonterminal and y is a

string of terminals and nonterminals BNF grammar is a set of BNF rules such

that each nonterminal used appears on the left of some rule (i.e., at least one production per nonterminal)

Page 7: Programming Languages and Compilers (CS 421)

04/22/23 7

Sample Grammar

Terminals: 0 1 + ( ) Nonterminals: <Sum> Start symbol = <Sum>

<Sum> ::= 0 <Sum >::= 1 <Sum> ::= <Sum> + <Sum> <Sum> ::= (<Sum>) Can be abbreviated as <Sum> ::= 0 | 1 | <Sum> + <Sum> | (<Sum>)

Page 8: Programming Languages and Compilers (CS 421)

04/22/23 8

BNF Deriviations

Given rules X ::= yZw and Z ::=v

we may writeX => yZw => yvw

Sequence of such replacements called derivation

Derivation called right-most if always replace the right-most non-terminal

Page 9: Programming Languages and Compilers (CS 421)

04/22/23 9

BNF Derivations

Start with the start symbol:

<Sum> =>

Page 10: Programming Languages and Compilers (CS 421)

04/22/23 10

BNF Derivations

Pick a non-terminal

<Sum> =>

Page 11: Programming Languages and Compilers (CS 421)

04/22/23 11

Pick a rule and substitute: <Sum> ::= <Sum> + <Sum>

<Sum> => <Sum> + <Sum >

BNF Derivations

Page 12: Programming Languages and Compilers (CS 421)

04/22/23 12

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

BNF Derivations

Page 13: Programming Languages and Compilers (CS 421)

04/22/23 13

Pick a rule and substitute: <Sum> ::= ( <Sum> )

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum>

BNF Derivations

Page 14: Programming Languages and Compilers (CS 421)

04/22/23 14

Pick a non-terminal:

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum>

BNF Derivations

Page 15: Programming Languages and Compilers (CS 421)

04/22/23 15

Pick a rule and substitute: <Sum> ::= <Sum> + <Sum>

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum>

BNF Derivations

Page 16: Programming Languages and Compilers (CS 421)

04/22/23 16

Pick a non-terminal:

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum>

BNF Derivations

Page 17: Programming Languages and Compilers (CS 421)

04/22/23 17

Pick a rule and substitute: <Sum >::= 1

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum>

BNF Derivations

Page 18: Programming Languages and Compilers (CS 421)

04/22/23 18

Pick a non-terminal:

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum>

BNF Derivations

Page 19: Programming Languages and Compilers (CS 421)

04/22/23 19

Pick a rule and substitute: <Sum >::= 0

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum> => ( <Sum> + 1 ) + 0

BNF Derivations

Page 20: Programming Languages and Compilers (CS 421)

04/22/23 20

Pick a non-terminal:

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum> => ( <Sum> + 1 ) + 0

BNF Derivations

Page 21: Programming Languages and Compilers (CS 421)

04/22/23 21

Pick a rule and substitute <Sum> ::= 0

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum> => ( <Sum> + 1 ) 0 => ( 0 + 1 ) + 0

BNF Derivations

Page 22: Programming Languages and Compilers (CS 421)

04/22/23 22

( 0 + 1 ) + 0 is generated by grammar

<Sum> => <Sum> + <Sum > => ( <Sum> ) + <Sum> => ( <Sum> + <Sum> ) +

<Sum> => ( <Sum> + 1 ) + <Sum> => ( <Sum> + 1 ) + 0 => ( 0 + 1 ) + 0

BNF Derivations

Page 23: Programming Languages and Compilers (CS 421)

04/22/23 23

<Sum> ::= 0 | 1 | <Sum> + <Sum> | (<Sum>)

<Sum> =>

Page 24: Programming Languages and Compilers (CS 421)

04/22/23 24

BNF Semantics

The language of a BNF grammar is the set of all strings of terminals that can be derived from the Start symbol

Page 25: Programming Languages and Compilers (CS 421)

04/22/23 25

Extended BNF Grammars

Alternatives: allow rules of from X ::= y | z Abbreviates X ::= y, X ::= z

Options: X ::= y[v]z Abbreviates X ::= yvz, X ::= yz

Repetition: X ::= y{v}*z Can be eliminated by adding new

nonterminal V and rules X ::= yz, X ::= yVz, V ::= v, V ::= vV

Page 26: Programming Languages and Compilers (CS 421)

04/22/23 26

Regular Grammars

Subclass of BNF Only rules of form

<nonterminal>::=<terminal><nonterminal> or <nonterminal>::=<terminal> or <nonterminal>::= ε

Defines same class of languages as regular expressions

Can be used for writing lexers

Page 27: Programming Languages and Compilers (CS 421)

04/22/23 27

Example

Regular grammar: <Balanced> ::= <Balanced> ::= 0<OneAndMore><Balanced> ::= 1<ZeroAndMore><OneAndMore> ::= 1<Balanced><ZeroAndMore> ::= 0<Balanced>

Generates even length strings where every initial substring of even length has same number of 0’s as 1’s

Page 28: Programming Languages and Compilers (CS 421)

04/22/23 28

Graphical representation of derivation Each node labeled with either non-

terminal or terminal If node is labeled with a terminal, then

it is a leaf (no sub-trees) If node is labeled with a non-terminal,

then it has one branch for each character in the right-hand side of the production used on it

Parse Trees

Page 29: Programming Languages and Compilers (CS 421)

04/22/23 29

Example

Consider grammar: <exp> ::= <factor> | <factor> + <factor> <factor> ::= <bin> | <bin> * <exp> <bin> ::= 0 | 1

Problem: Build parse tree for 1 * 1 + 0 as an <exp>

Page 30: Programming Languages and Compilers (CS 421)

04/22/23 30

Example cont.

1 * 1 + 0: <exp>

<exp> is the start symbol for this parse tree

Page 31: Programming Languages and Compilers (CS 421)

04/22/23 31

Example cont.

1 * 1 + 0: <exp>

<factor>

Use rule: <exp> ::= <factor>

Page 32: Programming Languages and Compilers (CS 421)

04/22/23 32

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

Use rule: <factor> ::= <bin> * <exp>

Page 33: Programming Languages and Compilers (CS 421)

04/22/23 33

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

Use rules: <bin> ::= 1 and <exp> ::= <factor> +

<factor>

Page 34: Programming Languages and Compilers (CS 421)

04/22/23 34

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

<bin> <bin>

Use rule: <factor> ::= <bin>

Page 35: Programming Languages and Compilers (CS 421)

04/22/23 35

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

<bin> <bin>

1 0Use rules: <bin> ::= 1 | 0

Page 36: Programming Languages and Compilers (CS 421)

04/22/23 36

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

<bin> <bin>

1 0Fringe of tree is string generated by

grammar

Page 37: Programming Languages and Compilers (CS 421)

04/22/23 37

Your Turn: 1 * 0 + 0 * 1

Page 38: Programming Languages and Compilers (CS 421)

04/22/23 38

Parse Tree Data Structures

Parse trees may be represented by OCaml datatypes (e.g. exp)

One datatype for each nonterminal

One constructor for each rule

Defined as mutually recursive collection of datatype declarations

Page 39: Programming Languages and Compilers (CS 421)

04/22/23 39

Example

Recall grammar:<exp> ::= <factor> | <factor> +

<factor><factor> ::= <bin> | <bin> * <exp><bin> ::= 0 | 1

type exp = Factor2Exp of factor | Plus of factor * factor and factor = Bin2Factor of bin | Mult of bin * exp and bin = Zero | One

Page 40: Programming Languages and Compilers (CS 421)

04/22/23 40

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

<bin> <bin>

1 0

Page 41: Programming Languages and Compilers (CS 421)

04/22/23 41

Example cont.

1 * 1 + 0: <exp>

<factor>

<bin> * <exp>

1 <factor> + <factor>

<bin> <bin>

1 0Factor2Exp(Mult(One, Plus(Bin2Factor One, Bin2Factor

Zero)))

Page 42: Programming Languages and Compilers (CS 421)

04/22/23 42

Ambiguous Grammars and Languages

A BNF grammar is ambiguous if its language contains strings for which there is more than one parse tree

If all BNFs for a language are ambiguous then the language is inherently ambiguous

Page 43: Programming Languages and Compilers (CS 421)

04/22/23 43

Example: Ambiguous Grammar

0 + 1 + 0 <Sum> <Sum>

<Sum> + <Sum> <Sum> + <Sum>

<Sum> + <Sum> 0 0 <Sum> + <Sum>

0 1 1 0

Page 44: Programming Languages and Compilers (CS 421)

04/22/23 44

Example

What is the result for:3 + 4 * 5 + 6

Page 45: Programming Languages and Compilers (CS 421)

04/22/23 45

Example

What is the result for:3 + 4 * 5 + 6

Possible answers: 41 = ((3 + 4) * 5) + 6 47 = 3 + (4 * (5 + 6)) 29 = (3 + (4 * 5)) + 6 = 3 + ((4 * 5) + 6) 77 = (3 + 4) * (5 + 6)

Page 46: Programming Languages and Compilers (CS 421)

04/22/23 46

Example

What is the value of:7 – 5 – 2

Page 47: Programming Languages and Compilers (CS 421)

04/22/23 47

Example

What is the value of:7 – 5 – 2

Possible answers: In Pascal, C++, SML assoc. left 7 – 5 – 2 = (7 – 5) – 2 = 0 In APL, associate to right 7 – 5 – 2 = 7 – (5 – 2) = 4

Page 48: Programming Languages and Compilers (CS 421)

04/22/23 48

Two Major Sources of Ambiguity

Lack of determination of operator precedence

Lack of determination of operator associativity

There may be other sources as well

Page 49: Programming Languages and Compilers (CS 421)

04/22/23 49

How to Enforce Associativity

Have at most one recursive call per production

When two or more recursive calls would be natural leave right-most one for right associativity, left-most one for left associativity

Page 50: Programming Languages and Compilers (CS 421)

04/22/23 50

Example

<Sum> ::= 0 | 1 | <Sum> + <Sum>

| (<Sum>) Becomes

<Sum> ::= <Num> | <Num> + <Sum>

<Num> ::= 0 | 1 | (<Sum>)

Page 51: Programming Languages and Compilers (CS 421)

04/22/23 51

Operator Precedence

Operators of highest precedence evaluated first (bind more tightly)

Precedence for infix binary operators as in following table

Needs to be reflected in grammar

Page 52: Programming Languages and Compilers (CS 421)

04/22/23 52

Precedence Table - Sample

Fortan

Pascal

C/C++

Ada SML

highest

** *, /, div, mod

++, -- ** div, mod, /, *

*, / +, - *, /, % *, /, mod

+, -, ^

+, - +, - +, - ::

Page 53: Programming Languages and Compilers (CS 421)

04/22/23 53

First Example Again

In any above language, 3 + 4 * 5 + 6 = 29

In APL, all infix operators have same precedence Thus we still don’t know what the

value is (handled by associativity) How do we handle precedence in

grammar?

Page 54: Programming Languages and Compilers (CS 421)

04/22/23 54

Predence in Grammar

Higher precedence translates to deeper in the derivation chain

Example:<exp> ::= <id> | <exp> + <exp> | <exp> * <exp>

Becomes<exp> ::= <mult_exp> | <exp> + <mult_exp><mult_exp> ::= <id> | <mult_exp> *

<id>