39
1 Terminology Statement ( 敘敘 ) » declaration, assignment containing expression ( 敘敘敘 ) Grammar ( 敘敘 ) » a set of rules specify the form of legal statements Syntax ( 敘敘 ) vs. Semantics ( 敘敘 ) » Example: assuming J,K:integer and X,Y:float » I:=J+K vs I:=X+Y Compilation: 敘敘 » matching statements written by the programmer to st ructures defined by the grammar and generating the appropriate object code

Terminology

  • Upload
    dory

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Terminology. Statement ( 敘述 ) declaration, assignment containing expression ( 運算式 ) Grammar ( 文法 ) a set of rules specify the form of legal statements Syntax ( 語法 ) vs. Semantics ( 語意) Example: assuming J,K :integer and X,Y :float I := J + K vs I := X+Y Compilation: 編譯 - PowerPoint PPT Presentation

Citation preview

Page 1: Terminology

1

Terminology

Statement ( 敘述 )» declaration, assignment containing expression ( 運算式 )

Grammar ( 文法 )» a set of rules specify the form of legal statements

Syntax ( 語法 ) vs. Semantics ( 語意 )» Example: assuming J,K:integer and X,Y:float» I:=J+K vs I:=X+Y

Compilation: 編譯 » matching statements written by the programmer to structure

s defined by the grammar and generating the appropriate object code

Page 2: Terminology

2

System Software

Assembler Loader and Linker Macro Processor Compiler Operating System Other System Software

» RDBS» Text Editors» Interactive Debugging System

Page 3: Terminology

3

Basic Compiler

Lexical analysis -- scanner» scanning the source statement, recognizing and classifying

the various tokens

Syntactic analysis -- parser» recognizing the statement as some language construct

Code generation --

Page 4: Terminology

4

Scanner

PROGRAM

STATS

VAR

SUM

,

SUMSQ

,

I

SUM

:=

0

;

SUMSQ

:=

READ

(

VALUE

)

;

Page 5: Terminology

5

Parser

Grammar: a set of rules » Backus-Naur Form (BNF) » Ex: Figure 5.2

Terminology» Define symbol ::=» Nonterminal symbols <>» Alternative symbols |» Terminal symbols

Page 6: Terminology

6

Simplified Pascal Grammar

Page 7: Terminology

7

Parser

<read> ::= READ (<id-list>) <id-list>::= id | <id-list>,id

<assign>::= id := <exp> <exp> ::= <term> |

<exp>+<term> | <exp>-<term>

<term>::=<factor> | <term>*<factor> | <term> DIV <factor>

<factor>::= id | int | <exp>

READ(VALUE)

SUM := 0

SUM := SUM + VALUE

MEAN := SUM DIV 100

Page 8: Terminology

8

Syntax Tree

Page 9: Terminology

9

Syntax Tree for Program 5.1

Page 10: Terminology

10

Lexical Analysis

Function» scanning the program to be compiled and recognizing the tokens that make up the source statements

Tokens» Tokens can be keywords, operators, identifiers, integers, flo

ating-point numbers, character strings, etc.» Each token is usually represented by some fixed-length cod

e, such as an integer, rather than as a variable-length character string (see Figure 5.5)

» Token type, Token specifier (value) (see Figure 5.6)

Page 11: Terminology

11

Scanner Output

Token specifier» identifier name, integer value

Token coding scheme» Figure 5.5

Page 12: Terminology

12

Example - Figure 5.6

Statement Token type Token specifierPROGRAM STATS 1

22 ^STATSVAR 2SUM,SUMSQ,I, …, : INTEGER 22 ^SUM

1422 ^SUMSQ

14…..

1422 ^VARIANCE

136

BEGIN 3SUM:=0; 22 ^SUM

1523 #0

Page 13: Terminology

13

Token Recognizer

By grammar» <ident> ::= <letter> | <ident> <letter>| <ident><digit>» <letter> ::= A | B | C | D | … | Z» <digit> ::= 0 | 1 | 2 | 3 | … | 9

By scanner - modeling as finite automata» Figure 5.8(a)

Page 14: Terminology

14

Recognizing Identifier

Identifiers allowing underscore (_)» Figure 5.8(b)

State A-Z 0-9 _1 22 2 2 33 2 2

1 2A-Z 2

A-Z0-9

3

A-Z0-9

-

Page 15: Terminology

15

Recognizing Integer

Allowing leading zeroes» Figure 5.8(c)

Disallowing leading zeroes» Figure 5.8(d)

1 20-9 2

0-9

1 21-9 2

0-9

3

0

4space 4

Page 16: Terminology

16

Scanner -- Implementation

Figure 5.10 (a)» Algorithmic code for identifer recognition

Tabular representation of finite automaton for Figure 5.9

State A-Z 0-9 ;,+-*() : = .

1 2 4 5 62 2 2 334 456 77

Page 17: Terminology

17

Syntactic Analysis

Recognize source statements as language constructs or build the parse tree for the statements» bottom-up: operator-precedence parsing» top-down:: recursive-descent parsing

Page 18: Terminology

18

Operator-Precedence Parsing

Operator» any terminal symbol (or any token)

Precedence» * » +» + « *

Operator-precedence» precedence relations between operators

Page 19: Terminology

Precedence Matrix for the Fig. 5.2

Page 20: Terminology

20

Operator-Precedence Parse Example

BEGIN READ ( VALUE ) ;

Page 21: Terminology

(i) … id1 := id2 DIV

(ii) … id1 := <N1> DIV int -

(iii) … id1 := <N1> DIV <N2> -

(iv) … id1 := <N3> - id3 *

(v) … id1 := <N3> - <N4> * id4 ;

Page 22: Terminology

(vi) … id1 := <N3> - <N4> * <N5> ;

(vi) … id1 := <N3> - <N6> ;

(vii) … id1 := <N7> ;

Page 23: Terminology

23

Operator-Precedence Parsing

Bottom-up parsing Generating precedence matrix

» Aho et al. (1988)

Page 24: Terminology

24

Shift-reduce Parsing with Stack

Figure 5.14

Page 25: Terminology

25

Recursive-Descent Parsing

Each nonterminal symbol in the grammar is associated with a procedure

<read> ::= READ (<id-list>) <stmt> ::= <assign> | <read> | <write> | <for> Left recursion

» <dec-list> ::= <dec> | <dec-list>;<dec>

Modification» <dec-list> ::= <dec> {;<dec>}

Page 26: Terminology

26

Page 27: Terminology

27

Recursive-Descent Parse of READ

Page 28: Terminology

28

Simplified Pascal Grammar for Recursive-Descent Parser

Page 29: Terminology

29

Page 30: Terminology

30

Page 31: Terminology

31

Page 32: Terminology

32

Page 33: Terminology

33

Page 34: Terminology

34

Page 35: Terminology

35

Page 36: Terminology

36

Page 37: Terminology

37

Page 38: Terminology

38

Page 39: Terminology

39