CSC 415: Translators and Compilers Spring 2009

CSC 415: Translators and CompilersSpring 2009

Chapter 5

Contextual Analysis

Chart 2

Chapter 5: Contextual Analysis

Identification– Monolithic Block Structure– Flat Block Structure– Nested Block Structure– Attributes– Standard Environment

Type Checking A Contextual Analysis Algorithm Case Study: Contextual Analysis in the Triangle

Compiler

Chart 3

Contextual Analysis

Given a parsed program, the purpose of contextual analysis is to check that the program conforms to the source language’s contextual constraints.– Scope rules: rules governing declarations and applied

occurrences of identifiers– Type rules: rules that allow us to infer the types of

expressions, and to decide whether each expression has a valid type

Analysis of the program to determine correctness with respect to the language definition (beyond structure)

Chart 4

Contextual Analysis

Contextual analysis consists of two sub-phases:– Identification: applying the source language’s scope

rules to relate each applied occurrence of an identifier to its declaration (if any).

– Type checking: applying the source language's type rules to infer the type of each expression, and compare that type with the expected type.

Chart 5

Structure of a Compiler

Lexical Analyzer

Parser

Intermediate Code Generation

Optimization

Assembly Code Generation

Symbol Table

Source code

Assembly code

tokens

parse tree

intermediate representation


Semantic Analyzer

Semantic Analyzer

Identification

Type checking

Chart 6

Identification

Relate each applied occurrence of an identifier in the source program to the corresponding declaration– Ill-formed program if no corresponding declaration –

generate error

Identification could cause compiler efficiency problems

Inefficient to use the AST

Chart 7

Identification Table

Also known as symbol table Associates identifiers with their attributes Basic operation

– Make the identification table empty– Add an entry associating a given identifier with a given attribute– Retrieve the attribute (if any) associated with a given identifier

Attribute– Consists of information relevant to contextual analysis– Obtained from the identifier’s declaration

Chart 8

Identification Table

Each declaration in a program has a defined scope– Portion of program over which the declaration takes

effect

Block: any program phase that delimits the scope of declarations within it

Example Triangle block command– Let D in C

Scope of each declaration in D extends over the subcommand C

Chart 9

Identification Table: Structure/Implementation

Maintain scope– An identifier should be found in the table only when

valid– If an identifier is defined in multiple scopes, then a

lookup in the table must provide the appropriate meaning for the use

Efficiency– How fast is lookup?– How fast to enter/exit a scope?– What is the overall table size?

Chart 10

Identification Table: Structure/Implementation

Different implementations– Organized for efficient retrieval– Binary search tree– Hash table

Chart 11

Identification Table: Functionality

A mapping of identifiers to their meanings

Information– Name– Type– Location

Operations– Create– Insert– Lookup– Delete– Update entry– Entering a new

scope– Leaving a scope

Chart 12

Block Structures

Monolithic block structure– Basic and Cobol

Flat block structure– Fortran

Nested block structure– Pascal, Ada, C, and Java

Chart 13

Monolithic Block Structure

The only block is the entire program All declarations are global Simple rules

– No identifier may be declared more than once– For every applied occurrence of an identifier I, there must be a

corresponding declaration of I No identifier may be used unless declared

The identification table should contain entries for all declarations in the source program

– At most, one entry for each identifier– The table contains an identifier I and the associated attribute A

Chart 14

Monolithic Block Structure

Identification Attribute

b

n

c

(1)

(2)

(3)

Program(1) integer b = 10(2) integer n(3) char C

begin…n = n * b…Write c…

end

• Create new tablecreate command

• At declaration for identifier I, make table entry

insert command• At applied occurrence of identifier I, retrieve

information from tablelookup command

Chart 15

Flat Block Structure

Program partitioned into several disjoint blocks Two scope levels

– Some declarations are local in scope Identifiers restricted to particular block

– Other declarations are global in scope Identifiers allowed anywhere in the program – the program as a whole

is a block

Less simple rules– No global declared identifier may be re-declared globally

But same identifier may also be declared locally– No locally declared identifier may be re-declared in the same

block Same identifier may be declared locally in several different blocks

– For every applied occurrence of an identifier I in a block B, there must be a corresponding declaration of I

Either global declaration of I or a declaration of I local to B

Minor complication is to distinguish global and local declaration entries

Chart 16

Flat Block Structure


• At start of a blockenter new scope command

• At end of a blockleave scope commanddelete command


insert command• At applied occurrence of identifier I,

retrieve information from tablelookup command

(5) integer cbegin

…end

(4) procedure R

(2) real r(3) real pi = 3.14begin

…end

(1) procedure Q

(6) integer i(7) boolean b(8)char cbegin

…call R…

end

program


Q

r

pi

(1)

(2)

(3)

Level

global

local

local


Q

R

c

(1)

(4)

(5)

Level

global

global

local


Q

R

(1)

(4)

Level

global

global


Q

R

i

(1)

(4)

(6)

Level

global

global

local

local

local

b

c

(7)

(8)

Chart 17

Nested Block Structure

Blocks may be nested one within another Many scope levels

– Declarations in the outermost block are global in scope.

The outermost block is at scope level 1

– Declarations inside an inner block are local to that block

Every inner block is completely enclosed by another block Next to outermost block is at scope level 2 If enclosed by a level-n, the block is at scope level n+1

Chart 18


More complex rules– No identifier may be declared more than once in the

same block Same identifier may be declared in different blocks, even if

they are nested

– For every applied occurrence of an identifier I in a block B, there must be a corresponding declaration of I

Must be in B itself Or in the block B’ immediately enclosing B Or in B’’ immediately enclosing B’ Etc.In smallest enclosing block that contains any declaration of I

Chart 19



• At start of a blockenter new scope command

• At end of a blockleave scope commanddelete command


insert commandLevel number determined by number of calls to enter new scope

• At applied occurrence of identifier I, retrieve information from table using highest level for I

lookup command

Let(1) var a: Integer;(2) var b: BooleanIn

begin…;


a

b

(1)

(2)

Level

1

1


a

b

b

(1)

(2)

(3)

Level

1

1

2

2

3

c

d

(4)

(5)

let(3) var b: Integer;(4) var c: BooleanIn

begin…;

let(6) var d: Boolean;(7) Var e: Integer

in…;

…end;

…end

…

let(5) var d: Integer;

In…;


a

b

b

(1)

(2)

(3)

Level

1

1

2

2 c (4)


a

b

d

(1)

(2)

(6)

Level

1

1

2

e (7)2

Chart 20

Attributes

Kind– constant– variable– procedure– function– type

Type– boolean– character– integer– record– array

Examples

Chart 21

Attributes

Information to be extracted from declaration– Constant, variable, procedure, function, type– Procedure or function declaration includes a list of formal

parameters that may be a constant, variable, procedural, or functional parameter

– Language provides whole families of record and array types

How to manage attribute information– Extract type information from declarations and store in information

table Could be complex for a realistic programming language Could require tedious programming

– Use the AST Pointers in information table pointing to location in AST with that

identifier

Chart 22

Attributes

Program

LetCommand

SequentialDeclaration SequentialCommand

VarDeclaration VarDeclaration SequentialCommand

LetCommand

SequentialDeclaration

VarDeclaration VarDeclaration

Ident. int boolIdent.

Ident. intbool Ident.

a b

d e

. . .

. . .

. . .


a

b

Level

1

1

(1) (2)

(6)

Identification Attributeab

d

Level

112

e

(7)

2

Chart 23

Standard Environment

Predefined constants, variables, types, procedures, and functions

These are loaded into the identification table Scope rules for standard environment

– Scope enclosing the entire program Level 0

– Same scope level as global declarations Example is C

Chart 24

Structure of a Compiler

Lexical Analyzer

Parser

Intermediate Code Generation

Optimization

Assembly Code Generation

Symbol Table

Source code

Assembly code

tokens

parse tree



Semantic Analyzer

Semantic Analyzer

Identification

Type checking

Chart 25

Type Checking

Second task of contextual analyzer is to ensure that the source program contains no type errors

Once applied occurrence of an identifier has been identified, the contextual analyzer will check that the identifier is used in a way consistent with its declaration

Chart 26

Type Checking

Statically –typed language can detect any type errors without actually running the program– For every expression E in the language, the compiler

can infer either that E has some type T or that E is ill-typed

If E does have type T, then E will always yield a value of type T If a value of type T’ is expected, then compiler checks that T’ is

equivalent to T

Chart 27

Type Checking

Infers the type of each expression bottom-up– Starting with literals and identifiers, and working up through larger

and larger subexpressions– Literal: The type of a literal is immediately known– Identifier: The type of an applied occurrence of identifier I is

obtained from the corresponding declaration of I– Unary operator application:

Consider “O E” where O is a unary operator of type T1 T2

Type checker ensures that E’s type is equivalent to T1

Infers that type of “O E” is T2. Otherwise a type error

– Binary operator application: Consider “E1 O E2” where O is binary operator of type T1 X T2 T3

E1’s type is equivalent to T1

E2’s type is equivalent to T2

‘E1 O E2‘ is of type T3

Otherwise type error

Chart 28

Type Checking

Type of a nontrivial expression is inferred from the types of its sub-expressions, using the appropriate type rules

Must be able to test if two given types T and T’ are equivalent

Chart 29

Type Checking – Constant or Variable Identifier

ConstDeclaration

Ident. Expr.

x . . .

:T

SimpelVname

Ident.

x

ConstDeclaration

Ident. Expr.

x . . .

:T

SimpelVname

Ident.

x

:T

Chart 30

Type Checking – Variable Declaration

VarDeclaration

Ident.

x

T

SimpelVname

Ident.

x

VarDeclaration

Ident.

x

T

SimpelVname

Ident.

x

:T

Chart 31

Type Checking – Binary Operator

BinaryExpression

Ident.

. . .

Expr.Op.

. . .

:int:int

<

BinaryExpression

Ident.

. . .

Expr.Op.

. . .

:int:int

<

:bool

< is of type int X int bool

Chart 32

Type Checking

Each applied occurrence of an identifier must be identified before type checking can proceed– + is of type int X int int– * is of type float X flaot float

Chart 33

Type Checking

Different class of phrase to be checked– Checking command C will determine whether C is well-formed or not– Checking of expression E will determine whether E is well-formed, and

infer the type of E– Checking declaration D will determine whether D is well-formed, and

make entries in the identification table for the identifier declared in D.– Checking assignment command V := E

Checking V to determine its type and ensure that it is a variable Checking E to determine its type Testing whether the two types are compatible

– Checking a block command let D in C Opening an inner scope Checking D Checking C Closing the inner scope

The visitor methods in triangle does the contextual analysis

Chart 34

Type Checking in Triangle -- while

public Object visitWhileCommand(WhileCommand ast, Object o) {

TypeDenoter eType = (TypeDenoter) ast.E.visit(this, null);

if (! eType.equals(StdEnvironment.booleanType))

reporter.reportError("Boolean expression expected here", "", ast.E.position);

ast.C.visit(this, null);

return null;

}

Documents

CSC 415: Translators and Compilers Spring 2009