Translation code translation converted code In compilers, this is the task of creating executable...

Preview:

Citation preview

Translation

code translation convertedcode

In compilers, this is the task of creatingexecutable from source code.

How is it done?

So far, we have analysed the code, identifyingthe "words" and the syntactical form.

Together, these help us understand themeaning of the code, and so we will use thestructure we have identified to create thetarget code.

Postfix Notation

Postfix notation is a method for writingexpressions which is unambiguous, and corresponds to the processing order we usein bottom-up parsing.

a+b is written as ab+a*b is written as ab*: :

In general, we define it recursively:

postfix for E1 <op> E2 is (postfix for E1)(postfix for E2)<op>

and

postfix for (E) is (postfix for E)

Postfix exampleWe can write a+(b*c)*(b*(a+b)) as

abc*bab+**+

To read a postfix expression, start from the leftand move right. By the time we reach an operator,we take the correct number of operands we havemost recently recognised, to get a new expression.

Above, b and c are the operands of the first *,a and b are the operands of the first +, etc.

Labelling the operators, we get:

a+1(b*1c)*2(b*3(a+2b))

which translates to

abc*1bab+2*3*2+1

Translating while parsing1) S -> S + T2) S -> T3) T -> T * F4) T -> F5) F -> ( S )6) F -> a

print ("+")

print("*")

print(a)

Parsing a+a*a+a produces the followingsequence:

a+a*a+a <=6 F+a*a+a <=T+a*a+a <= S+a*a+a <=6

S+F*a+a <=S+T*a+a <=6 S+T*F+a <=3 S+T+a <=1

S+a <=6 S +F <= S+T <=1 S

The order in which the productions wereapplied is 6, 6, 6, 3, 1, 6, 1, which causes theoutput of

aaa*+a*

The Value StackAll we are able to do using the previous methodis execute an action whenever a rule is used.We can't store up actions for future use.

However, we can extend the idea, by associatingvalues with each symbol on the stack. Theactions we then carry out can use those values that have been stored.

Suppose that we are about to reduce byA -> x1 x2 .... xn

This means that the furthest right symbols onthe symbol stack are:

x1 x2 ... xn

Call the values associated with those symbols$1, $2, ..., $n

When we carry out the reduction, we removethose symbols, and replace by A.

Changing the Value StackRemove the first n symbols from the stack, andreplace by a new value for A, which we will call$$. The value we want to store for A will dependon the values we stored for the xi

That is, $$ = f($1, $2, ..., $n), for some function f.

The only other case we need to consider iswhen we place a terminal on the stack. Wheredo we get its value?

Generally, we expect the lexical analyser tofind the value for us.

This means that in your Lex script, every time you recognise an integer or a real, you must translate it into a number of the appropriate form.

Computing the value of expressions

1) S -> S + T2) S -> T3) T -> T * F4) T -> F5) F -> ( S )6) F -> 17) F -> 28) F -> 3

$$ := $1 + $3$$ := $1$$ := $1 * $3$$ := $1$$ := $2$$ := 1$$ := 2$$ := 3

Note: this is a simplification

In practice, we would have 6) F -> a, andexpect the lexical analyser to return thedifferent integer values

Parsing the expressions

Symbol

1FTSS+S+2S+FS+TS+T*S+T*3S+T*FS+TS

Values

11111•1•21•21•21•2•1•2•31•2•31•67

Stack0050302010160165016301690169701697501697 10016901

Input1+2*3#

+2*3#+2*3#+2*3#+2*3#

2*3#*3#*3#*3#3#

####

ActionS5R6R4R2S6S5R6R4S7S5R6R3R1A

Value Stack in Lex

Lex must place the values in yylval

1. digit string - compute the value, place in yylval

2. char string - copy to a string array, place the index of its start point in yylval

3. real string - convert to a floating point, store in an array of reals, place the index of its start point in yylval

4. identifier - store as for strings

Lex and yylval

%{#include "y.tab.h"#include <stdlib.h>extern int yylval;%}

%%[0-9]+ {yylval = atoi(yytext);

/* convert string to integer */ return INT_T;}

[ \t] ; /* ignore space */

: :: :: :

%%

Value Stack in Yacc

Yacc allows an action after each production.The action will be performed immediatelybefore the reduction.

Values are represented using the $$ and $inotation.

When the statement is reached by Yacc, it will translate the different $i's into theirappropriate types

Using Yacc's Value Stack

%%Finish : Expr {printf("%d",$1);} ;

Expr : Expr PLUS_T Term{$$ = $1 + $3;}

| Term;

Term : Term MUL_T Factor{$$ = $1 * $3;}

| Factor;

Factor : OB_T Expr CB_T{$$ = $2;}

| INT_T;

%%

Syntax-Directed Translation

Yacc allows us to use the value stack.

However, this method only allows us toassociate a single value with each symbol.

We may want to record more information:

data types

places in the symbol table

code fragments

We will extend the idea of the value stackby associating multiple values with symbols

AttributesWith each symbol in the grammar, associatea set of attributes.

The attributes can be of any type, and represent any information we can express.

With each production in the grammar,associate a set of semantic rules, determininghow the values of the attributes are to becomputed.

The computation can modify the values ofthe attributes, or can have side-effects,modifying some external structure - e.g. thesymbol table - or can output results to thescreen or to a file.

Formal attribute definition

p) A -> is a grammar rule.

p) has associated with a set of semanticfunctions of the form

b := f(c1, c2, ..., cn)

where b, c1, c2, ..., cn are attributes of any symbol appearing in p).

If b is an attribute of A, then b is a synthesised attribute.

If b is an attribute of one of the symbols in , then b is an inherited attribute

Syntax-directed Definition: Example

1)

2)

3)

4)

5)

6)

7)

S -> E

E1 -> E2 + T

E -> T

T1 -> T2 * F

T -> F

F -> ( E )

F -> digit

print(E.val)

E1.val := E2.val + T.val

E.val := T.val

T1.val := T2.val * F.val

T.val := F.val

F.val := E.val

F.val := digit.lexval

Synthesised Attributes

The value of a synthesised attribute eithercomes from the child nodes, or from theproperties of the symbol itself.

As soon as a symbol is recognised in bottom-upparsing, the values of its synthesisedattributes can be obtained.

Thus, if a derivation of a string uses onlysymbols with synthesised attributes, we canevaluate all the attributes as we carry outthe parse.

A syntax-directed definition which uses onlysynthesised attributes is called anS-attributed definition.

6 + 2 * 3

S

E

T

F

val = 12

E

T

F

6

val = 6

val = 6

val = 6

lexval = 6

T

F

2

val = 2

lexval = 2

3lexval = 3

val = 2 val = 3

val = 6

+

*

12

Annotated Parse Tree

Inherited AttributesAn inherited attribute has its value determinedby the attribute values of its parent orsiblings.

Inherited attributes are useful for describingthe way in which the meaning of a symboldepends upon the context in which itappears.

For example, the meaning of the identifier "num" is different in the two cases below:

real num;

int num;

Thus a "type" attribute cannot be determinedfrom the symbol alone, but must be derivedfrom the attribute of parent or sibling symbols.

Inherited Attribute Example

1)

2)

3)

4)

5)

D -> T L

T -> int

T -> real

L1 -> L2, id

L -> id

L.t := T.t

T.t := integer

T.t := real

L2.t := L1.t, addtype(id.entry, L1.t)

addtype(id.entry, L.t)

D

T L

real , idL

, idL

id

t = real t = real

t = real

t = real

entry

entry

entry

addentry(...)

addentry(...)

addentry(...)

Augmented Parse Tree

real id1 , id2 , id3

Information FlowD

T L

real , idL

, idL

id

t = real t = real

t = real

t = real

entry

entry

entry

real id1 , id2 , id3

addentry(...)

addentry(...)

addentry(...)

Dependency Graph

The augmented parse tree on the previous slide is called a dependency graph.

We use dependency graphs to determinethe order in which we must evaluate theattributes to get a completely evaluatedparse tree.

A topological sort is an ordering of theattributes of a graph which is a valid orderin which to evaluate the attributes.

Topological SortD

T L

real , idL

, idL

id

t = real t = real

t = real

t = real

entry

entry

entry

real id1 , id2 , id3

addentry(...)

addentry(...)

addentry(...)

12

3

4

5

6

7

8

9

10

Evaluation methods

parse-tree based At compile time,construct a parse tree, then a dependencygraph, then a topological sort. Evaluatethe attributes in that order.

rule based When the compiler isconstructed, analyse the rules fordependencies between attributes, andfix the order of evaluation beforecompilation begins.

oblivious Use a fixed evaluation orderwithout analysing the dependencies.This limits the class of grammars thatcan be implemented.

Syntax Trees

A syntax tree is a condensed parse tree,where the operators and keywords do notappear as leaves, but with the parent nodesthat would have been their parents in theparse tree.

S => if B then S1 else S2

Example:

has the syntax tree:

if then else

B S1 S2

6 + 2 * 3

E

E T+

T *T F

F F 3

26

+

6 *

32

Using Syntax Trees

A syntax tree allows the translation processto be separated from the parsing process.

A grammar that is best for parsing mightnot explicitly represent the hierarchicalnature of the programs it describes

The parsing method imposes an orderin which the nodes are considered, whichmight not be the best order for translation.

Constructing Syntax Trees

We can use a syntax-directed definition tocreate syntax trees in a similar way to theway we created postfix expressions.

We will represent each node as a simpledata structure.

Operator structures will have a name and a number of fields containing pointers to eachoperand.

Simple operand structures will have a typeand a value.

E.g. 2+3 will be represented by:

+

num 2 num 3

Functions We require the following three functions:

mknode(op,left,right): creates aninternal node for the operator "op", withtwo fields for pointers to the left andright operands.

mkleaf_id(id,entry): creates a leafnode for the identifier "id", and a field fora pointer to the symbol table entry for "id".

mkleaf_num(num,val): creates a leafnode, labelled "num", with a field forthe value of the number.

Each function returns a pointer to the nodejust created.

Example Definition

1)

2)

3)

4)

5)

6)

7)

E1 -> E2 + T

E -> T

T1 -> T2 * F

T -> F

F -> ( E )

F -> id F -> num

E1.ptr := mknode("+", E2.ptr,T.ptr)

E.ptr := T.ptr

T1.ptr := mknode("*",T2.ptr, F.ptr)

T.ptr := F.ptr

F.ptr := E.ptr

F.ptr := mkleaf_id(id, id.entry)

F.ptr := mkleaf_num(num,num.val)

Constructing 6+2*x

E

E T+

T *T F

F F 3

26

ptr =

ptr = ptr =

ptr =ptr = ptr =

ptr =ptr =

+

num 6

id

*

num 2

Compound Statements

CStat -> Stat ; CStat

CStat -> Stat

Stat -> s

s ; s ; s ; s

CStat

CStat

CStat

CStat

Stat

Stat

Stat

Stat

;

;

;

s

s

s

s

Parse Tree

CStat1.ptr := mknode(";", Stat.ptr,CStat2.ptr)

CStat.ptr := Stat.ptr

Stat.ptr := mkleaf_id(id, s)

;

;

;

s

s

s s

Syntax tree

CStat1.ptr := CStat2.ptr; addChild(CStat1.ptr,Stat.ptr)

CStat.ptr := mkXnode(Stat.ptr)

Stat.ptr := mkleaf_id(id, s)

s s s s

seq

id

...(s)...

Seq -> CStat Seq.ptr := CStat.ptr;

seq ...

CStat1.ptr := Stat.ptr; addSib(Stat.ptr,CStat2.ptr)

CStat.ptr := Stat.ptr

Stat.ptr := mkleaf_id(id, s)

s s s s

seq

id

...(s)...

sibling

Seq -> CStat Seq.ptr := CStat.ptr;

Sample Program

int a b c;int g[5];

int testFunc(int x) { real y;

y := (x+a)/2; print(y); return a;}

main() {

a := 1; while (a < 3) do { testFunc(a); a := a + 1; }}

Recommended