Upload
lua
View
229
Download
0
Embed Size (px)
Citation preview
8/7/2019 Lec12-
1/12
1
( )
/
Compilers
In this subject, we concentrate on the following two purposes.
1- To present a general model of a compiler that may be used as a basis
for designing and studying compilers.
2- To create an evaluation of the difficulty and cost of implementing
and using particular features of languages.
To accomplish this, the above topic has been divided into three parts.
PART1: presents a simple example and introduces a general
model of a compiler.
PART2: study the model and explains its inner workings in
detail.
8/7/2019 Lec12-
2/12
2
PART3: uses the model to demonstrate the implementation of
advanced features, e.g., data structures, storage allocation,
block structure, and pointers.
A compiler accepts a program written in a higher level language as
input and produces its machine language equivalent as output.
In the part one, we examine a simple PL/1 program and become
familiar with the issues we must face in trying to compile it.
WCM: PROCEDURE (RATE, START, FINISH);DECLARE (COST, RATE, START, FINISH) FIXED BINARY (31) STATIC;COST = RATE * (START - FINISH) + 2 * RATE * (START FINISH 100);RETURN (COST);
END;
Figure 8.1 MINI-PL/1 program example
What must the compiler do in order to produce the machine language
equivalent of WCM?
:
-(Recognizing Basic Elements) :
COSTvariable labelWCM . operator=
8/7/2019 Lec12-
3/12
3
-(Recognizing Units and Interpreting Meaning)
: procedure
arguments : variables FIXED BINARY
bits 31 assignment statement :
.seven computations : return
.cost
-(Storage Allocation) storagelocations .
-(Code Generation) object code .
:
Problem No. 1- Recognizing Basic Elements
The action of parsing the source program into the proper syntactic
classes is known as lexical analysis . The program is scanned and separated
as shown in Figure 8.2.
. string processing source
program basic elements or tokens ,operators
8/7/2019 Lec12-
4/12
8/7/2019 Lec12-
5/12
5
This lexical process can be done in one continuous pass through the
data by creating an intermediate form of the program consisting of a chain or
table of tokens.
Some compilers reduce the size of the token table by only parsing
tokens as necessary, and discarding those that are no longer needed. The
lexical phase also discards comments since they have no effect on the
processing of the program.
Notice that the uniform symbol is the same length whether the token is
1 or 31 characters long.
8/7/2019 Lec12-
6/12
6
Problem No.2 Recognizing Syntactic Units and Interpreting
Meaning
Once the program has been broken down into tokens or uniform
symbols, the compiler must (1) recognize the phases, each phase is a
semantic entity and is a string of tokens that has an associated meaning, and
(2) interpret the meaning of the constructions.
The program is scanned and separated as shown in Figure 8.4. There
are many ways of operationally recognizing the basic constructs and
interpreting their meaning.
8/7/2019 Lec12-
7/12
7
Intermediate Form
syntactic construction object code
construction construction :
Arithmetic, nonarithmetic, or nonexecutable statements
Arithmetic statements
One intermediate form of an arithmetic statement is a parse tree. Figure
8.5 describe this method. The rules for converting an arithmetic statement
into a parse tree are:
1- Any variable is a terminal node of the tree.
2- For every operator, construct a binary tree whose left branch is the tree
for operand 1 and whose right branch is the tree for operand 2.
8/7/2019 Lec12-
8/12
8
Although, the above technique makes it easy for us to visualize the
structure of the statement, it is not a practical method for a compiler. The
compiler may use an intermediate form as a linear representation of the
parse tree called a matrix, as shown in Figure 8.6.
M1 M7
:
8/7/2019 Lec12-
9/12
9
Nonarithmetic Statement
The nonarithmetic statements, such as DO, IF, GO TO, END, etc., can
all be replaced by a sequential ordering of individual matrix, as shown in Fig.
8.7.
8/7/2019 Lec12-
10/12
10
Nonexecutable Statements
These statements are like DECLARE, in our example the interpretation
phase would note the data type, precision, and storage class (FIXED
BINARY, 31 bits, STATIC) in the identifier table for each of the variables
COST, RATE, START, and FINISH (as shown in Fig. 8.8).
Problem No.3 Storage Allocation
At some time, the amounts of storage required by our program must be
reserved. In our example the DECLARE statement gives us the proper
information:
8/7/2019 Lec12-
11/12
11
The interpretation phase constructs the entries in the table of Fig. 8.8. In
the case of fixed binary numbers of 32 bits, it will assign the first to relative
location 0, the second to location 4, and so on. One bit is assigned for sign.
Similarly, storage is assigned for the temporary locations that will
contain intermediate results of the matrix (e.g., M1, M2, M3, M4, M5, M6,
and M7).
8/7/2019 Lec12-
12/12
12
Problem No.4 Code Generation
Once the compiler has created the matrix and tables of supportinginformation, it may generate the object code.