Upload
jyoti-mishra
View
220
Download
0
Embed Size (px)
Citation preview
8/3/2019 Assign 2 Compiler
http://slidepdf.com/reader/full/assign-2-compiler 1/3
ShrinathJi Institute of Technology & Engineering
SITE, Nathdwara
M.TECH I SEMESTER 2011-12
SUBJECT-IMSE3 OPTIMIZING COMPLIER
Assignment II
DATE OF COMMENCEMENT 22-01-2012 DATE OF SUBMISSION 01-02-2012Q.1 What is cloning procedure, procedure level optimization
Ans.:Cloning Procedures
Cloning is a generalization of in-line expansion. Consider a procedure P that is called at a number of call sites C1, . . . , Cm in the program.
Procedure cloning occurs when the compiler notices that the set of call sites can be partitioned into subsets such that different versions of
the procedure P can be used. The same version is used at all call sites within the same set in the partition. The different versions will run
more quickly than the general procedure in each of the contexts in which the specialized version occurs. This is a research topic. The best
discussion available is by Hall (1991).
The partitioning of the call sites involves the identification of some characteristic of the parameters. An easy case is a constant parameter
that is used as the stride of a loop. More complex relationships can be identified by determining the dependency information for the flow
graph in terms of the formal parameters. When the parameter has certain values, the procedure may be vectorizable, parallelizable, or have
a form where the cache usage can be controlled.
There is one case of cloning that should be implemented whether the more advanced technique is implemented or not. Consider a function Fthat has n different call sites, S1, . . . , Sn. If there is one call site that is in a frequently executed region of the program and that call site
executes a large roportion of all calls on F, then a copy of F should be expanded inline at that site. All other sites should execute a normal call
on the original copy of F. The frequency and proportion of the calls is a parameter to be tuned, and the information to make the choice may
be gathered by program profiling.
Simple Procedure-Level Optimization
Whether full interprocedural analysis is performed or cloning is implemented, there are several optimizations that can be made when the
body of the calling and called procedures are known at compile time. Consider the two procedures in Figure 9.3. The first column represents
the original two procedures. If this section of code is executed frequently, the loop can be moved into a new rocedure made from a copy of
CALLED in which the body of the procedure becomes the body of the loop. The parameters for NEWCALLED are not listed; however, enough
information must be passed to describe the bounds of the loop and the original arguments.
Figure 9.3 Moving a Loop Inside a Procedure
8/3/2019 Assign 2 Compiler
http://slidepdf.com/reader/full/assign-2-compiler 2/3
eg ster oa esc ng
Register coalescing removes as many copy operations as possible. Many of the copy operations have already been eliminatedduring
peephole optimization, which eliminated all copies that were not implied by -nodes and did not involve temporaries associated
with -nodes at abnormal edges. The largest proportion of the copies are removed in this way. The rest of the copies are eliminated
using an observation of Chaitin (1981): If the source and the destination of a copy do not conflict, then the source and destination
can be combined into one register. Once the two temporaries have been combined, the algorithm can be applied again to another
copy. The observation creates a partition of the temporaries: Two temporaries are in the same partition if they have been combined
during register coalescing.
The SSA-form register -renamingalgorithm can generate -nodes associated with abnormal edges in the flow graph. These -nodes
must not generate copy operations when the graph is translated back into normal form. Thus the algorithm must avoid eliminatin
copies that will cause copies to occur on abnormal edges. As usual,impossibleedges are fine since the code on them can never be
executed anyway.Thealgorithm consists of using the SSA form to eliminate most copies. Initiallythe temporaries are partitioned so
that each temporary is in an element of the partition by itself. Then each -node and copy instruction is investigated. If an operand
and the destination temporaries do not conflict, then both temporaries are put in the same partition. The flow graph is then
translated back into normal form.
Note the similarity between register coalescing and register renaming. Both are implemented by creating a partition, and both
partitions are created to eliminate the copies at the-nodes.
Q.4 Describe Inter procedure analysis, Inlining procedure.
Interprocedural Analysis
Initiallythe compiler compileseach procedure individually,one procedure or flow graph at a time. In fact, the compiler is organized
as a production line: Each procedure is translated into a flow graph and fed through the compiler, one at a time, until the results are
added to the object file.With this structure, the compiler does not know about the effects of any procedure or function calls. It does
not know which variables might be modified by each procedure call, so it must assume the worst.For interprocedural analysis, this
organization must be changed. However the change can be hidden inside the interprocedural analysis phase if careful data
abstractions are maintained. Interprocedural analysisrequires informationabout multipleprocedures within the applicationprogram,
so the compile-one-at-a-time approach must be modified. Instead, the compiler must accumulate the flow graphs (and other data)
for each procedure. When all of the flow graphs have been found, the whole program can be analyzed to find the effects of each
procedure call more precisely. Then the rest of the compilation can occur, one flow graph at a time (see Figure 9.1).Figure 9.1 Schematic of Interprocedural Phase
In other words, the interprocedural analysisphase can be thought of as the stomach of the compiler. It gathers together all of the
flow graphs of the application,processes them, and passes each one along to the rest of the compiler to be processed. As each flow
graph is passed along, the inter - procedural analysis information about its calls and where they are called are available for the
optimizers and code generators.There are manyways in which this repository of information can be stored. One approach is to keep
a library of procedures and their flow graphs on the disk as a complex data structure that is updated each time a file in the
application is compiled. Another approach is to keep the repository in memory. In our sample case, the whole application will be
t t t
8/3/2019 Assign 2 Compiler
http://slidepdf.com/reader/full/assign-2-compiler 3/3
omput ng nterproce ura as n ormat on
There are four other kinds of information computed during interprocedural analysis:
1. The interprocedural analyzer computes alias information. Consider the point in application execution immediately after a
rocedure call inside a procedure. Which of its formal parameters (dummy arguments) may reference the same memory location as
another variable mentioned in the flow graph? This is only a problem for formal parameters passed by reference so that the actual
arameter is a pointer to the data in memory. Interprocedural analysis will compute an estimate of which formal parameters might
e sharing the same memory location as other formal parameters or global variables.2. The interprocedural analyzer computes modification information. The compiler would like to know which variables and memory
locations might be modified during a procedure call. This includes both the modification of arguments that are passed by reference
and global variablesthat are modifiedas side effects. Again, the word ³might is used since it is too difficultto determine whether the
data must be
modified during a procedure call.
3. The interprocedural analyzer computes the variables that might be used in a procedure. Again, this includes both variables that ar
modified because they are associated with formal parameters that are passed by reference and global variables. As before, the
information is only accurate to ³might rather than ³must standards.
4. The interprocedural analyzer computes the formal parameters that are always bound to a single constant in the application
rogram. I will not describe the computation of this information here, instead referring you to the papers referenced previously.
Inlining Procedures
The one part described in this chapter that is needed in any high- performance compiler is procedure inlining.Consider a function
such as in Figure 9.2. The cost of calling the function and returning the value is probably more expensive than the actual execution
of the function body. These costs can be avoided by substituting the body of the function into the calling procedure rather than
inserting a procedure call. During the substitution, the formal parameters must be replaced by the actual parameters in such a
fashion that the same computations will be performed after the substitution as would be performed by the function call, and local
variablesmust be renamed so that they do not conflict with the variables in the callingprocedure. Of course global variablesvariable
common between the called function and other functionsmust not be renamed. When should a function be expanded inline?There is
no single good answer to that question because of the expansion/contraction problem. The expansion of a function inline
within another function initiallyexpands the size of the whole program. On the other hand, this expansion may make possible anumber of simplificationsthat will result in a smaller program. Consider the example of a function that is a large case or switch
statement with each alternative being a single statement. If a function call on that function with a constant actual parameter is
replaced by an in-line expansion of the function, the program initiallyexpands in size; however, constant propagation will eliminat
all of the code except the corresponding one small alternative, thus making the program smaller and faster. Here is
the logic that the compiler should use for deciding whether a function is to be expanded inline:
If the compiler contains a compile-time command to expand a function inline, then expand it inline. This simply means that the
rogrammer is telling the compiler to do it, so do it. Correspondingly, if a compile-
timedirective indicates not to expand a functioninline, then do not do it under any circumstances.
Figure 9.2 Example of Function to Inline If there is onlyone call on a function, then it can be expanded inline.This willdecrease the
amount of function-call overhead without increasing program size. This
situation occurs with programs that are written in a top-down programming style. Such a programming style encourages the writing
of functions called only once. If the resulting function is estimated to be larger than some size, such as the size of the fastest cache,
then the expansion should not be performed automatically.
If the com iler estimates that the size of the function bod is smaller than the size of the function call, then the function can be