22
PSU CS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010

PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

  • View
    222

  • Download
    7

Embed Size (px)

Citation preview

Page 1: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 1

Languages and Compiler Design IIIR Code Optimization

Material provided by Prof. Jingke Li

Stolen with pride and modified by Herb Mayer

PSU Spring 2010rev.: 4/16/2010

Page 2: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 2

Agenda

• IR Optimization• Redundancy Elimination• Sample: CSE• Partial Redundancy Elimination (PRE)• Copy Propagation• Value Numbering• Loop Invariant Code Motion• Counter Examples• Strength Reduction• Induction Variable (IV) Elimination

Page 3: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 3

IR Optimization

• Definition: Optimization is the translation of an original program P1 into a semantically equivalent program P2 with better properties

• “Better” depends on the project. Possibilities include code compactness, execution speed, numeric precision, and others

Page 4: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 4

IR OptimizationOptimizations transform a program into a functionally-equivalentprogram with better performance. Transformation can be

implementedat various stages and levels.Advantages of IR-Level Optimization:• IR Operations are explicit, so cost estimations can be accurate• IR Optimizations are machine-independent, hence the results are

portable across different target machinesScopes of Optimization:• Local: Transforming code by analyzing a single basic block• Global: Transforming code by analyzing a whole subroutine• Inter-Procedural: By analyzing the whole programConcepts and Techniques:• Basic blocks & flow graphs• Control-flow analysis & data-flow analysis

Page 5: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 5

Redundancy Elimination

IR code optimization removes redundant computations. The following are specific examples:

• Common Subexpression Elimination (CSE) — Based on lexical representation, applicable to global scope

• Partial Redundancy Elimination — More powerful than CSE

• Copy Propagation — Companion optimization to CSE• Value Numbering (VN) — Value based, single Basic Block• Super-local Value Numbering — Extends VN to multiple

blocks• Loop Invariant Elimination — Removes code from

frequently to rarely executed part of program

Page 6: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 6

Common Subexpression Elimination (CSE)• E is a common subexpression if it occurs at L1 and L2, was computed

at L1, and no components received new values along path to L2• To achieve CSE, introduce Temp to hold subexpression when first

evaluated; see Example from Quicksort():

The second occurrence of 4*i in BB --from Quicksort()-- is a common

subexpression; so is the second occurrence of 4*j

t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*ja[t15] := x

BB before CSE

t11 := 4*i x := a[t11] t12 := t11 t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := t13a[t15] := x

t11 := 4*i x := a[t11] t13 := 4*j t14 := a[t13]a[t11]:= t14a[t13]:= x

BB’ after CSE BB’’ after total CSE

Page 7: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 7

CSE Across BBsCSE can eliminate redundant computation across Basic Blocks:

i := ja := 4 * iif … goto BB3

before CSE

BB1

i := j b := 4 * i

BB2

i := j c := 4 * i

BB3

i := jtemp := 4 * i a := tempif … goto BB3

after CSE

BB1’

i := j b := temp

BB2’

i := j c := temp

BB3’

Page 8: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 8

Global CSEboth 4*i in BB5 (andBB6) are CSEs

⇒ eliminate t6 and t11,t7, t12, replace with t2

4*j in BB5 and BB6are CSEs

⇒ eliminate t10 and t15,replace with t8 and t13

Now a[t2] in BB5 andBB6 become CSEs

⇒ replace with t3

i := m-1 j := nt1 := 4*n v := a[t1]

BB1

i := i+1t2 := 4*it3 := a[t2]if t3<v goto BB2

BB2

j := j-1t4 := 4*jt5 := a[t4]if t5 > v goto BB3

BB3

if i >= j goto BB6

BB4

t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*ja[t15] := x

BB6 t6 := 4*i x := a[t6] t7 := 4*i t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := 4*ja[t10]:= x goto BB2

BB5

Page 9: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 9

Global CSE i := m-1 j := nt1 := 4*n v := a[t1]

BB1

i := i+1t2 := 4*it3 := a[t2]if t3 < v goto BB2

BB2

j := j-1t4 := 4*jt5 := a[t4]if t5 > v goto BB3

BB3

if i >= j goto BB6

BB4

x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x

BB6

x := t3 a[t2]:= t5 a[t4]:= x goto BB2

BB5

Page 10: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 10

CSE Algorithm

Available expressions: An expression x y⊕ is available at node n if

every path from the entry node to n evaluates the expression, and there

are no definitions of x or y after the last evaluation

Algorithm:

1. Compute available expressions for all expressions.

2. At each node n : w := x y⊕ , where the expression x y is ⊕available, search backwards for the evaluations of x y⊕ that reach n

3. Replace each evaluation v := x y⊕ found in the search by

t := x y; v := t⊕4. Replace n by w := t

Page 11: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 11

An Improved CSE AlgorithmThe previous CSE algorithm performs the expensive backward search andinserts a new temp for every use of a common subexpression. The followingideas can improve the algorithm:

– Reduce number of new temps by assigning a unique name to each unique expression

– Avoid backward search by a separate traversal of the CFG

Algorithm:1. Compute available expressions for all expressions

2. Initialize an array Name[ e ] = ø for all expressions

3. At each node n : w := x y, where the expression x y (denoted e below) is ⊕ ⊕available:

If Name[ e ] = ø, allocate new name t and set Name[ e ] = t;

Else let t = Name[ e ];Replace n by w := t;

4. In a subsequent traversal of CFG, at each node v := e, if Name[ e ] != ø,

let t = Name[ e ]; replace the node by t := e; v := t;

Page 12: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 12

Yet Another CSE Algorithm

Ideas:

Create one temp for each unique expression.

Let subsequent pass eliminate unnecessary temps.

Algorithm:

1. Compute available expressions for all expressions.

2. At each evaluation of e:• Hash e to a name, t, in a table• Insert assignment t = e.

3. At a use of e where e is available:• Look up e’s name t in the hash table• Replace e with t.

Page 13: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 13

Partial Redundancy Elimination (PRE)

An expression x y is partially redundant at node ⊕ n, if some path from

entry node to n evaluates x y⊕ , and there are no definitions of x or y

after the last evaluation

PRE Optimization (it subsumes CSE):• Discover partially redundant expressions• Convert them to fully redundant expressions• Remove redundancy, to reduce # of overall computations at runtime

= ... x ⊕ y

x ⊕ y

x y⊕ x ⊕ y

x ⊕ y

x y⊕ x ⊕ y

= ...n

n n

Page 14: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 14

Copy Propagation

Copy statement has the form f := g

A large number of copy statements may be generated after performing

CSE optimizations. Copy propagation eliminates copy statements

by using g for f wherever possible

t6 := 4*i x := a[t6] t7 := t6 t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := t8a[t10]:= x goto BB2

BB5

Before

t6 := 4*i x := a[t6] t8 := 4*j t9 := a[t8] a[t6]:= t9 a[t8]:= x goto BB2

BB5’

After

Page 15: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 15

Cascading Problem

CSE transformations may have a cascading effect — more rounds of

CSE/Copy-propagation may be needed before reaching the final form:

x := b + c y := a + x u := b + c v := a + u

x := b + c y := a + x u := x v := a + u

x := b + c y := a + x v := a + x

x := b + c y := a + x v := y

⇒ ⇒

Page 16: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 16

Value Numbering

• Each variable is assumed to have a unique initial value• Each unique value is assigned a unique number• An expression’s value is represented by a corresponding symbolic

expression based on the operands’ numbers• E.g. expression x + y’s value is 1+2 , if 1 and 2 are x and y’s value

numbers, respectively• Each unique expression value is also assigned a unique number• When a new variable or expression is encountered, check to see if it

has been assigned a number, if so, use the number, otherwise assign it a new number

• Use a hash table for efficient number lookup

Page 17: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 17

Sample: Value Numbering

Value numbering uses a single round to calculate the effect

of cascaded optimizations

x := b + c y := a + x u := b + c v := a + u

statement var or expr assigned #

x := b + c b

c

b+c (1+2)

x

1

2

3

3

y := a + x a

a+x (4+3)

y

4

5

5

u := b + c u (1+2) 3

v := a + u v (4+3) 5

Page 18: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 18

Loop Invariant Code Motion

If a loop contains a statement t ← a ⊕ b such that a and b have

the same values each time around the loop, then t will also have the

same value each time. Hoist such loop-invariant statement out of loop!

t1 := 0

i := i+1t2 := a * bM[i]:= t2if a < N goto BB3

BB2

x := t2

BB3

BB1 t1 := 0 t2 := a * b

i := i+1M[i]:= t2if a < N goto BB3’

BB2’

x := t2

BB3’

BB1’

Page 19: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 19

Loop Invariant Criteria

A statement S : t ← a1 a2⊕ is loop-invariant within loop

L if, for each operand ai

1.) ai is a constant, or

2.) all definitions of ai that reach S are outside the loop, or

3.) only 1 definition of ai reaches S, which is loop-invariant

An iterative algorithm can be used to find all loop-invariant

statements

Page 20: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 20

Strength Reduction (SR)

• Definition: Reduction in strength is the replacement of an operation by a cheaper one, e.g. replace * by + if feasible

• Do not make such changes in the source, e.g. do not replace j=2*k; with j=k+k; let optimizer do this

if i >= y goto BB3

Call func1 j := 2 * k i := i + 1 goto BB1

BB2

x := ... BB3

BB1

if i >= y goto BB3

Call func1 j := k + k i++ goto BB1

BB2

x := ... BB3

BB1

Page 21: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 21

Induction Variable Elimination (IVE)

• Definition: Induction Variable (IV) is a variable iterating through a linear progression of values in a program section

• The program section is frequently a proper loop• IV are either fundamental or dependent on other IVs• IV elimination reduces multiple IVs into fewer, thus saving

operations– Since these operations are inside inner loops, savings can be

significant

• After IVE other optimizations can be applied too, e.g. SR

Page 22: PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU

PSU CS322 HM 22

Induction Variable Elimination, Cont’dinteger a(100) -- low bound is 1, not 0 like in C++ or Java, subtract!do i = 1, 100 -- OK for i to be undefined after loop

a(i) = 2 * i -- rhs deliberately not 4 * i, which would be easy: = IV

enddoBB0

t2 = 2 * t1t3 = 4 * t1t4 = t3 – 4t5 = A(a)+t4*t5 = t2t1 = t1 + 1Goto BB1

If t1>100 goto BB3

t1 = 1 // i

BB1

BB2

BB3Ater loop i undefined

BB0’

t2 = 2 * t1t5 = A(a)+t0*t5 = t2t0 = t0 + 4Goto BB1’

If t0>= 400 goto BB3’

t0 = 0 // IVt1 = 1 // i

BB1’

BB2’

BB3’Ater loop i undefined

BB0’’

t2 = 2 * t1*t0 = t2t0 = t0 + 4Goto BB1’’

If t0>= A(a)+400 goto BB3’

t0 = A(a) // IVt1 = 1 // i

BB1’’

BB2’’

BB3’

BB3’’Ater loop i undefined