45
Compilation 2007 Compilation 2007 Optimization Optimization Michael I. Schwartzbach BRICS, University of Aarhus

Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

Compilation 2007Compilation 2007

OptimizationOptimization

Michael I. Schwartzbach

BRICS, University of Aarhus

Page 2: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

2Optimization

OptimizationOptimization

The optimizer aims at:• reducing the runtime• reducing the code size

These goals often conflict, since a larger program may in fact be faster

The best optimizations achive both goals An optimizer may also have more esoteric aims:

• reducing energy consumption• reducing chip area

Page 3: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

3Optimization

Optimizations for SpaceOptimizations for Space

Were historically important, because memory was small and expensive

When memory became large and cheap, optimizing compilers traded space for time

Java compilers do not optimize much, but JVM bytecodes are designed to be small

When Java is targeted at mobile devices, space optimizations are again important

Page 4: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

4Optimization

Optimizations for SpeedOptimizations for Speed

Were historically important to gain acceptance for the introduction of high-level languages

Are still important, since the software always strains the limits of the hardware

Are challenged by ever higher abstractions in programming languages and must constantly adapt to changing microprocessor architecures

Java compilers do not optimize much, since the JVM kicks in with the JIT compiler

Page 5: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

5Optimization

Opportunities for OptimizationOpportunities for Optimization

At the source code level At an intermediate low level At the binary machine code level At runtime (JIT compilers) At the hardware level

An aggressive optimization requires many small contributions from all levels

Page 6: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

6Optimization

Optimizers Must Undo AbstractionsOptimizers Must Undo Abstractions

Variables abstract away from registers, so the optimizer must find an efficient mapping

Control structures abstract away from gotos, so the optimizer must simplify a goto graph

Data structures abstract away from memory, so the optimizer must find an efficient layout

... Method invocations abstract away from

procedure calls, so the optimizer must efficiently determine the intended implementation

Page 7: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

7Optimization

Difficult CompromisesDifficult Compromises

A high abstraction level makes the development time cheaper, but the runtime more expensive

An optimizing compiler makes runtime more efficient, but compile time less efficient

Optimizations for speed and size may conflict

Different applications may require different choices at different times

Page 8: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

8Optimization

Examples of OptimizationsExamples of Optimizations

Strength reduction Loop unrolling Common subexpression elimination Loop invariant code motion Inline expansion

These may take place either at the source level or at the bytecode level

Most require information from static analyses

Page 9: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

9Optimization

Strength ReductionStrength Reduction

Replace expensive operations with cheap ones:

for (i = 0; i < a.length; i++)

a[i] = a[i] + i/4;

for (i = 0; i < a.length; i++)

a[i] += (i >> 2);

Page 10: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

10Optimization

Loop UnrollingLoop Unrolling

Unfold a loop to save condition tests:

for (i = 0; i < 100; i++)

g(i);

for (i = 0; i < 100; i += 2) {

g(i);

g(i+1);

}

Page 11: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

11Optimization

Common Subexpression EliminationCommon Subexpression Elimination

Avoid redundant computations:

double d = a * Math.sqrt(c);

double e = b * Math.sqrt(c);

double tmp = Math.sqrt(c);

double d = a * tmp;

double e = b * tmp;

Page 12: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

12Optimization

Loop Invariant Code MotionLoop Invariant Code Motion

Move constant valued expressions outside loops:

for (i = 0; i < a.length; i++)

b[i] = a[i] + c * d;

int tmp1 = a.length;

int tmp2 = c * d;

for (i = 0; i < tmp1; i++)

b[i] = a[i] + tmp2;

Page 13: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

13Optimization

Inline ExpansionInline Expansion

Replace method invocations with copies:int pred(int x) {

if (x == 0) return x; else return x-1;

}

int f(int y) {

return pred(y) + pred(0) + pred(y+1);

}

int f(int y) {

int tmp = 0;

if (y == 0) tmp += 0; else tmp += y-1;

if (0 == 0) tmp += 0; else tmp += 0-1;

if (y+1 ==0 ) tmp += 0; else tmp += (y+1)-1;

return tmp;

}

Page 14: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

14Optimization

Collaborating OptimizationsCollaborating Optimizations

Optimizations may enable other optimizations:

int f(int y) {

int tmp = 0;

if (y == 0) tmp += 0; else tmp += y-1;

if (0 == 0) tmp += 0; else tmp += 0-1;

if (y+1 == 0) tmp += 0; else tmp += (y+1)-1;

return tmp;

}

int f(int y) {

if (y == 0) return 0;

else if (y == -1) return -2;

else return y+y-1;

}

Page 15: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

15Optimization

Optimization in JoosOptimization in Joos

public int foo(int a, int b, int c) { c = a*b+c; if (c<a) a = a+b*113; while (b>0) { a = a*c; b = b-1; } return a;}

iload_1iload_2imuliload_3iadddupistore_3popiload_3iload_1if_icmplt true1iconst_0goto end2true1:iconst_1end2:ifeq false0

iload_1iload_2imuliload_3iaddistore_3iload_3iload_1if_icmpge cond4iload_1iload_2bipush 113imuliaddistore_1goto cond4loop3:iload_1iload_3imulistore_1iinc 2 -1cond4:iload_2ifgt loop3iload_1ireturn

iload_1iload_2bipush 113imuliadddupistore_1popfalse0:goto cond4 loop3:iload_1iload_3imuldupistore_1pop

iload_2iconst_1isubdupistore_2popcond4:iload_2iconst_0if_icmpgt true5iconst_0goto end6true5:iconst_1end6:ifne loop3iload_1ireturn

52 bytecodes

27 bytecodes

Page 16: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

16Optimization

Peephole OptimizationsPeephole Optimizations

Make local improvements in bytecode sequences The optimizers considers only finite windows of

the sequence When the pattern "clicks", the optimizer rewrites a

part of the code using a template:

dup

istore 3 istore 3

pop

Page 17: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

17Optimization

Peephole TransitionsPeephole Transitions

Let P be a collection of peephole patterns It defines a transition relation on sequences of

bytecodes:

B1 B2

meaning that pP clicked at some position in the sequence B1 and produced the sequence B2

p

Page 18: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

18Optimization

TerminationTermination

A collection of peephole patterns must terminate This means that for the collection P, there must

not exist an infinite sequence:

B0 B1 B2 B3 ...

for any B0 and piP

p1 p2 p3 p4

Page 19: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

19Optimization

Soundness (1/2)Soundness (1/2)

Every peephole pattern must preserve semantics Assume the pattern p transforms a bytecode

sequence B1 into the sequence B2

Consider now any bytecode context C If C[B1] emits the verifiable code E1, then C[B2]

must emit some verifiable code E2 with the same semantics

Page 20: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

20Optimization

Soundness (2/2)Soundness (2/2)

C B1:

C C

B1 B2

E1 E2

p

emit emit

Page 21: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

21Optimization

A Peephole Pattern LanguageA Peephole Pattern Language

Joos has a domain-specific language for specifying peephole patterns

The Joos compiler contains an interpreter for this peephole language

It is invoked with the option -O patternfile It will try all patterns in an unspecified order until

no pattern clicks anywhere

Page 22: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

22Optimization

Pattern SyntaxPattern Syntax

pattern → pattern name var :

exp ->

intconst templates

The exp determines whether the pattern clicks The intconst tells how many bytecodes to replace The template specifies the new bytecodes

The evaluation of exp produces a set of bindings that may be used inside the templates and later in the expression

Page 23: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

23Optimization

Expression TypesExpression Types

The following types are possible results:• int• label• type-signature• field-signature• method-signature• string• condition• bytecodes• boolean

The notation inst(σ1, ..., σk) means that the given instruction has these arguments in the JVM specification

Page 24: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

24Optimization

exp intop exp |

exp intcomp exp |

exp comp exp |

exp ~ peepholes |

! exp |

exp && exp |

exp || exp |

intconst |

condconst

Pattern ExpressionsPattern Expressions

exp → var |

degree var |

target var |

formals var |

returns var |

negate exp |

commute exp |

Page 25: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

25Optimization

Peepholes and TemplatesPeepholes and Templates

peepholes → peephole*

peephole → instruction |

instruction (vars) |

* | (any single instruction)

var : (label binder)

template → template*

template → instruction |

instruction (exps)

condconst → eq | ne | lt | le | gt | ge | aeq | ane

intop → + | - | * | / | %

intcomp →< | <= | > | >=

comp → == | !=

Page 26: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

26Optimization

Peephole JudgementsPeephole Judgements

The judgement:

|- E: σ[→ ']

means that the expression E:• evaluates to a result of type σ • consumes the bindings • produces the bindings '

The judgement:

|- X: [→ ']

similarly describes peepholes, templates, and patterns

Page 27: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

27Optimization

Expression Well-Formedness (1/5)Expression Well-Formedness (1/5)

(x) = σ

|- x: σ[→]

(x) = label

|- degree x: int[→]

(x) = label

|- target x: bytecodes[→]

(x) = method-signature

|- formals x: int[→]

Page 28: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

28Optimization

Expression Well-Formedness (2/5)Expression Well-Formedness (2/5)

|- E: condition[→']

|- negate E: condition[→']

(x)= method-signature

|- returns x: int[→]

|- E: condition[→']

|- commute E: condition[→']

Page 29: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

29Optimization

Expression Well-Formedness (3/5)Expression Well-Formedness (3/5)

|- E1: int[→'] |- E2: int[' →'']

|- E1 intop E2: int[→'']

|- E1: int[→'] |- E2: int['→'']

|- E1 intcomp E2: boolean[→'']

|- E1: σ[→'] |- E2: σ['→'']

|- E1 comp E2: boolean[→'']

Page 30: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

30Optimization

Expression Well-Formedness (4/5)Expression Well-Formedness (4/5)

|- E: bytecodes[→'] |- P['→'']

|- E ~ P: boolean[→'']

|- E: boolean[→']

|- ! E: boolean[→]

|- E1: boolean[→'] |- E2: boolean['→'']

|- E1 && E2: boolean[→'']

Page 31: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

31Optimization

Expression Well-Formedness (5/5)Expression Well-Formedness (5/5)

|- E1: boolean[→'] |- E2: boolean[→''] x: '(x)=' ''(x)='' ' = ''

|- E1 || E2: boolean[→ ' '']

|- k: int[→]

|- cond: condition[→]

Page 32: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

32Optimization

Peephole Well-Formedness (1/2)Peephole Well-Formedness (1/2)

|- Pi[i→i+1]

|- P1P2...Pk[1→ k+1]

|- inst: [→]

xi ≠ xj xi inst(σ1,..., σk)

|- inst(x1,...,xk)[→[xi→σi]]

Page 33: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

33Optimization

Peephole Well-Formedness (2/2)Peephole Well-Formedness (2/2)

|- *: [→]

|- x: : [ → [x→label]]

|- label(x) : [ → [x→label]]

Page 34: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

34Optimization

Template Well-FormednessTemplate Well-Formedness

|- Ti: [i→i+1]

|- T1T2...Tk: [1→ k+1]

|- inst: [→]

|- Ei: σi[i→i+1] inst(σ1,..., σk)

|- inst(E1,...,Ek)[1→k+1]

|- E: label [1→2]

|- E:inst: [1→2]

Page 35: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

35Optimization

Pattern Well-FormednessPattern Well-Formedness

|- E: boolean[[x→bytecodes] → ] |- T[→']

|- pattern n x: E -> k T: [[]→']

Page 36: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

36Optimization

Pattern Examples (1/4)Pattern Examples (1/4)

pattern dup_istore_pop x:

x ~ dup

istore (i0)

pop

-> 3 istore (i0)

This pattern is relevant for code like:

x = a*b;

Page 37: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

37Optimization

Pattern Examples (2/4)Pattern Examples (2/4)

pattern goto_label x:

x ~ goto (l1)

label (l2)

&& l1 == l2

-> 1

This pattern arises during optimization of nested control structures

Page 38: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

38Optimization

Pattern Examples (3/4)Pattern Examples (3/4)

pattern constant_iadd_residue x:

x ~ ldc_int (i0)

iadd

ldc_int (i1)

iadd

-> 4 ldc_int (i0+i1)

iadd

This pattern is relevant for code like:

a+5+7

Page 39: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

39Optimization

Pattern Examples (4/4)Pattern Examples (4/4)

pattern goto_goto x:

x ~ goto (l0)

&& target l0 ~ goto (l1)

&& ! (target l1 ~ goto (l2))

&& ! (target l1 ~ label (l3))

-> 1 goto (l1)

This pattern arises during optimization of nested control structures

Page 40: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

40Optimization

Proving TerminationProving Termination

We want to avoid infinite sequences like:

B0 B1 B2 B3 ...

Define an integer valued function such that:

B: (B) 0

pP: B1 B2 (B2) < (B1)

p1 p2 p3 p4

p

Page 41: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

41Optimization

Termination Function ExampleTermination Function Example

For our 4 example patterns we define:

(B) = #dup + #goto + #iadd + ???

What gets smaller in the goto_goto pattern?

Page 42: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

42Optimization

Termination Function ExampleTermination Function Example

For our 4 example patterns we define:

(B) = #dup + #goto + #iadd + ???

What gets smaller in the goto_goto pattern?

label (l1) B

l1 → l2 → l3 → ... → lk li ≠ lj

goto goto goto goto

k

Page 43: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

43Optimization

A Non-Terminating PatternA Non-Terminating Pattern

pattern bad_goto_goto x:

x ~ goto (l0)

&& target l0 ~ goto (l1)

-> 1 goto (l1)

foo: goto bar

bar: goto foo

Page 44: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

44Optimization

Proving SoundnessProving Soundness

A formal proof of soundness for a collection of patterns requires a full formal semantics of:• bytecode sequences• peephole patterns• bytecode contexts• code emission• the complete JVM

The pitfall is usually the universal quantification of contexts: does this really always work?

Page 45: Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

45Optimization

An Unsound PatternAn Unsound Pattern

pattern idiv_pop x:

x ~ idiv

pop

-> 1 pop

This pattern may actually click And the resulting bytecode will always verify But the semantics is not preserved, since it may

remove a java.lang.ArithmeticException