23
2005 International Symposium on Code Generation and Optimization Progressive Register Allocation for Irregular Architectures David Koes [email protected] Seth Copen Goldstein [email protected] March 23, 2005

Progressive Register Allocation for Irregular Architectures

  • Upload
    orenda

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Progressive Register Allocation for Irregular Architectures. David Koes [email protected] Seth Copen Goldstein [email protected] March 23, 2005. eax. ebx. ecx. edx. esi. edi. esp. ebp. Irregular Architectures. Few registers Register usage restrictions - PowerPoint PPT Presentation

Citation preview

Page 1: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization

Progressive Register Allocation for Irregular

Architectures

David [email protected]

Seth Copen [email protected]

March 23, 2005

Page 2: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization2

Irregular Architectures

• Few registers

• Register usage restrictions– address registers, hardwired registers...

• Memory operands

• Examples:– x86, 68k, ColdFire,

ARM Thumb, MIPS16, V800, various DSPs...

eaxebxecxedxesiedi

ebpesp

Page 3: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization3

Fewer Registers More Spills

• Used gcc to compile >10,000 functions from Mediabench, Spec95, Spec2000, and micro-benchmarks

• Recorded which functions spilled

Percent of functions that spill

05

101520253035404550

PPC (32) 68k (16) x86 (8)

Percent

Page 4: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization4

Register Usage Restrictions

• Instructions may prefer or require a specific subset of registers– x86 multiply instruction

imul %edx,%eax // 2 byte instruction

imul %edx,%ecx // 3 byte instruction– x86 divide instruction

idivl %ecx // eax = edx:eax/ecx

Page 5: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization5

Memory Operands

• Load/store not always needed to access variables allocated to memory– depends upon instruction– still less efficient than register access

addl 8(%ebp), %eax vs

movl 8(%ebp), %edxaddl %edx, %eax

Page 6: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization6

Register Allocation Challenges

• Optimize spill code– with few registers, spilling unavoidable

• Model register usage restrictions

• Exploit memory operands– affects spilling decisions

Page 7: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization7

Previous Work

Method Models Irregular Features

Fast Optimal

Graph Coloring

Integer Programming[Goodwin and Wilken 96]

[Kong and Wilken 98]

[Fu and Wilken 2002]

Separated IP[Appel and George 01]

PBQP[Scholz and Eckstein 02] / /

Page 8: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization8

Our Goals

• Expressive– Explicitly represent architectural irregularities

and costs

• Proper model– An optimum solution results in optimal

register allocation

• Progressive solution algorithm– more computation better solution– decent feasible solution obtained quickly– competitive with current allocators

Page 9: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization9

Multicommodity Network Flow (MCNF)

a b

a b

2

22 4

444

instruction

crossbar

source

sink

Page 10: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization10

Modeling Usage Constraints

int foo(int a, int b, int c){ a = a*b; return a/c;}

a

a

b

imuleax edx ecx mem

b

1-1

idiveax edx ecx mem

c

c

1

not quite right…

Page 11: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization11

Modeling Spills and Moves

int foo(int a, int b, int c){ a = a*b; return a/c;}

a

imuleax edx ecx mem

b

1-1

eax edx ecx mem

eax edx ecx mem

c

b

3 3 3

a

idiveax edx ecx mem

c

1

eax edx ecx mem

eax edx ecx mem

Page 12: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization12

Modeling Stores

• Simple approach flawed– doesn’t model memory

persistency

• Solution: antivariables– flow only through memory– eviction cost = store cost– evict only once

Page 13: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization13

Register Allocation as MCNF

• Variables Commodities

• Variable Usage Network Design

• Nodes Allocation Classes (Reg/Mem)

• Registers Limits Node Capacities

• Spill Costs Edge Costs

• Variable Definition Source

• Variable Last Use Sink

Page 14: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization14

Solving an MCNF

• Integer solution NP-complete

• Use standard IP solvers– commercial solvers (CPLEX) are impressive

• Exploit structure of problem– variety of MCNF specific solvers

• empirically faster than IP solvers

• Lagrangian Relaxation technique

Page 15: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization15

Lagrangian Relaxation: Intuition

• Relaxes the hard constraints – only have to solve single commodity flow

• Combines easy subproblems using a Lagrangian multiplier– an additional price on each edge

a b

a b

01

Example:edges have unit capacity

a b

a b

0+11with price, solution to single commodity flow can be solution to multicommodity flow

Page 16: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization16

Solution Procedure

• Compute prices using iterative subgradient optimization– converge to optimal prices

• At each iteration, greedily construct a feasible solution using current prices– allocate most expensive vars first– can always find an allocation

Page 17: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization17

Solution Procedure

• Advantages+ have feasible solution at each step+ iterative nature progressive+ Lagrangian relaxation theory provides

means for computing a lower bound+ Can compute optimality bound

• Disadvantages– No guarantee of optimality of solution

Page 18: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization18

Evaluation

• Replace gcc’s local allocator

• Optimize for code size– easy to statically evaluate

• Evaluate on MediaBench, MiBench, SpecInt95, SpecInt2000– consider only blocks where local allocation is

interesting (enough variables to spill)

Page 19: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization19

Behavior of Solver

Page 20: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization20

Proven Optimality

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

5-10conflicts

(355 blocks)

10-15conflicts

(23 blocks)

15-20conflicts

(7 blocks)

>= 20conflicts

(5 blocks)

>25%

Within 20%

Within 15%

Within 10%

Within 5%

Optimal

Page 21: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization21

Comprehensive Results

-15.00%

-10.00%

-5.00%

0.00%

5.00%

10.00%

15.00%

20.00%

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

1 Iter10 Iters

100 Iters1000 Iters

5-10 conflicts(355 blocks)

10-15 conflicts(23 blocks)

15-20 conflicts(7 blocks)

>= 20 conflicts(5 blocks)

Improvement over gcc

artifact of interaction with gcc

Page 22: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization22

Progressive Nature

:-(

Page 23: Progressive Register Allocation for Irregular Architectures

2005 International Symposium on Code Generation and Optimization23

Contributions

• New MCNF model for register allocation+ expressive, can model irregular architectures+ can be solved using conventional ILP solvers

• Progressive solution procedure+ decent initial solution+ maintains feasible solution+ improves solution over time– no optimality guarantees

Progressive