Upload
orenda
View
38
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Progressive Register Allocation for Irregular Architectures. David Koes [email protected] Seth Copen Goldstein [email protected] March 23, 2005. eax. ebx. ecx. edx. esi. edi. esp. ebp. Irregular Architectures. Few registers Register usage restrictions - PowerPoint PPT Presentation
Citation preview
2005 International Symposium on Code Generation and Optimization
Progressive Register Allocation for Irregular
Architectures
David [email protected]
Seth Copen [email protected]
March 23, 2005
2005 International Symposium on Code Generation and Optimization2
Irregular Architectures
• Few registers
• Register usage restrictions– address registers, hardwired registers...
• Memory operands
• Examples:– x86, 68k, ColdFire,
ARM Thumb, MIPS16, V800, various DSPs...
eaxebxecxedxesiedi
ebpesp
2005 International Symposium on Code Generation and Optimization3
Fewer Registers More Spills
• Used gcc to compile >10,000 functions from Mediabench, Spec95, Spec2000, and micro-benchmarks
• Recorded which functions spilled
Percent of functions that spill
05
101520253035404550
PPC (32) 68k (16) x86 (8)
Percent
2005 International Symposium on Code Generation and Optimization4
Register Usage Restrictions
• Instructions may prefer or require a specific subset of registers– x86 multiply instruction
imul %edx,%eax // 2 byte instruction
imul %edx,%ecx // 3 byte instruction– x86 divide instruction
idivl %ecx // eax = edx:eax/ecx
2005 International Symposium on Code Generation and Optimization5
Memory Operands
• Load/store not always needed to access variables allocated to memory– depends upon instruction– still less efficient than register access
addl 8(%ebp), %eax vs
movl 8(%ebp), %edxaddl %edx, %eax
2005 International Symposium on Code Generation and Optimization6
Register Allocation Challenges
• Optimize spill code– with few registers, spilling unavoidable
• Model register usage restrictions
• Exploit memory operands– affects spilling decisions
2005 International Symposium on Code Generation and Optimization7
Previous Work
Method Models Irregular Features
Fast Optimal
Graph Coloring
Integer Programming[Goodwin and Wilken 96]
[Kong and Wilken 98]
[Fu and Wilken 2002]
Separated IP[Appel and George 01]
PBQP[Scholz and Eckstein 02] / /
2005 International Symposium on Code Generation and Optimization8
Our Goals
• Expressive– Explicitly represent architectural irregularities
and costs
• Proper model– An optimum solution results in optimal
register allocation
• Progressive solution algorithm– more computation better solution– decent feasible solution obtained quickly– competitive with current allocators
2005 International Symposium on Code Generation and Optimization9
Multicommodity Network Flow (MCNF)
a b
a b
2
22 4
444
instruction
crossbar
source
sink
2005 International Symposium on Code Generation and Optimization10
Modeling Usage Constraints
int foo(int a, int b, int c){ a = a*b; return a/c;}
a
a
b
imuleax edx ecx mem
b
1-1
idiveax edx ecx mem
c
c
1
not quite right…
2005 International Symposium on Code Generation and Optimization11
Modeling Spills and Moves
int foo(int a, int b, int c){ a = a*b; return a/c;}
a
imuleax edx ecx mem
b
1-1
eax edx ecx mem
eax edx ecx mem
c
b
3 3 3
a
idiveax edx ecx mem
c
1
eax edx ecx mem
eax edx ecx mem
2005 International Symposium on Code Generation and Optimization12
Modeling Stores
• Simple approach flawed– doesn’t model memory
persistency
• Solution: antivariables– flow only through memory– eviction cost = store cost– evict only once
2005 International Symposium on Code Generation and Optimization13
Register Allocation as MCNF
• Variables Commodities
• Variable Usage Network Design
• Nodes Allocation Classes (Reg/Mem)
• Registers Limits Node Capacities
• Spill Costs Edge Costs
• Variable Definition Source
• Variable Last Use Sink
2005 International Symposium on Code Generation and Optimization14
Solving an MCNF
• Integer solution NP-complete
• Use standard IP solvers– commercial solvers (CPLEX) are impressive
• Exploit structure of problem– variety of MCNF specific solvers
• empirically faster than IP solvers
• Lagrangian Relaxation technique
2005 International Symposium on Code Generation and Optimization15
Lagrangian Relaxation: Intuition
• Relaxes the hard constraints – only have to solve single commodity flow
• Combines easy subproblems using a Lagrangian multiplier– an additional price on each edge
a b
a b
01
Example:edges have unit capacity
a b
a b
0+11with price, solution to single commodity flow can be solution to multicommodity flow
2005 International Symposium on Code Generation and Optimization16
Solution Procedure
• Compute prices using iterative subgradient optimization– converge to optimal prices
• At each iteration, greedily construct a feasible solution using current prices– allocate most expensive vars first– can always find an allocation
2005 International Symposium on Code Generation and Optimization17
Solution Procedure
• Advantages+ have feasible solution at each step+ iterative nature progressive+ Lagrangian relaxation theory provides
means for computing a lower bound+ Can compute optimality bound
• Disadvantages– No guarantee of optimality of solution
2005 International Symposium on Code Generation and Optimization18
Evaluation
• Replace gcc’s local allocator
• Optimize for code size– easy to statically evaluate
• Evaluate on MediaBench, MiBench, SpecInt95, SpecInt2000– consider only blocks where local allocation is
interesting (enough variables to spill)
2005 International Symposium on Code Generation and Optimization19
Behavior of Solver
2005 International Symposium on Code Generation and Optimization20
Proven Optimality
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
5-10conflicts
(355 blocks)
10-15conflicts
(23 blocks)
15-20conflicts
(7 blocks)
>= 20conflicts
(5 blocks)
>25%
Within 20%
Within 15%
Within 10%
Within 5%
Optimal
2005 International Symposium on Code Generation and Optimization21
Comprehensive Results
-15.00%
-10.00%
-5.00%
0.00%
5.00%
10.00%
15.00%
20.00%
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
5-10 conflicts(355 blocks)
10-15 conflicts(23 blocks)
15-20 conflicts(7 blocks)
>= 20 conflicts(5 blocks)
Improvement over gcc
artifact of interaction with gcc
2005 International Symposium on Code Generation and Optimization22
Progressive Nature
:-(
2005 International Symposium on Code Generation and Optimization23
Contributions
• New MCNF model for register allocation+ expressive, can model irregular architectures+ can be solved using conventional ILP solvers
• Progressive solution procedure+ decent initial solution+ maintains feasible solution+ improves solution over time– no optimality guarantees
Progressive