Upload
others
View
20
Download
1
Embed Size (px)
Citation preview
LLVM Greedy Register Allocator – Improving Region Split DecisionsMarina Yatsina
Intel Corporation, Israel
April 16-17, 2018 European Developers Meeting
Bristol, United Kingdom
2
Motivation
3
Motivation
4
Motivation
* idiv implicitly clobbers %edx
5
Motivation
%ecx %ebx %edi %edx
%ebp %ecx %ebx %edi
%ebp %ecx %ebx %edi
%ecx %ebx %edi %edx
* idiv implicitly clobbers %edx
6
Motivation
%ecx %ebx %edi %edx
%ebp %ecx %ebx %edi
%ebp %ecx %ebx %edi
%ecx %ebx %edi %edx
* idiv implicitly clobbers %edx
7
Motivation
%ecx %ebx %edi %edx
%ebp %ecx %ebx %edi
%ebp %ecx %ebx %edi
%ecx %ebx %edi %edx
* idiv implicitly clobbers %edx
8
Motivation
%edx
%ebp
%ebp
%edx
* idiv implicitly clobbers %edx
9
Motivation
%edx
%ebp
%ebp
%edx
* idiv implicitly clobbers %edx
10
Greedy Register Allocator
• Greedy Register Allocator Overview
• Region Split
• Encountered Issues
• Performance Impact
11
Greedy Register Allocator
• Greedy Register Allocator Overview
• Region Split
• Encountered Issues
• Performance Impact
12
High Level Design of Code Generator
Instruction Selection
Scheduling and Formation
Late Machine Code
Optimizations
Prolog/Epilog Code Insertion
Register Allocation
SSA-based Machine Code Optimizations
Code Emission
Instruction Selection
Scheduling and Formation
SSA-based Machine Code Optimizations
13
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
14
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
15
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Stand alone passPriority Queue Construction
16
Live Interval Analysis
BB#0:x = ……
BB#1:… = …x……
BB#2:…… = …x……
17
Live Interval Analysis• Earlier pass named SlotIndexes added numbering to the instructions
000 BB#0:001 x = …002 …
003 BB#1:004 … = …x…005 …
006 BB#2:007 …008 … = …x…009 …
18
Live Interval Analysis• Analyze x’s live interval
000 BB#0:001 x = …002 …
003 BB#1:004 … = …x…005 …
006 BB#2:007 …008 … = …x…009 …
x
19
Live Interval Analysis• Look for uses and defs
000 BB#0:001 x = …002 …
003 BB#1:004 … = …x…005 …
006 BB#2:007 …008 … = …x…009 …
x
20
Live Interval Analysis• Connect them into intervals
000 BB#0:001 x = …002 …
003 BB#1:004 … = …x…005 …
006 BB#2:007 …008 … = …x…009 …
x
21
Live Interval Analysis• Represented x’s liveness as a collection of segments
x [001, 004), [006, 008)000 BB#0:
001 x = …002 …
003 BB#1:004 … = …x…005 …
006 BB#2:007 …008 … = …x…009 …
22
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
23
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
RegAllocGreedy Pass
24
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
25
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
For every interval
26
Spill Weight Calculation• Intervals calculated by Live Interval Analysis
w x y z
27
Spill Weight Calculation• Estimate spill weight of each interval based on interval characteristics• w has uses in a hot loop
• Higher spill weight
• x is cheaply rematerializable• Lower spill weight
30 KG5 KG 10 KG
w x y z
20 KG
28
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
29
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Spill
Priority Queue Construction
For every interval
30
Priority Queue Construction
w x y z
31
Priority Queue Construction
w x y z
32
Priority Queue Construction
w x y z
33
Priority Queue Construction• Calculate interval allocation priority and insert into the queue
Priority Queue
w x y z
34
Priority Queue Construction• Calculate interval allocation priority and insert into the queue• w is local in one basic block
• Lower allocation priority
Priority Queuew
x y z
35
Priority Queue Construction• Calculate interval allocation priority and insert into the queue• x is global and spans across a lot of instructions
• Higher allocation priority
Priority Queuew x
y z
36
Priority Queue Construction• Calculate interval allocation priority and insert into the queue
Priority Queuew xy
z
37
Priority Queue Construction• Calculate interval allocation priority and insert into the queue
Priority Queuew xy z
38
Priority Queue Construction• The Priority Queue will always dequeue the interval with the highest priority
Priority Queuew xy z
39
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
40
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
For every interval in queue
41
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
42
Register Assignment• R0, R1 are the physical registers in the system
R0 R1
43
Register Assignment
Priority Queue
w xy z
R0 R1
44
Register Assignment• Dequeue interval with highest priority
Priority Queue
y zw x
R0 R1
45
Register Assignment
R0 R1
Priority Queue
y zw
x
46
Register Assignment• Assign to available register if possible
R0 R1
Priority Queue
y zw
x
x
x
47
Register Assignment• Dequeue interval with highest priority
R0 R1
Priority Queue
wy
x
x
x
z
48
Register Assignment
R0 R1
Priority Queue
wy
x
x
x
z
49
Register Assignment• Assign to available register if possible
R0 R1
Priority Queue
wy
x
x
xz
z
50
Register Assignment• Dequeue interval with highest priority
R0 R1
Priority Queue
y
x
x
xz
z
w
51
Register Assignment
R0 R1
Priority Queue
y
x
x
xz
z
w
52
Register Assignment• Interference with x in R0
R0 R1
Priority Queue
y x
xz
z
w
x
53
Register Assignment• Interference with z in R1
R0 R1
Priority Queue
y
xz
z
w
x
x
54
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
55
Eviction
R0 R1
Priority Queue
y
x
x
xz
z
w
56
Eviction• Compare spill weights of interfering intervals
R0 R1
Priority Queue
y
x
x
xz
z
w
20 KG30 KG5 KG
57
Eviction• x cheaper than w
R0 R1
Priority Queue
y
x
x
xz
z
w
20 KG30 KG5 KG
58
Eviction• Evict x
R0 R1
Priority Queue
y
z
z
w x
59
Eviction• Assign w
R0 R1
Priority Queue
y
z
z
x
w
60
Eviction
R0 R1
Priority Queue
y
z
zw
x
61
Eviction• Enqueue x back to the queue• Usually receives the same allocation priority R0 R1
Priority Queue
y
z
zw
x
62
Register Assignment
R0 R1
z
zw
Priority Queue
xy
63
Register Assignment
• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
y x
64
Register Assignment
R0 R1
z
zw
Priority Queue
y
x
65
Register Assignment
• Interference with w in R0
R0 R1
z
zPriority Queue
y
x
w
66
Register Assignment
• Interference with z in R1
R0 R1x
z
zw
Priority Queue
y
67
Eviction• Compare spill weights of interfering intervals• Can we Evict? R0 R1
z
zw
Priority Queue
y
x
20 KG30 KG5 KG
68
Eviction• Compare spill weights of interfering intervals• Can we Evict? No!
• x is the cheapest oneR0 R1
z
zw
Priority Queue
y
x
20 KG30 KG5 KG
69
Register Assignment
R0 R1
z
zw
Priority Queue
y
x
70
Register Assignment• Mark x to be split
R0 R1
z
zw
Priority Queue
y
x*
71
Register Assignment
R0 R1
z
zw
Priority Queue
y x*
72
Register Assignment• Enqueue x* back to the queue• Intervals marked to be split receive lower allocation priority R0 R1
z
zw
Priority Queue
y x*
73
Register Assignment
R0 R1
z
zw
Priority Queue
yx*
74
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x* y
75
Register Assignment
R0 R1
z
zw
Priority Queue
x*
y
76
Register Assignment• Assign to available register if possible
R0 R1
z
zw
Priority Queue
x*y
77
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x*y
78
Register Assignment• x is marked to be split
R0 R1
z
zw
Priority Queue
x*
y
79
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
80
Split• Is split beneficial?
R0 R1
z
zw
Priority Queue
x*
y
81
Split• Is split beneficial? • (Assume) Yes! R0 R1
z
zw
Priority Queue
x*
y
82
Split• Is split beneficial? • (Assume) Yes!
• Find the best way to splitR0 R1
z
zw
Priority Queue
x*
y
83
Split• Do the split
• Add COPY instruction between the intervals
R0 R1
z
zw
Priority Queue
x* x1 x2
y
x2 =COPY x1
x1 =COPY x2
84
Split
R0 R1
z
zw
Priority Queue
x1 x2
y
85
Split• Calculate spill weights• Split artifacts may receive higher weight
than the original intervalR0 R1
z
zw
Priority Queue
x1 x2
15 KG2 KG
y
86
Split
R0 R1
z
zw
Priority Queue
x1 x2y
87
Split• Enqueue x1, x2 into the queue• This will also calculate their allocation priority R0 R1
z
zw
Priority Queue
x1 x2y
88
Register Assignment
R0 R1
z
zw
Priority Queue
yx1x2
89
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
yx2 x1
90
Register Assignment
R0 R1
z
zw
Priority Queue
yx2
x1
91
Register Assignment• Assign to available register if possible
R0 R1
z
zw
Priority Queue
yx2
x1
x1
x1
92
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
y
x1
x1x2
x1
93
Register Assignment
R0 R1
z
zw
Priority Queue
y
x1
x1
x2
x1
94
Register Assignment• Interference with y in R0
R0 R1
z
zw
x1
x1
x2
Priority Queue
y
x1
95
Register Assignment• Interference with z in R1
R0 R1
z
x1
x1
x2
w
y
Priority Queue
x1
z
96
Register Assignment
R0 R1
z
zw
Priority Queue
y
x1
x1
x2
x1
97
Eviction• Compare spill weights of interfering intervals
R0 R1
z
zw
Priority Queue
y
x1
x1
x2
15 KG 10 KG
x1
20 KG
98
Eviction• y cheaper than x2
R0 R1
z
zw
Priority Queue
y
x1
x1
x2
15 KG 10 KG
x1
20 KG
99
Eviction• Evict y
R0 R1
z
zw
Priority Queue
x1
x1
x2 y
x1
100
Eviction• A split artifact can evict original interval
• Part of the split-eviction gradual refinement
R0 R1
z
zw
Priority Queue
x1
x1
x2 y
x1
101
Eviction• Assign x2
R0 R1
z
zw
Priority Queue
x1
x1x2
y
x1x2
102
Eviction
R0 R1
z
zw
Priority Queue
x1
x1x2y
x1x2
103
Eviction• Enqueue y back to the queue
R0 R1
z
zw
Priority Queue
x1
x1x2y
x1x2
104
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y
x1x2
105
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x1
x1x2y
x1x2
106
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2
y
x1x2
107
Register Assignment• Interference with x2 in R0
R0 R1
z
zw
x1
x1
y
Priority Queue
x2
x1x2
108
Register Assignment• Interference with z in R1
R0 R1
z
x1
x1
y
Priority Queue
w
x2
x1
zx2
109
Eviction• Compare spill weights of interfering intervals• Can we Evict? R0 R1
z
zw
Priority Queue
x1
x1x2
y
15 KG10 KG
x1
20 KG
x2
110
Eviction• Compare spill weights of interfering intervals• Can we Evict? No!
• y is the cheapest oneR0 R1
z
zw
Priority Queue
x1
x1x2
y
15 KG10 KG
x1
20 KG
x2
111
Eviction
R0 R1
z
zw
Priority Queue
x1
x1x2
y
x1x2
112
Register Assignment• Mark y to be split
R0 R1
z
zw
Priority Queue
x1
x1x2
y*
x1x2
113
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y*
x1x2
114
Register Assignment• Enqueue y* back to the queue
R0 R1
z
zw
Priority Queue
x1
x1x2y*
x1x2
115
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y*
x1x2
116
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x1
x1x2y*
x1x2
117
Register Assignment• Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2
y*
x1x2
118
Split• Is split beneficial?
R0 R1
z
zw
Priority Queue
x1
x1x2
y*
x1x2
119
Split• Is split beneficial? • (Assume) No! R0 R1
z
zw
Priority Queue
x1
x1x2
y*
x1x2
120
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
121
Spill
R0 R1
z
zw
Priority Queue
x1
x1x2
y*
x1x2
122
Spill• Spill around uses• Spill after def
• Reload before useR0 R1
z
zw
Priority Queue
x1
x1x2
y*
spillreload
x1x2
123
Spill• Create new intervals for spills and reloads
R0 R1
z
zw
Priority Queue
x1
x1x2
y* y1 y2
spillreload
spillreload
x1x2
124
Spill
R0 R1
z
zw
Priority Queue
x1
x1x2
y1 y2
x1x2
125
Spill• Calculate spill weights
R0 R1
z
zw
Priority Queue
x1
x1x2
y1 y2
3 KG 2 KG
x1x2
126
Spill• Spill
R0 R1
z
zw
Priority Queue
x1
x1x2
y1 y2
y2y1
x1x2
127
Spill• Enqueue y1, y2 into the queue• This will also calculate their allocation
priorityR0 R1
z
zw
Priority Queue
x1
x1x2
y1 y2
y2y1
x1x2
128
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y2y1
x1x2
129
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x1
x1x2y1 y2
x1x2
130
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y1
y2
x1x2
131
Register Assignment• Assign to available register if possible
R0 R1
z
zw
Priority Queue
x1
x1x2y1y2
x1x2
132
Register Assignment• Dequeue interval with highest priority
R0 R1
z
zw
Priority Queue
x1
x1x2y2
y1
x1x2
133
Register Assignment
R0 R1
z
zw
Priority Queue
x1
x1x2y2
y1
x1x2
134
Register Assignment• Assign to available register if possible
R0 R1
z
zw
Priority Queue
x1
x1x2y2
y1
x1x2
135
Greedy Register Allocator
• Greedy Register Allocator Overview
• Region Split
• Encountered Issues
• Performance Impact
136
Motivation
%ecx %ebx %edi %edx
%ebp %ecx %ebx %edi
%ebp %ecx %ebx %edi
%ecx %ebx %edi %edx
137
Exploration• Why did the register allocator create these redundant mov instructions?
138
Exploration• Why did the register allocator create these redundant mov instructions?• Artifacts of split x* x1 x2
x2 =COPY x1
x1 =COPY x2
139
Exploration• Why did the register allocator create these redundant mov instructions?• Artifacts of split
• If we would have chosen to do the split differently we could have avoided the redundant mov instructions
• Why was this way to split was chosen?
140
Greedy Register Allocator Overview• General flow
Live Interval Analysis
Spill Weight Calculation
SplitEvictionRegister Assignment
Priority Queue Construction
Spill
141
Region Split• How to find the best way to split?
• How to know if split is beneficial?
R0 R1
z
zw
Priority Queue
x*
y
142
Find Best Split
R0 R1x R2 … Rn
143
Find Best Split• The registers already have assigned intervals
R0 R1x R2 … Rn
144
Find Best Split• The registers already have assigned intervals
R0 R1
z
zy
x
w
R2 … Rn
145
Find Best Split• The registers already have assigned intervals• These intervals impose allocation constraints
R0 R1x R2 … Rn
146
Find Best Split• Do the split of x for each one of the registers
R0 R1x R2 … Rn
147
Find Best Split• Do the split of x for each one of the registers
R0 R1x R2 … Rn
148
Find Best Split• Do the split of x for each one of the registers
R0 R1x R2 … Rn
149
Find Best Split• Do the split of x for each one of the registers
R0 R1x R2 … Rn
150
Find Best Split• Do the split of x for each one of the registers• Estimate split cost, e.g. the amount of spill code this split may cause
R0 R1x R2 … Rn
20$15$10$20$
151
Find Best Split• Do the split of x for each one of the registers• Choose the cheapest one
R0 R1x R2 … Rn
20$15$10$20$
152
Find Best Split• Do the split of x for each one of the registers
R0 R1x R2 … Rn
Find Best Split for Given Register• How do we do the best split for a given register R1?
153
x R1
• The region split is usually divided into 2 intervals
154
Find Best Split for Given Register
x x1
x2 =COPY x1
x2
x1 =COPY x2
R1
• The region split is usually divided into 2 intervals• ByReg
• Parts of x that pass on R1 register• Should comply with current allocation
constraints provided by intervals alreadyassigned to R1
• ByStack• Parts of x that pass on or on the stack• Usually where the already allocated
R1 intervals interfere with x
Find Best Split for Given Register
155
x x1ByReg
x2 =COPY x1
x2ByStack
x1 =COPY x2
R1
• A good split
Find Best Split for Given Register
156
x x1ByReg
x2 =COPY x1
x2ByStack
x1 =COPY x2
R1
• A good split• Reduces the transitions between ByReg and ByStack
• Each such transitions is potentially a spill/reload• In case ByStack is not allocated to
another register
• Places the transitions in blocks less frequently executed
Find Best Split for Given Register
157
x x1ByReg
x2 =COPY x1
x2ByStack
x1 =COPY x2
R1
• A good split• Reduces the transitions between ByReg and ByStack
• Each such transitions is potentially a spill/reload• In case ByStack is not allocated to
another register
• Places the transitions in blocks less frequently executed
• Use Hopfield neural network• Converges to a result that satisfies
the above characteristics
Find Best Split for Given Register
158
x x1ByReg
x2 =COPY x1
x2ByStack
x1 =COPY x2
R1
• Split reduces the amount of spills compared to spilling around uses
Determine if Split is Beneficial
159
BB#1:… = …x……
BB#2:…
BB#3:…
BB#0:x = ……
BB#4:…
BB#5:… = …x……
• Split reduces the amount of spills compared to spilling around uses• Use/def blocks must have x in a register
at some point
Determine if Split is Beneficial
160
BB#1:… = …x……
BB#2:…
BB#3:…
BB#0:x = ……
BB#4:…
BB#5:… = …x……
• Split reduces the amount of spills compared to spilling around uses• Use/def blocks must have x in a register
at some point
• If the split can create “regions” of several basic blocks where x is passed by registerthis will reduce the amount of spills• Only if constraints allow it
Determine if Split is Beneficial
161
BB#1:… = …x……
BB#2:…
BB#3:…
BB#0:x = ……
BB#4:…
BB#5:… = …x……
ByReg
• Split reduces the amount of spills compared to spilling around uses• Use/def blocks must have x in a register
at some point
• If the split can create “regions” of several basic blocks where x is passed by registerthis will reduce the amount of spills• Only if constraints allow it
Determine if Split is Beneficial
162
BB#1:… = …x……
BB#2:…
BB#3:…
BB#0:x = ……
BB#4:…
BB#5:… = …x……
Passed by register region
163
Greedy Register Allocator
• Greedy Register Allocator Overview
• Region Split
• Encountered Issues
• Performance Impact
• Inaccurate split cost calculation• Root cause of the following encountered issues
• Does not model the affect of local interference caused by the split• Makes the split cost inaccurate
• The “cheapest” split may actually be more expensive than other splits
• Can choose suboptimal split
Region Split Cost Issues
164
20$15$10$20$
• Find best split of interval x for R1• Using Hopfield neural network
Local Interference
165
BB#1:… ……… … ……
R1
• Find best split of interval x for R1• Using Hopfield neural network
• The network determines how x will be passed on the CFG edges
Local Interference
166
BB#1:… ……… … ……
R1
• Find best split of interval x for R1• Using Hopfield neural network
• The network determines how x will be passed on the CFG edges• “ByReg” interval or “By stack” interval
Local Interference
167
BB#1:…………… ……
R1
x1 ByReg
x2 ByStack
• Find best split of interval x for R1• Using Hopfield neural network
• The network determines how x will be passed on the CFG edges• “ByReg” interval or “By stack” interval• Determined which basic block will have a copy
between these two intervals
Local Interference
168
BB#1:………x2 = x1… ……
R1
x1 ByReg
x2 ByStack
• Find best split of interval x for R1• Using Hopfield neural network
• The network determines how x will be passed on the CFG edges• “ByReg” interval or “By stack” interval• Determined which basic block will have a copy
between these two intervals
• The Hopfield neural network does not model what happens to x inside the basic block
Local Interference
169
BB#1:…………… ……
R1
x1 ByReg
x2 ByStack
• The Hopfield neural network does not model what happens to x inside the basic block
Local Interference
170
BB#1:…………… ……
R1
• The Hopfield neural network does not model what happens to x inside the basic block• x split for R1 determined x’s ByReg
interval should enter and leave BB#1
Local Interference
171
BB#1:…………… ……
R1
x1 ByReg
x1 ByReg
• The Hopfield neural network does not model what happens to x inside the basic block• x split for R1 determined x’s ByReg
interval should enter and leave BB#1
• y in BB#1 is already assigned to R1
Local Interference
172
BB#1:…y = ……… = …y…… ……
R1
y
x1 ByReg
x1 ByReg
• The Hopfield neural network does not model what happens to x inside the basic block• x split for R1 determined x’s ByReg
interval should enter and leave BB#1
• y in BB#1 is already assigned to R1
• x is used in BB#1
Local Interference
173
BB#1:… = …x…y = …… = …x…… = …y……… = …x……
R1
y
x1 ByReg
x1 ByReg
• The Hopfield neural network does not model what happens to x inside the basic block• x split for R1 determined x’s ByReg
interval should enter and leave BB#1
• y in BB#1 is already assigned to R1
• x is used in BB#1
• y interferes with assigning x to R1 locally in BB#1
Local Interference
174
BB#1:… = …x…y = …… = …x…… = …y……… = …x……
R1
y
x1 ByReg
x1 ByReg
• The Hopfield neural network does not model what happens to x inside the basic block• x split for R1 determined x’s ByReg
interval should enter and leave BB#1
• y in BB#1 is already assigned to R1
• x is used in BB#1
• y interferes with assigning x to R1 locally in BB#1• The part of x that contains this local interference
will be added to x’s “ByStack” split artifact
Local Interference
175
BB#1:… = …x…y = …… = …x…… = …y……… = …x……
R1
y
x1 ByReg
x2 ByStack
x1 ByReg
• Local interferences may have very negative affectson assignment of the “ByStack” split artifact• Can cause bad eviction chains
• Encountered issues #1, #2
• Can cause a lot of reloads• Encountered issue #3
• This affect is not considered during split cost calculation
Local Interference
176
BB#1:… = …x…… = …y…… = …x…… = …y……… = …x……
R1
y
x1 ByReg
x2 ByStack
x1 ByReg
177
Encountered Issue• Bad eviction chain• Cyclic eviction/split chain – Issue #1
• Domino effect eviction – Issue #2
• Multiple reloads from the same location• Issue #3
178
Encountered Issue #1• Bad eviction chain – scenario 1• llvm/test/CodeGen/X86/
greedy_regalloc_bad_eviction_sequence.ll
179
Encountered Issue #1• Bad eviction chain – scenario 1• llvm/test/CodeGen/X86/
greedy_regalloc_bad_eviction_sequence.ll%ecx %ebx %edi %edx
%ebp %ecx %ebx %edi
%ebp %ecx %ebx %edi
%ecx %ebx %edi %edx
180
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
181
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
182
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi
ediy
x
183
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi
ediy
x
2 KG 1 KG
184
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi
edixyEvicts
y x
185
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi
edix
y
Evicts
y x
186
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
edix
y
Evicts
y x
187
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
edix
yy x
Evicts
188
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
edix2
y
x1
y x
x
part ofSplit
Evicts
189
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
• x1 represent the part of the split that has local interference with y
edix1
y x
xInterfering part
of
ySplit
Evicts
190
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
• x1 represent the part of the split that has local interference with y
edi
y
x1
y x
xInterfering part
ofSplit
Evicts
191
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
• x1 represent the part of the split that has local interference with y
• x1 evicts y from edi edi
y
x1
3 KG 2 KG
y x
xInterfering part
ofSplit
Evicts
192
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
• x1 represent the part of the split that has local interference with y
• x1 evicts y from edi ediyx1
y x
xInterfering part
ofEvicts Split
Evicts
193
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• y evicts x from edi• x is split into x1 and x2
• x1 represent the part of the split that has local interference with y
• x1 evicts y from edi ediy
x1Evicts Split
Evicts
y x
xInterfering part
of
194
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• Every such “movl” duo was created by cyclic eviction/split chain
Evicts Split
Evicts
y x
xInterfering part
of
wInterfering part
of
195
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• Every such “movl” duo was created by cyclic eviction/split chain
Evicts Split
Evicts
z w
bInterfering part
of
196
Encountered Issue #1• Bad eviction chain – scenario 1• Cyclic eviction/split chain
• Every such “movl” duo was created by cyclic eviction/split chain
Evicts Split
Evicts
a b
197
Encountered Issue #1• Bad eviction chain – scenario 1• The problem
• x is split in such a way that creates local interference split artifact
• That artifact causes cyclic eviction
• The solution• Tailored for this case
• Identify if a split will create a local interference artifact• Identify if that split artifact will cause a cyclic eviction• Increase split cost
• Make this split less attractive compared to other splits
• Commit: https://reviews.llvm.org/rL31629520$
Evicts Split
Evicts
y x
xInterfering part
of
198
Encountered Issue #2• Bad eviction chain – scenario 2• https://bugs.llvm.org/show_bug.cgi?id=26810
199
Encountered Issue #2• Bad eviction chain – scenario 2• https://bugs.llvm.org/show_bug.cgi?id=26810
• This time parts of the chain are spread around
%xmm5 %xmm4 %xmm3 %xmm2
%xmm7 %xmm5 %xmm4 %xmm3
%xmm7 %xmm5 %xmm4 %xmm3
%xmm5 %xmm4 %xmm3 %xmm2
200
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
201
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2
xmm2x
y
202
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2
x
y
4 KG 1 KG
xmm2
203
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2
xmm2x yEvicts y
x
204
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2
xmm2y
x
Evicts yx
205
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
xmm2y
x
Evicts yx
206
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
xmm2y
x
Evicts yx
207
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
xmm2
x
y2y1Evicts y
Split part of y
x
208
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
xmm2y1
x
Evicts ySplit Interfering part
of y
x
209
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
xmm2
x
y1Evicts y
Split Interfering part of y
x
210
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2 xmm2
x
y1
4 KG3 KG
Evicts ySplit Interfering part
of y
x
211
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2 xmm2
x
y1Evicts y
Split Interfering part of y
x
212
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2 y1Evicts y
Split Interfering part of y
x
213
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2• y1 evicts z from xmm3
xmm3y1
z
Evicts ySplit Interfering part
of y
x
214
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2• y1 evicts z from xmm3
xmm3
z
y1
1 KG3 KG
Evicts ySplit Interfering part
of y
x
215
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2• y1 evicts z from xmm3
xmm3y1 zEvicts y
Split
Evicts zy
Interfering part of
x
216
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• x evicts y from xmm2• y is split into y1 and y2 for xmm2
• y1 represent the part of the split that has local interference with x
• y1 cannot evict x from xmm2• y1 evicts z from xmm3
xmm3
y1
zEvicts y
Split
Evicts zy
Interfering part of
x
217
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3
xmm3
y1
zEvicts y
Split
Evicts zy
Interfering part of
x
218
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
xmm3z
y1
Evicts ySplit
Evicts zy
Interfering part of
x
219
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
xmm3z
y1
Evicts ySplit
Evicts zy
Interfering part of
x
220
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
xmm3z2z1
y1
zpart of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
221
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
xmm3z1
y1
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
222
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
xmm3z1
y1
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
223
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3 xmm3z1
y1
2 KG 3 KGz
Interfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
224
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3 xmm3z1
y1
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
225
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3 z1
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
226
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3• z1 evicts w from xmm4
xmm4
w
z1
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
227
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3• z1 evicts w from xmm4
xmm4z1
2 KG 1 KG
w
zInterfering part of
Evicts ySplit
Evicts zSplit
yInterfering part of
x
228
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3• z1 evicts w from xmm4
xmm4z1 w
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
w
229
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• y1 evicts z from xmm3• z is split into z1 and z2 for xmm3
• z1 represent the part of the split that has local interference with y1
• z1 cannot evict y1 from xmm3• z1 evicts w from xmm4
xmm4
z1
w
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
w
230
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• Every such “movl” duo was created by the domino effect
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
w
231
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• Every such “movl” duo was created by the domino effect
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
wSplit
Evicts a w
Interfering part of
232
Encountered Issue #2• Bad eviction chain – scenario 2• Domino effect eviction
• Every such “movl” duo was created by the domino effect
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
wSplit
Evicts aSplit
Evicts b
wInterfering part of
aInterfering part of
233
Encountered Issue #2• Bad eviction chain – scenario 2• The problem
• y is split to fit the register it was evicted from• This split creates local interference split
artifact that causes domino effect eviction
• The solution• Tailored for this case
• Identify if a split for evicted register creates local interference artifact• Identify if that split artifact will cause domino effect eviction• Increase split cost
• Make this split less attractive compared to other splits
• Commit: https://reviews.llvm.org/rL31629520$
zInterfering part of
Evicts ySplit
Evicts zSplit
Evicts
yInterfering part of
x
w
234
Encountered Issue #3• Multiple reloads from the same location
235
Encountered Issue #3• Multiple reloads from the same location• All the reloads are from the same
location
236
Encountered Issue #3• Multiple reloads from the same location• All the reloads are from the same
location
• Appeared in a hot loop after a higher level change
237
Encountered Issue #3• Multiple reloads from the same location
Before Change After ChangeLoop MBB is the same until Greedy
Loop MBB is the same until Greedy
238
Encountered Issue #3• Multiple reloads from the same location
Before Change After ChangeLoop MBB is the same until Greedy
x is split for R0
Loop MBB is the same until Greedy
x is split for R1
239
Encountered Issue #3• Multiple reloads from the same location
Before Change After ChangeLoop MBB is the same until Greedy
x is split for R0
Split doesn’t have local interferences
Loop MBB is the same until Greedy
x is split for R1
Split has local interference in Loop’s MBB
240
Encountered Issue #3• Multiple reloads from the same location
Before Change After ChangeLoop MBB is the same until Greedy
x is split for R0
Split doesn’t have local interferences
Loop MBB is the same until Greedy
x is split for R1
Split has local interference in Loop’s MBB
Local interference spilled & reloaded around uses
241
Encountered Issue #3• Multiple reloads from the same location• The problem
• Local interference interval has a lot of uses
• This interval is spilled and reloaded
• Solution• Identify if the created local
interference interval will spill• Increase split cost
• Make this split less attractive compared to other splits
• Commit: https://reviews.llvm.org/rL32387020$
242
Greedy Register Allocator
• Greedy Register Allocator Overview
• Region Split
• Encountered Issues
• Performance Impact
243
• Fix affected mostly EEMBC workloads
• Regressions unrelated to this change
• No actual compile time impact on CTMark
Fix for Bad Eviction Chains - Issues #1, #2
Run
Tim
e C
hang
e (%
)
Affected workloads-2.67-2.64-2.52-2.30-2.18
2.462.783.373.403.433.473.473.503.503.65
12.0813.02
244
• Fix affected mostly EEMBC workloads
• No actual compile time impact on CTMark
Fix for Multiple Reloads - Issue #3
Run
Tim
e C
hang
e (%
)
Affected workloads
2.13.033.2644.394.424.945.16
11.2711.2812.0612.1312.212.412.4412.8613.08
245
Conclusions• Local interference caused by split may have a negative affect
• Current split cost does not take this affect into account
• Committed solutions tailored to catch 3 specific scenarios
• Need a more holistic approach for quantifying the cost of local interferences caused by split