Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
CS293S SVN & DVN & GCSE
Yufei Ding
2
Review of Last Class
� Removing redundant expressions� DAG: version tracking� Linear representation: (local) value numbering
� Scope of optimization
� Basic block, Extended basic block, …
3
Renaming + Value Numbering
Example (continued)
With VNsa0
3 ¬ x01 + y0
2
z20 ¬ y0
2
* b03 ¬ x0
1 + y02
a14 ¬ 17
* c03 ¬ x0
1 + y02
Original Codea0 ¬ x0 + y0z0 ¬ y0
* b0 ¬ x0 + y0a1 ¬ 17
* c0 ¬ x0 + y0
Renaming:• Give each value a
unique name
Rewrittena0 ¬ x0 + y0z0 ¬ y0
* b0 ¬ a0a1 ¬ 17
* c0 ¬ a0
Result:• a0 is available• Rewriting just
works
Hash Table for Rewritten{<1,x0>, <2,y0>, <3,a0>}{<1,x0 >, <2,y0>, <3,a0>}{<1,x0 >, <2,y0>, <3,a0>, <4,17>}{<1,x0 >, <2,y0>, <3,a0>, <4,17>}
4
Missed opportunities(need stronger methods)
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
Local Value Numbering1 basic block at a time (1 entry point + 1 exit point)
• Strong local results
No cross-block effects
Can we find set of blocks that also ensures the sequential execution order in the basic block?
5
Topics of This Class
� Scope of optimization
� Basic block -> Local value numbering
� Extended basic block (EBB) -> Superlocal value numbering
� Dominator -> Dominator-based value numbering
� Global Common Subexpression Elimination (GCSE)
� More close to DAG-based methods
� Work on lexical notation instead of expression values.
Extended basic block (EBB)
� An EBB is a set of blocks B1, B2, ..., Bn, where Bi, 2<= i <= n has a unique predecessor, which is in the EBB. (If a block is added to the EBB, all of its predecessors must be included. Bi is the one with on predecessor, i.e., the root of the EBB).
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
7
Superlocal Value Numbering
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
1. First find the maximum EBB: ABCDE, F, G
2. Apply local method to EBBs’ paths
• Do {A,B}, {A,C,D}, {A,C,E}, {F}, {G}
8
Implementation
� Reuse the value numbering results of some common blocks for efficiency
� Which necessitates the undoing of a block’s effect� After {A,C,D}, it must recreate the state of {A,C} before
processing E.� Options:1. Record the state of the tables at each block
boundary, and restore the state when needed2. Walking backward and undo the effect. Need
record the “lost” information.3. Scoped hash tables (Lowest cost)
keep the table produced at the current block
9
Scoped Value Table
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
r ¬ c + dq ¬ a + b
C
e ¬ b + 18s ¬ a + bu ¬ e + f
Dt ¬ c + du ¬ a + b
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
a->1b->21+2->3m->3n->3
c->4d->54+5->6r->6q->3
t->6u->3
c->4d->54+5->6p->6r->6
10
Rewritten
a ¬ b + c
e ¬ b - c
b -> 1c -> 21 + 2 ->3 a -> 3
1->b2->c3->a
d ¬ b - cf ¬ b - c
1-2 -> 4e -> 4
Scoped rewritten table
4 -> e 4 -> d1-2 -> 4d-> 4f-> 4
d ¬ b - cf ¬ d
11
Rewritten
a ¬ b + c
a ¬ 17e ¬ b + c
d ¬ b + c
Renaming is still needed. But does it work in all scenarios?
a1 ¬ b1 + c1
a2 ¬ 17e1 ¬ b1 + c1
d1 ¬ b1 + c1
Extra Complexity
12
a1 ¬ b + c
a3 ¬ 17a2 ¬ a1 + c
d ¬ a + c?
13
SSA (Single Static Assignment) Name Space
Two principles� Each name is defined by exactly one operation� Each operand refers to exactly one definition
To reconcile these principles with real code� Insert f-functions at merge points to reconcile name space
x ¬ ... x ¬ ...
... ¬ x + ...
x0 ¬ ... x1 ¬ ...
x2 ¬f(x0,x1)¬ x2 + ...
becomes
Another SSA Example
14
x ¬ ... x ¬ ...
... ¬ x + ...
x3 ¬ ... x4 ¬ ...
x5 ¬f(x3,x4)¬ x5 + ...
becomes
x ¬ x + ...
x1 ¬f(x0,x5)
x2 ¬ x1 + ...
Detail: CT-2ndEd: Section 5.4.2;CT-1stEd: Section 5.5.
15
This is in SSA Form
Superlocal Value Numbering
m0 ¬ a + bn0 ¬ a + b
A
p0 ¬ c + dr0 ¬ c + d
B
r2 ¬ f(r0,r1)y0 ¬ a + bz0 ¬ c + d
G
q0 ¬ a + br1 ¬ c + d
C
e0 ¬ b + 18s0 ¬ a + bu0 ¬ e + f
D e1 ¬ a + 17t0 ¬ c + du1 ¬ e + f
E
e3 ¬ f(e0,e1)u2 ¬ f(u0,u1)v0 ¬ a + bw0 ¬ c + dx0 ¬ e + f
F
1.Build SSA form
2.Find EBBs
3.Apply value numbering to each path in each EBB using scoped hash tables
16
This is in SSA Form
Superlocal Value Numbering
m0 ¬ a + bn0 ¬ a + b
A
p0 ¬ c + dr0 ¬ c + d
B
r2 ¬ f(r0,r1)y0 ¬ a + bz0 ¬ c + d
G
q0 ¬ a + br1 ¬ c + d
C
e0 ¬ b + 18s0 ¬ a + bu0 ¬ e + f
D e1 ¬ a + 17t0 ¬ c + du1 ¬ e + f
E
e3 ¬ f(e0,e1)u2 ¬ f(u0,u1)v0 ¬ a + bw0 ¬ c + dx0 ¬ e + f
F
With all the bells & whistles
• Find more redundancy
• Pay little additional cost
• Still does nothing for F & G
Dominator-Based Value Numbering
17
18
Regional (Dominator-based) Methods
� Dominators of b: all blocks that dominate b� if every path from the entry of the graph to b goes through
a, then a is one of b’s dominator.� The full set of dominators for b is denoted by DOM(b).
� Strict Dominators:� If a dominators b and a ≠ b, then we say a strictly dominates
b.� Immediate Dominator:
� The immediate dominator of b is the strict dominator of b that is closest to b. It is denoted IDOM(b).
Examplem ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
BLOCK A B C D E F GDOM
IDOM
20
Dominator-Based Value Numbering� Basic strategy: use table from IDom(x ) to
start value numbering x� Use C for F and A for G
� Imposes a Dom-based application order
m0 ¬ a + bn0 ¬ a + b
A
p0 ¬ c + dr0 ¬ c + d
B
r2 ¬ f(r0,r1)y0 ¬ a + bz0 ¬ c + d
G
q0 ¬ a + br1 ¬ c + d
C
e0 ¬ b + 18s0 ¬ a + bu0 ¬ e + f
D e1 ¬ a + 17t0 ¬ c + du1 ¬ e + f
E
e3 ¬ f(e0,e1)u2 ¬ f(u0,u1)v0 ¬ a + bw0 ¬ c + dx0 ¬ e + f
F
SSA Resolves Name Conflicts
21
a ¬ b + c
b ¬ 17 d ¬ b - c
e ¬ b + c
a ¬ b0 + c
b1 ¬ 17 d ¬ b0 - c
b2 ¬f(b0,b1)
e ¬ b2 + c
Summary
� Two methods in a scope beyond a basic block� Superlocal value numbering (SVN)
�Value numbering across basic blocks
� Dominator-based value numbering (DVN)�Uses dominance information to handle join points in CFG
� They can be used together
� First Build SSA� Do SVN� Do DVN with the value tables built in SVN reused
22
Build SSA form is the prerequisite for both!
Examples
23
e = c + d; f = c + d;
g = c + d;
x = a + b;
c = a - b;
� The first data-flow problem� A global method
24
Global Common Subexpression Elimination (GCSE)
25
Some Expression Sets
For each block bLet AVAIL(b) be the set of expressions available on entry to b.Let EXPRKILL(b) be the set of expressions killed in b.
i.e. one or more operands of the expression are redefined in b.
!!!! Must consider all expressions in the whole graph.
Let DEEXPR(b) include the downward exposed expressions in b.
i.e. expressions defined in b and not subsequently killed in b
26
Formula to Compute AVAIL
Now, AVAIL(b) can be defined as:
AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) ))
preds(b) is the set of b’s predecessors in the control-flow graph. (Again, a predecessor is an immediate parent, not including other ancestors.)
27
Making Theory Concrete
Computing AVAIL for the exampleAVAIL(A) = ØAVAIL(B) = {a+b} È (Ø Ç all)
= {a+b}AVAIL(C) = {a+b}AVAIL(D) = {a+b,c+d} È ({a+b} Ç all)
= {a+b,c+d} AVAIL(E) = {a+b,c+d}AVAIL(F) = [{b+18,a+b,e+f} È
({a+b,c+d} Ç {all - e+f})]Ç [{a+17,c+d,e+f} È
({a+b,c+d} Ç {all - e+f})]= {a+b,c+d,e+f}
AVAIL(G) = [ {c+d} È ({a+b} Ç all)]Ç [{a+b,c+d,e+f} È
({a+b,c+d,e+f} Ç all)]= {a+b,c+d}
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
28
Computing Available Expressions
The Big Picture1. Build a control-flow graph2. Gather the initial data: DEEXPR(b) & EXPRKILL(b)3. Propagate information around the graph, evaluating the
equation
Works for loops through an iterative algorithm: finding the fixed-point.
All data-flow problems are solved, essentially, this way.
29
First step is to compute DEEXPR & EXPRKILL
Computing Available Expressions
assume a block b with operations o1, o2, …, ok
VARKILL ¬ ØDEEXPR(b) ¬ Ø
for i = k to 1assume oi is “x ¬ y + z”add x to VARKILL
if (y Ï VARKILL) and (z Ï VARKILL) thenadd “y + z” to DEEXPR(b)
EXPRKILL(b) ¬ Ø
For each expression efor each variable v Î e
if v Î VARKILL(b) thenEXPRKILL(b) ¬ EXPRKILL(b) È {e }
Many data-flow problems have initial information that costs less to compute
O(k) steps
O(N) stepsN is # operations
Backward through block
30
Computing Available Expressions
The worklist iterative algorithm
Worklist ¬ { all blocks, bi }
while (Worklist ¹ Ø)remove a block b from Worklist recompute AVAIL(b ) as
AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) ))if ??? then
Worklist ¬ ???
31
Computing Available Expressions
The worklist iterative algorithm
Worklist ¬ { all blocks, bi }
while (Worklist ¹ Ø)remove a block b from Worklist recompute AVAIL(b ) as
AVAIL(b) = ÇxÎpred(b) (DEEXPR(x) È (AVAIL(x) Ç EXPRKILL(x) ))if AVAIL(b ) changed then
Worklist ¬ Worklist È successors(b )
• Finds fixed point solution to equation for AVAIL
• That solution is unique
32
Data-flow AnalysisData-flow analysis is a collection of techniques for compile-time reasoning about run-time flow of values� Almost always involves building a graph
� Problems are trivial on a basic block� Global problems Þ control-flow graph (or
derivative)� Whole program problems Þ call graph (or
derivative)� Usually formulated as a set of simultaneous
equations
33
Replacement step in GCSE
� Limit to textually identical expressions(like DAG, unlike value numbering)
e <- d + c
a <- b + cd <- b
e <- b + c
a <- b + c f <- b + c
AVAIL(B) ={b+c}
B2B1
B
AVAIL(B) ={b+c}
Cannot find or remove the redundancy!Should replace b+c with ?
34
GCSE (replacement step)� Compute a static mapping from expression to name
� After analysis & before transformation
� " block b, " expression eÎAVAIL(b), assign e a global name by hashing on e
� During transformation step
� Evaluation of e Þ insert copy name(e) ¬ e
�(e is not available and needs to be evaluated)
� Reference to e Þ replace e with name(e)
�(e is available and should be replaced)
Example
m=a+b;
n=c+d;c = 17;q=c+d;
p=c+d;
r=c+d;
name expressiont1 a+bt2 c+d
B1
B2B3
B4
t1 = a+b;m=t1;
t2=c+d;n=t2;c = 17;t2=c+d;q=t2;
t2=c+d;p=t2;
r=t2;
B1
B2B3
B4
AVAIL(B4) ={c+d; a+b}
36
GCSE (replacement step)
� The major problem with this approach
� Inserts extraneous copies
� At all definitions and uses of any eÎAVAIL(b), " b
� Not a big issue
� Those extra copies are dead and easy to remove
� The useful ones often coalesce away
37
Comparison
m ¬ a + bn ¬ a + b
A
p ¬ c + dr ¬ c + d
B
y ¬ a + bz ¬ c + d
G
q ¬ a + br ¬ c + d
C
e ¬ b + 18s ¬ a + bu ¬ e + f
D e ¬ a + 17t ¬ c + du ¬ e + f
E
v ¬ a + bw ¬ c + dx ¬ e + f
F
LVN
LVN
SVN
SVNSVN
DVNDVN
GCSE
DVN
GCSE
The VN methods are ordered
• LVN ≤ SVN ≤ DVN
• GCSE is different
o Based on names, not value
o But for this particular example: DVN ≤ GCSE
oNot always!!!!
38
Redundancy Elimination Wrap-upConclusions� Redundancy elimination has some depth & subtlety� Variations on names, algorithms & analysis
DVN is probably the method of choice� Results quite close to the global methods (± 1%)� Cost is low