31
Foundations of Data-Flow Analysis

Foundations of Data-Flow Analysis

  • Upload
    deidra

  • View
    154

  • Download
    3

Embed Size (px)

DESCRIPTION

Foundations of Data-Flow Analysis. Basic Questions. Under what circumstances is the iterative algorithm used in the data-flow analysis correct ? How precise is the solution obtained by the iterative algorithm? Will the iterative algorithm converge ? - PowerPoint PPT Presentation

Citation preview

Page 1: Foundations of Data-Flow Analysis

Foundations of Data-Flow Analysis

Page 2: Foundations of Data-Flow Analysis

Basic Questions

Under what circumstances is the iterative algorithm used in the data-flow analysis correct?

How precise is the solution obtained by the iterative algorithm?

Will the iterative algorithm converge? What is the meaning of the solution to the

equations?

Page 3: Foundations of Data-Flow Analysis

Data-Flow Analysis Framework A direction of the data flow D, which is either for

wards or backwards

A semilattice, which includes a domain of values

V and a meet operator A family F of transfer functions from V to V. This

family must include functions suitable for the bou

ndary conditions, which are constant transfer fun

ctions for the special nodes ENTRY and EXIT in

any control flow graph

Page 4: Foundations of Data-Flow Analysis

Example: Reaching Definitions The direction: forwards The domain of values: the set of subsets of

the set of all definitions in the program The meet operator: set union The family of transfer functions: the set of

transfer functions for various statements

Page 5: Foundations of Data-Flow Analysis

Semilattices A semilattice is a set V and a binary meet o

perator such that for all x, y, and z in V: x x = x (meet is idempotent) x y = y x (meet is commutative) x (y z) = (x y) z (meet is associative)

A semilattice has a top element, denoted 丅 , such that for all x in V, 丅 x = x

Optionally, a semilattice may have a bottom element, denoted , such that for all x in V, x =

Page 6: Foundations of Data-Flow Analysis

Example: Reaching Definitions The domain of values is the set of all subsets

of the universal set U, or the power set of U, denoted 2U

The meet operator is the set union The set union is idempotent, commutative,

and associative The top element is the empty set The bottom element is the universal set U

Page 7: Foundations of Data-Flow Analysis

Partial Orders A relation is a partial order on a set V if fo

r all x, y, and z in V: x x (the partial order is reflexive) If x y and y x, then x = y (the partial order is

antisymmetric) If x y and y z, then x z (the partial order is t

ransitive) The pair (V, ) is called a poset, or partially

ordered set We define x < y if and only if x y and x y

Page 8: Foundations of Data-Flow Analysis

The Partial Order for a Semilattice It is useful to define a partial order for a sem

ilattice (V, ). For all x and y in V, we define x y if and only if x y = x

is reflexive: x x = x x x is antisymmetric:

x y x y = x, y x y x = y, x = (x y) = (y x) = y

is transitive: x y x y = x, y z y z = y, (x z) = ((x y) z) = (x (y z )) = (x y) = x x z

Page 9: Foundations of Data-Flow Analysis

Example: Reaching Definitions The relation is the set inclusion

x y = x x y This says that sets larger in size is smaller in

the partial order The set inclusion is reflexive, antisymmetric,

and transitive

Page 10: Foundations of Data-Flow Analysis

Greatest Lower Bounds

A greatest lower bound (or glb) of domain elements x and y is an element g such that

g x, g y, and If z is any element such that z x and z y, t

hen z g

Page 11: Foundations of Data-Flow Analysis

Meet and Greatest Lower Bound The meet of x and y is the greatest lower

bound of x and y Let g = x y g x:

g x = (x y) x = x (y x) = x (x y) = (x x) y = x y = g

g y z x and z y z g

z g = z (x y) = (z x) y = z y = z

Page 12: Foundations of Data-Flow Analysis

Lattice Diagrams

{d2}{d1} {d3}

{d1, d3}{d1, d2} {d2, d3}

{d1, d2, d3}

Page 13: Foundations of Data-Flow Analysis

Product Lattices

The product lattice for lattices (A, A) and (B, B) is defined as follows:

The domain of the product lattice is A B The meet for the product lattice:

(a, b) (a’, b’) = (a A a’, b B b’) The partial order for the product lattice:

(a, b) (a’, b’) iff a A a’ and b B b’ This definition can be extended to the product

of any number of lattices

Page 14: Foundations of Data-Flow Analysis

Example

({},{},{})

({},{d2},{})({d1},{},{}) ({},{},{d3})

({d1},{},{d3})({d1},{d2},{}) ({},{d2},{d3})

({d1}, {d2}, {d3})

Page 15: Foundations of Data-Flow Analysis

Height of a Semilattice An ascending chain in a poset (V, ) is a sequence

x1 < x2 < … < xn

The height of a semilattice is the largest number of < relations in any ascending chain

An iterative data flow analysis algorithm is convergent if the corresponding semilattice has finite height

A lattice consisting of a finite set of values will have a finite height

It is also possible for a lattice with an infinite number of values to have a finite height

Page 16: Foundations of Data-Flow Analysis

Transfer Functions

The family of transfer functions F: V V in a data-flow framework has the following properties:

F has an identity function I, such that I(x) = x for all x in V

F is closed under composition; that is, for any two functions f and g in F, the function h defined by h(x) = g(f(x)) is in F

Page 17: Foundations of Data-Flow Analysis

Example: Reaching Definitions The identity function: gen[B] = kill[B] = Closure under composition:

f1(x) = G1 (x - K1), f2(x) = G2 (x - K2), f2(f1(x)) = G2 ((G1 (x - K1)) - K2)

= (G2 (G1 - K2 )) (x - (K1 K2)).

Let G = G2 (G1 - K2 ) and K = K1 K2. f(x) = f2(f1(x)) = G (x - K).

Page 18: Foundations of Data-Flow Analysis

Monotone Frameworks

A framework (D, F, V, ) is monotone if x y implies f(x) f(y),

for all x and y in V, and f in F Equivalently, a framework (D, F, V, ) is mono

tone if f(x y) f(x) f(y), for all x and y in V, and f in F

Page 19: Foundations of Data-Flow Analysis

Proof of Equivalence

() x y x and x y y f(x y) f(x) and f(x y) f(y) f(x) f(y) is the glb of f(x) and f(y) f(x y) f(x) f(y)() x y x y = x f(x y) = f(x) f(x) f(y) f(y) f(x) f(y)

Page 20: Foundations of Data-Flow Analysis

Distributive Frameworks

A framework (D, F, V, ) is distributive if f(x y) = f(x) f(y)

for all x and y in V, and f in F

Distributivity implies monotonicity

Page 21: Foundations of Data-Flow Analysis

Example: Reaching DefinitionsLet y and z be sets of definitions, and

f(x) = G (x - K)

Then

G ((y z) - K) = (G (y - K)) (G (z - K))

Page 22: Foundations of Data-Flow Analysis

The Iterative Algorithm for General Frameworks: Input A control flow graph, with specially labeled ENTRY

and EXIT nodes, A direction of the data flow D, A set of values V, A meet operator , A set of functions F, where fB in F is the transfer func

tion for basic block B, and A constant value vENTRY or vEXIT in V, representing the

boundary condition for forward and backward frameworks, respectively

Page 23: Foundations of Data-Flow Analysis

The Iterative Algorithm for General Frameworks: Output Values in V for IN[B] and OUT[B] for each

basic block B in the control flow graph

Page 24: Foundations of Data-Flow Analysis

The Iterative Algorithm for General Frameworks: Forward

OUT[ENTRY] = vENTRY;

for (each basic block B other than ENTRY)

OUT[B] := 丅 ;

while (changes to any OUT occur)

for (each basic block B other than ENTRY) {

IN[B] := p pred(B) OUT[p];

OUT[B] := fB(IN[B]);

}

Page 25: Foundations of Data-Flow Analysis

The Iterative Algorithm for General Frameworks: Backward

IN[EXIT] = vEXIT;

for (each basic block B other than EXIT)

IN[B] := 丅 ;

while (changes to any IN occur)

for (each basic block B other than EXIT) {

OUT[B] := s succ(B) IN[s];

IN[B] := fB(OUT[B]);

}

Page 26: Foundations of Data-Flow Analysis

Properties of the Iterative Algorithm If the algorithm converges, the result is a soluti

on to the data-flow equations If the framework is monotone, then the solution

found is the maximum fixedpoint (MFP) of the data-flow equations. The maximum fixedpoint is a solution with the property that in any other solution, the value of IN[B] and OUT[B] are the corresponding values of MFP

If the semilattice of the framework is monotone and finite height, then the algorithm is guaranteed to converge

Page 27: Foundations of Data-Flow Analysis

The Ideal Solution Consider any path

P = ENTRY B1 … Bk-1 Bk The transfer function for P is

fP = fBk-1(fBk-2

( … (fB1) … ))

The ideal solution is

IDEAL[B] = Ppossible paths from ENTRY to B fP(vENTRY) Any answer that is greater than IDEAL is incorr

ect Any value smaller than or equal to IDEAL is co

nservative, i.e., safe

Page 28: Foundations of Data-Flow Analysis

The Meet-Over-Paths Solution Finding all possible paths is undecidable The meet-over-paths solution is

MOP[B] = P paths from ENTRY to B fP(vENTRY) The paths considered in the MOP solution is

a superset of all the paths that are possibly executed

MOP[B] IDEAL[B]

Page 29: Foundations of Data-Flow Analysis

MFP Solution versus MOP Solution The iterative algorithm visits basic blocks, not

necessarily in the order of execution At each confluence point, the algorithm

applies the meet operator to the data-flow values obtained so far. Some of these values used were introduced artificially in the initialization process, not representing the result of any execution from the beginning of the program

Page 30: Foundations of Data-Flow Analysis

Early Meet over Paths

ENTRY

B1 B2

B4

B3

MOP[B4] = ((f B3 f B1

) (f B3 f B2

))(vENTRY)

IN[B4] = f B3 ((f B1

(vENTRY) f B2

(vENTRY)))

Page 31: Foundations of Data-Flow Analysis

Comparison of Solutions

Using the iterative algorithm, we have

IN[B] MOP[B]

for monotone frameworks and

IN[B] = MOP[B]

for distributive frameworks

MFP MOP IDEAL