View
213
Download
0
Embed Size (px)
Citation preview
Control Flow AnalysisMooly Sagiv
http://www.math.tau.ac.il/~sagiv/courses/pa.html
Tel Aviv University
640-6706
Sunday 18-21 Scrieber 8
Monday 10-12 Schrieber 317
Textbook Chapter 3(Simplified+OO)
Goals
Understand the problem of Control Flow Analysis– in Functional Languages
– In Object Oriented Languages
– Function Pointers
Learn Constraint Based Program Analysis Technique– General
– Usage for Control Flow Analysis
– Algorithms
– Systems
Similarities between Problems &Techniques
Outline A Motivating Example (OO) The Control Flow Analysis Problem A Formal Specification Set Constraints Solving Constraints Adding Dataflow information Adding Context Information Back to the Motivating Example Conclusions
A Motivating Exampleclass Vehicle Object { int position = 10; void move(x1 : int) { position = position + x1 ;}}class Car extends Vehicle { int passengers;
void await(v : Vehicle) { if (v.position < position) then v.move(position - v.position); else self.move(10); }}class Truck extends Vehicle {
void move(x2 : int) { if (x2 < 55) position = position + x2; }}void main { Car c; Truck t; Vehicle v1;
new c; new t; v1 := c;c.passangers := 2;c.move(60);v1.move(70);c.await(t) ;}
The Control Flow Analysis (CFA) Problem
Given a program in a functional programming language with higher order functions(functions can serve as parameters and return values)
Find out for each function invocation which functions may be applied
Obvious in C without function pointers Difficult in C++, Java and ML The Dynamic Dispatch Problem
An ML Example
let f = fn x => x 1 ;
g = fn y => y + 2 ;
h = fn z => z + 3;
in (f g) + (f h)
An ML Example
let f = fn x => /* {g, h} */ x 1 ;
g = fn y => y + 2 ;
h = fn z => z + 3;
in (f g) + (f h)
The Language FUN Notations
– e Exp // expressions (or labeled terms)
– t Term // terms (or unlabeled terms)
– f, x Var // variables
– c Const // Constants
– op Op // Binary operators
– l Lab // Labels
Abstract Syntax– e ::= tl
– t ::= c | x | fn x e // function definition | fun f x e // recursive function definition | e1 e2 // function applications | if e0 then e1 else e2 | let x = e1 in e2 | e1 op e2
A Simple Example
((fn x x1)2 (fn y y3)4)5
An Example which Loops
(let g = fun f x (f1 (fn y y2)3)4
)5
(g6 (fn z z7)8)9
)10
The 0-CFA Problem Compute for every program a pair (C, ) where:
– C is the abstract cache associating abstract values with labeled program points
is the abstract environment associating abstract values with variables
Formally– v Val = P(Term) // Abstract values Env = Var Val // Abstract environment
– C Cache - Lab Val // Abstract Cache
– For function application (t1l1 t2
l2)l
C(l1) determine the function that can be applied
These maps are finite for a given program No context is considered for parameters
Possible Solutions for ((fn x x1)2 (fn y y3)4)5
1 {fn y y3} {fn y y3}
2 {fn x x1} {fn x x1}
3 {} {}
4 {fn y y3} {fn y y3}
5 {fn y y3} {fn y y3}
x {fn y y3} {}
y {} {}
(let g = fun f x (f1 (fn y y2)3)4
)5
(g6 (fn z z7)8)9
)10
Shorthand
sf fun f x (f1 (fn y y2)3)4
idy fn y y2
idz fn z z7
C(1) = {sf} C(2) = {} C(3) = {idy}
C(4) = {} C(5) = {sf} C(6) = {sf}
C(7) = {} C(8) = {idy} C(9) = {}
C(10) = {} (x) = {idy , idy } (y) = {}
(z) = {}
Relationship to Dataflow Analysis
Expressions are side effect free– no entry/exit
A single environment Represents information at different points via
maps A single value for all occurrences of a variable Function applications act similar to assignments
– “Definition” - Function abstraction is created
– “Use” - Function is applied
A Formal Specification of 0-CFA
A Boolean function define when a solution is acceptable
(C, ) e means that (C, ) is acceptable for the expression e
Define by structural induction on e Every function is analyzed once Every acceptable solution is sound (conservative) Many acceptable solutions Generate a set of constraints Obtain the least acceptable solution by solving the
constraints
Syntax Directed 0-CFA(Simple Expressions)
[const] (C, ) cl always[var] (C, ) xl if (x) C (l)
Syntax Directed 0-CFAFunction Abstraction
[fn] (C, ) (fn x e)l if:(C, ) e
fn x e C(l) [fun] (C, ) (fun f x e)l if:
(C, ) efun x e C(l)
fun x e (f)
Syntax Directed 0-CFAFunction Application
[app] (C, ) (t1l1 t2
l2)l if:(C, ) t1
l1
(C, ) t2l2
for all fn x t0l0 C(l):
C (l2) (x) C(l0) C(l) for all fun x t0
l0 C(l): C (l2) (x) C(l0) C(l)
Syntax Directed 0-CFAOther Constructs
[if] (C, ) (if t0l0 then t1
l1 else t2l2)l if:
(C, ) t0l0
(C, ) t1l1
(C, ) t2l2
C(l1) C(l)C(l2) C(l)
[let] (C, ) (let x = t1l1 in t2
l2)l if:(C, ) t1
l1
(C, ) t2l2
C(l1) (x) C(l2) C(l)
[op] (C, ) (t1l1 op t2
l2)l if:(C, ) t1
l1
(C, ) t2l2
Possible Solutions for ((fn x x1)2 (fn y y3)4)5
1 {fn y y3} {fn y y3}
2 {fn x x1} {fn x x1}
3 {} {}
4 {fn y y3} {fn y y3}
5 {fn y y3} {fn y y3}
x {fn y y3} {}
y {} {}
Set Constraints
A set of rules of the form:– lhs rhs
– {t} rhs’ lhs rhs (conditional constraint)
– lhs, rhs, rhs’ are» terms
» C(l)(x)
The least solution (C, ) can be found iterativelly– start with empty sets
– add terms when needed
Efficient cubic graph based solution
Syntax Directed Constraint Generation (Part I)
C* cl = {}C* xl = { (x) C (l)}
C* (fn x e)l = C* e { {fn x e} C(l)}C* (fun x e)l = C* e { {fun x e} C(l)} {{fun x e} ( f)}
C* (t1l1 t2
l2)l = C* t1l1 C* t2
l2 {{t} C(l) C (l2) (x) | t=fn x t0
l0 Term* } {{t} C(l) C (l0) C (l) | t=fn x t0
l0 Term* } {{t} C(l) C (l2) (x) | t=fun x t0
l0 Term* } {{t} C(l) C (l0) C (l) | t=fun x t0
l0 Term* }
Syntax Directed Constraint Generation (Part II)
C* (if t0l0 then t1
l1 else t2l2)l = C* t0
l0 C* t1l1 C* t2
l2 {C(l1) C (l)} {C(l2) C (l)}
C* (let x = t1l1 in t2
l2)l = C* t1l1 C* t2
l2 {C(l1) (x)} {C(l2) C(l)}
C* (t1l1 op t2
l2)l = C* t1l1 C* t2
l2
Set Constraints for ((fn x x1)2 (fn y y3)4)5
Iterative Solution to the Set Constraints for ((fn x x1)2 (fn y y3)4)5
step Constraint 1 2 3 4 x y
Adding Data Flow Information
Dataflow values can affect control flow analysis Example
(let f = (fn x (if (x1 > 02)3 then (fn y y4)5
else (fn z 56)7
)8
)9
in ((f10 311)12 013)14)15
Adding Data Flow Information Add a finite set of “abstract” values per program
Data Update Val = P(TermData)
Env = Var Val // Abstract environment
– C Cache - Lab Val // Abstract Cache
Generate extra constraints for data Obtained a more precise solution A special of case of product domain (4.4) The combination of two analyses may be more
precise than both For some programs may even be more efficient
Adding Dataflow Information (Sign Analysis)
Sign analysis Add a finite set of “abstract” values per program
Data = {P, N, TT, FF} Update Val = P(TermData) dc is the abstract value that represents a constant c
– d3 = {p}
– d-7= {n}
– dtrue= {tt}
– dfalse= {ff}
Every operator is conservatively interpreted
Syntax Directed Constraint Generation (Part I)
C* cl = dc C (l)}C* xl = { (x) C (l)}
C* (fn x e)l = C* e { {fn x e} C(l)}C* (fun x e)l = C* e { {fun x e} C(l)} {{fun x e} ( f)}
C* (t1l1 t2
l2)l = C* t1l1 C* t2
l2 {{t} C(l) C (l2) (x) | t=fn x t0
l0 Term* } {{t} C(l) C (l0) C (l) | t=fn x t0
l0 Term* } {{t} C(l) C (l2) (x) | t=fun x t0
l0 Term* } {{t} C(l) C (l0) C (l) | t=fun x t0
l0 Term* }
Syntax Directed Constraint Generation (Part II)
C* (if t0l0 then t1
l1 else t2l2)l = C* t0
l0 C* t1l1 C* t2
l2 {dt C (l0) C(l1) C (l)} {df C (l0) C(l2) C (l)}
C* (let x = t1l1 in t2
l2)l = C* t1l1 C* t2
l2 {C(l1) (x)} {C(l2) C(l)}
C* (t1l1 op t2
l2)l = C* t1l1 C* t2
l2 {C(l1) op C(l2) C(l)}
Adding Context Information The analysis does not distinguish between different
occurrences of a variable(Monovariant analysis)
Example(let f = (fn x x1) 2
in ((f3 f4)5 (fn y y6) 7)8)9
Source to source can help (but may lead to code explosion)
Example rewrittenlet f1 = fn x1 x1 in let f2 = fn x2 x2
in (f1 f2) (fn y y)
Simplified K-CFA
Records the last k dynamic calls (for some fixed k)
Similar to the call string approach Remember the context in which expression is
evaluated Val is now P(Term)Contexts
Env = Var Contexts Val
– C Cache - LabContexts Val
1-CFA (let f = (fn x x1) 2 in ((f3 f4)5 (fn y y6) 7)8)9
Contexts– [] - The empty context
– [5] The application at label 5
– [8] The application at label 8
Polyvariant Control FlowC(1, [5]) = (x, 5)= C(2, []) = C(3, []) = (f, []) = ({(fn x x1)}, [] )C(1, [8]) = (x, 8)= C(7, []) = C(8, []) = C(9, []) = ({(fn y y6)}, [] )
The Motivating Exampleclass Vehicle Object { int position = 10; void move(x1 : int) { position = position + x1 ;}}class Car extends Vehicle { int passengers;
void await(v : Vehicle) { if (v.position < position) then v.move(position - v.position); else self.move(10); }}class Truck extends Vehicle {
void move(x2 : int) { if (x2 < 55) position = position + x2; }}void main { Car c; Truck t; Vehicle v1;
new c; new t; v1 := c;c.passangers := 2;c.move(60);v1.move(70);c.await(t) ;}
Missing Material Efficient Cubic Solution to Set Constraints
www.cs.berkeley.edu/Research/Aiken/bane.html Experimental results for OO
www.cs.washington.edu/research/projects/cecil Operational Semantics for FUN (3.2.1) Defining acceptability without structural induction
– More precise treatment of termination (3.2.2)
– Needs Co-Induction (greatest fixed point)
Using general lattices as Dataflow values instead of powersets (3.5.2)
Lower-bounds– Decidability of JOP– Polynomiality
Conclusions
Set constraints are quite useful– A Uniform syntax
– Can even deal with pointers
But semantic foundation is still based on abstract interpretation
Techniques used in functional and imperative (OO) programming are similar
Control and data flow analysis are related