32
Course Outline • Traditional Static Program Analysis – Classic analyses and applications • Software Testing, Refactoring • Dynamic Program Analysis

Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Course Outline• Traditional Static Program Analysis

– Classic analyses and applications

• Software Testing, Refactoring

• Dynamic Program Analysis

Page 2: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Announcements

• I am setting up a new page for the class at

www.rpi.edu/~milana2/csci6961

• Email: [email protected], [email protected]

Page 3: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Outline

• Data-flow frameworks– The “Maximal Fixed Point” (MFP) solution– The “Meet Over all Paths” (MOP) solution

• Analysis of object references– Class Hierarchy Analysis (CHA)– Rapid Type Analysis (RTA)

Page 4: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

The MOP Solution

1. x:=a*b

2. if y<=a*b

3. a:=a+1

4. x:=x*b

5. goto 2

The MOP at entry of n is V fp(init(ρ))

The MOP over-approximates run-time dataflow facts.The MOP is the best summary of dataflow facts.

p in paths from ρ to n

MOP at entry of 3: f2(f1(Ø)) U f2(f5(f4(f3(f2(f1(Ø)))))) U f2(f5(f4(f3(f2(f5(f4(f3(f2(f1(Ø)))))))))) U … = {(x,1),(x,4),(a,3)}

This MOP over-approximates the reachingdefinitions at entry of 3: E.g., suppose that at the beginning y=1, a=1 and b=2. The actual reaching definitions at entry of 3: {(x,1)} !!!

TF

Page 5: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

The MFP Solution

1. x:=a*b

2. if y<=a*b

3. a:=a+1

4. x:=x*b

5. goto 2

The MFP at entry of 3 is the in(3) obtained as a solution of the following equations through fixed-point iteration:

in(1) = Øout(1) = f1(in(1))

TF

in(2) = out(1) U out(5)out(2) = f2(in(2))

in(3) = out(2)out(3) = f3(in(3))

in(4) = out(3)out(4) = f4(in(4))

in(5) = out(4)out(5) = f5(in(5)) = in(5)

Page 6: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

The MOP and MFP Solutions

{}

{(x,1)} {(x,4)} {(a,3)}

{(x,1),(x,4)} {(x,4),(a,3)} {(x,1),(a,3)}

{(x,1),(x,4),(a,3)}

1. x:=a*b

2. if y<=a*b

3. a:=a+1

4. x:=x*b

5. goto 2

0

1

in(2), in(3), in(4)

in(1)

in(2), in(3), in(5)

in(4)

in(5)

Page 7: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

MOP vs. MFP

• For distributive functions the dataflow analysis can merge paths (p1, p2), without loss of precision!– E.g., fp1(0) need not be calculated explicitly

– MFP=MOP

• Due to Kam and Ullman, 1976,1977: This is not true for monotone functions– MFP≥MOP. In general, MOP is undecidable

• A solution S, S≥MOP, is an unsafe solution– Other terms: unsafe, incorrect, unsound solution

Page 8: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Many Applications!

• White-box testing: compute coverage

• Regression testing

• Reverse engineering

• Restructuring: automated refactoring

• Static debugging– Memory errors – Concurrency bugs

Page 9: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Analysis of object references

• Analysis of object-oriented programs– Java

• Class Analysis problem: Given a reference variable x, what are the classes of the objects that x refers to at runtime?

• Points-to Analysis problem: Given a reference variable x, what are the objects that x refers to at runtime?

Page 10: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example: BoolExp hierarchyclass BoolExp { public: BoolExp(); virtual bool Evaluate(Context&)=0;};

class Constant : public BoolExp { public: Constant(bool); virtual bool Evaluate (Context&); private: bool _constant;};Constant::Constant(bool c) { _constant = c; }bool Constant::Evaluate(Context& aContext) { return _constant;}

class VarExp : public BoolExp { public: VarExp(char *); virtual bool Evaluate (Context&); private: char* _name;};VarExp::VarExp(char * n) { _name = n; }bool VarExp::Evaluate(Context& aContext) { return aContext.Lookup(_name); }

Page 11: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example: BoolExp hierarchyclass AndExp : public BoolExp { public: AndExp(BoolExp*, BoolExp*); virtual bool Evaluate (Context&); NOTE: NEED DESTRUCTORS!!! private: BoolExp* _operand1;

BoolExp* _operand2;};AndExp::AndExp(BoolExp* op1, BoolExp* op2) { _operand1=op1; _operand2=op2; }bool AndExp::Evaluate(Context& aContext) { return _operand1->Evaluate(aContext) && _operand2->Evaluate(aContext); }

class OrExp : public BoolExp { public: OrExp(BoolExp*, BoolExp*); virtual bool Evaluate (Context&); private: BoolExp* _operand1;

BoolExp* _operand2;};OrExp::OrExp(BoolExp* op1, BoolExp* op2) { _operand1=op1; _operand2=op2; }bool OrExp::Evaluate(Context& aContext) { return _operand1->Evaluate(aContext) || _operand2->Evaluate(aContext); }

Page 12: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

A client of the BoolExp hierarchy

main() { Context theContext; VarExp* x = new VarExp(“X”); VarExp* y = new VarExp(“Y”); BoolExp* exp = new AndExp(

new Constant(true), new OrExp(x, y) ); theContext.Assign(x, true); theContext.Assign(y, false); bool result = exp->Evaluate(theContext);}

Page 13: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Java Example: BoolExp hierarchypublic abstract class BoolExp { public boolean Evaluate(Context c);};

public class Constant extends BoolExp { private boolean _constant; public boolean Evaluate(Context c) {

return _constant; }

public class VarExp extends BoolExp { private String _name; public boolean Evaluate(Context c) { return c.Lookup(_name);}

Page 14: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Java Example: BoolExp hierarchypublic class AndExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2;

public AndExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) && _operand2.Evaluate(c); }}

public class OrExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2;

public OrExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) || _operand2.Evaluate(c); }}

Page 15: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

A client of the BoolExp hierarchy in Javamain() { Context theContext; VarExp x = new VarExp(“X”); VarExp y = new VarExp(“Y”); BoolExp exp = new AndExp(

new Constant(true), new OrExp(x, y) ); theContext.Assign(x, true); theContext.Assign(y, false); boolean result = exp.Evaluate(theContext);}

exp: {AndExp}

That is: At runtime exp may refer to (i.e., may point to) an object of class AndExp, but may not refer to an object of class OrExp!

Page 16: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Java Example: BoolExp hierarchypublic class AndExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2;

public AndExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) && _operand2.Evaluate(c); }}

public class OrExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2;

public OrExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) || _operand2.Evaluate(c); }}

_operand1: {Constant} _operand2: {OrExp}

_operand1: {VarExp} _operand2: {VarExp}

Page 17: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Class information: applications

• Compilers: can we devirtualize a virtual function call x.m()/x->m()?

• Software engineering– The calling relations in the program: call graph– Testing– Most interesting analyses require this information

Page 18: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Some terminology• Intraprocedural analysis

– So far, we assumed there are no procedure calls!– Analysis that works within a procedure and approximates

(or does not need) flow into and from procedures

• Interprocedural analysis– Takes into account procedure calls and tracks flow into and

from procedures – Many issues:

• Parameter passing mechanisms• Context• Call graph!• Functions as parameters!

– We will get back to this in a few classes…

Page 19: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Scalability• For most analyses (including class analysis) we need

interprocedural analysis on very large programs• Can the analysis handle large programs?

– 100K LOC, up to 45M LOC?

• Approximations of standard fixed point iteration– Reduce Lattice

– Reduce CFG

– Make transfer functions converge faster

– Other…

Page 20: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Today’s class

• Some simple interprocedural class analyses

• Class analysis: Given a reference variable x, what are the classes of the objects that x refers to at runtime?

• Class Hierarchy Analysis (CHA)• Rapid Type Analysis (RTA)

Page 21: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Class Hierarchy Analysis (CHA)• The simplest method of inferring

information about reference variables– Look at the class hierarchy

• In Java, if a reference variable r has a type A, the possible classes of run-time objects are included in the subtree of A. Denoted by cone(A).

– At virtual call site r.m find the methods that may be called based on the hierarchy information

J. Dean, D. Grove, and C. Chambers, Optimization of OO Programs Using Static Class Hierarchy Analysis, ECOOP’95

Page 22: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example

public class A {public static void main() {

A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();…

} // there are no other creation sites // or calls in the program

f()

A

B C

G D E

f()

f()

f()

Page 23: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example

A

B C

G D E

f()

f()

f()

public class A {public static void main() {

A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();… }

… } // there are no other creation sites // or calls in the program

The solution for reference variables by CHA is: a may refer to objects of classes {A,B,C,D,E,G}, d may refer to objects of class {D}, e may refer to objects of class {E}, and g to {G}.

Cone(C)

f()

Page 24: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example

public class A {public static void main() {

A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();… }

… } // there are no other creation sites // or calls in the program

main

A.f B.f C.f G.f

A

B C

G D E

f()

f()

f()

f()

a.f():

Page 25: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example: Applies-to Sets

public class A {public static void main() {

A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();… }

… } // there are no other creation sites // or calls in the program

main

A.f B.f C.f G.f

A

B C

G D E

f()

f()

f()

f()

a.f():

Applies-to sets: A.f = {A}; B.f = {B}; G.f = {G}; C.f = {C,D,E}

Page 26: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Observations on CHA• Do we need to resolve the class of the receiver

uniquely in order to devirtualize a call?

• Applies-to set for each method – At a call site r.f(), take the set of possible classes for

the receiver r; intersect this set with each possible method’s applies-to set.

– If only one method’s set has a non-empty intersection, then invoke the method directly.

– Otherwise, the call cannot be resolved.

Page 27: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Rapid Type Analysis

• Improves on Class Hierarchy Analysis• Interleaves construction of the call graph

with the analysis (known as on-the-fly call graph construction)

• Only expands calls if it has seen an instantiated object of appropriate type

• Makes assumption that the whole program is available!

David Bacon and Peter Sweeney, “Fast Static Analysis of C++ Virtual Function Calls”, OOPSLA ‘96

Page 28: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Examplepublic class A {

public static void main() {A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();…

} // there are no other creation // sites or calls in the // program

RTA starts in main; Sees D, and E are instantiated; Expands a.f() into C.f() only. Never reaches B.foo() and never sees G instantiated.

main

A.f B.f C.f G.f

Page 29: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

RTA• Keeps two sets, I (the set of instantiated classes), and R

(the set of reachable methods)• Starts from main, I = {}, R = {main}• Analyze calls in reachable methods: r.f()

– Finds potential targets according to CHA: X.f, Y.f, etc.– If Applies-to(X.f) intersects with I, make X.f a real target, and

add X.f to R

• Analyze instantiation sites in reachable methods: r = new A()– Add A to I– Find all analyzed calls r.f() with potential targets X.f triggered

by A (i.e., A in Applies-to(X.f) at r.f()). Make X.f a real target, and add X.f to R.

Page 30: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Example (continued)

public class A {public static void main() {

A a;D d = new D();E e = new E();if (…) a = d; else a = e;a.f(); }

… }public class B extends A {

public void foo() {G g = new G();…

} // there are no other creation // sites or calls in the // program

main

A.f B.f C.f G.f

{A} {B} {C,D,E} {G}

Page 31: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Comparisons

class A {public :

virtual int foo() { return 1; };};class B: public A {Public :

virtual int foo() { return 2; };virtual int foo(int i) { return i+1; };

};void main() {

B* p = new B;int result1 = p->foo(1);int result2 = p->foo();A* q = p;int result3 = q->foo();

}

Bacon-Sweeny, OOPSLA’96

CHA resolves result2 call uniquely to B.foo(); however, it does not resolve result3.

RTA resolves result3 uniquely because only B has been instantiated.

Page 32: Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Type Safety Limitations

• CHA and RTA assume type safety of the code they examine!//#1void* x = (void *) new B;B* q = (B*) x; //a safe downcastint case1 = q->foo()//#2void* x = (void *) new A;B* q = (B*) x; //an unsafe downcastint case2 = q->foo()//probably no error//#3void* x = (void *) new A;B* q = (B *) x; //an unsafe downcastint case3 = q->foo(66);//run-time error

A

B

foo()

foo()foo(int)