Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II

Preview:

DESCRIPTION

Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II. Roman Manevich Ben-Gurion University. Syllabus. Previously. Static Analysis by example Simple Available Expressions analysis Abstract transformer for assignments Three-address code - PowerPoint PPT Presentation

Citation preview

Spring 2014Program Analysis and Verification

Lecture 8: Static Analysis II

Roman ManevichBen-Gurion University

2

Syllabus

Semantics

NaturalSemantics

Structural semantics

AxiomaticVerification

StaticAnalysis

AutomatingHoare Logic

Control Flow Graphs

Equation Systems

CollectingSemantics

AbstractInterpretation fundamentals

Lattices

Galois Connections

Fixed-Points

Widening/Narrowing

Domain constructors

InterproceduralAnalysis

AnalysisTechniques

Numerical Domains

CEGAR

Alias analysis

ShapeAnalysis

Crafting your own

Soot

From proofs to abstractions

Systematically developing

transformers

3

Previously

• Static Analysis by example– Simple Available Expressions analysis– Abstract transformer for assignments– Three-address code– Processing serial composition– Processing conditions– Processing loops

4

Defining an SAV abstract transformer

• Goal: define a function FSAV[x:=a] : s.t.if FSAV[x:=a](D) = D’then sp(x := a, Conj(D)) Conj(D’)

• Idea: define rules for individual factsand generalize to sets of facts by the conjunction rule

Is either a variable v or an addition expression v+w

{ x= } x:=a { }[kill-lhs]

{ y=x+w } x:=a { }[kill-rhs-1]

{ y=w+x } x:=a { }[kill-rhs-2]

{ } x:= { x= }[gen]

{ y=z+w } x:=a { y=z+w }[preserve]

5

Defining a semantic reduction• Idea: make as many implicit facts explicit by

– Using symmetry and transitivity of equality– Commutativity of addition– Meaning of equality – can substitute equal variables

• For an SAV-predicate P=Conj(D) defineExplicate(D) = minimal set D* such that:

1. D D*

2. x=y D* implies y=x D*

3. x=y D* y=z D* implies x=z D*

4. x=y+z D* implies x=z+y D*

5. x=y D* and x=z+w D* implies y=z+w D*

6. x=y D* and z=x+w D* implies z=y+w D*

7. x=z+w D* and y=z+w D* implies x=y D*

• Notice that Explicate(D) D• Explicate is a special case of a semantic reduction

6

Annotating assignments

• Define:F*[x:=aexpr] = Explicate FSAV[x:= aexpr]

• Annotate(P, x:=aexpr) ={P} x:=aexpr F*[x:= aexpr](P)

7

Annotating composition

• Annotate(P, S1; S2) = let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2}

8

Simplifying conditions

• Extend While with– Non-determinism (or) and– An assume statement

assume b, s sos s if B b s = tt • Now, the following two statements are

equivalent– if b then S1 else S2

– (assume b; S1) or (assume b; S2)

9

assume transformer

• Define (bexpr) = if bexpr is factoid {bexpr} else {}

• Define F[assume bexpr](D) = D (bexpr)• Can sharpen

F*[assume bexpr] = Explicate FSAV[assume bexpr]

10

Annotating conditionslet Pt = F*[assume bexpr] Plet Pf = F*[assume bexpr] Plet Annotate(Pt, S1) be {Pt} A1 {Q1}let Annotate(Pf, S2) be {Pf} A2 {Q2}return {P}

if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2}

11

k-loop unrolling

The following must hold:P NQ1 NQ2 N…Qk N…

{ P }if (x z) x := x + 1 y := x + a d := x + aQ1 = { y=x+a, y=a+x } if (x z) x := x + 1 y := x + a d := x + aQ2 = { y=x+a, y=a+x }

{ P }Inv = { N }while (x z) do x := x + 1 y := x + a d := x + a

{ y=x+a, y=a+x, w=d, d=w } if (x z) x := x + 1 y := x + a d := x + aQ1 = { y=x+a, y=a+x }

We can compute the following sequence:N0 = P

N1 = N1 Q1

N2 = N1 Q2

…Nk = Nk-1 Qk

Observation 1: No need to explicitly unroll loop – we can reuse postcondition from unrolling k-1 for k

12

Annotating loopsAnnotate(P, while bexpr do S) = Initialize N := Nc := P

repeat let Annotate(P, if b then S else skip) be {Nc} if bexpr then S else skip {N} Nc := Nc N until N = Nc

return {P} INV= N while bexpr do F[assume bexpr](N) Annotate(F[assume bexpr](N), S) F[assume bexpr](N)

13

Annotating programsAnnotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F*[x:=aexpr] P} case S is S1; S2

let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2} case S is if bexpr then S1 else S2

let Pt = F[assume bexpr] P let Pf = F[assume bexpr] P let Annotate(Pt, S1) be {Pt} A1 {Q1} let Annotate(Pf, S2) be {Pf} A2 {Q2} return {P} if bexpr then {Pt} A1 {Q1}

else {Pf} A2 {Q2} {Q1 Q2}

case S is while bexpr do S N := Nc := P // Initialize repeat

let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N

until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}

14

Today

• Another static analysis example – constant propagation

• Basic concepts in static analysis– Control flow graphs– Equation systems– Collecting semantics– (Trace semantics)

15

Constant propagation

16

Second static analysis example

• Optimization: constant folding

– Example: x:=7; y:=x*9transformed to: x:=7; y:=7*9and then to: x:=7; y:=63

• Analysis: constant propagation (CP)– Infers facts of the form x=c

{ x=c }y := aexpr y := eval(aexpr[c/x])

constantfolding

simplifies constant expressions

17

Plan

• Define domain – set of allowed assertions• Handle assignments• Handle composition• Handle conditions• Handle loops

18

Constant propagationdomain

19

CP semantic domain

?

20

CP semantic domain

• Define CP-factoids: = { x = c | x Var, c Z }– How many factoids are there?

• Define predicates as = 2

– How many predicates are there?– Do all predicates make sense? (x=5) (x=7)

• Treat conjunctive formulas as sets of factoids{x=5, y=7} ~ (x=5) (y=7)

21

Handling assignments

22

CP abstract transformer

• Goal: define a functionFCP[x:=aexpr] : such thatif FCP[x:=aexpr] P = P’ then sp(x:=aexpr, P) P’

?

23

CP abstract transformer

• Goal: define a functionFCP[x:=aexpr] : such thatif FCP[x:=aexpr] P = P’ then sp(x:=aexpr, P) P’

{ x=c } x:=aexpr { }[kill]

{ y=c1, z=c2 } x:=y op z { x=c} and c=c1 op c2[gen-2]

{ } x:=c { x=c }[gen-1]

{ y=c } x:=aexpr { y=c }[preserve]

24

Gen-kill formulation of transformers• Suited for analysis propagating sets of factoids– Available expressions,– Constant propagation, etc.

• For each statement, define a set of killed factoids and a set of generated factoids

F[S] P = (P \ kill(S)) gen(S)• FCP[x:=aexpr] P = (P \ {x=c})

aexpr is not a constant• FCP[x:=k] P = (P \ {x=c}) {x=k}• Used in dataflow analysis – a special case of abstract

interpretation

25

Handling composition

26

Does this still work?

Annotate(P, S1; S2) = let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2}

27

Handling conditions

28

Handling conditional expressions

• We want to soundly approximate D bexpr and D bexpr in

• Define (bexpr) = if bexpr is CP-factoid {bexpr} else {}

• Define F[assume bexpr](D) = D (bexpr)

29

Does this still work?let Pt = F[assume bexpr] Plet Pf = F[assume bexpr] Plet Annotate(Pt, S1) be {Pt} A1 {Q1}let Annotate(Pf, S2) be {Pf} A2 {Q2}return {P}

if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2}

How do we define join for CP?

30

Join example

• {x=5, y=7} {x=3, y=7, z=9} =

31

Handling loops

32

Does this still work?

• What about correctness?• What about termination?

Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}

33

Does this still work?

• What about correctness?– If loop terminates then is N a loop invariant?

• What about termination?

Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}

34

A termination principle

• g : X X is a function• How can we determine whether the sequence

x0, x1 = g(x0), …, xk+1=g(xk),… stabilizes?• Technique:

1. Find ranking function rank : X N(that is show that rank(x) 0 for all x)

2. Show that if xg(x)then rank(g(x)) < rank(x)

35

Rank function for available expressions

• rank(P) = ?

36

Rank function for available expressions

• rank(P) = |P|number of factoids

• Prove that either Nc = Nc Nor rank(Nc N) <? rank(Nc)

Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}

37

Rank function for constant propagation

• rank(P) = ?

• Prove that either Nc = Nc Nor rank(Nc) >? rank(Nc N)

Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}

38

Rank function for constant propagation

• rank(P) = |P|number of factoids

• Prove that either Nc = Nc N’or rank(Nc) >? rank(Nc N’)

Annotate(P, while bexpr do S) = N’ := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc

let Annotate(Pt, S) be {Nc} Abody {N’} Nc := Nc N’ until N’ = Nc return {P} INV= {N’} while bexpr do {Pt} Abody {F[assume bexpr](N)}

39

Generalizing

By NMZ (Photoshop) [CC0], via Wikimedia Commons

1

AvailableExpressions

ConstantPropagation

AbstractInterpretation

40

Towards a recipe for static analysis

• Two static analyses– Available Expressions (extended with equalities)– Constant Propagation

• Semantic domain – a family of formulas– Join operator approximates pairs of formulas

• Abstract transformers for basic statements– Assignments– assume statements

• Initial precondition

41

Controlflow

graphs

42

A technical issue• Unrolling loops is quite inconvenient and

inefficient (but we can avoid it as we just saw)• How do we handle more complex control-flow

constructs, e.g., goto , break, exceptions…?– The problem: non-inductive control flow constructs

• Solution: model control-flow by labels and goto statements

• Would like a dedicated data structure to explicitly encode control flow in support of the analysis

• Solution: control-flow graphs (CFGs)

43

Modeling control flow with labels

while (x z) do x := x + 1 y := x + a d := x + aa := b

label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

Control-flow graph example

44

1 label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

2345

78

6

label0:

if x z

x := x + 1

y := x + a

d := x + a

goto label0

label1:

a := b

1

2

3

4

5

6

7

8

line number

Control-flow graph example

45

1 label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

2345

78

6

label0:

if x z

x := x + 1

y := x + a

d := x + a

goto label0

label1:

a := b

1

2

3

4

5

6

8

entry

exit

7

46

Control-flow graph• Node are statements or labels• Special nodes for entry/exit• A edge from node v to node w means that after

executing the statement of v control passes to w– Conditions represented by splits and join node– Loops create cycles

• Can be generated from abstract syntax tree in linear time– Automatically taken care of by the front-end

• Usage: store analysis results (assertions) in CFG nodes

Control-flow graph example

47

1 label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

2345

78

6

label0:

if x z

x := x + 1

y := x + a

d := x + a

goto label0

label1:

a := b

1

2

3

4

5

6

7

8

entry

exit

48

Eliminating labels

• We can use edges to point to the nodes following labels and remove all label nodes (other than entry/exit)

Control-flow graph example

49

1 label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

2345

78

6

label0:

if x z

x := x + 1

y := x + a

d := x + a

goto label0

label1:

a := b

1

2

3

4

5

6

7

8

entry

exit

Control-flow graph example

50

1 label0: if x z goto label1 x := x + 1 y := x + a d := x + a goto label0

label1: a := b

2345

78

6if x z

x := x + 1

y := x + a

d := x + a

a := b

2

3

4

5

8

entry

exit

51

Basic blocks

• A basic block is a chain of nodes with a single entry point and a single exit point

• Entry/exit nodes are separate blocks

if x z

x := x + 1

y := x + a

d := x + a

a := b

2

3

4

5

8

entry

exit

52

Blocked CFG

• Stores basic blocks in a single node• Extended blocks – maximal connected loop-

free subgraphs

if x z

x := x + 1y := x + ad := x + aa := b

2

3

8

entry

exit

45

53

Collecting semantics

54

Why need another semantics?

• Operational semantics explains how to compute output from a given input– Useful for implementing an interpreter/compiler– Less useful for reasoning about safety properties– Not suitable for computational purposes – does

not explicitly show how assertions in different program points influence each other

• Need a more explicit semantics– Over a control flow graph

Control-flow graph example

1234

5if x > 0

x := x - 1

goto label0:label1:

2

3

45

entry

exit

label0:1

55

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

Trimmed CFG

1234

5

if x > 0

x := x - 1

2

3

entry

exit

56

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

Collecting semantics example: input 1

1234

5

if x > 0

x := x - 1

2

3

entry

exit

[x1]

[x1]

[x0]

[x0]

57

[x1][x2][x3]…label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

Collecting semantics example: input 2

1234

5

if x > 0

x := x - 1

2

3

entry

exit

[x1]

[x1]

[x0][x2]

[x2]

58

[x1][x2][x3]…label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

[x0]

Collecting semantics example: input 3

1234

5

if x > 0

x := x - 1

2

3

entry

exit

[x1]

[x1]

[x0][x2]

[x2]

[x3]

[x3]

59

[x1][x2][x3]…label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

[x0]

ad infinitum – fixed point

1234

5

if x > 0

x := x - 1

2

3

entry

exit

[x1]

[x1]

[x1]

[x0]

[x2]

[x2]

[x2]

[x3]

[x3]

[x3]

…60

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

[x-1][x-2]…[x0]

Predicates at fixed point

1234

5

if x > 0

x := x - 1

2

3

entry

exit

61

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

{true}

{?}

{?}{?}

Predicates at fixed point

1234

5

if x > 0

x := x - 1

2

3

entry

exit

62

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

{true}

{true}

{x>0}{x0} {x0}

63

Collecting semantics

• Accumulates for each control-flow node the (possibly infinite) sets of states that can reach there by executing the program from some given set of input states

• Not computable in general• A reference point for static analysis• (An abstraction of the trace semantics)• We will work our way up to defining it

formally

64

Collecting semanticsin equational form

65

Math reference: function lifting

• Let f : X Y be a function• The lifted function f’ : 2X 2Y

is defined as f’(XS) = { f(x) | x XS }• We will sometimes use the same symbol for

both functions when it is clear from the context which one is used

66

Equational definition example• A vector of variables R[0, 1, 2, 3, 4]• R[0] = {xZ} // established input

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

• A (recursive) system of equations

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

Semantic function for assume x>0

Semantic function for x:=x-1 lifted to sets of states

67

General definition• A vector of variables R[0, …, k] one per input/output of a node

– R[0] is for entry• For node n with multiple predecessors add equation

R[n] = {R[k] | k is a predecessor of n}• For an atomic operation node R[m] S R[n] add equation

R[n] = S R[m]

• Transform if b then S1 else S2

to (assume b; S1) or (assume b; S2)

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

Next lecture:abstract interpretation fundamentals

Recommended