Automatic Verification of Pointer Programs using Grammar-based Shape Analysis Hongseok Yang Seoul...

Preview:

Citation preview

Automatic Verification of Pointer Programs using Grammar-based Shape Analysis

Hongseok YangSeoul National University(Joint Work with Oukseh Lee and Kwangkeun Yi)

Automatic Verification of Pointer Programs

Inference of program invariants: crucial for automatic verification. Difficulty: unboundedly many new heap cells.

h:=nil;

while (*) {

x:=new(nil,nil);

if (h=nil) { h:=x; }

else { x->next:=h;

h->prev:=x;

h:=x; }

}

h

nil nilh

nil nil

h

nil nil

h

nil

h

nil nil

Need to “summarize”

heap cells.

Goal: Precise “High-level” Invariants

h

nil nil

h

nil nil

h

nil p

dlist(p)dlist(p) ::= nil | p dlist(c)

nil

c

Existing technology: Shape analysis[SaReWi96,99].

Idea: Use a grammar to find a good abstraction of each heap object (i.e., cells and their pointers).

Demo Binomial heap construction: (all pointers to

nil are omitted.)

Our immediate goal was to handle the binomial heap construction algorithm.

Structure of Our Analysis

Abstract Execution

Normalize

D# Nk#

while(B){

C1;

C2;

C3;

C4;

}D# Nk

#

Embed

Abstract Domain D#: Grammar

D# = Pf(Graph) x Grammar + {T} A grammar R is a set of following rules:

(x) = nil | O | …

where V1, V2 2 {nil, self, x, (_), (self), (nil), (x) } Examples:

tree(x) = nil | O

dList(x) = nil | O

V1

V2

tree(_)tree(_)

xdList(self)

Abstract Domain D#: Shape Graph

D# = Pf(Graph) x Grammar + {T} Shape graph:

Each “node” is concrete (“a”), annotated with nil (“d”), or annotated with a nonterminal (“c” and “b”).

An element (S,G) in D# is called abstract state.

x

c:tree(_) a

y

b:dList(a)d:nil

Stack

Heap

Normalized Abstract Domain Nk#

Idealized version of normalization:1. Group nodes according to heap objects;2. Compute the best grammar that describes each group;3. Ensure that each shape graph doesn’t use more than k nodes.

Example:

Nk# (µ D#) consists of normalized abstract states.

The actual definition of Nk# is not algorithmic.

x

c:nila

d:nil e:nilb

xx

a:nil a:tree(_)

tree(_)tree(_)

tree(_) = nil | O

normalize

Definition of Analysis

Analysis of programs without loops: Forward analysis «C¬: D# ! D#

Case pruning «B¬: D# ! D#

«while B C¬A = FixAv F = tnFn(normalize(A)). F : Nk

# ! Nk#

F = A’. normalize(A’ t «B¬(«C¬A’)))

Doubly Linked List Construction

h := nil;while (*) {

var x;x := new;if (h = nil) {

h := x;}else {

x->next := h;h->prev := x;

h := x}

}

Inferred Loop Invariant Inferred abstract state (i.e., shape-graph set and

grammar):

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

(self)x

h

a:(_)

3rd Iteration Step Abstract state A2 after the 2rd iteration:

Inferred invariant A:

A = normalize(A2 t «LoopBody¬(A2))

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

(self)x

h

a:(_)

Computation of A2t«LoopBody¬(A2)

(x)= nil | O (x)= nil | Oprev prev

next next

nil(self) nil

x

h

a:(_)

h

a:(_)

x:=newx

f:nil g:nileprev next

x

f:nil g:nileprev next

h

a:nil

if(h=nil)…

x

f:nil eprev next aprev

nextc:(a)

h

g:nil b:nil

next

prev

False Branch

1. Unroll .

2. Prune cases.

3. “Execute”.

4. Join results.

5. Collect “garbage”.

True Branch

Normalization 1: Identify Heap Objects

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

h

b:nil c:nilaprev next

h

b:nil aprev next cprev

nextd:(c)

(x)=Oprev

next

nilnil

Identify data structures, and express them by nonterminals.

Normalization 1: Identify Heap Objects

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

h h

b:nil aprev next cprev

nextd:(c)

(x)=Oprev

next

nilnil

Identify data structures, and express them by nonterminals.

(x)=Oprev

next

x(self)

a:(_)

Normalization 1: Identify Heap Objects

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

h h

b:nil aprev next

(x)=Oprev

next

nilnil

Identify data structures, and express them by nonterminals.

(x)=Oprev

next

x(self)

a:(_)

(x)=Oprev

next

nil(self)

c:(a)

Normalization 1: Identify Heap Objects

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

h h

(x)=Oprev

next

nilnil

Identify data structures, and express them by nonterminals.

(x)=Oprev

next

x(self)

a:(_)

(x)=Oprev

next

nil(self)

a:(_)

Normalization 2: Unify Similar Shape Graphs

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

h

a:(_)

h h

(x)=Oprev

next

nilnil

Roughly, two shape graphs are similar iff they coincide except the use of nonterminals.

(x)=Oprev

next

x(self)

a:(_)

(x)=Oprev

next

nil(self)

a:(_)

(x)= nil | O

h

a:(_)

prev

next

nil(self)

prev

next

nilnil | O

prev

next

nil(self)| O

Normalization 3: Collect Garbage

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

nilx

(x)=Oprev

next

nilnil

Eliminate the definitions of unused nonterminals from the grammar.

(x)=Oprev

next

x(self)

(x)=Oprev

next

nil(self)

(x)= nil | O

h

a:(_)

prev

next

nil(self)

prev

next

nilnil | O

prev

next

nil(self)| O

, are not used

Normalization 4: Simplify the Grammar

(x)= nil | Oprev

next nilx

Regard (x) and nil as the same. Combine “same” cases and “same” definitions.

(x)=Oprev

next

x(self)

(x)= nil | O

h

a:(_)

prev

next

nil(self)

prev

next

nilnil | O

prev

next

nil(self)| O

“Same” Cases

| nil

Normalization 4: Simplify the Grammar

(x)= nil | Oprev

next nilx

Regard (x) and nil as the same. Combine “same” cases and “same” definitions.

(x)=Oprev

next

x(self)

(x)= nil | O

h

a:(_)

prev

next

nil(self) | O

prev

next

nil(self)

| nil“Same” Definitions

(self)

| Oprev

next

x(self)

Normalization 4: Simplify the Grammar

(x)= nil | Oprev

next nilx

Regard (x) and nil as the same. Combine “same” cases and “same” definitions.

(x)= nil | O

h

a:(_)

prev

next

nil(self)

| Oprev

next

x(self)

“Same” Cases

Normalization 4: Simplify the Grammar

(x)= nil

Regard (x) and nil as the same. Combine “same” cases and “same” definitions.

(x)= nil | O

h

a:(_)

prev

next

nil(self)

| Oprev

next

x(self)

Summary

1. “Execute” the loop body abstractly: «LoopBody¬A2

2. Join the old and new values: A2 t «LoopBody¬A2

3. Normalize the obtained abstract state:1. For each shape graph, identify heap objects and expres

s them using nonterminals.2. Unify similar shape graphs.3. Remove the definitions of unused nonterminals.4. Simplify the grammar.

Correctness The meaning of each abstract state (G,R) is given

by an assertion “trans(G,R)” in sep. logic. Correctness theorem: If «C¬(G,R) = (G’,R’), then {tr

ans(G,R)}C{trans(G’,R’)} is derivable in sep. logic. Termination: Since the domain Nk

# is finite, the analysis terminates.

Conclusion Presented an analysis that infers the loop invaria

nt of complex pointer programs. The key idea is to use a grammar to describe the

structure of a heap object (i.e., data structure). Future work:

1. Develop a systematic reusable framework.2. Handle data structures with more extensive sharing.

dags and trees with linked leaf nodes, etc.3. Prove a property that relates the input and ouput state

s. SW recovers link fields to their original values.

Inferred Loop Invariant Inferred shape-graph set and grammar:

Representation by an assertion:

(x)= nil | O

(x)= nil | O

prev

prev

next

next

nil(self)

(self)x

h

a:(_)

letrec (a,x) = (empÆa=nil) Ç 9b.(a nil,b) * (b,a)(a,x) = (empÆa=nil) Ç 9b.(a x,b) * (b,a)

in 9a. h=a Æ 8x.(a,x)

Abstract Domain D#

D# = Pf(Graph) x Grammar + {T} Shape graph:

Each “node” can be concrete (“a”), annotated with nil (“d”), or annotated with a nonterminal (“c” and “b”).

Semantics by separation-logic assertions:9abcd.(x=aÆy=b) Æ ((8y.tree(c,y))*(ac,d)*(c=nilÆemp)*dList(b,a))

Formal definition: Graph = (Var!finSymL) x (SymL!finVal) Val = {nil, <a,b>, (a), () | a,b2SymL, 2NonTerm }

x

c:tree(_) a

y

b:dList(a)d:nil

Stack

Heap

Grammar

A grammar R is a set of following rules: (x) = nil | O | …

where V1, V2 2 {nil, self, x, (_), (self), (nil) }

Examples: tree(x) = nil | O dList(x) = x | O

Semantics by separation-logic assertions: tree(c,x) = (c=nilÆemp) Ç 9lr.(cl,r)*(8y.tree(l,y))*(8y.tree(r,y))dList(c,p) = (c=nilÆemp) Ç 9n.(cp,n)*dList(n,c)

Formal definition: Grammar = NonTerm !finPnf({nil} + Case x Case) Case = {nil, self, arg, (_), (arg), (self) | 2NonTerm }

V1

V2

tree(_)tree(_)

xdList(self)

Normalized Abstract Domain N#

N# consists of abstract states (G,R) in D# s.t.1. all “data structures” are expressed by nonterminals:

1. All “similar” shape graphs and rules are merged.x

c:nila

b:(_)

x

c:nila

d:nil e:nilb

y x

c:nila

b:(_)

yx) = nil |

O

x) = nil | O

(_)(_)nilnil

Recommended