Scalable Contract Checking for Systems Software using SMT solvers

Preview:

DESCRIPTION

Scalable Contract Checking for Systems Software using SMT solvers. Shaz Qadeer RiSE , Microsoft Research Joint work with Jeremy Condit and Shuvendu Lahiri. http://research.microsoft.com/en-us/projects/havoc/. Context: Scalable module verification. Harness. - PowerPoint PPT Presentation

Citation preview

Scalable Contract Checking for Systems Software

using SMT solvers

Shaz QadeerRiSE, Microsoft Research

Joint work with Jeremy Condit and Shuvendu Lahiri

http://research.microsoft.com/en-us/projects/havoc/

Context: Scalable module verification

Target: OS components (kernel, drivers, file-systems)– ~100KLOC of lines of codes with

>1000 of procedures

Module– A set of public/entry procedures – A set of private/internal

procedures

Specs– Interface specification

• Specs for public methods• Specs for external modules

– Property assertion

Initialize(..);

while(*) {choice= nondet();If (choice == 1){

[assume pre_1] call Public_1(…);

} else if (choice == 2){[assume pre_2]call Public_2(…);

} …}Cleanup(…);

Harness

Desirable goals

• Find bugs – Violations of property assertions– Low false alarms

• Use contracts – Modular checking for scalability– Readable contracts are formal documentation

• Reduce testing cost by providing high assurance in the verifier– Formal documentation of assumptions– Simple meta-theory for proofs

Existing methods on these examples

Large difference between theory and practice

Imprecise– Modeling of lists/arrays

Unsound– Modeling of lists/arrays– Aliasing, pointer arithmetic– Restricted harness

Complex “proof” calculus– Combination of analyses

Initialize(..);

while(*) {choice= nondet();If (choice == 1){

[assume pre_1] call Public_1(…);

} else if (choice == 2){[assume pre_2]call Public_2(…);

} …}Cleanup(…);

Harness

Full functional correctness is not a goal

Neither is minimizing the trusted computing base

Proof method: Floyd-Hoare Triple

• Floyd-Hoare triple {P} S {Q}

P, Q : predicates/propertyS : a program

• From a state satisfying P, if S executes, – No assertion in S fails, and – Terminating executions end in a state satisfying Q

Select(f1,b) = 5 f2 = Store(f1,a,5) Select(f2,a) + Select(f2,b) = 10is valid

{ b.f = 5 } a.f = 5 { a.f + b.f = 10 }is valid

theory of equality: =theory of arithmetic: 5, 10, +theory of arrays: Select, Store

iff

Program verification Formula

• [Nelson & Oppen ’79]

Satisfiability-Modulo-Theory (SMT)

• Boolean satisfiability solving + theory reasoning• Ground theories

– Equality, arithmetic, Select/Store• NP-complete logics• Powerful methods to combine decision

procedures for theories– [Nelson & Oppen ’79]

• Phenomenal progress in the past few years– Yices, Z3, Mathsat, ….

Simple type-state property

• Allocation type-state of DEV_OBJ– Device Objects (DEV_OBJ)

allocated and freed

• Property to check for a module– IoDeleteDevice() only called

on elements in MyDevObj

~MyDevObj

MyDevObj

IoCreateDevice() IoDeleteDevice()

Simple property simple invariants

typedef struct _DEV_OBJ{DEV_EXT *DevExt;

…} DEV_OBJ;

typedef struct _DEV_EXT{DEV_OBJ *Self;

…} DEV_EXT;

requires (do MyDevObj)NT_STATUS PnP(DEV_OBJ do, IRP *pirp){

PDEV_EXT data = do->DevExt; ….

switch(pirp->MajorFn){case IRP_MN_REMOVE_DEVICE: IoDeleteDevice(data->Self);

…}

}

DevExt

Self

do

DEV_OBJ

DEV_EXT

x MyDevObj. x->DevExt->Self = x

Simple property simple invariants

NT_STATUS Unload(…){ ….

iter = hd->First; while(iter != null) {

RemoveEntryList(iter);iter = iter->Next;IoDeleteDevice(iter->Self);

}….

}

First

Next

Self

DevExt

hd

x Btwn(Next, hd->First,NULL). x DevExt

x DevExt. x->Self->DevExt = x

DEV_OBJ

DEV_EXT

x DevExt. x->Self MyDevObj

DEV_OBJ

Next

Self

DevExt

DEV_EXT

Limitations of SMT solvers

• No support for precise reasoning with reachability predicate– Incompleteness in Floyd-Hoare proofs for straight line

code• Brittle support for quantifiers

– Complexity: NP-complete (ground) undecidable– Leads to unpredictable behavior of verifiers

• Proof times, proof success rate

Limitations of SMT solvers

• Answer the query {P} S {Q} for loop-free and call-free programs

• To handle loops and procedures, contracts are needed– Loop invariants– Pre/post-conditions

• Infeasible to manually supply internal contracts for large modules

Contributions

• Efficient decision procedure for verifying list-based programs

• Verifying and exploiting C type annotations

• Annotation inference for large modules

next

f

g

next

f

g

next

f

g

yx

Btwn(next,x,y)

Reachability predicate: Btwnf

• Express properties of collectionsx Btwn(next, next(hd), hd). state(x) = LOCKED //cyclic

• Arithmetic reasoning on data (e.g. sortedness)x Btwn(next, hd, null) \ {null}. y Btwn(next, x, null) \ {null}. d(x) d(y)

Expressive logic

Efficient decision procedure

• Decides the validity of {P} S {Q} – Worst-case exponential time but works well in

practice• Decision problem is NP-complete

– Cannot expect any better with propositional logic– Retains the complexity of current SMT logics

• Implemented in the Z3 SMT solver– Leverages powerful ground-theory reasoning

(arithmetic, arrays, uninterpreted functions…)

Contributions

• Efficient decision procedure for verifying list-based programs

• Verifying and exploiting C type annotations

• Annotation inference for large modules

C languageC types

– Scalars (int, long, char, short)– Pointers (int*, struct T*, ..)– Nested structs and unions – Array (struct T a[10];) – Function pointers– Void *

Difficult to establish type safety in presence of pointer arithmetic, casts– Type Safety (spatial) memory safety– Important default property to check

Lack of types hurts property checking– Difficult to disambiguate heap pointers– Difficult to write concise type invariants

IRP IRP

Flink

Blink

ListEntry

Flink

Blink

ListEntry

Example: Type Checking

p

q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry))

Type Checker: Does variable q have type IRP*?

q

Property Checker: Is r->Data1 unchanged?

...q->Data2 = 42;

IRP IRP

Flink

Blink

ListEntry

Example: Property Checkingq

Data2

Data1

Flink

Blink

ListEntry

Data2

Data1

r

Example: Property Checking

Flink

Blink

ListEntry

Data2

Data1

Flink

Blink

ListEntry

Data2

Data2 / Data1

For all we know,Data1 and Data2could be aliased!

q

r

Our Approach

• Implement a type checker in HAVOC– Provide formal semantics for C and its types

• Use types to improve the property checker– Provide Java-style field disambiguation

• Fully automated using Z3 SMT solver

Formalizing Type Safety

A C program is type safe if the run-time value of every variable and heap location

corresponds to its compile-time type.

Mem : addr -> value Type : addr -> type HasType : value x type -> bool

for all a in addr, HasType(Mem(a), Type(a))

Example

requires( HasType(ENCL(p), record*) && ENCL(p) != NULL)void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42;}requires( forall(q, Btwn(next, p, NULL), q != NULL ==>

HasType(ENCL(q), record*) && ENCL(q) != NULL))void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; }}

#define ENCL(x) CONTAINING_RECORD(x, record, node)

Decision Procedure

• Translation results in verification conditions that refer to Mem, Type, and HasType

• Can be encoded into an NP-complete logic– No worse than SAT solving– Provide decision procedure using an SMT solver

Experiments

• Implementation supports full C language– Supports polymorphism– Supports user-defined, dependent types

• Fancier type invariants => slower checking– Pay only for what you use!

• Annotated and checked four Windows drivers– Sample drivers provided with Windows DDK– About 2.3 KLOC total, with 225 annotations– Checking time: ~1 minute each

Contributions

• Efficient decision procedure for verifying list-based programs

• Verifying and exploiting C type annotations

• Annotation inference for large modules

Simple property simple invariantsNT_STATUS Unload(…){ ….

iter = hd->First; while(iter != null) {

RemoveEntryList(iter);iter = iter->Next;IoDeleteDevice(iter->Self);

}….

}

First

Next

Self

DevExt

hd

x Btwn(Next,hd->First,NULL). x DevExt

x DevExt. x->Self->DevExt = x

DEV_OBJ

DEV_EXT

x DevExt. x->Self MyDevObj

DEV_OBJ

Next

Self

DevExt

DEV_EXT

Need to simplify the problem

Module– A set of public/entry

procedures – A set of private/internal

procedures

Specs– Interface specification– Property assertion

Require the user to provide a module invariant

Initialize(..);[loop_inv moduleInv]while(*) {

choice= nondet();If (choice == 1){

[assume pre_1] call Public_1(…);

} else if (choice == 2){[assume pre_2]call Public_2(…);

} …}Cleanup(…);

Harness

Module invariants

• Module invariants– Invariant about all objects of a given type– Invariants on global variables

• Preserved by the public functions • Low overhead

– On “steady state” and therefore succinct– Only needed to be written at module level

Intra-module inference

• Given module M, interface specs, property and module invariants

• Infer annotations on internal procedures and loops

• Use annotations to verify property and module invariant

• Challenges– Module invariants are temporarily broken– Inference has to be scalable

Module invariant brokenrequires (TypeInvDO)ensures (TypeInvDO)

void publicFoo () { PDEV_OBJ do = NewDEV_OBJ(); privateBar(do);}

requires (TypeInvDOExcept(do))requires (TypeInvDO)ensures (TypeInvDO)

void privateBar (PDEV_OBJ do) { do->DevExt->Self = do;}

DevExt

Self

x

DEV_OBJ

DEV_EXT

#define TypeInvDO \ x MyDevObj. x->DevExt->Self = x \

Houdini algorithm (Flanagan-Leino 01)

• Problem statement – Given a set of procedures P1, …, Pn– A set of C of candidate annotations for each procedure– Returns a subset of the candidate annotations such that each

procedure satisfy its annotations– Also known as “monomial predicate abstraction”

• Algorithm– Performs a greatest-fixed point starting from all annotations

• Remove annotations that are violated– Requires a quadratic (n * |C|) number of theorem prover calls– Uses a modular checker

Candidate assertions

• Candidate assertions– Type-states in module invariants

• Over parameters, globals, locals and their fields

– Module invariant exceptions (next slide)– Conditional annotations

• Disjunction of above annotations

Module invariant exceptionsTypeInvDOExcept({do,de->Self},requires))TypeInvDOExcept({do,de->Self},ensures))void privateBar (PDEV_OBJ do, PDEV_EXT de) { … do->DevExt->Self = do;}

#define TypeInvDO \ x MyDevObj. x->DevExt->Self = x \

#define TypeInvDOExcept({a,b},ANNOT) \

ANNOT(x MyDevObj. x = a x = b x->DevExt->Self = x)\ANNOT(a->DevExt->Self = a) \ANNOT(b->DevExt->Sefl = b) \

Exceptions come from parameters,

return, globals, fields

Observations

• Able to synthesize most intermediate invariants– “close” to the module invariant (simple)– readable

• Invariants contain quantifiers, Boolean structure– Checking all Boolean combinations expensive (from

NP-Complete PSPACE-complete [CADE’09])• Retains scalability of the Houdini inference

Experiments• Benchmarks

– 4 device drivers (~7KLOC each), contains lists, arrays• #Internal methods: ~30• #loops: ~20

• Properties– double-free, lock-usage

• User provides module invariant– Tool infers intermediate invariants and modifies

clauses

Results

• Verified the properties with 0 false alarms• Module invariant overhead

– Number of module invariants ~5-10– Reused across multiple drivers

• Most internal annotations inferred– Approx 1500 inferred annotation per driver– Less than 5 manual annotation per driver

• Mostly conditional annotations (e.g. predicated on return value)

– Inference time < 5X of the checking time

Contributions

• Efficient decision procedure for verifying list-based programs

• Verifying and exploiting C type annotations

• Annotation inference for large modules

Questions?

Recommended