Scalable Contract Checking for Systems Software using SMT solvers

Scalable Contract Checking for Systems Software

using SMT solvers

Shaz QadeerRiSE, Microsoft Research

Joint work with Jeremy Condit and Shuvendu Lahiri

http://research.microsoft.com/en-us/projects/havoc/

Context: Scalable module verification

Target: OS components (kernel, drivers, file-systems)– ~100KLOC of lines of codes with

>1000 of procedures

Module– A set of public/entry procedures – A set of private/internal

procedures

Specs– Interface specification

• Specs for public methods• Specs for external modules

– Property assertion

Initialize(..);

while(*) {choice= nondet();If (choice == 1){

[assume pre_1] call Public_1(…);

} else if (choice == 2){[assume pre_2]call Public_2(…);

} …}Cleanup(…);

Harness

Desirable goals

• Find bugs – Violations of property assertions– Low false alarms

• Use contracts – Modular checking for scalability– Readable contracts are formal documentation

• Reduce testing cost by providing high assurance in the verifier– Formal documentation of assumptions– Simple meta-theory for proofs

Existing methods on these examples

Large difference between theory and practice

Imprecise– Modeling of lists/arrays

Unsound– Modeling of lists/arrays– Aliasing, pointer arithmetic– Restricted harness

Complex “proof” calculus– Combination of analyses

Initialize(..);

while(*) {choice= nondet();If (choice == 1){

} …}Cleanup(…);

Harness

Full functional correctness is not a goal

Neither is minimizing the trusted computing base

Proof method: Floyd-Hoare Triple

• Floyd-Hoare triple {P} S {Q}

P, Q : predicates/propertyS : a program

• From a state satisfying P, if S executes, – No assertion in S fails, and – Terminating executions end in a state satisfying Q

Select(f1,b) = 5 f2 = Store(f1,a,5) Select(f2,a) + Select(f2,b) = 10is valid

{ b.f = 5 } a.f = 5 { a.f + b.f = 10 }is valid

theory of equality: =theory of arithmetic: 5, 10, +theory of arrays: Select, Store

Program verification Formula

• [Nelson & Oppen ’79]

Satisfiability-Modulo-Theory (SMT)

• Boolean satisfiability solving + theory reasoning• Ground theories

– Equality, arithmetic, Select/Store• NP-complete logics• Powerful methods to combine decision

procedures for theories– [Nelson & Oppen ’79]

• Phenomenal progress in the past few years– Yices, Z3, Mathsat, ….

Simple type-state property

• Allocation type-state of DEV_OBJ– Device Objects (DEV_OBJ)

allocated and freed

• Property to check for a module– IoDeleteDevice() only called

on elements in MyDevObj

~MyDevObj

MyDevObj

IoCreateDevice() IoDeleteDevice()

Simple property simple invariants

typedef struct _DEV_OBJ{DEV_EXT *DevExt;

…} DEV_OBJ;

typedef struct _DEV_EXT{DEV_OBJ *Self;

…} DEV_EXT;

requires (do MyDevObj)NT_STATUS PnP(DEV_OBJ do, IRP *pirp){

PDEV_EXT data = do->DevExt; ….

switch(pirp->MajorFn){case IRP_MN_REMOVE_DEVICE: IoDeleteDevice(data->Self);

DevExt

DEV_OBJ

DEV_EXT

x MyDevObj. x->DevExt->Self = x

Simple property simple invariants

NT_STATUS Unload(…){ ….

iter = hd->First; while(iter != null) {

RemoveEntryList(iter);iter = iter->Next;IoDeleteDevice(iter->Self);

DevExt

x Btwn(Next, hd->First,NULL). x DevExt

x DevExt. x->Self->DevExt = x

DEV_OBJ

DEV_EXT

x DevExt. x->Self MyDevObj

DEV_OBJ

DevExt

DEV_EXT

Limitations of SMT solvers

• No support for precise reasoning with reachability predicate– Incompleteness in Floyd-Hoare proofs for straight line

code• Brittle support for quantifiers

– Complexity: NP-complete (ground) undecidable– Leads to unpredictable behavior of verifiers

• Proof times, proof success rate

Limitations of SMT solvers

• Answer the query {P} S {Q} for loop-free and call-free programs

• To handle loops and procedures, contracts are needed– Loop invariants– Pre/post-conditions

• Infeasible to manually supply internal contracts for large modules

Contributions

• Efficient decision procedure for verifying list-based programs

• Verifying and exploiting C type annotations

• Annotation inference for large modules

Btwn(next,x,y)

Reachability predicate: Btwnf

• Express properties of collectionsx Btwn(next, next(hd), hd). state(x) = LOCKED //cyclic

• Arithmetic reasoning on data (e.g. sortedness)x Btwn(next, hd, null) \ {null}. y Btwn(next, x, null) \ {null}. d(x) d(y)

Expressive logic

Efficient decision procedure

• Decides the validity of {P} S {Q} – Worst-case exponential time but works well in

practice• Decision problem is NP-complete

– Cannot expect any better with propositional logic– Retains the complexity of current SMT logics

• Implemented in the Z3 SMT solver– Leverages powerful ground-theory reasoning

(arithmetic, arrays, uninterpreted functions…)

Contributions

C languageC types

– Scalars (int, long, char, short)– Pointers (int*, struct T*, ..)– Nested structs and unions – Array (struct T a[10];) – Function pointers– Void *

Difficult to establish type safety in presence of pointer arithmetic, casts– Type Safety (spatial) memory safety– Important default property to check

Lack of types hurts property checking– Difficult to disambiguate heap pointers– Difficult to write concise type invariants

IRP IRP

ListEntry

Example: Type Checking

q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry))

Type Checker: Does variable q have type IRP*?

Property Checker: Is r->Data1 unchanged?

...q->Data2 = 42;

IRP IRP

ListEntry

Example: Property Checkingq

ListEntry

Example: Property Checking

ListEntry

Data2 / Data1

For all we know,Data1 and Data2could be aliased!

Our Approach

• Implement a type checker in HAVOC– Provide formal semantics for C and its types

• Use types to improve the property checker– Provide Java-style field disambiguation

• Fully automated using Z3 SMT solver

Formalizing Type Safety

A C program is type safe if the run-time value of every variable and heap location

corresponds to its compile-time type.

Mem : addr -> value Type : addr -> type HasType : value x type -> bool

for all a in addr, HasType(Mem(a), Type(a))

Example

requires( HasType(ENCL(p), record*) && ENCL(p) != NULL)void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42;}requires( forall(q, Btwn(next, p, NULL), q != NULL ==>

HasType(ENCL(q), record*) && ENCL(q) != NULL))void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; }}

#define ENCL(x) CONTAINING_RECORD(x, record, node)

Decision Procedure

• Translation results in verification conditions that refer to Mem, Type, and HasType

• Can be encoded into an NP-complete logic– No worse than SAT solving– Provide decision procedure using an SMT solver

Experiments

• Implementation supports full C language– Supports polymorphism– Supports user-defined, dependent types

• Fancier type invariants => slower checking– Pay only for what you use!

• Annotated and checked four Windows drivers– Sample drivers provided with Windows DDK– About 2.3 KLOC total, with 225 annotations– Checking time: ~1 minute each

Contributions

Simple property simple invariantsNT_STATUS Unload(…){ ….

iter = hd->First; while(iter != null) {

RemoveEntryList(iter);iter = iter->Next;IoDeleteDevice(iter->Self);

DevExt

x Btwn(Next,hd->First,NULL). x DevExt

x DevExt. x->Self->DevExt = x

DEV_OBJ

DEV_EXT

x DevExt. x->Self MyDevObj

DEV_OBJ

DevExt

DEV_EXT

Need to simplify the problem

Module– A set of public/entry

procedures – A set of private/internal

procedures

Specs– Interface specification– Property assertion

Require the user to provide a module invariant

Initialize(..);[loop_inv moduleInv]while(*) {

choice= nondet();If (choice == 1){

} …}Cleanup(…);

Harness

Module invariants

• Module invariants– Invariant about all objects of a given type– Invariants on global variables

• Preserved by the public functions • Low overhead

– On “steady state” and therefore succinct– Only needed to be written at module level

Intra-module inference

• Given module M, interface specs, property and module invariants

• Infer annotations on internal procedures and loops

• Use annotations to verify property and module invariant

• Challenges– Module invariants are temporarily broken– Inference has to be scalable

Module invariant brokenrequires (TypeInvDO)ensures (TypeInvDO)

void publicFoo () { PDEV_OBJ do = NewDEV_OBJ(); privateBar(do);}

requires (TypeInvDOExcept(do))requires (TypeInvDO)ensures (TypeInvDO)

void privateBar (PDEV_OBJ do) { do->DevExt->Self = do;}

DevExt

DEV_OBJ

DEV_EXT

#define TypeInvDO \ x MyDevObj. x->DevExt->Self = x \

Houdini algorithm (Flanagan-Leino 01)

• Problem statement – Given a set of procedures P1, …, Pn– A set of C of candidate annotations for each procedure– Returns a subset of the candidate annotations such that each

procedure satisfy its annotations– Also known as “monomial predicate abstraction”

• Algorithm– Performs a greatest-fixed point starting from all annotations

• Remove annotations that are violated– Requires a quadratic (n * |C|) number of theorem prover calls– Uses a modular checker

Candidate assertions

• Candidate assertions– Type-states in module invariants

• Over parameters, globals, locals and their fields

– Module invariant exceptions (next slide)– Conditional annotations

• Disjunction of above annotations

Module invariant exceptionsTypeInvDOExcept({do,de->Self},requires))TypeInvDOExcept({do,de->Self},ensures))void privateBar (PDEV_OBJ do, PDEV_EXT de) { … do->DevExt->Self = do;}

#define TypeInvDO \ x MyDevObj. x->DevExt->Self = x \

#define TypeInvDOExcept({a,b},ANNOT) \

ANNOT(x MyDevObj. x = a x = b x->DevExt->Self = x)\ANNOT(a->DevExt->Self = a) \ANNOT(b->DevExt->Sefl = b) \

Exceptions come from parameters,

return, globals, fields

Observations

• Able to synthesize most intermediate invariants– “close” to the module invariant (simple)– readable

• Invariants contain quantifiers, Boolean structure– Checking all Boolean combinations expensive (from

NP-Complete PSPACE-complete [CADE’09])• Retains scalability of the Houdini inference

Experiments• Benchmarks

– 4 device drivers (~7KLOC each), contains lists, arrays• #Internal methods: ~30• #loops: ~20

• Properties– double-free, lock-usage

• User provides module invariant– Tool infers intermediate invariants and modifies

clauses

Results

• Verified the properties with 0 false alarms• Module invariant overhead

– Number of module invariants ~5-10– Reused across multiple drivers

• Most internal annotations inferred– Approx 1500 inferred annotation per driver– Less than 5 manual annotation per driver

• Mostly conditional annotations (e.g. predicated on return value)

– Inference time < 5X of the checking time

Contributions

Questions?

Scalable Contract Checking for Systems Software using SMT solvers

Documents

Automated Test Generation via SAT/SMT Solvers

SMT Solvers: Theory and Practice - Max Planck Society · SMT Solvers: Theory and Practice Clark Barrett barrett@cs.nyu.edu New York University Summer School on Veriﬁcation Technology,

An Introduction to SMT Solvers - open-do.org · An Introduction to SMT Solvers Johannes Kanig INRIA, LRI, ProVal team 2 juin 2010 AdaCore

SMT Solvers for Malware Unpacking

Unbounded Data Model Verification Using SMT Solvers

SMT Beyond DPLL(T): A New Approach to Theory Solvers and

Extending SMT Solvers to Higher-Order Logichomepage.divms.uiowa.edu/~hbarbosa/papers/hosmt/hosmt...for an assignment E[Qthat is also a model of ’0. Extending SMT solvers to HOL can

Automatic Abstraction in SMT-Based Unbounded Software Model Checking

Fuzzing SMT Solvers via Two-Dimensional Input Space

Programming with Constraint Solversbodik/ucb/cs294/fa12/... · 2012. 9. 4. · SAT vs. SMT solvers SAT solvers accept propositional Boolean formulas typically in CNF form SMT (satisfiability

Verifying Optimizations using SMT Solvers

SMT-Based Model Checking for Recursive Programsakomurav/publications/spacer_procs.pdf · SMT-Based Model Checking for Recursive Programs ⋆ Anvesh Komuravelli, Arie Gurﬁnkel, and

SMTCoq: skeptical cooperation between SAT/SMT solvers and Coqargo.matf.bg.ac.rs/events/2013/pdp2013/slides/ChantalKeller.pdf · SMTCoq: skeptical cooperation between SAT/SMT solvers

SMT Solvers in IT Security - Deobfuscating binary code ... · SMT Solvers in IT Security - Deobfuscating binary code with logic ... The opinions and views expressed in this talk and

Validating SMT Solvers via Semantic Fusion · Validating SMT Solvers via Semantic Fusion Dominik Winterer∗ Department of Computer Science ETH Zurich, Switzerland dominik.winterer@inf.ethz.ch

On Induction for SMT Solvers - Lab for Automated …lara.epfl.ch/~kuncak/papers/ReynoldsKuncak14InductionSMTSolvers.pdf · On Induction for SMT Solvers ... (the problem is not even

On verifying ATL transformations using 'off-the-shelf' SMT solvers · 2017. 2. 2. · On verifying ATL transformations using ‘off-the-shelf’ SMT solvers Fabian Büttner 1, Marina

An Introduction to SMT Solvers and Their Applications (Part 1)ajreynol/pres-iowa2017-part1.pdf · An Introduction to SMT Solvers and Their Applications (Part 1) Andrew Reynolds University

Georgy Nosenko - An introduction to the use SMT solvers for software security

SMT Solvers (an extension of SAT)