50
Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI- Chicago) Workshop on Effects and Type Theory Tallinn, December 13, 2007

Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Embed Size (px)

Citation preview

Page 1: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Towards a language design for modular software verification

Aleks NanevskiMicrosoft Research, Cambridge

Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI-Chicago)

Workshop on Effects and Type TheoryTallinn, December 13, 2007

Page 2: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

How to design a programming language from scratch with verification in mind?

• Simple types have been very successful in preventing a class of programming errors.

• But many errors are outside of their reach. index-out-of-bounds division-by-zero invariants on mutable state, or almost anything involving effects

• Can a language enforce these deeper properties? While supporting usual features from programming practice. Be conservative over simply-typed languages.

Page 3: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Two foundational approaches to program specification and verification

• Hoare Logic starts with an existing language usually imperative, untyped, first-order recent extensions to simply-typed functional languages

[Honda’05],[Krishnaswami’06],[Birkedal’05]

• Dependent type theory targets pure higher-order lambda calculus types may capture deep semantic properties of data

• integer is even, list has 5 elements, etc.

• I want to argue that we essentially want a combination of both.

Page 4: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

What limitations of simple types to address?

• Simple types cannot specify effects.

• These operations are naturally partial, but here they must be “completed”: perform run-time check possibly raise exception

• Simple types do not capture this partiality.

Page 5: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

How to specify effect behavior?

• Type-and-effect systems: refine the type with the effect annotation.

Page 6: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Semantic disconnect in type-and-effect systems

• Following term would be labeled as throwing DivByZero, in most type-and-effect systems.

• Also, execution of div x n will repeat the check for n>0, even if it doesn’t need to.

• Also, how to specify dynamically generated exns? this immediately requires dependent types

Page 7: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

How to reconnect type-and-effects with semantics?

• Idea: draw effect annotations from logic.

• y > 0 is a precondition that must be proved before running div x y. we will also require postconditions, like in Hoare logic and proofs

• Important: Pre/post-conditions become embedded in types.

Page 8: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Why embed specifications into types?

• Captures partiality e.g., no need to define div x y in case y · 0. hence, strictly more expressive than Hoare Logic

• Enables trade-offs between proving and efficiency I.e. we can immediately define:

• Uniform abstraction over terms, types, specs. essential for information hiding and scalability essential for higher-order and local state

Page 9: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Which logic to use for specifications?

• It should be able to support all kinds of programming features: practical data structures (e.g., hash-tables). higher-order functions, polymorphism. pointers, aliasing, state ownership recursion, callcc, IO, concurrency.

• Thus, the logic better be very expressive.

• Type theory (like Coq) seems perfect.

• But need to reconcile it with effects.

Page 10: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Hoare Type Theory (HTT)

• Introduce a type corresponding to specs in Hoare Logic (for partial correctness).

• Hoare type stands for stateful programs with precondition P postcondition Q result type A

• Simply-typed fragment (almost) core Haskell.

Page 11: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Hoare Type Theory (cont’d)

• Fruitful combination of some fundamental PL ideas: Dijkstra’s predicate transformer. Curry-Howard isomorphism. Monads (as in Haskell). Separation Logic of Reynolds, O’Hearn, et al.

• Provably compositional: components can be specified and checked in isolation.

• Prototype under construction as extension of Coq. Execution by code extraction.

Page 12: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Dependent types and effects

Page 13: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Type theories are unsound if effects are added naively

• Propositions like (10 < 0) are types.

• Effectful programs can often be given any type:

divergence via infinite recursion exceptions mutable state IO concurrency

• An effectful program can prove that (10 < 0)! Hence, the system is inconsistent

The

awkward

squad

from

Haskell

Page 14: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

A solution: Monads

• Like in Haskell, distinguish purity with types pure fragment – the underlying type theory

• e : nat e is an integer value

• e : ST nat e is delayed effectful computation. when executed, it may change the state and diverge. but since it is delayed, it is actually considered pure. hence, can safely appear in types, predicates, proofs.

• e : ST (10 < 0) a computation which must diverge when executed.

Page 15: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Refine the monad with pre/post-condition to capture effectful behavior and partiality

• Hoare type is a dependent (or indexed) monad.

• Formation rule ST{P}x:A{Q} : Type if P : heap Prop A : Type x:A |- Q : heap heap Prop, where

heap = loc option(a:Type. a), and loc = nat.

• Note: postcondition is binary relation on heaps. Variant of VDM notation.

Page 16: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

• whereis true if x points to v:A in h.

• Note: before running inc x, must prove that x stores a nat. because x may store a value of some other type. because x may be a dangling pointer.

Example: specify function that increments location contents and returns old value

Page 17: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Implementation of inc in Haskell-style do-notation.

• HTT implementation typechecks inc as follows: Compute P,Q=weakest pre/strongest post for the do-block Then emit obligation to prove the consequence:

Page 18: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Typing of primitive commands designed to compute weakest pre and strongest post

• Memory read

• (Strong) Memory update

Page 19: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Typing of primitive commands designed to compute weakest pre and strongest post

• Memory allocation

• Memory deallocation

Page 20: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Fixpoints are a little bit different…

• Pre/posts must be given explicitly (for now)

• Corresponds to giving loop invariants in Hoare Logic

• But should be possible to write a rule that infers the strongest invariant! Future work.

Page 21: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Monadic primitives (unit)

• Roughly, corresponds to Hoare Logic rule of variable assignment.

Page 22: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Monadic primitives (bind)

• Rule of sequential composition (but higher-order)

• Note: quantifications over pre/posts and heaps is essential for obtaining tightest specs.

Page 23: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Monadic primitives (Haskell-style do)

• Rule of consequence

• Interesting fact: “do” is not ordinary coercion it is an introduction form for Hoare type bind is corresponding elimination

Page 24: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Example: counter

• Allocate a private location x

• Export function that increments x

• Executing fcounter; x0f; x1f; x2f will bind

0,1,2 to x0,x1,x2, respectively.

• What is the spec for counter?

Page 25: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

• Problem: x is out of scope in return type.

A specification with nested Hoare types

Page 26: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

• Introduce invariant into code to hide how count is kept.

• Another problem: fst(f) 0 h states (x0) h, but we lost connection with i

• We will need Separation Logic to handle this.

Hide private state by existential abstraction

Page 27: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Proving program correctness in HTT

Page 28: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Weakest pre and strongest post precisely capture the semantics of a program.

• Problem: these may not be easy to read!

• Remember the example 3-line program:

Page 29: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Here is the computed tightest spec for inc, in Coq syntax.

inc : forall x : loc, ST (fun i : heap => (fun i0 : heap => exists v : nat, ptsto x v i0) i /\ (forall (x0 : nat) (m : heap), (fun (y : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y i0) x0 i m -> (fun (xv : nat) (i0 : heap) => (fun i1 : heap => exists B : Type, exists w : B, ptsto x w i1) i0 /\ (forall (x1 : unit) (m0 : heap), (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 m0 -> (fun (_ : unit) (_ : heap) => True) x1 m0)) x0 m)) (fun (y : nat) (i m : heap) => exists x0 : nat, exists h : heap, (fun (y0 : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y0 i0) x0 i h /\ (fun (xv y0 : nat) (i0 m0 : heap) => exists x1 : unit, exists h0 : heap, (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 h0 /\ (fun (_ : unit) (r : nat) (i1 f : heap) => r = xv /\ f = i1) x1 y0 h0 m0) x0 y h m)

Page 30: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Luckily, the spec has a lot of structure!

• It literally represents the program as a predicate.

• We apply the proving strategy from Hoare Logic: symbolically evaluate the program, one step at a time. at each step, discharge the verification condition that enables

the next evaluation step.

• With a twist: Evaluation/VC-generation can be implemented as a set of lemmas. proving the lemmas verifies the VC-gen implementation.

Page 31: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Example lemma for symbolic evaluation (in Coq syntax)

• If program starts with a read from location x: first prove that x is initialized (ptsto x v i) then proceed to prove the spec of the continuation.

• Other lemmas similar (evals_bind_write, evals_bind_new…)

• Applicable lemma can be determined by a tactic.

Lemma evals_bind_read :

forall (A B : Type) (x : loc) (v : A) (p2 : A -> heap -> Prop)

(q2 : A -> B -> heap -> heap -> Prop)          (i : heap) (q : B -> heap -> Prop),

     ptsto x v i ->    (p2 v i /\ forall y m, q2 v y i m -> q y m) ->      

    (bind_pre (read_pre A x) (read_post A x) p2 i /\        forall y m, (bind_post (read_pre A x)

(read_post A x) p2 q2 y i m -> q y m.

Page 32: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Separation Logic

Page 33: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Large footprints in Hoare Logic

• Let inc:

• Q: What is known after inc runs in a heap with locations x and y?

• A: Only that xv+1, but all info about y is lost.

• Spec should explicitly say that y is not changed. possible to write in ST, but quite inconvenient

Page 34: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Small footprints and Separation Logic

• Specs should only describe what the program changes [O’Hearn,Reynolds,Pym,…]

• If e : STsep{P}x:A{Q}, then e can run in any heap containing a subheap i such that P i diverges, or returns subheap m such that Q i m part of initial heap outside i is not accessible.

• Easier to use than large footprints, but more difficult meta theory.

Page 35: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Separation logic adds two new things:

• Separating conjunction (easily definable in HTT):

(P * Q) holds of heap h iff P and Q hold of disjoint parts of h

• Frame rule of inference: If then

• Can we add Frame rule to HTT? How to prove that Frame is sound?

Page 36: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Employ a type-theoretic idea to expedite…

• Impose that well-typed programs must satisfy Frame!

• Define new monad STsep, over ST:

• Then re-type the stateful commands, using rule of consequence.

Page 37: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Programs remain the same, but specs become much simpler

• Example: allocation

empty subheap is consumed and replaced by rv r must be fresh (as new can’t access existing state)

• Example: deallocation

subheap x- is consumed and replaced by empty.

• Analogy with linear logic.

Page 38: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

• Now (fst f) 0 replaces empty from the precondition.

• Meaning: initial heap is extended with x0

STsep monad correctly handles private state

Page 39: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Meta-theoretic properties:soundness, compositionality, equations

Page 40: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Verification in HTT reduces to typechecking

• Theorem: If e:ST{P}r:A{Q}, then E evaluates as expected.

• Proved via Preservation and Progress lemmas. but much more demanding!

• Preservation: evaluation preserves types, normal forms, and postconditions. e.g: if e:ST{T}r:int{r = 55} then e does produce 55.

• Progress demands soundness of assertion logic Requires a denotational model for HTT.

Page 41: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Type checking is syntax directed

• Program properties independent of context. No need for whole program reasoning. Proofs by induction on program structure.

• Program is a proof of its spec: in the pure case, by Curry-Howard. in the impure case, by weakest pre/strongest post.

• Formal statements of compositionality In the pure case, substitution principles. In the impure case, Hoare’s rule of composition.

Page 42: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Denotational models

• Denotation for e : ST{P}x:A{Q x} is apredicate transformer: takes p:heapProp such that 8h. p h P h returns q:AheapProp such that

8x h. q x h 9i. p i Æ Q x i h is monotone

• Model suffices for soundness, but too large e.g., does not support storing monads into heaps also, requires showing monotonicity before taking fix.

• Better, realizability model [Petersen,Birkedal’08]. But not implemented in Coq, and seems very hard to!

Page 43: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Implementation, related work, future work, summary

Page 44: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Summary

• HTT reflects effect information into types via Hoare-style pre/post conditions. Generalization of monadic type-and-effect systems, but

effect annotations are logical predicates over heaps.

• Types determine in which context a program may be used (in a context satisfying the precondition). This is a uniquely type-theoretic property, generalizing

ordinary Hoare Logics.

• Combines usefully with higher-order features of a type theory like Coq, to represent modes of use of state, like: freshnes, aliasing, ownership (via Separation Logic) higher-order and shared local state (via existential

abstraction).

Page 45: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Related work

• Extended static checking: ESC/Java, JML, Spec#, SPlint, Cyclone, Sage Hoare-like annotations verified during typechecking. Restrictive strategies for dealing with undecidability

• Dependent types and effects [Augustson’98],[Mandelbaum’03],[Zhu,Xi’05],[Shao’05],

[Sheard’05],[Westbrook’06],[Taha’07],[Condit’07]. Programs and specs cannot share pure code

(phase separation)

• Hoare Logics for higher-order functions: [Schoeder’02],[Honda’05],[Krishnaswami’06],[Birkedal’04] Simply-typed underlying languages (with effects) Hoare triples do not integrate into types.

Page 46: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

HTT in comparison to related work.

Spec expressiveness

Programming features

Typed lambda calculus

Java,C#,Haskell,O’Caml

Dependent type theory (Coq,Epigram,NuPRL…)

Hoare specs (ESC,JML,Spec#,Cyclone)

Light dependent types (Cayenne,DML, ATS,Omega)

Fully verified

software

Fully verified

software

HTT

Page 47: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Future work: gain more experience with implementation in Coq

• A lot of scaffolding for verification is in place symbolic evaluation lemmas tactics for Separation Logic reasoning (were tricky to nail down at

first; several wrong starts)

• Getting ready to attack larger programs. Probably start with libraries for imperative data structures.

• Largest so far: Hash-table module, Stack module, Parsing combinators.

• Experience encouraging: proofs/code ratio quite large but proofs were not difficult

Page 48: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Future work: other effects

• First attempts at formulating Haskell-style monad for transactional concurrency. Separate state into private and shared Reasoning like O’Hearn’s concurrent separation logic Hoare type is a 4-touple STM{I}{P}x:A{Q} I – invariant of shared state

• Other notions of concurrency? Auxiliary variables, history/prophecy variables? Predicate transformers for concurrency?

• IO monad? Specifications must be limited to statements that are

invariant against outside changes to the world.

• Continuation monad? (first attempts made)

Page 49: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Future work: better models and axiomatizations

• Can we encode equality over effectful code as some reasonable judgment?

• Without having to implement involved categorical models.

Page 50: Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal

Hopefully in future not too far, far away…

Spec expressiveness

Programming features

Typed lambda calculus

Java,C#,Haskell,O’Caml

Dependent type theory (Coq,Epigram,NuPRL…)

Hoare specs (ESC,JML,Spec#,Cyclone)

Light dependent types (Cayenne,DML, ATS,Omega)

Fully verified

software

Fully verified

software

HTT