Download pptx - Type inference in type-based verification Dimitrios Vytiniotis, Microsoft Research [email protected] May 2010

Type inference in type-based verification

Dimitrios Vytiniotis, Microsoft Research

[email protected]

May 2010

mailto:[email protected]

2

Software is hard to get right*

Which tools can help programmers write reliable code?

How to make these tools more practical and effective to use?

Making programming language

types more practical and effective

* Toyota recalls 2010 models due to faulty software in the brakes. Upgrade your Prius!

this

talk

3

Programming language

types

Why invest in types?

complexity

# of bugs

Model-driven

development

Development with proof assistants

Verification condition generation

andconstraint solving

Model checki

ng

Other benefits: 1. Integrated verification and

development2. Early error detection 3. Static checks means fast

runtime code4. Force to think about

documentation5. Modular development6. They scale

A demonstrably simple technology that can eliminate

lots of bugs

this talk

4

A brief (hi)story of type expressivity

Simple

Types1970

Hindley-Milner

ML, Haskell, F#

OutsideIn(X)

GADTs

First-class polymorphis

m

Dependent types

Type families

Type classes

…

2015

The context My work on expressive types

The future

ICFP 2006ICFP 2009

ICFP 2006JFP 2007

ICFP 2008ML 2009

TLDI 2010

inc::Int->Int

map::(a->b)->[a]->[b]

NEW: JFP submission

5


Simple

types1970

Hindley-Milner

ML, Haskell, F#

OutsideIn(X)

GADTs


m

Dependent types

Type families

Type classes

…

2015

My work on expressive types

The future

ICFP 2006ICFP 2009

ICFP 2006JFP 2007

ICFP 2008ML 2009

TLDI 2010

Keeping types

practical

The context

NEW: JFP submission

6

Types express properties

[1,2,3,4] :: { l :: List Int where forall i < length(l), l[i]<=4 }

[1,2,3,4] :: ListWithLength 4 Int

[1,2,3,4] :: List NONEMPTY Int

[1,2,3,4] :: List Int

[1,2,3,4] :: IntList

[1,2,3,4] :: Object

# of bugs

… but keep the complexity low

Our goal:Increase

expressivity …

Hindley-Milner [Hindley, Damas & Milner]Haskell, ML, F#, also Java, C#, …

7

Keeping type annotation cost low How to convince the type checker that

programs are well-typed?

StringBuilder sb = new StringBuilder(256);

var sb = new StringBuilder(256);

Full type inferenceNo user annotations at all

Full type checkingExplicit types everywhere

Hindley-Milner

inc x = x+1

Many traditional languages

Int inc(Int x) = x+1

Increased expressivity

requires more checking

Full type inference extremely convenient [no type-induced pain]

map f list = case list of nil -> nil h:tail -> cons (f h) (map f tail)

map<S,T> (f :: S -> T) (list :: [S]) = case list of nil -> nil<T> h:tail -> cons<T> (f h) (map<S,T> f tail)

8

Keeping types predictable With simple, robust, declarative typing rules

test1 = … p1 + p2 … -- ACCEPTEDtest2 = … p2 + p1 … -- REJECTED

And theorems that connect typing rules to low level algorithms

test1 = p -- ACCEPTED test2 = -- REJECTED let f x = x in f p

t <- infer e s <- infer u α <- freshsolve(t = s -> α)return α

e :: s -> t u :: s e u :: t

Hindley-Milner scores perfect here

9


Simple

Types1970

Hindley-Milner

ML, Haskell, F#

OutsideIn(X)

GADTs


m

Dependent types

Type families

Type classes

…

2015


The future

ICFP 2006ICFP 2009

ICFP 2006JFP 2007

ICFP 2008ML 2009

TLDI 2010

NEW: JFP submission

Simple, predictable No user annotations Low expressivity

1. What are GADTs2. Why they are difficult for type inference3. Inference vs checking [ICFP 2006]4. Simplifying and reducing annotations

[ICFP 2009] How to implement GADTs

10

GADTs in Glasgow Haskell Compiler (GHC)

-- An Algebraic Datatype: Integer Listsdata IList where Nil :: IList Cons :: Int -> IList -> IList

-- A Generalized Algebraic Datatype (GADT)data IList f where Nil :: IList EMPTY Cons :: Int -> IList f -> IList NONEMPTY

x = Cons 1 (Cons 2 Nil)

head :: IList NONEMPTY -> Int test0 = head x

test0 = head Nil

Type checker knowsx :: IList NONEMPTY

REJECTED!

11

Uses of GADTs Compiler enforces invariants via type checking

tail :: ListWithLength (S n) -> ListWithLength n compile :: Term SOURCE -> Maybe (Term TARGET)

Significant number of research papers [Cheney & Hinze, Xi, Pottier & Simonet, Pottier & Régis-Gianas, Sulzmann & Stuckey,…]

Verified compiler transformations, data structure implementations, reflection & generic programming, …

Such a cool feature that people are using GADT-inspired tricks in other languages! For example, C. Russo and A. Kennedy have a C# encoding

12

Example: evaluation of embedded DSL

data Term where ILit :: Int -> Term And :: Term -> Term -> Term IsZero :: Term -> Term ... eval :: Term -> Valeval (ILit i) = IVal ieval (And t1 t2) = case eval t1 of IVal _ -> error BVal b1 -> case eval t2 of IVal _ -> error BVal b2 -> BVal (b1 && b2)...

f = eval (And (ILit 3) (IsZero 0))

data Term a where ILit :: Int -> Term Int And :: Term Bool -> Term Bool -> Term Bool IsZero :: Term Int -> Term Bool ...

eval :: Term a -> a eval (ILit i) = i eval (And t1 t2) = eval t1 && eval t2...

A common example, also appearing in [Peyton Jones, Vytiniotis, Weirich, Washburn , ICFP 2006]

data Val where IVal :: Int -> Val BVal :: Bool -> Val

Represents only correct termsTagless evaluation: efficient code

A non-GADT representation

A GADT representation

13

Type checking and GADTs

Pattern matching introduces type equalities, available after the = In the first branch we learn that a ~ Int

data Term a where ILit :: Int -> Term Int

eval :: Term a -> a eval (ILit i) = i eval _ = …

i :: Int

Possible with the help of programmer annotations

Right-hand side: we must return type a

That’s fine because we know that (a~Int)from pattern matching

Determines the term we analyze

Determines the result

14

Type inference and GADTs

Here is a possible type of getILit:Term a -> [Int]

But if (a ~ Int) is used then there is also another one

Term a -> [a]

data Term a where ILit :: Int -> Term Int ...

-- Get a list of literals in this termgetILit (ILit i) = [i] getILit _ = []

Haskell programmers omit type signatures

BAD!

15

A threat for modularityTwo different “specifications” for getILitbtrm :: Term Bool

f1 = (getILit btrm) ++ [0]

f2 = (getILit btrm) ++ [True]

test = let getILit (ILit i) = [i] getILit _ = [] in ...

Works only with: Term a -> [Int]

Works only with: Term a -> [a]

And this one?

We want to have a unique principal type that we infer once and use throughout the scope of

the function

16

Separating checking and inference [ICFP 2006]

S. Peyton Jones, D. Vytiniotis, G. Washburn, S. Weirich

Not all programs have principal types, so use annotations to let programmers decide

No annotation: do not use GADT equalities

To use the other type supply an annotation:

Annotations determine two interweaved modes of operation: checking mode and inference mode

getILit (ILit i) = [i] -- inferred: (Term a -> [Int])

getILit :: Term a -> [a] getILit (ILit i) = [i]

17

Discovering a complete implementationPredictability mandates high-level declarative typing rules

That turned out to be possible because:1. Typing rules [and algorithm] can “switch” mode when they meet

annotations 2. The GADT checking problem is easy3. All non-GADT branches are typed as in Hindley-Milner

This is what GHC implements since 2006 Extremely effective and popular: http://darcs.net, commercial users,

…

The first work on type inference and GADTs to achieve this

Theorem: There exists a provably decidable, sound and

complete algorithm for the [ICFP 2006] type system

Needed to design a type system and a sound and complete algorithm

http://darcs.net/

18

[ICFP 2006] was a breakthrough but …To reduce required annotations it used some ad-hoc annotation propagation

How to improve this?

opt :: Term b -> Term b

eval :: Term a -> a eval x = case opt x of ILit i -> i

eval :: Term a -> aeval x = let f x = x in case f (opt x) of ILit i -> i

fails

Because no type annotation for f

Quite remarkable BUT what about predictability?

typechecks

19

The Outside-In solutionShrijvers, Sulzmann, Peyton Jones, Vytiniotis [ICFP 2009]

perform full inference outside a GADT branch first, and then

use what you learnt to go inside the branch

Very aggressive type information discovery

+ a simpler “Outside-In” type system

eval :: Term a -> a eval x = let f x = x in case f (opt x) of ILit i -> i

Working on the outside of the branch first determines that

f (opt x) :: Term a

20

Simplifying and reducing annotations [ICFP 2009]

Fewer annotations needed Predictability

Forthcoming implementation in GHC, invited paper in special issue of JFP “the system of this paper is the simplest proposal ever made to solve

type inference for GADTs” [anonymous reviewer]

Theorem: There exists a provably decidable, sound and complete algorithm

for the “Outside-In” type system in [ICFP 2009]

All type-safe programs

All programs with principal types

Modularity

Theorem:“Outside-In” type system

21

Inferring principal types in [ICFP 2009]data Term a where ILit :: Int -> Term Int If :: Term Bool -> Term a -> Term a -> Term a

-- Get the least number in this term findLeast (ILit i) = ifindLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2

Because of (x1 < x2), findLeast must return Int. There is a principal type [and ICFP 2009 finds it]:

Term a -> Int

Not due to arbitrarily choosing Term a -> Int as previously

REJECTED in [ICFP 2009] No ad-hoc assumptions about programmer intentions

22

The algorithm in [ICFP 2009]

findLeast (ILit i) = i findLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2

GADT branches introduce implication constraints that we must solve(α ~ Int) => (β ~ Int)

Type checker infers partially known type:

findLeast :: Term α -> βImplication constraints may have many solutions

β := Int or β := αwhich result in different types. Constraint abduction [Maher] or (rigid) E-unification [Degtyarev & Voronkov, Veanes, Gallier & Snyder, Gurevich]

Detecting incomparable solutions only possible in special cases. Mostly negative results about complexity or even decidability of the general problem.

NOT VERY ENCOURAGING

23

Restricting implications for Outside-In Step 1: Introduce special constraints that record the

interface of the branch with the outside

Step 2: Solve non-implication constraint (B) first. Easy, no multitude of solutions to pick from:

β := Int Step 3: Substitute solution on implication constraint (A)

[a] (α ~ Int) => (Int ~ Int) Step 4: Solve remaining implications fixing interface

variables

findLeast (ILit i) = i findLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2

Constraint A:[α,β] (α ~ Int) => (β ~ Int)Interface: [α,β]

Constraint B:[α,β] (β ~ Int)Interface: [α,β]

24


Simple

Types1970

Hindley-Milner

ML, Haskell, F#

OutsideIn(X)

GADTs


m

Dependent types

Type families

Type classes

…

2015


The future

ICFP 2006ICFP 2009

ICFP 2006JFP 2007

ICFP 2008ML 2009

TLDI 2010

NEW: JFP submission

25

The Hindley-Milner type system 25 years later

How all the above affect our “golden standard” of modern type systems?

We had to add user type annotations to HM to get GADTs Yet another reason for this is first-class polymorphism [THESIS

TOPIC] QML: Explicit first-class polymorphism for ML [Russo, Vytiniotis, ML 2009] FPH: First-class polymorphism for Haskell [Vytiniotis, Peyton Jones, Weirich, ICFP 2008] Practical type inference for higher-rank types [Peyton Jones, Vytiniotis, Weirich, Shields, JFP

2007] The canonical reference for Higher-Rank type systems

Boxy Types [Vytiniotis, Peyton Jones, Weirich, ICFP 2006]

… but are we also forced to remove anything?

Reminder: Hindley-Milner does not need any annotations, at all

26

let generalization in Hindley-Milner

For some extensions [e.g. GHCs celebrated type families] we must allow deferring because:

no-deferring hard-to-generalize*

… but is it practical to defer?

main = let group x y = [x,y] in (group 0 1, group False False) group is polymorphic. We can give it the generalized type

group :: forall a. a -> a -> [a]

or defer the check to the call sites [Pottier, Sulzmann, HM(X)]:group :: forall a b. (a ~ b) => a -> b ->

[a]

* trust me

27

No generalization for let-bound definitions

Well-typed if we defer equality to the call site of g:

g :: (a ~ Int) => b -> Int

f :: a -> Term a -> Int f x y = let g b = x + 1 in case y of ILit i -> g ()

a ~ Int ... errk???

If typing rules allow deferring

Then algorithm must not solve any equality [BAD!]

completeness

proof reveals nasty

surprise

28

The proposal [TLDI 2010] D. Vytiniotis, S. Peyton Jones, T. Schrijvers [TLDI 2010]

Abandon generalization of local definitions

The only complete algorithms are not practicalRADICAL: removing a basic ingredient of HM

But not restrictive in practice: 127 lines affected in 95Kloc of Haskell libraries

(0.13%)!

No expressivity loss: Polymorphism can be recovered with

annotations

29

OutsideIn(X) Many recent extensions exhibit those problems:

GADTs [previous slides] Type classes: sort :: forall a. Ord a => [a] -> [a] Type families:

append :: forall n m. (IList n)->(IList m)->(IList (Plus n m))

Units of measure [Kennedy 94], implicit parameters, functional dependencies, impredicative polymorphism …

OutsideIn(X) [TLDI 2010, new JFP submission]

Parameterize “Outside-In” type system and infrastructure [implication constraints] by a constraint theory X and its

solver w/o losing inference

Do th

e Har

d Wor

k

once

30

OutsideIn(X) – new JFP submission

Substantial article that brings the results of a multi-year collaborative research program together Many people involved over the years: Simon Peyton Jones,

Tom Schrijvers (KU Leuven), Martin Sulzmann (Informatik Consulting Systems AG), Manuel Chakravarty (UNSW), Stephanie Weirich (Penn), Geoff Washburn (LogicBlox) , …

Bonus: a new glorious constraint solver to instantiate X, which improves previous work, and for the first time shows how to deal with all of GHCs tricky features

31


Simple types1960

Hindley-Milner

ML, Haskell, F#

OutsideIn(X)

GADTs


m

Dependent types

Type families

Type classes

…

2015

My work on expressive types

The future

ICFP 2006ICFP 2009

ICFP 2006JFP 2007

ICFP 2008ML 2009

TLDI 2010

The context

NEW: JFP submission

32

What we did learn

We now know about:

Local assumptions [ICFP 2006, ICFP 2009, TLDI 2010] Local definitions [TLDI 2010] Generalizing Outside-In with OutsideIn(X)

[TLDI 2010]

Where to from here?

33

2015 (And ideas for collaborations!) … towards practical pluggable type systems +

inference!

import UnitTheory.thy

data Vehicle = Vehicle { weight :: Int[kg] , power :: Int[hp] , ... }

...

UnitTheory.thy A theory of units of measure: [Kennedy, ESOP94] constant kg,hp,sec,m axiom u*1 = u axiom u*v = v*uaxiom …

A solver for UnitTheory constraints

Type checker/inference

OutsideIn(UnitTheory)

DSL Designer / User

DSL User

We (the compiler)

Yes/No

Programs with principal types

Open: How to design syntactic language extensions

Open: How to trust solver [proof checking, certificates?]

Open: How to type more programs with principal types[revisiting rigid E-unification, better constraint solvers, ideas from SMT solving]

Open: How to combine multiple theories and solvers [revisiting Nelson-Oppen]

34

Understanding and writing better software Past:What do GADTs mean? How many functions have type

forall a. [a] -> a -> a forall a. Term a -> a -> a [Vytiniotis & Weirich, MFPS XXIII, Vytiniotis & Weirich, JFP 2010]

Past: PL proofs are tedious and error-prone. Mechanize them in proof assistants. The POPLMark Challenge [TPHOLS 2005] Have been using Isabelle/HOL and Coq in recent works with Claudio

Russo and Andrew Kennedy

Ongoing: Typed intermediate languages that better support type equalities and full-blown dependent types [with S. Weirich, S. Zdancewic, S. Peyton Jones]

Ongoing: Adding probabilities to contracts to combine static analysis and testing or statistical methods [with V. Koutavas, TCD]

On the wish list: Macroscopically programming groups of agents of limited computational power

35

Q-A games for encoding and decodingImagine a binary format such that

every bitstring encodes a non-empty set of type-safe CIL programs

Not easy to program from first principle!

Instead, understand and program encoders using question-answer games

Good coding scheme follows by asking good questions! Recent ICFP 2010 submission with A. Kennedy

qq1

got it!…

q2……

y

y

y

n

n

n

36

Programming language

types

Making good software easier to write

complexity

# bugs

A demonstrably simple technology that can already

eliminate lots of bugs

This talk: solving research problems to make types more

effective and practical:

Catch more bugs Require little user guidance

Remain predictable and modular