63
Type Inference David Walker COS 320

Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Embed Size (px)

Citation preview

Page 1: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Type Inference

David Walker

COS 320

Page 2: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Criticisms of Typed Languages

Types overly constrain functions & datapolymorphism makes typed constructs useful in

more contextsuniversal polymorphism => code reusemodules & abstract types => code reusesubtyping => code reuse

Types clutter programs and slow down programmer productivity type inference

uninformative annotations may be omitted

Page 3: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Type Inference

Type Inference overview generation of type constraints from unannotated

simply-typed programs Fun has subtyping and still needed some annotations There is an algorithm for complete type inference for

MinML without subtyping (and of course for full ML, including parametric polymorphism)

solving type constraints

Page 4: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Type Schemes

A type scheme contains type variables (a, b, c) that may be filled in during type inferences ::= a | int | bool | s1 -> s2

A term scheme is a term that contains type schemes rather than proper typese ::= ... | fun f (x:s1) : s2 = e

Page 5: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Example

fun map (f, l) =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

Page 6: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 1: Add Type Schemes

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

type schemeson functions

Page 7: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

for eachexpression,perform type inference onsub expressions,assigning themtype schemesand generatingconstraintsthat mustbe solved

Page 8: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

begin with theif expression &recursivelyperform type inference on itssubexpressions

Page 9: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

b = b’ list since argument to null must be a list

Page 10: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

: d list since nil is some kind of list

constraintsb = b’ list

Page 11: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

: d list

constraintsb = b’ list

b = b’’ list b = b’’’ list

since hd and tl are functions that take list arguments

Page 12: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l : b’’), map (f, tl l : b’’’ list)))

: d list

constraintsb = b’ listb = b’’ listb = b’’’ list

b = b’’’ lista = a

Page 13: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l : b’’) : a’, map (f, tl l) : c))

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ list

a = b’’ -> a’

Page 14: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’

Page 15: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’d list = c’ list

Page 16: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

: d list

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’d list = c’ listd list = c

Page 17: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 2: Generate Constraints

fun map (f : a, l : b) : c =

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’c = c’ lista’ = c’d list = c’ listd list = c

Page 18: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 3: Solve Constraints

finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = a...

Constraint solution provides all possible solutions to type scheme annotations on terms

solutiona = b’ -> c’b = b’ listc = c’ list

map (f : a -> b x : a list): b list = ...

Page 19: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Step 4: Generate types

Generate types from type schemesOption 1: pick an instance of the most

general type when we have completed type inference on the entire program

map : (int -> int) -> int list -> int listOption 2: generate polymorphic types for

program parts and continue (polymorphic) type inference

map : ForAll (a,b,c) (a -> b) -> a list -> b list

Page 20: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Type Inference Details

Type constraints q are sets of equations between type schemesq ::= {s11 = s12, ..., sn1 = sn2}

eg: {b = b’ list, a = b -> c}

Page 21: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Constraint Generation

Syntax-directed constraint generationour algorithm crawls over abstract syntax of

untyped expressions and generatesa term schemea set of constraints

Algorithm defined as set of inference rules programming notation: tc(G, u) = (e, t, q)math notation: G |-- u => e : t, q

inputs outputs

Page 22: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Constraint Generation

G |-- x ==> x :

G |-- 3 ==> 3 :

G |-- true ==> true :

G |-- false ==> false :

Page 23: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Constraint Generation

G |-- x ==> x : s, { } (if G(x) = s)

G |-- 3 ==> 3 :

G |-- true ==> true :

G |-- false ==> false :

Page 24: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Constraint Generation

G |-- x ==> x : s, { } (if G(x) = s)

G |-- 3 ==> 3 : int, { }

G |-- true ==> true :

G |-- false ==> false :

Page 25: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Constraint Generation

G |-- x ==> x : s, { } (if G(x) = s)

G |-- 3 ==> 3 : int, { }

G |-- true ==> true : bool, { }

G |-- false ==> false : bool, { }

Page 26: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

------------------------------------------------------------------------G |-- u1 + u2 ==>

Page 27: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> G |-- u2 ==> ------------------------------------------------------------------------G |-- u1 + u2 ==>

Page 28: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 G |-- u2 ==> e2 ------------------------------------------------------------------------G |-- u1 + u2 ==>

Page 29: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 G |-- u2 ==> e2 ------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2

Page 30: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 G |-- u2 ==> e2 ------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2 : int

Page 31: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : ------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2 : int, q1 U {t1 = int}

Page 32: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2 : int, q1 U q2 U {t1 = int, t2 = int}

Page 33: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

------------------------------------------------------------------------G |-- u1 < u2 ==>

Page 34: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 < u2 ==> e1 < e2 : bool,

Page 35: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 < u2 ==> e1 < e2 : bool,

Page 36: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 < u2 ==> e1 + e2 : bool, q1 U q2 U {t1 = int, t2 = int}

Page 37: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

If statements

----------------------------------------------------------------G |-- if u1 then u2 else u3 ==>

Page 38: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

If statements

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2G |-- u3 ==> e3 : t3, q3----------------------------------------------------------------G |-- if u1 then u2 else u3 ==> if e1 then e2 else e3

:

Page 39: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

If statements

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2G |-- u3 ==> e3 : t3, q3----------------------------------------------------------------G |-- if u1 then u2 else u3 ==> if e1 then e2 else e3

: a, q1 U q2 U q3 U {t1 = bool, a = t2, a = t3}

Page 40: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Function Application

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2----------------------------------------------------------------G |-- u1 u2==> e1 e2

: a, q1 U q2 U {t1 = t2 -> a}

Page 41: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Function Application

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2----------------------------------------------------------------G |-- u1 u2==> e1 e2

: a,

Page 42: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Function Declaration

G, f : a -> b, x : a |-- u ==> e : t, q----------------------------------------------------------------G |-- fun f(x) is u end ==> fun f (x : a) : b is e end

: a -> b, q U {t = b}

Page 43: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Function Declaration

G, f : a -> b, x : a |-- u ==> e : t, q----------------------------------------------------------------G |-- fun f(x) = u ==> fun f (x : a) : b = e

:

Page 44: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Solving Constraints

A solution to a system of type constraints is a substitution Sa function from type variables to type

schemes, eg:S(a) = aS(b) = intS(c) = a -> a

Page 45: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Substitutions

Applying a substitution S:substitute(S,int) = int

substitute(S,s1 -> s2) = substitute(s1) -> substitute(s2)

substitute(S,a) = s (if S(a) = s)

Due to laziness, I will write S(s) instead of

substitute(S, s)

Page 46: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification

Unification: An algorithm that provides the principal solution to a set of constraints (if one exists)Unification systematically simplifies a set of

constraints, yielding a substitutionStarting state of unification process: (Id,q)Final state of unification process: (S, { })

Page 47: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

We can specify unification as a rewriting system:Given (S, q), systematically pick a

constraint from q and simplify it, possibly adding to the substitution

Use inductive definitions to specify rewriting.

Judgment form: (S,q) -> (S’,q’)

Page 48: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> ???

Page 49: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> (S, q)

Page 50: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> ???

-----------------------------(S,{a=a} U q) -> ???

Page 51: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

Page 52: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Variable definitions

--------------------------------------------- (a not in s)(S,{a=s} U q) -> (S[a=s], [s/a]q)

-------------------------------------------- (a not in s)(S,{s=a} U q) -> (S[a=s], [s/a]q)

Page 53: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Functions:----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> ???

Page 54: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine

Functions:----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

Page 55: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Occurs Check

What is the solution to {a = a -> a}?

Page 56: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Occurs Check

What is the solution to {a = a -> a}?There is none! The occurs check detects this situation

-------------------------------------------- (a not in s)(S,{s=a} U q) -> (S[a=s], [s/a]q)

occurs check

Page 57: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Unification Machine Gets Stuck

Recall: final states have the form (S, { })

Stuck states (S,q) are such that every equation in q has the form: int = bools1 -> s2 = s (s not function type)a = s (s contains a)or is symmetric to one of the above

Stuck states arise when constraints are unsolvable

Page 58: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Termination

We want unification to terminate (to give us a type reconstruction algorithm)

In other words, we want to show that there is no infinite sequence of states (S1,q1) -> (S2,q2) -> ...

Page 59: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Termination

We associate an ordering with constraintsq < q’ if and only if

q contains fewer variables than q’q contains the same number of variables as q’ but

fewer type constructors (ie: fewer occurrences of int, bool, or “->”)

This is a lexicographic orderingwe can prove (by contradiction) that there is no

infinite decreasing sequence of constraints

Page 60: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

TerminationLemma: Every step reduces the size of qProof: Examine each case of the definition of

the reduction relation & show that our metric goes down.

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

------------------------ (a not in FV(s))(S,{a=s} U q) -> (S[a=s], [s/a]q)

Page 61: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Properties of Solutions

given any set of constraints q, our unification machine reduces (Id, q) to (S, { }) such that S(q) is a set of true equations (eg, int = int, bool ->

a = bool -> a, etc) S is a principal solution to the constraints.

it generates more general types than any other substitution T:

ie: a -> a is more general than int -> int ie: T replaces more type variables with more specific

types

Page 62: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Summary: Type Inference

Type inference algorithm.Given a context G, and untyped term u:

Find e, t, q such that G |- u ==> e : t, qFind principal solution S of q via unification

if no solution exists, there is no way to type check the term

Apply S to e, ie our solution is S(e) S(e) contains schematic type variables a,b,c, etc that may

be instantiated with any type

Since S is principal, S(e) characterizes all possible typings of e

Page 63: Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful

Compare with Subtyping Algorithm

Simple type inference algorithm. Given a context G (mapping variables to type schemes), and

untyped term u: Find e, t, q such that G |- u ==> e : t, q Type equations collected on the side

Subtyping algorithm given a context G (mapping variables to full types)

Find t such that G |-- e : t Subtyping equations solved when they are generated

Lots of current research on type inference that combines subtyping and polymorphic type reconstruction for advanced type systems