50
Type Inference David Walker CS 510, Fall 2002

Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Type Inference

David Walker

CS 510, Fall 2002

Page 2: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Criticisms of Typed Languages

Types overly constrain functions & datapolymorphism makes typed constructs useful in

more contextsuniversal polymorphism => code reuseexistential polymorphism => rep. independencesubtyping => coming soon

Types clutter programs and slow down programmer productivity type inference

uninformative annotations may be omitted

Page 3: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Type Inference

Todayoverviewgeneration of type constraints from

unannotated simply-typed programssolving type constraints

Next timeML-style polymorphic type inferenceType inference for System F

Page 4: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Type Schemes

A type scheme contains type variables that may be filled in during type inferences ::= a | int | bool | s1 -> s2

A term scheme is a term that contains type schemes rather than proper typese ::= ... | fun f (x:s1) : s2 is e end

Page 5: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Example

fun map (f, l) is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

Page 6: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 1: Add Type Schemes

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

type schemeson functions

Page 7: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

b = b’ list

Page 8: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

: d list

constraintsb = b’ list

Page 9: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

: d list

constraintsb = b’ list

b = b’’ list b = b’’’ list

Page 10: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l : b’’), map (f, tl l : b’’’ list)))

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ list

b = b’’’ lista = a

Page 11: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l : b’’) : a’, map (f, tl l) : c))

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ list

a = b’’ -> a’

Page 12: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’

Page 13: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’d list = c’ list

Page 14: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

: d list

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’d list = c’ listd list = c

Page 15: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 2: Generate Constraints

fun map (f : a, l : b) : c is

if null (l) then

nil

else

cons (f (hd l), map (f, tl l)))

end

finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’c = c’ lista’ = c’d list = c’ listd list = c

Page 16: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 3: Solve Constraints

finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = a...

Constraint solution provides all possible solutions to type scheme annotations on terms

solutiona = b’ -> c’b = b’ listc = c’ list

map (f : a -> b x : a list): b listis ...end

Page 17: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Step 4: Generate types

Generate types from type schemesOption 1: pick an instance of the most

general type when we have completed type inference on the entire program

map : (int -> int) -> int list -> int listOption 2: generate polymorphic types for

program parts and continue (polymorphic) type inference

map : All (a,b,c) (a -> b) -> a list -> b list

Page 18: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Type Inference Details

Type constraints are sets of equations between type schemesq ::= {s11 = s12, ..., sn1 = sn2}

eg: {b = b’ list, a = b -> c}

Page 19: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Constraint Generation

Syntax-directed constraint generationour algorithm crawls over abstract syntax

of untyped expressions and generatesa term schemea set of constraints

Algorithm defined as set of inference rules (as always)G |-- u => e : t, q

Page 20: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Constraint Generation

Simple rules:G |-- x ==> x : s, { } (if G(x) = s)

G |-- 3 ==> 3 : int, { } (same for other ints)

G |-- true ==> true : bool, { }

G |-- false ==> false : bool, { }

Page 21: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2 : int, q1 U q2 U {t1 = int, t2 = int}

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 < u2 ==> e1 + e2 : bool, q1 U q2 U {t1 = int, t2 = int}

Page 22: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

If statements

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2G |-- u3 ==> e3 : t3, q3----------------------------------------------------------------G |-- if u1 then u2 else u3 ==> if e1 then e2 else e3

: a, q1 U q2 U q3 U {t1 = bool, a = t2, a = t3}

Page 23: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Function Application

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2----------------------------------------------------------------G |-- u1 u2==> e1 e2

: a, q1 U q2 U {t1 = t2 -> a}

Page 24: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Function Declaration

G, f : a -> b, x : a |-- u ==> e : t, q----------------------------------------------------------------G |-- fun f(x) is u end ==> fun f (x : a) : b is e end

: a -> b, q U {t = b}

Page 25: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Solving Constraints

A solution to a system of type constraints is a substitution Sa function from type variables to typesbecause some things get simpler, we

assume substitutions are defined on all type variables:

S(a) = a (for almost all variables a)S(a) = s (for some type scheme s)

dom(S) = set of variables s.t. S(a) a

Page 26: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Substitutions

Given a substitution S, we can define a function S* from type schemes (as opposed to type variables) to type schemes:S*(int) = intS*(s1 -> s2) = S*(s1) -> S*(s2)S*(a) = S(a)

Due to laziness, I will write S(s) instead of S*(s)

Page 27: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Composition of Substitutions

Composition (U o S) applies the substitution S and then applies the substitution U: (U o S)(a) = U(S(a))

We will need to compare substitutionsT <= S if T is “more specific” than ST <= S if T is “less general” than SFormally: T <= S if and only if T = U o S for

some U

Page 28: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Composition of Substitutions

Examples:example 1: any substitution is less general than

the identity substitution I:S <= I because S = S o I

example 2:S(a) = int, S(b) = c -> cT(a) = int, T(b) = int -> intwe conclude: T <= S if T(a) = int, T(b) = int -> bool then T is unrelated to S

(neither more nor less general)

Page 29: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Solving a Constraint

S |= q if S is a solution to the constraints q

S(s1) = S(s2) S |= q-----------------------------------S |= {s1 = s2} U q

----------S |= { }

any substitution isa solution for the emptyset of constraints

a solution to an equationis a substitution that makesleft and right sides equal

Page 30: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Most General Solutions

S is the principal (most general) solution of a constraint q ifS |= q (it is a solution) if T |= q then T <= S (it is the most general one)

Lemma: If q has a solution, then it has a most general one

We care about principal solutions since they will give us the most general types for terms

Page 31: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Examples

Example 1q = {a=int, b=a}principal solution S:

Page 32: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Examples

Example 1q = {a=int, b=a}principal solution S:

S(a) = S(b) = intS(c) = c (for all c other than a,b)

Page 33: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Examples

Example 2q = {a=int, b=a, b=bool}principal solution S:

Page 34: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Examples

Example 2q = {a=int, b=a, b=bool}principal solution S:

does not exist (there is no solution to q)

Page 35: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Unification

Unification: An algorithm that provides the principal solution to a set of constraints (if one exists)Unification systematically simplifies a set of

constraints, yielding a substitutionStarting state of unification process: (I,q)Final state of unification process: (S, { })

Page 36: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Unification Machine

We can specify unification as a transition system: (S,q) -> (S’,q’)

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

Page 37: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Unification Machine

Functions:

Variable definitions

----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

--------------------------------------------- (a not in FV(s))(S,{a=s} U q) -> ([a=s] o S, [s/a]q)

-------------------------------------------- (a not in FV(s))(S,{s=a} U q) -> ([a=s] o S, [s/a]q)

Page 38: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Occurs Check

What is the solution to {a = a -> a}?

Page 39: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Occurs Check

What is the solution to {a = a -> a}?There is none! (Remember your

homework)The occurs check detects this situation

-------------------------------------------- (a not in FV(s))(S,{s=a} U q) -> ([a=s] o S, [s/a]q)

occurs check

Page 40: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Irreducible States

Recall: final states have the form (S, { })

Stuck states (S,q) are such that every equation in q has the form: int = bools1 -> s2 = s (s not function type)a = s (s contains a)or is symmetric to one of the above

Stuck states arise when constraints are unsolvable

Page 41: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Termination

We want unification to terminate (to give us a type reconstruction algorithm)

In other words, we want to show that there is no infinite sequence of states (S1,q1) -> (S2,q2) -> ...

Page 42: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Termination

We associate an ordering with constraintsq < q’ if and only if

q contains fewer variables than q’q contains the same number of variables as q’ but

fewer type constructors (ie: fewer occurrences of int, bool, or “->”)

This is a lexacographic orderingwe can prove (by contradiction) that there is no

infinite decreasing sequence of constraints

Page 43: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Termination

Lemma: Every step reduces the size of qProof: By cases (ie: induction) on the definition

of the reduction relation.

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

------------------------ (a not in FV(s))(S,{a=s} U q) -> ([a=s] o S, [s/a]q)

Page 44: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Complete Solutions

A complete solution for (S,q) is a substitution T such thatT <= ST |= q

Page 45: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Properties of Solutions

Lemma 1:Every final state (S, { }) has a complete

solution. It is S:

S <= S S |= { }

Page 46: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Properties of Solutions

Lemma 2No stuck state has a complete solution (or

any solution at all) it is impossible for a substitution to make the

necessary equations equal int bool int t1 -> t2 ...

Page 47: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Properties of Solutions

Lemma 3 If (S,q) -> (S’,q’) then

T is complete for (S,q) iff T is complete for (S’,q’)

proof by? in the forward direction, this is the preservation

theorem for the unification machine!

Page 48: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Summary: Unification

By termination, (I,q) ->* (S,q’) where (S,q’) is irreducible. Moreover: If q’ = { } then

(S,q’) is final (by definition)S is a principal solution for q

Consider any T such that T is a solution to q. Now notice, S is complete for (S,q’) (by lemma 1) S is complete for (I,q) (by lemma 3) Since S is complete for (I,q), T <= S and therefore S

is principal.

Page 49: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Summary: Unification (cont.)

... Moreover: If q’ is not { } (and (I,q) ->* (S,q’) where

(S,q’) is irreducible) then (S,q) is stuck. Consequently, (S,q) has no

complete solution. By lemma 3, even (I,q) has no complete solution and therefore q has no solution at all.

Page 50: Type Inference David Walker CS 510, Fall 2002. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs

Summary: Type Inference

Type inference algorithm.Given a context G, and untyped term u:

Find e, t, q such that G |- u ==> e : t, qFind principal solution S of q via unification

if no solution exists, there is no reconstruction

Apply S to e, ie our solution is S(e) S(e) contains schematic type variables a,b,c, etc that

may be instantiated with any type

Since S is principal, S(e) characterizes all reconstructions.