Type Inference with Run-time Logs

Type Inferencewith Run-time Logs

Ravi Chugh, Ranjit Jhala, Sorin LernerUniversity of California, San Diego

Motivation• Dynamically-typed languages– Rapid prototyping– Facilitate inter-language development

• Statically-typed languages– Prevent certain run-time errors– Enable optimized execution– Provide checked documentation

• Gradual type systems offer a spectrum– Provides hope for evolving existing programs– But …

2

... We Need Inference!

• Goal: migrate programs from dynamic langs

• Programmer annotation burden is currently 0

• A migration strategy must require ~0 work– Even modifying 1-10% LOC in a large codebase

can be prohibitive

3

The Challenge: Inference

4

Goal: practical type system

polymorphism

def id(x) { return x };1 + id(2);“hello” ^ id(“world”);

polymorphism

id : ∀X. X → X


5


subtypingpolymorphism

def succA(x) { return (1 + x.a)};succA({a=1});succA({a=1;b=“hi”});

subtyping

{a:Int;b:Str} <:

{a:Int}


bounded quantification

6



def incA(x) { x.a := 1 + x.a; return x};incA({a=1}).a;incA({a=1;b=“hi”}).b;


incA : ∀X<:{a:Int}. X →

X



7



dynamic

n := if b then 0 else “bad”;m := if b then n + 1 else 0;

dynamic

n : dynamic



8



dynamicother features...



9




System ML ✓



10




System F ✗

The Idea

• Use run-time executions to help inference

• Program may have many valid types– But particular execution might rule out some

• Dynamic language programmers test a lot– Use existing test suites to help migration

11

Route: Inference w/ Run-time Logs


12

subtyping+ polymorphism

omit higher-order functions

System E

System E≤

System E−

First Stop: System E−


13


omit higher-order functions

System E

System E≤

System E−

E− Type System

• Expression and function types

τ ::= | Int | Bool | ... | {fi:τi}

| X

σ ::= ∀Xi. τ1 → τ2

• Typing rules prevent field-not-found and primitive operation errors

14

def id (a) { a }

def id[A] (a:A) { a } : ∀A. A → A

def id[] (a:Int) { a } : Int → Int

def id[B,C] (x:B*C) { a } : ∀B,C. B*C → B*C

• Infinitely many valid types for id...15

• ... but ∀A. A → A is the principal type

16

∀A. A → A

∀B,C. B*C → B*CInt → Int

• ... but ∀A. A → A is the principal type

• Allows id to be used in most different ways17

∀B. B*B → B*B

Int*Int → Int*IntInt*Bool → Int*Bool

∀A. A → A

∀B,C. B*C → B*CInt → Int

def readF (o) { o.f }

def readF[F] (o:{f:F}) { o.f } : ∀F. {f:F} → F

• This is the best type

18

∀F. {f:F} → F

∀F,G. {f:F;g:G} → F{f:Int} → Int

∀G. {f:Int;g:G} → Int

def hasF (x) { let _ = x.f in x }

• Two valid types:

• Neither is better than the other 19

∀F. {f:F} → {f:F} ∀F,G. {f:F;g:G} → {f:F;g:G}

allowshasF({f=1;g=2}).g

allowshasF({f=1})

E− Static Type Inference

• E− lacks principal types– Cannot assign type just from definition– Need to consider calling contexts

• Our approach is iterative– Impose minimal constraints on argument– If a calling context requires an additional field to

be tracked, backtrack and redo the function

20

2121

def hasF(x) { let _ = x.f in x } def main1() { hasF({f=1}) }

Iterative Inference – Example 1

X <: {f:F}

hasF : ∀F. {f:F} → {f:F}x

Constraints on x Solution

X = {f:F}x

✓

“this record type came from the program variable x”

2222

def hasF(x) { let _ = x.f in x } def main2() { let z = {f=1;g=2} in hasF(z).g }


X <: {f:F}

hasF : ∀F. {f:F} → {f:F}x


X = {f:F}x

✗*

hasF(z) : {f:Int}x so can’t project on g, unless...

X <: {g:G}

2323



X <: {f:F}

hasF : ∀F,G. {f:F;g:G} → {f:F;g:G}x


X <: {g:G}X = {f:F;g:G}x

✓

2424

def hasF(x) { let _ = x.f in x } def main3() { let z = {f=1;g=2} in hasF(z).g;

hasF({f=1}) }


X <: {f:F}

Constraints on x

X <: {g:G}


Solution

X = {f:F;g:G}x

2525

def hasF(x) { let _ = x.f in x } def main3() { let z = {f=1;g=2} in hasF(z).g;

hasF({f=1}) }


X <: {f:F}

Constraints on x

X <: {g:G}


Solution

X = {f:F;g:G}x

✓✗

E− Type Inference with Run-time Logs• Evaluation can log caller-induced constraints

1. Wrap all values with sets of type variables

2. When a value passed to arg x, add Xx tag

3. When a value with tag Xx projected on field f, record Xx <: {f : Xx.f}

1.Modified inference finds all caller-induced constraints in run-time log

26

27


main2()⇒ let z = {f=1;g=2} in hasF(z).g⇒ let z = [{f=1;g=2},{}] in hasF(z).g⇒ hasF([{f=1;g=2},{Xx}]).g

⇒ (let _ = [{f=1;g=2},{Xx}].f in [{f=1;g=2},{Xx}]).g

⇒ (let _ = [1,{Xx.f}] in [{f=1;g=2},{Xx}]).g

⇒ [{f=1;g=2},{Xx}].g

⇒ [2,{Xx.g}]

Evaluation

Run-time log

Xx <: {g : Xx.g}

Xx <: {f : Xx.f}Caller-inducedconstraint

System E− Summary

28

• Fully static inference needs to iterate, but can optimize with run-time information

• If all expressions executed, then– log contains all caller-induced constraints– no need for iteration

Next Stop: System E


29


System ESystem E

System E≤

System E−

If-expressions• Type is a supertype of branch types

30

if b then 1 else 2 : Int

if b then {f=1; g=“”} else {f=2; h=true}

: {f:Int}

if b then {f=“”} else {f=true} : {f:Top}

if b then 1 else true

: Top

def choose (y,z) { if y.n > 0 then y else z }

• Several incomparable types:

31

∀A. {n:Int}*Top → Top

∀A. {n:Int}*{n:Top} → {n:Top}

{n:Int}*{n:Int} → {n:Int}

∀B. {n:Int;b:B}*{n:Int;b:B} →

{n:Int;b:B}

∀B. {n:Int;b:B}*{b:B} → {b:B}

3232

def choose(y,z) { if1 y.n > 0 then y else z }

def main4() { let t = choose({n=1},{n=2}) in

succ t.n }

Iterative Inference – Example

Constraints

“this type came from theif-expression 1”

N = IntY <: {n:N}

choose : {n:Int}*Top → Top1 Iteration 0

3333



succ t.n }


Y <: {n:N}

choose : {n:Int}*Top → Top1

Constraints

Iteration 0choose : {n:Int}*{n:Top} → {n:Top1.n}1

Iteration 1

N = Int Z <: {n:M}

✗* t : Top1

“this value came from the n field of the record returned by if-expression 1”

3434



succ t.n }


Y <: {n:N}


Constraints


Iteration 1

choose : {n:Int}*{n:Int} → {n:Int}1 Iteration 2

N = Int Z <: {n:M} N = M

✗* t.n : Top1.n

3535



succ t.n }


Y <: {n:N}


Constraints


Iteration 1

choose : {n:Int}*{n:Int} → {n:Int}1 Iteration 2

N = Int Z <: {n:M} N = M

✓

System E Summary

• Lacks principal types

• Has iterative inference– Subscripts for types returned by if-exps– Subscripts for Top in case it reaches a primop

• Can remove iteration with run-time info

36

Last Stop: System E≤


37


System E≤System E≤

System E

System E−

Bounded Quantification

38

def f[ X1<:τ1 , ... , Xn<:τn ] (x:τ) { e }


39

def f[ X1<:τ1 , ... , Xn<:τn ] (x:τ) { e }

Type parameters now have bounds

40

def hasF (x) { let _ = x.f in x }

• Now there is a best type for hasF

• Each call site can instantiate X differently 40

∀F. {f:F} → {f:F} ∀F,G. {f:F;g:G} → {f:F;g:G}

∀F, X<:{f:F}. X → X

def choose (y,z) { if1 y.n > 0 then y else z }

41

∀Y<:{n:Int}. Y * Y → Y

{n:Int}*{n:Int} →

{n:Int}

{n:Int}*{n:Top} →

{n:Top}

{n:Int}*Top → Top

∀B. {n:Int;b:B}*{n:Int;b:B} →

{n:Int;b:B}

∀B. {n:Int;b:B}*{b:B} →

{b:B}

def choose (y,z) { if1 y.n > 0 then y else z }

42

∀Y<:{n:Int}. Y * Y → Y

{n:Int}*{n:Top} →

{n:Top}

{n:Int}*Top → Top

∀B. {n:Int;b:B}*{b:B} →

{b:B}

• These four types are incomparable

4343


def main5() { let _ = choose({n=1;b=tru},{n=2;b=fls}) in

let w = choose({n=1;b=tru},{b=fls}) in not w.b }

Y<:{n:N}

Constraints

choose : ∀Y<:{n:Int}. Y * Y → Y0

N=Int Y=Z

4444




Y<:{n:N}

choose : ∀Y<:{n:Int}. Y * Y → Y

Constraints

0

N=Int Y=Z

✗*

✓

4545




Y<:{n:N}


Constraints

0


N=Int Y≠Z

✗*

4646




Y<:{n:N}


Constraints

0

choose : {n:Int}*Top → Top11choose : {n:Int;b:Top}*{b:Top} → {b:Top1.b}1

2

N=Int Z<:{b:C}Y≠Z Y<:{b:B}

✗* w : Top1

✓✓

4747




Y<:{n:N}


Constraints

0


2

N=Int Z<:{b:C} B=C

choose : ∀B. {n:Int;b:B}*{b:B} → {b:B}3

Y≠Z Y<:{b:B}

✗* w.b : Top1.b

✓✓

4848




Y<:{n:N}


Constraints

0


2

N=Int Z<:{b:C} B=C

✓

choose : ∀B. {n:Int;b:B}*{b:B} → {b:B}3

Y≠Z Y<:{b:B}

System E≤ Summary

• Lacks principal types

• When to share variables: Y=Z or Y≠Z?– New source of iteration for static inference– Can partially eliminate with run-time info

• When Y≠Z, same as System E

49

Summary


50


System E≤

System E

System E−

Summary

• Iterative static inference for E≤ – first-order functions, records– polymorphism, subtyping– bounded quantification

• Run-time information improves algorithms– reduces worst-case complexity– (not counting overhead for instrumented eval)

51

Future Work

• Direction #0: Prove formal properties for E≤

• Direction #1: Extend E≤

– recursion, recursive types, dynamic, classes, ...– is there still static inference?– how can run-time info help?– can existing programs be encoded in E?

• Direction #2: Inference for F– does inference become easier with run-time info?– if so, extend with more features– if not, heuristic-based approaches for migration

52

Thanks!

Released under Creative Commons Attribution 2.0 Generic License

53

Original photo by Alaskan Dude

http://creativecommons.org/licenses/by/2.0/deed.en

http://www.flickr.com/photos/72213316@N00/4566711365/

Extra Slides

54

• Function definitions

• Function calls

• Program is sequence of function definitions

Typed vs. Untyped Syntax

55

def f[X1,...,Xn](x:τ){ e }

def f(x){ e’ }

f[τ1,...,τn](e) f(e’)

Sharing Bounded Type Variables

• E≤ lacks principal types

• Can type variables for y and z be shared?

• If their separate bounds are compatible

Y<:{n:Int} and Z<:{} can be combined to {n:Int}

Y<:{n:Int} and Z<:{n:Bool} cannot be combined

56


• With arbitrary subtyping constraints,choose has a principal type

57

∀Y,Z. (Y<:{n:Int}, Y<:Z) => Y * Z → Z

Related Work• Complete inference for ML + records [Remy 89]

– limited by ML polymorphism– and fields cannot be forgotten

• Type inference for F is undecidable [Wells 94]

• Local type inference for F [Pierce et al, Odersky et al, et al]

– require top-level annotations, try to fill in rest– even partial inference for F undecidable [Pfenning 88]

• Type checking for F≤ is undecidable [Pierce 91]

58

Related Work• Static type systems for dynamic languages– type systems for JavaScript [Thiemann 05, et al]

– Typed Scheme [Tobin-Hochstadt and Felleisen 08]

– Diamondback Ruby [Furr et al 09 11]

• Gradual type systems– functions [Thatte 90, Cartwright and Fagan 91, Siek and Taha 06]

– objects [Siek and Taha 07, Wrigstad et al 10, Bierman et al 10]

• Coercion insertion [Henglein/Rehof 95, Siek/Vachharajani 08] – do not deal with records

59

Documents

Type Inference with Run-time Logs