Upload
seamus-baxter
View
218
Download
0
Embed Size (px)
Citation preview
Type Analysis andType Analysis andTyped CompilationTyped Compilation
Stephanie Weirich
Cornell University
ICFP '98
OutlineOutline
• Typed Compilation background
• Type Analysis background
• Initial framework - Type Passing
• Problems with Type Passing
• Type Erasure framework
• Closure Conversion comparison
• Related Work
ICFP '98
Traditional CompilationTraditional Compilation
UntypedAssemblyLanguage
UntypedIntermediate
Language
Typed Abstract Syntax
ImplicitlyTyped Source
Language
Compilation is a series of translations between severallanguages
ICFP '98
Typed CompilationTyped Compilation
Assembly Language
TypedIntermediate
Language
Typed Abstract Syntax
Most of those languages are typed, and the types are translated with the terms
ImplicitlyTyped Source
Language
ICFP '98
Typed CompilationTyped Compilation
• Safety– “well-typed programs can’t go wrong”– Types describe invariants maintained by the
compiler
• Performance– Types provide information which may be used by
the compiler for optimization– Tag-free garbage collection– Data layout control
ICFP '98
Type AnalysisType Analysis
• Create functions from types to values
• Code may branch on an unknown type
fun toString x:a => (typecase a of
int => Int.toString | char => Char.toString | * => fn (fst,snd) => “(” ^ (toString fst) ^ “,” ^ (toString snd) ^ “)” ) x
ICFP '98
Data Layout OptimizationData Layout Optimization
• Parametric polymorphism requires uniformity of representation for all arguments, regardless of their types.
ICFP '98
Because any array may be passed to a polymorphic function, all arrays must look the same, no matter the type of their elements.
A:int ArrayB:bool Array
sub = fn (A:a array,i:int) => wordsub(A,i)
Polymorphic SubscriptPolymorphic Subscript
ICFP '98
In languages such as C, the type of an array is always known at its use.
A[2]
B[2]
wordsub(A,2)
wordsub(B,0)&(1<<2) <> 0
int A[4]
bool B[4]
Monomorphic subscriptMonomorphic subscript
ICFP '98ICFP '98
sub = fn (A:a array,i:int) => typecase a of int => wordsub(A,i)
| bool => (wordsub(A,i div 32) & (1<<(i mod 32))) <> 0
A:int Array
B:bool Array
Type Analysis to the RescueType Analysis to the Rescue
Type analysis allows us to determine the type at run time.
ICFP '98
Initial Framework - Initial Framework - iiML ML
• Make type abstraction/application explicit (as in System F)
( a. e) [ int ]
• Add typecase operatortypecase t of
int => ...
| b * g => ...
• Code execution (operational semantics) now relies on the type system
ICFP '98
Type Passing SemanticsType Passing Semantics
(a.(A:a array,i:int). typecase a of int => wordsub(A,i) bool => (wordsub(A,i div 32) & (1<<(i mod 32))) <> 0) [int] (A,3)
sub[int](A,3)
wordsub(A,3)
typecase int of int => wordsub(A,3) bool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0
ICFP '98
TypecheckingTypechecking
To typecheck a typecase term we annotate the term with its return type:
tostringa. x:a. (typecase [ d.d -> string ] a of
int => Int.toString | char => Char.toString | b * g => fn (fst,snd):b*g => “(” ^ (toString[b] fst) ^ “,” ^ (toString[g] snd) ^ “)” ) x
Then we check that each branch satisfies that type with the appropriate type substitution. The return type of the entire term is a substituted for d in the annotated type.
ICFP '98
ProblemsProblems
• Issues about expressiveness– Complicates low-level constructs – Can’t express some optimizations– Can’t express abstraction boundaries
ICFP '98
ComplexityComplexity
• Polymorphic closure conversion– Minamide et al. [1996]– Morrisett et al. [1998]
• Duplication of effort– Optimization of runtime behavior– Explicit modeling of low level computation
• Allocation Semantics• Typed Assembly Languages
ComplexityComplexity
ICFP '98
InefficiencyInefficiency
• Must pass all types even if some are never examined
• TIL -- eliminates unexamined run-time types in ad hoc manner in translation to untyped calculus
InefficiencyInefficiency
ICFP '98
• No way to hold types abstract if they can always be examined
• Clients allowed to break abstraction barriers and infer more information than desired
Host Client
bool ref capability
Loss of AbstractionLoss of Abstraction
ICFP '98
SolutionSolution
Pass terms that represent types
ICFP '98
Type Erasure SemanticsType Erasure Semanticssub [int] Rint (A,3)
ICFP '98
Type Erasure SemanticsType Erasure Semantics
(a.x:R(a). (A:a array,i:int). typecase x of Rint => wordsub(A,i) Rbool => (wordsub(A,i div 32) & (1<<(i mod 32))) <> 0) [int] Rint (A,3)
sub [int] Rint (A,3)
wordsub(A,3)
typecase Rint of Rint => wordsub(A,3) Rbool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0
ICFP '98
Type Erasure SemanticsType Erasure Semantics
(.x:R(). (A: array,i:int). typecase x of Rint => wordsub(A,i) Rbool => (wordsub(A,i div 32) & (1<<(i mod 32))) <> 0) [int] Rint (A,3)
sub [int] Rint (A,3)
wordsub(A,3)
typecase Rint of Rint => wordsub(A,3) Rbool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0
ICFP '98
FormalizationFormalization
• Special representation terms:
– Rint
– R (x,y)
• A term e which represents a type has the special type R().
– R(Rint,Rint): R(int int)
• Instead of a type, the argument to typecase is a term of type R()
• The type system tracks the correspondence between a type and its representation
ICFP '98
TypecheckingTypechecking
To typecheck a typecase term we still annotate the term with its return type:
tostringa. x:a. y:R(a). (typecase [ d.d -> string ] y of
Rint => Int.toString | Rchar => Char.toString | R*(r1,r2) as b*g => fn (fst,snd):b*g => “(” ^ (toString[b] fst r1) ^ “,” ^ (toString[g] snd r2) ^ “)” ) x
We require that the argument to the typecase be of type R(t) for the typecase term to typecheck. We can then substitute a for d in the annotated type as before.
ICFP '98
SolutionSolutionSolutionSolution
• Everything that happens at run-time is described by the terms
• Can go to a type erasure semantics
• Optimization• Traditional code optimizers can optimize
type representations• Sophisticated techniques still possible
• If a representation is not provided a type may not be analyzed
ICFP '98
Typed Closure ConversionTyped Closure Conversion
• At run-time, a first-class function is represented by a code pointer
• But now x is unbound -- so we change the type of l2 to take another argument
• A closure is just a code pointer paired with the values of its free variables -- its environment
f x:int. y:int. x + y
l1: x:int. l2l2: y:int. x + y
l1: x:int. (l2, x)l2: y,x:int. x + y
ICFP '98
(f 3) 4
Typed Closure ConversionTyped Closure Conversion
Application just extracts the environment and applies the function to it
let clos = (f 3) in (#1 clos) (4,#2 clos)
clos : int -> int clos : (int*int -> int) * int
f x:int. y:int. x + y
ICFP '98
Existential TypesExistential Types
• Unfortunately, the type of the closure now depends on its free variables
• Existential types hold the types of free variables abstract
l1: x:int. pack (l2,x) as env.((int*env -> int)*env) hiding int
unpack (env,clos) = (f 3)in (#1 clos)(4,#2 clos)
clos : (int*int -> int) * int
ICFP '98
Polymorphic Closure ConversionPolymorphic Closure Conversion
• In a type passing semantics a function may also have free type variables at run time
• They also must be part of the closure, via translucent types
• The type environment is then abstracted using existential kinds
a. x:a. y:int. ...
l1: a. x:a. (b=a).(l2[b], x)l2: b. y:int,x:b.
...
clos : k. b:T. a:k. g:k = a. (int * b) -> int * b
ICFP '98
Closure ConversionClosure Conversion
• In a type erasure semantics, only the type representation remains at run time.
• At the time of closure creation, the type arguments may be given to the function.
a. x:a, z:R(a). y:int. ...
l1: a. x:a,z:R(a). pack (l2[a], z) as b.(int*b -> int)*b hiding (a*R(a))
l2: b. y:int,(x:b,z:R(b)).
...
ICFP '98
Multi-stage Type AnalysisMulti-stage Type Analysis
TypedAssembly Language
TypedIntermediate
Language
Typed Abstract Syntax
Final Step of typed compilation: compile to a typed assembly language
ImplicitlyTyped Source
Language
ICFP '98
Multi-stage Type AnalysisMulti-stage Type Analysis
• How do we preserve the meaning of typecase when the types themselves change?– in TALx86 both int and float are compiled
into B4, the type of 4-byte values– int int may be compiled into a variety of
types depending on the calling convention
ICFP '98
Related WorkRelated Work
• Harper, Morrisett - POPL 95• Minamide, Morrisett, Harper - POPL 96 • Minamide - 2nd Fuji Intl Workshop on
Functional and Logic Programming 96 • Morrisett, Walker, Crary, Glew - POPL 98 • Crary, Weirich, Morrisett - ICFP 98• Crary, Weirich - ICFP 99
ICFP '98
Low-level Type AnalysisLow-level Type Analysis
• How do we analyze types with quantifiers?– In TALx86 every function (polymorphic or
not) is compiled into a polymorphic code pointer
ICFP '9834
Areas for Future WorkAreas for Future Work
ICFP '98
ICFP '98
OutlineOutline
• Introduction– Typed Compilation– Type Analysis in general
• toString example
– Type Analysis in compilation• bit array example
• Initial framework – syntax -- from examples– semantics
• type passing
– problems– complication of theory– can’t express efficient code– loss of abstraction
• Type erasure semantics• syntax• dynamic semantics• static semantics
• Closure Conversion example• LX Teaser• Related work
ICFP '98
A note about typecheckingA note about typechecking
– In this example wordsub has a strange typea array * int -> int
– It would better if it were of type
int array * int -> int – Then argument to subscript must allways be
an int array. But that forgets its actual type of bool array. So we create a special type, called packed array, with a type-level type analysis operator.
ICFP '9812 ICFP '98
Type Passing SemanticsType Passing Semanticssub[int](A,3)
ICFP '9813
Type Passing SemanticsType Passing Semantics
(.(A: array,i:int). typecase of int => wordsub(A,3) bool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0) [int] (A,3)
sub[int](A,3)
ICFP '9814
Type Passing SemanticsType Passing Semantics
(.(A: array,i:int). typecase of int => wordsub(A,3) bool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0) [int] (A,3)
sub[int](A,3)
typecase int of int => wordsub(A,3) bool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0
ICFP '9830
FormalizationFormalization
• Special representation terms:– Rint
– R (x,y)
ICFP '9831
FormalizationFormalization
• Special representation terms:– Rint
– R (x,y)
• A term e which represents a type has the special type R(). – R(Rint,Rint): R(int int)
ICFP '9824
Type Erasure SemanticsType Erasure Semantics
(.x:R(). (A: array,i:int). typecase x of Rint => wordsub(A,3) Rbool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0) [int] Rint (A,3)
sub [int] Rint (A,3)
ICFP '9825
Type Erasure SemanticsType Erasure Semantics
(.x:R(). (A: array,i:int). typecase x of Rint => wordsub(A,3) Rbool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0) [int] Rint (A,3)
sub [int] Rint (A,3)
typecase Rint of Rint => wordsub(A,3) Rbool => (wordsub(A,3 div 32) & (1<<(3 mod 32))) <> 0
ICFP '98
ICFP '985 ICFP '98
Type based compilationType based compilation
Terms TypesSource
Language
Intermediate Language
Machine Language
ICFP '98
Multi-stage Type AnalysisMulti-stage Type Analysis
ICFP '98
Type Level Type AnalysisType Level Type Analysis
ICFP '98
Type Safe LanguageType Safe Language
• Give us guarentees about the run-time behavior of programs
• Types abstractly describe the run-time flow of values
ICFP '98
Traditional CompilationTraditional Compilation( fn x => x+1 ) 3
( x . x + 1 ) 3
l1: push 3 call l2
retnl2: mov eax,[esp+4] add eax,eax mov [esp+4], eax
retn
( fn x : int => x +1 ) 3
Source File
Type InferenceChecking
Untyped IL
Machine Code
ICFP '98
Type Based CompilationType Based Compilation( fn x => x+1 ) 3
( x : int . x + 1 ) 3
l1: push 3 call l2
retnl2: mov eax,[esp+4] add eax,eax mov [esp+4], eax
retn
( fn x : int => x +1 ) 3
Source File
Type InferenceChecking
Typed IL
Machine Code
ICFP '98
Why Typed CompilationWhy Typed Compilation
• Safety -- assurances about compiler correctness
• Type based optimizations
• For example ...
ICFP '98
But we don’t always know But we don’t always know the types, what then ?the types, what then ?
• For example -- Parametric polymorphism
• Introduce an operator into the language that can distinguish types
• Typecase !
ICFP '98
Need a language with this Need a language with this operator - lmlioperator - lmli
ICFP '98
Second example Second example
ICFP '9816 ICFP '98
Performance and SafetyPerformance and Safety
Terms Types
Source
IL
Machine
ICFP '9817 ICFP '98
SafetySafety
Terms Types
Source
IL
Machine TAL
ICFP '9811 ICFP '98
Type Passing SemanticsType Passing Semantics
• Used by the languageiML , the
Intermediate language of TIL/ML and FLINT compilers
• Unlike most calculi where types may be erased prior to run-time, types do have an operational significance -- they are arguments to typecase terms.
ICFP '9810 ICFP '98
Intensional Type AnalysisIntensional Type Analysis
• Valuable element of type-directed compilers
• Allows otherwise untypeable optimizations– Specialized data layout– Tag-free Garbage Collection– Polymorphic marshalling– ...
ICFP '9829
SolutionSolution
• More efficient– Only pass type representations when
necessary– Traditional code optimizers can help– Sophisticated techniques still possible
• Recovers abstraction– Can withhold representation from clients