Ralf LämmelUniversität Koblenz-Landau, Software Languages Team, Koblenz, Germany
joint work with
Oleg KiselyovFleet Numerical Meteorology and Oceanography Center, Monterey, CA
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Spin-offs from the Expression Problem
1
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
List of spin-offs ahead
1.Can we have lists of expressions (in the Haskell model)?
2.Aren’t the types of the Haskell solution too complex?
3.Can we also deal with operations that construct data?
4.Are we supposed to write such boilerplate code for toTree?
5.Can we also deal with operations with multiple arguments?
6.What if the result type depends on the argument type(s)?
2
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Can we have lists of expressions (in the Haskell model)?
> [Lit 1, Lit 2]
[Lit 1,Lit 2]
> [Lit 1, Neg (Lit 1)]
TYPE ERROR!
Consider, for example, the expression form of function application. We would need to maintain a list of argument expressions.
3
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another supernatural power:existential quantification
data AnyExp = forall x. Exp x => AnyExp x
Forall outside the constructor equals Exists inside the constructor.
> let list = [AnyExp $ Lit 1, AnyExp $ Neg (Lit 1)]
> :t list
list :: [AnyExp]
Existential quantification is neither Haskell 98 nor Haskell 2010.
4
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Relative usefulness ofexistential quantification
> let list = [AnyExp $ Lit 1, AnyExp $ Neg (Lit 1)]
> list
TYPE ERROR (“Don’t know how to show”)
• The quantified type is fully opaque.
• Any operations deemed necessary must be packed as well.
data AnyExp = forall x. Exp x => AnyExp x
5
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Existential quantification is in conflict with extensibility in the operation dimension.
> let list = [AnyExp $ Lit 1, AnyExp $ Neg (Lit 1)]
> list
[Lit 1,Neg (Lit 1)]
data AnyExp = forall x. Show x => AnyExp x
instance Show AnyExp where show (AnyExp x) = show x
Here, we assume that all expression forms are Show-able.
Show works fine now, but any other operation would need to be anticipated in
the constraints of AnyExp.6
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Can we do better than this?
Yes!To be cont’d.
7
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Aren’t the types of the Haskell solution too complex?
> :t Add (Lit 1) (Lit 2)Add (Lit 1) (Lit 2) :: Add Lit Lit
The type is almost the value!
8
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The solution: use an open (extensible) operation for “showing types”.
class ShowType x where t :: x -> String
> t $ Add (Lit 1) (Lit 2)Exp
9
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Instances for our data variantsmodule OperationExtension where
import ColonTimport DataBaseimport DataExtension
-- Showing types concisely
instance ShowType Lit where t _ = "Exp"
instance (Exp x, Exp y) => ShowType (Add x y) where t _ = "Exp"
instance Exp x => ShowType (Neg x) where t _ = "Exp"
The support module for ShowType (to be revealed!)
10
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Add a generic instance for function types
class ShowType x where t :: x -> String
instance (ShowType x, ShowType y) => ShowType (x -> y) where t _ = "(" ++ t (undefined::x) ++ " -> " ++ t (undefined::y) ++ ")"
We use t’s argument only as a type argument.
We capture the type parameters x and y from the instance head.
11
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Add a default instance for all types
class ShowType x where t :: x -> String
instance Typeable x => ShowType x where t x = show $ typeOf x
This instance applies if there is no type-specific instance.
Obtain the “default” representation.
12
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
-- Configurable ":t"
{-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE OverlappingInstances #-} {-# LANGUAGE ScopedTypeVariables #-}
module ColonT where
import Data.Typeable
class ShowType x where t :: x -> String
instance Typeable x => ShowType x where t x = show $ typeOf x
instance (ShowType x, ShowType y) => ShowType (x -> y) where t _ = "(" ++ t (undefined::x) ++ " -> " ++ t (undefined::y) ++ ")"
The manifestation of supernatural powers
needed for this technique.
G(l)ory details of ShowType
13
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Can we also deal with operations that construct data?
Application scenarios:
• Configurable constructors.
• Parsing text into expressions
• De-treealizing trees into expression terms.
• De-serializing expressions as they come from the wire.
• ...
14
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Reference encodings forde-/treealization based on closed data type
toTree :: Exp -> Tree StringtoTree (Lit i) = Node "Lit" [Node (show i) []]toTree (Add l r) = Node "Add" [toTree l, toTree r]
fromTree :: Tree String -> ExpfromTree (Node "Lit" [Node i []]) = (Lit $ read i)fromTree (Node "Add" [l,r]) = (Add (fromTree l) (fromTree r))
15
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
An extensible fromTree operation
class FromTree x where fromTree :: Tree String -> x
instance FromTree Lit where fromTree (Node "Lit" [i]) = Lit (fromTree i)
instance (Exp e, Exp e', FromTree e, FromTree e') => FromTree (Add e e') where fromTree (Node "Add" [x,y]) = Add (fromTree x) (fromTree y)
instance (Exp e, FromTree e) => FromTree (Neg e) where fromTree (Node "Neg" [x]) = Neg (fromTree x)
instance FromTree Int where fromTree (Node s []) = read s
The problem with this apparently open and extensible solution is
that we would need to know the precise type of the result of
de-serialization for it to work on the grounds of the many instancesshown. This is absolutely unrealistic.
16
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another attempt at ∃
17
-- The homogenized type of all expressions
data AnyExp = forall x. (Exp x, Show x) => AnyExp x
instance Show AnyExp where show (AnyExp x) = show x
-- Apply a polymorphic function on expressions
applyToExp :: (forall x. (Exp x, Show x) => x -> y) -> AnyExp -> yapplyToExp f (AnyExp x) = f x
-- The conversion from trees to expressions
tree2exp :: Tree String -> AnyExptree2exp (Node "Lit" [i]) = AnyExp (Lit (fromTree i))tree2exp (Node "Neg" [x]) = applyToExp (AnyExp . Neg) (tree2exp x)tree2exp (Node "Add" [x,y]) = ...
1st problem: the function tree2exp is not extensible, but one can think of some style of chaining together
conversion blocks.
2nd problem: the constraints for the existential quantification are
again hard to anticipate.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A new problem born:the de-serialization problem
Can we describe functionality for de-serialization in a modular fashion so that arbitrary functionality can be
applied to the data, just as if the data was never serialized in the first place?
Extended offline discussions appreciated
18
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Are we supposed to write such boilerplate code for toTree?
class ToTree x where toTree :: x -> Tree String
instance ToTree Lit where toTree (Lit i) = Node "Lit" []
instance (Exp x, Exp y, ToTree x, ToTree y) => ToTree (Add x y) where toTree (Add x y) = Node "Add" [toTree x, toTree y]
instance (Exp x, ToTree x) => ToTree (Neg x) where toTree (Neg x) = Node "Neg" [toTree x]
19
This code follows a common scheme.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The solution: use generic programming.
> toTree $ Add (Lit 1) (Lit 2)
Node "A
dd" [Node "Lit" ["1"], N
ode "Lit" ["2"]]
toTree :: Data x => x -> Tree StringtoTree x = Node (showConstr (toConstr x)) (gmapQ toTree x)
Type class for generic
programming
This function is polymorphic in
data types.
Heavy lifting: map polymorphic function
over all immediate subterms.
Access to constructor
string.
20
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Scrap your boilerplatewith the Data class
class Typeable a => Data a where gmapQ :: (forall a . Data a => a -> u) -> a -> [u] ...
Map polymorphic function over all
immediate subterms and collect a list of intermediate results.
Polymorphic function argument. Hence, the type of gmapQ has rank 2.
Polymorphic result type.
Needed for “type case”.To be explained later!
21
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another Data-class member revealed
class Typeable a => Data a where gmapQ :: (forall a . Data a => a -> u) -> a -> [u] gmapT :: (forall b . Data b => b -> b) -> a -> a ...
Map polymorphic function over all
immediate subterms and reconstruct term.
Polymorphic function argument. Hence, the type
of gmapT has rank 2.
22
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another Data-class member revealed
class Typeable a => Data a where gmapQ :: (forall a . Data a => a -> u) -> a -> [u] gmapT :: (forall b . Data b => b -> b) -> a -> a toConstr :: a -> Constr ...
Reflection-like access to constructor information.
23
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
{-# LANGUAGE DeriveDataTypeable #-}
module DataBase where
import Data.Genericsimport Data.Typeable
-- Data variants for literals and addition
data Lit = Lit Int deriving (Typeable, Data, Show)data (Exp l, Exp r) => Add l r = Add l r deriving (Typeable, Data, Show)
-- The open union of data variants
class Exp xinstance Exp Litinstance (Exp l, Exp r) => Exp (Add l r)
Generic programming-enabled data types
Another extension needed for program access to type and term
representation.
Instruct the compiler to generate suitable instances for Typeable, Data,
and Show. (The Show class/function is a special, generic function.)
24
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Can we also deal with operations with multiple arguments?
Consider the following problem.
• There are different kinds of shape: rectangles, circles, ....
• The desired operation is intersection:
• There are efficient options for certain couples of shape.
• There are less efficient defaults.
This problem dates back to Curtis Clifton, Gary T. Leavens, Craig Chambers, and Todd Millstein’s paper “MultiJava: Modular Open
Classes and Symmetric Multiple Dispatch for Java”, TOPLAS 2006.
25
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A closed data type for shapes
26
-- Some kinds of shape
data Shape = Square Int Int Int | Rectangle Int Int Int Int | Circle Int Int Int | Ellipse Int Int Int Int
In reality, we want an open type of shapes.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Intersection on closed data type
27
-- Some kinds of shape
data Shape = Square Int Int Int | Rectangle Int Int Int Int | Circle Int Int Int | Ellipse Int Int Int Int
In reality, we want an open type of shapes.
-- The intersection operation
intersect :: Shape -> Shape -> Bool
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Case discrimination for intersection
28
-- The intersection operation
intersect :: Shape -> Shape -> Bool
-- Some efficient cases treated specifically
intersect (Rectangle x1 x2 y1 y2) (Rectangle a1 a2 b1 b2) = ...intersect (Circle x y r) (Circle x2 y2 r2) = ...intersect (Circle x y r) (Rectangle x1 x2 y1 y2) = ...
-- Case handled by commutativity of operation
intersect r@(Rectangle _ _ _ _) c@(Circle _ _ _) = intersect c r
-- A default cast
intersect s1 s2 = ...
The efficient implementations are elided here.
The patterns of the equation are more general than all
others. Hence, it must go last.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A type-class-based open data type for shapes
This is even a simpler situation than with the Expression Problem. The kinds of shape are not even recursive.
29
-- Some kinds of shape as different datatypes
data Square = Square Int Int Int data Rectangle = Rectangle Int Int Int Int data Circle = Circle Int Int Int data Ellipse = Ellipse Int Int Int Int
-- A type bound to explicitly collect all kinds of shape
class Shape xinstance Shape Squareinstance Shape Rectangleinstance Shape Circleinstance Shape Ellipse
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another supernatural power:multi-parameter type classes
-- The intersection operation
class (Shape x, Shape y) => Intersect x y where intersect :: x -> y -> Bool
A type class may have multiple parameters.
30
Multi-parameter type classes takes us from “type classes are sets of types” to “type classes are relations on types”.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The intersection operation cont’d
31
-- Some efficient cases treated specifically
instance Intersect Rectangle Rectangle where intersect (Rectangle x1 x2 y1 y2) (Rectangle a1 a2 b1 b2) = ...
instance Intersect Circle Circle where intersect (Circle x y r) (Circle x2 y2 r2) = ...
instance Intersect Circle Rectangle where intersect (Circle x y r) (Rectangle x1 x2 y1 y2) = ...
Instances can occur in any order and in different modules (without affecting semantics).
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The intersection operation cont’d
-- Case handled by commutativity of intersection instance Intersect Rectangle Circle where intersect r c = intersect c r
There is no need for any instance constraints here since the required instance is visible and can be resolved right away.
32
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The intersection operation cont’d-- Generic defaults
instance Shape s => Intersect Rectangle s where intersect r s = ...
instance Shape s => Intersect s Rectangle where intersect s r = ...
instance (Shape s1, Shape s2) => Intersect s1 s2 where intersect s1 s2 = ...
These instances only kick in as no other more specific
instances apply.
Beware of debated overlapping instances!
33
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
What if the result type depends on the argument type(s)?
Consider the following problem.
• There are different kinds of shape: rectangles, circles, ....
• The desired operation is normalize.
34
Preserve area and origin!
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Type distinctions among shapes
35
-- The set of all shape types
class Shape xinstance Shape Squareinstance Shape Rectangleinstance Shape Circleinstance Shape Ellipse
-- The subset of shapes that are normal(ized)
class Shape s => NormalShape sinstance NormalShape Squareinstance NormalShape Circle
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A first attempt of a type-class for normalization
36
-- The normalization operation
class (Shape s1, NormalShape s2) => Normalize s1 s2 where normalize :: s1 -> s2
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
The normalization operation cont’d
37
-- Exhaustive case discrimination
instance Normalize Square Square where normalize = id
instance Normalize Circle Circle where normalize = id
instance Normalize Rectangle Square where normalize = ...
instance Normalize Ellipse Circle where normalize = ...
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A type error
38
> normalize (Square 1 2 3)
No instance for (Normalize Square s2) arising from a use of `normalize'. Possible fix: add an instance declaration for (Normalize Square s2) In the expression: normalize (Square 1 2 3)
Instance selection requires all type parameters to be instantiated. (The type-class system uses an “open-world assumption”.)
Read as: we have not specified result type,
and there is no instance which an
arbitrary result type.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A revised type class for normalization
39
> normalize (Square 1 2 3)Square 1 2 3
class (Shape s1, NormalShape s2)
=> Normalize s1 s2
| s1 -> s2
where
normalize :: s1 -> s2
Functional dependency between types
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Fact sheet on type classes
40
• Type class = set of named, typed, type-parametric operations
• Single-parameter type class = set of types
• Multi-parameter type class = relation on types
• Functional dependencies
• 1 dependency = restrict relation on types to a function.
• We glance here over classes with multiple dependencies.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
One default abbreviating several instances
41
instance Normalize Square Square where normalize = id
instance Normalize Circle Circle where normalize = id
instance NormalShape s => Normalize s s where normalize = id
Beware of debated overlapping instances!
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Another type error
42
Functional dependencies conflict between instance declarations:
instance [overlap ok] (NormalShape s) => Normalize s s
instance [overlap ok] Normalize Rectangle Square
instance [overlap ok] Normalize Ellipse Circle
It looks like a Rectangle could be mapped to both
a Square and a Rectangle.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
A bleeding-edge default
43
instance ( NormalShape s1 , Shape s2, s1 ~ s2 ) => Normalize s1 s2 where normalize = id
An equality constraint
This sort of instance is now really treated as a default; it is only exercised, if no other
instance can be applied.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Remember that question?Can we have lists of expressions?
> [Lit 1, Lit 2]
[Lit 1,Lit 2]
> [Lit 1, Neg (Lit 1)]
TYPE ERROR!
44
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Remember that question?Can we have lists of shapes?
• Again, we could use existential quantification.
• Remember, ∃ is in conflict with extensibility in the operation dimension.
• However, our multi-dispatch scenario emphasizes the data dimension.
• We want to be able to add more kinds of shape.
• We want to be able to register more special cases.
• Hence:
• There is just one operation: intersection.
• Let’s give it a try: ∃
45
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Reference encoding based on closed data type
-- Intersection for all combinations of shapes in a list
intersectMany :: [Shape] -> BoolintersectMany [] = FalseintersectMany (x:[]) = FalseintersectMany (x:y:z) = intersect x y || intersectMany (x:z) || intersectMany (y:z)
46
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Finding a bound for existential quantification
data AnyShape = forall x. Shape x => AnyShape x
intersectMany :: [AnyShape] -> BoolintersectMany [] = FalseintersectMany (_:[]) = FalseintersectMany ((AnyShape x):(AnyShape y):z) = intersect x y || intersectMany (AnyShape x:z) || intersectMany (AnyShape y:z)
Why on earth does the following code typecheck (with say GHC 6.n many versions)? This is the riddle for today.
The generic instance is always chosen.
Looking into the package.
Putting things back into the
package.
47
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Key insight for now:Stop using regular Haskell lists.
Use nested pairs instead.
class IntersectMany x where intersectMany :: x -> Bool
instance IntersectMany () where intersectMany _ = False
instance Shape x => IntersectMany (x,()) where intersectMany _ = False
instance ( Intersect x y , IntersectMany (x,z) , IntersectMany (y,z) ) => IntersectMany (x,(y,z)) where intersectMany (x,(y,z)) = intersect x y || intersectMany (x,z) || intersectMany (y,z)
Use a designated type class
Case of empty list
(empty tuple)
Case of singleton list
Case of a list of length ≥ 2
48
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Additional Haskell constructs and idioms covered since the previous check list.
✦ Flexible instances (& contexts)
✦ Multiparameter typeclasses
✦ Functional dependencies
✦ Overlapping instances
✦ Data.Typeable
✦ deriving Show et al.
✦ Scrap your boilerplate
✦ Rank-2 types
✦ Existential quantification
✦ Nested tuples as lists
49
Purple means “extremely powerful”.Green means “heavy lifting”.
Red means “dangerous”.
Slide deck and accompanying code distribution © 2010, Ralf Lämmel & Oleg Kiselyov
Summary
• Covered forms of operations• The operation dispatches on an argument’s “type”.• The operation dispatches on several arguments’ “types”.• The operation is driven by the result type.• The operation uses untyped input but typed output.• ...
• Extensibility requires specific efforts for each form.• Haskell can handle most forms easily; all forms potentially.• Contenders: Scala, ...
50