130
Real World Haskell Bryan O’Sullivan [email protected] 2008-09-27

DEFUN 2008 - Real World Haskell

Embed Size (px)

DESCRIPTION

Slides from my Haskell tutorial at DEFUN 2008.

Citation preview

Page 1: DEFUN 2008 - Real World Haskell

Real World Haskell

Bryan O’[email protected]

2008-09-27

Page 2: DEFUN 2008 - Real World Haskell

Welcome!

A few things to expect about this tutorial:

I The pace will be rapid

I Stop me and ask questions—early and often

I I assume no prior Haskell exposure

Page 3: DEFUN 2008 - Real World Haskell

A little bit about Haskell

Haskell is a multi-paradigm language.It chooses some unusual, but principled, defaults:

I Pure functions

I Non-strict evaluation

I Immutable data

I Static, strong typing

Why default to these behaviours?

I We want our code to be safe, modular, and tractable.

Page 4: DEFUN 2008 - Real World Haskell

Pure functions

DefinitionThe result of a pure function depends only on its visible inputs:

I Given identical inputs, it always computes the same result.

I It has no other observable effects.

What are some consequences of this?

I Modularity leads to simplified reasoning about behaviour.

I Straightforward testing: no need for elaborate frameworks.

Page 5: DEFUN 2008 - Real World Haskell

Immutable data

DefinitionData is immutable (or purely functional) if it is never modifiedafter construction.

To “modify” a value, we create a new value.Both new and old versions can coexist afterwards, so we getpersistent, versioned data for free.

I Modification is often easier than with mutable data.

I In multithreaded code, we do away with much elaboratelocking.

Page 6: DEFUN 2008 - Real World Haskell

Static, strong typing

DefinitionA program is statically typed if we know the type of everyexpression before the program is run.

DefinitionCode is strongly typed if the absence of certain classes of error canbe proven statically.

Page 7: DEFUN 2008 - Real World Haskell

Safety, modularity, and tractability

Safety:

I As few nasty surprises at runtime as possible.

I Static typing and eased testing give us confidence.

Modularity:

I We can build big pieces of code from smaller components.

I No need to focus on the details of the smaller parts.

Tractability:

I All of this fits in our brain comfortably...

I ...leaving plenty of room for the application we care about.

Page 8: DEFUN 2008 - Real World Haskell

GHC, the Glorious Glasgow Haskell Compiler

Have you got GHC yet?

I Download installer for Windows, OS X, or Linux here:

I http://www.haskell.org/ghc/download_ghc_683.html

Page 9: DEFUN 2008 - Real World Haskell

What’s special about GHC?

I Mature, portable, optimising compilerI Great tools:

I interactive shell and debuggerI time and space profilersI code coverage analyser

I BSD-licensed, hence suitable for OSS and commercial use

Page 10: DEFUN 2008 - Real World Haskell

Counting lines

The classic Unix wc command counts the lines in some files:

$ time wc -l *.fasta9975 1000-Rn_EST.fasta14032 chr18.fasta14005 chr19.fasta13980 chr20.fasta42017 chr_all.fasta94009 total

real 0m0.017s

Page 11: DEFUN 2008 - Real World Haskell

Breaking the problem down

Subproblems to consider:

I Get our command line arguments

I Read a file

I Split it into lines

I Count the lines

Let’s work through these in reverse order.

Page 12: DEFUN 2008 - Real World Haskell

Type signatures

DefinitionA type signature describes the type of a Haskell expression:

e : : Double

I We read :: as “left has the type right”.

I So “e has the type Double”.

Here’s the accompanying definition:

e = 2.7182818

Page 13: DEFUN 2008 - Real World Haskell

Type signatures are optional

In Haskell, most type signatures are optional.

I The compiler can automatically infer types based on ourusage.

Why write type signatures at all, then?

I Mostly as useful documentation to ourselves.

Page 14: DEFUN 2008 - Real World Haskell

GHC’s interactive interpreter

GHC includes an interactive expression evaluator, ghci.Run it from a terminal window or command prompt:

$ ghciGHCi, version 6.8.3: http://www.haskell.org/ghc/:? for helpLoading package base ... linking ... done.Prelude>

The Prelude> text is ghci’s prompt.Type :? at the prompt to get (terse) help.

Page 15: DEFUN 2008 - Real World Haskell

Basic interaction

Let’s enter some expressions:

Prelude> 2 + 24Prelude> True && FalseFalse

We can find out about types:

Prelude> :type TrueTrue :: Bool

Page 16: DEFUN 2008 - Real World Haskell

Writing a list

Here’s an empty list:

Prelude> [][]

What do we need to create a longer list?

I A value

I An existing list

I Some glue—the : operator

Prelude> 1:[][1]Prelude> 1:2:[][1,2]

Page 17: DEFUN 2008 - Real World Haskell

Syntactic sugar for lists

What’s the difference between these?

I 1:2:[]

I [1,2]

Nothing—the latter is purely a notational convenience.

Page 18: DEFUN 2008 - Real World Haskell

Characters and strings

One character:

Prelude> :type ’a’’a’ :: Char

A string is a list of characters:

Prelude> ’a’ : ’b’ : []"ab"

Notation:

I Single quotes for one Char

I Double quotes for a string (written [Char])

Page 19: DEFUN 2008 - Real World Haskell

Function application

We apply a function to its arguments by juxtaposition:

Prelude> length [2,4,6]3Prelude> take 2 [3,6,9,12][3,6]

Why refer to this as application, instead of the more familiarcalling?

I Haskell is a non-strict language

I The result may not be computed immediately

Page 20: DEFUN 2008 - Real World Haskell

Lists are inductive

Haskell lists are defined inductively.A list can be one of two things:

I An empty list

I A value in front of an existing list

We call our friends [] and : value constructors:

I They construct values that have the type “list of something.”

Page 21: DEFUN 2008 - Real World Haskell

Counting lines

Haskell programmers love abstraction.

I We won’t worry about counting lines.

I Instead, we’ll count the elements in any kind of list.

Page 22: DEFUN 2008 - Real World Haskell

The type signature of a function

How do we describe a function that computes the length of a list?

l e n : : [ a ] −> Integer

I The −> notation denotes a function.

I The function accepts an [a], and returns an Integer.

What’s an [a]?

I A list, whose elements must all be of some type a.

Page 23: DEFUN 2008 - Real World Haskell

Counting by induction: the base case

An empty list has the length zero.

l e n [ ] = 0

This is our first example of pattern matching.

I Our function accepts one argument.

I If the argument is an empty list, we return zero.

We call this the base case.

Page 24: DEFUN 2008 - Real World Haskell

Counting by induction: the inductive case

Let’s see if a list value was created using the : constructor.

l e n ( x : xs ) = 1 + l e n xs

If the pattern match succeeds:

I The name x is bound to the head of the list.

I The name xs is bound to the tail of the list.

I The body of the definition is used as the result.

Page 25: DEFUN 2008 - Real World Haskell

The complete function

Save this in a file named Length.hs:

l e n : : [ a ] −> Integerl e n [ ] = 0l e n ( x : xs ) = 1 + l e n xs

Page 26: DEFUN 2008 - Real World Haskell

Load the file into ghci

In the same directory, run ghci:

Prelude> :load Length[1 of 1] Compiling Main ( Length.hs, interpreted )Ok, modules loaded: Main.*Main>

The ghci prompt changes when we load files.Let’s try out our function:

*Main> len []0*Main> len (1:[])1*Main> len [4,5,6]3

Page 27: DEFUN 2008 - Real World Haskell

Generating a list from a list

How might we double every other element of a list?

d o u b l e ( a : b : c s ) = a : b ∗ 2 : d o u b l e c sd o u b l e c s = c s

Save this in a file named Double.hs.Load the file into ghci.Try the following expressions:

I [1..10]

I double [1..10]

Page 28: DEFUN 2008 - Real World Haskell

Your turn: axpy

I The classic Linpack function axpy computes a× xi + yi over ascalar a and each element i of two vectors x and y .

I Define it over two lists of numbers in Haskell.

I How do we handle lists of different lengths?

Page 29: DEFUN 2008 - Real World Haskell

Splitting text on line boundaries

Haskell provides a large library of built-in functions, the Prelude.Here’s the Prelude’s function for splitting text by lines:

l i n e s : : Str ing −> [ Str ing ]

The type String is a synonym for [Char].A ghci experiment:

*Main> lines "foo\nbar\n"["foo","bar"]*Main> len (lines "foo\nbar\n")2

Page 30: DEFUN 2008 - Real World Haskell

Reading a file

To read a file, we use the Prelude’s readFile function:

*Main> :type readFilereadFile :: FilePath -> IO String

What’s this signature mean?

I The FilePath type is just a synonym for String.

I The type IO String means here be dragons!

I A signature that ends in IO something can have externallyvisible side effects.

I Here, the side effect is “read the contents of a file”.

Page 31: DEFUN 2008 - Real World Haskell

Side effects

That innocuous IO in the type is a big deal.

I We can tell by its type signature whether a value might haveexternally visible effects.

I If a type does not include IO, it cannot:I Read filesI Make network connectionsI Launch torpedoes

The ideal is for most code to not have an IO type.

Page 32: DEFUN 2008 - Real World Haskell

Counting lines in a file

If we invoke code that has side effects, our code must byimplication have side effects too.

c o u n t L i n e s : : Fi lePath −> IO Integerc o u n t L i n e s path = do

c o n t e n t s <− r eadF i l e pathreturn ( l e n ( l i n e s c o n t e n t s ) )

We had to add IO to our type here because we use readFile,which has side effects.

I Add this code to Length.hs.

Page 33: DEFUN 2008 - Real World Haskell

A few explanations

I The <− notation means “perform the action on the right,and assign the result to the name on the left.”

name <− a c t i o n

I The return function takes a pure value, and (here) adds IO toits type.

Page 34: DEFUN 2008 - Real World Haskell

Command line arguments

We use getArgs to obtain command line arguments.

import System . Env i ronment ( getArgs )main = do

a r g s <− getArgsputStrLn ( ” h e l l o , a r g s a r e ” ++ show a r g s )

What’s new here?

I The import directive imports the name getArgs from theSystem.Environment module.

I The ++ operator concatenates two lists.

Page 35: DEFUN 2008 - Real World Haskell

Pattern matching in an expression

We use case to pattern match inside an expression.

−− Does l i s t c o n t a i n two or more e l e m e n t s ?atLeastTwo myLis t =

case myLis t of( a : b : c s ) −> True

−> False

The expression between case and of is matched in turn againsteach pattern, until one matches.

Page 36: DEFUN 2008 - Real World Haskell

Irrefutable and wild card patterns

I A pattern usually matches against a value’s constructors.

I In other words, it inspects the structure of the value.

I A simple pattern, e.g. a plain name like a, contains noconstructors.

I It thus matches any value.

DefinitionA pattern that always matches any value is called irrefutable.

The special wild card pattern is irrefutable, but does not bind avalue to a name.

Page 37: DEFUN 2008 - Real World Haskell

Tuples

I A tuple is a fixed-size collection of values.

I Items in a tuple can have different types.

I Example: (True,”foo”)

I This has the type (Bool,String)

Contrast tuples with lists, to see why we’d want both:

I A list is a variable-sized collection of values.

I Each value in a list must have the same type.

I Example: [True, False]

Page 38: DEFUN 2008 - Real World Haskell

The zip function

What does the zip function do? Adventures in function discovery,courtesy of ghci:

I Start by inspecting its type, using :type.

I Try it with one set of inputs.

I Then try with another.

Page 39: DEFUN 2008 - Real World Haskell

Making our program runnable

Add the following code to Length.hs:

main = do−− E x e r c i s e : g e t t he command l i n e arguments

l e n g t h s <− mapM c o u n t L i n e s a r g smapM p r i n t L e n g t h ( z ip a r g s l e n g t h s )case a r g s of

( : : ) −> p r i n t L e n g t h ( ” t o t a l ” , sum l e n g t h s )−> return ( )

Don’t forget to add an import directive at the beginning!

Page 40: DEFUN 2008 - Real World Haskell

The mapM function

I This function applies an action to a list of arguments in turn,and returns the list of results.

I The mapM function is similar, but returns the value (), akaunit (“nothing”).

I The mapM function is useful for the effects it causes, e.g.printing every element of a list.

Page 41: DEFUN 2008 - Real World Haskell

Write your own printLength function

Hint: we’ve seen a similar example already, with our getArgsexample.

Page 42: DEFUN 2008 - Real World Haskell

Compiling your program

It’s easy to compile a program with GHC:

$ ghc --make Length

What does the compiler do?

I Looks for a source file named Length.hs.

I Compiles it to native code.

I Generates an executable named Length.

Page 43: DEFUN 2008 - Real World Haskell

Running our program

Here’s an example from my laptop:

$ time ./Length *.fasta1000-Rn_EST.fasta 9975chr18.fasta 14032chr19.fasta 14005chr20.fasta 13980chr_all.fasta 42017total 94009

real 0m1.533s

Oh, no! Look at that performance!

I 90 times slower than wc

Page 44: DEFUN 2008 - Real World Haskell

Faster file processing

I Lists are wonderful to work with

I But they exact a huge performance toll

The current best-of-breed alternative for file data:

I ByteString

Page 45: DEFUN 2008 - Real World Haskell

What is a ByteString?

They come in two flavours:

I Strict: a single packed array of bytes

I Lazy: a list of 64KB strict chunks

Each flavour provides a list-like API.

Page 46: DEFUN 2008 - Real World Haskell

Retooling our word count program

All we do is add an import and change one function:

import qua l i f i e d Data . B y t e S t r i n g . Lazy . Char8 as B

c o u n t L i n e s path = doc o n t e n t s <− B . r eadF i l e pathreturn ( length (B . l i n e s c o n t e n t s ) )

The “B.” prefixes make us pick up the readFile and linesfunctions from the bytestring package.

Page 47: DEFUN 2008 - Real World Haskell

What happens to performance?

I Haskell lists: 1.533 seconds

I Lazy ByteString: 0.022 seconds

I wc command: 0.015 seconds

Given the tiny data set size, C and Haskell are in a dead heat.

Page 48: DEFUN 2008 - Real World Haskell

When to use ByteStrings?

I Any time you deal with binary data

I For text, only if you’re sure it’s 8-bit clean

For i18n needs, fast packed Unicode is under development.Great open source libraries that use ByteStrings:

I binary—parsing/generation of binary data

I zlib and bzlib—support for popularcompression/decompression formats

I attoparsec—parse text-based files and network protocols

Page 49: DEFUN 2008 - Real World Haskell

Part 2

Page 50: DEFUN 2008 - Real World Haskell

A little bit about JSON

A popular interchange format for structured data: simpler thanXML, and widely supported.Basic types:

I Number

I String

I Boolean

I Null

Derived types:

I Object: unordered name/value map

I Array: ordered collection of values

Page 51: DEFUN 2008 - Real World Haskell

JSON at work: Twitter’s search API

From http://search.twitter.com/search.json?q=haskell:

{"text": "Why Haskell? Easiest way to be productive","to_user_id": null,"from_user": "galoisinc","id": 936114469,"from_user_id": 1633746,"iso_language_code": "en","created_at":"Fri, 26 Sep 2008 19:15:35 +0000"}

Page 52: DEFUN 2008 - Real World Haskell

JSON in Haskell

data JSValue= J S N u l l| JSBool ! Bool| J S R a t i o n a l ! Rational| J S S t r i n g J S S t r i n g| JSArray [ JSValue ]| JSObject ( JSObject JSValue )

Page 53: DEFUN 2008 - Real World Haskell

What is a JSString?

We hide the underlying use of a String:

newtype J S S t r i n g = JSONString { f r o m J S S t r i n g : : Str ing }

t o J S S t r i n g : : Str ing −> J S S t r i n gt o J S S t r i n g = JSONString

We do the same with JSON objects:

newtype JSObject a = JSONObject { f romJSObject : : [ ( Str ing , a ) ] }

t o J S O b j e c t : : [ ( Str ing , a ) ] −> JSObject at o J S O b j e c t = JSONObject

Page 54: DEFUN 2008 - Real World Haskell

JSON conversion

In Haskell, we capture type-dependent patterns using typeclasses:

I The class of types whose values can be converted to and fromJSON

data R e s u l t a = Ok a | E r r o r Str ing

c l a s s JSON a wherereadJSON : : JSValue −> R e s u l t ashowJSON : : a −> JSValue

Page 55: DEFUN 2008 - Real World Haskell

Why JSString, JSObject, and JSArray?

Haskell typeclasses give us an open world:

I We can declare a type to be an instance of a class at any time

I In fact, we cannot declare the number of instances to be fixed

If we left the String type “naked”, what could happen?

I Someone might declare Char to be an instance of JSON

I What if someone declared a JSON a =>JSON [a] instance?

This is the overlapping instances problem.

Page 56: DEFUN 2008 - Real World Haskell

Relaxing the overlapping instances restriction

By default, GHC is conservative:

I It rejects overlapping instances outright

We can get it to loosen up a bit via a pragma:

{−# LANGUAGE O v e r l a p p i n g I n s t a n c e s #−}

If it finds one most specific instance, it will use it, otherwise bail asbefore.

Page 57: DEFUN 2008 - Real World Haskell

Bool as JSON

Here’s a simple way to declare the Bool type as an instance of theJSON class:

instance JSON Bool whereshowJSON = JSBool

readJSON ( JSBool b ) = Ok breadJSON = E r r o r ” Bool p a r s e f a i l e d ”

This has a design problem:

I We’ve plumbed our Result type straight in

I If we want to change its implementation, it will be painful

Page 58: DEFUN 2008 - Real World Haskell

Hiding the plumbing

A simple (but good enough!) approach to abstraction:

s u c c e s s : : a −> R e s u l t as u c c e s s k = Ok k

f a i l u r e : : Str ing −> R e s u l t af a i l u r e errMsg = E r r o r errMsg

Functions like these are sometimes called “smart constructors”.

Page 59: DEFUN 2008 - Real World Haskell

Does this affect our code much?

We simply replace the explicit constructors with the functions wejust defined:

instance JSON Bool whereshowJSON = JSBool

readJSON ( JSBool b )= s u c c e s s b

readJSON = f a i l u r e ” Bool p a r s e f a i l e d ”

Page 60: DEFUN 2008 - Real World Haskell

JSON input and output

We can now convert between normal Haskell values and our JSONrepresentation. But...

I ...we still need to be able to transmit this stuff over the wire.

Which is more fun to mull over? Parsing!

Page 61: DEFUN 2008 - Real World Haskell

A functional view of parsing

Here’s a super-simple perspective:

I Take a piece of data (usually a sequence)

I Try to apply an interpretation to it

How might we represent this?

Page 62: DEFUN 2008 - Real World Haskell

A basic type signature for parsing

Take two type variables, i.e. placeholders for types that we’llsubstitute later:

I s—the state (data) we want to parse

I a—the type of its interpretation

We get this generic type signature:

s −> a

Let’s make the task more concrete:

I Parse a String as an Int

Str ing −> Int

What’s missing?

Page 63: DEFUN 2008 - Real World Haskell

Parsing as state transformation

After we’ve parsed one Int, we might have more data in ourString that we want to parse.How to represent this? Return the transformed state and the resultin a tuple.

s −> ( a , s )

We accept an input state of type s, and return a transformedstate, also of type s.

Page 64: DEFUN 2008 - Real World Haskell

Parsing is composable

Let’s give integer parsing a name:

p a r s e D i g i t : : Str ing −> ( Int , Str ing )

How might we want to parse two digits?

p a r s e T w o D i g i t s : : Str ing −> ( ( Int , Int ) , Str ing )p a r s e T w o D i g i t s s =

l e t ( i , t ) = p a r s e D i g i t s( j , u ) = p a r s e D i g i t t

i n ( ( i , j ) , u )

Page 65: DEFUN 2008 - Real World Haskell

Chaining parses more tidily

It’s not good to represent the guts of our state explicitly usingpairs:

I Tying ourselves to an implementation eliminates wiggle room.

Here’s an alternative approach.

newtype S t a t e s a = S t a t e {r u n S t a t e : : s −> ( a , s )

}

I A newline declaration hides our implementation. It has noruntime cost.

I The runState function is a deconstructor: it exposes theunderlying value.

Page 66: DEFUN 2008 - Real World Haskell

Chaining parses

Given a function that produces a result and a new state, we can“chain up” another function that accepts its result.

c h a i n S t a t e s : : S t a t e s a −> ( a −> S t a t e s b ) −> S t a t e s bc h a i n S t a t e s m k = S t a t e cha inFunc

where cha inFunc s =l e t ( a , t ) = r u n S t a t e m si n r u n S t a t e ( k a ) t

Notice that the result type is compatible with the input:

I We can chain uses of chainStates!

Page 67: DEFUN 2008 - Real World Haskell

Injecting a pure value

We’ll often want to leave the current state untouched, but inject anormal value that we can use when chaining.

p u r e S t a t e : : a −> S t a t e s ap u r e S t a t e a = S t a t e $ \ s −> ( a , s )

Page 68: DEFUN 2008 - Real World Haskell

What about computations that might fail?

Try these in in ghci:

Prelude> head [1,2,3]1Prelude> head []

What gets printed in the second case?

Page 69: DEFUN 2008 - Real World Haskell

One approach to potential failure

The Prelude defines this handy standard type:

data Maybe a = Just a| Nothing

We can use it as follows:

sa feHead ( x : ) = Just xsa feHead [ ] = Nothing

Save this in a source file, load it into ghci, and try it out.

Page 70: DEFUN 2008 - Real World Haskell

Some familiar operations

We can chain Maybe values:

chainMaybes : : Maybe a −> ( a −> Maybe b )−> Maybe b

chainMaybes Nothing k = NothingchainMaybes ( Just x ) k = k x

This gives us short circuiting if any computation in a chain fails:

I Maybe is the Ur-exception.

We can also inject a pure value into a Maybe-typed computation:

pureMaybe : : a −> Maybe apureMaybe x = Just x

Page 71: DEFUN 2008 - Real World Haskell

What do these types have in common?

Chaining:

chainMaybes : : Maybe a −> ( a −> Maybe b )−> Maybe b

c h a i n S t a t e s : : S t a t e s a −> ( a −> S t a t e s b )−> S t a t e s b

Injection of a pure value:

p u r e S t a t e : : a −> S t a t e s apureMaybe : : a −> Maybe a

I Abstract away the type constructors, and these have identicaltypes!

Page 72: DEFUN 2008 - Real World Haskell

Monads

More type-related pattern capture, courtesy of typeclasses:

c l a s s Monad m where−− c h a i n(>>=) : : m a −> ( a −> m b ) −> m b

−− i n j e c t a pure v a l u ereturn : : a −> m a

Page 73: DEFUN 2008 - Real World Haskell

Instances

When a type is an instance of a typeclass, it supplies particularimplementations of the typeclass’s functions:

instance Monad Maybe where(>>=) = chainMaybesreturn = pureMaybe

instance Monad ( S t a t e s ) where(>>=) = c h a i n S t a t e sreturn = p u r e S t a t e

Page 74: DEFUN 2008 - Real World Haskell

Chaining with monads

Using the methods of the Monad typeclass:

p a r s e T h r e e D i g i t s =p a r s e D i g i t >>= \a −>p a r s e D i g i t >>= \b −>p a r s e D i g i t >>= \c −>return ( a , b , c )

Syntactically sugared with do-notation:

p a r s e T h r e e D i g i t s = doa <− p a r s e D i g i tb <− p a r s e D i g i tc <− p a r s e D i g i treturn ( a , b , c )

This now looks suspiciously like imperative code.

Page 75: DEFUN 2008 - Real World Haskell

Haven’t we forgotten something?

What happens if we want to parse a digit out of a string thatdoesn’t contain any?

I We’d like to “break the chain” if a parse fails.

I We have this nice Maybe type for representing failure.

Alas, we can’t combine the Maybe monad with the State monad.

I Different monads do not combine.

Page 76: DEFUN 2008 - Real World Haskell

But this is awful! Don’t we need lots of boilerplate?

Are we condemned to a world of numerous slightly tweaked custommonads?We can adapt the behaviour of an underlying monad.

newtype MaybeT m a = MaybeT {runMaybeT : : m (Maybe a )

}

Page 77: DEFUN 2008 - Real World Haskell

Can we inject a pure value?

pureMaybeT : : (Monad m) => a −> MaybeT m apureMaybeT a = MaybeT ( return ( Just a ) )

Page 78: DEFUN 2008 - Real World Haskell

Can we write a chaining function?

chainMaybeTs : : (Monad m) => MaybeT m a −> ( a −> MaybeT m b )−> MaybeT m b

x ‘ chainMaybeTs ‘ f = MaybeT $ dounwrapped <− runMaybeT xcase unwrapped of

Nothing −> return NothingJust y −> runMaybeT ( f y )

Page 79: DEFUN 2008 - Real World Haskell

Making a Monad instance

Given an underlying monad, we can stack a MaybeT on top of itand get a new monad.

instance (Monad m) => Monad ( MaybeT m) where(>>=) = chainMaybeTsreturn = pureMaybeT

Page 80: DEFUN 2008 - Real World Haskell

A custom monad in 2 lines of code

A parsing type that can short-circuit:

{−# LANGUAGE G e n e r a l i z e d N e w t y p e D e r i v i n g #−}

newtype MyParser a = MyP ( MaybeT ( S t a t e Str ing ) a )de r i v i ng (Monad , MonadState Str ing )

We use a GHC extension to automatically generate instances ofnon-H98 typeclasses:

I Monad

I MonadState String

Page 81: DEFUN 2008 - Real World Haskell

What is MonadState?

The State monad is parameterised over its underlying state, asState s:

I It knows nothing about the state, and cannot manipulate it.

Instead, it implements an interface that lets us query and modifythe state ourselves:

c l a s s (Monad m) => MonadState s m−− q u e r y th e c u r r e n t s t a t eg e t : : m s

−− r e p l a c e t he s t a t e w i t h a new oneput : : s −> m ( )

Page 82: DEFUN 2008 - Real World Haskell

Parsing text

In essence:

I Get the current state, modify it, put the new state back.

What do we do on failure?

s t r i n g : : Str ing −> MyParser ( )s t r i n g s t r = do

s <− g e tl e t ( hd , t l ) = sp l i tA t ( length s t r ) si f s t r == hd

then put t le l s e f a i l $ ” f a i l e d to match ” ++ show s t r

Page 83: DEFUN 2008 - Real World Haskell

Shipment of fail

We’ve carefully hidden fail so far. Why?

I Many monads have a very bad definition: error.

What’s the problem with error?

I It throws an exception that we can’t catch in pure code.

I It’s only safe to use in catastrophic cases.

Page 84: DEFUN 2008 - Real World Haskell

Non-catastrophic failure

A bread-and-butter activity in parsing is lookahead:

I Inspect the input stream and see what to do next

JSON example:

I An object begins with “{”I An array begins with “[”

We look at the next input token to figure out what to do.

I If we fail to match “{”, it’s not an error.

I We just try “[” instead.

Page 85: DEFUN 2008 - Real World Haskell

Giving ourselves alternatives

We have two conflicting goals:

I We like to keep our implementation options open.

I Whether fail crashes depends on the underlying monad.

We need a safer, abstract way to fail.

Page 86: DEFUN 2008 - Real World Haskell

MonadPlus

A typeclass with two methods:

c l a s s Monad m => MonadPlus m where−− non− f a t a l f a i l u r emzero : : m a

−− i f t he f i r s t a c t i o n f a i l s ,−− pe r f o r m the second i n s t e a dmplus : : m a −> m a −> m a

To upgrade our code, we replace our use of fail with mzero.

Page 87: DEFUN 2008 - Real World Haskell

Writing a MonadZero instance

We can easily make any stack of MaybeT atop another monad aMonadPlus:

instance Monad m => MonadPlus ( MaybeT m) wheremzero = MaybeT $ return Nothing

a ‘ mplus ‘ b = MaybeT $ dor e s u l t <− runMaybeT acase r e s u l t of

Just k −> return ( Just k )Nothing −> runMaybeT b

We simply add MonadPlus to the list of typeclasses we ask GHCto automatically derive for us.

Page 88: DEFUN 2008 - Real World Haskell

Using MonadPlus

Given functions that know how to parse bits of JSON:

p a r s e O b j e c t : : MyParser [ ( Str ing , JSValue ) ]p a r s e A r r a y : : MyParser [ JSValue ]

We can turn them into a coherent whole:

parseJSON : : MyParser JSValueparseJSON =

( p a r s e O b j e c t >>= \o −> return ( JSObject o ) )‘ mplus ‘

( p a r s e A r r a y >>= \a −> return ( JSArray a ) )‘ mplus ‘

. . .

Page 89: DEFUN 2008 - Real World Haskell

The problem of boilerplate

Here’s a repeated pattern from our parser:

f o o >>= \x −> return ( bar x )

These brief uses of variables, >>=, and return are redundant andburdensome.In fact, this pattern of applying a pure function to a monadic resultis ubiquitous.

Page 90: DEFUN 2008 - Real World Haskell

Boilerplate removal via lifting

We replace this boilerplate with liftM:

l i f tM : : Monad m => ( a −> b ) −> m a −> m b

We refer to this as lifting a pure function into the monad.

parseJSON =( JSObject ‘ l i f tM ‘ p a r s e O b j e c t )

‘ mplus ‘( JSArray ‘ l i f tM ‘ p a r s e A r r a y )

This style of programming looks less imperative, and moreapplicative.

Page 91: DEFUN 2008 - Real World Haskell

The Parsec library

Our motivation so far:

I Show you that it’s really easy to build a monadic parsinglibrary

But we must concede:

I Maybe you simply want to parse stuff

Instead of rolling your own, use Daan Leijen’s Parsec library.

Page 92: DEFUN 2008 - Real World Haskell

What to expect from Parsec

It has some great advantages:

I A complete, concise EDSL for building parsers

I Easy to learn

I Produces useful error messages

But it’s not perfect:

I Strict, so cannot parsing huge streams incrementally

I Based on String, hence slow

I Accepts, and chokes on, left-recursive grammars

Page 93: DEFUN 2008 - Real World Haskell

Parsing a JSON string

An example of Parsec’s concision:

j s o n S t r i n g = between ( c h a r ’\ ” ’ ) ( c h a r ’\” ’ )( many j s o n C h a r )

Some parsing combinators explained:

I between matches its 1st argument, then its 3rd, then its 2nd

I many runs a parser until it fails

I It returns a list of parse results

Page 94: DEFUN 2008 - Real World Haskell

Parsing a character within a string

j s o n C h a r = c h a r ’\\ ’ >> ( p e s c <|> p u n i )<|> s a t i s f y ( ‘ notElem ‘ ”\”\\” )

Between quotes, jsonChar matches a string’s body:

I A backslash must be followed by an escape (“\n”) or Unicode(“\u2fbe” )

I Any other character except “\” or “”” is okay

More combinator notes:

I The >> combinator is like >>=, but provides onlysequencing, not binding

I The satisfy combinator uses a pure predicate.

Page 95: DEFUN 2008 - Real World Haskell

Your turn!

Write a parser for numbers. Here are some pieces you’ll need:

import Numeric ( readFloat , readSigned )import Text . P a r s e r C o m b i n a t o r s . P a r s e cimport C o n t r o l .Monad (mzero )

Other functions you’ll need:

I getInput

I setInput

The type of your parser should look like this:

parseNumber : : C h a r P a r s e r ( ) Rational

Page 96: DEFUN 2008 - Real World Haskell

Experimenting with your parser

Simply load your code into ghci, and start playing:

Prelude> :load MyParser*Main> parseTest parseNumber "3.14159"

Page 97: DEFUN 2008 - Real World Haskell

My number parser

parseNumber = dos <− g e t I n p u tcase readSigned readFloat s of

[ ( n , s ’ ) ] −> s e t I n p u t s ’ >> return n−> mzero

<?> ”number”

Page 98: DEFUN 2008 - Real World Haskell

Using JSON in Haskell

A good JSON package is already available from Hackage:

I http://tinyurl.com/hs-json

I The module is named Text.JSON

I Doesn’t use overlapping instances

Page 99: DEFUN 2008 - Real World Haskell

Part 3

This was going to be a concurrent web application, but I ran outof time.

I It’s still going to be informative and fun!

Page 100: DEFUN 2008 - Real World Haskell

Concurrent programming

The dominant programming model:

I Shared-state threads

I Locks for synchronization

I Condition variables for notification

Page 101: DEFUN 2008 - Real World Haskell

The prehistory of threads

Invented independently at least 3 times, circa 1965:

I Dijkstra

I Berkeley Timesharing System

I PL/I’s CALL XXX (A, B) TASK;

Alas, the model has barely changed in almost half a century.

Page 102: DEFUN 2008 - Real World Haskell

What does threading involve?

Threads are a simple extension to sequential programming.All that we lose are the following:

I Understandability,

I Predictability, and

I Correctness

Page 103: DEFUN 2008 - Real World Haskell

Concurrent Haskell

I Introduced in 1996, inspired by Id.

I Provides a forkIO action to create threads.

The MVar type is the communication primitive:

I Atomically modifiable single-slot container

I Provides get and put operations

I An empty MVar blocks on get

I A full MVar blocks on put

We can use MVars to build locks, semaphores, etc.

Page 104: DEFUN 2008 - Real World Haskell

What’s wrong with MVars?

MVars are no safer than the concurrency primitives of otherlanguages.

I Deadlocks

I Data corruption

I Race conditions

Higher order programming and phantom typing can help, but onlya little.

Page 105: DEFUN 2008 - Real World Haskell

The fundamental problem

Given two correct concurrent program fragments:

I We cannot compose another correct concurrent fragmentfrom them without great care.

Page 106: DEFUN 2008 - Real World Haskell

Message passing is no panacea

It brings its own difficulties:

I The programming model is demanding.

I Deadlock avoidance is hard.

I Debugging is really tough.

I Don’t forget coherence, scaling, atomicity, ...

Page 107: DEFUN 2008 - Real World Haskell

Lock-free data structures

A focus of much research in the 1990s.

I Modus operandi: find a new lock-free algorithm, earn a PhD.

I Tremendously difficult to get the code right.

I Neither a scalable or sustainable approach!

This inspired research into hardware support, followed by:

I Software transactional memory

Page 108: DEFUN 2008 - Real World Haskell

Software transactional memory

The model is loosely similar to database programming:

I Start a transaction.

I Do lots of work.

I Either all changes succeed atomically...

I ...Or they all abort, again atomically.

An aborted transaction is usually restarted.

Page 109: DEFUN 2008 - Real World Haskell

The perils of STM

STM code needs to be careful:

I Transactional code must not perform non-transactionalactions.

I On abort-and-restart, there’s no way to roll backdropNukes()!

In traditional languages, this is unenforceable.

I Programmers can innocently cause serious, hard-to-find bugs.

Some hacks exist to help, e.g. tm callable annotations.

Page 110: DEFUN 2008 - Real World Haskell

STM in Haskell

In Haskell, the type system solves this problem for us.

I Recall that I/O actions have IO in their type signatures.

I STM actions have STM in their type signatures, but not IO.

I The type system statically prevents STM code fromperforming non-transactional actions!

Page 111: DEFUN 2008 - Real World Haskell

Firing up a transaction

As usual, we can explore APIs in ghci.The atomically action launches a transaction:

Prelude> :m +Control.Concurrent.STM

Prelude Control.Concurrent.STM> :type atomicallyatomically :: STM a -> IO a

Page 112: DEFUN 2008 - Real World Haskell

Let’s build a game—World of Haskellcraft

Our players love to have possessions.

data I tem = S c r o l l | Wand | Banjode r i v i ng (Eq , Ord , Show)

−− i n v e n t o r ydata I n v = I n v {

i n v I t e m s : : [ I tem ] ,i n v C a p a c i t y : : Int

} de r i v i ng (Eq , Ord , Show)

Page 113: DEFUN 2008 - Real World Haskell

Inventory manipulation

Here’s how we set up mutable player inventory:

import C o n t r o l . C o n c u r r e n t .STM

type I n v e n t o r y = TVar I n v

n e w I n v e n t o r y : : Int −> IO I n v e n t o r yn e w I n v e n t o r y cap =

newTVarIO I n v { i n v I t e m s = [ ] ,i n v C a p a c i t y = cap }

The use of curly braces is called record syntax.

Page 114: DEFUN 2008 - Real World Haskell

Inventory manipulation

Here’s how we can add an item to a player’s inventory:

addItem : : I tem −> I n v e n t o r y −> STM ( )

addItem item i n v = doi <− readTVar i n vwr i teTVar i n v i {

i n v I t e m s = item : i n v I t e m s i}

But wait a second:

I What about an inventory’s capacity?

I We don’t want our players to have infinitely deep pockets!

Page 115: DEFUN 2008 - Real World Haskell

Checking capacity

GHC defines a retry action that will abort and restart atransaction if it cannot succeed:

i s F u l l : : I n v −> Booli s F u l l ( I n v i t e m s cap ) = length i t e m s == cap

addItem item i n v = doi <− readTVar i n vwhen ( i s F u l l i )

r e t r ywr i teTVar i n v i {

i n v I t e m s = item : i n v I t e m s i}

Page 116: DEFUN 2008 - Real World Haskell

Let’s try it out

Save the code in a file, and fire up ghci:

*Main> i <- newInventory 3*Main> atomically (addItem Wand i)*Main> atomically (readTVar i)Inv {invItems = [Wand], invCapacity = 3}

What happens if you repeat the addItem a few more times?

Page 117: DEFUN 2008 - Real World Haskell

How does retry work?

In principle, all the runtime has to do is retry the transactionimmediately, and spin tightly until it succeeds.

I This might be correct, but it’s wasteful.

What happens instead?

I The RTS tracks each mutable variable touched during atransaction.

I On retry, it blocks the transaction until at least one of thosevariables is modified.

We haven’t told GHC what variables to wait on: it does thisautomatically!

Page 118: DEFUN 2008 - Real World Haskell

Your turn!

Write a function that removes an item from a player’s inventory:

removeItem : : I tem −> I n v e n t o r y −> STM ( )

Page 119: DEFUN 2008 - Real World Haskell

My item removal action

removeItem item i n v = doi <− readTVar i n vcase break (==item ) ( i n v I t e m s i ) of

( , [ ] ) −> r e t r y( h , ( : t ) ) −> wr i teTVar i n v i {

i n v I t e m s = h ++ t}

Page 120: DEFUN 2008 - Real World Haskell

Your turn again!

Write an action that lets us give an item from one player toanother:

g i v e I t e m : : I tem −> I n v e n t o r y −> I n v e n t o r y−> STM ( )

Page 121: DEFUN 2008 - Real World Haskell

My solution

g i v e I t e m item a b = doremoveItem item aaddItem item b

Page 122: DEFUN 2008 - Real World Haskell

What about that blocking?

If we’re writing a game, we don’t want to block forever if a player’sinventory is full or empty.

I We’d like to say “you can’t do that right now”.

Page 123: DEFUN 2008 - Real World Haskell

One approach to immediate failure

Let’s call this the C programmer’s approach:

addItem1 : : I tem −> TVar I n v −> STM BooladdItem1 item i n v = do

i <− readTVar i n vi f i s F u l l i

then return Falsee l s e do

wr i teTVar i n v i {i n v I t e m s = item : i n v I t e m s i}return True

Page 124: DEFUN 2008 - Real World Haskell

What is the cost of this approach?

If we have to check our results everywhere:

I The need for checking will spread

I Sadness will ensue

Page 125: DEFUN 2008 - Real World Haskell

The Haskeller’s first loves

We have some fondly held principles:

I Abstraction

I Composability

I Higher-order programming

How can we apply these here?

Page 126: DEFUN 2008 - Real World Haskell

A more abstract approach

It turns out that the STM monad is a MonadPlus instance:

i m m e d i a t e l y : : STM a −> STM (Maybe a )i m m e d i a t e l y a c t =

( Just ‘ l i f tM ‘ a c t ) ‘ mplus ‘ return Nothing

Page 127: DEFUN 2008 - Real World Haskell

What does mplus do in STM?

This combinator is defined as orElse :

o r E l s e : : STM a −> STM a −> STM a

Given two transactions j and k :

I If transaction j must abort, perform transaction k instead.

Page 128: DEFUN 2008 - Real World Haskell

A complicated specification

We now have all the pieces we need to:

I Atomically give an item from one player to another.

I Fail immediately if the giver does not have it, or the recipientcannot accept it.

I Convert the result to a Bool.

Page 129: DEFUN 2008 - Real World Haskell

Compositionality for the win

Here’s how we glue the whole lot together:

import Data .Maybe ( i s Ju s t )

giveItemNow : : I tem −> I n v e n t o r y −> I n v e n t o r y−> IO Bool

giveItemNow item a b =l i f tM i s Ju s t . a t o m i c a l l y . i m m e d i a t e l y $

removeItem item a >> addItem item b

Even better, we can do all of this as nearly a one-liner!

Page 130: DEFUN 2008 - Real World Haskell

Thank you!

I hope you found this tutorial useful!Slide source available:

I http://tinyurl.com/defun08