Reasoning about laziness

Reasoning about laziness

Johan [email protected]

2011-02-12

Laziness

I Haskell is a lazy language

I Functions and data constructors don’t evaluate theirarguments until they need them

cond : : Bool −> a −> a −> acond True t e = tcond False t e = e

I Same with local definitions

abs : : Int −> Intabs x | x > 0 = x

| otherwise = ne g xwhere n eg x = negate x

Why laziness is important

I Laziness supports modular programming

I Programmer-written functions instead of built-in languageconstructs

( | | ) : : Bool −> Bool −> BoolTrue | | = TrueFalse | | x = x

Laziness and modularity

Laziness lets us separate producers and consumers and still getefficient execution:

I Generate all solutions (a huge tree structure)

I Find the solution(s) you want

nextMove : : Board −> MovenextMove b = s e l e c t M o v e a l l M o v e s

wherea l l M o v e s = al lMovesFrom b

The solutions are generated as they are consumed.

Example: summing some numbers

sum : : [ Int ] −> Intsum xs = sum ’ 0 xs

wheresum ’ acc [ ] = accsum ’ acc ( x : x s ) = sum ’ ( acc + x ) xs

foldl abstracts the accumulator recursion pattern:

f o l d l : : ( a −> b −> a ) −> a −> [ b ] −> af o l d l f z [ ] = zf o l d l f z ( x : x s ) = f o l d l f ( f z x ) xs

sum = f o l d l (+) 0

A misbehaving function

How does evaluation of this expression proceed?

sum [ 1 , 2 , 3 ]

Like this:

sum [1,2,3]

==> foldl (+) 0 [1,2,3]

==> foldl (+) (0+1) [2,3]

==> foldl (+) ((0+1)+2) [3]

==> foldl (+) (((0+1)+2)+3) []

==> ((0+1)+2)+3

==> (1+2)+3

==> 3+3

==> 6

Thunks

A thunk represents an unevaluated expression.

I GHC needs to store all the unevaluated + expressions on theheap, until their value is needed.

I Storing and evaluating thunks is costly, and unnecessary if theexpression was going to be evaluated anyway.

I foldl allocates n thunks, one for each addition, causing astack overflow when GHC tries to evaluate the chain ofthunks.

Controlling evaluation order

The seq function allows to control evaluation order.

seq : : a −> b −> b

Informally, when evaluated, the expression seq a b evaluates a andthen returns b.

Weak head normal form

Evaluation stops as soon as a data constructor (or lambda) isreached:

ghci> seq (1 ‘div‘ 0) 2

*** Exception: divide by zero

ghci> seq ((1 ‘div‘ 0), 3) 2

2

We say that seq evaluates to weak head normal form (WHNF).

Weak head normal form

Forcing the evaluation of an expression using seq only makes senseif the result of that expression is used later:

l e t x = 1 + 2 i n seq x ( f x )

The expression

p r i n t ( seq (1 + 2) 3)

doesn’t make sense as the result of 1+2 is never used.

Exercise

Rewrite the expression

(1 + 2 , ’ a ’ )

so that the component of the pair is evaluated before the pair iscreated.

Solution

Rewrite the expression as

l e t x = 1 + 2 i n seq x ( x , ’ a ’ )

A strict left fold

We want to evaluate the expression f z x before evaluating therecursive call:

f o l d l ’ : : ( a −> b −> a ) −> a −> [ b ] −> af o l d l ’ f z [ ] = zf o l d l ’ f z ( x : xs ) = l e t z ’ = f z x

i n seq z ’ ( f o l d l ’ f z ’ x s )

Summing numbers, attempt 2

How does evaluation of this expression proceed?

foldl’ (+) 0 [1,2,3]

Like this:

foldl’ (+) 0 [1,2,3]

==> foldl’ (+) 1 [2,3]

==> foldl’ (+) 3 [3]

==> foldl’ (+) 6 []

==> 6

Sanity check:

ghci> print (foldl’ (+) 0 [1..1000000])

500000500000

Computing the mean

A function that computes the mean of a list of numbers:

mean : : [ Double ] −> Doublemean xs = s / f romIntegra l l

where( s , l ) = f o l d l ’ s t e p ( 0 , 0) xss t e p ( s , l ) a = ( s+a , l +1)

We compute the length of the list and the sum of the numbers inone pass.

$ ./Mean

Stack space overflow: current size 8388608 bytes.

Use ‘+RTS -Ksize -RTS’ to increase it.

Didn’t we just fix that problem?!?

seq and data constructors

Remember:

I Data constructors don’t evaluate their arguments whencreated

I seq only evaluates to the outmost data constructor, butdoesn’t evaluate its arguments

Problem: foldl ’ forces the evaluation of the pair constructor, butnot its arguments, causing unevaluated thunks build up inside thepair:

(0.0 + 1.0 + 2.0 + 3.0, 0 + 1 + 1 + 1)

Forcing evaluation of constructor arguments

We can force GHC to evaluate the constructor arguments beforethe constructor is created:


where( s , l ) = f o l d l ’ s t e p ( 0 , 0) xss t e p ( s , l ) a = l e t s ’ = s + a

l ’ = l + 1i n seq s ’ ( seq l ’ ( s ’ , l ’ ) )

Bang patterns

A bang patterns is a concise way to express that an argumentshould be evaluated.

{−# LANGUAGE BangPatte rns #−}


where( s , l ) = f o l d l ’ s t e p ( 0 , 0) xss t e p ( ! s , ! l ) a = ( s + a , l + 1)

s and l are evaluated before the right-hand side of step isevaluated.

Strictness

We say that a function is strict in an argument, if evaluating thefunction always causes the argument to be evaluated.

nu l l : : [ a ] −> Boolnu l l [ ] = Truenu l l = False

null is strict in its first (and only) argument, as it needs to beevaluated to pick a return value.

Strictness - Example

cond is strict in the first argument, but not in the second and thirdargument:

cond : : Bool −> a −> a −> acond True t e = tcond False t e = e

Reason: Each of the two branches only evaluate one of the twolast arguments to cond.

Strict data types

Haskell lets us say that we always want the arguments of aconstructor to be evaluated:

data P a i r S a b = PS ! a ! b

When a PairS is evaluated, its arguments are evaluated.

Strict pairs as accumulators

We can use a strict pair to simplify our mean function:


wherePS s l = f o l d l ’ s t e p (PS 0 0) xss t e p (PS s l ) a = PS ( s + a ) ( l + 1)

Tip: Prefer strict data types when laziness is not needed for yourprogram to work correctly.

Reasoning about laziness

A function application is only evaluated if its result is needed,therefore:

I One of the function’s right-hand sides will be evaluated.

I Any expression whose value is required to decide which RHSto evaluate, must be evaluated.

By using this “backward-to-front” analysis we can figure whicharguments a function is strict in.

Reasoning about laziness: example

max : : Int −> Int −> Intmax x y

| x > y = x| x < y = y| otherwise = x −− a r b i t r a r y

I To pick one of the three RHS, we must evaluate x > y.

I Therefore we must evaluate both x and y.

I Therefore max is strict in both x and y.

Poll

data BST = L e a f | Node Int BST BST

i n s e r t : : Int −> BST −> BSTi n s e r t x L e a f = Node x L e a f L e a fi n s e r t x ( Node x ’ l r )

| x < x ’ = Node x ’ ( i n s e r t x l ) r| x > x ’ = Node x ’ l ( i n s e r t x r )| otherwise = Node x l r

Which arguments is insert strict in?

I None

I 1st

I 2nd

I Both

Solution

Only the second, as inserting into an empty tree can be donewithout comparing the value being inserted. For example, thisexpression

i n s e r t (1 ‘ div ‘ 0) L e a f

does not raise a division-by-zero expression but

i n s e r t (1 ‘ div ‘ 0) ( Node 2 L e a f L e a f )

does.

Some other things worth pointing out

I insert x l is not evaluated before the Node is created, so it’sstored as a thunk.

I Most tree based data structures use strict sub-trees:

data Set a = Tip| Bin ! S i z e a ! ( Set a ) ! ( Set a )

Strict function arguments are great for performance

I Strict arguments can often be passed as unboxed values (e.g.a machine integer in a register instead of a pointer to aninteger on the heap).

I The compiler can often infer which arguments are stricts, butcan sometimes need a little help (like in the case of insert afew slides back).

Summary

Understanding how evaluation works in Haskell is important andrequires practice.

Education

Reasoning about laziness