48
How Efficient Immutable Data Enables Functional Programming

Efficient Immutable Data Structures (Okasaki for Dummies)

Embed Size (px)

Citation preview

Page 1: Efficient Immutable Data Structures (Okasaki for Dummies)

How Efficient Immutable Data Enables Functional Programming

Page 2: Efficient Immutable Data Structures (Okasaki for Dummies)

How Efficient Immutable Data Enables Functional Programming

or

Okasaki

For

Dummies

Page 3: Efficient Immutable Data Structures (Okasaki for Dummies)

3 SEPTEMBER 2015

Who Am I?

Page 4: Efficient Immutable Data Structures (Okasaki for Dummies)

4 SEPTEMBER 2015

Tom Faulhaber➡ Planet OS CTO ➡ Background in

networking, Unix OS, visualization, video

➡ Currently working mostly in “Big Data”

➡ Contributor to the Clojure programming language

Page 5: Efficient Immutable Data Structures (Okasaki for Dummies)

5 SEPTEMBER 2015

Who Are YOU?

Page 6: Efficient Immutable Data Structures (Okasaki for Dummies)

6 SEPTEMBER 2015

What is functional programming?

Page 7: Efficient Immutable Data Structures (Okasaki for Dummies)

7 SEPTEMBER 2015

Page 8: Efficient Immutable Data Structures (Okasaki for Dummies)

8 SEPTEMBER 2015

y = f(x)

Pure Functions:

Page 9: Efficient Immutable Data Structures (Okasaki for Dummies)

9 SEPTEMBER 2015

y = f(x)

Pure Functions:

y = f(x)

Page 10: Efficient Immutable Data Structures (Okasaki for Dummies)

10 SEPTEMBER 2015

y = f(x)

Pure Functions:

y = f(x)y = f(x)

Not modified

Not shared

Page 11: Efficient Immutable Data Structures (Okasaki for Dummies)

11 SEPTEMBER 2015

Higher-order Functions:

map(f, [x1, x2, ..., xn]) ![f(x1), f(x2), ..., f(xn)]

Page 12: Efficient Immutable Data Structures (Okasaki for Dummies)

12 SEPTEMBER 2015

Higher-order Functions:

g = map(f)Result is a new function

Page 13: Efficient Immutable Data Structures (Okasaki for Dummies)

13 SEPTEMBER 2015

Higher-order Functions:

g = map � f

Page 14: Efficient Immutable Data Structures (Okasaki for Dummies)

14 SEPTEMBER 2015

Other Aspects:

➡Type inference

➡Laziness

Page 15: Efficient Immutable Data Structures (Okasaki for Dummies)

15 SEPTEMBER 2015

Functional is the opposite of Object-oriented

Page 16: Efficient Immutable Data Structures (Okasaki for Dummies)

16 SEPTEMBER 2015

State is managed through encapsulation

Object-oriented:

State is avoided altogether

Functional:

Page 17: Efficient Immutable Data Structures (Okasaki for Dummies)

17 SEPTEMBER 2015

Why functional?

Page 18: Efficient Immutable Data Structures (Okasaki for Dummies)

18 SEPTEMBER 2015

Why functional?

➡ No shared state makes it easier to reason about programs

➡ Concurrency problems simply go away (almost!) ➡ Undo and backtracking are trivial ➡ Algorithms are often more elegant

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. - Alan Perlis

Page 19: Efficient Immutable Data Structures (Okasaki for Dummies)

19 SEPTEMBER 2015

Why functional?

A host of new languages support the functional model: - ML, Haskell, Clojure, Scala, Idris - All with different degrees of purity

Page 20: Efficient Immutable Data Structures (Okasaki for Dummies)

20 SEPTEMBER 2015

There’s a catch!

Page 21: Efficient Immutable Data Structures (Okasaki for Dummies)

21 SEPTEMBER 2015

There’s a catch!

f(5)

This is cheap:

Page 22: Efficient Immutable Data Structures (Okasaki for Dummies)

22 SEPTEMBER 2015

There’s a catch!

f({"type": "object", "properties": { "mesos": { "description": "Mesos specific configuration properties", "type": "object", "properties": { "master": { … } … } … } … } … })

But this is expensive:

Page 23: Efficient Immutable Data Structures (Okasaki for Dummies)

23 SEPTEMBER 2015

There’s a catch!

f(<my whole database>)

And this is crazy:

Page 24: Efficient Immutable Data Structures (Okasaki for Dummies)

24 SEPTEMBER 2015

Persistent Data Structures to the Rescue

Page 25: Efficient Immutable Data Structures (Okasaki for Dummies)

25 SEPTEMBER 2015

Persistent Data Structures

The goal: Approximate the performance of mutable data structures: CPU and memory.

The big secret: Use structural sharing!

There are lots of little secrets, too. We won’t cover them today.

Page 26: Efficient Immutable Data Structures (Okasaki for Dummies)

26 SEPTEMBER 2015

Persistent Data Structures - History

1990 2000 2010

Persistant Arrays (Dietz)

ML Language (1973)

Catenable Queues

(Buchsbaum/ Tarjan)

Okasaki

Haskell Language

Clojure

CollectionsFinger Trees (1977)

Zipper (Huet)

Data.Map in Haskell

Priority Search Queues (Hinze)

Fast And Space Efficient Trie Searches

(Bagwell)Ideal Hash

Trees (Bagwell)

RRB Trees

(Bagwell/ Rompf)

Page 27: Efficient Immutable Data Structures (Okasaki for Dummies)

27 SEPTEMBER 2015

The quick brown dog jumps over

6

Example: Vector

➡ In Java/C# ArrayList; in C++ std::vector. ➡ A list with constant access and update and amortized

constant append.

The quick brown fox jumps over

6 a[3] =“dog”dog

Page 28: Efficient Immutable Data Structures (Okasaki for Dummies)

28 SEPTEMBER 2015

Example: Vector

➡ In Java/C# ArrayList; in C++ std::vector. ➡ A list with constant access and update and amortized

constant append.

The quick brown dog jumps over

6 a.push_back(“the”)

The quick brown dog jumps over

7

the

the

The quick brown dog jumps over

7

the

Page 29: Efficient Immutable Data Structures (Okasaki for Dummies)

29 SEPTEMBER 2015

Example: Vector

➡ To build a persistent vector, we start with a tree:

Persistent^

depth =

dlog neData is in the leaves

6

The quick brown fox jumps over

Page 30: Efficient Immutable Data Structures (Okasaki for Dummies)

30 SEPTEMBER 2015

The quick brown fox jumps over

6

0 1 2 3 4 5000 001 010 011 100 101

LLL LLR LRL LRR RLL RLR

The quick brown fox jumps over

6

0 1 2 3 4 5000 001 010 011 100 101

LLL LLR LRL LRR RLL RLR

The quick brown fox jumps over

6

0 1 2 3 4 5

000 001 010 011 100 101

LLL LLR LRL LRR RLL RLR

x = a[3]

The quick brown fox jumps over

6

0 1 2 3 4 5000 001 010 011 100 101

LLL LLR LRL LRR RLL RLR

The quick brown fox jumps over

6

0 1 2 3 4 5000 001 010 011 100 101

LLL LLR LRL LRR RLL RLR

Page 31: Efficient Immutable Data Structures (Okasaki for Dummies)

31 SEPTEMBER 2015

The quick brown fox jumps over

6 7

The quick brown fox jumps over

6 7

The quick brown fox jumps over

6 7

The quick brown fox jumps over

6

b = a.add(“the”)7

The quick brown fox jumps over

6

the

Page 32: Efficient Immutable Data Structures (Okasaki for Dummies)

32 SEPTEMBER 2015

7

The quick brown fox jumps over the

Page 33: Efficient Immutable Data Structures (Okasaki for Dummies)

33 SEPTEMBER 2015

The quick brown fox jumps over

6

Page 34: Efficient Immutable Data Structures (Okasaki for Dummies)

34 SEPTEMBER 2015

7

The quick brown fox jumps over

6

the

Page 35: Efficient Immutable Data Structures (Okasaki for Dummies)

35 SEPTEMBER 2015

But, wait…

Page 36: Efficient Immutable Data Structures (Okasaki for Dummies)

36 SEPTEMBER 2015

But, wait…

O(1) 6= O(log n)

This isn’t what you promised!

Page 37: Efficient Immutable Data Structures (Okasaki for Dummies)

37 SEPTEMBER 2015

2

4

6

8

10

0 250 500 750 1000Number of elements

Tree

dep

th

2

4

6

8

10

0 250 500 750 1000Number of elements

Tree

dep

th

2

4

6

8

10

0 250 500 750 1000Number of elements

Tree

dep

th

d = 1

d = dlog2 ne

Page 38: Efficient Immutable Data Structures (Okasaki for Dummies)

38 SEPTEMBER 2015

The answer: Use 32-way trees

Page 39: Efficient Immutable Data Structures (Okasaki for Dummies)

39 SEPTEMBER 2015

x = a[7022896]x = a[7022896]

00110 10110 01010 01001 10000

6 22 10 9 16

Page 40: Efficient Immutable Data Structures (Okasaki for Dummies)

40 SEPTEMBER 2015

6

apple

22

10

9

16

Page 41: Efficient Immutable Data Structures (Okasaki for Dummies)

41 SEPTEMBER 2015

O(1) ' O(log32 n)

Page 42: Efficient Immutable Data Structures (Okasaki for Dummies)

42 SEPTEMBER 2015

2

4

6

8

10

0 250 500 750 1000Number of elements

Tree

dep

th

d = 1

d = dlog2 ne

2

4

6

8

10

0 250 500 750 1000Number of elements

Tree

dep

th

d = dlog32 ne

Page 43: Efficient Immutable Data Structures (Okasaki for Dummies)

43 SEPTEMBER 2015

Example: Tree Walking

➡ The functional equivalent of the visitor pattern

Page 44: Efficient Immutable Data Structures (Okasaki for Dummies)

44 SEPTEMBER 2015

Clojure code to implement the walker:

(postwalk (fn [node] (if (= :blue (:color node)) (assoc node :color :green) node)) tree)

Example: Tree Walking

Page 45: Efficient Immutable Data Structures (Okasaki for Dummies)

45 SEPTEMBER 2015

Example: Zippers

➡ Allow you to navigate and update a tree across many operations by “unzipping” it.

Page 46: Efficient Immutable Data Structures (Okasaki for Dummies)

46 SEPTEMBER 2015

Takeaways

➡ Functional data structures can approximate the performance of mutable data structures, but will usually won’t be quite as fast.

➡ … but not having to do state management often wins back the difference

➡ We need to choose data structures carefully depending on how they’re going to be used.

➡ This doesn’t solve shared state, just reduces it. (but see message passing, software transactional memory, etc.)

Page 47: Efficient Immutable Data Structures (Okasaki for Dummies)

47 SEPTEMBER 2015

ReferencesChris Okasaki, Purely Functional Data Structures, Doctoral dissertation, Carnegie Mellon University, 1996.

Rich Hickey, “Are We There Yet?” Presentation at the JVM Languages SUmmit, 2009. http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey

Gerard Huet, "Functional Pearl: The Zipper". Journal of Functional Programming 7 (5): 549–554. doi:10.1017/s0956796897002864

Jean Niklas L’orange, “Understanding Clojure's Persistent Vectors” Blog post at http://hypirion.com/musings/understanding-persistent-vector-pt-1.

Page 48: Efficient Immutable Data Structures (Okasaki for Dummies)

48 SEPTEMBER 2015

Discussion