26
Intel Concurrent Collections (for Haskell) Ryan Newton, Chih-Ping Chen, Simon Marlow Software and Services Group Jul 27, 2010

Intel Concurrent Collections (for Haskell)

  • Upload
    arnie

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Intel Concurrent Collections (for Haskell). Ryan Newton, Chih -Ping Chen, Simon Marlow Software and Services Group Jul 27, 2010. Haskell means functions. As in Math… Replacing a function call with its result is safe. f(x ) = g(x ) + 1 g(y ) = 2* y. (i.e. referential transparency). - PowerPoint PPT Presentation

Citation preview

Page 1: Intel Concurrent  Collections  (for Haskell)

IntelConcurrent Collections

(for Haskell)Ryan Newton, Chih-Ping Chen, Simon Marlow

Software and Services Group

Jul 27, 2010

Page 2: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 2

Haskell means functions

•As in Math…

•Replacing a function call with its result is safe.

f(x) = g(x) + 1 g(y) = 2*y

f(3) = g(3)+1 = 2*3+1 = 7

Benefit: determinism everywhere

(i.e. referential transparency)

Page 3: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 3

Parallelism in Haskell

•Annotate expressions that (maybe) should run in parallel

These are HINTS to the compiler/runtime.

f(x) = g(x) `par` h(x)

Page 4: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 4

Parallelism in Haskell #2

•Haskell is imperative also!−(But that part is segregated from the functional bits.)

• IO Threads and Channels: do action.. writeChan c msg action..

do action..readChan c

action..

This form of message passing sacrifices determinism.Let’s try CnC instead…

Page 5: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 5

do action.. writeChan c msg action..

do action..readChan c

action..

Parallelism in Haskell #2

• IO Threads and Channels:

This form of message passing sacrifices determinism.Let’s try CnC instead…

•Steps, Items, and Tags

Step1

Step2

Page 6: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 6

Haskell / CnC Synergies

C++ incarnation Haskell incarnation

CnC Step hopefully pure execute method

Pure functionEnforce step purity.

Complete CnC Graph Execution

Graph execution concurrently on other threads. Blocking calls allow environment to collect results.

Pure function. Leverage the fact that CnC can play nicely in a pure framework.

Also, Haskell CnC is a fun experiment. • We can try things like idempotent work stealing! • Most imperative threading packages don’t support that.• Can we learn things to bring back to other CnC’s?

Page 7: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 7

How is CnC Purely Functional?

Domain Range

Every function maps a set (domain) to a set (range):

Page 8: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 8

How is CnC Purely Functional?

Domain Range

A complete CnC Eval

A set of tag/item collections

Page 9: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 9

An individual “step” is a function too

Domain Range

MyStep

Updates(new tags, items)

Page 10: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 10

How is CnC called from Haskell?

•Haskell is LAZY (functions eval arguments only on demand)

•To control par. eval granularity Haskell CnC is mostly eager.

•When the result of a CnC graph is needed, the whole graph executes, in parallel, to completion.

Need result! Recv.

result

Par exec.

Forkworkers

Page 11: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 11

A complete CnC program in Haskell myStep items tag =  do word1 <- get items "left"     word2 <- get items "right"     put items "result" (word1 ++ word2 ++ show tag)

cncGraph = do tags <- newTagCol items <- newItemCol prescribe tags (myStep items) initialize$ do put items "left" "Hello " put items "right" "World " putt tags 99 finalize$ do get items "result"

main = putStrLn (runGraph cncGraph)

Page 12: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 12

INITIAL RESULTS

Page 13: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 13

Prelude: Haskell is (kinda) fast

“The Great Language Shootout”

Page 14: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 14

Prelude: Haskell is (kinda) fast

Page 15: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 15

I will show you graphs like this:

Page 16: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 16

Current Implementation(s)• IO based and Func. impl.•Lots of fun with schedulers!

−Lightweight user threads (3)

−Global work queue (4,5,6,7,10)

−Continuations + Work stealing (10,11)

−Haskell “spark” work stealing (8,9)>(Mechanism used by ‘par’)

wrkr1 wrkr2 wrkr3

Globalworkpool

thr1 thr2 thr3 …

Thread per step instance.Step(0) Step(1) Step(2)

wrkr1 wrkr2

Per-threadwork

deques

Page 17: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 17

Back to graphs…

Page 18: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 18

Recent Results

Page 19: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 19

Recent Results

Page 20: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 20

Recent Results

Page 21: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 21

Conclusions from early prototype• Lightweight user threads (3)

simple, predictable, but too much overhead

• Global work queue (4,5,6,7,10) better scaling than expected

• Work stealing (11) will win in the end, need good deques

• Using Haskell “spark” work stealing (8,9) a flop (for now)

No clear scheduler winners! CnC mantra - separate tuning (including scheduler selection) from application semantics.

Page 22: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 22

BEGIN:FUTURE WORK SECTION

Page 23: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 23

Future Work, Haskell CnC Specific

•GHC needs a scalable garbage collector (in progress)

•Much incremental scheduler improvement left.

•Need more detailed profiling and better understanding of variance−Lazy evaluation and a complex runtime are a challenge!

Page 24: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 24

Moving forward, radical optimization•Today CnC is mostly “WYSIWYG”

−A step becomes a task (but we can control the scheduler)−User is responsible for granularity −Typical of dynamically scheduled parallelism

•CnC is moving to a point where profiling and graph analysis enable significant transformations.−Implemented currently: Hybrid static/dynamic scheduling

>Prune space of schedules / granularity choices offline −Future: efficient selection of data structures for collections (next slide)

>Vectors (dense) vs. Hashmaps (sparse)>Concurrent vs. nonconcurrent data structures

Page 25: Intel Concurrent  Collections  (for Haskell)

Software and Services Group 25

Tag functions: data access analysis

• How nice -- a declarative specification of data access!

(mystep : i) <- [mydata : i-1], [mydata : i], [mydata : i+1]

• Much less pain than traditional loop index analysis• Use this to:

−Check program for correctness.−Analyze density, convert data structures−Generate precise garbage collection strategies−Extreme:

full understanding of data deps. can replace control dependencies

Page 26: Intel Concurrent  Collections  (for Haskell)