93
State You’re Doing it Wrong: Alternative Concurrency Paradigms For the JVM Jonas Bonér Crisp AB blog: http://jonasboner.com work: http://crisp.se code: http://github.com/jboner twitter: jboner

State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Embed Size (px)

DESCRIPTION

My talk for JavaOne 2009 Abstract: Writing concurrent programs in the Java programming language is hard, and writing correct concurrent programs is even harder. What should be noted is that the main problem is not concurrency itself but the use of mutable shared state. Reasoning about concurrent updates to, and guarding of, mutable shared state is extremely difficult. It imposes problems such as dealing with race conditions, deadlocks, live locks, thread starvation, and the like. It might come as a surprise to some people, but there are alternatives to so-called shared-state concurrency (which has been adopted by C, C++, and the Java programming language and become the default industry-standard way of dealing with concurrency problems). This session discusses the importance of immutability and explores alternative paradigms such as dataflow concurrency, message-passing concurrency, and software transactional memory. It includes a pragmatic discussion of the drawbacks and benefits of each paradigm and, through hands-on examples, shows you how each one, in its own way, can raise the abstraction level and give you a model that is much easier to reason about and use. The presentation also shows you how, by choosing the right abstractions and technologies, you can make hard concurrency problems close to trivial. All discussions are driven by examples using state-of-the-art implementations available for the JVM machine.

Citation preview

Page 1: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

State You’re Doing it Wrong: Alternative Concurrency Paradigms For the JVMJonas BonérCrisp ABblog: http://jonasboner.comwork: http://crisp.secode: http://github.com/jbonertwitter: jboner

Page 2: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

2

Agenda>An Emergent Crisis>State: Identity vs Value>Shared-State Concurrency>Software Transactional Memory (STM)>Message-Passing Concurrency (Actors) >Dataflow Concurrency>Wrap up

Page 3: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

3

Moore’s Law>Coined in the 1965 paper by Gordon E. Moore >The number of transistors is doubling every 18 months

>Processor manufacturers have solved our problems for years

Page 4: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

4

Not anymore

Page 5: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

5

The free lunch is over>The end of Moore’s Law >We can’t squeeze more out of one CPU

Page 6: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

6

Conclusion>This is an emergent crisis >Multi-processors are here to stay >We need to learn to take advantage of that >The world is going concurrent

Page 7: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

7

State

Page 8: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

8

The devil is in the state

Page 9: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

9

Wrong, let me rephrase

Page 10: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

10

The devil is in the mutable state

Page 11: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

11

Definitions&

Philosophy

Page 12: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

What is a Value?

A Value is something that does not change

Discussion based onhttp://clojure.org/state

by Rich Hickey

12

Page 13: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

What is an Identity?

A stable logical entity associated with a

series of different Values over time

13

Page 14: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

What is State?

The Value an entity with a specific Identity

has at a particular point in time

14

Page 15: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

How do we know if something has State?

If a function is invoked with the same arguments at

two different points in time and returns different values...

...then it has state

15

Page 16: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

The Problem

Unification of Identity & Value

They are

not the same

16

Page 17: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

We need to separate Identity & Value...add a level of indirectionSoftware Transactional Memory

Managed References

Message-Passing ConcurrencyActors/Active Objects

Dataflow ConcurrencyDataflow (Single-Assignment) Variables

17

Page 18: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

18

Shared-State Concurrency

Page 19: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

19

Shared-State Concurrency>Concurrent access to shared, mutable state. >Protect mutable state with locks >The Java C# C/C++ Ruby Python etc. ...way

Page 20: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

20

Shared-State Concurrency is incredibly hard

>Inherently very hard to use reliably>Even the experts get it wrong

Page 21: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

21

Page 22: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

...and for each of these...

1. Look at an implementation using Shared-State Concurrency

2. Compare with implementation using an alternative paradigm

22

Page 23: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

23

Page 24: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

24

Problem 1:

Transfer funds between bank accounts

Page 25: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

25

Shared-State Concurrency

Transfer funds between bank accounts

Page 26: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

26

Account

publicclassAccount{privatedoublebalance;publicvoidwithdraw(doubleamount){balance‐=amount;}publicvoiddeposit(doubleamount){balance+=amount;}}> Not thread-safe

Page 27: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

27

Let’s make it thread-safepublicclassAccount{privatedoublebalance;publicsynchronizedvoidwithdraw(doubleamount){balance‐=amount;}publicsynchronizedvoiddeposit(doubleamount){balance+=amount;}}

>Thread-safe, right?

Page 28: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

It’s still brokenNot atomic

28

Page 29: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

29

Let’s write an atomic transfer method

publicclassAccount{...

publicsynchronizedvoidtransferTo(Accountto,doubleamount){this.withdraw(amount);to.deposit(amount);}...}

> This will work right?

Page 30: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

30

Let’s transfer funds

Accountalice=...Accountbob=...//inonethreadalice.transferTo(bob,10.0D);//inanotherthreadbob.transferTo(alice,3.0D);

Page 31: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Might lead to DEADLOCK

Darn, this is really hard!!!

31

Page 32: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

32

We need to enforce lock ordering>How? >Java won’t help us >Need to use code convention (names etc.) >Requires knowledge about the internal state and implementation of Account

>…runs counter to the principles of encapsulation in OOP

>Opens up a Can of Worms

Page 33: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

The problem with locksLocks do not composeTaking too few locksTaking too many locksTaking the wrong locksTaking locks in the wrong orderError recovery is hard

33

Page 34: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Java bet on the wrong horse

But we’re not completely screwed There are alternatives

34

Page 35: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

We need better and more high-level

abstractions

35

Page 36: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

36

Alternative Paradigms>Software Transactional Memory (STM) >Message-Passing Concurrency (Actors) >Dataflow Concurrency

Page 37: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

37

Software Transactional Memory (STM)

Page 38: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

38

Software Transactional Memory>See the memory (heap and stack) as a transactional dataset

>Similar to a database begin commit abort/rollback

>Transactions are retried automatically upon collision

>Rolls back the memory on abort

Page 39: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

39

Software Transactional Memory> Transactions can nest> Transactions compose (yipee!!)atomic{..atomic{..}}

Page 40: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

40

Restrictions >All operations in scope of a transaction: Need to be idempotent Can’t have side-effects

Page 41: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

41

Case study: Clojure

Page 42: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

42

What is Clojure? >Functional language>Runs on the JVM>Only immutable data and datastructures>Pragmatic Lisp>Great Java interoperability>Dynamic, but very fast

Page 43: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

43

Clojure’s concurrency story >STM (Refs) Synchronous Coordinated

>Atoms Synchronous Uncoordinated

>Agents Asynchronous Uncoordinated

>Vars Synchronous Thread Isolated

Page 44: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

44

STM (Refs)>A Ref holds a reference to an immutable value>A Ref can only be changed in a transaction>Updates are atomic and isolated (ACI)>A transaction sees its own snapshot of the world>Transactions are retried upon collision

Page 45: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

45

Let’s get back to our banking problem

The STM way Transfer funds

between bank accounts

Page 46: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

46

;;alice’saccountwithbalance1000USD(defalice(ref1000))

;;bob’saccountwithbalance1000USD(defbob(ref1000))

Create two accounts

Page 47: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

47

;;amounttotransfer(defamount100)

;;notvalid;;throwsexceptionsince;;notransactionisrunning(ref‐setalice(‐@aliceamount))(ref‐setbob(+@bobamount))

Transfer 100 bucks

Page 48: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

48

;;updatebothaccountsinsideatransaction(dosync(ref‐setalice(‐@aliceamount))(ref‐setbob(+@bobamount)))

Wrap in a transaction

Page 49: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Potential problems with STMHigh contention (many transaction collisions) can lead to:

Potential bad performance and too high latencyProgress can not be guaranteed (e.g. live locking)Fairness is not maintained

Implementation details hidden in black box

49

Page 50: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

50

My (humble) opinion on STM >Can never work fine in a language that don’t have compiler enforced immutability>E.g. never in Java (as of today)

>Should not be used to “patch” Shared-State Concurrency

>Still a research topic how to do it in imperative languages

Page 51: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Discussion: Problem 1Need for consensus and truly shared knowledge

Shared-State ConcurrencyBad fit

Software Transactional Memory Great fitMessage-Passing Concurrency

Terrible fitDataflow Concurrency

Terrible fit

51

Page 52: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

52

Message-Passing Concurrency

Page 53: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

53

Actor Model of Concurrency >Implements Message-Passing Concurrency>Originates in a 1973 paper by Carl Hewitt>Implemented in Erlang, Occam, Oz>Encapsulates state and behavior>Closer to the definition of OO than classes

Page 54: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

54

Actor Model of Concurrency >Share NOTHING>Isolated lightweight processes

> Can easily create millions on a single workstation>Communicates through messages>Asynchronous and non-blocking

Page 55: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

55

Actor Model of Concurrency >No shared state … hence, nothing to synchronize.

>Each actor has a mailbox (message queue)

Page 56: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

56

Actor Model of Concurrency>Non-blocking send>Blocking receive>Messages are immutable>Highly performant and scalable Similar to Staged Event Driven Achitecture style (SEDA)

Page 57: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

57

Actor Model of Concurrency >Easier to reason about>Raised abstraction level>Easier to avoid Race conditions Deadlocks Starvation Live locks

Page 58: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

58

Fault-tolerant systems >Link actors>Supervisor hierarchies One-for-one All-for-one

>Ericsson’s Erlang success story 9 nines availability (31 ms/year downtime)

Page 59: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

59

Page 60: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

60

Problem 2:

A game of ping pong

Page 61: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

61

Shared-State Concurrency

A game of ping pong

Page 62: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Ping Pong Table

publicclassPingPongTable{publicvoidhit(Stringhitter){System.out.println(hitter);}}

62

Page 63: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

PlayerpublicclassPlayerimplementsRunnable{privatePingPongTablemyTable;privateStringmyName;publicPlayer(Stringname,PingPongTabletable){myName=name;myTable=table;}

...}

63

Page 64: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Player cont......publicvoidrun(){while(true){synchronized(myTable){try{myTable.hit(myName);myTable.notifyAll();myTable.wait();}catch(InterruptedExceptione){}}}}}

64

Page 65: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Run it

PingPongTabletable=newPingPongTable();Threadping=newThread(newPlayer("Ping",table));Threadpong=newThread(newPlayer("Pong",table));ping.start();pong.start();

65

Page 66: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

66

Help: java.util.concurrent>Great library >Raises the abstraction level

>No more wait/notify & synchronized blocks>Concurrent collections>Executors, ParallelArray

>Simplifies concurrent code >Use it, don’t roll your own

Page 67: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

67

Actors

A game of ping pong

Page 68: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Define message

caseobjectBall

68

Page 69: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Player 1: Pong

valpong=actor{loop{receive{//waitonmessagecaseBall=>//matchonmessageBallprintln("Pong")reply(Ball)}}}

69

Page 70: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Player 2: Ping

valping=actor{pong!Ball//startthegameloop{receive{caseBall=>println("Ping")reply(Ball)}}}

70

Page 71: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Run it...well, they are already up and running

71

Page 72: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

72

Actor implementations for the JVM >Killim (Java)>Jetlang (Java)>Actor’s Guild (Java)>ActorFoundry (Java)>Actorom (Java)>FunctionalJava (Java)>Akka Actor Kernel (Java/Scala)>GParallelizer (Groovy)>Fan Actors (Fan)

Page 73: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Discussion: Problem 2Coordination of interrelated tasks/processes

Shared-State ConcurrencyBad fit (ok if java.util.concurrent is used)

STM Won’t helpMessage-Passing Concurrency

Great fitDataflow Concurrency

Ok

73

Page 74: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Dataflow ConcurrencyThe forgotten paradigm

74

Page 75: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

75

Dataflow Concurrency>Declarative >No observable non-determinism >Data-driven – threads block until data is available>On-demand, lazy >No difference between:

>Concurrent and >Sequential code

Page 76: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

76

Dataflow Concurrency>No race-conditions >Deterministic >Simple and beautiful

Page 77: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

77

Dataflow Concurrency>Dataflow (Single-Assignment) Variables >Dataflow Streams (the tail is a dataflow variable) >Implemented in Oz and Alice

Page 78: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

78

Just three operations>Create a dataflow variable >Wait for the variable to be bound >Bind the variable

Page 79: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

79

Limitations>Can’t have side-effects Exceptions IO (println, File, Socket etc.) Time etc.

Not general-purpose Generally good for well-defined isolated modules

Page 80: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

80

Oz-style dataflow concurrency for the JVM

>Created my own implementation (DSL) > On top of Scala

Page 81: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

81

API: Dataflow Variable//Createdataflowvariablevalx,y,z=newDataFlowVariable[Int]//Accessdataflowvariable(Waittobebound)z()//Binddataflowvariablex<<40//Lightweightthreadthread{y<<2}

Page 82: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

82

API: Dataflow StreamDeterministic streams (not IO streams)

//Createdataflowstreamvalproducer=newDataFlowStream[Int]//Appendtostreamproducer<<<s//Readfromstreamproducer()

Page 83: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Roadmap: Let’s look at three problem domains

1. Need for consensus and truly shared knowledgeExample: Banking

2. Coordination of independent tasks/processesExample: Scheduling, Gaming

3. Workflow related dependent processesExample: Business processes, MapReduce

83

Page 84: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

84

Problem 3:

Producer/Consumer

Page 85: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

85

Shared-State Concurrency

Producer/Consumer

Page 86: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Use java.util.concurrent

Fork/Join framework (ParallelArray etc.) ExecutorServiceFutureBlockingQueue

86

Page 87: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

87

Dataflow Concurrency

Producer/Consumer

Page 88: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

88

Example: Dataflow Variables

//sequentialversionvalx,y,z=newDataFlowVariable[Int]x<<40y<<2z<<x()+y()println("z="+z())

Page 89: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

89

Example: Dataflow Variables

//concurrentversion:nodifferencevalx,y,z=newDataFlowVariable[Int]thread{x<<40}thread{y<<2}thread{z<<x()+y()println("z="+z())}

Page 90: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Dataflow Concurrency in Java

DataRush (commercial)Flow-based Programming in Java (dead?) FlowJava (academic and dead)

90

Page 91: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

Discussion: Problem 3Workflow related dependent processes

Shared-State ConcurrencyOk (if java.util.concurrent is used)

STM Won’t helpMessage-Passing Concurrency

OkDataflow Concurrency

Great fit

91

Page 92: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

92

Wrap up>Parallel programs is becoming increasingly important>We need a simpler way of writing concurrent programs

>“Java-style” concurrency is too hard>There are alternatives worth exploring Message-Passing Concurrency Software Transactional Memory Dataflow Concurrency

Each with their strengths and weaknesses

Page 93: State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM

93

Jonas BonérCrisp AB

blog: http://jonasboner.comwork: http://crisp.secode: http://github.com/jbonertwitter: jboner