36
Programming in the Multi-Core Era Emil Sekerinski Department of Computing and Software McMaster University June 2018

Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Programming in the Multi-Core Era

Emil Sekerinski

Department of Computing and SoftwareMcMaster University

June 2018

Page 2: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

40+ Years of Microprocessor Trends

We have reached instruction level parallelism wall and power wallStill increasing number of transistors leads to more processor cores

Page 3: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Communication and Synchronization in Multi-Programs

Message Passing:Asynchronous: Unix pipes & filters, Erlang processesSynchronous: Go goroutines

Shared Variables:Semaphores: Linux, C, PythonMonitors: Java, C#, PythonTransactional memory: C++, HaskellRemote procedure call: Java, PythonDistributed shared memory

Introduced for resource sharing, interactive programs, distribution

Suitable for multi-core processors?

Page 4: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Threads in Programs

Threads have to be explicitly created to make use of multiple cores:

Too few threads: not all cores will be used→ turn independent tasks into threads

[Android Guides]

Too many threads: slowdown due to overhead of coresswitching between threads, e.g. multiplying n × n matrices→ use thread pools to reduce number of threads

Page 5: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Threads in Programs

Threads have to be explicitly created to make use of multiple cores:

Too few threads: not all cores will be used→ turn independent tasks into threadsToo many threads: slowdown due to overhead of coresswitching between threads, e.g. multiplying n × n matrices→ use thread pools to reduce number of threads

[Thread Pool in the .NET Framework]

Page 6: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Threads in Programs

Threads have to be explicitly created to make use of multiple cores:

Too few threads: not all cores will be used→ turn independent tasks into threadsToo many threads: slowdown due to overhead of coresswitching between threads, e.g. multiplying n × n matrices→ use thread pools to reduce number of threads

In addition to the static structure of modules, programs havea dynamic structure of threads.

(State is distributed over global variables and thread-localvariables)

Page 7: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Inspiration: Simula-67 and Smalltalk-80

Simula-67objects in programs mimicobjects of the “real world”objects are cooperatively scheduled→ coroutines

Smalltalk-80 Java, C#, Python, . . .sender callerreceiver calleemessage method

Objects should be thought of as having “independent lives”,even if the implementation is sequential

How to make objects “truly concurrent”?Ada uses rendezvous, Erlang and Akka use actors

[Birtwhistle, Dahl, Myhrhaug, Nygaard.

Simula Begin, 1979.]

Page 8: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Inspiration: Simula-67 and Smalltalk-80

Simula-67objects in programs mimicobjects of the “real world”objects are cooperatively scheduled→ coroutines

Smalltalk-80 Java, C#, Python, . . .sender callerreceiver calleemessage method

Objects should be thought of as having “independent lives”,even if the implementation is sequential

How to make objects “truly concurrent”?Ada uses rendezvous, Erlang and Akka use actors

“Part of the problemdescription becomespart of the solution”

Page 9: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Inspiration: Simula-67 and Smalltalk-80

Simula-67objects in programs mimicobjects of the “real world”objects are cooperatively scheduled→ coroutines

Smalltalk-80 Java, C#, Python, . . .sender callerreceiver calleemessage method

Objects should be thought of as having “independent lives”,even if the implementation is sequential

How to make objects “truly concurrent”?Ada uses rendezvous, Erlang and Akka use actors

“Part of the problemdescription becomespart of the solution”

Page 10: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Inspiration: Simula-67 and Smalltalk-80

Simula-67objects in programs mimicobjects of the “real world”objects are cooperatively scheduled→ coroutines

Smalltalk-80 Java, C#, Python, . . .sender callerreceiver calleemessage method

Objects should be thought of as having “independent lives”,even if the implementation is sequential

How to make objects “truly concurrent”?Ada uses rendezvous, Erlang and Akka use actors

“Part of the problemdescription becomespart of the solution”

Page 11: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Guarded Commands: A Model for Concurrency

dox1 > x2 → swap x1 and x2x2 > x3 → swap x2 and x4x3 > x4 → swap x3 and x4

Used for verification tools, TLA (Amazon), Event B (railways),SPIN (NASA), Microsoft Device Driver Verifier, . . . , for hardwaredescription languages, but not programming languages:

Widespread belief in the 80’s that guarded commandscannot be implemented efficiently

if x1 > x2 then swap x1 and x2,. . .if both x1 > x2 and x3 > x4,swap concurrently

Page 12: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Guarded Commands: A Model for Concurrency

dox1 > x2 → swap x1 and x2x2 > x3 → swap x2 and x4x3 > x4 → swap x3 and x4

Used for verification tools, TLA (Amazon), Event B (railways),SPIN (NASA), Microsoft Device Driver Verifier, . . . , for hardwaredescription languages, but not programming languages:

Widespread belief in the 80’s that guarded commandscannot be implemented efficiently

Page 13: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Lime: a Concurrent Object-Oriented Language

class DelayedDoublervar y: int; d: booleaninit()d := true

method store(u: int)y := u; d := false

method retrieve(): intd → return y

action doublenot d → y := 2 * y; d := true

Actions execute when enabledAll objects are concurrentObjects synchronize throughmethod callsMethod blocks when guard isfalseExecution is atomic up tomethod callsGuards only over local fields

Page 14: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Lime: a Concurrent Object-Oriented Language

class DelayedDoublervar y: int; d: booleaninit()d := true

method store(u: int)y := u; d := false

method retrieve(): intd → return y

action doublenot d → y := 2 * y; d := true

A correct implementation of:

class Doublervar x: intmethod store(u: int)x := 2 * u

method retrieve(): intreturn x

Representative for writing to a file, sending over network, . . .

Page 15: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 16: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:true

headm:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 17: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:true

headm:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 18: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 19: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:true

headm:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 20: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:true

headm:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 21: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 22: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:truehead

m:4pa

m:5pa

m:6pa

mpa

Page 23: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:true

headm:4pa

m:5pa

m:6pa

mpa

Page 24: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Priority Queue

Add an integer to the queueRemove the smallest integerRemove executes in O(1) timeInitialize; add 5, 6, 4 . . .

class PriorityQueuevar l : PriorityQueue. . .method add(e: int)when not a and not r doif l = nil thenm := e ; l := new PriorityQueue

else p := e ; a := trueaction doAddwhen a doif m < p then l.add(p)else l.add(m); m := pa := false

headmpa

headm:5p

a:truehead

m:5pa

mpa

headm:5p:6

a:true

mpa

headm:5pa

m:6p

a:truehead

m:5pa

m:6pa

mpa

headm:5p:4

a:true

m:6pa

mpa

headm:4pa

m:6p:5

a:true

mpa

headm:4pa

m:5pa

m:6p

a:true

headm:4pa

m:5pa

m:6pa

mpa

Page 25: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Results of Priority Queue

0

500

1000

1500

2000

2500

3000

3500

10 20 30 40 50 60 70 80

Tim

e (

ms)

Number of Objects

LimeGo

ErlangJava

PthreadHaskell

0

5000

10000

15000

20000

25000

30000

1000

2000

3000

4000

5000

6000

7000

8000

9000

Page 26: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Leaf-oriented Tree

Internal nodes contain only guidesThe elements are stored in the leaves

Leaf

Leaf

Leaf

Leaf

Node

Node

Root input

Page 27: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Results of Leaf-oriented Tree

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10 20 30 40 50 60 70 80

Tim

e (

ms)

Number of Objects

LimeGo

ErlangJava

PthreadHaskell

0

200

400

600

800

1000

1200

1400

1600

1800

1000

2000

3000

4000

5000

6000

7000

8000

9000

Page 28: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

MapReduce

MapReduce computes the sum of squares from 1 to nThe mapper: map(x) = x2

The reducer: reduce(x , y) = x + y .

Mapper

Mapper

Mapper

Mapper

Reducer

Reducer

Reducer

input

input

input

input

output

Page 29: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Results of MapReduce

0

1000

2000

3000

4000

5000

6000

7000

8000

2 4 8 16 32 64

Tim

e (

ms)

Number of Objects

LimeGo

ErlangJava

PthreadHaskell

0

500

1000

1500

2000

2500

3000

3500

4000

4500

128

256

512

1024

2048

Page 30: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

MapReduce with Varying Number of Threads

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8

Tim

e (

ms)

Thread number

MapReduce Example (8192)

GoLime

Page 31: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Santa Claus Problem

Santa Claus sleeps in his shop up at the North Pole, and can only be wakened byeither all nine reindeer being back from their year long vacation on a tropical island, orby some elves who are having some difficulties making the toys. One elf’s problem isnever serious enough to wake up Santa (otherwise, he may never get any sleep), so,the elves visit Santa in a group of three. When three elves are having their problemssolved, any other elves wishing to visit Santa must wait for those elves to return. IfSanta wakes up to find three elves waiting at his shop’s door, along with the lastreindeer having come back from the tropics, Santa has decided that the elves can waituntil after Christmas, because it is more important to get his sleigh ready as soon aspossible. (It is assumed that the reindeer don’t want to leave the tropics, andtherefore they stay there until the last possible moment.) The penalty for the lastreindeer to arrive is that it must get Santa while the others wait in a warming hutbefore being harnessed to the sleigh. [Trono, 1994]

Includes priority, multi-party synchronization, barriers, and batch processing.

A number of flawed solutions have been published.

Page 32: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Results of Santa Claus

Repetitionsof Santa

Lime (guards) C (semaphores) Go (channels) Java (monitors)

10,000 0.03 / 0.03 / 0.00 0.87 / 0.26 / 1.18 0.08 / 0.12 / 0.01 6.38 / 2.48 / 5.30100,000 0.21 / 0.21 / 0.00 8.82 / 2.50 / 12.0 0.77 / 1.18 / 0.06 60.3 / 21.6 / 52.01,000,000 2.03 / 2.03 / 0.01 93.0 / 24.8 / 123 7.51 / 11.6 / 0.55 ≈534 / 159 / 509

Execution time in sec on AMD 16 core (32 threads) processor: real / user / system

Page 33: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

GO and Lime

GO threads (goroutines) use synchronous communication(CSP)Object-oriented structure deemphasized, thread structureemphasized

GO and Lime make use of goroutines resp. actions so cheap thatconcurrency can be used whenever natural;

The runtime will make use of all available cores.

How is that achieved?

Page 34: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Lime Runtime System

obj obj obj obj

head tail

Global Queue

head tail

obj obj obj...

obj obj obj obj...

obj obj obj...

Thread 1

Thread 2

...

Thread n

lock

lock

lock

lock

Local Queues

enqueue

dequeue

Number of worker threads with local queue ≈ number of cores

Local queue is full / empty: enqueue / dequeue to / from global queue

Local and global queues are empty: steal from the other threads.

Lock-free implementation of queues

Each action is a coroutine: compiler inserts transfers in code, i.e. schedulescooperatively → lightweight threads

Page 35: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

Conclusions

Efficient implementation of concurrency with guarded commands ispossible, when local to objects

Based on the Ph.D. thesis of Shucai Yao (2018) andJoshua Moore-Oliva, M.Sc, (2010)Xiao-lei Cui, M.Sc. (2009)Upasana Pujari, M.Sc. (2009)Kevin Lou, M.Sc. (2004)Jie Liang, M.Sc. (2004)

Ongoing work on inheritance, ownership, exceptions

Page 36: Programming in the Multi-Core Era - McMaster Universityemil/pubs/Sekerinski18MultiCoreEra... · 2019. 8. 12. · 1 > x 2!swapx 1 andx 2 x 2 > x 3!swapx 2 andx 4 x 3 > x 4!swapx 3

http://www.software-pioneers.com (2001)