140
Scala Parallel Collections Aleksandar Prokopec EPFL

Scala Parallel Collections Aleksandar Prokopec EPFL

Embed Size (px)

Citation preview

Page 1: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala Parallel Collections

Aleksandar ProkopecEPFL

Page 2: Scala Parallel Collections Aleksandar Prokopec EPFL
Page 3: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

McDonald

Page 4: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

1040 ms

Page 5: Scala Parallel Collections Aleksandar Prokopec EPFL
Page 6: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala parallel collections

for { s <- surnames n <- names if s endsWith n} yield (n, s)

Page 7: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

Page 8: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

2 cores

575 ms

Page 9: Scala Parallel Collections Aleksandar Prokopec EPFL

Scala parallel collections

for { s <- surnames.par n <- names.par if s endsWith n} yield (n, s)

4 cores

305 ms

Page 10: Scala Parallel Collections Aleksandar Prokopec EPFL

for comprehensions

surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}

Page 11: Scala Parallel Collections Aleksandar Prokopec EPFL

for comprehensionsnested parallelized bulk operations

surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s))}

Page 12: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelism

Page 13: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismparallel within parallel

composition

surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops}

Page 14: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...

Page 15: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc

Page 16: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield

recursive algorithms

Page 17: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c

Page 18: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c

Page 19: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, Array(""))

Page 20: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, Array(""))

1545 ms

Page 21: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray(""))

Page 22: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 1 core

1575 ms

Page 23: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 2 cores

809 ms

Page 24: Scala Parallel Collections Aleksandar Prokopec EPFL

Nested parallelismgoing recursive

def vowel(c: Char): Boolean = ...def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s

gen(5, ParArray("")) 4 cores

530 ms

Page 25: Scala Parallel Collections Aleksandar Prokopec EPFL

So, I just use par and I’m home free?

Page 26: Scala Parallel Collections Aleksandar Prokopec EPFL

How to think parallel

Page 27: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countuse case for foldLeft

val txt: String = ...txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

Page 28: Scala Parallel Collections Aleksandar Prokopec EPFL

6543210

Character countuse case for foldLeft

txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

going left to right - not parallelizable!

A B C D E F

_ + 1

Page 29: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countuse case for foldLeft

txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

going left to right – not really necessary

3210 A B C

_ + 1

3210 D E F

_ + 1

_ + _6

Page 30: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countin parallel

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

Page 31: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countin parallel

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

3211 A B C

_ + 1

3211 A B C

: (Int, Char) => Int

Page 32: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countfold not applicable

txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1}

3213 A B C

_ + _ 33

3213 A B C

! (Int, Int) => Int

Page 33: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countuse case for aggregate

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

Page 34: Scala Parallel Collections Aleksandar Prokopec EPFL

3211 A B C

Character countuse case for aggregate

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

_ + _ 33

3213 A B C

_ + 1

Page 35: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countuse case for aggregate

aggregation element

3211 A B C

_ + _ 33

3213 A B C

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

B

_ + 1

Page 36: Scala Parallel Collections Aleksandar Prokopec EPFL

Character countuse case for aggregate

aggregation aggregation aggregation element

3211 A B C

_ + _ 33

3213 A B C

txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1}, _ + _)

B

_ + 1

Page 37: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countanother use case for foldLeft

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

Page 38: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countinitial accumulation

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

0 words so far last character was a space

“Folding me softly.”

Page 39: Scala Parallel Collections Aleksandar Prokopec EPFL

Word counta space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character is a space

Page 40: Scala Parallel Collections Aleksandar Prokopec EPFL

Word counta non space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character was a space – a new word

Page 41: Scala Parallel Collections Aleksandar Prokopec EPFL

Word counta non space

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

“Folding me softly.”

last seen character wasn’t a space – no new word

Page 42: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countin parallel

“softly.““Folding me “

P1 P2

Page 43: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countin parallel

“softly.““Folding me “

wc = 2; rs = 1 wc = 1; ls = 0

P1 P2

Page 44: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countin parallel

“softly.““Folding me “

wc = 2; rs = 1 wc = 1; ls = 0wc = 3

P1 P2

Page 45: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countmust assume arbitrary partitions

“g me softly.““Foldin“

wc = 1; rs = 0 wc = 3; ls = 0

P1 P2

Page 46: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count must assume arbitrary partitions

“g me softly.““Foldin“

wc = 1; rs = 0 wc = 3; ls = 0

P1 P2

wc = 3

Page 47: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

Page 48: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

# spaces on the left # spaces on the right#words

Page 49: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countinitial aggregation

txt.par.aggregate((0, 0, 0))

# spaces on the left # spaces on the right#words

””

Page 50: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countaggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res

“““Folding me“ “softly.“““

Page 51: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs)

“e softly.“ “Folding m“

Page 52: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation aggregation

...}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)

“ softly.““Folding me”

Page 53: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1)

”_”

0 words and a space – add one more space each side

Page 54: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0)

” m”

0 words and a non-space – one word, no spaces on the right side

Page 55: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1)

” me_”

nonzero words and a space – one more space on the right side

Page 56: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0)

” me sof”

nonzero words, last non-space and current non-space – no change

Page 57: Scala Parallel Collections Aleksandar Prokopec EPFL

Word count aggregation element

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)

” me s”

nonzero words, last space and current non-space – one more word

Page 58: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countin parallel

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Page 59: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countusing parallel strings?

txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Page 60: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countstring not really parallelizable

scala> (txt: String).par

Page 61: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

Page 62: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

Page 63: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

ParArray

Page 64: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countstring not really parallelizable

scala> (txt: String).parcollection.parallel.ParSeq[Char] = ParArray(…)

different internal representation!

ParArray

copy string contents into an array

Page 65: Scala Parallel Collections Aleksandar Prokopec EPFL

Conversionsgoing parallel

// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}

mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}

Page 66: Scala Parallel Collections Aleksandar Prokopec EPFL

Conversionsgoing parallel

// `par` is efficient for...mutable.{Array, ArrayBuffer, ArraySeq}

mutable.{HashMap, HashSet}immutable.{Vector, Range}immutable.{HashMap, HashSet}

most other collections construct a new parallel collection!

Page 67: Scala Parallel Collections Aleksandar Prokopec EPFL

Conversionsgoing parallel

sequential parallel

Array, ArrayBuffer, ArraySeq mutable.ParArray

mutable.HashMap mutable.ParHashMap

mutable.HashSet mutable.ParHashSet

immutable.Vector immutable.ParVector

immutable.Range immutable.ParRange

immutable.HashMap immutable.ParHashMap

immutable.HashSet immutable.ParHashSet

Page 68: Scala Parallel Collections Aleksandar Prokopec EPFL

Conversionsgoing parallel

// `seq` is always efficientParArray(1, 2, 3).seqList(1, 2, 3, 4).seqParHashMap(1 -> 2, 3 -> 4).seq”abcd”.seq

// `par` may not be...”abcd”.par

Page 69: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collections

Page 70: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)

Page 71: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] {

Page 72: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length

Page 73: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)

Page 74: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]

Page 75: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collection

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)

Page 76: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collectionsplitter definition

class ParStringSplitter(var i: Int, len: Int)extends Splitter[Char] {

Page 77: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collectionsplitters are iterators

class ParStringSplitter(i: Int, len: Int)extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }

Page 78: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collectionsplitters must be duplicated

... def dup = new ParStringSplitter(i, len)

Page 79: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collectionsplitters know how many elements remain

... def dup = new ParStringSplitter(i, len) def remaining = len - i

Page 80: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom collectionsplitters can be split

... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }

Page 81: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countnow with parallel strings

new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

Page 82: Scala Parallel Collections Aleksandar Prokopec EPFL

Word countperformance

txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false)}

new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0)}, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs)})

100 ms

cores: 1 2 4time: 137 ms 70 ms 35 ms

Page 83: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

GenTraversable

GenIterable

GenSeq

Traversable

Iterable

Seq

ParIterable

ParSeq

Page 84: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

Page 85: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

Page 86: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

side-effects!ArrayBuffer is not synchronized!

Page 87: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res += s } res}

side-effects!ArrayBuffer is not synchronized!

ParSeq

Seq

Page 88: Scala Parallel Collections Aleksandar Prokopec EPFL

Hierarchy

def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]()for (s <- sq) {

if (s.nonEmpty) res.synchronized { res += s } } res}

Page 89: Scala Parallel Collections Aleksandar Prokopec EPFL

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Page 90: Scala Parallel Collections Aleksandar Prokopec EPFL

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

These return collections!

Page 91: Scala Parallel Collections Aleksandar Prokopec EPFL

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Sequential collections – builders

Page 92: Scala Parallel Collections Aleksandar Prokopec EPFL

Accessors vs. transformerssome methods need more than just splitters

foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, …

map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Sequential collections – buildersParallel collections – combiners

Page 93: Scala Parallel Collections Aleksandar Prokopec EPFL

Buildersbuilding a sequential collection

1 2 3 4 5 6 7 Nil2 4 6

Nil

ListBuilder

+= += +=

result

Page 94: Scala Parallel Collections Aleksandar Prokopec EPFL

How to build parallel?

Page 95: Scala Parallel Collections Aleksandar Prokopec EPFL

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

Page 96: Scala Parallel Collections Aleksandar Prokopec EPFL

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

CombinerCombiner Combiner

Page 97: Scala Parallel Collections Aleksandar Prokopec EPFL

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

Should be efficient – O(log n) worst case

Page 98: Scala Parallel Collections Aleksandar Prokopec EPFL

Combinersbuilding parallel collections

trait Combiner[-Elem, +To]extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo]}

How to implement this combine?

Page 99: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel arrays

1, 2, 3, 4 5, 6, 7, 82, 4 6, 8 3, 1, 8, 0 2, 2, 1, 98, 0 2, 2

merge merge

mergecopy

allocate

2 4 6 8 8 0 2 2

Page 100: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap

Page 101: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

e.g. calling filter

Page 102: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner ParHashCombiner

e.g. calling filter

0 51 7 94

Page 103: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner

0 1 4

ParHashCombiner

5 7 9

Page 104: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap0 1 2 4 5 7 8 9

ParHashCombiner

0 1 4

ParHashCombiner

5 9

5 70 1 4

7

9

Page 105: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashMap

ParHashCombiner ParHashCombiner

How to merge?

5 70 1 4 9

Page 106: Scala Parallel Collections Aleksandar Prokopec EPFL

5 7 8 91 40

Parallel hash tables

buckets!ParHashCombiner ParHashCombiner

0 1 4 975

ParHashMap20 = 00002

1 = 00012

4 = 01002

Page 107: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashCombiner ParHashCombiner

0

1

4 9

7

5

combine

Page 108: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

ParHashCombiner ParHashCombiner

9

7

50

1

4

ParHashCombiner

no copying!

Page 109: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

9

7

5

0

1

4

ParHashCombiner

Page 110: Scala Parallel Collections Aleksandar Prokopec EPFL

Parallel hash tables

9750 1 4

ParHashMap

Page 111: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

What is the return type here?

Page 112: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

creates a ParVector!

Page 113: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

new ParString(txt).filter(_ != ‘ ‘)

creates a ParVector!

class ParString(val str: String)extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i)...

Page 114: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...

Page 115: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner : Combiner[Char, ParString]

Page 116: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParString(val str: String)extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString]{ def apply(i: Int) = str.charAt(i)...protected[this] override def newCombiner = new ParStringCombiner

Page 117: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] {

Page 118: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0

Page 119: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0

size

Page 120: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)

size

Page 121: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder)

size

chunks

Page 122: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last

size

chunks

Page 123: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last

size lastc

chunks

Page 124: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }

Page 125: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

class ParStringCombinerextends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }

size lastc

chunks+1

Page 126: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }

Page 127: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo])

lastc

chunks

lastc

chunks

Page 128: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

... def result = { val rsb = new StringBuilder for (sb <- chunks) rsb.append(sb) new ParString(rsb.toString) }...

Page 129: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods returning custom collections

... def result = ...

lastc

chunks

StringBuilder

Page 130: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersfor methods expecting implicit builder factories

// only for big boys... with GenericParTemplate[T, ParColl]...

object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...

Page 131: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinersperformance measurement

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

Page 132: Scala Parallel Collections Aleksandar Prokopec EPFL

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

Custom combinersperformance measurement

Page 133: Scala Parallel Collections Aleksandar Prokopec EPFL

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms

Custom combinersperformance measurement

Page 134: Scala Parallel Collections Aleksandar Prokopec EPFL

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms2 cores

81 ms

Custom combinersperformance measurement

Page 135: Scala Parallel Collections Aleksandar Prokopec EPFL

txt.filter(_ != ‘ ‘)

new ParString(txt).filter(_ != ‘ ‘)

106 ms

1 core

125 ms2 cores

81 ms4 cores

56 ms

Custom combinersperformance measurement

Page 136: Scala Parallel Collections Aleksandar Prokopec EPFL

1 core

125 ms2 cores

81 ms4 cores

56 ms

t/ms

proc

125 ms

1 2 4

81 ms56 ms

Custom combinersperformance measurement

Page 137: Scala Parallel Collections Aleksandar Prokopec EPFL

1 core

125 ms2 cores

81 ms4 cores

56 ms

t/ms

proc

125 ms

1 2 4

81 ms56 ms

def result

(not parallelized)

Custom combinersperformance measurement

Page 138: Scala Parallel Collections Aleksandar Prokopec EPFL

Custom combinerstricky!

• two-step evaluation– parallelize the result method in combiners

• efficient merge operation– binomial heaps, ropes, etc.

• concurrent data structures– non-blocking scalable insertion operation– we’re working on this

Page 139: Scala Parallel Collections Aleksandar Prokopec EPFL

Future workcoming up

• concurrent data structures• more efficient vectors• custom task pools• user defined scheduling• parallel bulk in-place modifications

Page 140: Scala Parallel Collections Aleksandar Prokopec EPFL

Thank you!

Examples at:git://github.com/axel22/sd.git