47
13. Introduction to Parallel Programming Fabrizio Perin Prof. O. Nierstrasz

13. Introduction to Parallel Programming Fabrizio Perin Prof. O. Nierstrasz

Embed Size (px)

Citation preview

13. Introduction to Parallel Programming

Fabrizio PerinProf. O. Nierstrasz

© Oscar Nierstrasz 2

Sources

> Section 4.4 of Concurrent Programming in Java (Doug Lea, Prentice Hall PTR, November 1999)— Covers parallel decomposition in greater detail.

> Section 6-7-8 of Java concurrency in practice (Brian Goetz, et al., Addison Wesley Professional May 09, 2006)

> Doug Lea's concurrency-interest website: — Download the fork-join framework as part of the jsr166y package — read the paper on its design.— http://gee.cs.oswego.edu/dl/concurrency-interest/index.html

Parallelism

© Oscar Nierstrasz 3

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

© Oscar Nierstrasz 4

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

Parallelism

5

Concurrent programming and parallelism

> Concurrent computing is a form of computing in which programs are designed as collections of interacting computational processes that may be executed in parallel.

> Parallel computing is a form of computation in which many calculations are carried out simultaneously.

© Oscar Nierstrasz

Wikipedia

© Oscar Nierstrasz 6

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

Why we should practice parallel programming?

© Oscar Nierstrasz 7

Parallelism

Because I want to keep my super cool multi-core computer busy!

© Oscar Nierstrasz 8

Parallelism

Speedup = old running time / new running time

Speedup = 140/ 65 = 2.15

(parallel version is 2.15 times faster)

25

20

25 25 25

20

20

100

20

140 65

Why we should practice parallel programming?

Why we should practice parallel programming?

© Oscar Nierstrasz 9

Parallelism

Speedup = old running time / new running time

Speedup = 140/ 65 = 2.15

(parallel version is 2.15 times faster)

25

20

25 25 25

20

20

100

20

140 65

Why we should practice parallel programming?

© Oscar Nierstrasz 10

Parallelism

Speedup = old running time / new running time

Speedup = 140/ 65 = 2.15

(parallel version is 2.15 times faster)

25

20

25 25 25

20

20

100

20

140 65

Why we should practice parallel programming?

© Oscar Nierstrasz 11

Parallelism

25

20

25 25 25

20

20

100

20

140 65

Speed up =1

(1− p) +p

n

p = part of parallel code

n = number of CPUs

Amdahl’s law

Why we should practice parallel programming?

© Oscar Nierstrasz 12

Parallelism

Amdahl’s law

25

20

25 25 25

20

20

100

20

140 65

=1

(1 − p) +p

n

=1

(1 − 71%) +71%

4

= 2.15

speedup =oldrunningtime

newrunningtime=

p = part of parallel code =100⋅100%

140= 71%

n = number of CPUs = 4

Why we should practice parallel programming?

© Oscar Nierstrasz 13

Parallelism

Amdahl’s law

p = part of parallel code =100⋅100%

140= 71%

n = number of CPUs = 4

25

20

25 25 25

20

20

100

20

140 65

The maximum speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.

Why we should practice parallel programming?

© Oscar Nierstrasz 14

Parallelism

Think about the problem!!!

Parallelism

15

Why we should practice parallel programming?

> Don’t try to force a non-parallel problem to be parallel

> Identify which are the program chunks that can provide the best ratio speedup/effort

© Oscar Nierstrasz

Why we should practice parallel programming?

© Oscar Nierstrasz 16

Parallelism

B 100A 50

B 100

A 50

e.g.

B x2

A 10

B 50

> Don’t try to force a non-parallel problem to be parallel

> Identify which are the program chunks that can provide the best ratio speedup/effort

A x5

© Oscar Nierstrasz 17

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

Common steps to create parallel programs

© Rodric Rabbah, IBM 18

Parallelism

Kinds of parallelism (Problem decomposition)

© Rodric Rabbah, IBM 19

Parallelism

Kind of parallelisms (Problem decomposition)

© Rodric Rabbah, IBM 20

Parallelism

> Data parallelism: The same task run on different data in parallel

Can divide parts of the data between different tasks and perform the tasks in parallel

No dependencies among the tasks that cause their results to be ordered or merged

Kind of parallelisms (Problem decomposition)

© Rodric Rabbah, IBM 21

Parallelism

> Task parallelism: Different tasks running on the same data

Several functions on

the same data (e.g. average, max, min etc..)

Tasks are independent so they can run in parallel

Kind of parallelisms (Problem decomposition)

© Rodric Rabbah, IBM 22

Parallelism

> Hybrid data/task parallelism: A parallel pipeline of tasks, each of which might be data parallel Each task can run in

parallel E.g. Unix pipes

© Oscar Nierstrasz 23

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel

programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

24

Java library for concurrent and parallel programming

> This package includes classes and extensible frameworks to support concurrent and parallel programming

Parallelism

© Oscar Nierstrasz

© Oscar Nierstrasz 25

Executor

> Executor is an interface used to define custom thread-like systems.

> Executor contains methods to execute tasks and manage with them.

> Tasks may execute in a newly created thread, an existing task-execution thread, or the thread calling execute(), and may execute sequentially or concurrently.

Parallelism

public interface Executor {

Void execute(Runnable command);}

© Oscar Nierstrasz 26

Web server without Executor

Parallelism

class ThreadPerTaskWebServer{public static void main(String[] args) throws IOException{

ServerSocket socket = new ServerSocket(80);while(true){

final Socket connection = socket.accept();Runnable task = new Runnable() {

public void run(){handleRequest(connection);

}};new Thread(task).start();

}}

}

© Oscar Nierstrasz 27

Web server with Executor

Parallelism

class TaskExecutionWebServer{private static final int NTHREADS = 100;private static final Executor exec

= Executor.newFixedThreadPool(NTHREADS);

public static void main(String[] args) throws IOException{ServerSocket socket = new ServerSocket(80);while(true){

final Socket connection = socket.accept();Runnable task = new Runnable() {

public void run(){handleRequest(connection);

}};exec.execute(task);

}}

}

© Oscar Nierstrasz 28

Thread Pool

> A thread pool manages a set of worker threads.> The threads into the pool have a simple life cycle:

> Request the next task from the queue of tasks> Execute> And wait for another task

> Advantages from using a thread pool:> Reduce the costs of thread creation and teardown> Increases responsiveness> By properly tuning the pool you always have the correct number

of threads (you don’t run out of memory and all your CPUs are busy)

Parallelism

© Oscar Nierstrasz 29

Thread Pool

> newFixedThreadPool: A fixed-size thread pool. A new thread is created for each task to execute up to the maximum pool size. Attempts to keep the pool size constant (threads that die for any reason are replaced by new threads).

> newCachedThreadPool: More flexible pool that removes idle threads when the size of the pool exceeds the demand for processing, and adds new threads when demand increases, but places no bounds on the size of the pool.

Parallelism

© Oscar Nierstrasz 30

Thread Pool

> newSingleThreadExecutor: A single-threaded executor creates a single worker thread to process tasks, replacing it if it dies unexpectedly. Tasks are guaranteed to be processed sequentially according to the order imposed by the task queue (FIFO, LIFO, priority order).

> newScheduledThreadPool: A fixed-size thread pool that supports delayed and periodic task execution, similar to Timer.

Parallelism

© Oscar Nierstrasz 31

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

© Oscar Nierstrasz 32

Select Max in parallel

…public class SelectMaxProblem {…public int solveSequentially(){

int max = Integer.MIN_VALUE;for (int i = start; i<end; i++){

int n = numbers[i];if(n > max)

max = n;}return max;

}…

Parallelism

© Oscar Nierstrasz 33

Select Max in parallel

import jsr166y.ForkJoinTask;import jsr166y.RecursiveAction;

public class MaxWithFJ extends RecursiveAction {

@Overrideprotected void compute() {

if(problem.getSize() < threshold){ result = problem.solveSequentially(); System.out.println(Thread.currentThread()+

" is solving sequentially on: ” +problem.getSize());

}else{ int midpoint = problem.getSize()/2; MaxWithFJ left =

new MaxWithFJ(problem.subproblem(0, midpoint), threshold); MaxWithFJ right =

new MaxWithFJ(problem.subproblem(midpoint+1, problem.getSize()), threshold); invokeAll(left, right); System.out.println("solving in parallel on: " + ForkJoinTask.getPool()); result = Math.max(left.getResult(), right.getResult());}

Parallelism

© Oscar Nierstrasz 34

Select Max in parallel

public class MaxWithFJTests {

private final SelectMaxProblem problem; private final int threshold; private final int nThreads; private final int[] number = new int[500000];

public MaxWithFJTests(){…}

@Testpublic void parallelTest(){

MaxWithFJ mfj = new MaxWithFJ(problem, threshold); ForkJoinPool fjPool = new ForkJoinPool(nThreads);

fjPool.invoke(mfj); int result = mfj.getResult();

…assertEquals(result, max);}}

Parallelism

© Oscar Nierstrasz 35

Roadmap

> Concurrent programming and parallelism> Why we should practice parallel programming?

(Amdahl's laws about speedup)> Steps to create parallel programs> Java library for concurrent and parallel programming> Examples (find the max)> Example (merge sort parallel)

Parallelism

Parallelism

36

Merge sort

> Divide et Impera algorithm:> Divide: split your problem into sub-problems that are smaller

parts of the original problem> Impera: solve the sub-problems recursively. (If the sub-

problem is small enough than it is solve in a straightforward manner).

> Combine: the solutions to the sub-problems into the solution for the original problem.

© Oscar Nierstrasz

Parallelism

37

Merge sort

> Merge sort:> Divide the n-element sequence to be sorted into two sub-

sequences of n/2 elements each> Impera: Sort the the sub-sequence recursively using merge

sort> Combine: Merge the two sorted sub-sequences to produce the

sorted answer

© Oscar Nierstrasz

© Oscar Nierstrasz 38

Merge sort in parallel

. . .public class MergeSort extends RecursiveAction {. . .private void merge(MergeSort left, MergeSort right) {

int i=0, leftPos=0, rightPos=0, leftSize = left.size(), rightSize = right.size();

while (leftPos < leftSize && rightPos < rightSize)

result[i++] = (left.result[leftPos] <= right.result[rightPos]) ? left.result[leftPos++] : right.result[rightPos++]; while (leftPos < leftSize) result[i++] = left.result[leftPos++]; while (rightPos < rightSize) result[i++] = right.result[rightPos++]; }

. . .

Parallelism

© Oscar Nierstrasz 39

Merge sort in parallel

. . .public class MergeSort extends RecursiveAction {. . .

public int size() {return endPos-startPos;}

protected void compute() { if (size() < SEQUENTIAL_THRESHOLD) { System.arraycopy(numbers, startPos, result, 0, size()); Arrays.sort(result, 0, size()); } else { int midpoint = size() / 2; MergeSort left =

new MergeSort(numbers, startPos, startPos+midpoint); MergeSort right =

new MergeSort(numbers, startPos+midpoint, endPos); invokeAll(left, right); merge(left, right); } } public int[] getResult() {return result;}}

Parallelism

© Oscar Nierstrasz 40

Merge sort in parallel

package mergeSortParallel;import java.util.Arrays;import jsr166y.RecursiveAction;

public class MergeSort extends RecursiveAction { private static final int SEQUENTIAL_THRESHOLD = 50000;…

private void merge(MergeSort left, MergeSort right) { int i=0, leftPos=0, rightPos=0, leftSize = left.size(), rightSize = right.size(); while (leftPos < leftSize && rightPos < rightSize) result[i++] = (left.result[leftPos] <= right.result[rightPos]) ? left.result[leftPos++] : right.result[rightPos++]; while (leftPos < leftSize) result[i++] = left.result[leftPos++]; while (rightPos < rightSize) result[i++] = right.result[rightPos++]; }

public int size() {return endPos-startPos;}

protected void compute() { if (size() < SEQUENTIAL_THRESHOLD) { System.arraycopy(numbers, startPos, result, 0, size()); Arrays.sort(result, 0, size()); } else { int midpoint = size() / 2; MergeSort left = new MergeSort(numbers, startPos, startPos+midpoint); MergeSort right = new MergeSort(numbers, startPos+midpoint, endPos); invokeAll(left, right); merge(left, right); } } public int[] getResult() {return result;}}

Parallelism

© Oscar Nierstrasz 41

WordCount example

Parallelism

> Where was the problem?

© Oscar Nierstrasz 42

WordCount example

Parallelism

50 500 5000 50000 5000000

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1

2

3

4

5

6

7

8

9

10

© Oscar Nierstrasz 43

WordCount example

Parallelism

50 500 5000 50000 500000400

1

2

3

4

5

6

7

8

9

10

© Oscar Nierstrasz 44

WordCount example

Parallelism

1 2 3 4 5 6 7 8 9 100

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

50

500

5000

50000

500000

© Oscar Nierstrasz 45

WordCount Example

[ wordcount ]> java WordCount bigdict.txt Total words = 241104Total time = 225 ms[ wordcount ]> java WordCount bigdict.txt Total words = 241104Total time = 226 ms

[ wordcount ]> java WordCountParallel bigdict.txt 8 50000Total words = 241104Total time = 148 ms[ wordcount ]> java WordCountParallel bigdict.txt 8 50000Total words = 241104Total time = 148 ms

[ wordcount ]> java WordCountParallelAtomicInt bigdict.txt 8 50000Total words = 241104Total time = 133 ms[ wordcount ]> java WordCountParallelAtomicInt bigdict.txt 8 50000Total words = 241104Total time = 132 ms

Parallelism

© Oscar Nierstrasz

Architectural Styles for Concurrency

46

What you should know!

> What is the difference between Concurrent Computing and Parallel Computing?

> Why you execute code in parallel?> Which kind of problem decomposition you can apply?> Which are the main functionality of the

java.util.concurrent package of Java?

License

© Oscar Nierstrasz

ESE — Introduction

Attribution-ShareAlike 3.0 UnportedYou are free:

to Share — to copy, distribute and transmit the workto Remix — to adapt the work

Under the following conditions:Attribution. You must attribute the work in the manner specified by the author or licensor

(but not in any way that suggests that they endorse you or your use of the work).Share Alike. If you alter, transform, or build upon this work, you may distribute the

resulting work only under the same, similar or a compatible license.For any reuse or distribution, you must make clear to others the license terms of this work.

The best way to do this is with a link to this web page.Any of the above conditions can be waived if you get permission from the copyright holder.Nothing in this license impairs or restricts the author's moral rights.

http://creativecommons.org/licenses/by-sa/3.0/