9
Advanced Computer Networks Lecture 1 - Parallelization 1

Advanced Computer Networks Lecture 1 - Parallelization 1

Embed Size (px)

Citation preview

Page 1: Advanced Computer Networks Lecture 1 - Parallelization 1

1

Advanced Computer Networks

Lecture 1 - Parallelization

Page 2: Advanced Computer Networks Lecture 1 - Parallelization 1

2

Scale increases complexity

Single-coremachine Cluster

Multicoreserver

Large-scale distributed system

Wide-areanetwork

More challenges

Trueconcurrency

NetworkMessage passingMore failure modes(faulty nodes, ...)

Wide-area networkEven more failuremodesIncentives, laws, ...

Page 3: Advanced Computer Networks Lecture 1 - Parallelization 1

3

Parallelization

• The algorithm works fine on one core• Can we make it faster on multiple cores?

– Difficult - need to find something for the other cores to do– There are other sorting algorithms where this is much easier– Not all algorithms are equally parallelizable

void bubblesort(int nums[]) { boolean done = false; while (!done) { done = true; for (int i=1; i<nums.length; i++) { if (nums[i-1] > nums[i]) { swap(nums[i-1], nums[i]); done = false; } } }}

Page 4: Advanced Computer Networks Lecture 1 - Parallelization 1

4

Parallelization

• If we increase the number of processors, will the speed also increase?– Yes, but (in almost all cases) only up to a point

Numberssorted persecond

Cores used

Ideal

Expected

Speedup:

nN T

TS 1

Completion timewith one core

Completion timewith n cores

Page 5: Advanced Computer Networks Lecture 1 - Parallelization 1

5

N

i i

i

overall

Sf

S

0

1

Amdahl's law

• Usually, not all parts of the algorithm can be parallelized• Let f be the fraction of the algorithm that can be parallelized,

and let Si be the corresponding speedup• Then

Time Time Time

....

Parallelpart Sequential

parts

Core #1

Core #2

Core #3

Core #1

Core #2

Core #3

Core #4

Core #5

Core #6

Page 6: Advanced Computer Networks Lecture 1 - Parallelization 1

6

Amdahl's law• We are given a sequential task which is split into

four consecutive parts: P1, P2, P3 and P4 with the percentages of runtime being 11%, 18%, 23% and 48% respectively.

• Then we are told that P1 does not speed up, so S1 = 1, while P2 speeds up 5×, P3 speeds up 20×, and P4 speeds up 1.6×.

• New sequential running time is:

Page 7: Advanced Computer Networks Lecture 1 - Parallelization 1

7

Amdahl's law

• Or a little less than 1⁄2 the original running time

• The overall speed boost is 1 / 0.4575 = 2.186, or a little more than double the original speed.

Page 8: Advanced Computer Networks Lecture 1 - Parallelization 1

8

Is more parallelism always better?

• Increasing parallelism beyond a certain point can cause performance to decrease!– Example: Need to send a message to each core

to tell it what to do. Messages back and forth

Numberssorted persecond

Cores

Ideal

Expected

Reality (often)

Sweetspot

Page 9: Advanced Computer Networks Lecture 1 - Parallelization 1

9

Parallelization

• What size of task should we assign to each core?

• Frequent coordination creates overhead– Need to send messages back and forth, wait for

other cores...– Result: Cores spend most of their time

communicating

– Bad: Ask each core to sort three numbers– Good: Ask each core to sort a million numbers