54
Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education.

Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Embed Size (px)

Citation preview

Page 1: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Parallel and Distributed Algorithms

Eric Vidal

Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson

Education.

Page 2: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Outline

• Introduction (case study: maximum element)– Work-optimality

• The Parallel Random Access Machine– Shared memory modes– Accelerated cascading

• Other Parallel Architectures (case study: sorting)– Circuits– Linear processor networks– (Mesh processor networks)

• Distributed Algorithms– Message-optimality– Broadcast and echo– (Leader election)

Page 3: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Introduction

Page 4: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Why use parallelism?

• p steps on 1 printer, 1 step on p printers• p = speed-up factor (best case)• Given a sequential algorithm, how can we

parallelize it?– Some are inherently sequential (P-complete)

Page 5: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Case Study: Maximum ElementIn: a[]Out: maximum element in a

sequential_maximum(a) { n = a.length max = a[0] for i = 1 to n – 1 { if (a[i] > max) max = a[i] } return max}

21 11 23 17 48 33 22 41

21

23

23

48

48

48

48

O(n)

Page 6: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Parallel Maximum

• Idea: Use ⌈n / 2 processors⌉

• Note idle processors after the first step!

21 11 23 17 48 33 22 41

21 23 48 41

23 48

48

O(lg n)

Page 7: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Work-Optimality

• Work = number of algorithmic steps × number of processors

• Running time of parallelized maximum algo = O(lg n) × (n / 2) = O(n lg n)

• Not work-optimal! Sequential algo’s work is O(n)– Workaround: accelerated cascading…

Page 8: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Formal Algorithm for Parallel Maximum

• But first!...

Page 9: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

The Parallel Random Access Machine

Page 10: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

The Parallel Random Access Machine (PRAM)

• New construct: parallel loop

for i = 1 to n in parallel { … }

• Assumption 1: use n processors to execute this loop (processors are synchronized)

• Assumption 2: memory shared across all processors

Page 11: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Example: Parallel Search In: a[], xOut: true if x is in a, false otherwise

parallel_search(a, x) { n = a.length found = false for i = 0 to n – 1 in parallel { if (a[i] == x) found = true } return found}

Is this work-optimal?

Shared memory modes:•Exclusive Read (ER)•Concurrent Read (CR)•Exclusive Write (EW)•Concurrent Write (CW)

Real-world systems are most commonly CREW

parallel_search runs on what type?

Page 12: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Formal Algorithm for Parallel Maximum

In: a[]Out: maximum element in a

parallel_maximum(a) { n = a.length for i = 0 to lg ⌈ n – 1 {⌉ for j = 0 to ⌈n/2i+1 – 1 ⌉ in parallel { if (j × 2i+1 + 2i < n) // boundary check a[j × 2i+1] = max(a[j × 2i+1], a[j × 2i+1 + 2i]) } } return a[0]}

Theorem: parallel_maximum is CREW and finds the maximum element in parallel time O(lg n) and work O(n lg n)

Page 13: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Accelerated Cascading

• Phase 1: Use sequential_maximum on blocks of lg n elements– We use n / lg n processors– O(lg n) sequential steps per processor– Total work = O(lg n) steps × (n / lg n) processors = O(n)

• Phase 2: Use parallel_maximum on the resulting n / lg n elements– lg (n / lg n) parallel steps = lg n – lg (lg n) = O(lg n)– Total work = O(lg n) steps × ((n / lg n) / 2) processors =

O(n)

Page 14: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Formal Algorithm for Optimal Maximum

In: a[]Out: maximum element in a

optimal_maximum(a) { n = a.length block_size = lg ⌈ n⌉ block_count = ⌈n / block_size⌉ create array block_results[block_count] for i = 0 to block_count – 1 in parallel { start = i × block_size end = min(n – 1, start + block_size – 1) block_results[i] = sequential_maximum(a[start .. end]) } return parallel_maximum(block_results)}

Page 15: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Some Notes

• All CR algorithms can be converted to ER algorithms!– “Broadcasting” an ER variable to all processors for

concurrent access takes O(lg n) parallel time• maximum is a “semigroup algorithm”– Semigroup = a set of elements + an associative

binary relation (max, min, +, ×, etc.)– Same accelerated-cascading methods can be

applied for min-element, summation, product of n numbers, etc.!

Page 16: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Other Parallel Architectures

Page 17: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

PRAM may not be the best model• Shared memory = expensive!– Some algorithms require communication between

processors (= memory locking issues)– Better to use channels!

• Extreme case: very simple processors with no shared memory (just communication channels)

Page 18: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Circuits

• Each processor is a gate with a specialized function (e.g., comparator gate)

• Circuit = a layout of gates to perform a full task (e.g., sorting)

x

y

min(x, y)

max(x, y)

Page 19: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting circuit for 4 elements (depth 3)

Step 1 Step 2 Step 3(Depth of network = 3)

17

42

23

7

17

42

7

23

23

17

7

17

23

42

7

42

Page 20: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting circuit for n elements?

• Simpler problem: max element

• Idea: Add as many of these diagonals as needed

Page 21: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Odd-Even Transposition Network

• Theorem: The odd-even transposition network sorts n numbers in n steps and O(n2) processors

18

42

31

56

12

11

19

34

18

42

31

56

11

12

19

34

18

31

42

11

56

12

19

34

18

31

11

42

12

56

19

34

18

11

31

12

42

19

56

34

11

18

12

31

19

42

34

56

11

12

18

19

31

34

42

56

11

12

18

19

31

34

42

56

11

12

18

19

31

34

42

56

Page 22: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Zero-One Principle of Sorting Networks

• Lemma: If a sorting network works correctly on all inputs consisting of only 0’s and 1’s, it works for any arbitrary input– Assume there is a network that sorts 0-1 sequences

but not another arbitrary input a0 .. an-1

– Let b0 .. bn-1 be the output of that network– There must exist s < t such that bs > bt

– Label all ai < bs with 0 and all else with 1– If we run all a0 .. an-1 with their labels, then bs’s label

will be 1 and bt’s label will be 0– Contradiction: The network is assumed to sort 0-1

sequences properly but did not do so here!

Page 23: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Correctness of the Odd-Even Transposition Network

• Assume binary sequence a0 .. an–1

• Let ai = first 0 in the sequence• Two cases: i is odd or even

• To sort a0 .. ai, we need i steps (worst-case)• Induction: Given a0 .. ak (where k ≥ i) will sort in k

steps, will a0 .. ak+1 get sorted in k+1 steps?

1

1

1

0

1

1

0

1

1

0

1

0

1

0 1

1

1

10

1

1

1

10

1

1

1

0

1

1

0

0

1 0

1

Page 24: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Better Sorting Networks

• Batcher’s Bitonic Sorter (1968)– Depth O(lg2 n), size O(n lg2 n)– Idea: sort 2 groups (recursively), then merge using

a network that can sort bitonic sequences

• AKS Network (1983)– Ajtai, Komlós and Szemerédl– Depth O(lg n), size O(n lg n)– Not practical! Hides a very large c in the cn lg n

algorithm

Page 25: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

More Intelligent Processors:Processor Networks

• Star

• Linear/Ring

• Completely-connected

• Mesh

Diameter = 2

Diameter = n – 1 (or n – 2)

Diameter = 1

Page 26: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Linear Networks

• Emulate an odd-even transposition network!

• O(n) steps, work is O(n2)– We can’t expect better on a linear network

18 42 31 56 12 11 19

18 42 31 56 11 12 19

18 31 42 11 56 12 19

18 31 11 42 12 56 19

18 11 31 12 42 19 56

11 18 12 31 19 42 56

11 12 18 19 31 42 56

11 12 18 19 31 42 56

Page 27: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Row phase

• Sort rows, sort columns, repeat

15 4 10 611 7 5 112 14 13 89 16 2 3

Page 28: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Column phase

• Sort rows, sort columns, repeat

4 6 10 1511 7 5 18 12 13 14

16 9 3 2

Page 29: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Row phase

• Sort rows, sort columns, repeat

4 6 3 18 7 5 2

11 9 10 1416 12 13 15

Page 30: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Column phase

• Sort rows, sort columns, repeat

1 3 4 68 7 5 29 10 11 14

16 15 13 12

Page 31: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Row phase

• Sort rows, sort columns, repeat

1 3 4 28 7 5 69 10 11 12

16 15 13 14

Page 32: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Column phase

• Sort rows, sort columns, repeat

1 2 3 48 7 6 59 10 11 12

16 15 14 13

Page 33: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Arrange numbers in “boustrophedon” order

a = { 15, 4, 10, 6, 1, 5, 7, 11, 12, 14, 13, 8, 9, 16, 2, 3 }

Done!

• Sort rows, sort columns, repeat

1 2 3 48 7 6 59 10 11 12

16 15 14 13

Page 34: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Sorting on Mesh Networks: Shearsort

• Theorem: Shearsort sorts n2 elements in O(n lg n) steps on an n × n mesh

• We can use the Zero-One Principle!– Only because algorithm is comparison-exchange• Can be implemented using comparators only

– and oblivious• Outcome of comparator does not influence

comparisons made later on– (Disclaimer: reference is actually very unclear

about this)

Page 35: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Correctness of Shearsort

0 1 0 0 1 0 0 10 1 1 1 0 1 1 10 0 1 0 1 0 0 11 0 0 1 0 0 1 01 1 1 0 1 0 1 00 0 0 1 1 1 0 10 0 1 1 1 1 1 11 1 0 0 1 1 1 1

Page 36: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Correctness of Shearsort

0 0 0 0 0 1 1 11 1 1 1 1 1 0 00 0 0 0 0 1 1 11 1 1 0 0 0 0 00 0 0 1 1 1 1 11 1 1 1 0 0 0 00 0 1 1 1 1 1 11 1 1 1 1 1 0 0

1 full row of 1’s

1 full row of 0’s

1 full row of 1’s

1 full row of 1’s

Page 37: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Correctness of Shearsort

• lg(n) × 2 phases, each phase takes n steps

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 1 0 00 0 1 1 0 1 0 01 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 1

Sort spaceguaranteed to

be halvedafter 2 phases

Page 38: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Distributed Algorithms

Page 39: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Different concerns altogether…

• Problems usually easy to parallelize• Main problems:

– Inherently asynchronous– How to broadcast data and ensure every node gets it– How to minimize bandwidth usage– What to do when nodes go down (decentralization)– (Do we trust the results given by the nodes?)

2, 3, 5, 7, 13 …

… 242643801-1, 243112609-1…DES (56-bit) SETI@Home

Page 40: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Message-Optimality

• New language constructs:

send <M> to p receive <M> from p terminate

• Message-complexity = number of messages sent by a distributed algorithm (also uses O-notation)

Page 41: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Broadcast

• Initiators vs. noninitiators

• Simple case: ring network w/ one initiator

init_ring_broadcast() { send token to successor receive token from predecessor terminate}

ring_broadcast() { receive token from predecessor send token to successor terminate}

Theorem: init_ring_broadcast + ring_broadcast broadcasts to n machines using time and message complexity O(n)

Page 42: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Broadcast on a tree networkinit_broadcast() { N = { q | q is a child neighbor of p } for each q ∈ N send token to q terminate}

broadcast() { receive token from parent N = { q | q is a child neighbor of p } for each q ∈ N send token to q terminate}

Note: no acknowledgment!

2

1

3

6

4

5

Page 43: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

Page 44: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

0

nul

nul

nul

nul

nul

Page 45: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

0

0

nul

nul

nul

nul

Page 46: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

0

0

nul

nul

0

nul

Page 47: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

0

0

nul

nul

0

0

Page 48: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

1

0

0

0

1

2

Page 49: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

1

0

2

2

3

2

Page 50: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

2

0

fin

fin

fin

4

Page 51: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

2

1

fin

fin

fin

fin

Page 52: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

3

fin

fin

fin

fin

fin

Page 53: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Echo

• Creates a spanning tree out of any connected network

Theorem: init_echo + echo has time complexity O(diameter) and message complexity O(edges)

init_echo() { N = { q | q is a neighbor of p } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } terminate}

echo() { receive token from parent N = { q | q is a neighbor of p } – { parent } for each q ∈ N send token to q counter = 0 while (counter < |N|) { receive token counter = counter + 1 } send token to parent terminate}

2

1

3

6

4

5

fin

fin

fin

fin

fin

fin

Page 54: Parallel and Distributed Algorithms Eric Vidal Reference: R. Johnsonbaugh and M. Schaefer, Algorithms (International Edition). 2004. Pearson Education

Leader Election (for ring networks)init_election() {

send token, p.ID to successor min = p.ID receive token, token_id while (p.ID != token_id) { if token_id < min min = token_id send token, token_id to successor receive token, token_id }

if (p.ID == min) i_am_the_leader = true else i_am_the_leader = false terminate

}

election() {

i_am_the_leader = false do { receive token, token_id send token, token_id to successor } while (true)

}

Theorem: init_election + election runs in n steps with message complexity O(n2)