18
Linked List Ranking Parallel Algorithms 1

Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Embed Size (px)

Citation preview

Page 1: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Linked List Ranking

Parallel Algorithms

1

Page 2: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Work Analysis – Pointer Jumping

Number of Steps: Tp = O(Log N)Number of Processors: NWork = O(N log N)T1 = O(N)Work Optimal??

2

Page 3: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Common List Ranking Strategy

Reduce Rank Expand StrategySimilar to Odd-Even Prefix SumsAlgorithms vary in how items are

selected for deletion in the Reduce Step

Some variation in the Expand Step.

3

Page 4: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Reduce Rank Expand

Reduction Step - Delete non-adjacent elements in parallel until the list is of size O(p) (algorithms vary is this step)

Rank Step – Pointer jumping to rank the p remaining elements

Expand Step – Reinsert elements (in the reverse order), computing the rank as it is inserted

4

Page 5: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Asynchronous PRAM – APRAM

Variant of the PRAM modelFeatures of CRCW PRAM – Plus

Processors may have different clock speeds and work asynchronously

Each processor has own random number generator

5

Page 6: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Randomized Algorithm

Some portion of the algorithm’s outcome is non-deterministic

Performance stated in terms of expected order of time complexity – EO(f(n))

EO(f(n)) – with high probability the results are within order f(n)

6

Page 7: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

APRAM List Ranking by Martel & Subramonian

1. Select m=EO(n/log n) elements at random1. Each generates a random number between 1-N, selects

those with values 1 to N / log N

2. Perform log log n iterations of pointer jumping1. Each points to selected, nil, or pointer length log n

3. Pack selected elements to smaller array

4. Scan: each selected points to next selected or nil

5. Compute ranks of selected list-pointer jumping

6. Copy back to original list; scan to complete ranks

*EO(n log log n) using p = O(n/log n) 7

Page 8: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

List 0 3 6 8 5 1 2 4 11 0 7 10 14 12 15 13

Rank 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Step1

R# 3 8 10 3 9 13 16 11 2 15 7 1 8 6 4 12

Sel * * * * *

Step2

List 0 8 11 10 15

Rank 3 5 3 3 2

Step3

Rank 16 13 8 5 2

Step4

Rank 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Demo: APRAM Randomized Algorithm

8

Page 9: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Modifications for EREW Model

APRAMEach PU operates

asynchronouslyMust synchronize

each at step

Steps 2, 4, 6 allow for Concurrent Reads

EREWDefault execution is

synchronousPartition the array to

eliminate Concurrent Reads

9

Page 10: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Randomized EREW Algorithm (4.1)

Select O(p) elements at randomEach generates a random number between

1-N, selects those with values 1 to N / log NPack selected items into smaller arrayScan from selected item to next to set rankRank selected elements by pointer jumpingWrite ranks back to original arrayScan from selected items to set remaining

ranks10

Page 11: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

List 0 3 6 8 5 1 2 4 11 0 7 10 14 12 15 13

Rank 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Step1

R# 3 8 10 3 9 13 16 11 2 15 7 1 8 6 4 12

Sel * * * * *

Stp2

List 0 8 11 10 15

Rank 3 5 3 3 2

Stp3

Rank 16 13 8 5 2

Stp4

Rank 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Demo: EREW Randomized Algorithm

11

Page 12: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Complexity Analysis

Selection - EO(n/p)Compaction - prefix sums EO(n/p + log p)Form LL - EO(n/p)Pointer Jumping - EO ( n/p + log p)Sequential Scan - EO(n/p)OVERALL - EO ( n/p + log p)Optimal speedup for p <= O (n / log n)Which gives EO ( log n) timeSee paper for proof.

12

Page 13: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

What next??

Can we accomplish an O(log n) deterministic algorithm?

What was randomized in other algorithms?Can it be made deterministic?

13

Page 14: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Answer!

Yes & No, Can Reduce deterministically!Near Optimal

2-Ruling Set (Cole & Vishkin)A subset of the list such that no 2 elements are

adjacent & there are exactly 1 or 2 non-ruling set elements between each ruling set element & its nearest successor

14

Page 15: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Deterministic EREW LLR

Step 1Select a Ruling Set of O(p) elements,

O(n/p) distance apartKeep track of actual distance (for ranking)

Repeatedly apply 2-Ruling Set Algorithm to increase the distance between the elements

Remaining steps are same

15

Page 16: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Complexity Analysis1. Select O(p) elements O(n/p) distance apart –

O(n/p * log (n/p))

2. Compact – O(n/p + log p)

3. Rank – O(n/p + log p)

4. Write back then Scan – O(n/p)

O((log p) + (n/p) * log(n/p))

= O(n/p * log (n/p) ) for n>= p log p

= O(log n log log n) for p =O(n/log n)

Work O(n log log n)16

Page 17: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

17

Page 18: Linked List Ranking Parallel Algorithms 1. Work Analysis – Pointer Jumping Number of Steps: Tp = O(Log N) Number of Processors: N Work = O(N log N) T1

Advantages of New Algorithms

Simple, use basic strategiesSmall constantsSmall space requirementsOptimal & Near Optimal

18