14
Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Embed Size (px)

Citation preview

Page 1: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Parallel Sorting – Odd-Even Sort

David MonismithCS 599

Notes based upon multiple sources provided in the footers of each slide

Page 2: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

http://en.wikipedia.org/wiki/Bubble_sort

Recall Bubble Sort

• Bubble sort is a O(N2) sorting algorithm.• It is simple to understand and implement.• It has terrible performance, even when

compared to other O(N2) algorithms.• So why discuss it?– Easy to understand– Easy to implement– “Easy” to parallelize

Page 3: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Optimized Bubble Sort Algorithm

do { int new_n = 0; for(i = 1; i < n; i++) if(arr[i-1] > arr[i]) { swap(&arr[i-1],&arr[i]); new_n = i; } n = new_n; } while(n > 0);

http://en.wikipedia.org/wiki/Bubble_sort

Page 4: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Odd-Even Sort

• Parallelizable version of Bubble sort • Requires N passes through the array.• Each pass through the array analyzes either:– Every pair of odd indexed elements and the

preceding element, or– Every pair of even indexed elements and the

preceding element.• Within each pass, elements that are not in

order are swapped.Introduction to Parallel Programming 2nd

Ed.

Page 5: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Odd Even Sort Complexity

• As this algorithm is a bubble sort, its work complexity is O(N2).

• Notice that the step complexity of the algorithm is O(N) as each inner loop may be executed in parallel.

• The results of each iteration of the outer loop are dependent upon the previous iteration.

• Notice that there are N iterations in the outer loop, and each inner loop consists of a full pass through the array, requiring O(N) operations.

Introduction to Parallel Programming 2nd Ed.

Page 6: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Example

• An example of odd-even sort will be provided on the board.

• Students will complete a worksheet problem to trace an odd-even sort.

• The goal of this exercise is to see how the algorithm can be parallelized.

Page 7: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Odd-Even Sort Algorithm

for(n = 0; n < N; n++) { if(n & 1){ for(i = 2; i < N; i+=2) if(arr[i-1] > arr[i]) swap(&arr[i-1],&arr[i]); } else { for(i = 1; i < N; i+=2) if(arr[i-1] > arr[i]) swap(&arr[i-1],&arr[i]); } }

homepages.ihug.co.nz/~aurora76/Malc/Sorting_Array.htm

Page 8: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Why does Odd-Even Sort work?

Page 9: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

OpenMP Parallelization

• Parallelization of the Odd-Even Sort Algorithm is straightforward as the swap operations performed within each iteration are independent.

• Thus each odd or even step may be fully parallelized through OpenMP directives.

Page 10: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Straightforward Parallelization With OpenMP

for(n = 0; n < N; n++) { if(n & 1){ #pragma omp parallel for private(i) shared(arr) for(i = 2; i < N; i+=2) if(arr[i-1] > arr[i]) swap(&arr[i-1],&arr[i]); } else { #pragma omp parallel for private(i) shared(arr) for(i = 1; i < N; i+=2) if(arr[i-1] > arr[i]) swap(&arr[i-1],&arr[i]); }}

Page 11: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

http://en.wikipedia.org/wiki/Bubble_sort

MPI Parallelization

• Distributed parallelization of the Odd-Even Sort Algorithm is somewhat different than the threaded (shared memory) version.

• Approximately equally chunks of the array to be sorted are sent to each process.

• Local arrays are sorted within each process.• Data from the arrays is then swapped with the

odd neighbor or the even neighbor.

Page 12: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

MPI PseudocodeSort local arrayfor(i = 0; i < num_processes; i++) { neighbor=findNeighbor(i,rank); if(neighbor >= 0 && neighbor < num_processes) Send local array to neighbor and Receive neighbor’s local array if(rank < neighbor) keep smaller elements else keep larger elements endif endifendfor

http://cs.nyu.edu/courses/spring14/CSCI-UA.0480-003/lecture11.pdf

Page 13: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

Find Neighbor Pseudocodeint findNeighbor(i, rank, num_processes) if(i is even) if(rank is even) neighbor = rank + 1 else neighbor = rank - 1 endif else if(rank is even) neighbor = rank - 1 else neighbor = rank + 1 endif end if if(neighbor < 0 || neighbor >= num_processes) return -1; return neighbor;

http://cs.nyu.edu/courses/spring14/CSCI-UA.0480-003/lecture11.pdf

Page 14: Parallel Sorting – Odd-Even Sort David Monismith CS 599 Notes based upon multiple sources provided in the footers of each slide

MPI Implementation //Sort local values qsort(arr, numElements, sizeof(int), compare);

//Begin iterations for(n = 0; n < size; n++) { MPI_Barrier(MPI_COMM_WORLD); int neighbor = computeNeighbor(n, rank, size);

if(neighbor >= 0 && neighbor < size) { //Send my values to my neighbor and receive values from my neighbor MPI_Sendrecv(arr, numElements, MPI_INT, neighbor, n, recvArr, numElements, MPI_INT, neighbor, n, MPI_COMM_WORLD, &status);

//If my rank < my neighbor's rank, keep the smaller values if(rank < neighbor){ mergeArrays(arr, recvArr, temp, numElements, 1); //Else keep the larger values } else { mergeArrays(arr, recvArr, temp, numElements, 0); } } }