35
Department of Mathematics Dhaka University

Final 25 aprl

Embed Size (px)

Citation preview

Page 1: Final 25 aprl

Department of Mathematics

Dhaka University

Page 2: Final 25 aprl

Project Presentation On

Efficiency of sorting algorithmsCourse No: 490

Session: 2001- 2002Class Roll No: Jn-262

Registration No: 3293/1997-1998

Supervised by: Dr. Amal Krishna Halder

Page 3: Final 25 aprl

• The word Sort refer to arrange the records according order which may be ascending order or descending order.

• The word Algorithm refer to technique or process of sorting the records.

What is sorting algorithms

Page 4: Final 25 aprl

What we have done• Review various sorting algorithms with the help of

Internet, Library books and our project guide.

• Implement the algorithms by a well known programming language C/C++ and test it for various data.

• We contribute something new algorithm structure in the sorting algorithm literature.

• We analyzed comparatively and try to find efficiency among them.

• We try to make a Conclusion about the efficiency of sorting algorithms.

Page 5: Final 25 aprl

• Computer can perform million/billion time faster operations than human. As for example, to find a specific data among billion of data computer will take few seconds.

• How fast computer search a data depends on its processing power. Processing power depends on processing technique. It is easy to search a specific data if it has been sorted earlier. So sorting takes very important role for searching data.

Why we choose sorting algorithms• There are many reasons why sorting algorithms is of

interest to computer scientists and mathematicians.

• Some algorithms are easy to implement, some algorithms take advantage of particular computer architectures, and some algorithms are particularly clever. Some are very efficient in every stage, some are inefficient for a particular size and type of data.

• In our project, our main aim is to find out the efficiency of various types of sorting algorithm.

Page 6: Final 25 aprl

Type of sorting

• Internal: if the records that it is sorting are in main memory. e.g. Bubble, Quick, Shell etc.

• External: if the records that it is sorting are with the help of auxiliary storage. e.g. Radix, Merge, Tri-merge(newly proposed)

Page 7: Final 25 aprl

• Recursive: if recursive function (the function which calls itself in the function) is used . e.g. Merge, Quick, Tri-merge(newly proposed) etc.

• Non-recursive: if it is not used recursive function e.g. Bubble, Insertion, Shell etc.

A sort also can classified as:

Page 8: Final 25 aprl

Overview• After reviewing various sorting algorithm we

choose some commonly used sorting algorithm that will be listed afterwards.

• For each algorithm we try to describe derivation and algorithm with the same example for clear understand.

Page 9: Final 25 aprl

⇒Bubble Sort ⇒Insertion Sort⇒Selection Sort⇒Replacement Sort ⇒Shell Sort ⇒Heap Sort⇒Radix Sort ⇒Quick Sort ⇒Merge Sort

Some Common Sorting Algorithms:

Page 10: Final 25 aprl

A new approach for sorting algorithm literature

• Reviewing various sorting algorithm our knowledge has been enriched. We try to work with ternary tree structure for sorting algorithm where as generally used binary tree structure.

• After facing many problem finally we able to construct a successful sorting algorithm which probed to be more efficient than binary one.

• Since it uses ternary tree structure and merging technique we named it Tri-merge Sort.

• As far as our knowledge goes, this ternary tree structure has not yet been used for sorting algorithm.

Page 11: Final 25 aprl

Problem faced

• At first we faced the problem that if data size can be expressed as 3 i (i=1,2…k) then it works properly because it splits equally and merge without missing

• But if data size is not an exact multiple of 3 then it cannot be split equally and cannot merge from three files recursively.

• We fixed the problem by ensuring that data will be split recursively until 1 or 2 data remain. This last step is treated individually without recursive call and will be sort among themselves.

Page 12: Final 25 aprl

Tri-merge SortFormulation:

Tri-merge is a new proposed technique for sorting data. It uses the attractive features of the sorting methods as like of merge sorting. Tri-merge uses ternary tree structure and recursive function.

The procedure of Tri-merge sort is completed in two phases:

Phase-1: Split in to 3 parts recursively.

Phase-2: Merge to 1 part from 3 ordered parts recursively.

Page 13: Final 25 aprl

Phase-1: Split • In this stage total data of the given list split into

3 parts.

• Each part consequently split 3 parts recursively until 1 or 2 elements remain.

• When two data remain in a part they are sorted among themselves.

• The split procedure will be shown afterward by animation.

Page 14: Final 25 aprl

Phase-2: Merge

• In this phase every 3 split parts become 1 part and are sorted among themselves.

• This process continues until all data become one part and finally we get completely sorted data.

• The merge procedure will be shown afterward by animation.

Page 15: Final 25 aprl

8 5 2 7 1 3 9 2 6

8 5 2 7 1 3 9 2 6

8 75 2 1 3 9 2 6

Splitting Recursively

Split to 3 parts Recursively until 1 or 2 data remain

Page 16: Final 25 aprl

1 7 2 9852 3 6

1 22 3 5 6 7 8 9

Final Merge From 3 Sorted files to 1 Sorted fileCompare first element from three files and find smallest oneWhich will be merge to final list as like describing here..

This process will be continue until all data are merged

8 5 2 7 1 3 9 2 6

Merging Recursively

Now we get sorted list

Page 17: Final 25 aprl

Tri-mergeSort( a[L,....,R], L, R )

{ n=R-L+1 (number of element calculated from index position)IF (n>2) THEN{

m1 = n/3 (First midpoint)m2 = 2*m1 (Second midpoint)

Tri-mergeSort ( a[L,….,m1], L, m1 )

Tri-mergeSort ( a[m1+1,….,m2], m1+1, m2 )

Tri-mergeSort ( a[m2+1,….,R], m2+1, R )

Tri-merge ( a[L,….,R], L, m1+1, m2+1, R )

}ELSE IF (n = 2) THEN{

IF ( a[L] > a[R] ){

temp = a[L]a[L] = a[R]a[R] = temp

}}

}

Algorithm of Tri-merge sort

in

Structural PseudoCode

Page 18: Final 25 aprl

Tri-merge ( a[L,….,R], L, m1, m2, R)

{ part1 = a[L,…,m1-1]; part2 = a[m1,…,m2-1]; part3 = a[m2,…,R]TempArray[L,…,R]; n = R-L+1IF ( n > 2 ){ WHILE( part1, part2 and part3 has elements )

Comparing from 3 parts find minimum and set to TempArray.WHILE( part1 and part2 has elements )

Comparing from 2 parts find minimum and set to TempArray.WHILE( part2 and part3 has elements )

Comparing from 2 parts find minimum and set to TempArray. WHILE( part1 and part3 has elements )

Comparing from 2 parts find minimum and set to TempArray.WHILE( part1 has elements )

set to TempArray.WHILE( part2 has elements )

set to TempArray.WHILE (part3 has elements)

set to TempArray.}RETURN TempArray

}

Page 19: Final 25 aprl

void TriMergeSort( int a[], int L, int R, long int asscount[],long int comcount[]) { int noOfEle=R-L+1; int m1,m2,part,t; comcount[0]++; if(noOfEle>2) { part=(R-L+1)/3; m1 = L+part-1; m2 = L+2*part-1; asscount[0]+=3; TriMergeSort(a,L,m1,asscount,comcount); TriMergeSort(a, m1+1,m2,asscount,comcount); TriMergeSort(a, m2+1,R,asscount,comcount); TriMerge(a,L,m1+1,m2+1,R,asscount,comcount); } else if(noOfEle==2) { comcount[0]+=2; if(a[L]>a[R]) { asscount[0]+=3; t=a[L]; a[L]=a[R];

Complete program of Tri-merge Sort

Page 20: Final 25 aprl
Page 21: Final 25 aprl

Efficiency AnalysisEfficiency of sorting algorithm depends on the

following criteria: • How quickly (how much time require) data is sorted.

CPU time depends on how many operation (assignment and compare) count are required for the algorithm

• How they work for small data• How they work for big data• How they work for random data• How they work for preordered, disordered data and so

on.

Page 22: Final 25 aprl

To achieve our goal we have try to do the following task

Since efficiency (time) depends on number of operations so we try to find out the total number of assignment and compare operations involved.

• Input data is prepared as random order by C/C++ programming command.

• To get effect on number of operations for small and big data we find out total number of operations for 50, 200, 400, 600, 800 data.

• We had a limitation to work with huge data (e.g. we have used earlier version of C/C++ compiler, Limited memory size of the computer.)

Page 23: Final 25 aprl

To achieve our goal we have try to do the following task

For comparative analysis we arrange total number of operation count in different table with different number of data (e.g. 50, 200, 400, 600, 800) including 100%, 80%, 10%, random, totally disordered.

• To compare the result clearly we show them graphically so that one can get the picture at a glance.

• Analyzing the obtained data we try to find out an approximation function of operation count f(n), where n is the number of data. To get the function we use the technique of interpolation. This function is also serves an indication about the efficiency of the algorithm.

Page 24: Final 25 aprl

The Table of Operation Count

Page 25: Final 25 aprl

For 50 Data total operation count with different ordered

Sort Name 100% Order

80% Ordered

10% Order Random Completely Disorder

Bubble 3775 4153 5305 5335 7405 Insertion 245 749 2285 2325 5085 Selection 3975 4010 4071 4095 4445 Replacement 3775 4108 5086 5026 6154 Heap 2899 2773 2532 2479 2145 Shell 582 838 1186 1270 1622 Radix 706 706 706 706 706 Quick 4172 1615 1355 1204 4147 Merge 3163 3195 3205 3209 3145 TriMerge(New) 2078 2441 2560 2596 2443

For 800 Data total operation count with different ordered Sort Name 100%

Order 80% Ordered

10% Order Random Completely Disorder

Bubble 9226764 1063453 1259566 1447693 2756473 Insertion 509393 141399 402883 653719 619815 Selection 964759 964760 965474 966869 967801 Replacement 823641 1005562 1049428 1065151 1076530 Heap 73640 70835 68372 65851 64712 Shell 18494 41607 100366 148267 152850 Radix 9706 9706 9706 9706 9706 Quick 82110 54429 46866 37285 83002 Merge 80945 82205 83143 83455 84013 TriMerge(New) 72984 75943 79611 79993 79732

Page 26: Final 25 aprl
Page 27: Final 25 aprl

Name of sorting algorithm f(n)=Number of operation count(n = Number of data)

Bubble 2*n2

Insertion n2

Selection 1.5*n2

Replacement 1.7*n2

Heap 28(n * Log n)

Shell 39(n * Log n)

Radix 5(n * Log n)

Quick 14(n * Log n)

Merge 36(n * Log n)

Tri-merge(New) 32(n * Log n)

Approximate Operation count function f(n) with respect to number of data n

Page 28: Final 25 aprl

Graphical Representation

According

Operation Count Function

Page 29: Final 25 aprl
Page 30: Final 25 aprl

Summary

Page 31: Final 25 aprl

Bubble Sort is a very slow way of sorting; its main advantage is the simplicity of the algorithm.

Insertion Sort is much efficient for ordered, preordered and small data. For disorder and large data its efficiency goes inversely.

Selection sort works nearly same for all types (ordered, preordered, random, disordered) of data. It is more efficient for small data than large one.

Replacement sort is also a slow procedure as bubble sort, it is not so efficient one.

Heap sort works well for disordered and data than preordered data. It also works well for big data.

Page 32: Final 25 aprl

Shell sort is more efficient for ordered data; it also works very well for small data but very slow for disordered data.

Radix sort is much more efficient than any other sorting algorithm. It works well for all types of ordered, disordered and random data; basically it gives exactly the same number of operations for all of those types. In our observation

Quick sort is more efficient for random data. It is bad for fully ordered and fully disordered data.

Merge sort gives nearly same number of operation for all types of data. It is more efficient for big data.

Tri-merge sort gives much better result than the merge sort, i.e. it is more efficient than merge.

Page 33: Final 25 aprl

Ranking of sorting algorithm:  

It is very difficult to make ranking order of sorting algorithm according to their efficiency. Some works very well for small data and some well for big data. Similarly some works well for preordered data some works well for random data. After all we try to find out a general rank order for average situation.

Radix sort Quick sort Heap sort Tri-merge (newly proposed) Merge sort Shell sort Insertion sort Selection sort Replacement sort Bubble sort.

Page 34: Final 25 aprl

Sort

Type

Bubble

Inser tion

Selec tion

Replace Ment

Heap Shell Radix Quick Merge Tri-

merge

Algorithm Most Easy Easy Easy Most Easy

Most Hard

Most Hard Easy Hard

Fairly Hard

Fairly Hard

For Random Data Very Bad Fairly Bad

Bad Fairly Bad Good Fairly Good

Very Good

Very good

Good Good

For Ordered Data Very Bad

Fairly Good

Very Bad Bad Bad

Very Good

Very Good

Very bad

Fairly Good Good

For Disordered Data Very Bad Very Bad Bad Bad Good

Fairly Good

Very Good

Very bad Good Good

For Small Data Very Bad Good Bad Very Bad Good Good Very Good

Very Good

Fairly Good

Fairly Good

For Big Data Very Bad Bad Bad Vary Bad Good Fairly Bad

Very Good

Good Good Good

Obtained Complexity 2* 2n 2n

1.5*2n

1.7* 2n

28*nnLog10

39*nnLog10

5*nnLog10

14*nnLog10

36*nnLog10

32*nnLog10

Conclusion Table at a glance

Page 35: Final 25 aprl

End