59
DATA STRUCTURES Lectures: 4 hrs/week Theory: 50 Marks Class - SECSE Online: 50 Marks By Mr. B. J Gorad, BE(CSE), M.Tech (CST), GATE2011,2016, PhD(CSE)* Assistant Professor, Computer Science and Engineering, Sharad Institute of Technology College of Engineering, Ichalkaranji, Maharashtra Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Searching and Sorting Techniques in Data Structure

Embed Size (px)

Citation preview

Page 1: Searching and Sorting Techniques in Data Structure

DATA STRUCTURES

Lectures: 4 hrs/week Theory: 50 Marks Class - SECSE Online: 50 Marks

ByMr. B. J Gorad,

BE(CSE), M.Tech (CST), GATE2011,2016, PhD(CSE)*

Assistant Professor, Computer Science and Engineering,

Sharad Institute of Technology College of Engineering,Ichalkaranji, Maharashtra

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 2: Searching and Sorting Techniques in Data Structure

Course Outcomes

At the end of Course students will CO1 - Familiar with C programming and basic data structures.

CO2 - Understand various searching and sorting algorithms and they will able to

Choose the appropriate data structure and algorithm design method for a

specified application.

CO3 - Understand the abstract properties of various data structures such as stacks,

queues. Also they will able to choose appropriate data structure for

specified application.

CO4 - Understand the abstract properties of various data structures such as Lists.

Also they will able to choose appropriate data structure for specified application.

CO5 - Understand and apply fundamental algorithmic problems including Tree, B+

Tree, and Tree traversals.

CO6 - Understand and apply fundamental algorithmic problems including Graph,

Graph traversals and shortest paths

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 3: Searching and Sorting Techniques in Data Structure

Linear Search&

Binary Search

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 4: Searching and Sorting Techniques in Data Structure

What is algorithm and Algorithm design?

• An Algorithm is a Step by Step solution of a specific mathematical or computer related problem.

• Algorithm design is a specific method to create a mathematical process in solving problems.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 5: Searching and Sorting Techniques in Data Structure

Sorted Array

• Sorted array is an array where each element is sorted in numerical, alphabetical, or some other order, and placed at equally spaced addresses in computer memory.

1 2 3 4

0.2 0.3 1 1.5

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 6: Searching and Sorting Techniques in Data Structure

Unsorted Array

• Unsorted array is an array where each element is not sorted in numerical, alphabetical, or some other order, and placed at equally spaced addresses in computer memory.

1 2 3 4

0.2 0.3 1.5 1

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 7: Searching and Sorting Techniques in Data Structure

What is searching?• In computer science, searching is the process of

finding an item with specified properties from acollection of items.

• The items may be stored as records in a database,simple data elements in arrays, text in files, nodes intrees, vertices and edges in graphs, or maybe beelements in other search place.

• The definition of a search is the process of looking forsomething or someone

• Example : An example of a search is a quest to find amissing person Mr. B J Gorad, CSE, Sharad Institute of

Technology COE, Ichalkaranji, Maharshtra

Page 8: Searching and Sorting Techniques in Data Structure

Why do we need searching?✓Searching is one of the core computer sciencealgorithms.✓We know that today’s computers store a lot ofinformation.✓To retrieve this information proficiently we needvery efficient searching algorithms.

Types of Searching

• Linear search• Binary search

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 9: Searching and Sorting Techniques in Data Structure

Linear Search

• The linear search is a sequential search, which uses a loop to step through an array, starting with the first element.

• It compares each element with the value being searched for, and stops when either the value is found or the end of the array is encountered.

• If the value being searched is not in the array, the algorithm will unsuccessfully search to the end of the array.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 10: Searching and Sorting Techniques in Data Structure

Linear Search

• Since the array elements are stored in linear order searching the element in the linear order make it easy and efficient.

• The search may be successful or unsuccessfully. That is, if the required element is found them the search is successful other wise it is unsuccessfully.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 11: Searching and Sorting Techniques in Data Structure

Unordered linear/ Sequential search

int unorderedlinearsearch (int A[], int n, int data){

for (int i=0; i<n; i++){

if(A[i] == data)return i;

}return -1;

}

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 12: Searching and Sorting Techniques in Data Structure

Advantages of Linear search

• If the first number in the directory is the number you were searching for ,then lucky you!!.

• Since you have found it on the very first page, now its not important for you that how many pages are there in the directory.

• The linear search is simple - It is very easy to understand and implement

• It does not require the data in the array to be stored in any particular order

• So it does not depends on no. on elements in the directory. Hence constant time .

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 13: Searching and Sorting Techniques in Data Structure

Disadvantages of Linear search

• It may happen that the number you are searching for is the last number of directory or if it is not in the directory at all.

• In that case you have to search the whole directory.

• Now number of elements will matter to you. if there are 500 pages ,you have to search 500;if it has 1000 you have to search 1000.

• Your search time is proportional to number of elements in the directory.

• It has very poor efficiency because it takes lots of comparisons to find a particular record in big files

• The performance of the algorithm scales linearly with the size of the input

• Linear search is slower then other searching algorithms

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 14: Searching and Sorting Techniques in Data Structure

Analysis of Linear SearchHow long will our search take?

In the best case, the target value is in the first element of the array.

So the search takes some tiny, and constant, amount of time.

In the worst case, the target value is in the last element of the array.

So the search takes an amount of time proportional to the length of the array.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 15: Searching and Sorting Techniques in Data Structure

Analysis of Linear Search

In the average case, the target value is somewhere in the array.

In fact, since the target value can be anywhere in the array, any element of the array is equally likely.

So on average, the target value will be in the middle of the array.

So the search takes an amount of time proportional to half the length of the array

The worst case complexity is O(n), sometimes known an O(n) search

Time taken to search elements keep increasing as the number of elements are increased.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 16: Searching and Sorting Techniques in Data Structure

Binary Search

The general term for a smart search through sorted data is a binary search.

1. The initial search region is the whole array.2. Look at the data value in the middle of the search region.3. If you’ve found your target, stop.

4. If your target is less than the middle data value, the new search region is the lower half of the data.

5. If your target is greater than the middle data value, the new search region is the higher half of the data.

6. Continue from Step 2.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 17: Searching and Sorting Techniques in Data Structure

17

Binary Search

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 18: Searching and Sorting Techniques in Data Structure

© Reem Al-Attas

Binary Search

2. Calculate middle = (low + high) / 2.

= (0 + 8) / 2 = 4.

If 37 == array[middle] return middle

Else if 37 < array[middle] high = middle -1

Else if 37 > array[middle] low = middle +1

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 19: Searching and Sorting Techniques in Data Structure

9/6/2017 © Reem Al-Attas 19

Binary Search

Repeat 2. Calculate middle = (low + high) / 2.

= (0 + 3) / 2 = 1.

If 37 == array[middle] return middle

Else if 37 < array[middle] high = middle -1

Else if 37 > array[middle] low = middle +1

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 20: Searching and Sorting Techniques in Data Structure

9/6/2017 © Reem Al-Attas 20

Binary Search

Repeat 2. Calculate middle = (low + high) / 2.

= (2 + 3) / 2 = 2.

If 37 == array[middle] return middle

Else if 37 < array[middle] high = middle -1

Else if 37 > array[middle] low = middle +1

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 21: Searching and Sorting Techniques in Data Structure

Binary Search Routine

public int binarySearch (int[] number, intsearchValue)

{

int low = 0, high = number.length - 1, mid = (low + high) / 2;

while (low <= high && number[mid] != searchValue) {

if (number[mid] < searchValue) {

low = mid + 1;

}

else

{ //number[mid] > searchValue

high = mid - 1;

}

mid = (low + high) / 2; //integer division will truncate

}

if (low > high) {

mid = NOT_FOUND;

}

return mid;

}Mr. B J Gorad, CSE, Sharad Institute of

Technology COE, Ichalkaranji, Maharshtra

Page 22: Searching and Sorting Techniques in Data Structure

Binary Search Performance

• Successful Search

– Best Case – 1 comparison

– Worst Case – log2N comparisons

• Unsuccessful Search

– Best Case =

Worst Case – log2N comparisons

• Since the portion of an array to search is cut into half after every comparison, we compute how many times the array can be divided into halves.

• After K comparisons, there will be N/2K elements in the list. We solve for K when N/2K = 1, deriving K = log2N.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 23: Searching and Sorting Techniques in Data Structure

Comparing N and log2N Performance

Array Size Linear – N Binary –log2N10 10 4

50 50 6

100 100 7

500 500 9

1000 1000 10

2000 2000 11

3000 3000 12

4000 4000 12

5000 5000 13

6000 6000 13

7000 7000 13

8000 8000 13

9000 9000 14

10000 10000 14

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 24: Searching and Sorting Techniques in Data Structure

Important Differences:

Input data needs to be sorted in Binary Search and not in Linear SearchLinear search does the sequential access whereas Binary search access data randomly.Time complexity of linear search -O(n) , Binary search has time complexity O(log n).Linear search performs equality comparisons and Binary search performs ordering comparisons

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 25: Searching and Sorting Techniques in Data Structure

Bubble Sort Algorithm

➢ Bubble sort is a simple sorting algorithm.

➢ This sorting algorithm is comparison-based algorithm in whicheach pair of adjacent elements is compared and the elementsare swapped if they are not in order.

➢ This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2) where n is the number of items.

How Bubble Sort Works?

➢ We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short and precise.

➢ Bubble sort starts with very first two elements, comparing them to check which one is greater.Mr. B J Gorad, CSE, Sharad Institute of

Technology COE, Ichalkaranji, Maharshtra

Page 26: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 27: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 28: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 29: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 30: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 31: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 32: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 33: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 34: Searching and Sorting Techniques in Data Structure

Insertion Sort

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 35: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 36: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 37: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 38: Searching and Sorting Techniques in Data Structure

Selection Sort

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 39: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 40: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 41: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 42: Searching and Sorting Techniques in Data Structure

Merge Sort Algorithm

• Merge sort is a sorting technique based ondivide and conquer technique. With Averagecase and worst-case time complexity beingΟ(n log n), it is one of the most respectedalgorithms.

• Merge sort first divides the array into equalhalves and then combines them in a sortedmanner.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 43: Searching and Sorting Techniques in Data Structure

How merge sort works

• To understand merge sort, we take anunsorted array as depicted below −

• We know that merge sort first divides thewhole array iteratively into equal halvesunless the atomic values are achieved. We seehere that an array of 8 items is divided intotwo arrays of size 4.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 44: Searching and Sorting Techniques in Data Structure

• This does not change the sequence ofappearance of items in the original. Now wedivide these two arrays into halves.

• We further divide these arrays and we achieveatomic value which can no more be divided.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 45: Searching and Sorting Techniques in Data Structure

• Now, we combine them in exactly samemanner they were broken down.

• We first compare the element for each list andthen combine them into another list in sortedmanner. We see that 14 and 33 are in sortedpositions. We compare 27 and 10 and in thetarget list of 2 values we put 10 first, followedby 27. We change the order 19 and 35. 42 and44 are placed sequentially.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 46: Searching and Sorting Techniques in Data Structure

• In next iteration of combining phase, wecompare lists of two data values, and mergethem into a list of four data values placing allin sorted order.

• After final merging, the list should look likethis −

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 47: Searching and Sorting Techniques in Data Structure

Algorithm• Merge sort keeps on dividing the list into equal

halves until it can no more be divided. Bydefinition, if it is only one element in the list, itis sorted. Then merge sort combines smallersorted lists keeping the new list sorted too.– Step 1 − divide the list recursively into two halves

until it can no more be divided.

– Step 2 − if it is only one element in the list it isalready sorted, return.

– Step 3 − merge the smaller lists into new list insorted order.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 48: Searching and Sorting Techniques in Data Structure

Data Structure - Shell Sort

• Shell sort is a highly efficient sorting algorithmand is based on insertion sort algorithm. Thisalgorithm avoids large shifts as in case ofinsertion sort if smaller value is very far right andhave to move to far left.

• This algorithm uses insertion sort on widelyspread elements first to sort them and then sortsthe less widely spaced elements. This spacing istermed as interval. This interval is calculatedbased on Knuth's formula as −

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 49: Searching and Sorting Techniques in Data Structure

• h = h * 3 + 1

where − h is interval with initial value 1

This algorithm is quite efficient for mediumsized data sets as its average and worst casecomplexity are of O(n^2) where n are no. ofitems.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 50: Searching and Sorting Techniques in Data Structure

How shell sort works

• We take the below example to have an idea, howshell sort works?

• We take the same array we have used in ourprevious examples. {35,33,42,10,14,19,27,44}

• For our example and ease of understanding wetake the interval of 4.

• And make a virtual sublist of all values located atthe interval of 4 positions. Here these values are{35, 14}, {33, 19}, {42, 27} and {10, 14}

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 51: Searching and Sorting Techniques in Data Structure

We compare values in each sub-list and swap them (if necessary) in the

original array. After this step, new array should look like this −

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 52: Searching and Sorting Techniques in Data Structure

Then we take interval of 2 and this gap generates two sublists - {14, 27, 35,

42}, {19, 10, 33, 44}

We compare and swap the values, if required, in the original array. After this

step, this array should look like this −

And finally, we sort the rest of the array using interval of value 1. Shell

sort uses insertion sort to sort the array. The step by step depiction is

shown below −

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 53: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 54: Searching and Sorting Techniques in Data Structure

Algorithm

• We shall now see the algorithm for shell sort.

• Step 1 − Initialize the value of h

• Step 2 − Divide the list into smaller sub-list of equal interval h

• Step 3 − Sort these sub-lists using insertion sort

• Step 4 − Repeat until complete list is sorted

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 55: Searching and Sorting Techniques in Data Structure

Radix Sort

• Radix Sort is generalization of Bucket Sort

• To sort Decimal Numbers radix/base will be used as 10. so we need 10 buckets.

• Buckets are numbered as 0,1,2,3,…,9

• Sorting is Done in the passes

• Number of Passes required for sorting is number of digits in the largest number in the list.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 56: Searching and Sorting Techniques in Data Structure

Ex. Range Passes0 to 99 2 Passes0 to 999 3 Passes0 to 9999 4 Passes

• In First Pass number sorted based on Least Significant Digit and number will be kept in same bucket.

• In 2nd Pass, Numbers are sorted on second least significant bit and process continues.

• At the end of every pass, numbers in buckets are merged to produce common list.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 57: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 58: Searching and Sorting Techniques in Data Structure

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra

Page 59: Searching and Sorting Techniques in Data Structure

• Radix Sort is very simple, and a computer can do it fast. When it is programmed properly, Radix Sort is in fact one of the fastest sorting algorithms for numbers or strings of letters.

• Average case and Worst case Complexity - O(n)

Disadvantages• Still, there are some tradeoffs for Radix Sort that can make it less

preferable than other sorts.• The speed of Radix Sort largely depends on the inner basic

operations, and if the operations are not efficient enough, Radix Sort can be slower than some other algorithms such as Quick Sort and Merge Sort.

• In the example above, the numbers were all of equal length, but many times, this is not the case. If the numbers are not of the same length, then a test is needed to check for additional digits that need sorting. This can be one of the slowest parts of Radix Sort, and it is one of the hardest to make efficient.

• Radix Sort can also take up more space than other sorting algorithms, since in addition to the array that will be sorted, you need to have a sublist for each of the possible digits or letters.

Mr. B J Gorad, CSE, Sharad Institute of Technology COE, Ichalkaranji, Maharshtra