of 29 /29
Data Structures and Data Structures and Algorithms Algorithms Searching Searching Algorithms Algorithms M. B. Fayek M. B. Fayek CUFE 2006 CUFE 2006

Data Structures and Algorithms Searching Algorithms

Embed Size (px)


Data Structures and Algorithms Searching Algorithms. M. B. Fayek CUFE 2006. Agenda. Introduction Sequential Search Binary Search Interpolation Search Indexed Search. 1. Introduction. What is a Search? - PowerPoint PPT Presentation

Text of Data Structures and Algorithms Searching Algorithms

  • Data Structures and Algorithms Searching AlgorithmsM. B. FayekCUFE 2006

  • AgendaIntroductionSequential SearchBinary SearchInterpolation SearchIndexed Search

  • 1. IntroductionWhat is a Search? Searching is the task of finding a certain data item (record) in a large collection of such items. A key field that identifies the item sought for is given. (For simplification we consider only the key field instead of the complete record.) If the item is found either its location or the complete item is returned.If the item is not found an indication is given, usually by returning a non-existing index such as -1.

  • AgendaIntroductionSequential SearchBinary SearchInterpolation SearchIndexed Search

  • 2. Sequential SearchSequential Search is also called Exhaustive Search because the complete collection is searched.

  • 2. Sequential Search NOYESKey item = list[i] ?Keyitem = 20 i =0 i =1 i =2return i =2 as found location !

  • 2. Sequential SearchThe first implementation will be: for i = 0 to n doget next item Aiif Ai == k return iendforreturn -1

  • 2. Sequential SearchAnother pseudo code is: i =0while i < n and item Ai ki
  • 2. Sequential SearchHow is the algorithm implemented? The way the collection is constructed affects the way the next item Ai is retrieved.In a static array: Ai is the indexed item A[i]In a linked list: Ai is the next node to be fetched by following the next pointer in the present node. In this case usually the address of the node found (a pointer to the found node) is returned or a NULL pointer to indicate that it was not found In a file: Ai is the next record retrieved from the file

  • 2. Sequential SearchComplexity:The basic operation is the comparisonFor a collection of n data items there are several cases:Best case: item found at the first locationNumber of comparisons = 1Worst Case: item found at the last location or item not foundNumber of comparisons = nAverage case = (1+n)/2

  • 2. Sequential Search EnhancementsSequential Search may be enhanced using several techniques:Sorting before searching (Presorting)Sentinel SearchProbabilistic Search

  • 2. Sequential Search Enhancements1. PresortingA good question to ask before searching is whether the collection is sorted or not?How do we use that info? If sorted the search is terminated as soon as the value of the indexed item in the collection exceeds that of the search item.What is the effect? This will not affect the worst case of finding the element at the last position, but it will decrease the average number of comparisons if logic position of the item were somewhere before the end of the list and the element was not found.A more efficient search is the binary search.

  • 2. Sequential Search Enhancements2. Sentinel SearchThe basic loop in sequential sort include 2 comparisons at each iterationwhile( (i< n) && (key < > A [ i ]) )To decrease the number of comparisons to one per iteration a sentinel value = key is inserted at the end of the array (beyond its end, i.e. at n) Hence the first comparison is redundant. The search will always stop finding key either within A (if it already existed) or outside A if it originally did not exist.A check on the location of key will indicate if it existed or not.

  • 2. Sequential Search Enhancements3. Probabilistic SearchThe basic idea here is that popular elements of the list that are searched for more frequently should require less comparisons to findThis is implemented by enhancing the location of an element found in the array when searched for, one location ahead by swapping it with the element before it.Hence, each time an element is found the number of comparisons needed to find it next time is decremented by one

  • 2. Sequential SearchModifying the first sequential algorithm for the case of sorted list would be :for i = 0 to n doif Ai > k return -1 // as list is sorted the // possible location has been passedif Ai == k return ireturn -1

  • 2. Sequential SearchModifying the second sequential algorithm for the case of sorted list would be :i =0while i < n and next item Ai < ki
  • AgendaIntroductionSequential SearchBinary SearchInterpolation SearchIndexed Search

  • 3. Binary SearchHow does it work? Basic idea that dividing the list at each search step into 2 sublists and checking the mid item the range to be searched for possible location is either the left or right sublist (i.e. desreased to half ).Note however, that the determination of the middle item in the collection is a simple task if the data collection is represented in memory by a sequential array, whereas it is not so if the collection is represented using a linked list. Hence we will assume that the collection is a sequential array.

  • 2. Sequential Search NOYESKey item = list[mid] ?Keyitem = 20 n = 8mid =4return i =2 as found location !Key item < list[mid] mid =2 3 comparisons! mid =3Key item > list[mid]

  • 3. Binary SearchFor the same input and output specs as before the algorithm is:low = 0; high = n-1; while (low < high) do{ mid = (low+high)/2 if ( k < A [mid] ) then high = mid -1 else if ( k > A [mid] then low = mid +1 else return mid // found }return -1 // not found

  • 3. Binary SearchComplexity:For a collection of n data items:In each step: the mid item is compared to k and the range of search is divided by 2This is repeated until the range is zero (at the worst case).i.e. we should ask: how many times will we divide n by 2 till the length of sublists is zero? log2 n which is better than n

  • AgendaIntroductionSequential SearchBinary SearchInterpolation SearchIndexed Search

  • 4.Interpolation SearchWhat is meant by interpolation?Here we try to guess more precisely where the search key resides. Instead of calculating the middle as the physical middle (low+high)/2 it is calculated in a weighted manner w.r.t. to the value of k relative to max and min values in the list

  • 4. Interpolation SearchAnalysis: Calculations are more complex for midSignificant Improvement in search time especially when values of data items in collection are evenly distributed.

  • AgendaIntroductionSequential SearchBinary SearchInterpolation SearchIndexed Search

  • 5. Indexed SearchWhat is an index? Similar to the index of a book (e.g. telephone book), items in the index point to significant items in the collection.This implies that in this search an additional table is used the index table, where each item in the index table points to a specific location in the original search list.

  • 5. Indexed SearchAlgorithm:// Input: Search array A of n items + index table of d items + key item k//Output: Location of item with search key or false key

    Step 1: Determine search range for key within index table by specifying (imin to imax) inside original search list Step 2: Search sequentially for key in range (imin to imax) inside original search list

  • 5. Indexed SearchAlgorithm:IndexTableSearching for key =53{PosStep 1Step 2Pos = 5+1= 61


  • 5. Indexed SearchAnalysis: Assuming that: the original table is of size nIndex is of size dStep 1: Determine search range has average complexity:O( d/2) Step 2: Search for key in range (imin to imax) inside original search list, assume average range length = n/k