CSC 211 Data Structures Lecture 12

1

CSC 211Data Structures

Lecture 12

Dr. Iftikhar Azim [email protected]

1

2

Last Lecture Summary Dynamic Representation Allocation from Dynamic Storage Returning unused storage back to dynamic

storage Linked List Operations

Insert Delete

2

3

Objectives Overview Cursor-based Implementation of List Search operation Concepts and Definitions Sequential Search Implementation of Sequential search Complexity of Sequential Search

4

Comparison of Methods If the bound to which the list can grow is not

known, use the pointer implementation. INSERT and DELETE take constant time in

linked list but take O(n) time in array. PREVIOUS and END take constant time in

array but O(n) time in linked list. Insertion or deletion that affects the element at

the position denoted by some position variable, eg; HEAD or TAIL should be used with care.

5

Comparison of Methods Array Implementation wastes space

since it uses maximum space irrespective of the number of elements in the list.

Linked list uses space proportional to the number of elements in the list, but requires extra space to save the position

pointers.

6

Cursor-based Implementation of List Some languages do not support pointers, but we can simulate using cursors.

Create one array of records.

Each record consists of an element and an integer that is used as a cursor.

An integer variable LHead is used as a cursor to the header cell of the list L.

7

Cursor-based Implementation of ListL = a,b,c

M = d.e

available

L

1

2D 7 4C 0 6A 8 0E 0B 3 10 3

3

45

678

910

1

9

5

M

Element next

8

aq

b …

ap

b …

temp

Moving a cell from one list to another

Cursor-based Implementation of List

9

Searching A question you should always ask when selecting

a search algorithm is “How fast does the search have to be?” The reason is that, in general, the faster the

algorithm is, the more complex it is. Bottom line: you don’t always need to use or

should use the fastest algorithm. Let’s explore the following search algorithms,

keeping speed in mind. Sequential (linear) search Binary search

10

Searching A search algorithm is a method of locating a

specific item of information in a larger collection of data

Search Algorithms Computer has organized data into computer

memory. Now we look at various ways of searching for a

specific piece of data or for where to place a specific piece of data.

Each data item in memory has a unique identification called its key of the item.

11

What is Searching Finding the location of the record with a given

key value, or finding the locations of some or all records which satisfy one or more conditions.

Search algorithms start with a target value and employ some strategy to visit the elements looking for a match.

If target is found, the index of the matching element becomes the return value.

12

Linear Search In computer science, linear search or sequential

search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found

Linear search is the simplest search algorithm Its worst case cost is proportional to the number of

elements in the list; and so is its expected cost, if all list elements are equally likely to be searched for.

Therefore, if the list has more than a few elements, other methods (such as binary search or hashing) will be faster, but they also impose additional requirements.

13

Properties of Linear Search It is easy to implement. It can be applied on random as well as sorted

arrays. It has more number of comparisons. It is better for small inputs not for long inputs.

14

Linear Search very simple algorithm. It uses a loop to sequentially step through an

array, starting with the first element. It compares each element with the value being

searched for (key) and stops when that value is found or the end of the array is reached.

Can be applied to both sorted and unsorted list

15

Linear Search - Algorithmset found to false; set position to –1; set index to 0

while (index < number of elements) and (found is false)

if list[index] is equal to search value found = true

position = index end if add 1 to index

end whilereturn position

16

Linear Search - ProgramInt LinSearch(int [] list, int item, int size) { int found = 0;

int position = -1; int index = 0;

while (index < size) && (found == 0) { if (list[index] == item ) { found = 1;

position = index; } // end if index++; } // end of while return position;} // end of function LinSearch

17

Linear Search - Example Array numlist contains:

Searching for the the value 11, linear search examines 17, 23, 5, and 11

Searching for the the value 7, linear search examines 17, 23, 5, 11, 2, 29, and 3

17 23 5 11 2 29 3

18

Linear Search - Tradeoffs Benefits:

Easy algorithm to understand Array can be in any order

Disadvantages: Inefficient (slow): For array of N elements, examines N/2 elements on

average for value in array, N elements for value not in array

19

Linear Search Analysis For a list with n items, the best case is when the value is equal to the

first element of the list, in which case only one comparison is needed.

The worst case is when the value is not in the list (or occurs only once at the end of the list), in which case n comparisons are needed.

20

Linear Search Analysis If the value being sought occurs k times in the list, and all orderings of the list are equally likely, the expected number of

comparisons is If k=0 then it is n If 1<=k<=n then it is (n+1) / (k+1) For example, if the value being sought occurs once in the list, and

all orderings of the list are equally likely, the expected number of comparisons is (n+1) /2

However, if it is known that it occurs once, then at most n - 1 comparisons are needed, and the expected number of comparisons is

((n+2)(n-1)) / 2n (for example, for n = 2 this is 1, corresponding to a single if-then-

else construct). Either way, asymptotically the worst-case cost and the expected

cost of linear search are both O(n).

21

Non-Uniform Probabilities The performance of linear search improves if the

desired value is more likely to be near the beginning of the list than to its end.

Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list.

In particular, when the list items are arranged in order of decreasing probability, and these probabilities are geometrically distributed, the cost of linear search is only O(1).

If the table size n is large enough, linear search will be faster than binary search, whose cost is O(log n)

22

Linear Search - Application Linear search is usually very simple to implement, and is

practical when the list has only a few elements, or when performing a single search in an unordered list.

When many values have to be searched in the same list, it often pays to pre-process the list in order to use a faster method. For example, one may sort the list and use binary search, or build any

efficient search data structure from it. Should the content of the list change frequently, repeated re-organization may be more trouble than it is worth.

As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches

23

Linear Search - Pseudocode Forward iteration This pseudocode describes a typical variant of

linear search, where the result of the search is supposed to be either the location of the list item where the desired value was found; or an invalid location Λ, to indicate that the desired element does not occur in the list.

For each item in the list: if that item has the desired value, stop the search and return the item's location.

Return Λ.

24

Linear Search - Pseudocode In this pseudocode, the last line is executed only

after all list items have been examined with none matching.

If the list is stored as an array data structure, the location may be the index of the item found (usually between 1 and n, or 0 and n−1). In that case the invalid location Λ can be any index

before the first element (such as 0 or −1, respectively) or after the last one (n+1 or n, respectively).

If the list is a simply linked list, then the item's location is its reference, and Λ is usually the null pointer.

25

Searching in Reverse Order Linear search in an array is usually

programmed by stepping up an index variable until it reaches the last index.

This normally requires two comparison instructions for each list item: one to check whether the index has reached the

end of the array, and another one to check whether the item has the

desired value. In many computers, one can reduce the work

of the first comparison by scanning the items in reverse order.

26

Searching in Reverse Order Suppose an array A with elements indexed 1 to

n is to be searched for a value x. The following pseudocode performs a forward search, returning n + 1 if the value is not found:

Set i to 1. Repeat this loop: If i > n, then exit the loop. If A[i] = x, then exit the loop. Set i to i + 1.

Return i. k

27

Searching in Reverse Order The following pseudocode searches the array

in the reverse order, returning 0 when the element is not found:

Set i to n. Repeat this loop: If i ≤ 0, then exit the loop. If A[i] = x, then exit the loop. Set i to i − 1.

Return i. k

28

Using a Sentinel Another way to reduce the overhead is to

eliminate all checking of the loop index. This can be done by inserting the desired item

itself as a sentinel value at the far end of the list, as in this pseudocode:

Set A[n + 1] to x. Set i to 1. Repeat this loop: If A[i] = x, then exit the loop. Set i to i + 1.

Return i. n

29

Using a Sentinel - Analysis With this stratagem, it is not necessary to check the value of i

against the list length n: even if x was not in A to begin with, the loop will terminate when i = n + 1. However this method is possible only if the array slot A[n + 1]

exists but is not being otherwise used. Similar arrangements could be made if the array were to be

searched in reverse order, and element A(0) were available. Although the effort avoided by these ploys is tiny, it is still a

significant component of the overhead of performing each step of the search, which is small.

Only if many elements are likely to be compared will it be worthwhile considering methods that make fewer comparisons but impose other requirements.

30

Linear Search on an Ordered List For ordered lists that must be accessed sequentially, such as linked lists or files with variable-length records

lacking an index, the average performance can be improved by giving

up at the first element which is greater than the unmatched target value, rather than examining the entire list.

If the list is stored as an ordered array, then binary search is almost always more efficient than linear search as with n > 8, say, unless there is some reason to suppose that most searches will be for the small elements near the start of the sorted list.

31

Sequential Search on an Unordered File Basic algorithm:Get the search criterion (key)Get the first record from the fileWhile ( (record != key) and (still more records) )

Get the next recordEnd_while

When do we know that there wasn’t a record in the file that matched the key?

32

Sequential Search on an Ordered File Basic algorithm:Get the search criterion (key)Get the first record from the fileWhile ( (record < key) and (still more records) )

Get the next recordEnd_whileIf ( record = key )

Then successElse there is no match in the file

End_else When do we know that there wasn’t a record in the

file that matched the key?

33

Sequential Search of Ordered vs.. Unordered List Let’s do a comparison.

If the order was ascending alphabetical on customer’s last names, how would the search for John Adams on the ordered list compare with the search on the unordered list? Unordered list

if John Adams was in the list? if John Adams was not in the list?

Ordered list if John Adams was in the list? if John Adams was not in the list?

34

Ordered Vs. Unordered (Cont…) How about George Washington? Unordered

if George Washington was in the list? If George Washington was not in the list?

Ordered if George Washington was in the list? If George Washington was not in the list?

How about James Madison?

35

Ordered Vs. Unordered (Cont…) Observation: the search is faster on an ordered list only when the item being searched for is not in the list.

Also, keep in mind that the list has to first be placed in order for the ordered search.

Conclusion: the efficiency of these algorithms is roughly the same.

So, if we need a faster search, we need a completely different algorithm.

How else could we search an ordered file?

36

Comparing Algorithms Before we can compare different methods of

searching (or sorting, or any algorithm), we need to think a bit about the time requirements for the algorithm to complete its task.

We could also compare algorithms by the amount of memory needed For the code For execution (work space)

37

Comparing Algorithms An algorithm can require different times to

solve different problems of the same size ( a measure of efficiency)

For example, the time it takes an algorithm to search for the integer ‘1’ in an array of 100 integers depends on the nature of the array are they sorted already? if so, ‘1’ may be at the start or end

38

Most of the time we consider the maximum amount of time that an algorithm can require

We call this worst-case analysis Worst-case analysis states that an algorithm is

O(f(n)) if it will not take anymore time than k * f(n) time units for all but a finite number of values n.

Read the ‘big-O’, O(…), as ‘on the order of’ f(n) is a function describing how the time or

memory requirements increase with increasing problem size (increasing values of n).

Order: A Comparison Tool

39

The worst-case scenario doesn’t mean the algorithm will always be slow, but that it is guaranteed never to take more time then the given bound

This is called an asymptotic bound Remember those asymptotes from algebra

(same thing) Sometimes, the worst-case happens very

rarely (if at all) in practice

Order

40

A harder to calculate metric is an algorithm’s average-case performance

Average-case analysis uses probabilities of problem sizes and problems of a given size to determine how it will act on average

We won’t worry about calculating the average-case performance at this point

Average Performance

41

If the item we are looking for is the first item, the search is O(1). This is the best-case scenario

If the target item is the last item (item n), the search takes O(n). This is the worst-case scenario.

On average, the item will tend to be near the middle (n/2) but this can be written (½*n), and as we will see, we can ignore multiplicative coefficients. Thus, the average-case is still O(n)

Sequential Search

42

Sequential Search - Analysis To determine the average number of comparisons in

the successful case of the sequential search algorithm: Consider all possible cases. Find the number of comparisons for each case. Add the number of comparisons and divide by the

number of cases. If the search item, called the target, is the first

element in the list, one comparison is required. If it is the second element in the list, two

comparisons are required. If it is the nth element in the list, n comparisons

are required

43

Sequential Search - Analysis The following expression gives the average

number of comparisons to find an item in a list size of n:

It is known that:

Therefore, the following expression gives the average number of comparisons made by the sequential search in the successful case:

44

Sequential Search So, the time that sequential search takes is

proportional to the number of items to be searched

Another way of saying the same thing using the Big-O notation is: O(n) A sequential search is of order n

45

Linear Search

6 4 2 9 5 10

index = seqSearch (arr, 1, 8, 3);

7Index 1 2 3 4 5 6 7 8

3

target = 3

match at index = 6return index 6

6 4 2 9 5 3 10 7Index 1 2 3 4 5 6 7 8

target =11

no matchreturn index 0

index = seqSearch ( arr 1, 8,11);

46

Linear Search AlgorithmInput: An Array A with n elements and the

particular element X to be foundOutput: Element X exists or NOT.

For i:=1 to nIF (A[i]=X) THEN

Print: Item ExistsEnd Algorithm

Print: Item does not exist in ArrayEXIT

47

Linear Search Tracing

Lets search for the number 3. We start at the beginning and check the first element in the array. Is it 3?

No, not it. Is it the next element?

Not there either. The next element?

4848

Linear Search Tracing

Not there either. Next?

We found it!!! Now you understand the idea of linear searching; we go through each element, in order, until we find the correct value or we don’t till the very end.

49

Linear Search Consider a membership file in which each

record contains, among other data the name and telephone number of its member. Suppose we are given the name of a member and we want to find his or her telephone number. One way to do this is to linearly search through the file, that is, apply the Linear Search:

Search each record of the file, one at a time, until finding the given Name and hence the corresponding telephone number

50

Linear Search Complexity First of all, it is clear that the time required to

execute the algorithm is proportional to the number of comparisons.

Also, assuming that each name in the file is equally likely to be picked, it is intuitively clear that the average number of comparisons for a file with n records is equal to n/2;

that is, the complexity of the linear search algorithm is given by O(n) for average case

51

Linear Search The Linear Search algorithm would be

impossible in practice if we were searching through a list consisting of thousands of names, as in a telephone book.

However, if the names are sorted alphabetically, as in telephone books, then we can use an efficient algorithm called binary search.

We may have to use binary search.

52

Summary Cursor-based Implementation of List Search operation Linear Search

Concept , Algorithm and Code Examples

Complexity of Linear Search

Documents

CSC 211 Data Structures Lecture 12