31
Searching and Binary Search Trees CSCI 3333 Data Structures

Searching and Binary Search Trees CSCI 3333 Data Structures

Embed Size (px)

Citation preview

Page 1: Searching and Binary Search Trees CSCI 3333 Data Structures

Searching andBinary Search Trees

CSCI 3333 Data Structures

Page 2: Searching and Binary Search Trees CSCI 3333 Data Structures

Acknowledgement

Dr. Yue Mr. Charles Moen Dr. Wei Ding Ms. Krishani Abeysekera Dr. Michael Goodrich Dr. Richard F. Gilberg

Page 3: Searching and Binary Search Trees CSCI 3333 Data Structures

Searching

Google: era of searching! Many type of searching:

Search engines: best match. Key searching: searching for records

with specified key values. Keys are usually assumed to come from a

total order (e.g. integers or strings) Keys are usually assumed to be unique.

No two records have the same key.

Page 4: Searching and Binary Search Trees CSCI 3333 Data Structures

Naïve Linear Search

Algorithm LinearSearch(r, key)Input: r: an array of record. Each record

contains a unique key. key: key to be found.

Output: i: the index such that r[i] contains the key, or -1 if not found.

foreach element in r with index i if r[i].key = key return i;end foreachreturn -1

Page 5: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity

Operations Average Case

Worst Case

Find O(n) O(n)

Insert O(1) O(1)

Delete O(n) O(n)

Sorted order traversal

Need to sort first

Need to sort first

Page 6: Searching and Binary Search Trees CSCI 3333 Data Structures

Naïve Linear Search

Linear time in finding! Only suitable when n is very small. No ‘pre-processing’.

Page 7: Searching and Binary Search Trees CSCI 3333 Data Structures

Binary Search

Assume the array has already been sorted in ascending key value.

Example: find(7)

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

1 3 4 5 7 8 9 11 14 16 18 19

0

0

0

0

ml h

ml h

ml h

lm h

Page 8: Searching and Binary Search Trees CSCI 3333 Data Structures

Binary Search Algorithm

Algorithm BinarySearch(r, key)Input: r: an array of record sorted by its key values.

Each record contains a unique key. key: key to be found.

Output: i: the index such that r[i] contains the key, or -1 if not found.

low = 0; high = r.size – 1while (low <= high) do mid = (high – low) / 2 if r[mid].key > key then high = mid-1 if r[mid].key = key then return mid; if r[mid].key < key then low = mid + 1end whilereturn -1;

Page 9: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity

Operations Average Case

Worst Case

Find O(lg n) O(lg n)

Insert O(n) O(n)

Delete O(n) O(n)

Sorted order traversal

O(n) O(n)

Page 10: Searching and Binary Search Trees CSCI 3333 Data Structures

Binary Search

O(lg n) in finding! Especially suitable for relatively static sets

of records. Question: can we find a searching

algorithm of O(1)? Question: can we find a searching

algorithm of O(lg N), where the time complexity for insert and delete is better? Average case Worst case

Page 11: Searching and Binary Search Trees CSCI 3333 Data Structures

Binary Search Trees

A binary search tree (BST) is a binary tree which has the following properties: Each node has a unique key value. The left subtree of a node contains only

values less than the node's value. The right subtree of a node contains

only key values greater than the node's value.

• Each subtree is itself a binary search Each subtree is itself a binary search tree.tree.

Page 12: Searching and Binary Search Trees CSCI 3333 Data Structures

BST

An inorder traversal of a binary search trees visits the keys in increasing order.

Note that there are variations in the formulation of BST. The one defined in the textbook is somewhat different and is best mapped to implement a dictionary. Duplicate key values allowed. Records stored in leaf nodes. Leaf nodes store no key.

The version of BST used here is more basic, easier to understand, and more popular.

Page 13: Searching and Binary Search Trees CSCI 3333 Data Structures

Searching a BST

Similar to binary search. Each key comparison with a node will

result in one of these: target < node key => search left subtree. target = node key => found! target > node key => search right subtree.

Difference with binary search: do not always eliminate about half of the candidates. The BST may not be balanced.

Page 14: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Find Algorithm (Recursive)

Algorithm Find(r, target)Input: r: root of a BST. target: key to be found.Output: p: the node in the BST that contains the

target; null if not found.if r = null return null;if target = r.key return r;if target < r.key return find(r.leftChild(),target)else // target > r.key

return find(r.rightChild(),target)end if

Page 15: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Find Algorithm (Iterative)Algorithm Find(r, target)Input: r: root of a BST. target: key to be found.Output: p: the node in the BST that contains the

target; null if not found.result = rwhile result != null do

if target = result.key return result;if target < result.key then

result = result.leftChild()else // target > r.key

result = result.rightChild()end if

end whilereturn result

Page 16: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity

Depend on the shape of the BST. Worst case: O(n) for unbalanced

trees. Average case: O(lg N)

Page 17: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Insert

Insert a record rec with key rec.key. Idea: search for rec.key in the BST. If found => error (keys are

supposed to be unique) If not found => ready to insert

(however, need to remember the parent node for insertion).

Page 18: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Insert Algorithm (Iterative)

Algorithm Insert(r, rec)Input: r: root of a BST. rec: record to be found.Output: whether the insertion is successful.Side Effect: r may be updated.curr = rparent = null// find the location for insertion.while (curr != null) do if curr.key = rec.key then

return false // duplicate key: unsuccessful insertion else if rec.key < curr.key then curr = curr.leftChild() parent = curr else // rec.key > curr.key curr = curr.rightChild() parent = curr end ifend while

Page 19: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Insert (Continue)

// create new node for inserrtion newNode = new Node(rec)newNode.parent = parent// Insertion.if parent = null then r = newNode;else if rec.key < parent.key then parent.leftChild = newNodeelse parent.rightChild = newNodeend ifreturn true

Page 20: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete

More complicated Cannot delete a parent without

arranging for the children! Who take care of the children?

Grandparent. If there is no grandparent? The root is

deleted and a child will become the root.

Need to keep track of the parent of the node to be deleted.

Page 21: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete

Find the node with the key to be deleted. If not found, deletion is not successful. Deletion of the node p:

No child: simple deletion Single child: connect the parent of p to the

single child of p. Both children:

Connect the parent of p to the right child of p. Connect the left child of p as the left child of

the immediate successor of p.

Page 22: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete AlgorithmAlgorithm Delete(r, target)Input: r: root of a BST. target: key of the record to be

found.Output: whether the deletion is successful.Side Effect: r is updated if the root node is deleted.

curr = r; parent = null// find the location for insertion.while (curr != null and curr.key != target) do if target < curr.key then curr = curr.left parent = curr else // target > curr.key curr = curr.right parent = curr end ifend while

Page 23: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete Algorithm (Continue)if curr = null then return false; // not foundend if

if curr.right = null then // no right child if parent = null then // root deleted r = r.left else if target < parent.key then parent.left = curr.left else parent.right = curr.left end if end ifelse

Page 24: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete Algorithm (Continue)

// has right child if curr.left = null then // has a right child but no left child if parent = null then // root deleted r = r.right else if target < parent.key then parent.left = curr.right else parent.right = curr.right end if end if else

Page 25: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete Algorithm (Continue)

// has both left and right children // find immediate successor. immedSucc = curr.right while (immedSucc.left != null) immedSucc = immedSucc.left end while immedSucc.left = curr.left

Page 26: Searching and Binary Search Trees CSCI 3333 Data Structures

BST Delete Algorithm (Continue)

if parent = null then // root deleted r = r.right else if target < parent.key then parent.left = curr.right else parent.right = curr.right end if end if end if delete currend if

Page 27: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity

Worst case: O(N) Result in unbalanced tree.

Page 28: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity (general BST)

Operations Average Case

Worst Case

Find O(lg n) O(n)

Insert O(lg n) O(n)

Delete O(lg n) O(n)

Sorted order traversal

O(n) O(n)

Page 29: Searching and Binary Search Trees CSCI 3333 Data Structures

Unbalanced BST

Two approaches: Reorganization when the trees become

very unbalanced. Use balanced BST

Balanced BST Many potential definitions. Height = O(lg n)

Page 30: Searching and Binary Search Trees CSCI 3333 Data Structures

Time Complexity (balanced BST)

Operations Average Case

Worst Case

Find O(lg n) O(lg n)

Insert O(lg n) O(lg n)

Delete O(lg n) O(lg n)

Sorted order traversal

O(n) O(lg n)

Page 31: Searching and Binary Search Trees CSCI 3333 Data Structures

Questions and Comments?