Unit 11: Data Structures & Complexity

Preview:

DESCRIPTION

Unit 11: Data Structures & Complexity. We discuss in this unit Graphs and trees Binary search trees Hashing functions Recursive sorting: quicksort, mergesort. syllabus. basic programming concepts. object oriented programming. topics in computer science. Graphs and Trees. - PowerPoint PPT Presentation

Citation preview

1unit 11a

Unit 11: Data Structures & ComplexityUnit 11: Data Structures & Complexity

We discuss in this unit • Graphs and trees

• Binary search trees

• Hashing functions

• Recursive sorting: quicksort, mergesort

basic programming

concepts

object oriented programming

topics in computer science

syllabus

2unit 11a

Graph: a data representation which includes nodes

and edges, where each edge connects two nodes

Example: the internet

Tree: a connected graph with no loops, and a root

Example: inheritance tree

Graphs and TreesGraphs and Trees

3unit 11a

Graphs and TreesGraphs and Trees

a) b)

c)ROOT

LEAF NODES

internal vertices

4unit 11a

Binary TreesBinary Trees

A rooted tree is called a binary tree if every internal vertex has no more than 2 childrenThe tree is called a full binary tree if every internal vertex has exactly 2 children

An ordered rooted tree is a rooted tree where the children of each internal vertex are ordered; we call the children of a vertex the left child and the right child, if they exist

A rooted binary tree of height H is called balanced if all its leaves are at levels H or H-1

5unit 11a

Binary tree: exampleBinary tree: example

Hal

Lou

Ken

Joe Ted

Sue Ed

Max

6unit 11a

Theorem: A tree with N vertices has N-1 edges

Theorem: There are at most 2 H leaves in a binary tree of height H

Corallary: If a binary tree with L leaves is full and balanced,

then its height is

H = log2 L

Theorem: There are at most (2 H+1–1) nodes in a binary tree of

height H

Tree PropertiesTree Properties

unit 11a

A special kind of binary tree in which:

1. Each vertex contains a distinct key value

2. The key values in the tree can be compared using “greater than” and “less than”

3. The key value of each vertex in the tree is

less than every key value in its left subtree, and greater than every key value in its right subtree

Binary Search Tree (BST)Binary Search Tree (BST)

8unit 11a

Example: Binary Search TreeExample: Binary Search Tree

Hal

Lou

Ken

Joe Ted

Sue Ed

Max

9unit 11a

Shape of a BSTShape of a BST

Depends on its key values and their order of insertion:

Insert the elements ‘J’ ‘E’ ‘F’ ‘T’ ‘A’ in that order. The first value to be inserted is put into the root.

‘J’

10unit 11a

Inserting ‘E’ into the BSTInserting ‘E’ into the BST

Thereafter, each value to be inserted begins by comparing itself to the value in the root, moving left it is less, or moving right if it is greater. This continues at each level until it can be inserted as a new leaf.

‘J’

‘E’

11unit 11a

Begin by comparing ‘F’ to the value in the root, moving left

it is less, or moving right if it is greater. This continues

until it can be inserted as a leaf.

Inserting ‘F’ into the BSTInserting ‘F’ into the BST

‘J’

‘E’

‘F’

12unit 11a

Begin by comparing ‘T’ to the value in the root, moving left it is less, or

moving right if it is greater. This continues until it can be inserted as

a leaf.

Inserting ‘T’ into the BSTInserting ‘T’ into the BST

‘J’

‘E’

‘F’

‘T’

13unit 11a

Begin by comparing ‘A’ to the value in the root, moving left it is less, or

moving right if it is greater. This continues until it can be inserted as

a leaf.

Inserting ‘A’ into the BSTInserting ‘A’ into the BST

‘J’

‘E’

‘F’

‘T’

‘A’

14unit 11a

what BST is obtained by inserting the elements ‘A’ ‘E’ ‘F’ ‘J’ ‘T’ in that order?

Order of insertionOrder of insertion

‘A’

‘E’

‘F’

‘J’

‘T’

15unit 11a

Another binary search treeAnother binary search tree

Add nodes containing these values in this order:

‘D’ ‘B’ ‘L’ ‘Q’ ‘S’ ‘V’ ‘Z’

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’

‘K’ ‘P’

16unit 11a

Task: is ‘F’ in the treeTask: is ‘F’ in the tree??

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’

‘K’

‘V’

‘P’ ‘Z’‘D’

‘Q’‘L’‘B’

‘S’

17unit 11a

Search(x)Search(x)

start at the root of the tree which contains y:

1. the tree is empty x is not present

2. x = y (the item at the root) the root is returned

3. x < y recursively search the left subtree

4. x > y recursively search the right subtree

18unit 11a

OperationsOperations

Search(x)

Insert(x)

Delete(x)

tree algs/demo

19unit 11a

ComplexityComplexity

Search(x) – O(H)

Insert(x) – O(H)

Delete(x) – O(H)

worst case O(n) when tree is a listbest case O(log n) when tree is full and balanced

20unit 11a

A traversal algorithm is a procedure for

systematically visiting every vertex of an ordered

binary tree

Tree traversals are defined recursively

Three traversals are named:

preorder

inorder

postorder

Traversal AlgorithmsTraversal Algorithms

21unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the preorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; the preorder traversal is

• visit r

• traverse T1 in preorder

• traverse T2 in preorder

PREORDER Traversal AlgorithmPREORDER Traversal Algorithm

22unit 11a

Preorder TraversalPreorder Traversal

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree second Visit right subtree last

Visit first

result: J E A H T M Yresult: J E A H T M Y

23unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the inorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; then

• traverse T1 in inorder

• visit r

• traverse T2 in inorder

INORDER Traversal AlgorithmINORDER Traversal Algorithm

24unit 11a

Inorder TraversalInorder Traversal

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree first Visit right subtree last

Visit second

result: A E H J M T Yresult: A E H J M T Y

25unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the postorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; then

• traverse T1 in postorder

• traverse T2 in postorder

• visit r

POSTORDER Traversal AlgorithmPOSTORDER Traversal Algorithm

26unit 11a

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree first Visit right subtree second

Visit last

Postorder Traversal

result: A H E M Y T Jresult: A H E M Y T J

27unit 11a

A Binary Expression TreeA Binary Expression Tree

‘-’

‘8’ ‘5’

ROOT

INORDER TRAVERSAL: 8 - 5 has value 3

PREORDER TRAVERSAL: - 8 5

POSTORDER TRAVERSAL: 8 5 -

28unit 11a

A special kind of binary tree in which:

1. Each leaf node contains a single operand

2. Each nonleaf node contains a single binary operator

3. The left and right subtrees of an operator node represent subexpressions that must be evaluated before applying the operator at the root of the subtree

Binary Expression TreeBinary Expression Tree

29unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

What value does it have?

( 4 + 2 ) * 3 = 18

30unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

What infix, prefix, postfix expressions does it represent?

31unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

Infix: ( ( 4 + 2 ) * 3 )

Prefix: * + 4 2 3 evaluate from right

Postfix: 4 2 + 3 * evaluate from left

32unit 11a

Levels Indicate PrecedenceLevels Indicate Precedence

When a binary expression tree is used to represent an expression, the levels of the nodes in the tree indicate their relative precedence of evaluation

Operations at higher levels of the tree are evaluated later than those below them; the operation at the root is always the last operation performed

33unit 11a

ExampleExample

‘*’

‘-’

‘8’ ‘5’

What infix, prefix, postfix expressions does it represent?

‘/’

‘+’

‘4’

‘3’

‘2’

34unit 11a

A binary expression treeA binary expression tree

Infix: ( ( 8 - 5 ) * ( ( 4 + 2 ) / 3 ) )

Prefix: * - 8 5 / + 4 2 3

Postfix: 8 5 - 4 2 + 3 / * has operators in order used

‘*’

‘-’

‘8’ ‘5’

‘/’

‘+’

‘4’

‘3’

‘2’

35unit 11a

Hash tablesHash tables

Goal: accesses data with complexity O(1), even when the data needs to be dynamically administered (mostly by inserting new or deleting existing data)

Solution: array Problem: access is only O(1) if the index of the element is

known – otherwise O(log n) Solution: each data item to be stored is associated with a

hash value which gives array index:• generate array of size m, where m is prime and sufficiently

large• assign unique numerical value N(k) to key of element k• h(k) = N(k) mod m

36unit 11a

Hash tables - contHash tables - cont

Problem: hash value is not unique anymore! Solution: a collision procedure must determine the

position for the new object

hasshing

37unit 11a

OperationsOperations

Search(x) Insert(x) Delete(x)

Complexity: worst case O(n) when hash value is always the

same (and table is really a list) best case O(1) when hash value is always distinct

38unit 11a

Example: keys are namesExample: keys are names

list of unique identifiers• washington 103288042987600

• lincoln 5201793578

• bush 151444 the range currently is all positive integers, which is

not possible to implement for m=10007

• washington 3249

• lincoln 4873

• bush 1339

39unit 11a

why modulus prime numberwhy modulus prime number??

underlying reason: the new code numbers should appear random

technical reason: the modulus operation should be a “field”

example: • original codes are all even numbers

• m is even

• only even buckets in the table will get filled, half the table will be empty

40unit 11a

Back to sortingBack to sorting......

Two recursive algorithms:

MergeSort

QuickSort

41unit 11a

MergeSortMergeSort

divide array to two equal parts

recursively sort left part

recursively sort right part

merge the two sorted lists

MergeSort

42unit 11a

QuickSortQuickSort

quicksort

choose pivot arbitrarily

divide to left part (< = than pivot) and right part (> than pivot)

sort each part recursively

43unit 11a

ComplexityComplexity

MergeSort

worst case O(n * log n)

QuickSort

worst case O(n2)

average case O(n * log n)

Recommended