43
1 unit 11a Unit 11: Data Structures & Unit 11: Data Structures & Complexity Complexity We discuss in this unit Graphs and trees Binary search trees Hashing functions Recursive sorting: quicksort, mergesort basic programmin g concepts object oriented programmin g topics in computer science syllabus

Unit 11: Data Structures & Complexity

Embed Size (px)

DESCRIPTION

Unit 11: Data Structures & Complexity. We discuss in this unit Graphs and trees Binary search trees Hashing functions Recursive sorting: quicksort, mergesort. syllabus. basic programming concepts. object oriented programming. topics in computer science. Graphs and Trees. - PowerPoint PPT Presentation

Citation preview

Page 1: Unit 11: Data Structures & Complexity

1unit 11a

Unit 11: Data Structures & ComplexityUnit 11: Data Structures & Complexity

We discuss in this unit • Graphs and trees

• Binary search trees

• Hashing functions

• Recursive sorting: quicksort, mergesort

basic programming

concepts

object oriented programming

topics in computer science

syllabus

Page 2: Unit 11: Data Structures & Complexity

2unit 11a

Graph: a data representation which includes nodes

and edges, where each edge connects two nodes

Example: the internet

Tree: a connected graph with no loops, and a root

Example: inheritance tree

Graphs and TreesGraphs and Trees

Page 3: Unit 11: Data Structures & Complexity

3unit 11a

Graphs and TreesGraphs and Trees

a) b)

c)ROOT

LEAF NODES

internal vertices

Page 4: Unit 11: Data Structures & Complexity

4unit 11a

Binary TreesBinary Trees

A rooted tree is called a binary tree if every internal vertex has no more than 2 childrenThe tree is called a full binary tree if every internal vertex has exactly 2 children

An ordered rooted tree is a rooted tree where the children of each internal vertex are ordered; we call the children of a vertex the left child and the right child, if they exist

A rooted binary tree of height H is called balanced if all its leaves are at levels H or H-1

Page 5: Unit 11: Data Structures & Complexity

5unit 11a

Binary tree: exampleBinary tree: example

Hal

Lou

Ken

Joe Ted

Sue Ed

Max

Page 6: Unit 11: Data Structures & Complexity

6unit 11a

Theorem: A tree with N vertices has N-1 edges

Theorem: There are at most 2 H leaves in a binary tree of height H

Corallary: If a binary tree with L leaves is full and balanced,

then its height is

H = log2 L

Theorem: There are at most (2 H+1–1) nodes in a binary tree of

height H

Tree PropertiesTree Properties

Page 7: Unit 11: Data Structures & Complexity

unit 11a

A special kind of binary tree in which:

1. Each vertex contains a distinct key value

2. The key values in the tree can be compared using “greater than” and “less than”

3. The key value of each vertex in the tree is

less than every key value in its left subtree, and greater than every key value in its right subtree

Binary Search Tree (BST)Binary Search Tree (BST)

Page 8: Unit 11: Data Structures & Complexity

8unit 11a

Example: Binary Search TreeExample: Binary Search Tree

Hal

Lou

Ken

Joe Ted

Sue Ed

Max

Page 9: Unit 11: Data Structures & Complexity

9unit 11a

Shape of a BSTShape of a BST

Depends on its key values and their order of insertion:

Insert the elements ‘J’ ‘E’ ‘F’ ‘T’ ‘A’ in that order. The first value to be inserted is put into the root.

‘J’

Page 10: Unit 11: Data Structures & Complexity

10unit 11a

Inserting ‘E’ into the BSTInserting ‘E’ into the BST

Thereafter, each value to be inserted begins by comparing itself to the value in the root, moving left it is less, or moving right if it is greater. This continues at each level until it can be inserted as a new leaf.

‘J’

‘E’

Page 11: Unit 11: Data Structures & Complexity

11unit 11a

Begin by comparing ‘F’ to the value in the root, moving left

it is less, or moving right if it is greater. This continues

until it can be inserted as a leaf.

Inserting ‘F’ into the BSTInserting ‘F’ into the BST

‘J’

‘E’

‘F’

Page 12: Unit 11: Data Structures & Complexity

12unit 11a

Begin by comparing ‘T’ to the value in the root, moving left it is less, or

moving right if it is greater. This continues until it can be inserted as

a leaf.

Inserting ‘T’ into the BSTInserting ‘T’ into the BST

‘J’

‘E’

‘F’

‘T’

Page 13: Unit 11: Data Structures & Complexity

13unit 11a

Begin by comparing ‘A’ to the value in the root, moving left it is less, or

moving right if it is greater. This continues until it can be inserted as

a leaf.

Inserting ‘A’ into the BSTInserting ‘A’ into the BST

‘J’

‘E’

‘F’

‘T’

‘A’

Page 14: Unit 11: Data Structures & Complexity

14unit 11a

what BST is obtained by inserting the elements ‘A’ ‘E’ ‘F’ ‘J’ ‘T’ in that order?

Order of insertionOrder of insertion

‘A’

‘E’

‘F’

‘J’

‘T’

Page 15: Unit 11: Data Structures & Complexity

15unit 11a

Another binary search treeAnother binary search tree

Add nodes containing these values in this order:

‘D’ ‘B’ ‘L’ ‘Q’ ‘S’ ‘V’ ‘Z’

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’

‘K’ ‘P’

Page 16: Unit 11: Data Structures & Complexity

16unit 11a

Task: is ‘F’ in the treeTask: is ‘F’ in the tree??

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’

‘K’

‘V’

‘P’ ‘Z’‘D’

‘Q’‘L’‘B’

‘S’

Page 17: Unit 11: Data Structures & Complexity

17unit 11a

Search(x)Search(x)

start at the root of the tree which contains y:

1. the tree is empty x is not present

2. x = y (the item at the root) the root is returned

3. x < y recursively search the left subtree

4. x > y recursively search the right subtree

Page 18: Unit 11: Data Structures & Complexity

18unit 11a

OperationsOperations

Search(x)

Insert(x)

Delete(x)

tree algs/demo

Page 19: Unit 11: Data Structures & Complexity

19unit 11a

ComplexityComplexity

Search(x) – O(H)

Insert(x) – O(H)

Delete(x) – O(H)

worst case O(n) when tree is a listbest case O(log n) when tree is full and balanced

Page 20: Unit 11: Data Structures & Complexity

20unit 11a

A traversal algorithm is a procedure for

systematically visiting every vertex of an ordered

binary tree

Tree traversals are defined recursively

Three traversals are named:

preorder

inorder

postorder

Traversal AlgorithmsTraversal Algorithms

Page 21: Unit 11: Data Structures & Complexity

21unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the preorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; the preorder traversal is

• visit r

• traverse T1 in preorder

• traverse T2 in preorder

PREORDER Traversal AlgorithmPREORDER Traversal Algorithm

Page 22: Unit 11: Data Structures & Complexity

22unit 11a

Preorder TraversalPreorder Traversal

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree second Visit right subtree last

Visit first

result: J E A H T M Yresult: J E A H T M Y

Page 23: Unit 11: Data Structures & Complexity

23unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the inorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; then

• traverse T1 in inorder

• visit r

• traverse T2 in inorder

INORDER Traversal AlgorithmINORDER Traversal Algorithm

Page 24: Unit 11: Data Structures & Complexity

24unit 11a

Inorder TraversalInorder Traversal

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree first Visit right subtree last

Visit second

result: A E H J M T Yresult: A E H J M T Y

Page 25: Unit 11: Data Structures & Complexity

25unit 11a

Let T be an ordered binary tree with root r:

If T has only r, then r is the postorder traversal

Otherwise, suppose T1, T2 are the left and right

subtrees at r; then

• traverse T1 in postorder

• traverse T2 in postorder

• visit r

POSTORDER Traversal AlgorithmPOSTORDER Traversal Algorithm

Page 26: Unit 11: Data Structures & Complexity

26unit 11a

‘J’

‘E’

‘A’ ‘H’

‘T’

‘M’ ‘Y’

ROOT

Visit left subtree first Visit right subtree second

Visit last

Postorder Traversal

result: A H E M Y T Jresult: A H E M Y T J

Page 27: Unit 11: Data Structures & Complexity

27unit 11a

A Binary Expression TreeA Binary Expression Tree

‘-’

‘8’ ‘5’

ROOT

INORDER TRAVERSAL: 8 - 5 has value 3

PREORDER TRAVERSAL: - 8 5

POSTORDER TRAVERSAL: 8 5 -

Page 28: Unit 11: Data Structures & Complexity

28unit 11a

A special kind of binary tree in which:

1. Each leaf node contains a single operand

2. Each nonleaf node contains a single binary operator

3. The left and right subtrees of an operator node represent subexpressions that must be evaluated before applying the operator at the root of the subtree

Binary Expression TreeBinary Expression Tree

Page 29: Unit 11: Data Structures & Complexity

29unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

What value does it have?

( 4 + 2 ) * 3 = 18

Page 30: Unit 11: Data Structures & Complexity

30unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

What infix, prefix, postfix expressions does it represent?

Page 31: Unit 11: Data Structures & Complexity

31unit 11a

A Binary Expression TreeA Binary Expression Tree

‘*’

‘+’

‘4’

‘3’

‘2’

Infix: ( ( 4 + 2 ) * 3 )

Prefix: * + 4 2 3 evaluate from right

Postfix: 4 2 + 3 * evaluate from left

Page 32: Unit 11: Data Structures & Complexity

32unit 11a

Levels Indicate PrecedenceLevels Indicate Precedence

When a binary expression tree is used to represent an expression, the levels of the nodes in the tree indicate their relative precedence of evaluation

Operations at higher levels of the tree are evaluated later than those below them; the operation at the root is always the last operation performed

Page 33: Unit 11: Data Structures & Complexity

33unit 11a

ExampleExample

‘*’

‘-’

‘8’ ‘5’

What infix, prefix, postfix expressions does it represent?

‘/’

‘+’

‘4’

‘3’

‘2’

Page 34: Unit 11: Data Structures & Complexity

34unit 11a

A binary expression treeA binary expression tree

Infix: ( ( 8 - 5 ) * ( ( 4 + 2 ) / 3 ) )

Prefix: * - 8 5 / + 4 2 3

Postfix: 8 5 - 4 2 + 3 / * has operators in order used

‘*’

‘-’

‘8’ ‘5’

‘/’

‘+’

‘4’

‘3’

‘2’

Page 35: Unit 11: Data Structures & Complexity

35unit 11a

Hash tablesHash tables

Goal: accesses data with complexity O(1), even when the data needs to be dynamically administered (mostly by inserting new or deleting existing data)

Solution: array Problem: access is only O(1) if the index of the element is

known – otherwise O(log n) Solution: each data item to be stored is associated with a

hash value which gives array index:• generate array of size m, where m is prime and sufficiently

large• assign unique numerical value N(k) to key of element k• h(k) = N(k) mod m

Page 36: Unit 11: Data Structures & Complexity

36unit 11a

Hash tables - contHash tables - cont

Problem: hash value is not unique anymore! Solution: a collision procedure must determine the

position for the new object

hasshing

Page 37: Unit 11: Data Structures & Complexity

37unit 11a

OperationsOperations

Search(x) Insert(x) Delete(x)

Complexity: worst case O(n) when hash value is always the

same (and table is really a list) best case O(1) when hash value is always distinct

Page 38: Unit 11: Data Structures & Complexity

38unit 11a

Example: keys are namesExample: keys are names

list of unique identifiers• washington 103288042987600

• lincoln 5201793578

• bush 151444 the range currently is all positive integers, which is

not possible to implement for m=10007

• washington 3249

• lincoln 4873

• bush 1339

Page 39: Unit 11: Data Structures & Complexity

39unit 11a

why modulus prime numberwhy modulus prime number??

underlying reason: the new code numbers should appear random

technical reason: the modulus operation should be a “field”

example: • original codes are all even numbers

• m is even

• only even buckets in the table will get filled, half the table will be empty

Page 40: Unit 11: Data Structures & Complexity

40unit 11a

Back to sortingBack to sorting......

Two recursive algorithms:

MergeSort

QuickSort

Page 41: Unit 11: Data Structures & Complexity

41unit 11a

MergeSortMergeSort

divide array to two equal parts

recursively sort left part

recursively sort right part

merge the two sorted lists

MergeSort

Page 42: Unit 11: Data Structures & Complexity

42unit 11a

QuickSortQuickSort

quicksort

choose pivot arbitrarily

divide to left part (< = than pivot) and right part (> than pivot)

sort each part recursively

Page 43: Unit 11: Data Structures & Complexity

43unit 11a

ComplexityComplexity

MergeSort

worst case O(n * log n)

QuickSort

worst case O(n2)

average case O(n * log n)