47
Advanced Data Structure Hackson Leung 2009-01-10

Advanced Data Structure Hackson Leung 2009-01-10

Embed Size (px)

Citation preview

Advanced Data StructureHackson Leung

2009-01-10

Outline

• Review• General Array• Linked List• Sorted Array

• Main dish• Binary Search Tree• Heap (aka Priority Queue)• Hash Table

• Dessert?

ReviewSomething you should have known

Review

• Data Structure – way to store data• GOOD or BAD?• No absolute answer

• Operations Supported• Find an element T inside• Find the minimum element inside• Insert a new element T in it• Remove the Tth element inside• Remove the minimum element inside

Review

• GOOD or BAD?• No absolute answer• WHY?

• Bringing small money is troublesome• Most shops accept money

• Octopus is convenient in payment• Not many shops accept it

• Money OR Octopus? Depends• Which structure is the best? • Also Depends!

Review

• GOOD or BAD?• No absolute answer IF we consider many• Definite IF only consider one operation

• For one operation, time complexity• Still remember that?• O(1), O(lgN), O(N), O(NlgN), O(N2)...

Review - Array• If you don’t know that, you won’t be here• Find an element T• Search the whole array, O(N)

• Find the minimum• Search the whole array, O(N)

• Insert an element T• A[cnt] = T; cnt = cnt + 1; O(1)

• Remove the Tth element• O(1). Why?

• Remove the minimum element• O(N). Why?

Review – Linked List• Find an element T• Search the whole list, O(N)

• Find the minimum• Search the whole list, O(N)

• Insert an element T• O(1). Why?

• Remove the element with pointer T• O(1), with better implementation.

• Remove the minimum element• O(N). Why?

Review – Sorted Array• Find an element T• Binary search, O(lgN)

• Find the minimum• Still need to search?!?! No way! O(1)

• Insert an element T• Where to add? O(lgN)• Maintain sorted, O(N) shift

• Remove the Tth element• Delete + Shift, O(N)

• Remove the minimum element• Delete + Shift, O(N) (Can it be O(1)???)

Review – SummeryArray Linked List Sorted Array

Find T O(N) O(N) O(lgN)

Find Min O(N) O(N) O(1)

Add T O(1) O(1) O(N)

Remove Tth/pointer T O(1) O(1) O(N)

Remove Minimum O(N) O(N) O(N) or O(1)

• If we perform a few number of operations, fine• If we perform a lot......• NONE ARE EFFICIENT!

Binary Search TreeI suppose you have learnt tree before

Binary Search Tree

• Make use of Binary Tree• Binary Tree? At most 2 children!• One more property• Left Subtree < Node < Right Subtree

11

8

4 9

15

20

Binary Search Tree

• Insert 11, 8, 15, 9, 20, 4

11

8

4 9

15

20

Binary Search Tree

• Insert 11, 8, 15, 9, 20, 4

11

8

4 9

15

20

Binary Search Tree

• Find 9

11

8

4 9

15

20

Binary Search Tree

• Find 14

11

8

4 9

15

20

Binary Search Tree

• Remove• Leaf Node Very easy• Single Child Still easy• Push that child upward

• 2 Children?• Find Max(Min) at Left(Right) Subtree• Replace to that node• For that removal of node...

• Leaf?• Single?• 2 Children?

Binary Search Tree

• Remove• Leaf Node Very easy• Single Child Still easy• Push that child upward

• 2 Children?• Replace by Min(Max) at Left(Right)

Subtree• Either leaf of single child case

• Declare that node to be deleted instead of actually deleting (Lazy Deletion)

Binary Search Tree

• Remove• Declare that node to be deleted

instead of actually deleting (Lazy Deletion)

• Delete 11 11

8

4 9

15

20

Binary Search Tree

• Remove• Declare that node to be deleted

instead of actually deleting (Lazy Deletion)

• Delete 11

8

4 9

15

20

Binary Search Tree

• Summary• Insert is similar to Find• Find Minimum: Left Most• Remove Min: Find Min + Remove

• Complexities• Find: O(lgN)• Find Min: O(lgN)• Remove Min: O(lgN)• Insert: O(lgN)• Remove: O(lgN)

Binary Search Tree

• World of lgN!• It is supposed to be lgN only.• Upper bounds...?• Example...?• Reason...?

• Solution• AVL Tree, Red Black Tree• By using rotations• Rarely used + difficult to implement

HeapPriority is given to...

Heap (Priority Queue)

• Binary Tree (Usually complete)• Queue only supports• Enqueue (Insert)• Dequeue (Remove Min)

• One property• Node value is greater(smaller) than

all its descendants• Max(Min) Heap

Heap (Priority Queue)

• How to implement? Array!• Suppose 1-based array• For an node at Tth position• Parent = T/2• Left = 2T• Right = 2T+1

Heap (Priority Queue)

• Enqueue• Just insert at the end• Shift up until property is satisfied

Heap (Priority Queue)

• Dequeue• Replace top element with the last• Shift down until property is satisfied

Heap (Priority Queue)

• Build a heap• Each node’s height: lgN• For each node, do shift down• O(NlgN)???• For a binary tree with N nodes• There are at most ceil(N/2h+1) nodes with

height h

• Complexity in terms of h• = O(2N)• Therefore it is O(N)

Heap (Priority Queue)

• Summary• Find is not supported

• O(N) search whole array• Is remove supported?

• Any subtree is also a heap• Remove Min on that subtree

• Complexity• Find: O(N)• Find Min: O(1)• Remove Min: O(lgN)• Insert: O(lgN)• Remove: O(lgN)

Hash TableMemory is finite, but input range is NOT

Hash Table

• Simple Problem• Mark Six result: 6 integers in [1..49]• Check if each number matches my

bet• Most efficient approach?

• Answer• Use boolean array with 49 cells• Check if that cell is set for each

number• O(1)

Hash Table

• Simple Problem...?• Mark Six result: 6 integers in [1..232]• Check if each number matches my

bet• Most efficient approach?

• Answer...?• Use boolean array with 232 cells• 4GB memory already• Even O(1) checking... Too much

memory required

Hash Table

• Suppose we are not checking numbers but strings instead...

• Suppose we are not checking numbers but images instead...

• Solution• Compress the range by Hash Function• Convert the range into an integer

Hash Table

• Idea• We need to store values between 0 to 99, but we have only 10 cells• We can compress the range [0, 99] to [0, 9] by taking the modulo 10. It is called Hash Value• Insert, Find and Deleting are O(1)

Hash Table

• Problem• What if inserting both 11 and 31?• Collision problem

• Solutions• Chaining (Open Hashing)• Open Addressing (Close Hashing)

Hash Table - Chaining

• Each cell becomes a linked list• On average, Insert/Find/Remove

takes O(1+α), where α is the Load Factor = # stored / # cells

• For random HashFunction, usuallyaverage case

Hash Table – Open Addressing

• Find until a blank cell is reached• Remove must use Lazy Deletion or

Find may fail in some cases

Hash Table – Open Addressing

• Previous method is called Linear Probing

• Quadratic Probing is also popular• Table size MUST be prime• In OI, Linear Probing is enough

• Complexities• Find: O(1 / 1-α)• Insert: O(1 / 1-α)• Remove: O(ln(1 / 1- α)/α + 1/α)

Hash Table – Summary• Find Min and Remove Min is usually

not supported unless you search the whole tree or array

• Complexities (Chaining)• Find: O(1 + α)• Insert: O(1 + α)• Remove: O(1 + α)

• Complexities (Open Addressing)• Find: O(1 / 1-α)• Insert: O(1 / 1-α)• Remove: O(ln(1 / 1- α)/α + 1/α)

• If α is small (<50%), almost O(1)

SummaryCompare what we have learnt

Summary

Summary

• Choice of data structures are depending on the problem nature

• What operations are intensive is the critical factor of choosing them

ExtrasRead only if you have digested main dishes

Extra Topics

• Double Hashing• Use of two hash functions

• Binary Indexed Tree• IOI 2001 Mobiles

• Segment Tree• IOI 1998 Picture

• Quad Tree• Rarely used

TasksPractise makes Perfect

Tasks

• HKOJ 1020 – Left Join• HKOJ 1021 – Inner Join• HKOJ 1019 – Addition II• HKOJ 1090 – Dilligent• HKOJ 3061 – Tappy World• NOI 2004 Day 1 - Cashier

Q & ACuriosity?