40
Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: [email protected] Homepage: http://see.xidian.edu.cn/fac ulty/liujing

Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: [email protected] Homepage:

Embed Size (px)

Citation preview

Page 1: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Data Structures and Algorithm Analysis

Lecturer: Jing Liu

Email: [email protected]: http://see.xidian.edu.cn/faculty/liujing

Page 2: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Textbook

Mark Allen Weiss, Data Structures and Algorithm Analysis in C, China Machine Press.

Page 3: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Grading Final exam: 70% Others: 30%

Page 4: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

What are Data Structures and Algorithms? Data Structures are methods of

organizing large amounts of data.

An algorithm is a procedure that consists of finite set of instructions which, given an input from some set of possible inputs, enables us to obtain an output if such an output exists or else obtain nothing at all if there is no output for that particular input through a systematic execution of the instructions.

Page 5: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Instructions

Inputs

(Problems)

Outputs

(Answers)

Computers

Page 6: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Programming Languages

Data Structure

Algorithms

Software Systems

Page 7: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Contents

Chapter 3 Lists, Stacks, and QueuesChapter 4 TreesChapter 5 HashingChapter 6 Priority Queues (Heaps)Chapter 7 SortingChapter 8 The Disjoint Set ADTChapter 9 Graph AlgorithmsChapter 10 Algorithm Design Techniques

Page 8: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Abstract Data Types (ADTs) One of the basic rules concerning programming is

to break the program down into modules.

Each module is a logical unit and does a specific job. Its size is kept small by calling other modules.

Modularity has several advantages. (1) It is much easier to debug small routines than large routines; (2) It is easier for several people to work on a modular program simultaneously; (3) A well-written modular program places certain dependencies in only one routing, making changes easier.

Page 9: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Abstract Data Types (ADTs) An abstract data type (ADT) is a set of operations.

Abstract data types are mathematical abstractions; nowhere in an ADT’s definition is there any mention of how the set of operations is implemented.

Objects such as lists, sets, and graphs, along with their operations, can be viewed as abstract data types, just as integers, reals, and booleans are data types. Integers, reals, and booleans have operations associated with them, and so do ADTs.

Page 10: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Abstract Data Types (ADTs) The basic idea is that the implementation of the op

erations related to ADTs is written once in the program, and any other part of the program that needs to perform an operation on the ADT can do so by calling the appropriate function.

If for some reason implementation details need to be changed, it should be easy to do so by merely changing the routings that perform the ADT operations.

There is no rule telling us which operations must be supported for each ADT; this is a design decision.

Page 11: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

The List ADT The form of a general list: A1, A2, A3, …, AN; The size of this list is N; An empty list is a special list of size 0; For any list except the empty list, we say that Ai+1

follows (or succeeds) Ai (i<N) and that Ai-1 precedes Ai (i>1);

The first element of the list is A1, and the last element is AN. We will not define the predecessor of A1 or the successor of AN.

The position of element Ai in a list is i.

Page 12: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

The List ADTThere is a set of operations that we would like

to perform on the list ADT:

PrintList MakeEmpty Find: return the position of the first occurrence of a key Insert and Delete: insert and delete some key from som

e position in the list FindKth: return the element in some position Next and Previous: take a position as argument and retu

rn the position of the successor and predecessor

Page 13: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

The List ADT

Example: The list is 34, 12, 52, 16, 13

Find(52) Insert(X, 3) Delete(52)

The interpretation of what is appropriate for a function is entirely up to the programmer.

Page 14: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

All these functions about lists can be implemented by using an array.

Simple Array Implementation of Lists

PrintList MakeEmpty Find Insert Delete Next Previous

Page 15: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Disadvantages:

Simple Array Implementation of Lists

An estimate of the maximum size of the list is required, even if the array is dynamically allocated. Usually this requires a high overestimate, which wastes considerable space.

Insertion and deletion are expensive. For example, inserting at position 0 requires first pushing the entire array down one spot to make room. Because the running time for insertions and deletions

is so slow and the list size must be known in advance, simple arrays are generally not used to implement lists.

Page 16: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

In order to avoid the linear cost of insertion and deletion, we need to ensure that the list is not stored contiguously, since otherwise entire parts of the list will need to be moved.

Linked Lists

A1 A2 A3 A4 A5

A linked list

The linked list consists of a series of structures, which are not necessarily adjacent in memory.

Each structure contains the element and a pointer to a structure containing its successor. We call this the Next pointer.

The last cell’s Next pointer points to NULL;

Page 17: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

If P is declared to be a pointer to a structure, then the value stored in P is interpreted as the location, in main memory, where a structure can be found.

A field of that structure can be accessed by P->FieldName, where FieldName is the name of the field we wish to examine.

Linked Lists

A1 A2 A3 A4 A5

Linked list with actual pointer values

In order to access this list, we need to know where the first cell can be found. A pointer variable can be used for this purpose.

800

712

992

692

0

1000 800 712 992 692

Page 18: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

To execute PrintList(L) or Find(L, Key), we merely pass a pointer to the first element in the list and then traverse the list by following the Next pointers.

The Delete command can be executed in one pointer change.

Linked Lists

A1 A2 A3 A4 A5

The Insert command requires obtaining a new cell from the system by using a malloc call and then executing two pointer maneuvers.

A1 A2 A3 A4 A5

X

Page 19: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

There are several places where you are likely to go wrong:

Linked Lists

(1) There is no really obvious way to insert at the front of the list from the definitions given;(2) Deleting from the front of the list is a special case, because it changes the start of the list; careless coding will lose the list;(3) A third problem concerns deletion in general. Although the pointer moves above are simple, the deletion algorithm requires us to keep track of the cell before the one that we want to delete.

Page 20: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

One simple change solves all three problems. We will keep a sentinel node, referred to an a header or dummy node.

Linked Lists

A1 A2 A3 A4 A5

Linked list with a header

Header

L To avoid the problems associated with deletions, we need to

write a routing FindPrevious, which will return the position of the predecessor of the cell we wish to delete. If we use a header, then if we wish to delete the first element in the list, FindPrevious will return the position of the header.

Page 21: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Sometimes it is convenient to traverse lists backwards. The solution is simple. Merely add an extra field to the data structure, containing a pointer to the previous cell. The cost of this is an extra link, which adds to the space requirement and also doubles the cost of insertions and deletions because there are more pointers to fix.

Doubly Linked Lists

A doubly linked list

A1 A2 A3 A4 A5

How to implement doubly linked lists?

Page 22: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

A popular convention is to have the last cell keep a pointer back to the first. This can be done with or without a header. If the header is present, the last cell points to it.

It can also be done with doubly linked lists, the first

cell’s previous pointer points to the last cell.

Circularly Linked Lists

A double circularly linked list

A1 A2 A3 A4 A5

Page 23: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Example – The Polynomial ADT

If most of the coefficients Ai are nonzero, we can use a simple array to store the coefficients.

Write codes to calculate F(X) based on array.

0( )

N iii

F X A X=

Page 24: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Example – The Polynomial ADT

If most of the coefficients Ai are zero, the implementation based on array is not efficient, since most of the time is spent in multiplying zeros.

1000 141( ) 10 5 1P X X X= + +

101000

5 14 1 0

P1

An alternative is to use a singly linked list. Each term in the polynomial is contained in one cell,

and the cells are sorted in decreasing order of exponents.

Page 25: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Stack ADT A stack is a list with the restriction that

insertions and deletions can be performed in only one position, namely, the end of the list, called the top.

The fundamental operations on a stack are Push, which is equivalent to an insert, and Pop, which deletes the most recently inserted element.

The most recently inserted element can be examined prior to performing a Pop by use of the Top routine.

Page 26: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Stack ADT A Pop or Top on an empty stack is generally

considered an error in the stack ADT.

Running out of space when performing a Push is an implementation error but not an ADT error.

Stacks are sometimes known as LIFO (last in, first out) lists.

Page 27: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Stack ADT

6

3

2

4

9

7Top

Stack model: only the top element is accessible

Page 28: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Implementation of Stacks Since a stack is a list, any list implementation

will do.

We will give two popular implementations. One uses pointers and the other uses an array.

No matter in which case, if we use good programming principles, the calling routines do not need to know which method is being used.

Page 29: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Linked List Implementation of Stacks We perform a Push by inserting at the front of

the list

We perform a Pop by deleting the element at the front of the list

A Top operation merely examines the element at the front of the list, returning its value.

Page 30: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Array Implementation of Stacks A Stack is defined as a pointer to a structure. The st

ructure contains the TopOfStack and Capacity fields. Once the maximum size is known, the stack array can be dynamically allocated.

Associated with each stack is TopOfStack, which is -1 for an empty stack (this is how an empty stack is initialized).

Page 31: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Array Implementation of Stacks To push some element X onto the stack, we increm

ent TopOfStack and then set Stack[TopOfStack]=X, where Stack is the array representing the actual stack.

To pop, we set the return value to Stack[TopOfStack] and then decrement TopOfStack.

Page 32: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Example – Conversion of Numbers We have many different data systems, like

Decimal system, Binary system, Hexadecimal system, Octal system

Decimal Number

Divisor

Quotient Remainder

30 2 15 0

15 2 7 1

7 2 3 1

3 2 1 1

1 2 0 1

Convert a decimal number to a binary number

Function calls.

Page 33: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

The Queue ADT Like stacks, queues are lists.

With a queue, however, insertion is done at one end, whereas deletion is performed at the other end.

The basic operations on a queue are Enqueue, which inserts an element at the end of the list (called the rear), and Dequeue, which deletes (and returns) the element at the start of the list (known as the front).

Page 34: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Array Implementation of Queues For each queue data structure, we keep an array, Queue[],

and the positions Front and Rear, which represent the ends of the queue.

We also keep track of the number of elements that are actually in the queue, Size. All this information is part of one structure.

The following figure shows a queue in some intermediate state. The cells that are blanks have undefined values in them:

5 2 7 1

Front Rear

Page 35: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Array Implementation of Queues To Enqueue an element X, we increment Size and R

ear, then set Queue[Rear]=X.

To Dequeue an element, we set the return value to Queue[Front], decrement Size, and then increment Front.

Whenever Front or Rear gets to the end of the array, it is wrapped around to the beginning. This is known as a circular array implementation.

Page 36: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

2 4Front Rear

Initial State

1 2 4Front

Enqueue (1)Rear

1 3 2 4Front

Enqueue (3) Rear

1 3 2 4FrontRear

Dequeue, which returns 2

1 3 2 4Rear

Dequeue, which returns 4 Front

1 3 2 4Rear

Dequeue, which returns 1

Front

1 3 2 4Rear

Dequeue, which returns 3 and makes the Queue empty Front

Page 37: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Linked List Implementation of Queues

Front

Rear

……

/

Header

Page 38: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Linked List Implementation of Queues

Front

x

Reart /

FrontReart /

xFrontReart y /

xFrontReart y /

Empty Queue

Enqueue x

Enqueue y

Dequeue x

Page 39: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Example Applications When jobs are submitted to a printer, they are ar

ranged in order of arrival. Every real-life line is a queue. For instance, lines

at ticket counters are queues, because service is first-come first-served.

A whole branch of mathematics, known as queueing theory, deals with computing, probabilistically, how long users expect to wait on a line, how long the line gets, and other such questions.

Page 40: Data Structures and Algorithm Analysis Lecturer: Jing Liu Email: neouma@mail.xidian.edu.cn Homepage:

Homework of Chapter 3 Exercises: 3.2 (Don’t need to analyze the running time.) 3.3 3.4 3.5 3.21 3.25