34
September 15, 20 03 1 Algorithms and Data Structures Simonas Šaltenis Aalborg University [email protected]

September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University [email protected]

Embed Size (px)

Citation preview

Page 1: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

1

Algorithms and Data Structures

Simonas ŠaltenisAalborg [email protected]

Page 2: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

2

Administration

People Simonas Šaltenis Xuegang Huang (Harry) Kim R. Bille hjælpelærer Martin G. Thomsen

Home page http://www.cs.auc.dk/~simas/ad03 Check the homepage frequently! Course book “ Introduction to Algorithms”,

2.ed Cormen et al. Lectures, B3-104, 10:15-12:00, Thursdays and

14:30-16:15 Mondays

Page 3: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

3

Administration (2)

Exercise classes at 8:15 (or 12:30) before the lectures!

Exam: SE course, written Troubles

Simonas Šaltenis E1-215b [email protected]

Page 4: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

4

Exercise classes

Exercise classes are the most important part of this course! Understanding the textbook and lectures is not

enough! One (or two) exercise will be a hand-in

exercise: Each group writes an answer on the paper,

which I check and discuss during the next exercise class

Half of the exam will be very similar to hand-in exercises

Page 5: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

5

Grouprooms

DAT1: E1-113, E2-107, E2-115, E1-209, E4-212, E3-

116 SW3:

E3-104, E3-106, E3-108, E3-110 D5:

C4-219, C4-221, C1-201, C2-203, C1-205

Correct?

Page 6: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

6

Prerequisites

DAT1/SW3 (courses from basis): Discrete mathematics Programmering i C

D5: Algoritmer og Datastrukturer (on Basis) Mathematics 2 (on D4)

Textbook: K.H.Rosen. “Discrete Mathematics and Its Applications”

Page 7: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

7

What is it all about?

Solving problems Get me from home to work Balance my budget Simulate a jet engine Graduate from AAU

To solve problems we have procedures, recipes, process descriptions – in one word Algorithms

Page 8: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

8

History

Name: Persian mathematician Mohammed al-Khowarizmi, in Latin became Algorismus

First algorithm: Euclidean Algorithm, greatest common divisor, 400-300 B.C.

19th century – Charles Babbage, Ada Lovelace.

20th century – Alan Turing, Alonzo Church, John von Neumann

Page 9: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

9

Data Structures and Algorithms

Algorithm Outline, the essence of a computational

procedure, step-by-step instructions Program – an implementation of an

algorithm in some programming language

Data structure Organization of data needed to solve

the problem

Page 10: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

10

Overall Picture

Data Structure and Algorithm Design Goals

Implementation Goals

Correctness EfficiencyRobustness

Adaptability

Reusability

Using a computer to help solve problems

Designing programsarchitecture algorithms

Writing programsVerifying (Testing) programs

Page 11: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

11

Overall Picture (2)

This course is not about: Programming languages Computer architecture Software architecture Software design and implementation

principles Issues concerning small and large scale

programming We will only touch upon the theory of

complexity and computability

Page 12: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

12

Algorithmic problem

Infinite number of input instances satisfying the specification. For example:

A sorted, non-decreasing sequence of natural numbers. The sequence is of non-zero, finite length:

1, 20, 908, 909, 100000, 1000000000. 3.

Specification of input

?Specification of output as a function of input

Page 13: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

13

Algorithmic Solution

Algorithm describes actions on the input instance

There may be many correct algorithms for the same algorithmic problem

Input instance, adhering to the specification

Algorithm Output related to the input as required

Page 14: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

14

Definition of an Algorithm

An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite amount of time.

Properties: Precision Determinism Finiteness Correctness Generality

Page 15: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

15

Example 1: Searching

INPUT• sorted non-descending sequence of n (n >0) numbers (database)• a single number (query)

a1, a2, a3,….,an; q j

OUTPUT• an index of the found number or NIL

2 5 4 10 11; 5 2

2 5 4 10 11; 9 NIL

Page 16: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

16

Searching (2)

INPUT: A[1..n] – an array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q j1while j n and A[j] q do j++if j n then return jelse return NIL

INPUT: A[1..n] – an array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q j1while j n and A[j] q do j++if j n then return jelse return NIL

The algorithm uses a brute-force algorithm design technique – scans the input sequentially.

The code is written in an unambiguous pseudocode and INPUT and OUTPUT of the algorithm are clearly specified

Page 17: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

17

Pseudo-code

A la Pascal, C, Java or any other imperative language: Control structures (if then else, while

and for loops) Assignment () Array element access: A[i] Composite type (record or object) element

access: A.b (in CLRS, b[A]) Variable representing an array or an object

is treated as a pointer to the array or the object.

Page 18: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

18

Preconditions, Postconditions

It is important to specify the preconditions and the post conditions of algorithms: INPUT: precise specifications of what the

algorithm gets as an input OUTPUT: precise specifications of what the

algorithm produces as an output, and how this relates to the input. The handling of special cases of the input should be described

Page 19: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

19

Sort

Example 2: SortingINPUTsequence of n numbers

a1, a2, a3,….,anb1,b2,b3,….,bn

OUTPUTa permutation of the input sequence of numbers

2 5 4 10 7 2 4 5 7 10

Correctness (requirements for the output)For any given input the algorithm halts with the output:

• b1 < b2 < b3 < …. < bn

• b1, b2, b3, …., bn is a permutation of a1, a2, a3,….,an

Correctness (requirements for the output)For any given input the algorithm halts with the output:

• b1 < b2 < b3 < …. < bn

• b1, b2, b3, …., bn is a permutation of a1, a2, a3,….,an

Page 20: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

20

Insertion Sort

A1 nj

3 6 84 9 7 2 5 1

i

Strategy

• Start “empty handed”• Insert a card in the right position of the already sorted hand• Continue until all cards are inserted/sorted

Strategy

• Start “empty handed”• Insert a card in the right position of the already sorted hand• Continue until all cards are inserted/sorted

INPUT: A[1..n] – an array of integersOUTPUT: a permutation of A such that A[1]A[2]…A[n] for j2 to n do keyA[j] Insert A[j] into the sorted sequence A[1..j-1] ij-1 while i>0 and A[i]>key do A[i+1]A[i] i-- A[i+1]key

INPUT: A[1..n] – an array of integersOUTPUT: a permutation of A such that A[1]A[2]…A[n] for j2 to n do keyA[j] Insert A[j] into the sorted sequence A[1..j-1] ij-1 while i>0 and A[i]>key do A[i+1]A[i] i-- A[i+1]key

Page 21: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

21

Analysis of Algorithms

Efficiency: Running time Space used

Efficiency as a function of input size: Number of data elements (numbers,

points) A number of bits in an input number

Page 22: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

22

The RAM model

Very important to choose the level of detail.

The RAM model: Instructions (each taking constant time), we

usually choose one type of instruction as a characteristic operation that is counted:

Arithmetic (add, subtract, multiply, etc.) Data movement (assign) Control (branch, subroutine call, return) Comparison

Data types – integers, characters, and floats

Page 23: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

23

Analysis of Insertion Sort

for j2 to n do keyA[j] Insert A[j] into the sorted sequence A[1..j-1] ij-1 while i>0 and A[i]>key do A[i+1]A[i] i-- A[i+1]:=key

costc1

c2

0 c3

c4

c5

c6

c7

timesnn-1n-1

n-1

n-1

2

n

jjt

2( 1)

n

jjt

2( 1)

n

jjt

Time to compute the running time as a function of the input size

Page 24: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

24

Best/Worst/Average Case

Best case: elements already sorted tj=1, running time = f(n), i.e., linear time.

Worst case: elements are sorted in inverse order tj=j, running time = f(n2), i.e., quadratic time

Average case: tj=j/2, running time = f(n2), i.e., quadratic time

Page 25: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

25

Best/Worst/Average Case (2)

For a specific size of input n, investigate running times for different input instances:

1n

2n

3n

4n

5n

6n

Page 26: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

26

Best/Worst/Average Case (3)

For inputs of all sizes:

1n

2n

3n

4n

5n

6n

Input instance size

Run

ning

tim

e

1 2 3 4 5 6 7 8 9 10 11 12 …..

best-case

average-case

worst-case

Page 27: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

27

Best/Worst/Average Case (4)

Worst case is usually used: It is an upper-bound and in certain

application domains (e.g., air traffic control, surgery) knowing the worst-case time complexity is of crucial importance

For some algorithms worst case occurs fairly often

The average case is often as bad as the worst case

Finding the average case can be very difficult

Page 28: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

28

That’s it?

Is insertion sort the best approach to sorting?

Alternative strategy based on divide and conquer technique for algorithm design

MergeSort sorting the numbers <4, 1, 3, 9> is split into sorting <4, 1> and <3, 9> and merging the results Running time f(n log n)

Page 29: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

29

Analysis of Searching

INPUT: A[1..n] – an array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q j1while j n and A[j] q do j++if j n then return jelse return NIL

INPUT: A[1..n] – an array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q j1while j n and A[j] q do j++if j n then return jelse return NIL

Worst-case running time: f(n) Average-case: f(n/2)

Page 30: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

30

Binary search

INPUT: A[1..n] – a sorted (non-decreasing) array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q left1rightndo

j(left+right)/2if A[j]=q then return jelse if A[j]>q then rightj-1else left=j+1

while left<=rightreturn NIL

INPUT: A[1..n] – a sorted (non-decreasing) array of integers, q – an integer. OUTPUT: an index j such that A[j] = q. NIL, if j (1jn): A[j] q left1rightndo

j(left+right)/2if A[j]=q then return jelse if A[j]>q then rightj-1else left=j+1

while left<=rightreturn NIL

Idea: Divide and conquer, one of the key design techniques

Page 31: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

31

Binary search – analysis

How many times the loop is executed: With each execution the difference

between left and right is cut in half Initially the difference is n The loop stops when the difference becomes

0 How many times do you have to cut n in

half to get 1? lg n – better than the brute-force

approach (n)

Page 32: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

32

The Goals of this Course

The main things that we will try to learn in this course: To be able to think “algorithmically”, to get

the spirit of how algorithms are designed To get to know a toolbox of classical

algorithms To learn a number of algorithm design

techniques (such as divide-and-conquer) To learn reason (in a formal way) about the

efficiency and the correctness of algorithms

Page 33: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

33

Syllabus

Introduction (1) Correctness, analysis of algorithms (2,3,4) Sorting (1,6,7) Elementary data structures, ADTs (10) Searching, advanced data structures

(11,12,13,18) Dynamic programming (15) Graph algorithms (22,23,24) NP-Completeness (34)

Page 34: September 15, 20031 Algorithms and Data Structures Simonas Šaltenis Aalborg University simas@cs.auc.dk

September 15, 2003

34

Next Week

Correctness of algorithms Asymptotic analysis, big O notation Some basic math revisited