22
Data Structures and Algorithms Prof. Adriano Patrick Cunha

Data structures and algorithms

Embed Size (px)

Citation preview

Data Structures and Algorithms

Prof. Adriano Patrick Cunha

Prof. Adriano Patrick Cunha

Program

Introduction Algorithms.

Designing and Analyzing Algorithms.

Recursive technique.

Lists.

Trees.

Priority Lists.

Prof. Adriano Patrick Cunha

Data Structures and Algorithms

“I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.”

Linus Torvalds

Prof. Adriano Patrick Cunha

Data Structures and Algorithms

Problems solve by algorithms and data structures

The Human Genome Project

The Internet enables people all around the world to quickly access and retrieve large amounts of information.

Electronic commerce enables goods and services to be negotiated and exchanged electronically, and it depends on the privacy of personal information such as credit card numbers, passwords, and bank statements.

Manufacturing and other commercial enterprises often need to allocate scarce resources in the most beneficial way.

We are given a road map on which the distance between each pair of adjacent intersections is marked, and we wish to determine the shortest route from one intersection to another.

We are given two ordered sequences of symbols, X = (x1 ; x2 ;... ; xm) and Y = (y1 ; y2 ;... ; yn), and we wish to find a longest common subsequence of X and Y

We are given a mechanical design in terms of a library of parts, where each part may include instances of other parts, and we need to list the parts in order so that each part appears before any part that uses it.

We are given n points in the plane, and we wish to find the convex hull of these points. The convex hull is the smallest convex polygon containing the points.

Prof. Adriano Patrick Cunha

Algorithms

Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output.

We can also view an algorithm as a tool for solving a well-specified computational problem. The statement of the problem specifies in general terms the desired input/output relationship. The algorithm describes a specific computational procedure for achieving that input/output relationship.

Prof. Adriano Patrick Cunha

Problem:

Sorting

Input: sequence <a1, a2, ..., an) of numbers

Output: permutation <a'1, a'2, ..., a'n>

$(such that) -> $ a'1 <= a'2 <= ... <= a'n

E.g.

Input: 〈31, 41, 59, 26, 41, 58〉

Output: 〈26, 31, 41, 41, 58, 59〉

Prof. Adriano Patrick Cunha

Solve:

Insertion-Sort(A, n) //Sort A[1..n]

for j <- 2 to n do

key <- A[ j ];

i <- j - 1;

while(i > 0 and A[ i ] > key) do

A[ i + 1 ] <- A[ i ];

i <- j - 1;

A[ i + 1 ] <- key;

It works the way many people sort the cards by playing cards

1. Letters initially on the table2. One card at a time3. Place it in the left hand, in the correct position

a. Find the right position from right to left

E.g. 〈5, 2, 4, 6, 1, 3〉

Prof. Adriano Patrick Cunha

Solve:

Insertion-Sort(A, n) //Sort A[1..n]

for j <- 2 to n do

key <- A[ j ];

i <- j - 1;

while(i > 0 and A[ i ] > key) do

A[ i + 1 ] <- A[ i ];

i <- j - 1;

A[ i + 1 ] <- key;

Is correct ?

Prof. Adriano Patrick Cunha

Solve:

Insertion-Sort(A, n) //Sort A[1..n]

for j <- 2 to n do

key <- A[ j ];

i <- j - 1;

while(i > 0 and A[ i ] > key) do

A[ i + 1 ] <- A[ i ];

i <- j - 1;

A[ i + 1 ] <- key;

Is correct ?

Is the best ?

Prof. Adriano Patrick Cunha

What is Correct?

An algorithm is said to be correct if, for every input instance, it halts with the correct output. We say that a correct algorithm solves the given computational problem.

An incorrect algorithm might not halt at all on some input instances, or it might halt with an incorrect answer.

Contrary to what you might expect, incorrect algorithms can sometimes be useful, if we can control their error rate.

Prof. Adriano Patrick Cunha

What is Correct?

Loop invariant

Property or statement that holds true for each loop iteration

Helps understand why an algorithm is correct

Prof. Adriano Patrick Cunha

Loop Invariant

Three details must be shown:

Initialization: the invariant is true before the first loop iteration

Maintenance: if the invariant is true before an iteration of the loop, it will remain true before the next loop iteration

Termination: when the loop ends, the invariant gives us a useful property that helps to show that the algorithm is correct

Prof. Adriano Patrick Cunha

What is being the best?

Insertion SortNumber of statements executed c1n2

Computer A: 1 billion (109) instructions per second.Great programmer: 2n2 instructions.

Intercalation Sort(Merge-Sort)Number of statements executed c2nlgnComputer B: 10 millions (107 ) instructions per second.Regular programmer: 50nlgn instructions.

Time to sort 1 million (106) elements of a set?

Prof. Adriano Patrick Cunha

Problem of the traveling salesman

A traveling salesman has to visit a certain number of cities and each move between two cities involves a certain cost. What will be the most economic return, visiting each of the cities only once and returning the one from where you left? The optimal solution for this type of problem is to find a Hamilton circuit of minimum length.

Hamilton (or Hamiltonian) Circuit It is a path that begins and ends at the same vertex running through all the vertices once (except the last which is also the first).

Prof. Adriano Patrick Cunha

Problem of the traveling salesman

4 15

10

9

16

20

8

14127

A

B

CD

E

A -> E -> C -> D -> B -> A <56km>B -> C -> E -> A -> D -> B <49km>C -> E -> A -> D -> B -> C <49km>D -> A -> E -> C -> B -> D <49km>E -> A -> D -> C -> B -> E <44km>

N Alternativas (~n!) Tempo

5 120 0,00012 s

10 362880 3,62880 s

12 479001600 8 min

15 1307674368000 15 days

20 2432902008176640000 77.147 years

50 3,04 E+0064 ∞

100 9,33 E+0157 ∞

Prof. Adriano Patrick Cunha

What is being the best?

Running Time

- Depends on input (e.g. already sorted)

- Depends on input size (6 elements vs 6x10^9 elements)

- parametrize in input size

- Want upper bounds

- guarantee to user

Prof. Adriano Patrick Cunha

Analysis of Algorithms

Theoretical study of computer-program performance and resources usage.

What's more important than performance?

Why study algorithms and performance?

Determines something is feasible or infeasible

Performance enables the usability, security

Speed is fun!!

Prof. Adriano Patrick Cunha

Kinds of Analysis

- Woist-case(usually)

- T(n) = max time on any input of size n

- Average-case (sometimes)

- T(n) = expected time over all inputs of size n

- Need assumption of statistical distribution

- Best-case (bogus)

- Cheat

Prof. Adriano Patrick Cunha

Asympototic analysis

Ignore machine dependent constants

Look at GROWTH T(n) as n-> infinity

O Notation

- Drop low order terms

- Ignore leading constants

- E.g. 3n3 + 90n2 - 5n + 6046 => O(n3)

- As n -> infinity, O(n2) algorithm always beats a O(n3) algorithm.

Prof. Adriano Patrick Cunha

Insertion Sort - Asympototic analysis

Insertion-Sort(A, n) //Sort A[1..n]

for j <- 2 to n do

key <- A[ j ];

//Insert A[ j ] into the sorted sequence A[ 1 .. j -1]

i <- j - 1;

while(i > 0 and A[ i ] > key) do

A[ i + 1 ] <- A[ i ];

i <- j - 1;

A[ i + 1 ] <- key;

cost times

c1

c2

0

c3

c4

c5

c6

c7

n

n - 1

0

n - 1

∑nj=2Tj

∑nj=2(Tj-1)

∑nj=2(Tj-1)

n - 1

T(n) = c1n + c2(n-1) + c3(n-1) + c4∑nj=2Tj + c5∑n

j=2(Tj-1) + c6∑nj=2(Tj-1) + c7(n-1)

Prof. Adriano Patrick Cunha

Insertion Sort - Asympototic analysis

Woist-case (Sequence in reverse order)

tj = j, para j = 2, 3, ..., n

∑nj=2Tj = ∑n

j=2j = (∑nk=1k) - 1 = n(n + 1)/2 - 1

∑nj=2(Tj-1) = ∑n

j=2( j-1) = ∑n-1k=1k = n(n - 1)/2

T(n) = c1n + c2(n-1) + c3(n-1) + c4[n(n + 1)/2 - 1] + c5[n(n - 1)/2] + c6[n(n - 1)/2] + c7(n-1)

Logo, O(n2)

Prof. Adriano Patrick Cunha

Recursive Technique

Continua ...