CSE 202 - Algorithms

9/26/2002 CSE 202 - Intro

CSE 202 - Algorithms

Fall 2002

Instructor: Larry [email protected]

Office hours (4101 AP&M) Tu & Th 3:30 – 5:00

(or whenever)

TA: John-Paul Fryckman

[email protected]

CSE 202 - Intro2

What we’ll studyGeneral Techniques

Divide and Conquer, Dynamic Programming, Hashing, Greedy Algorithms, Reduction to other problems, ...

Specific Problems

Sorting, sorting, shortest paths, max flow, sorting, ...

Various Paradigms

Probabilistic algorithms

Alternate models of computation

NP Completeness

CSE 202 - Intro3

Sounds like my undergrad course ...

• Going over same material twice is good!

• We’ll probably go deeper – mathematical formalisms

– modified assumptions

– assorted topics every computer scientist should know.

CSE 202 - Intro4

Logistics• Textbook: Introduction to Algorithms, 2nd edition

– Cormen, Leiserson, Rivest & Stein

– First edition probably OK (new chapter K ~ old chapter (K+1).)

• Written work– Individual homeworks

– Group homeworks

– In-class Midterm

– Final (oral? take-home?)

• Grades– A: Demonstrate mastery of material.

– B (or B+ or B-): Typical grade. Understand most of material, solve most routine problems and some hard ones.

– C: Really don’t “get it”.

– D,F: Gave up.

CSE 202 - Intro5

Logistics• Classes will include “lecture” (with overheads),

“discussion” (at board), and “exercises” (at your seat). – I’d prefer you pay attention, think, ask questions, and

participate in discussions rather than taking notes.

• Website: //www.cs.ucsd.edu/classes/fa02/cse202-b

– .pdf and .ps of “lectures”. (Sometimes available before class.)

– Homeworks, announcements, etc. too

– You are responsible for checking website

• Be sure to register for the class email list– I’ll use it to send out urgent messages, like HW corrections

– Directions on class webpage

CSE 202 - Intro6

Formal Analysis of Algorithms• A problem is a (infinite) set of instances.

• An instance is a triple <input, output, size>– Less formally, we treat “input” as an instance.

– “Size” is related to the number of symbols needed to represent the instance, but the exact convention is problem-specific. (E.g., nxn matrix may be “size n”.)

• A decision problem is a problem in which every instance’s output is “yes” or “no”.

Example: Sorting is a problem.

<5, 2, 17, 6> is an instance of sorting, with output <2, 5, 6, 17> and size 4.

Note: I tend to underlineterms that you should know.

CSE 202 - Intro7

Formal Analysis of Algorithms

• Algorithm: a step-by-step method of producing “output” of instances given “input” – The steps are instructions for some model of computation.

– “Standard” model is the Random Access Machine (RAM) Memory

ALU

One needs to be careful about what operations are allowed - E.g. operations on very long numbers aren’t “one step”. - For this course – anything reasonable is OK.

CSE 202 - Intro8

Formal Analysis of Algorithms

The complexity of an algorithm is a function from (the instance size) to (the running time on instances of that size).

But what happens if different instances of a given size have different running times??

• Worst-case complexity: maximum of the running times over all instances of a given size.

• Average complexity: average of the running times.

• Probabilistic complexity: we’ll study later – can be used when a single instance has different running times.

• Best-case complexity: not a very useful concept

= {0, 1, 2, ... } (the natural numbers)

CSE 202 - Intro9

Mathematical notation

P.T. Barnum asserted, “You can fool all of the people some of the time”.– Suppose on Monday, I fooled all the men, and on

Tuesday, I fooled all the women (but not vice versa).

Is this an example of Barnum’s assertion?

Mathematics is (or should be) precise

Let P = set of all the people, T = {Monday,Tuesday},

Let F(p,t) mean “I fooled person p at time t”

Is “p P t T F(p,t)” true?

Is “t T p P F(p,t)” true?

means “for all” means “there exists”

CSE 202 - Intro10

Classifying (complexity) functionsWe focus on functions from to , though definitions are

general.

Given function g, O(g) (pronounced “Oh of g” or “Big Oh of g”) is the set of functions (from

to ) for which g is an asymptotic upper bound. This means:

O(g) = {f : n0 c>0 n>n0 0 f(n) c g(n) }

Example: 3n log n + 20 O(n2)Note: Since O(n2) is a set, using ““ is technically correct.But people often use “=“ .

CSE 202 - Intro11

Classifying (complexity) functions

(g) (pronounced “Omega” or “Big Omega”) is the set of functions for which g is an asymptotic lower bound. Formally:

(g) = {f : n0 c>0 n>n0 0 c g(n) f(n) }

(g) (“Theta”) is the set of functions for which g is an asymptotic tight bound. This means:

(g) = {f : n0 c1,c2>0 n>n0 0 c1g(n) f(n) c2g(n) }

Theorem: (g) = (g) O(g).

Note: we won’t use little-o or little-omega notation.

CSE 202 - Intro12

Review

• Consider the decision problem, “Is x prime?”– How should we define “size”?

• Consider the algorithm:if(i<2) return “no”;for i = 2 to sqrt(x) if (x%i = 0) return “no”;

return “yes”;

Note: when x is even, algorithm takes constant time.

What is the asymptotic complexity?

You should be asking, “average or worst-case”?

Let’s say worst case (average complexity is harder)

CSE 202 - Intro13

What is a proof?• Informal Proof (“Hand waving”)

– Anything that convinces the reader

• Formal Proof (First few classes)– A sequence of statements, each being:

• a hypothesis of the theorem• a definition, axiom, or known theorem• a statement that follows from previous statements via

a rule of inference

• Proof (Rest of course)– A subset or summary of a formal proof that

convinces a literate but sleepy reader that the writer could write a formal proof if forced to.

CSE 202 - Intro14

Notes about proofs• Use complete sentences.

– Sentences have a subject (“you” may be understood), a verb, a period at the end.

– If sentence is too long, introduce notation or definitions.

• Each statement should indicate whether it’s an assumption, a definition, a known fact, ...– Don’t just write, “x A”. Instead write:

• “Let x A.” if you’re introducing x and want it to be in A.

• “Suppose x A.” if x has already been introduced, and you’re seeing what would happen if it were in A.

• “Thus, x A.” if it follows from earlier statements.

– Make sure each variable is properly introduced• Like declaring variables in a program.

CSE 202 - Intro15

Some “proof schemas”• Literate readers – even sleepy ones – know that:

– To show A B, can write “Let x A. (blah ...). Thus, x B.”

– To show A = B (for sets A and B), show A B and B A.

– To show P implies Q, write “Assume P. (blah ...). Thus Q.”

– To show “x P”, write “Let x=(whatever). (blah ...) Thus P.”

– To show “x P”, write “Given any x, (blah ...). Thus P.”

– To show an algorithm has complexity O(f), you will probably construct a positive constant c and an integer n0 and then show that if n>n0 and I is an instance of size n, then the algorithm requires time at most c f(n) on I.

– These schemas can be mostly implicit. Thus, after writing “Let x A. (blah blah). Thus, x B.” you often don’t need to write “This shows that A B.”

CSE 202 - Intro16

Example formal proof

Thm: Suppose fO(g) and gO(h). Then fO(h).

Proof: Since f O(g), n0 c0>0 n>n0 0 f(n) c0 g(n).

Similarly, n1 c1>0 n>n1 0 g(n) c1 h(n).

Let n2 = max(n0, n1) and c2 = c0 c1. Note that c2 is positive since both c0 and c1 are.

Suppose n>n2. Then n>n0 and so 0 f(n) c0 g(n).

But we also have n>n1 and so g(n) c1 h(n).

Thus, 0 f(n) c0 c1 h(n) = c2 h(n).

Q.E.D.

Just the definition of O(g), but introduces c0 and n0.

Setup for proving n2 c2>0 ...

Setup for proving n>n2 ...

Means, “We’ve proved what we intended to.”

CSE 202 - Intro17

Your turn ...

Thm: If fO(g) then g(f).

CSE 202 - Intro18

Mathematical Induction

Suppose for i, Pi is a statement.

Suppose also that we can prove P0, and we can prove “ i, Pi implies Pi+1”

Then the Principle of Mathematical Induction allows us to conclude “ i Pi”.

P3P0

P1P2

P4...

CSE 202 - Intro19

Mathematical Induction

Alternatively, suppose we can prove P0, and we can prove “ i, (P0 & P1 & ... & Pi) implies Pi+1”

Again, the Principle of Mathematical Induction allows us to conclude “ i Pi”.

P3

P0

P1

P2

P4

...P5

P6

P7

CSE 202 - Intro20

Induction Example

Definitions from CLRS pg. 1088:

A binary tree is a structure on a finite number of nodes that either contains no nodes or is composed of three disjoint subsets: a root node, a binary tree called the left subtree and a binary tree called the right subtree.

The height of a non-empty tree is the maximum depth of its nodes. The depth of a node is the length of the path from the root to the node.

Thm: If T is a non-empty binary tree of height h, then T has fewer than 2h+1 nodes.

CSE 202 - Intro21

Induction Example

Let Ph be the statement, “If T is a binary tree of height h, then T has at most 2h+1–1 nodes.”

We will prove h Ph by induction.

Base Case (h=0): T is non-empty, so it has a root node r. Let s be any node of T. Since the height of T is 0, the depth of s must be 0, so s = r. Thus, T has only one node (which is 20+1-1).

CSE 202 - Intro22

Induction Example

Let Ph be the statement, “If T is a binary tree of height h, then T has at most 2h+1–1 nodes.”

We will prove h Ph by induction.

Base Case (h=0): T is non-empty, so it has a root node r. Let s be any node of T. Since the height of T is 0, the depth of s must be 0, so s = r. Thus, T has only one node (which is 20+1-1).

Obviously, the only binary tree of height 0 is the tree of one node, so P0 is true.

CSE 202 - Intro23

Induction Example

Induction step: Assume that Ph is true.

Let T be a tree of height h+1. Then the left subtree L is a binary tree.

If L is empty, it has 0 nodes.

Otherwise, each node in L is has depth one less than its depth in T. Thus, L is a non-empty binary tree of depth at most h. By assumption

OOOPS!

I’ve only assumed Ph. But L may have smaller height.

CSE 202 - Intro24

Induction ExampleInduction step: Assume that Pi is true ih.

Let T be a tree of height h+1. Then the left subtree L is a binary tree.

If L is empty, it has 0 nodes.

Otherwise, each node in L is has depth one less than its depth in T. Thus, L is a non-empty binary tree of depth at most h. By assumption, L has at most 2h+1–1 nodes.

Similarly, the right subtree R has at most 2h+1–1 nodes.

Thus, T has at most 1 + 2h+1–1 + 2h+1–1 = 2 2h+1–1

= 2(h+1)+1–1 nodes.

This shows that Ph+1 is true, and completes our proof.

for the root for L for R

Documents

CSE 202 - Algorithms