04/30/13 Last class: summary, goggles, ices Discrete Structures (CS 173) Derek Hoiem, University of...

Preview:

Citation preview

04/30/13

Last class: summary, goggles, ices

Discrete Structures (CS 173)Derek Hoiem, University of Illinois 1

Image: http://darksideofthecatalogue.wordpress.com/2011/11/22/light-at-the-end-of-the-tunnel-is-glowing-thing-23-12/

Final examTuesday, May 7, 7-10pm

DCL 1320: Students with last names Afridi to Mehta

Siebel 1404: Students with last names Melvin to Zmick

You should have already contacted us about conflicts. Note there is no specific conflict exam --- most conflicts should be resolved by other classes unless due to 3 tests in one day.http://admin.illinois.edu/policy/code/article3_part2_3-201.html

2

What to expect on final• Focus on material after midterm 2, but will be

stuff from earlier parts of semester also• Almost certainly at least one question on each

of:– Big-O/Algorithms– Proof by Contradiction– State Diagrams– Induction– Countability (not too complicated)

3

Today’s class

• Summary of concepts learned

• Fast image retrieval, Google Goggles, and the relevance to discrete structures

• ICES forms

4

What you learned in CS 173

• How to model the world

• How to prove things

• How to model computational behavior

• How to think formally and computationally

5

How to model the world• Logic, propositions, and relations

– Used in natural language processing, machine learning, programming languages

• Sets– Used for data mining, groups (e.g., with social networks), image processing (e.g.,

sets of pixels), clustering

• Functions, algorithms– Programming languages, most programming

• Graphs and trees– Used in search, machine learning, social networks, path planning, menu design

• State diagrams– Used for design of automated systems, AI planning, map building, robotics

6

How to prove things

• Direct proof

• Use of cases

• Indirect proofs such as by contrapositive– Changing into logically equivalent form sometimes makes the proof easier

• Proof by example or counter-example– Useful to show something exists

• Induction– Proof for unbounded set of integers

• Contradiction– Useful to show something can’t exist

7

How to model computational behavior

• Algorithm analysis

• Recursion trees– Unroll recursive functions to analyze cost at root,

internal calls, and leaves

• Big-O and Big-Theta– Analyze running time independent of

implementation details and compute power8

How to think formally and computationally

• Formal proof methods

• Number theory– Important for numerical methods

• P vs. NP– Important classes of algorithms

• Comparing cardinality of infinite sets– Fundamental implications, such as halting problem

9

Google goggles

http://www.google.com/mobile/goggles/#text

Demo

10

How to quickly find images in a large database that match a given image?

Basic representation: interest points (also called keypoints)

Describe appearance of distinctive image patches

12

Thousands of these per image, each is a 128 dimension vector of numbers

Simple idea

See how many keypoints are close to keypoints in each other image

Lots of Matches

Few or No Matches

But this will be really, really slow!Like 10 images per second.

Slide Slide Credit: Nister

110,000,000 Images in 5.8 Seconds

“Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.

Slide Slide Credit: Nister

Slide Slide Credit: Nister

Slide Credit: NisterSlide

Structure 1: “Visual Words”

• Group points (descriptors) into sets of similar points (called “clustering”)

• Represent image as the number of points you see in each set– Images are similar if they have a lot of sets in common

• Concepts from class: a set of 128-dimensional real vectors is partitioned into sets, and new vectors are assigned to a set index:

K-means algorithm

Illustration: http://en.wikipedia.org/wiki/K-means_clustering

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

K-means algorithm

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

Back to 2

Demo: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Efficiency from clustering

Time to match images in database:– Previous matching time for two images with

descriptors of dimensions– Post-cluster matching time:

Time to assign points to clusters:

21

Structure 2: trees for nested partitions

• For points within a set to be very similar, need many sets (1,000,000)– Slow time to assign points to sets– Need to compare each point to each cluster center:

• Solution: create nested sets

• Discrete structures concepts: collections of sets, trees

Following slides by David Nister (CVPR 2006)

Much faster processing of query image

Old time to assign points to sets: for clusters

New time with trees:

In practice: 10,000+ times speed up

35

Structure 3: Inverse document file

• Like a book index: keep a list of all the words (keypoints) and all the pages (images) that contain them.

• Rank database images based on tf-idf measure.

tf-idf: Term Frequency – Inverse Document Frequency

# words in document

# times word appears in document

# documents

# documents that contain the word

Performance

Speedups

• Matching based on set membership

• Tree for faster clustering

• Inverse document file for only checking images with same sets as query

• Overall (in practice 100,000+ times speedup)

Slide Slide Credit: Nister

110,000,000 Images in 5.8 Seconds

“Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.

Summary• Clever data structures and efficient algorithms

make the difference between 10 images per second and 20 million images per second– Clustering (partitioning) for faster comparison– Trees for faster clustering– Lookup table for faster matching

• In this class, you learned how to model, analyze, and prove things about discrete structures

45

Next steps

• CS 225: implementing and using data structures such as linked lists, trees, graphs, etc.

• CS 241/242: experience writing code and structuring programs and dealing with OS

• CS 373: grammars, finite automata, languages, Turing machines, decidability

• Research or project experience46

ICES forms• Important for course evaluation and feedback

• Please provide comments about both positive aspects and ways to improve

47

Thank you!

48

Recommended