Download pdf - CPSC 425: Computer Visionftung/cpsc425/lecture01.pdf · 2016-01-05 · Welcome to CPSC 425 For an autonomous car to navigate safely, it must sense its environment. detect lane markings,

CPSC 425: Computer Vision

Instructor: Fred [email protected]

Department of Computer ScienceUniversity of British Columbia

Lecture Notes 2015/2016 Term 2

1 / 41

Welcome to CPSC 425

Who has heard of Google’s self-driving car or Tesla Autopilot?

Image credit: Google; Technology Review, 2015.

2 / 41

Welcome to CPSC 425

For an autonomous car to navigate safely, it must sense itsenvironment.

detect lane markings, obstaclesdetect and predict the movement of other cars, cyclists, andpedestriansinterpret road signs, gestures from cyclists

3 / 41

Welcome to CPSC 425

Image credit: CBC, 2015.

4 / 41

Welcome to CPSC 425

How can we design computers (or robots, self-driving cars,...) thatmake sense of a complex visual world?

That is the question that computer vision tries to answer.

5 / 41

Menu January 5, 2016

Topics:IntroductionCourse MechanicsCourse TopicsSome Introductory Examples

Reading:Next: Forsyth & Ponce (2nd ed.) 1.1.1–1.1.3

Handouts:Assignment 1: Introduction to Python for Computer Vision

Reminders:Complete Assignment 1 by Tuesday, January 12www: http://www.cs.ubc.ca/~ftung/cpsc425/piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/

6 / 41

Who is Fred?

Fred TungUniversity of Waterloo graduate (2008)PhD candidate in Computer ScienceSupervisor: Jim LittleMy areas of research are...

— Scene parsing of images— Scene parsing of video— Large-scale visual search

7 / 41

Scene Parsing

1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

buildingcarfencemountainpersonroadsidewalkskyunlabelled (ground truth only)

8 / 41

Course Origins

CPSC 425 was originally developed by Bob Woodham and hasevolved over the years. Much of the material this year is adapted frommaterial prepared by Bob.

I will also share with you some exciting recent work in computer visionto solve real-world problems such as

autonomous navigationobject recognition, andlarge-scale image search

9 / 41

Framework for Class DiscussionCome to each class prepared to discuss that day’s material at fourlevels:

Problem:What is the problem addressed?

Key Idea(s):What is the key idea (or ideas) behind the approach taken?What assumptions are made? Are there alternativeapproaches?

Technical Detail(s):What theory underlies the approach taken? What are importantpractical aspects of implementation, experimentation andapplication?

“Gotchas:”Are there unexpected “features” of the approach likely to trip upthe inexperienced?

10 / 41

Course Expectations

Students in this class have varying backgrounds, skills, andexpectations. Please respect, help and encourage each other.

I will expect you to

read assigned textbook sections in advance

read any additional assigned reading in advance

ask questions (both in and outside class)

engage fully in all course activities: lectures, assignments,discussion, and office hours

complete all assignments on time

behave ethically

11 / 41

Course Mechanics

There will be 7 assignments (6 marked)

— we will use Python 2 and four packages:I Python Imaging Library (PIL)I NumPyI MatplotlibI SciPy

There is:— one (in-class) midterm exam, tentatively February 11

(class before reading break)— a 150 minute final exam, scheduled by the

Registrar’s office

12 / 41

Course Mechanics

My office hour: Fridays 10:30-11:30, ICCS 187 (or email me for anappointment)TA office hours TBA

Course website:http://www.cs.ubc.ca/~ftung/cpsc425/

Course Piazza group:https://piazza.com/ubc.ca/winterterm22015/cpsc425/

There will be no extension to assignment due dates

13 / 41

Course Mechanics (cont’d)

Marks for the course are calculated as follows:

In class (clicker questions) 10 %Assignments 25 %Midterm exam 25 %Final exam 40 %

14 / 41

Course Outline

I. Physics of Imaging

Image formationCameras and lensesColour

II. Early Vision

Image filtering, correlation/convolutionImage characterisationEdge and corner detectionTexture analysis

15 / 41

Course Outline

III. Mid-Level Vision

Feature detectionModel fittingStereoMotion and optical flow

IV. High-Level Vision

Clustering and classificationImage classificationObject detectionDeep learning in computer visionExamples and applications

16 / 41

Questions?

17 / 41

Example 1: Dione and Titan (Moons of Saturn)

Image credit: NASA/JPL-Caltech/Space Science Institute

18 / 41

Example 1 (cont’d): Dione and Titan

Image credit: NASA/JPL-Caltech/Space Science Institute19 / 41

Example 2: A Full (Earth) Moon

20 / 41

Example 3: Eggs?

21 / 41

Example 3 (cont’d):

22 / 41

Example 3 (cont’d):

23 / 41

Example 4: The dressLighting conditions also affect the perception of colour.

24 / 41

Example 5: Rotating Mask Example

Video: rotating mask

Given a rotating mask, we have difficulty seeing the hollow side

Our “everyday experience” tells us that the nose is pointing outwardsand not inwards

The associated text for this example read, in part,

“In solving the ill-posed problem from[sic] recovering 3D formfrom 2D images our brain makes a priori assumptions aboutthe world. Assumption 1: Faces are convex”

(Original) credit: http://www.kyb.tuebingen.mpg.de/

25 / 41

Example 6: Handwritten Text

Read this!

26 / 41

Example 7: The FedEx Logo

Lindon Leader of “Creative Leader” designed the FedEx logo

See interview with him at

http://www.thesneeze.com/mt-archives/000273.php

27 / 41

Example 8: First-Down Line

image courtesy SporTVision http://www.sportvision.com/

28 / 41

Example 9: Kinect for Xbox 360How does it work?

image from January/February 2011 issue of Technology Review

29 / 41

Example 9: Kinect for Xbox 360The Kinect uses depth information to recognize the pose of the players.

Image from J. Shotton et al. (2011)

30 / 41

Example 10: Word Lens (Google Translate)Real-time translation on your mobile device using optical characterrecognition + augmented reality

Image credit: Google

31 / 41

Example 11: Reverse image searchSearch using an image instead of text

Image credit: Google; Tineye.

32 / 41

Example 12: Interactive image search

Pinterest recently introduced a feature that lets you to look up productinformation by drawing boxes in images

33 / 41

Example 13: Amazon delivery dronesDelivering the product you ordered, by unmanned aerial vehicle

Image credit: Amazon

34 / 41

Example 14: Smart traffic systemsAdjusting a network of traffic lights in real time, based on current trafficconditions

Image credit: Miovision

35 / 41

Example 15: Sports video analytics at UBC

Here is a sample video sequence processed based on the combinedthesis work of three LCI graduate students:

Video: 1000 frame broadcast hockey sequence

Credit: Kenji Okuma, Wei-Lwun Lu, Ankur Gupta

36 / 41

Example 15 (cont’d): Puck Location and Possession

Andrew Duan’s M.Sc thesis (August, 2011) integrates thedetermination of puck location and possession into our sports videoanalysis system

Here’s what Andrew’s system does with the same hockey videosequence we saw before:

Video: 1000 frame broadcast hockey sequence

Credit: Xin Duan (Andrew)

37 / 41

Example 16: Basketball

Wei-Lwun Lu’s Ph.D thesis (October, 2011) tracks multiple playerswhile preserving player identity.

Here are a couple of examples from Wei-Lwun’s thesis using abasketball video sequence:

Video: Homography estimation

Video: Player identification

Credit: Wei-Lwun Luhttp://www.cs.ubc.ca/~vailen/thesis/thesis.shtml

38 / 41

Example 17: Automating camera operationMore recently, Jianhui Chen (one of your TAs this term) is working onautomatic broadcast camera control.

Image credit: J. Chen and P. Carr, 2015.39 / 41

Example 17: Automating camera operation

Image credit: IEEE Spectrum

40 / 41

Reminders:Complete Assignment 1 by Tuesday, January 12www: http://www.cs.ubc.ca/~ftung/cpsc425/piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/

41 / 41