CPSC 425: Computer Vision
Instructor: Fred [email protected]
Department of Computer ScienceUniversity of British Columbia
Lecture Notes 2015/2016 Term 2
1 / 41
Welcome to CPSC 425
Who has heard of Google’s self-driving car or Tesla Autopilot?
Image credit: Google; Technology Review, 2015.
2 / 41
Welcome to CPSC 425
For an autonomous car to navigate safely, it must sense itsenvironment.
detect lane markings, obstaclesdetect and predict the movement of other cars, cyclists, andpedestriansinterpret road signs, gestures from cyclists
3 / 41
Welcome to CPSC 425
Image credit: CBC, 2015.
4 / 41
Welcome to CPSC 425
How can we design computers (or robots, self-driving cars,...) thatmake sense of a complex visual world?
That is the question that computer vision tries to answer.
5 / 41
Menu January 5, 2016
Topics:IntroductionCourse MechanicsCourse TopicsSome Introductory Examples
Reading:Next: Forsyth & Ponce (2nd ed.) 1.1.1–1.1.3
Handouts:Assignment 1: Introduction to Python for Computer Vision
Reminders:Complete Assignment 1 by Tuesday, January 12www: http://www.cs.ubc.ca/~ftung/cpsc425/piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/
6 / 41
Who is Fred?
Fred TungUniversity of Waterloo graduate (2008)PhD candidate in Computer ScienceSupervisor: Jim LittleMy areas of research are...
— Scene parsing of images— Scene parsing of video— Large-scale visual search
7 / 41
Scene Parsing
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
buildingcarfencemountainpersonroadsidewalkskyunlabelled (ground truth only)
8 / 41
Course Origins
CPSC 425 was originally developed by Bob Woodham and hasevolved over the years. Much of the material this year is adapted frommaterial prepared by Bob.
I will also share with you some exciting recent work in computer visionto solve real-world problems such as
autonomous navigationobject recognition, andlarge-scale image search
9 / 41
Framework for Class DiscussionCome to each class prepared to discuss that day’s material at fourlevels:
Problem:What is the problem addressed?
Key Idea(s):What is the key idea (or ideas) behind the approach taken?What assumptions are made? Are there alternativeapproaches?
Technical Detail(s):What theory underlies the approach taken? What are importantpractical aspects of implementation, experimentation andapplication?
“Gotchas:”Are there unexpected “features” of the approach likely to trip upthe inexperienced?
10 / 41
Course Expectations
Students in this class have varying backgrounds, skills, andexpectations. Please respect, help and encourage each other.
I will expect you to
read assigned textbook sections in advance
read any additional assigned reading in advance
ask questions (both in and outside class)
engage fully in all course activities: lectures, assignments,discussion, and office hours
complete all assignments on time
behave ethically
11 / 41
Course Mechanics
There will be 7 assignments (6 marked)
— we will use Python 2 and four packages:I Python Imaging Library (PIL)I NumPyI MatplotlibI SciPy
There is:— one (in-class) midterm exam, tentatively February 11
(class before reading break)— a 150 minute final exam, scheduled by the
Registrar’s office
12 / 41
Course Mechanics
My office hour: Fridays 10:30-11:30, ICCS 187 (or email me for anappointment)TA office hours TBA
Course website:http://www.cs.ubc.ca/~ftung/cpsc425/
Course Piazza group:https://piazza.com/ubc.ca/winterterm22015/cpsc425/
There will be no extension to assignment due dates
13 / 41
Course Mechanics (cont’d)
Marks for the course are calculated as follows:
In class (clicker questions) 10 %Assignments 25 %Midterm exam 25 %Final exam 40 %
14 / 41
Course Outline
I. Physics of Imaging
Image formationCameras and lensesColour
II. Early Vision
Image filtering, correlation/convolutionImage characterisationEdge and corner detectionTexture analysis
15 / 41
Course Outline
III. Mid-Level Vision
Feature detectionModel fittingStereoMotion and optical flow
IV. High-Level Vision
Clustering and classificationImage classificationObject detectionDeep learning in computer visionExamples and applications
16 / 41
Questions?
17 / 41
Example 1: Dione and Titan (Moons of Saturn)
Image credit: NASA/JPL-Caltech/Space Science Institute
18 / 41
Example 1 (cont’d): Dione and Titan
Image credit: NASA/JPL-Caltech/Space Science Institute19 / 41
Example 2: A Full (Earth) Moon
20 / 41
Example 3: Eggs?
21 / 41
Example 3 (cont’d):
22 / 41
Example 3 (cont’d):
23 / 41
Example 4: The dressLighting conditions also affect the perception of colour.
24 / 41
Example 5: Rotating Mask Example
Video: rotating mask
Given a rotating mask, we have difficulty seeing the hollow side
Our “everyday experience” tells us that the nose is pointing outwardsand not inwards
The associated text for this example read, in part,
“In solving the ill-posed problem from[sic] recovering 3D formfrom 2D images our brain makes a priori assumptions aboutthe world. Assumption 1: Faces are convex”
(Original) credit: http://www.kyb.tuebingen.mpg.de/
25 / 41
Example 6: Handwritten Text
Read this!
26 / 41
Example 7: The FedEx Logo
Lindon Leader of “Creative Leader” designed the FedEx logo
See interview with him at
http://www.thesneeze.com/mt-archives/000273.php
27 / 41
Example 8: First-Down Line
image courtesy SporTVision http://www.sportvision.com/
28 / 41
Example 9: Kinect for Xbox 360How does it work?
image from January/February 2011 issue of Technology Review
29 / 41
Example 9: Kinect for Xbox 360The Kinect uses depth information to recognize the pose of the players.
Image from J. Shotton et al. (2011)
30 / 41
Example 10: Word Lens (Google Translate)Real-time translation on your mobile device using optical characterrecognition + augmented reality
Image credit: Google
31 / 41
Example 11: Reverse image searchSearch using an image instead of text
Image credit: Google; Tineye.
32 / 41
Example 12: Interactive image search
Pinterest recently introduced a feature that lets you to look up productinformation by drawing boxes in images
33 / 41
Example 13: Amazon delivery dronesDelivering the product you ordered, by unmanned aerial vehicle
Image credit: Amazon
34 / 41
Example 14: Smart traffic systemsAdjusting a network of traffic lights in real time, based on current trafficconditions
Image credit: Miovision
35 / 41
Example 15: Sports video analytics at UBC
Here is a sample video sequence processed based on the combinedthesis work of three LCI graduate students:
Video: 1000 frame broadcast hockey sequence
Credit: Kenji Okuma, Wei-Lwun Lu, Ankur Gupta
36 / 41
Example 15 (cont’d): Puck Location and Possession
Andrew Duan’s M.Sc thesis (August, 2011) integrates thedetermination of puck location and possession into our sports videoanalysis system
Here’s what Andrew’s system does with the same hockey videosequence we saw before:
Video: 1000 frame broadcast hockey sequence
Credit: Xin Duan (Andrew)
37 / 41
Example 16: Basketball
Wei-Lwun Lu’s Ph.D thesis (October, 2011) tracks multiple playerswhile preserving player identity.
Here are a couple of examples from Wei-Lwun’s thesis using abasketball video sequence:
Video: Homography estimation
Video: Player identification
Credit: Wei-Lwun Luhttp://www.cs.ubc.ca/~vailen/thesis/thesis.shtml
38 / 41
Example 17: Automating camera operationMore recently, Jianhui Chen (one of your TAs this term) is working onautomatic broadcast camera control.
Image credit: J. Chen and P. Carr, 2015.39 / 41
Example 17: Automating camera operation
Image credit: IEEE Spectrum
40 / 41
Reminders:Complete Assignment 1 by Tuesday, January 12www: http://www.cs.ubc.ca/~ftung/cpsc425/piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/
41 / 41