Upload
abdur-rehman-chaudhry
View
215
Download
1
Tags:
Embed Size (px)
DESCRIPTION
CV
Citation preview
EC-803 Computer Vision
Lecture-1:
Course Introduction
Basic Transformations- Translation,
Scaling and Rotation, both in 2D & 3D
MATLAB or OpenCV
NUST College of E&ME, Spring 2015 1
Instructor: Mahmood Akhtar, PhD
Lecture Timing: Thu 17302030 hrs, CR (DCE)-16
Topics:Basic Transformations, Camera Model and Imaging
Geometry, Camera Calibration, Multiview Geometry,
Stereopsis, Structure From Motion, Linear Filters, Edges,
Texture, Segmentation by: Clustering Pixels; Split and
Merge; Mean Shift Algorithm; Graph-Theoretic
Clustering; Fitting a Model- Hough Transform; etc,
Tracking, Model-Based Vision, Finding Templates Using
Classifiers
Course Introduction
2NUST College of E&ME, Spring 2015
Geometric Transformations- to change sets of points representing some object (study about translation, scaling, rotation, etc)
Camera Model and Imaging Geometry- image formation process, camera coordinates and 3D world coordinates aligned / not aligned, how to deal with different situations
Camera Calibration- process of estimating the parameters of a pinhole camera model, approximating the camera that produced a given photograph or video, camera matrix
3NUST College of E&ME, Spring 2015
Multiview Geometry- to understand how
several views of the same scene constrain its 3D
structure and camera configurations
Stereopsis- algorithms that mimic our ability to
fusing pictures recorded by two eyes and
exploiting the difference between them to gain
a strong sense of depth
Structure From Motion- to estimate the 3D
shape of a scene from multiple pictures when
cameras positions and parameters are a priori
unknown and may change over time4NUST College of E&ME, Spring 2015
Linear Filters- smoothing by averaging, Gaussian,
derivatives and finite differences, filters and
templates, scale and image pyramids
Edges and Texture- noise and edge detectors-
Laplacian and gradient-based; extracting image
structure, analysis and synthesis using oriented
pyramids
Segmentation- subdivides an image or video into
its constituent regions or objects as required,
applications: summarising videos, finding machine
parts, finding people, finding building in satellite
images and searching a collection of images5NUST College of E&ME, Spring 2015
Tracking- problem of generating an inference
about the motion of an object given a sequence
of images. Major applications: motion capture,
recognition from motion, surveillance, and
targeting
Model-Based Vision- object recognition as a
correspondence problem- understanding of the
relationship between the position of image
features, and the position and orientation of an
object; application: registration of VOI in medical
imaging system
6NUST College of E&ME, Spring 2015
Finding Templates Using Classifiers- a classifier is
anything that takes a feature set as an input and
produces a class label. Here, we would learn
about techniques for building classifiers with
example of their use in vision applications
7NUST College of E&ME, Spring 2015
Text Book & References:
David A. Forsyth and Jean Ponce,
Computer Vision A Modern Approach,
2002 Ed (available from local market)
Class slides & selected research papers
to be distributed by the instructor
Mubarak Shah, Fundamentals of Computer Vision, 1997
(soft copy available online)
Linda Shapiro and George Stockman, Computer Vision, 2000
(soft copy available online)
Rafael C. Gonzalez and Richard E. Woods, Digital Image
Processing, 3rd Edition, 2009 (available from local market)
8NUST College of E&ME, Spring 2015
Prerequisites:
Digital image processing
Working knowledge of C++ programming
Knowledge related to:
Euclidean and projective geometry
Linear Algebra
Vector calculus
Probability & Statistics
Yahoo Group: CV_CEME_S2015
9NUST College of E&ME, Spring 2015
Grading Policy*:
Surprise quizzes (Min 6) 8%
Programming assignments (Min 3) 7%
Sessional exam I 15%
Sessional exam II 15%
Project 15%
Final exam 40%
*Relative final grading policy applies
10NUST College of E&ME, Spring 2015
Quizzes & Assignments:
Please make sure you visit CV_CEME_S2015 group every
day, for notifications about assignments & other related
material to be uploaded from time to time
Quizzes: 6 to 8, carrying 8% weight in the total marks
(best x out of y can be considered in the benefit of
students)
Assignments: min 3, carrying 7% weight in the total
marks. It may be written assignments or programming
assignments. Submission deadline will be given with the
assignment. Assignments submitted after the deadline
will not be accepted and will carry ZERO MARKS. Cheated
(i.e., matching) assignments will get ZERO MARKS.
11NUST College of E&ME, Spring 2015
Project:
Project will carry 15% weight in the total marks
Project is supposed to be conducted individually (i.e., no grouping)
Your project is most likely going to be an OpenCVimplementation of a recent CV related algorithm / work
Students are encouraged to visit IEEE Explore for 27th IEEE conf on CVPR and they should start looking into different research articles (published in 2014)
Project topics / problems should be selected and approval should be obtained within the first four weeks of the course.
Project presentations will commence from week 13 onwards and projects (i.e., CD containing draft of proposed novel work, implementation code, presentation, etc) will not be accepted after the submission deadline.
Projects consisting of downloaded codes or presentations will not be accepted and will carry ZERO MARKS
12NUST College of E&ME, Spring 2015
Vision
Process of discovering what is present in the world
and where it is by looking
13NUST College of E&ME, Spring 2015
What is Computer Vision?
given an image or more, extract properties of the 3D
world:
- Traffic scene
- Number of vehicles
- Type of vehicles
- Location of closest obstacle
- Assessment of congestion
- Location of the scene captured
-
14NUST College of E&ME, Spring 2015
Computer Vision
goal is to emulate human vision (which is limited to
the visual band of electromagnetic (EM) spectrum),
including learning and being able to make inferences
and take actions based on visual inputs
15NUST College of E&ME, Spring 2015
Why Computer Vision?
An image is worth 1000 words
Many biological systems rely on vision
The world is 3D and dynamic
Cameras and computers are cheap
16NUST College of E&ME, Spring 2015
Applications of Computer Vision
Autonomous cars, Planes, Missiles, Robots, ...
Space exploration
Aid to the blind, Sign language recognitions
Manufacturing, Quality control
Surveillance, Security, Biometrics
Image retrieval
Medical imaging & analysis
...
17NUST College of E&ME, Spring 2015
18NUST College of E&ME, Spring 2015
Overview
Image Formation and
Camera Geometry Modeling and Calibration
Image rectification
Segmentation Impose some order on
group of pixels to
separate them from
each other or infer
shape information
Processing on
Single Image Linear Filters
Edge detection
Texture
Multiple Images Multi-view geometry
Stereo imaging
Structure from motion
Interpretation Interpret objects
using geometric
information
Recognition Recognize
objects using
probabilistic
techniques
Real World
Action
Computer Vision Focuses on:
What information should be extracted?
How can it be extracted?
How should it be represented?
How can it be used to achieve the goal?
19NUST College of E&ME, Spring 2015
Related Disciplines
Image processing
Pattern recognition
Computer graphics
Artificial intelligence
Machine learning
20NUST College of E&ME, Spring 2015
21
DATA
IMAGES
Data
Processing
Image
Processing
Computer
VisionComputer
Graphics
NUST College of E&ME, Spring 2015
Related Disciplines
Active Research Topics
Object recognition
Human behavior analysis
Internet and computer vision
Biometrics and soft biometrics
Large scale 3D reconstruction (city level)
Medical image processing
Vision for robotics
22NUST College of E&ME, Spring 2015
Computer Vision Publications
Journals
IEEE Trans. on Pattern Analysis and Machine
Intelligence (TPAMI)
Internal Journal of Computer Vision (IJCV)
IEEE Trans. on Image Processing
23NUST College of E&ME, Spring 2015
Computer Vision Publications
Conferences
International Conference on Computer Vision
(ICCV), once every two years
IEEE Conf. of Computer Vision and Pattern
Recognition (CVPR), once a year
Europe Conference on Computer Vision (ECCV),
once every two years
24NUST College of E&ME, Spring 2015
=
11001001
10
0
yx
yx
yx
=
11000100010001
10
0
0
z
yx
z
yx
z
yx
Translation:
(2D)(3D)
Images courtesy of Dr Imtiaz A Taj (MAJU)
)zz'z,yy'y,xx'x( 000 +=+=+=
Basic Transformations
25NUST College of E&ME, Spring 2015
Cartesian Coordinate System Homogeneous Coordinate System
=
ZYX
W
=
kkZkYkX
Wh
(Euclidean Geometry) (Projective Geometry)
=
=
4h3h
4h2h
4h1h
3
2
1
WWWWWW
WWW
W
NUST College of E&ME, Spring 2015 26
Scaling: )zS'z,yS'y,xS'x( zyx ===
Basic Transformations
=
11000000
1yx
s
s
yx
y
x
=
11000000000000
1z
yx
s
s
s
z
yx
z
y
x
(2D) (3D)
27NUST College of E&ME, Spring 2015
Rotation (2D):
- around origin
NUST College of E&ME, Spring 2015
Basic Transformations
=
110000
1yx
CosSinSinCos
yx
p)T p(RT rr-
- around an arbitrary point
(not origin)r
28
MATLAB, or OpenCV
Image processing process of manipulating image data in order to make it suitable for computer vision applications or to make it suitable to present it to humans
Computer vision goes beyond image processing, helps to obtain relevant information from images and make decisions based on that information
Steps for a typical computer vision application:
Image acquisition Image manipulation Obtaining
relevant information Decision making
NUST College of E&ME, Spring 2015 29
Most popular methods to develop computer vision applications: OpenCV with C/C++, MATLAB and Aforge
MATLAB is the most easiest and the inefficient way to process images
- an interpreter, not made to go fast but gives you the opportunity to play with its functionalities
OpenCV is computationally the most efficient framework- designed for real time applications- code written in optimized C / C++
- can take advantage of multicore processors- further automatic optimization possible using IPP libraries
AForge has qualities in between OpenCV and MATLAB
Matlab is a kind of sandbox for "playing" and learning (and relatively slow). OpenCV is dedicated and specific (and fast)NUST College of E&ME, Spring 2015 30
OpenCV has become hardest only because there is
no proper documentation and error handling codes
But OpenCV has lots of basic inbuilt image
processing functions (over 500 functions),
It is worthy to learn computer vision with OpenCV
Useful webpages on this topic:
http://opencv.org/
http://opencv-srf.blogspot.com/2010/09/what-is-opencv.html
NUST College of E&ME, Spring 2015 31
Assignment- 1
Download and install the latest release of OpenCV. Build and run your first openCV program.
Related Tutorials:
- Installing OpenCV 3 on Ubuntu: http://rodrigoberriel.com/2014/10/installing-opencv-3-0-0-on-ubuntu-14-04/
- Using OpenCV 3 with Eclipse: http://rodrigoberriel.com/2014/10/using-opencv-3-0-0-with-eclipse/
NUST College of E&ME, Spring 2015 32