32
EC-803 Computer Vision Lecture-1: Course Introduction Basic Transformations- Translation, Scaling and Rotation, both in 2D & 3D MATLAB or OpenCV NUST College of E&ME, Spring 2015 1

_CV_s2015_Lec_1

Embed Size (px)

DESCRIPTION

CV

Citation preview

  • EC-803 Computer Vision

    Lecture-1:

    Course Introduction

    Basic Transformations- Translation,

    Scaling and Rotation, both in 2D & 3D

    MATLAB or OpenCV

    NUST College of E&ME, Spring 2015 1

  • Instructor: Mahmood Akhtar, PhD

    ([email protected])

    Lecture Timing: Thu 17302030 hrs, CR (DCE)-16

    Topics:Basic Transformations, Camera Model and Imaging

    Geometry, Camera Calibration, Multiview Geometry,

    Stereopsis, Structure From Motion, Linear Filters, Edges,

    Texture, Segmentation by: Clustering Pixels; Split and

    Merge; Mean Shift Algorithm; Graph-Theoretic

    Clustering; Fitting a Model- Hough Transform; etc,

    Tracking, Model-Based Vision, Finding Templates Using

    Classifiers

    Course Introduction

    2NUST College of E&ME, Spring 2015

  • Geometric Transformations- to change sets of points representing some object (study about translation, scaling, rotation, etc)

    Camera Model and Imaging Geometry- image formation process, camera coordinates and 3D world coordinates aligned / not aligned, how to deal with different situations

    Camera Calibration- process of estimating the parameters of a pinhole camera model, approximating the camera that produced a given photograph or video, camera matrix

    3NUST College of E&ME, Spring 2015

  • Multiview Geometry- to understand how

    several views of the same scene constrain its 3D

    structure and camera configurations

    Stereopsis- algorithms that mimic our ability to

    fusing pictures recorded by two eyes and

    exploiting the difference between them to gain

    a strong sense of depth

    Structure From Motion- to estimate the 3D

    shape of a scene from multiple pictures when

    cameras positions and parameters are a priori

    unknown and may change over time4NUST College of E&ME, Spring 2015

  • Linear Filters- smoothing by averaging, Gaussian,

    derivatives and finite differences, filters and

    templates, scale and image pyramids

    Edges and Texture- noise and edge detectors-

    Laplacian and gradient-based; extracting image

    structure, analysis and synthesis using oriented

    pyramids

    Segmentation- subdivides an image or video into

    its constituent regions or objects as required,

    applications: summarising videos, finding machine

    parts, finding people, finding building in satellite

    images and searching a collection of images5NUST College of E&ME, Spring 2015

  • Tracking- problem of generating an inference

    about the motion of an object given a sequence

    of images. Major applications: motion capture,

    recognition from motion, surveillance, and

    targeting

    Model-Based Vision- object recognition as a

    correspondence problem- understanding of the

    relationship between the position of image

    features, and the position and orientation of an

    object; application: registration of VOI in medical

    imaging system

    6NUST College of E&ME, Spring 2015

  • Finding Templates Using Classifiers- a classifier is

    anything that takes a feature set as an input and

    produces a class label. Here, we would learn

    about techniques for building classifiers with

    example of their use in vision applications

    7NUST College of E&ME, Spring 2015

  • Text Book & References:

    David A. Forsyth and Jean Ponce,

    Computer Vision A Modern Approach,

    2002 Ed (available from local market)

    Class slides & selected research papers

    to be distributed by the instructor

    Mubarak Shah, Fundamentals of Computer Vision, 1997

    (soft copy available online)

    Linda Shapiro and George Stockman, Computer Vision, 2000

    (soft copy available online)

    Rafael C. Gonzalez and Richard E. Woods, Digital Image

    Processing, 3rd Edition, 2009 (available from local market)

    8NUST College of E&ME, Spring 2015

  • Prerequisites:

    Digital image processing

    Working knowledge of C++ programming

    Knowledge related to:

    Euclidean and projective geometry

    Linear Algebra

    Vector calculus

    Probability & Statistics

    Yahoo Group: CV_CEME_S2015

    9NUST College of E&ME, Spring 2015

  • Grading Policy*:

    Surprise quizzes (Min 6) 8%

    Programming assignments (Min 3) 7%

    Sessional exam I 15%

    Sessional exam II 15%

    Project 15%

    Final exam 40%

    *Relative final grading policy applies

    10NUST College of E&ME, Spring 2015

  • Quizzes & Assignments:

    Please make sure you visit CV_CEME_S2015 group every

    day, for notifications about assignments & other related

    material to be uploaded from time to time

    Quizzes: 6 to 8, carrying 8% weight in the total marks

    (best x out of y can be considered in the benefit of

    students)

    Assignments: min 3, carrying 7% weight in the total

    marks. It may be written assignments or programming

    assignments. Submission deadline will be given with the

    assignment. Assignments submitted after the deadline

    will not be accepted and will carry ZERO MARKS. Cheated

    (i.e., matching) assignments will get ZERO MARKS.

    11NUST College of E&ME, Spring 2015

  • Project:

    Project will carry 15% weight in the total marks

    Project is supposed to be conducted individually (i.e., no grouping)

    Your project is most likely going to be an OpenCVimplementation of a recent CV related algorithm / work

    Students are encouraged to visit IEEE Explore for 27th IEEE conf on CVPR and they should start looking into different research articles (published in 2014)

    Project topics / problems should be selected and approval should be obtained within the first four weeks of the course.

    Project presentations will commence from week 13 onwards and projects (i.e., CD containing draft of proposed novel work, implementation code, presentation, etc) will not be accepted after the submission deadline.

    Projects consisting of downloaded codes or presentations will not be accepted and will carry ZERO MARKS

    12NUST College of E&ME, Spring 2015

  • Vision

    Process of discovering what is present in the world

    and where it is by looking

    13NUST College of E&ME, Spring 2015

  • What is Computer Vision?

    given an image or more, extract properties of the 3D

    world:

    - Traffic scene

    - Number of vehicles

    - Type of vehicles

    - Location of closest obstacle

    - Assessment of congestion

    - Location of the scene captured

    -

    14NUST College of E&ME, Spring 2015

  • Computer Vision

    goal is to emulate human vision (which is limited to

    the visual band of electromagnetic (EM) spectrum),

    including learning and being able to make inferences

    and take actions based on visual inputs

    15NUST College of E&ME, Spring 2015

  • Why Computer Vision?

    An image is worth 1000 words

    Many biological systems rely on vision

    The world is 3D and dynamic

    Cameras and computers are cheap

    16NUST College of E&ME, Spring 2015

  • Applications of Computer Vision

    Autonomous cars, Planes, Missiles, Robots, ...

    Space exploration

    Aid to the blind, Sign language recognitions

    Manufacturing, Quality control

    Surveillance, Security, Biometrics

    Image retrieval

    Medical imaging & analysis

    ...

    17NUST College of E&ME, Spring 2015

  • 18NUST College of E&ME, Spring 2015

    Overview

    Image Formation and

    Camera Geometry Modeling and Calibration

    Image rectification

    Segmentation Impose some order on

    group of pixels to

    separate them from

    each other or infer

    shape information

    Processing on

    Single Image Linear Filters

    Edge detection

    Texture

    Multiple Images Multi-view geometry

    Stereo imaging

    Structure from motion

    Interpretation Interpret objects

    using geometric

    information

    Recognition Recognize

    objects using

    probabilistic

    techniques

    Real World

    Action

  • Computer Vision Focuses on:

    What information should be extracted?

    How can it be extracted?

    How should it be represented?

    How can it be used to achieve the goal?

    19NUST College of E&ME, Spring 2015

  • Related Disciplines

    Image processing

    Pattern recognition

    Computer graphics

    Artificial intelligence

    Machine learning

    20NUST College of E&ME, Spring 2015

  • 21

    DATA

    IMAGES

    Data

    Processing

    Image

    Processing

    Computer

    VisionComputer

    Graphics

    NUST College of E&ME, Spring 2015

    Related Disciplines

  • Active Research Topics

    Object recognition

    Human behavior analysis

    Internet and computer vision

    Biometrics and soft biometrics

    Large scale 3D reconstruction (city level)

    Medical image processing

    Vision for robotics

    22NUST College of E&ME, Spring 2015

  • Computer Vision Publications

    Journals

    IEEE Trans. on Pattern Analysis and Machine

    Intelligence (TPAMI)

    Internal Journal of Computer Vision (IJCV)

    IEEE Trans. on Image Processing

    23NUST College of E&ME, Spring 2015

  • Computer Vision Publications

    Conferences

    International Conference on Computer Vision

    (ICCV), once every two years

    IEEE Conf. of Computer Vision and Pattern

    Recognition (CVPR), once a year

    Europe Conference on Computer Vision (ECCV),

    once every two years

    24NUST College of E&ME, Spring 2015

  • =

    11001001

    10

    0

    yx

    yx

    yx

    =

    11000100010001

    10

    0

    0

    z

    yx

    z

    yx

    z

    yx

    Translation:

    (2D)(3D)

    Images courtesy of Dr Imtiaz A Taj (MAJU)

    )zz'z,yy'y,xx'x( 000 +=+=+=

    Basic Transformations

    25NUST College of E&ME, Spring 2015

  • Cartesian Coordinate System Homogeneous Coordinate System

    =

    ZYX

    W

    =

    kkZkYkX

    Wh

    (Euclidean Geometry) (Projective Geometry)

    =

    =

    4h3h

    4h2h

    4h1h

    3

    2

    1

    WWWWWW

    WWW

    W

    NUST College of E&ME, Spring 2015 26

  • Scaling: )zS'z,yS'y,xS'x( zyx ===

    Basic Transformations

    =

    11000000

    1yx

    s

    s

    yx

    y

    x

    =

    11000000000000

    1z

    yx

    s

    s

    s

    z

    yx

    z

    y

    x

    (2D) (3D)

    27NUST College of E&ME, Spring 2015

  • Rotation (2D):

    - around origin

    NUST College of E&ME, Spring 2015

    Basic Transformations

    =

    110000

    1yx

    CosSinSinCos

    yx

    p)T p(RT rr-

    - around an arbitrary point

    (not origin)r

    28

  • MATLAB, or OpenCV

    Image processing process of manipulating image data in order to make it suitable for computer vision applications or to make it suitable to present it to humans

    Computer vision goes beyond image processing, helps to obtain relevant information from images and make decisions based on that information

    Steps for a typical computer vision application:

    Image acquisition Image manipulation Obtaining

    relevant information Decision making

    NUST College of E&ME, Spring 2015 29

  • Most popular methods to develop computer vision applications: OpenCV with C/C++, MATLAB and Aforge

    MATLAB is the most easiest and the inefficient way to process images

    - an interpreter, not made to go fast but gives you the opportunity to play with its functionalities

    OpenCV is computationally the most efficient framework- designed for real time applications- code written in optimized C / C++

    - can take advantage of multicore processors- further automatic optimization possible using IPP libraries

    AForge has qualities in between OpenCV and MATLAB

    Matlab is a kind of sandbox for "playing" and learning (and relatively slow). OpenCV is dedicated and specific (and fast)NUST College of E&ME, Spring 2015 30

  • OpenCV has become hardest only because there is

    no proper documentation and error handling codes

    But OpenCV has lots of basic inbuilt image

    processing functions (over 500 functions),

    It is worthy to learn computer vision with OpenCV

    Useful webpages on this topic:

    http://opencv.org/

    http://opencv-srf.blogspot.com/2010/09/what-is-opencv.html

    NUST College of E&ME, Spring 2015 31

  • Assignment- 1

    Download and install the latest release of OpenCV. Build and run your first openCV program.

    Related Tutorials:

    - Installing OpenCV 3 on Ubuntu: http://rodrigoberriel.com/2014/10/installing-opencv-3-0-0-on-ubuntu-14-04/

    - Using OpenCV 3 with Eclipse: http://rodrigoberriel.com/2014/10/using-opencv-3-0-0-with-eclipse/

    NUST College of E&ME, Spring 2015 32