10
1 CSE152, Spr 2011 Intro Computer Vision Introduction Introduction to Computer Vision CSE 152 Lecture 1 CSE152, Spr 2011 Intro Computer Vision What is Computer Vision? Trucco and Verri (Text): Computing properties of the 3-D world from one or more digital images Sockman and Shapiro: To make useful decisions about real physical objects and scenes based on sensed images Ballard and Brown: The construction of explicit, meaningful description of physical objects from images. Forsyth and Ponce: Extracting descriptions of the world from pictures or sequences of pictures” CSE152, Spr 2011 Intro Computer Vision Why is this hard? What is in this image? 1. A hand holding a man? 2. A hand holding a mirrored sphere? 3. An Escher drawing? 4. A 1935 self portrait of Escher • Interpretations are ambiguous • The forward problem (graphics) is well-posed • The “inverse problem” (vision) is not CSE152, Spr 2011 Intro Computer Vision We all make mistakes “640K ought to be enough for anybody.” – Bill Gates, 1981 “…” – Marvin Minsky CSE152, Spr 2011 Intro Computer Vision What do you see? CSE152, Spr 2011 Intro Computer Vision What was happening

What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

1

CSE152, Spr 2011 Intro Computer Vision

Introduction

Introduction to Computer Vision CSE 152 Lecture 1

CSE152, Spr 2011 Intro Computer Vision

What is Computer Vision? •  Trucco and Verri (Text): Computing properties of

the 3-D world from one or more digital images

•  Sockman and Shapiro: To make useful decisions about real physical objects and scenes based on sensed images

•  Ballard and Brown: The construction of explicit, meaningful description of physical objects from images.

•  Forsyth and Ponce: Extracting descriptions of the world from pictures or sequences of pictures”

CSE152, Spr 2011 Intro Computer Vision

Why is this hard?

What is in this image? 1.  A hand holding a man? 2.  A hand holding a mirrored sphere? 3.  An Escher drawing? 4.  A 1935 self portrait of Escher

•  Interpretations are ambiguous •  The forward problem (graphics) is well-posed •  The “inverse problem” (vision) is not

CSE152, Spr 2011 Intro Computer Vision

We all make mistakes “640K ought to be enough for anybody.” –

Bill Gates, 1981

“…” – Marvin Minsky

CSE152, Spr 2011 Intro Computer Vision

What do you see?

CSE152, Spr 2011 Intro Computer Vision

What was happening

Page 2: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

2

CSE152, Spr 2011 Intro Computer Vision

Should Computer Vision follow from our understanding of Human Vision?

Yes & No

1.  Who would ever be crazy enough to even try creating machine vision? 2.  Human vision “works”, and copying is easier than creating. 3.  Secondary benefit – in trying to mimic human vision, we learn about it.

1.  Why limit oneself to human vision when there is even greater diversity in biological vision

2.  Why limit oneself to biological when there may be greater diversity in sensing mechanism?

3.  Biological vision systems evolved to provide functions for “specific” tasks and “specific” environments. These may differ for machine systems

4.  Implementation – hardware is different, and synthetic vision systems may use different techniques/methodologies that are more appropriate to computational mechanisms

CSE152, Spr 2011 Intro Computer Vision

The Near Future: Ubiquitous Vision •  Digital video has become really

cheap.

•  It’s widely embedded in cell phones, cars, games, etc.

•  99.9% of digitized video isn’t seen by a person.

•  That doesn’t mean that only 0.1% is important!

CSE152, Spr 2011 Intro Computer Vision

Applications: touching your life •  Digital photography •  Google Goggles •  Video games •  Football •  Movies •  Surveillance •  HCI – hand gestures •  Aids to the blind •  Face recognition &

biometrics •  Road monitoring •  Industrial inspection

•  Robotic control •  Autonomous driving •  Space: planetary

exploration, docking •  Medicine – pathology,

surgery, diagnosis •  Microscopy •  Military •  Remote Sensing •  Virtual Earth; street view •  Homeland security

CSE152, Spr 2011 Intro Computer Vision

Related Fields •  Image Processing •  Computer Graphics •  Pattern Recognition •  Perception •  Robotics •  AI

CSE152, Spr 2011 Intro Computer Vision

What is a digital image? •  Formed by a camera (usually). •  Divided into smaller elements (pixels) •  Stored as a matrix of numbers (usually

positive) •  8-bit, 12-bit, 16-bit, float •  Color image – 3 channels

CSE152, Spr 2011 Intro Computer Vision

How do we interpret an image? Use different cues

•  Variation in appearance in multiple views –  stereo –  motion

•  Shading & highlights •  Shadows •  Contours •  Texture •  Blur •  Geometric constraints •  Prior knowledge

Page 3: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

3

CSE152, Spr 2011 Intro Computer Vision

An example of a cue: Shading and lighting

Shading as a result of differences in lighting is

1.  A source of information 2.  An annoyance

CSE152, Spr 2011 Intro Computer Vision

Illumination Variability

“The variations between the images of the same face due to illumination and viewing direction are almost always larger than image variations due to change in face identity.”

-- Moses, Adini, Ullman, ECCV ‘94

CSE152, Spr 2011 Intro Computer Vision

Lambertian Reflectance:

At image location (x,y) the intensity of a pixel I(x,y) is

I(x,y) = a(x,y) n(x,y) s where •  a(x,y) is the albedo of the surface projecting to (x,y). •  n(x,y) is the unit surface normal. •  s is the direction and strength toward the light source.

An example: Image Formation Model

n s

.

a

I(x,y)

CSE152, Spr 2011 Intro Computer Vision

An implemented algorithm: Relighting

Single Light Source

CSE152, Spr 2011 Intro Computer Vision

An implemented algorithm Photometric Stereo

CSE152, Spr 2011 Intro Computer Vision

The course •  Part 1: The Physics of Imaging •  Part 2: Early Vision (Segmentation) •  Part 3: Reconstruction (Shape-from-X) •  Part 4: Recognition

Page 4: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

4

CSE152, Spr 2011 Intro Computer Vision

Part I of Course: The Physics of Imaging

•  How images are formed – Cameras

•  What a camera does •  How to tell where the camera was located

– Light •  How to measure light •  What light does at surfaces •  How the brightness values we see in cameras are

determined

– Color •  The underlying mechanisms of color •  How to describe it and measure it CSE152, Spr 2011 Intro Computer Vision

A real camera … and its model

CSE152, Spr 2011 Intro Computer Vision

Cameras, lenses, and sensors

From Computer Vision, Forsyth and Ponce, Prentice-Hall, 2002.

• Pinhole cameras • Lenses • Projection models • Geometric camera parameters

CSE152, Spr 2011 Intro Computer Vision

Color

CSE152, Spr 2011 Intro Computer Vision

Part II: Early Vision in One Image •  Representing small patches of image

– For three reasons •  Sharp changes are important in practice --- known as

“edges” •  Representing texture by giving some statistics of the

different kinds of small patch present in the texture. –  Tigers have lots of bars, few spots –  Leopards are the other way

•  We wish to establish correspondence between (say) points in different images, so we need to describe the neighborhood of the points

CSE152, Spr 2011 Intro Computer Vision

Segmentation •  Which image components “belong together”? •  Belong together ≅ lie on the same object •  Cues

–  similar color –  similar texture –  not separated by contour –  form a suggestive shape when assembled

Page 5: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

5

CSE152, Spr 2011 Intro Computer Vision

Texture Patterns [Leung, Malik]

•  Regular texture pattern, repeated texture elements •  Segment image based on texture •  Surface shape from texture pattern

CSE152, Spr 2011 Intro Computer Vision

Boundary Detection

http://www.robots.ox.ac.uk/~vdg/dynamics.html

CSE152, Spr 2011 Intro Computer Vision

Boundary Detection: Local cues

CSE152, Spr 2011 Intro Computer Vision

Boundary Detection: Local cues

CSE152, Spr 2011 Intro Computer Vision

Gradients

CSE152, Spr 2011 Intro Computer Vision

(Sharon, Balun, Brandt, Basri)

Page 6: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

6

CSE152, Spr 2011 Intro Computer Vision CSE152, Spr 2011 Intro Computer Vision

Part 3: Reconstruction from Multiple Images

•  Photometric Stereo – What we know about the world from lighting

changes. •  The geometry of multiple views •  Stereopsis

– What we know about the world from having two eyes

•  Structure from motion – What we know about the world from having

many eyes •  or, more commonly, our eyes moving.

CSE152, Spr 2011 Intro Computer Vision

Mars Rover Spirit

From Viking

CSE152, Spr 2011 Intro Computer Vision

Xbox Kinnect

CSE152, Spr 2011 Intro Computer Vision

Multi-view geometry applications •  Image stitching, mosaicing

–  http://www.yosemite-17-gigapixels.com/ –  214,414 x 80,571 pixels from 2k images. – Pixel is less than 3 arc seconds.

•  Photosynth –  http://photosynth.net/ –  In Bing Maps

CSE152, Spr 2011 Intro Computer Vision

Images with marked features

Page 7: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

7

CSE152, Spr 2011 Intro Computer Vision

Recovered

Recovered model edges reprojected through recovered camera positions into the three original images

CSE152, Spr 2011 Intro Computer Vision

Resulting model & Camera Positions

CSE152, Spr 2011 Intro Computer Vision

Façade (Debevec, Taylor and Malik, 1996) Reconstruction from multiple views, constraints, rendering

Architectural modeling: •  photogrammetry; •  view-dependent texture mapping; •  model-based stereopsis.

Reprinted from “Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-Based Approach,” By P. Debevec, C.J. Taylor, and J. Malik, Proc. SIGGRAPH (1996). © 1996 ACM, Inc. Included here by permission.

CSE152, Spr 2011 Intro Computer Vision

Façade

CSE152, Spr 2011 Intro Computer Vision

Video-Motion Analysis •  Where “things” are

moving in image –segmentation.

•  Determining observer motion (egomotion)

•  Determining scene structure

•  Tracking objects •  Understanding

activities & actions

CSE152, Spr 2011 Intro Computer Vision

Forward Translation & Focus of Expansion [Gibson, 1950]

Page 8: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

8

CSE152, Spr 2011 Intro Computer Vision

Tracking •  Use a model to predict next position and

refine using next image •  Model:

–  simple dynamic models (second order dynamics) –  kinematic models –  etc.

•  Face tracking and eye tracking now work rather well

CSE152, Spr 2011 Intro Computer Vision

Visual Tracking

Main Challenges 1.  3-D Pose Variation 2.  Occlusion of the target 3.  Illumination variation 4.  Camera jitter 5.  Expression variation

etc.

[ Ho, Lee, Kriegman ]

CSE152, Spr 2011 Intro Computer Vision

Tracking

(www.brickstream.com) CSE152, Spr 2011 Intro Computer Vision

Tracking

CSE152, Spr 2011 Intro Computer Vision

Tracking

CSE152, Spr 2011 Intro Computer Vision

Tracking

Page 9: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

9

CSE152, Spr 2011 Intro Computer Vision

Tracking

CSE152, Spr 2011 Intro Computer Vision

Part 4: Recognition

Given a database of objects and an image determine what, if any of the objects are present in the image.

CSE152, Spr 2011 Intro Computer Vision

Recognition Challenges •  Within-class variability

–  Different objects within the class have different shapes or different material characteristics

–  Deformable –  Articulated –  Compositional

•  Pose variability: –  2-D Image transformation (translation, rotation, scale) –  3-D Pose Variability (perspective, orthographic projection)

•  Lighting –  Direction (multiple sources & type) –  Color –  Shadows

•  Occlusion – partial •  Clutter in background -> false positives

CSE152, Spr 2011 Intro Computer Vision

Face Detection: First Step

•  Digital Cameras: Face + Expression •  Street view

•  Crowd counting

CSE152, Spr 2011 Intro Computer Vision

Why is Face Recognition Hard? Many faces of Madona

CSE152, Spr 2011 Intro Computer Vision

Face Recognition: 2-D and 3-D

2-D

Face Database

Time (video)

2-D

Recognition Data

3-D 3-D

Recognition Comparison

Page 10: What is Computer Vision? Introduction · 3 CSE152, Spr 2011 Intro Computer Vision An example of a cue: Shading and lighting Shading as a result of differences in lighting is 1. A

10

CSE152, Spr 2011 Intro Computer Vision

Yale Face Database B

64 Lighting Conditions 9 Poses => 576 Images per Person

CSE152, Spr 2011 Intro Computer Vision http://www.ri.cmu.edu/projects/project_271.html

CSE152, Spr 2011 Intro Computer Vision

Model-Based Vision

•  Given 3-D models of each object •  Detect image features (often edges, line segments, conic sections) •  Establish correspondence between model &image features •  Estimate pose •  Consistency of projected model with image.

CSE152, Spr 2011 Intro Computer Vision

•  Google Goggles

CSE152, Spr 2011 Intro Computer Vision

Announcements •  Class Web Page:

–  http://www.cs.ucsd.edu/classes/sp11/cse152-a/

•  Assignment 0: “Getting Started with Matlab” (to be posted soon), due 4/7/11

•  Read Chapters 1 Trucco & Verri

CSE152, Spr 2011 Intro Computer Vision

The Syllabus