CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel –...

Preview:

Citation preview

CS 496: Computer VisionCS 496: Computer Vision

Thanks to Chris BreglerThanks to Chris Bregler

CS 496: Computer VisionCS 496: Computer Vision

• PersonnelPersonnel– Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz

smr@cs.princeton.edusmr@cs.princeton.edu

– TA: Wagner CorrêaTA: Wagner Corrêawtcorrea@cs.princeton.eduwtcorrea@cs.princeton.edu

– Email to bothEmail to bothcs496@princeton.educs496@princeton.edu

• Course web pageCourse web page

http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/

What is Computer Vision?What is Computer Vision?

• Input: images or videoInput: images or video

• Output: description of the worldOutput: description of the world

What is Computer Vision?What is Computer Vision?

• Input: images or videoInput: images or video

• Output: description of the worldOutput: description of the world– Many levels of descriptionMany levels of description

Low-Level or “Early” VisionLow-Level or “Early” Vision

• Considers local Considers local properties of an properties of an imageimage

““There’s an edge!”There’s an edge!”

Mid-Level VisionMid-Level Vision

• Grouping and Grouping and segmentationsegmentation

““There’s an object There’s an object and a background!”and a background!”

High-Level VisionHigh-Level Vision

• RecognitionRecognition

““It’s a chair!”It’s a chair!”

Big Question #1: Who Cares?Big Question #1: Who Cares?

• Applications of computer visionApplications of computer vision– In AI: vision serves as the “input stage”In AI: vision serves as the “input stage”– In medicine: understanding human In medicine: understanding human

visionvision– In engineering: model extractionIn engineering: model extraction

Vision and Other FieldsVision and Other Fields

Computer VisionComputer VisionArtificial IntelligenceArtificial Intelligence

Cognitive PsychologyCognitive Psychology Signal ProcessingSignal Processing

Computer GraphicsComputer Graphics

Pattern AnalysisPattern Analysis

MetrologyMetrology

Big Question #2: Does It Big Question #2: Does It Work?Work?

• Situation much the same as AI:Situation much the same as AI:– Some fundamental algorithmsSome fundamental algorithms– Large collection of hacks / heuristicsLarge collection of hacks / heuristics

• Vision is hard!Vision is hard!– Especially at high level, physiology Especially at high level, physiology

unknownunknown– Requires integrating many different Requires integrating many different

methodsmethods– Requires reasoning and understanding:Requires reasoning and understanding:

“AI completeness”“AI completeness”

Computer and Human VisionComputer and Human Vision

• Emulating effects of human visionEmulating effects of human vision

• Understanding physiology of human Understanding physiology of human visionvision

Image FormationImage Formation

• Human: lens forms Human: lens forms image on retina,image on retina,sensors (rods and sensors (rods and cones) respond to cones) respond to lightlight

• Computer: lens Computer: lens system forms image,system forms image,sensors (CCD, CMOS) sensors (CCD, CMOS) respond to lightrespond to light

Low-Level VisionLow-Level Vision

HubelHubel

Low-Level VisionLow-Level Vision

• Retinal ganglion cellsRetinal ganglion cells

• Lateral Geniculate Nucleus – function Lateral Geniculate Nucleus – function unknown (visual adaptation?)unknown (visual adaptation?)

• Primary Visual CortexPrimary Visual Cortex– Simple cells: orientational sensitivitySimple cells: orientational sensitivity– Complex cells: directional sensitivityComplex cells: directional sensitivity

• Further processingFurther processing– Temporal cortex: what is the object?Temporal cortex: what is the object?– Parietal cortex: where is the object? How do I Parietal cortex: where is the object? How do I

get it?get it?

Low-Level VisionLow-Level Vision

• Net effect: low-level human visionNet effect: low-level human visioncan be (partially) modeled as a set ofcan be (partially) modeled as a set ofmultiresolution, orientedmultiresolution, oriented filters filters

Low-Level Depth CuesLow-Level Depth Cues

• FocusFocus

• VergenceVergence

• StereoStereo

• Not as important as popularly Not as important as popularly believedbelieved

Low-Level Computer VisionLow-Level Computer Vision

• Filters and filter banksFilters and filter banks– Implemented via convolutionImplemented via convolution– Detection of edges, corners, and other local Detection of edges, corners, and other local

featuresfeatures– Can include multiple orientationsCan include multiple orientations– Can include multiple scales: “filter pyramids”Can include multiple scales: “filter pyramids”

• ApplicationsApplications– First stage of segmentationFirst stage of segmentation– Texture recognition / classificationTexture recognition / classification– Texture synthesisTexture synthesis

Texture Analysis / SynthesisTexture Analysis / Synthesis

MultiresolutionMultiresolutionOrientedOrientedFilter BankFilter Bank

OriginalOriginalImageImage

ImageImagePyramidPyramid

Texture Analysis / SynthesisTexture Analysis / Synthesis

OriginalOriginalTextureTexture

SynthesizedSynthesizedTextureTexture

Heeger and BergenHeeger and Bergen

Low-Level Computer VisionLow-Level Computer Vision

• Optical flowOptical flow– Detecting frame-to-frame motionDetecting frame-to-frame motion– Local operator: looking for gradientsLocal operator: looking for gradients

• ApplicationsApplications– First stage of trackingFirst stage of tracking

Optical FlowOptical Flow

Image #1Image #1 Optical FlowOptical FlowFieldField

Image #2Image #2

Low-Level Computer VisionLow-Level Computer Vision

• Shape from XShape from X– StereoStereo– MotionMotion– ShadingShading– Texture foreshorteningTexture foreshortening

3D Reconstruction3D Reconstruction

Tomasi+KanadeTomasi+Kanade

Debevec,Taylor,MalikDebevec,Taylor,Malik Phigin et al.Phigin et al.

Forsyth et al.Forsyth et al.

Mid-Level VisionMid-Level Vision

• Physiology unclearPhysiology unclear

• Observations by Gestalt psychologistsObservations by Gestalt psychologists– ProximityProximity– SimilaritySimilarity– Common fateCommon fate– Common regionCommon region– ParallelismParallelism– ClosureClosure– SymmetrySymmetry– ContinuityContinuity– Familiar configurationFamiliar configuration WertheimerWertheimer

Grouping CuesGrouping Cues

Grouping CuesGrouping Cues

Grouping CuesGrouping Cues

Grouping CuesGrouping Cues

Mid-Level Computer VisionMid-Level Computer Vision

• TechniquesTechniques– Clustering based on similarityClustering based on similarity– Limited work on other principlesLimited work on other principles

• ApplicationsApplications– Segmentation / groupingSegmentation / grouping– TrackingTracking

Snakes: Active ContoursSnakes: Active Contours

Contour Evolution forContour Evolution forSegmenting an ArterySegmenting an Artery

BirchfeldBirchfeld

HistogramsHistograms

Expectation Maximization Expectation Maximization (EM)(EM)

Color SegmentationColor Segmentation

Bayesian MethodsBayesian Methods

• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models

• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A

given model Bgiven model B

Bayesian MethodsBayesian Methods

• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models

• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A

given model Bgiven model B

• Bayes’s RuleBayes’s RuleP(B|A) = P(A|B) P(B|A) = P(A|B) P(B) / P(A) P(B) / P(A)

– Probability of model B given observation Probability of model B given observation AA

Thomas BayesThomas Bayes(c. 1702-1761)(c. 1702-1761)

Bayesian MethodsBayesian Methods

)|( aXP )|( aXP

)|( bXP )|( bXP

# black pixels# black pixels

# black pixels# black pixels

High-Level VisionHigh-Level Vision

• Human mechanisms: ???Human mechanisms: ???

High-Level VisionHigh-Level Vision

• Computational mechanismsComputational mechanisms– Bayesian networksBayesian networks– TemplatesTemplates– Linear subspace methodsLinear subspace methods– Kinematic modelsKinematic models

Cootes et al.Cootes et al.

Template-Based MethodsTemplate-Based Methods

Linear SubspacesLinear Subspaces

DataData

PCAPCA

New Basis VectorsNew Basis Vectors

Kirby et al.Kirby et al.

Principal Components Analysis Principal Components Analysis (PCA)(PCA)

Kinematic ModelsKinematic Models

• Optical Flow/Feature tracking: no constraints

• Layered Motion: rigid constraints

• Articulated: kinematic chain constraints

• Nonrigid: implicit / learned constraints

Real-world ApplicationsReal-world Applications

Osuna et al:

Real-world ApplicationsReal-world Applications

Osuna et al:

Course OutlineCourse Outline

• Image formation and captureImage formation and capture

• Filtering and feature detectionFiltering and feature detection

• Optical flow and trackingOptical flow and tracking

• Projective geometryProjective geometry

• Shape from XShape from X

• Segmentation and clusteringSegmentation and clustering

• RecognitionRecognition

• Applications: 3D scanning; image-based Applications: 3D scanning; image-based renderingrendering

3D Scanning3D Scanning

Image-Based Modeling and Image-Based Modeling and RenderingRendering

Debevec et al.Debevec et al.

ManexManex

Course MechanicsCourse Mechanics

• 60%: 4 written / programming assignments60%: 4 written / programming assignments

• 30%: Final group project30%: Final group project

• 10%: In-class participation (includes 10%: In-class participation (includes attendance, project presentation, etc.)attendance, project presentation, etc.)

Course MechanicsCourse Mechanics

• Book: Book: Computer Vision – A Modern Computer Vision – A Modern ApproachApproachDavid Forsyth and Jean PonceDavid Forsyth and Jean Ponce

• PapersPapers

• All online – available from class webpageAll online – available from class webpage

CS 496: Computer VisionCS 496: Computer Vision

• PersonnelPersonnel– Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz

smr@cs.princeton.edusmr@cs.princeton.edu

– TA: Wagner CorrêaTA: Wagner Corrêawtcorrea@cs.princeton.eduwtcorrea@cs.princeton.edu

– Email to bothEmail to bothcs496@princeton.educs496@princeton.edu

• Course web pageCourse web page

http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/

Recommended