CAP 6412 Advanced Computer Vision - …bgong/CAP6412/lec22.pdfA Brief Introduction to 3D Computer...

Preview:

Citation preview

CAP6412AdvancedComputerVision

http://www.cs.ucf.edu/~bgong/CAP6412.html

Boqing GongMarch 31,2016

Today

• Administrivia• RNN(Review:model,learning,challenge&solution)• LSTM

ProjectIIposted,dueTuesday04/26, 11:59pm

• http://www.cs.ucf.edu/~bgong/CAP6412/proj2.pdf

• NextTuesday:lastdaytoacquirepermissionfortakingoption2

Nextweek

Tuesday(04/05)

Abdullah Jamal

Thursday(04/07)

Samer Iskander

Today

• Administrivia• RNN(Review:model,learning,challenge&solution)• LSTM

RecurrentNeuralNetwork

• Threetimestepsandbeyond

Imagecredits:RichardSocher

RNN

• Threetimestepsandbeyond • Alayeredfeedforward net• Tiedweights fordifferenttimesteps• Conditioning (memorizing?)onallpreviousinput• Cheap to save memoryinRAM

Imagecredits:RichardSocher

LSTMslidesborrowedfromHinton

ComparingwithPlainRNN

• Threetimestepsandbeyond

Imagecredits:RichardSocher

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

A Brief Introduction to 3D Computer Vision

Presented by Karan Daei-Mojdehi

Department of Computer ScienceUniversity of Central Florida

March 31, 2016

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

A Hint About 3D Vision

image source: http://www.markedbyteachers.com/as-and-a-level/psychology/perception-cognitive-psychology-a-revision-categories.html

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

What is 3D Computer Vision?

Definition: extraction of 3D information from digitalimages

i.e. we want to infer a scene’s geometry from takenimages of that scene

common representations of 3D information:depth maps,meshes, point clouds, volumetric models

example of depth map(source:[4]) sample point cloud

Motivation?

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

What is 3D Computer Vision?

Definition: extraction of 3D information from digitalimages

i.e. we want to infer a scene’s geometry from takenimages of that scene

common representations of 3D information:depth maps,meshes, point clouds, volumetric models

example of depth map(source:[4]) sample point cloud

Motivation?

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

What is 3D Computer Vision?

Definition: extraction of 3D information from digitalimages

i.e. we want to infer a scene’s geometry from takenimages of that scene

common representations of 3D information:depth maps,meshes, point clouds, volumetric models

example of depth map(source:[4]) sample point cloud

Motivation?

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

3D Reconstruction

3D reconstruction is the process of capturing the shapeand appearance of real objectsGoogle Earth 3D Reconstruction of Aerial Images (click)

UCF Campus 3D Reconstruction from Google Earth

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Image Registration

Image Registration is the process of alignment of differentviews of a scene with overlapping viewsOne familiar application is in taking panorama pictures

another interesting application is Microsoft Synth R© (click)which emerged from Photo Tourism[5]

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Image Registration

Image Registration is the process of alignment of differentviews of a scene with overlapping viewsOne familiar application is in taking panorama pictures

another interesting application is Microsoft Synth R© (click)which emerged from Photo Tourism[5]

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Other 3D Computer Vision Applications

Recognition From 3D Reconstruction

Robotics

Navigationobject manipulation with collision detection

Stereoscopy: Creating illusion of depth by creating twoviews of a scene for binocular vision

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Other 3D Computer Vision Applications

Recognition From 3D Reconstruction

Robotics

Navigationobject manipulation with collision detection

Stereoscopy: Creating illusion of depth by creating twoviews of a scene for binocular vision

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Shape From X series

Pioneering work: Shape From Shading[3] (also known asphotoclinometry)

Downside: strong assumptions on imaging conditions:

Lambertian Surface (glossy coatings void this)

Orthographic projection (assuming projection into imageplane is affine)a single infinitely distant and known light source

other similar Shape from X approaches: shape fromtexture, shape from

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Shape From X series

Pioneering work: Shape From Shading[3] (also known asphotoclinometry)

Downside: strong assumptions on imaging conditions:

Lambertian Surface (glossy coatings void this)Orthographic projection (assuming projection into imageplane is affine)

a single infinitely distant and known light source

other similar Shape from X approaches: shape fromtexture, shape from

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Shape From X series

Pioneering work: Shape From Shading[3] (also known asphotoclinometry)

Downside: strong assumptions on imaging conditions:

Lambertian Surface (glossy coatings void this)Orthographic projection (assuming projection into imageplane is affine)a single infinitely distant and known light source

other similar Shape from X approaches: shape fromtexture, shape from

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Shape From X series

Pioneering work: Shape From Shading[3] (also known asphotoclinometry)

Downside: strong assumptions on imaging conditions:

Lambertian Surface (glossy coatings void this)Orthographic projection (assuming projection into imageplane is affine)a single infinitely distant and known light source

other similar Shape from X approaches: shape fromtexture, shape from

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Structure from Motion

can infer the shape of an object by tracking its features inframes of a video

an example software which is exploiting this technique isVideo Trac(click)

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Depth from Defocus

Take two images with different focus blur

exercise precise blur estimation and magnification variationresulted by changing focus

image credits: [6]

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Model-Based Techniques

Derive a morphable model by scanning huge number ofexamples

New objects of same type can be modeled by forminglinear combination of prototypes

A Morphable Model for the Synthesis of 3D Faces [1]:

image credits: [1]

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Stereopsis

Is based on visual disparity (i.e. parallax)

We can say that we have reached a point where stereo =laser scan

boils down to a problem of matching robust featuredescriptors in two views ( in case of uncallibrated camerawhich is the general case)

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

A new Challenge: Single Image Depth Estimationof General Scenes

Motivation:

previous approaches short comingsthere exists cues for depth even in single images

ill-posed problem → non-deterministic approaches

probabilistic and learning based methods are applied

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

A new Challenge: Single Image Depth Estimationof General Scenes

Motivation:

previous approaches short comingsthere exists cues for depth even in single images

ill-posed problem → non-deterministic approaches

probabilistic and learning based methods are applied

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Learning Depth from Single Monocular ImagesUsing Deep Convolutional Neural Fields[4]

By F. Liu, et al.

University of Adelaide, Australlia

currently holds state of the art results

A Deep Learning Approach

Main Contributions:introduces a framework for joint training of a CNN andcontinuous CRF

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Problem StatementMonocular Depth Estimation

Infer the depth of each pixel given a single RGB Image ofa scene

Is an essential step for 3D reconstruction of a scene

ill-posed problem as there are only subtle cues for depth ina single image (parallax, occlusion, perspective, etc)

image source: NYU v2 Depth Dataset

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Approach Outline

Joint Training of Deep Convolutional Network andConditional Random Field (CRF)Continuous CRF since depth is continuousunary and pairwise potentials of CRF are learnt byseparate networks

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Approach Outline

Joint Training of Deep Convolutional Network andConditional Random Field (CRF)

Continuous CRF since depth is continuous

unary and pairwise potentials of CRF are learnt byseparate networks

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Approach Outline

Joint Training of Deep Convolutional Network andConditional Random Field (CRF)

Continuous CRF since depth is continuous

unary and pairwise potentials of CRF are learnt byseparate networks

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Approach Outline

Joint Training of Deep Convolutional Network andConditional Random Field (CRF)

Continuous CRF since depth is continuous

unary and pairwise potentials of CRF are learnt byseparate networks

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Conditional Random Field

first, super pixels are defined on small homogeneoussections of image

y is a vector of continuous depth for super pixels:

graphical model defined on y :

Z(x) is the partition function and E (y , x) is the energyfunction:

Unary Potential + Pairwise Potential

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Conditional Random Field

first, super pixels are defined on small homogeneoussections of image

y is a vector of continuous depth for super pixels:

graphical model defined on y :

Z(x) is the partition function and E (y , x) is the energyfunction:

Unary Potential + Pairwise Potential

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Conditional Random Field

first, super pixels are defined on small homogeneoussections of image

y is a vector of continuous depth for super pixels:

graphical model defined on y :

Z(x) is the partition function and E (y , x) is the energyfunction:

Unary Potential + Pairwise Potential

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

CRF Unary Potential

where zp(θ) is the scalar output of last fully connectedlayer of CNN similar to VGG net

each super pixel is resized to 224x224 pixels and fed tothis network

Figure: Network architecture of Deep CNN for unary term

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

CRF Pairwise Potential

Adjacent similar pixels will have close depth

Pairwise potential enforces smoothness amongneighbouring similar pixels

Rpq is the output of network in pairwise part:

Skpqs are different similarity matrices between adjacent

superpixels

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

CRF Pairwise Potential

Adjacent similar pixels will have close depth

Pairwise potential enforces smoothness amongneighbouring similar pixels

Rpq is the output of network in pairwise part:

Skpqs are different similarity matrices between adjacent

superpixels

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

CRF Pairwise Potential

Adjacent similar pixels will have close depth

Pairwise potential enforces smoothness amongneighbouring similar pixels

Rpq is the output of network in pairwise part:

Skpqs are different similarity matrices between adjacent

superpixels

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Training the whole pipeline

We train with objective of solving MAP inference of depth

in CRF:

Closed-form solution derived in the paper

Stochastic Gradient Descent used to back propagate theerror and train the networks parameters

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Training the whole pipeline

We train with objective of solving MAP inference of depth

in CRF:

Closed-form solution derived in the paper

Stochastic Gradient Descent used to back propagate theerror and train the networks parameters

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Available Datasets for Evaluating Depth Estimation

KITI Data set

videos taken form a driving vehicledepths captured by a LiDar sensor700 train and 697 test images from 28 scenes (extractedby Eigen[2] from videos

Make3D Data set

Aligned depthmaps from laser range sensors400 train and 134 test images

NYU Depth v2 Data set

comprised of video sequences from indoor scenes byrecorded by kinect1449 aligned RGB-D images (with densely labeledsegments)407,024 raw unlabeled RGB-D frames

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Evaluation Metrics

These metrics are currently common for evaluating depthestimation results:

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Experiment Results

Results on NYU v2 Depth dataset:

Results on KITTI dataset:

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Experiment Results

Results on Make3d dataset:

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Sample Results from Model

Figure: Orignial image Predicted Depth Map

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Sample Results from Model

Figure: Orignial image Predicted Depth Map

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Strength and Weakness of the paper

Main Strength of the paper in the joint training frameworkthat is made possible by finding a closed form solution toCRF MAP inference of depth

weakness of paper is that it is not using any geometriccues of depth (e.g. vanishing point)

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Outline

1 3D Computer VisionWhat is 3D Computer Vision?3D Computer Vision ApplicationsA short history of Approaches to 3D Vision ProblemsA new Challenge: Single Image Depth Estimation ofGeneral Scenes

2 ”Learning Depth from Single Monocular Images Using DeepConvolutional Neural Fields”

About the paperProblem StatementApproach OutlineApproach DetailsExperiment ResultsStrength and Weakness of the paperFuture Directions

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Future Directions

Creating a hybrid model that can handle partly labeleddata

Applying structure learning of the graphical model to thismodel

Enhance resolution of output from Deep CNN (like the[FlowNet] paper we saw earlier

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Questions

Questions?

Thank you for your time

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Questions

Questions?

Thank you for your time

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Volker Blanz and Thomas Vetter. “A morphable modelfor the synthesis of 3D faces”. In: Proceedings of the26th annual conference on Computer graphics andinteractive techniques. ACM Press/Addison-WesleyPublishing Co. 1999, pp. 187–194.

David Eigen, Christian Puhrsch, and Rob Fergus. “Depthmap prediction from a single image using a multi-scaledeep network”. In: Advances in neural informationprocessing systems. 2014, pp. 2366–2374.

Berthold KP Horn. “Shape from shading: A method forobtaining the shape of a smooth opaque object from oneview”. In: (1970).

Fayao Liu, Chunhua Shen, and Guosheng Lin. “Deepconvolutional neural fields for depth estimation from asingle image”. In: Proceedings of the IEEE Conference on

A BriefIntroduction

to 3DComputerVision

Presented byKaran

Daei-Mojdehi

3D ComputerVision

What is 3DComputerVision?

3D ComputerVisionApplications

A short historyof Approaches to3D VisionProblems

A newChallenge:Single ImageDepthEstimation ofGeneral Scenes

”LearningDepth fromSingleMonocularImages UsingDeepConvolutionalNeural Fields”

About the paper

ProblemStatement

ApproachOutline

ApproachDetails

ExperimentResults

Strength andWeakness of thepaper

FutureDirections

References

Computer Vision and Pattern Recognition. 2015,pp. 5162–5170.

Noah Snavely, Steven M Seitz, and Richard Szeliski.“Photo tourism: exploring photo collections in 3D”. In:ACM transactions on graphics (TOG). Vol. 25. 3. ACM.2006, pp. 835–846.

M. Watanabe, S.K. Nayar, and M. Noguchi. “Real-TimeComputation of Depth from Defocus”. In: Proceedings ofThe International Society for Optical Engineering (SPIE).Vol. 2599. Jan. 1996, pp. 14–25.

Recommended