71
EE290T : 3D Reconstruction and Recognition

EE290T : 3D Reconstruction and Recognition

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EE290T : 3D Reconstruction and Recognition

EE290T : 3D Reconstruction and

Recognition

Page 2: EE290T : 3D Reconstruction and Recognition

Acknowledgement

Courtesy of Prof. Silvio Savarese.

Page 3: EE290T : 3D Reconstruction and Recognition

Introduction

Page 4: EE290T : 3D Reconstruction and Recognition

“There was a table set out under

a tree in front of the house,and the March Hare and the

Hatter were having tea at it.”

“The table was a large one, but

the three were all crowded

together at one corner of it …”

From “A M a d Tea-Party”Alice's Adventures in W onderla n

by

Lew is Ca rroll

Page 5: EE290T : 3D Reconstruction and Recognition

Illustra tion by Arthur Ra ckha m

a tree in front of the house,

and the March Hare and the

Hatter were having tea at it.”

“The table was a large one, but

the three were all crowded

together at one corner of it …”

From “A M a d Tea-Party”Alice's Adventures in W onderla n

by

Lew is Ca rroll

“There was a table set out under

Page 6: EE290T : 3D Reconstruction and Recognition

-semantic

Image/ video

Computer vision

Object 1 Object N

- semantic

Page 7: EE290T : 3D Reconstruction and Recognition

-semantic

Image/ video

Object 1 Object N

- semantic

-geometry -geometry

Computer vision

Page 8: EE290T : 3D Reconstruction and Recognition

-semantic

Image/ video

Object 1 Object N

- semantic

-geometry -geometry

spatial & temporal relations

Computer vision

Page 9: EE290T : 3D Reconstruction and Recognition

-semantic

Image/ video

Object 1 Object N

- semantic

-geometry -geometry

spatial & temporal relations

Scene

-Sema ntic

- geometry

Computer vision

Page 10: EE290T : 3D Reconstruction and Recognition

Sensing device

• Information extraction

• Interpretation

Computer vision

1. Information extraction: features, 3D structure, motion flows, etc…

2. Interpretation: recognize objects, scenes, actions, events

Computational device

Page 11: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

21

1990 20102000

EosSystems

Page 12: EE290T : 3D Reconstruction and Recognition

Fingerprint biometrics

Page 13: EE290T : 3D Reconstruction and Recognition

Augmentation with 3D computer graphics

23

Page 14: EE290T : 3D Reconstruction and Recognition

3D object prototyping

24PhotomodelerEosSystems

Page 15: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

25

1990 20102000

EosSystems Autostich

Page 16: EE290T : 3D Reconstruction and Recognition

Face detection

Page 17: EE290T : 3D Reconstruction and Recognition

Face detection

Page 18: EE290T : 3D Reconstruction and Recognition

Web applications

28Photometria

Page 19: EE290T : 3D Reconstruction and Recognition

Panoramic Photography

kolor

Page 20: EE290T : 3D Reconstruction and Recognition

30

3D modeling of landmarks

Page 21: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

31

1990 20102000

EosSystems Autostich

• Efficient SLAM/SFM• Large scale image repositories• Deep learning (e.g. ImageNet)

Page 22: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

32

1990 20102000

Kooaba

A9

Kinect

EosSystems Autostich

Google Goggles

Page 23: EE290T : 3D Reconstruction and Recognition

Image search engines

Page 24: EE290T : 3D Reconstruction and Recognition

Google Goggles34

Visual search and landmarks recognition

Page 25: EE290T : 3D Reconstruction and Recognition

35

Visual search and landmarks recognition

Page 26: EE290T : 3D Reconstruction and Recognition

36

Augmented reality

• Magic leap• Daqri• Meta

• Etc…

Page 27: EE290T : 3D Reconstruction and Recognition

Motion sensing and gesture recognition

37

Page 28: EE290T : 3D Reconstruction and Recognition

Autonomous navigation and safety

Mobileye: Vision systems in high-end BMW, GM, Volvo models

But also, Toyota, Google, Apple, Tesla, Nissan, Ford, etc….

Source: A. Shashua, S. Seitz

Page 29: EE290T : 3D Reconstruction and Recognition

Personal robotics

39

Page 30: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

Vision for robotics, space exploration

Factory inspectionSurveillanceAssistive technologies

Security

Page 31: EE290T : 3D Reconstruction and Recognition

411990 20102000

Computer vision and Applications

Kooaba

A9

Kinect

EosSystems Autostich

Google Goggles

Page 32: EE290T : 3D Reconstruction and Recognition

EosSystems

Computer vision and Applications

421990 20102000

2D

3D

Google Goggles

Page 33: EE290T : 3D Reconstruction and Recognition

Computer vision and Applications

431990

EosSystems

20102000

2D

3D

Google Goggles

Page 34: EE290T : 3D Reconstruction and Recognition

Current state of computer vision

3D Reconstruction• 3D shape recovery• 3D scene reconstruction• Camera localization• Pose estimation

2D Recognition• Object detection• Texture classification• Target tracking• Activity recognition

4 4

Page 35: EE290T : 3D Reconstruction and Recognition

3D Reconstruction• 3D shape recovery• 3D scene reconstruction• Camera localization• Pose estimation

Current state of computer vision

Levoy et al., 00Hartley & Zisserman, 00Dellaert et al., 00Rusinkiewic et al., 02Nistér, 04Brown & Lowe, 04Schindler et al, 04Lourakis & Argyros, 04Colombo et al. 05

Golparvar-Fard, et al. JAEI 10Pandey et al. IFAC , 2010Pandey et al. ICRA 2011Savarese et al. IJCV 05Savarese et al. IJCV 06Microsoft’s PhotoSynth Snavely et al., 06-08Schindler et al., 08Agarwal et al., 09 45Frahm et al., 10

Lucas & Kanade, 81Chen & Medioni, 92Debevec et al., 96Levoy & Hanrahan, 96Fitzgibbon & Zisserman, 98Triggs et al., 99Pollefeys et al., 99Kutulakos & Seitz, 99

Snavely

et al., 06

-08

Page 36: EE290T : 3D Reconstruction and Recognition

3D Reconstruction• 3D shape recovery• 3D scene reconstruction• Camera localization• Pose estimation

Current state of computer vision

Levoy et al., 00Hartley & Zisserman, 00Dellaert et al., 00Rusinkiewic et al., 02Nistér, 04Brown & Lowe, 04Schindler et al, 04Lourakis & Argyros, 04Colombo et al. 05

Golparvar-Fard, et al. JAEI 10Pandey et al. IFAC , 2010Pandey et al. ICRA 2011Savarese et al. IJCV 05Savarese et al. IJCV 06Microsoft’s PhotoSynth Snavely et al., 06-08Schindler et al., 08Agarwal et al., 09Frahm et al., 10

Lucas & Kanade, 81Chen & Medioni, 92Debevec et al., 96Levoy & Hanrahan, 96Fitzgibbon & Zisserman, 98Triggs et al., 99Pollefeys et al., 99Kutulakos & Seitz, 99

Snavely

et al., 06

-08

Page 37: EE290T : 3D Reconstruction and Recognition

2D Recognition• Object detection• Texture classification• Target tracking• Activity recognition

Current state of computer vision

Turk & Pentland, 91Poggio et al., 93Belhumeur et al., 97LeCun et al. 98 Amit and Geman, 99 Shi& Malik, 00 Viola & Jones, 00Felzenszwalb & Huttenlocher 00Belongie & Malik, 02Ullman et al. 02

Argawal & Roth, 02Ramanan & Forsyth, 03Weber et al., 00Vidal-Naquet & Ullman 02Fergus et al., 03Torralba et al., 03Vogel & Schiele, 03Barnard et al., 03Fei-Fei et al., 04Kumar & Hebert ‘04

He et al. 06Gould et al. 08Maire et al. 08Felzenszwalb et al., 08Kohli et al. 09L.-J. Li et al. 09Ladicky et al. 10,11Gonfaus et al. 10Farhadi et al., 09Lampert et al., 09 47

Page 38: EE290T : 3D Reconstruction and Recognition

2D Recognition• Object detection• Texture classification• Target tracking• Activity recognition

Current state of computer vision

car person car

Car Car

Street

Tree

PersonBike

Building

Turk & Pentland, 91Poggio et al., 93Belhumeur et al., 97LeCun et al. 98 Amit and Geman, 99 Shi& Malik, 00 Viola & Jones, 00Felzenszwalb & Huttenlocher 00Belongie & Malik, 02Ullman et al. 02

Argawal & Roth, 02Ramanan & Forsyth, 03Weber et al., 00Vidal-Naquet & Ullman 02Fergus et al., 03Torralba et al., 03Vogel & Schiele, 03Barnard et al., 03Fei-Fei et al., 04Kumar & Hebert ‘04

He et al. 06Gould et al. 08Maire et al. 08Felzenszwalb et al., 08Kohli et al. 09L.-J. Li et al. 09Ladicky et al. 10,11Gonfaus et al. 10Farhadi et al., 09Lampert et al., 09

Page 39: EE290T : 3D Reconstruction and Recognition

49

Current state of computer vision

3D Reconstruction• 3D shape recovery• 3D scene reconstruction• Camera localization• Pose estimation

2D Recognition• Object detection• Texture classification• Target tracking• Activity recognition

Perceiving the World in 3D!

Page 40: EE290T : 3D Reconstruction and Recognition

Course overview

1. Geometry

2. Semantics

Geometry:- How to extract 3d information?- Which cues are useful?- What are the mathematical tools?

Page 41: EE290T : 3D Reconstruction and Recognition

Camera systemsEstablish a mapping from 3D to 2D

Page 42: EE290T : 3D Reconstruction and Recognition

How to calibrate a cameraEstimate camera parameters such as pose or focallength

?

Page 43: EE290T : 3D Reconstruction and Recognition

Single view metrologyEstimate 3D properties of the world from a single image

?

Page 44: EE290T : 3D Reconstruction and Recognition

Single view metrologyEstimate 3D properties of the world from a single image

Page 45: EE290T : 3D Reconstruction and Recognition

Multiple view geometryEstimate 3D properties of the world from multiple views

Page 46: EE290T : 3D Reconstruction and Recognition

Toma si & Kana de (1993)

Mathematical tools

Epipola r geometry

Photoconsistency

Page 47: EE290T : 3D Reconstruction and Recognition

Structure from motion

Courtesy of Oxford Visua l Geometry Group

Page 48: EE290T : 3D Reconstruction and Recognition

Structure lighting and volumetric stereo

Scanning Michelangelo’s “The David”• The Digital Michelangelo Project

- http://graphics.stanford.edu/projects/mich/

• 2 BILLION polygons, accuracy to .29mm

Page 49: EE290T : 3D Reconstruction and Recognition

Course overview

1. Geometry

2. Semantics

Semantics:- How to recognize objects?- How to classify images or understand a scene?- How to segment out critical semantics- How to estimate 3D properties (pose, size, shape…)

Page 50: EE290T : 3D Reconstruction and Recognition

Object recognition and categorization

Page 51: EE290T : 3D Reconstruction and Recognition

Classification:Is this an forest?

No!

Page 52: EE290T : 3D Reconstruction and Recognition

Classification:Does this image contain a building? [yes/no]

Yes!

Page 53: EE290T : 3D Reconstruction and Recognition

Detection:Does this image contain a car? [where?]

car

Page 54: EE290T : 3D Reconstruction and Recognition

Building

clock

personcar

Detection:Which objects do this image contain? [where?]

Page 55: EE290T : 3D Reconstruction and Recognition

clock

Detection:Accurate localization (segmentation)

Page 56: EE290T : 3D Reconstruction and Recognition

Building 45 degree

Person, backCar, side view

Detection:Estimating 3D geometrical properties

Page 57: EE290T : 3D Reconstruction and Recognition

Challenges: viewpoint variation

slide credit: Fei-Fei, Fergus & Torralba

Page 58: EE290T : 3D Reconstruction and Recognition

Challenges: illumination

image credit: J. Koenderink

Page 59: EE290T : 3D Reconstruction and Recognition

Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba

Page 60: EE290T : 3D Reconstruction and Recognition

Challenges: deformation

Page 61: EE290T : 3D Reconstruction and Recognition

Challenges:

occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba

Page 62: EE290T : 3D Reconstruction and Recognition

Challenges: background clutter

Kilmeny Niland. 1995

Page 63: EE290T : 3D Reconstruction and Recognition

Challenges: object intra-class variation

slide credit: Fei-Fei, Fergus & Torralba

Page 64: EE290T : 3D Reconstruction and Recognition

A t. .•

. l , 11t'-:.

I • <\J I . ""'

Page 65: EE290T : 3D Reconstruction and Recognition

Course overview

1. Geometry

2. Semantics

Joint recovery of geometry and semantics!

Page 66: EE290T : 3D Reconstruction and Recognition

Visua l processing in the bra in

V1

where pathway (dorsal stream)

what pathway (ventral stream)

78

The ventral stream (also known

as the "what pathway") is

involved with object and visual

identification and recognition.

The dorsal stream (or,

"where pathway") is involved

with processing the object's

spatial location relative to

the viewer and with speech

repetition.

Page 67: EE290T : 3D Reconstruction and Recognition

V1

where pathway (dorsal stream)

what pathway (ventral stream)

Pre-frontal cortex

79

Visua l processing in the bra in

Page 68: EE290T : 3D Reconstruction and Recognition

Joint reconstruction and recognition

…Input images

80 Car Person Tree Sky

Street Building Else

Page 69: EE290T : 3D Reconstruction and Recognition

…Input images

81 Car Person Tree Sky

Street Building Else

Joint reconstruction and recognition

Page 70: EE290T : 3D Reconstruction and Recognition

“There was a table set out under

a tree in front of the house,and the March Hare and the

Hatter were having tea at it.”

“The table was a large one, but

the three were all crowded

together at one corner of it …”

From “A M a d Tea-Party”

Alice's Adventures in W onderla ndby

Lew is Ca rroll

Page 71: EE290T : 3D Reconstruction and Recognition

3D

geo

met

ry

Project presentations

Reco

gnit

ion

Syllabus

Lecture Topic

1 Introduction

2 Camera models

3 Camera calibration

4 Single view metrology

5 Epipolar geometry

6 Multi-view geometry

7 Structure from motion/ SLAM

8 Volumetric stereo

9 Fitting and Matching

10 Detector and Descriptors

11 Intro to Recognition; Object classification I

12 Object classification II

13 Scene understanding & segmentation

14 Visual Representation Learning by NNs

15 3D Object recognition

16 3D Scene understanding