ECE592-064Digital Image Processing and Introduction …twu19/teaching_notes/ece592-064/Lecture-05.pdf · ECE592-064Digital Image Processing and Introduction to Computer Vision

1/26/17

1

ECE592-064 Digital Image Processing and Introduction

to Computer Vision

Depart. of ECE, NC State University

Instructor: Tianfu (Matt) Wu

Spring 2017

Outline1. Recap

2. Solving the three geometric problems.

• Review some optimization background

• Sketch the derivation

• List some off-the-shelf toolboxes

• Collect training data

1/26/17

2

1. Recap, Pinhole Camera• Mathematical Idealized Model

𝑋𝑌𝑍= [𝑅𝑇]

𝑢𝑣𝑤1

U

V

W

Worldcoordinate

system

Extrinsic matrixIntrinsic matrix

• Depth is lost• Length and angel are not preserved• Straight lines are still straight

Projection matrix

𝜆𝑥𝑦1= 𝐾2×2 𝑅𝜏 2×5

𝑢𝑣𝑤1

1. Recap, an Approximate Model for Real Cameras

observed data = true data + “noise/distortion”Image pixel coordinates ~ pinhole(world coordinates; 𝐾, 𝑅, 𝑇) + noise

For simplicity, we model noise with Gaussian distribution:𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝑇) = 𝑁𝑜𝑟𝑚(𝑝𝑖𝑛ℎ𝑜𝑙𝑒 𝑤;, 𝐾, 𝑅, 𝑇 ;𝜎H𝐼)

• Learning extrinsic parameters, 3D rotation matrix 𝑅2×2 and 3D translation 𝜏2×J

• Learning intrinsic parameters (calibration), 𝐾2×2

• Inferring 3D points (triangulation / reconstruction) from 𝐽 calibratedcameras (assume the correspondence between 𝑥L’s are known)

𝑅∗, 𝜏∗ = argmaxS,T

Ulog𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝜏)�

;

𝐾∗ = argmaxY[maxS,Z


;

]

𝑤∗ = argmax[[Ulog𝑃𝑟𝑜𝑏 𝑥L 𝑤, 𝐾L, 𝑅L, 𝜏L)

�

L

]

1/26/17

3

Why do these problems matter?

Check the video source: https://www.technologyreview.com/s/426265/meet-2011-tr35-winner-noah-snavely/ Building Rome in a Day: Agarwal et al. 2009


D. Hoiem, A.A. Efros, and M. Hebert, “Putting Objects in Perspective”, in CVPR 2006 (Best paper award)

1/26/17

4


Source: Simon

2. Solving the Three Geometric Problems

• 1) Learning extrinsic parameters, 3D rotation matrix 𝑅2×2 and 3D translation 𝜏2×J

𝑅∗, 𝜏∗ = argmaxS,Z


;

Subject to 𝑅𝑅\J = 𝑅𝑅T = 1 (rotation matrix)𝜏] > 0(object is in front of camera)

Let xcd = pinhole(wc, K, R, τ), we have,

𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝜏 =1

2𝜋|𝜎H𝐼|� 𝑒\JH qr\qrd s tuv wx(qr\qrd )

1/26/17

5




U− 𝑥; − 𝑥;d T(𝑥; − 𝑥;d )�

;

= argm𝑖𝑛S,Z

U 𝑥; − 𝑥;d T(𝑥; − 𝑥;d )�

;


Review some optimization background on board

Non-linear, Non-convex



• Compute a good initialization by resorting to homogeneouscoordinates

Pre-multiply both sides by inverse of camera calibration matrix, 𝐾\J(𝑋z = 𝐾\J𝑋 are known as normalized image coordinates).

Source: Simon

1/26/17

6




Source: Simon

The third equation gives us an expression for l

Substitute back into first two lines




Source: Simon

Linear equation – two equations per point – form system of equations

1/26/17

7




Source: Simon

Minimum direction problem of the form , Find minimum of subject to .

To solve, compute the SVD and then set to the last column of .




Source: Simon

Now we extract the values of 𝑅 and 𝜏 from 𝑉 .

Problem: the scale is arbitrary and the rows and columns of the rotation matrix may not be orthogonal.

Solution: compute SVD 𝑅 = 𝑂𝑃𝑄Tand then choose 𝑅~ = 𝑂𝑄T.

Use the ratio between the rotation matrix before and after to rescale,

�̂� =19UU

𝑅~𝑅

2

L�J

𝜏2

;�J

Use these estimates for start of non-linear optimisation.

1/26/17

8


• 2) Learning intrinsic parameters, 𝐾2×2

Source: Simon

𝐾∗ = argmaxY[maxS,Z


;

]

One approach (not very efficient) is to alternately

• Optimize extrinsic parameters for fixed intrinsic

• Optimize intrinsic parameters for fixed extrinsic


U log 𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝜏)�

;


𝐾∗ = argmaxY[Ulog𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝜏)

�

;

]

( Then use non-linear optimization )



Source: Simon

𝐾∗ = argmaxY[Ulog𝑃𝑟𝑜𝑏 𝑥; 𝑤;, 𝐾, 𝑅, 𝜏)

�

;

]

𝐾∗ = argmaxY

U− 𝑥; − 𝑥;d T(𝑥; − 𝑥;d )�

;

= argm𝑖𝑛Y

U 𝑥; − 𝑥;d T(𝑥; − 𝑥;d )�

;

This is a least squares problem.

Linear w.r.t. intrinsic parameters

1/26/17

9



Source: Simon

ℎ∗ = argm𝑖𝑛�U 𝑥; − 𝐴;ℎ T(𝑥; − 𝐴;ℎ)�

;


• 3) Inferring 3D points (triangulation / reconstruction) from 𝐽calibrated cameras (assume the correspondence between 𝑥L’s are known)

Source: Simon

𝑤∗ = argmax[[Ulog𝑃𝑟𝑜𝑏 𝑥L 𝑤, 𝐾L, 𝑅L, 𝜏L)

�

L

]

Write jth pinhole camera in homogeneous coordinates:

Pre-multiply with inverse of intrinsic matrix

1/26/17

10


• 3) Inferring 3D points (triangulation / reconstruction) from 𝐽calibrated cameras (assume the correspondence between 𝑥L’s are known)

Source: Simon

Last equations gives

Substitute back into first two equations

Re-arranging get two linear equations for [𝑢, 𝑣, 𝑤]

Solve using >1 cameras and then use non-linear optimization


• Some off-the-shelf toolboxes

• Matlab

• https://www.vision.caltech.edu/bouguetj/calib_doc/

• https://www.mathworks.com/videos/camera-calibration-with-matlab-81233.html

• OpenCV + Python

• http://docs.opencv.org/3.1.0/dc/dbb/tutorial_py_calibration.html

• https://github.com/warp1337/opencv_cam_calibration

1/26/17

11


• Collect Training Data

• Problem 1 & 2: 𝑥;, 𝑤; ;�J� , a set of pairs of image pixel locationsand 3D world position.

• Problem 3: 𝑥L , a set of correspondence points in images fromdifferent viewpoints.

3D apparatus 2D apparatus 1D apparatus

Source: Zhengyou Zhang

Source: Simon

A Summary of Pinhole Camera Model

• Pinhole camera model is a non-linear function that takes points in 3D world and finds where they map to in image

• Parameterized by intrinsic (2D scaling, shearing andtranslation) and extrinsic (3D rotation and translation)matrices

• Difficult to estimate intrinsic/extrinsic/depth because non-linear

• Use homogeneous coordinates where we can get closed form solutions, but as initialization to the non-linear optimization only

1/26/17

12

Next: Single View Metrology

• A. Criminisi, I. Reid, and A. Zisserman, IJCV, 2000.

Whichiscloser?

Whoistaller?

Source: D. Forsyth

How high is the camera?

What is the camera rotation?

What is the focal length of the camera?

Documents

ECE592-064Digital Image Processing and Introduction …twu19/teaching_notes/ece592-064/Lecture-05.pdf · ECE592-064Digital Image Processing and Introduction to Computer Vision