Pictorial Information Homework 2 1 Question 1: Calibrating ...users.umiacs.umd.edu/~cteo/.../hw-data/...solution.pdf · 2. The rotation vector om is a non-normalized vector codirectional

Pictorial Information

Homework 2

1 Question 1: Calibrating cam1

The calibration process is separated into two steps: initialization and non-linear optimization.

During the initialization step user manually selects four extreme corners of the grid. The coordinates ofselected points are refined by a corner detection algorithm which searches for a corner in the vicinity of thepoint selected by user. Then given the number of squares in the grid the rest of the corner points of the gridare estimated. The closed form solution for the calibration parameters (intrinsic and extrinsic) is computedfrom these image points. Projection matrix P is obtained.

The optimization step involves minimization of the reprojection error, which is defined as follows. Metricconfiguration of the grid is known: number of squares in the grid and length of each square along X and Ydirections. Hence, placing the grid so that it lies in the Z = 0 plane we know 3D coordinates of the cornersof the grid Xi (note that the location of the grid with respect to the world coordinate frame needs to befixed during the initialization step). We can reproject these points to the image plane using x̂i = PXi. Thereprojection error of point Xi is d(xi, x̂i), where d is the Euclidean distance and xi is the 2D coordinate ofthe corner point Xi in the image obtained in the initialization step. The optimized projection matrix is theone that minimizes the sum of the reprojection errors for all grid corners

P̂ = argminP

∑i

d(xi, x̂i)

Once P̂ is obtained the coordinates of grid corners in the image are updated to xi = P̂Xi.

1. Calibration parameters estimated after extracting grid corners with corner finder window size set to 5.Note that the estimated camera centre is way off from the expected [320 240] for a 640× 480 image.

Focal Length: fc = [ 569.89119 561.69978 ] +- [ 66.90937 73.86583 ]

Principal point: cc = [ 356.36231 299.73559 ] +- [ 58.51278 104.49382 ]

Skew: alpha_c = [ 0.00000 ] +- [ 0.00000 ]

=> angle of pixel axes = 90.00000 +- 0.00000 degrees

Distortion: kc = [ -0.71192 0.72151 -0.05281 -0.00828 0.00000 ]

+- [ 0.51963 2.40495 0.07955 0.03679 0.00000 ]

Pixel error: err = [ 1.13242 1.04376 ]

The numbers following +- (numerical errors/standard deviation??) of the corresponding parametersafter the non-linear minimization of the reprojection error. The corresponding calibration matrix K isgiven by

K =

αx s x00 αy y00 0 1

where αx and αy are the focal length of the camera expressed in units of horizontal and vertical pixels(the values are different if pixels are not perfect squares), s is the skew factor, x0 and y0 is the principalpoint expressed in pixels. In our case camera matrix is

K =

569.89119 0 356.362310 561.69978 299.735590 0 1

1

Homework 2 2

2. Points that are further away from the camera are imaged at a lower resolution. If in addition to thatthe grid is imaged as a skewed rectangle this might lead to a situation where the corners between twoadjacent squares are imaged as either separated (by one or more pixels) or connected (i.e. they join ina boundary of more than one pixel). This results in incorrect corner detection by the corner detectionalgorithm. In our particular example images 9, 12 and 13 (first image is indexed as 1) posed thisproblem. One possible solution to this problem is to reduce the area within which the algorithm looksfor a corner. That way the points detected by the algorithm is forced to be close to those selected bythe user.

X

YO

X

O

Figure 1: Problem with automatic corner detection. The bottom right corner was not detected properly.

Calibration parameters after reextracting corners with corner finder window size set to 1. Note asignificant decrease in pixel error.

Focal Length: fc = [ 548.73706 549.66836 ] +- [ 25.60846 25.65445 ]


Skew: alpha_c = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc = [ -0.04097 0.09409 0.01544 -0.01001 0.00000 ]

+- [ 0.26055 2.21165 0.02139 0.01749 0.00000 ]

Pixel error: err = [ 0.47038 0.29962 ]

A further improvement can be achieved by recomputing the corners of the grid. Grid corners computedafter reprojection error minimization are used as seeds for automatic corner detection algorithm. Sincethe current points are expected to be close to treir true values a small search window should be used.Note that this method gives considerable improvement if the images are highly distorted. This is notthe case for our images.

Calibration parameters after recomputing corners with corner finder windows size set to 1. Note thatpixel error was redistributed more equally over x and y.

Focal Length: fc = [ 539.40530 538.89995 ] +- [ 24.40718 24.12283 ]


Skew: alpha_c = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc = [ 0.02023 -0.47770 0.01059 -0.01385 0.00000 ]

+- [ 0.25233 2.16381 0.01829 0.01962 0.00000 ]

Pixel error: err = [ 0.39922 0.39523 ]

Homework 2 3

Comparing reprojection error scatter plot for calibration results before and after refining corner pointsshows that reprojection error was significantly reduced. All extreme outlier errors were reduced.

−6 −4 −2 0 2 4 6−5

−4

−3

−2

−1

0

1

2

3

4

x

y

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

x

y

Figure 2: Reprojection error scatter plot (in pixels) before and after refining the corner points.

X

YO

Figure 3: Extracted (red crosses) and reprojected (black circles) grid corners for image 10.

−400

−2000

200

4000

200400

600800

10001200

14001600

−100

0

100

10121113

9687

2

4

1

35

15

18171614

Xc

Zc

Yc

Oc

Figure 4: Positions of grids with respect to the camera 1.

Homework 2 4

2 Question 2: Calibrating cam2

Calibration parameters estimated after extracting grid corners with corner finder window size set to 5.

Focal Length: fc = [ 788.10309 790.11003 ] +- [ 70.84724 71.15334 ]


Skew: alpha_c = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc = [ -0.01543 0.01696 -0.00678 -0.00382 0.00000 ]

+- [ 0.24104 0.37531 0.03679 0.03504 0.00000 ]

Pixel error: err = [ 0.57108 0.30490 ]

Calibration parameters estimated reextracting corner points for problematic images (6, 9) and recomputingcorners automatically.

Focal Length: fc = [ 718.43763 721.61863 ] +- [ 48.37756 51.96494 ]


Skew: alpha_c = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc = [ -0.05610 -0.02608 -0.01986 0.00688 0.00000 ]

+- [ 0.16310 0.26592 0.03114 0.01792 0.00000 ]

Pixel error: err = [ 0.38740 0.38471 ]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

x

y

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

x

y

Figure 5: Reprojection error scatter plot (in pixels) before and after refining the corner points.

Homework 2 5

X

YO

Figure 6: Extracted (red crosses) and reprojected (black circles) grid corners for image 10.

0200

400600

0

500

1000

1500

2000−400

−300

−200

−100

09687

10121113

214153

5

17141618

Xc

Zc

Yc

Oc

Figure 7: Positions of grids with respect to the camera 2.

3 Question 3

The total pixel error can be computed as a sum of pixel errors along x and y directions:

Epix1 = 0.7945

Epix2 = 0.7721

Pixel error however is not a good metric for for comparing the quality of calibration of two cameras whichwere located at different distances from the same calibration grid. The same pixel error will represent agreater real world distance error for the camera that is further away from the grid. A better metric wouldbe total distance error. Calculating total distance error precisely can be difficult (I am not even sure if itis possible) but we can get an estimate of it by taking into account the fact that the pixel error is inverselyproportional to the distance of the calibration grid from the camera Epixi

∝ 1/Z. Hence the distance errorcam be estimated as

Edist ∝ Epix × Z

Homework 2 6

We can take Z as the mean distance of calibration grid centres from the camera. In our case it was estimatedfrom the plots of grid positions with respect to cameras:

Edist1 ∝ 0.7945 ∗ 1300 = 1033

Edist2 ∝ 0.7721 ∗ 1700 = 1317

This calculation tells us that camera 1 is better calibrated than camera 2. This is expected as camera 2 islocated further from the grids which results in worse corner point estimation.

4 Question 4: Stereo calibration

1. Calibration parameters obtained after running stereo optimization:

• Left camera

Focal Length: fc_left = [ 595.62985 595.95122 ] +- [ 8.41064 8.06296 ]

Principal point: cc_left = [ 328.15377 244.49640 ] +- [ 19.69169 18.28879 ]

Skew: alpha_c_left = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc_left = [ -0.18283 2.21910 0.00378 -0.00954 0.00000 ]

+- [ 0.21197 2.61603 0.00844 0.01432 0.00000 ]

• Right camera

Focal Length: fc_right = [ 794.00268 796.12964 ] +- [ 12.81526 13.17540 ]

Principal point: cc_right = [ 259.92620 247.92904 ] +- [ 49.45971 20.69040 ]

Skew: alpha_c_right = [ 0.00000 ] +- [ 0.00000 ]


Distortion: kc_right = [ -0.08870 0.26831 0.00051 -0.00628 0.00000 ]

+- [ 0.11665 0.27897 0.00546 0.01421 0.00000 ]

• Extrinsic parameters (position of right camera with respect to left camera):

Rotation vector: om = [ -0.06425 0.83695 0.02628 ] +- [ 0.03460 0.06895 0.01779 ]

Translation vector: T = [ -762.20898 70.67856 1040.76619 ]

+- [ 64.17818 26.22549 47.12061 ]

2. The rotation vector om is a non-normalized vector codirectional with the rotation axis and whosemagnitude is equal to the rotation angle. Rotation matrix can be retrieved using Rodrigues formula:

R =

0.6695 −0.0486 0.7412−0.0020 0.9977 0.0673−0.7428 −0.0466 0.6679

Homework 2 7

3. Stereo rig spatial configuration

0

500

10000

200400

600800

10001200

14001600

1800

−100

0

100

10111213

9687

Extrinsic parameters

2431

5

XYZ

15

Right Camera

18171614

ZXY

Left Camera

Figure 8: Stereo rig spatial configuration

4. A stereo image pair is rectified if the epipolar lines in both images are parallel to the x axis and alignedin a way such that a lines match up between views. The figure below shows the rectified pair for image13.

Figure 9: A pair of stereo rectified images. Red lines show the matching epipolar lines

Two points XL and XR which represent the same point in 3D space expressed in the left camera coordinateframe and right camera coordinate frame respectively are related by the following equations:

XR = RXL + T

XL = RT (XR − T )

where R is the rotation matrix and T is the translation vector between coordinate frames. Note that fororthogonal matrices Q−1 = QT .

Homework 2 8

5 Question 5: Stereo triangulation

Consider a camera with a projection matrix P =[I 0

]. A point in space Xcam expressed in the camera

coordinate frame is mapped to

xn = PXcam =

1 0 0 00 1 0 00 0 1 0

x1x2x31

=

x1x2x3

7→ [x1/x3x2/x3

]

xn is an image point expressed in normalized coordinates. Note that it corresponds to a ray in 3D space onwhich Xcam lies.

Now consider a calibrated stereo rig. Assume that xL and xR are the image points in left and right cameracorresponding to 3D point X. Suppose that we can find normalize these points and get the normalizedcoordinates xLn and xRn. For each camera we can use the normalized coordinates to construct the ray thatoriginates at camera centre and contains point X. These two rays will intersect at X. Thus we have foundthe 3D coordinates of a point in space from the normalized coordinates of its images in a stereo rig.

Camera Calibration Toolbox for MATLAB provides a function normalize.m which computes the normalizedcoordinates of an image point given camera calibration parameters (including lens distortion model). Inpractice, there is always an error in estimating the image points xL and xR. Hence, the rays from the leftand right cameras will never intersect. Hence X is approximated as a point for which the sum of distancestwo both rays is minimised.

0200

400600

8001000

1200 0

500

1000

1500

-400

-300

-200

-100

0


XY

Z

Right Camera

ZX

YLeft Camera

Figure 10: Rays from left and right cameras

Homework 2 9

0 200 400 600 800 1000 1200

0

500

1000

1500

-400

-350

-300

-250

-200

-150

-100

-50

0

50X

Right Camera

Y

Z


XZ

Y

Left Camera

Figure 11: Note that those rays do not intersect

Homework 2 10

1. The location of a point in 3D world coordinates can be found from a pair of calibrated stereo imagesby performing stereo triangulation on the matched images of the point. In our case two points for eyes,two points for mouth corners and one point for nose tip were matched.

Figure 12: Face point correspondence for image 7

Figures below show the 3D plot containing face locations as well as calibration grid locations for images0, 7 and 12. Note that for objects representing faces, the middle point corresponding to the tip of thenose is closer to the camera than the other 4 points. middle point corresponds to

−400−200

0200

400 0200

400600

8001000

12001400

16001800

−100

0

100

200

300

400

12

12

7

0

7

0

XZY

Left Camera

Figure 13: Face locations, isometric view

Homework 2 11

−400 −300 −200 −100 0 100 200 300 400

−100

−50

0

50

100

150

200

250

300

350

400

12

12X

0

Z

Y

Left Camera

0

7

7

Figure 14: Face locations, frontal view

0 200 400 600 800 1000 1200 1400 1600 1800

−100

0

100

200

300

400

712

712

0

0

ZLeft CameraXY

Figure 15: Face locations, side view

2. Both cabinets and the wall can be represented to a certain degree of accuracy as planes. The followingscheme was used to perform the dense stereo reconstruction of these objects:

• Plane estimation

The matching points from both images are used to estimate the equation of the plane;

• Corner estimation

Object corners are estimated form the picture which has all of the corner points of the desiredobject visible. These corner points are then projected on the estimated plane.

Plane Estimation

We need to estimate 4 coefficients in the equation of the plane:

ax1 + bx2 + cx3 + d = 0;

We can obtain the 3D coordinates of a point in space given the corresponding matching points in leftand right images of a calibrated stereo pair using stereo triangulation. Given 3 or more points thatbelong to the same plane we can estimate the equation of a plane by solving the following system:

Homework 2 12

x1 y1 z1 1x2 y2 z2 1x3 y3 z3 1

abcd

=

0000

Since the scale doesn’t matter we can choose d = 1.

Figure 16: Image points used for cabinet plane estimation

−500

0

5001000

0500

10001500

20002500

0

200

400

600

800


XYZRight Camera

ZXYLeft Camera

Figure 17: Estimated plane for cabinet

Corner estimation Once the plane has been estimated we can select the image where all of thecorners of the object are visible and project them on the estimated plane. The 3D rays correspondingto the image points are intersected with the plane.

Homework 2 13

Figure 18: Image points from right camera corresponding to camera corners

−5000

5001000 0

500

1000

1500

2000

2500

0

200

400

600

800

XYRight CameraZ


XZYLeft Camera

Figure 19: Image points from right camera corresponding to camera corners

Homework 2 14

-1000

-500

0

500

1000 0

500

1000

1500

2000

2500-400

-200

0

200

400

600

800

10

XYRight CameraZ


XZYLeft Camera

Figure 20: Dense reconstruction of cabinet and walls

Documents

Pictorial Information Homework 2 1 Question 1: Calibrating ...users.umiacs.umd.edu/~cteo/.../hw-data/...solution.pdf · 2. The rotation vector om is a non-normalized vector codirectional