3D Computer Vision and Video Computing 3D Vision Topic 1 of Part II Camera Models CSC I6716 Fall...

Preview:

Citation preview

3D Computer Vision

and Video Computing 3D Vision3D Vision

Topic 1 of Part IICamera Models

CSC I6716Fall 2010

Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu

3D Computer Vision

and Video Computing 3D Vision3D Vision

n Closely Related Disciplines l Image Processing – images to magesl Computer Graphics – models to imagesl Computer Vision – images to modelsl Photogrammetry – obtaining accurate measurements from images

n What is 3-D ( three dimensional) Vision?l Motivation: making computers see (the 3D world as humans do)l Computer Vision: 2D images to 3D structurel Applications : robotics / VR /Image-based rendering/ 3D video

n Lectures on 3-D Vision Fundamentalsl Camera Geometric Models (3 lectures)l Camera Calibration (3 lectures)l Stereo (4 lectures)l Motion (4 lectures)

3D Computer Vision

and Video Computing Lecture OutlineLecture Outline

n Geometric Projection of a Cameral Pinhole camera modell Perspective projectionl Weak-Perspective Projection

n Camera Parametersl Intrinsic Parameters: define mapping from 3D to 2Dl Extrinsic parameters: define viewpoint and viewing direction

n Basic Vector and Matrix Operations, Rotation

n Camera Models Revisitedl Linear Version of the Projection Transformation Equation

n Perspective Camera Modeln Weak-Perspective Camera Modeln Affine Camera Modeln Camera Model for Planes

n Summary

3D Computer Vision

and Video Computing Lecture AssumptionsLecture Assumptions

n Camera Geometric Modelsl Knowledge about 2D and 3D geometric transformationsl Linear algebra (vector, matrix)l This lecture is only about geometry

n Goal

Build up relation between 2D images and 3D scenes- 3D Graphics (rendering): from 3D to 2D- 3D Vision (stereo and motion): from 2D to 3D

- Calibration: Determning the parameters for mapping

3D Computer Vision

and Video Computing Image FormationImage Formation

Light (Energy) Source

Surface

Pinhole Lens

Imaging Plane

World Optics Sensor Signal

B&W Film

Color Film

TV Camera

Silver Density

Silver densityin three colorlayers

Electrical

3D Computer Vision

and Video Computing Image FormationImage Formation

Light (Energy) Source

Surface

Pinhole Lens

Imaging Plane

World Optics Sensor Signal

Camera: Spec & Pose

3D Scene

2D Image

3D Computer Vision

and Video Computing Pinhole Camera ModelPinhole Camera Model

n Pin-hole is the basis for most graphics and visionl Derived from physical construction of early camerasl Mathematics is very straightforward

n 3D World projected to 2D Imagel Image inverted, size reducedl Image is a 2D plane: No direct depth information

n Perspective projection l f called the focal length of the lensl given image size, change f will change FOV and figure sizes

Pinhole lens

Optical Axis

f

Image Plane

3D Computer Vision

and Video Computing Focal Length, FOVFocal Length, FOV

n Consider case with object on the optical axis:

n Optical axis: the direction of imagingn Image plane: a plane perpendicular to the optical axisn Center of Projection (pinhole), focal point, viewpoint, nodal pointn Focal length: distance from focal point to the image planen FOV : Field of View – viewing angles in horizontal and vertical

directions

fz

viewpoint

Image plane

3D Computer Vision

and Video Computing Focal Length, FOVFocal Length, FOV

n Consider case with object on the optical axis:

fz

Out of view

Image plane

n Optical axis: the direction of imagingn Image plane: a plane perpendicular to the optical axisn Center of Projection (pinhole), focal point, viewpoint, , nodal pointn Focal length: distance from focal point to the image planen FOV : Field of View – viewing angles in horizontal and vertical

directions

n Increasing f will enlarge figures, but decrease FOV

3D Computer Vision

and Video Computing Equivalent GeometryEquivalent Geometry

n Consider case with object on the optical axis:

fz

n More convenient with upright image:

fz

Projection plane z = f

n Equivalent mathematically

3D Computer Vision

and Video Computing Perspective ProjectionPerspective Projection

n Compute the image coordinates of p in terms of the world (camera) coordinates of P.

n Origin of camera at center of projectionn Z axis along optical axisn Image Plane at Z = f; x // X and y//Y

x

y

Z

P(X,Y,Z)p(x, y)

Z = f

0

Y

X

Z

Yfy

Z

Xfx

3D Computer Vision

and Video Computing Reverse ProjectionReverse Projection

n Given a center of projection and image coordinates of a point, it is not possible to recover the 3D depth of the point from a single image.

In general, at least two images of the same point taken from two different locations are required to recover depth.

All points on this linehave image coordi-nates (x,y).

p(x,y)

P(X,Y,Z) can be any-where along this line

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam : what do you see in this picture?

l straight line

l size

l parallelism/angle

l shape

l shape of planes

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam

straight line

l size

l parallelism/angle

l shape

l shape of planes

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam

straight line

´ size

l parallelism/angle

l shape

l shape of planes

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam

straight line

´ size

´ parallelism/angle

l shape

l shape of planes

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam

straight line

´ size

´ parallelism/angle

´ shape

l shape of planes

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

Photo by Robert Kosara, robert@kosara.net

http://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam

straight line

´ size

´ parallelism/angle

´ shape

l shape of planes

parallel to image

l depth

3D Computer Vision

and Video Computing Pinhole camera imagePinhole camera image

- We see spatial shapes rather than individual pixels

- Knowledge: top-down vision belongs to human

- Stereo &Motion most successful in 3D CV & application

- You can see it but you don't know how…

Amsterdam: what do you see?

straight line

´ size

´ parallelism/angle

´ shape

l shape of planes

parallel to image

l Depth ?

l stereo

l motion

l size

l structure …

3D Computer Vision

and Video ComputingYet other pinhole camera imagesYet other pinhole camera images

Markus Raetz, Metamorphose II, 1991-92, cast iron, 15 1/4 x 12 x 12 inches

Fine Art Center University Gallery, Sep 15 – Oct 26

Rabbit or Man?

3D Computer Vision

and Video ComputingYet other pinhole camera imagesYet other pinhole camera images

Markus Raetz, Metamorphose II, 1991-92, cast iron, 15 1/4 x 12 x 12 inches

Fine Art Center University Gallery, Sep 15 – Oct 26

2D projections are not the “same” as the real object as we usually see everyday!

3D Computer Vision

and Video Computing It’s real!It’s real!

3D Computer Vision

and Video Computing Weak Perspective ProjectionWeak Perspective Projection

n Average depth Z is much larger than the relative distance between any two scene points measured along the optical axis

n A sequence of two transformationsl Orthographic projection : parallel raysl Isotropic scaling : f/Z

n Linear Modell Preserve angles and shapes

Z

Yfy

Z

Xfx

x

y

Z

P(X,Y,Z)p(x, y)

Z = f

0

Y

X

3D Computer Vision

and Video Computing Camera ParametersCamera Parameters

n Coordinate Systemsl Frame coordinates (xim, yim) pixelsl Image coordinates (x,y) in mml Camera coordinates (X,Y,Z) l World coordinates (Xw,Yw,Zw)

n Camera Parametersl Intrinsic Parameters (of the camera and the frame grabber): link the

frame coordinates of an image point with its corresponding camera coordinates

l Extrinsic parameters: define the location and orientation of the camera coordinate system with respect to the world coordinate system

Zw

Xw

Yw

Y

X

Zx

yO

Pw

P

p

xim

yim

(xim,yim)

Pose / Camera

Object / World

Image frame

Frame Grabber

3D Computer Vision

and Video Computing Intrinsic Parameters (I)Intrinsic Parameters (I)

n From image to framel Image centerl Directions of axesl Pixel size

n From 3D to 2Dl Perspective projection

n Intrinsic Parametersl (ox ,oy) : image center (in pixels)l (sx ,sy) : effective size of the pixel (in mm)l f: focal length

xim

yim

Pixel(xim,yim)

Y

Z

X

x

y

O

p (x,y,f)

ox

oy

(0,0)

Size:(sx,sy)

yyim

xxim

soyy

soxx

)(

)(

Z

Yfy

Z

Xfx

3D Computer Vision

and Video Computing Intrinsic Parameters (II)Intrinsic Parameters (II)

n Lens Distortions

n Modeled as simple radial distortionsl r2 = xd2+yd2

l (xd , yd) distorted pointsl k1 , k2: distortion coefficientsl A model with k2 =0 is still accurate for a

CCD sensor of 500x500 with ~5 pixels distortion on the outer boundary

)1(

)1(

42

21

42

21

rkrkyy

rkrkxx

d

d

(xd, yd)(x, y)k1 , k2

3D Computer Vision

and Video Computing Extrinsic ParametersExtrinsic Parameters

n From World to Camera

n Extrinsic Parametersl A 3-D translation vector, T, describing the relative locations of the

origins of the two coordinate systems (what’s it?)l A 3x3 rotation matrix, R, an orthogonal matrix that brings the

corresponding axes of the two systems onto each other

TPRP w Zw

Xw

Yw

Y

X

Zx

yO

PwP

p

xim

yim

(xim,yim)

T

3D Computer Vision

and Video ComputingLinear Algebra: Vector and MatrixLinear Algebra: Vector and Matrix

n A point as a 2D/ 3D vectorl Image point: 2D vectorl Scene point: 3D vectorl Translation: 3D vector

n Vector Operationsl Addition:

n Translation of a 3D vectorl Dot product ( a scalar):

n a.b = |a||b|cosql Cross product (a vector)

n Generates a new vector that is orthogonal to both of them

Tyxy

x),(

p

TZYX ),,(P

Tzyx TTT ),,(T

Tzwywxw TZTYTX ),,( TPwP

baba Tc

bac

T: Transpose

a x b = (a2b3 - a3b2)i + (a3b1 - a1b3)j + (a1b2 - a2b1)k

3D Computer Vision

and Video ComputingLinear Algebra: Vector and MatrixLinear Algebra: Vector and Matrix

n Rotation: 3x3 matrixl Orthogonal :

n 9 elements => 3+3 constraints (orthogonal/cross ) => 2+2 constraints (unit vectors) => 3 DOF ? (degrees of freedom, orthogonal/dot)

l How to generate R from three angles? (next few slides)

n Matrix Operationsl R Pw +T= ? - Points in the World are projected on three new axes

(of the camera system) and translated to a new origin

IRRRRRR TTT ei ..,1

T

T

T

ij

rrr

rrr

rrr

r

3

2

1

333231

232221

131211

33R

R

R

R

zT

yT

xT

zwww

ywww

xwww

T

T

T

TZrYrXr

TZrYrXr

TZrYrXr

w

w

w

w

PR

PR

PR

TRPP

3

2

1

333231

232221

131211

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation around the Axesl Result of three consecutive

rotations around the coordinate axes

n Notes:l Only three rotationsl Every time around one axisl Bring corresponding axes to each other

n Xw = X, Yw = Y, Zw = Zl First step (e.g.) Bring Xw to X

RRRR Zw

Xw

Yw

Y

X

Z

O

g

b

a

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation g around the Zw Axisl Rotate in XwOYw planel Goal: Bring Xw to Xl But X is not in XwOYw

l YwX X in XwOZw (Yw XwOZw) Yw in YOZ ( X YOZ)

n Next time rotation around Yw

100

0cossin

0sincos

RZw

Xw

Yw

Y

X

Z

O

g

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation g around the Zw Axisl Rotate in XwOYw plane so thatl YwX X in XwOZw (YwXwOZw) Yw

in YOZ ( XYOZ)n Zw does not change

100

0cossin

0sincos

RZw

Xw

Yw

Y

X

Z

O

g

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation b around the Yw Axisl Rotate in XwOZw plane so thatl Xw = X Zw in YOZ (& Yw in YOZ)

n Yw does not change

cos0sin

010

sin0cos

RZw

Xw

Yw

Y

X

Z

O

b

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation b around the Yw Axisl Rotate in XwOZw plane so thatl Xw = X Zw in YOZ (& Yw in YOZ)

n Yw does not change

cos0sin

010

sin0cos

R

Zw

Xw

Yw

Y

X

Z

O

b

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation a around the Xw(X) Axisl Rotate in YwOZw plane so thatl Yw = Y, Zw = Z (& Xw = X)

n Xw does not change

cossin0

sincos0

001

R

Zw

Xw

Yw

Y

X

Z

O

a

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation a around the Xw(X) Axisl Rotate in YwOZw plane so thatl Yw = Y, Zw = Z (& Xw = X)

n Xw does not change

cossin0

sincos0

001

R

Zw

Xw

Yw

Y

X

Z

O

a

3D Computer Vision

and Video Computing Rotation: from Angles to MatrixRotation: from Angles to Matrix

n Rotation around the Axesl Result of three consecutive

rotations around the coordinate axes

n Notes:l Rotation directionsl The order of multiplications matters:

, ,g b al Same R, 6 different sets of , ,a b gl R Non-linear function of , ,a b gl R is orthogonall It’s easy to compute angles from R

RRRR Zw

Xw

Yw

Y

X

Z

O

g

b

a

Appendix A.9 of the textbook

coscoscossinsinsincossinsincossincos

cossincoscossinsinsinsincoscossinsin

sinsincoscoscos

R

3D Computer Vision

and Video Computing Rotation- Axis and AngleRotation- Axis and Angle

n According to Euler’s Theorem, any 3D rotation can be described by a rotating angle, q, around an axis defined by an unit vector n = [n1, n2, n3]T.

n Three degrees of freedom – why?

Appendix A.9 of the textbook

sin

0

0

0

)cos1(cos

12

13

23

232313

322212

312121

nn

nn

nn

nnnnn

nnnnn

nnnnn

IR

3D Computer Vision

and Video ComputingLinear Version of Perspective ProjectionLinear Version of Perspective Projection

n World to Cameral Camera: P = (X,Y,Z)T

l World: Pw = (Xw,Yw,Zw)T

l Transform: R, T

n Camera to Imagel Camera: P = (X,Y,Z)T

l Image: p = (x,y)T

l Not linear equations

n Image to Framel Neglecting distortionl Frame (xim, yim)T

n World to Framel (Xw,Yw,Zw)T -> (xim, yim)T

l Effective focal lengthsn fx = f/sx, fy=f/sy

n Three are not independent

zT

yT

xT

zwww

ywww

xwww

T

T

T

TZrYrXr

TZrYrXr

TZrYrXr

w

w

w

w

PR

PR

PR

TRPP

3

2

1

333231

232221

131211

yyim

xxim

soyy

soxx

)(

)(

) ,(),(Z

Yf

Z

Xfyx

zwww

ywwwyyim

zwww

xwwwxxim

TZrYrXr

TZrYrXrfoy

TZrYrXr

TZrYrXrfox

333231

232221

333231

131211

3D Computer Vision

and Video Computing Linear Matrix Equation of perspective projection

Linear Matrix Equation of perspective projection

n Projective Spacel Add fourth coordinate

n Pw = (Xw,Yw,Zw, 1)T

l Define (x1,x2,x3)T such thatn x1/x3 =xim, x2/x3 =yim

n 3x4 Matrix Mext

l Only extrinsic parametersl World to camera

n 3x3 Matrix Mint

l Only intrinsic parametersl Camera to frame

n Simple Matrix Product! Projective Matrix M= MintMext

l (Xw,Yw,Zw)T -> (xim, yim)T

l Linear Transform from projective space to projective planel M defined up to a scale factor – 11 independent entries

13

2

1

w

w

w

ZY

X

x

x

x

extintMM

zT

yT

xT

z

y

x

ext

T

T

T

T

T

T

rrr

rrr

rrr

3

2

1

333231

232221

131211

R

R

R

M

100

0

0

int yy

xx

of

of

M

32

31

/

/

xx

xx

y

x

im

im

3D Computer Vision

and Video Computing Three Camera ModelsThree Camera Models

n Perspective Camera Modell Making some assumptions

n Known center: Ox = Oy = 0n Square pixel: Sx = Sy = 1

l 11 independent entries <-> 7 parameters

n Weak-Perspective Camera Modell Average Distance Z >> Range dZl Define centroid vector Pw

l 8 independent entries

n Affine Camera Modell Mathematical Generalization of Weak-Persl Doesn’t correspond to physical cameral But simple equation and appealing geometry

n Doesn’t preserve angle BUT parallelisml 8 independent entries

z

y

x

T

fT

fT

rrr

frfrfr

frfrfr

333231

232221

131211

M

zw

y

x

wp

T

fT

fT

frfrfr

frfrfr

PR

MT3

000232221

131211

zw T PRZZ T3

3

2

1

232221

131211

000 b

b

b

aaa

aaa

afM

3D Computer Vision

and Video Computing Camera Models for a Plane Camera Models for a Plane

n Planes are very common in the Man-Made World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of a Planel 8 independent entries

n General Form ?l 8 independent entries

1

0

333231

232221

131211

3

2

1

w

w

w

z

y

x

Z

Y

X

T

fT

fT

rrr

frfrfr

frfrfr

x

x

x

dZnYnXn wzwywx dwTPn

3D Computer Vision

and Video Computing Camera Models for a Plane Camera Models for a Plane

n A Plane in the World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of Zw=0l 8 independent entries

n General Form ?l 8 independent entries

1

0

333231

232221

131211

3

2

1

w

w

w

z

y

x

Z

Y

X

T

fT

fT

rrr

frfrfr

frfrfr

x

x

x

dZnYnXn wzwywx dwTPn

13231

232221

1211

3

2

1

w

w

z

x

Y

X

Trr

fTfrfr

fTfrfr

x

x

x

3D Computer Vision

and Video Computing Camera Models for a Plane Camera Models for a Plane

n A Plane in the World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of Zw=0l 8 independent entries

n General Form ?l nz = 1

l 8 independent entries

1

333231

232221

131211

3

2

1

w

w

w

z

y

x

Z

Y

X

T

fT

fT

rrr

frfrfr

frfrfr

x

x

x

dZnYnXn wzwywx dwTPn

1)()()(

)()()(

)()()(

3333323331

2323222321

1313121311

3

2

1

w

w

zyx

yyx

xyx

Y

X

Tdrrnrrnr

Tdrfrnrfrnrf

Tdrfrnrfrnrf

x

x

x

wywxw YnXndZ

n 2D (xim,yim) -> 3D (Xw, Yw, Zw) ?

3D Computer Vision

and Video Computing Applications and IssuesApplications and Issues

n Graphics /Renderingl From 3D world to 2D image

n Changing viewpoints and directionsn Changing focal length

l Fast rendering algorithms

n Vision / Reconstructionl From 2D image to 3D model

n Inverse problemn Much harder / unsolved

l Robust algorithms for matching and parameter estimationl Need to estimate camera parameters first

n Calibrationl Find intrinsic & extrinsic parametersl Given image-world point pairsl Probably a partially solved problem ?l 11 independent entries

n <-> 10 parameters: fx, fy, ox, oy, , ,a b g, Tx,Ty,Tz

13

2

1

w

w

w

ZY

X

x

x

x

extintMM

z

y

x

ext

T

T

T

rrr

rrr

rrr

333231

232221

131211

M

100

0

0

int yy

xx

of

of

M

32

31

/

/

xx

xx

y

x

im

im

3D Computer Vision

and Video Computing Camera Model SummaryCamera Model Summary

n Geometric Projection of a Cameral Pinhole camera modell Perspective projectionl Weak-Perspective Projection

n Camera Parameters (10 or 11)l Intrinsic Parameters: f, ox,oy, sx,sy,k1: 4 or 5 independent

parametersl Extrinsic parameters: R, T – 6 DOF (degrees of freedom)

n Linear Equations of Camera Models (without distortion)l General Projection Transformation Equation : 11 parametersl Perspective Camera Model: 11 parameters l Weak-Perspective Camera Model: 8 parameters l Affine Camera Model: generalization of weak-perspective: 8l Projective transformation of planes: 8 parameters

3D Computer Vision

and Video Computing NextNext

n Determining the value of the extrinsic and intrinsic parameters of a camera

Calibration(Ch. 6)

z

y

x

ext

T

T

T

rrr

rrr

rrr

333231

232221

131211

M

100

0

0

int yy

xx

of

of

M

13

2

1

w

w

w

ZY

X

x

x

x

extintMM