3D Vision

Preview:

DESCRIPTION

3D Vision. CSC I6716 Spring2011. Topic 1 of Part II Camera Models. Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu. 3D Vision. Closely Related Disciplines Image Processing – images to mages Computer Graphics – models to images Computer Vision – images to models - PowerPoint PPT Presentation

Citation preview

3D Computer Visionand Video Computing 3D Vision

Topic 1 of Part IICamera Models

CSC I6716Spring2011

Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu

3D Computer Visionand Video Computing 3D Vision

n Closely Related Disciplines l Image Processing – images to magesl Computer Graphics – models to imagesl Computer Vision – images to modelsl Photogrammetry – obtaining accurate measurements from images

n What is 3-D ( three dimensional) Vision?l Motivation: making computers see (the 3D world as humans do)l Computer Vision: 2D images to 3D structurel Applications : robotics / VR /Image-based rendering/ 3D video

n Lectures on 3-D Vision Fundamentalsl Camera Geometric Models (1 lecture)l Camera Calibration (2 lectures)l Stereo (2 lectures)l Motion (2 lectures)

3D Computer Visionand Video Computing Lecture Outline

n Geometric Projection of a Cameral Pinhole camera modell Perspective projectionl Weak-Perspective Projection

n Camera Parametersl Intrinsic Parameters: define mapping from 3D to 2Dl Extrinsic parameters: define viewpoint and viewing direction

n Basic Vector and Matrix Operations, Rotation

n Camera Models Revisitedl Linear Version of the Projection Transformation Equation

n Perspective Camera Modeln Weak-Perspective Camera Modeln Affine Camera Modeln Camera Model for Planes

n Summary

3D Computer Visionand Video Computing Lecture Assumptions

n Camera Geometric Modelsl Knowledge about 2D and 3D geometric transformationsl Linear algebra (vector, matrix)l This lecture is only about geometry

n Goal

Build up relation between 2D images and 3D scenes- 3D Graphics (rendering): from 3D to 2D- 3D Vision (stereo and motion): from 2D to 3D

- Calibration: Determning the parameters for mapping

3D Computer Visionand Video Computing Image Formation

Light (Energy) Source

Surface

Pinhole Lens

Imaging Plane

World Optics Sensor Signal

B&W Film

Color Film

TV Camera

Silver Density

Silver densityin three colorlayers

Electrical

3D Computer Visionand Video Computing Image Formation

Light (Energy) Source

Surface

Pinhole Lens

Imaging Plane

World Optics Sensor Signal

Camera: Spec & Pose

3D Scene

2D Image

3D Computer Visionand Video Computing Pinhole Camera Model

n Pin-hole is the basis for most graphics and visionl Derived from physical construction of early camerasl Mathematics is very straightforward

n 3D World projected to 2D Imagel Image inverted, size reducedl Image is a 2D plane: No direct depth information

n Perspective projection l f called the focal length of the lensl given image size, change f will change FOV and figure sizes

Pinhole lens

Optical Axis

f

Image Plane

3D Computer Visionand Video Computing Focal Length, FOV

n Consider case with object on the optical axis:

n Optical axis: the direction of imagingn Image plane: a plane perpendicular to the optical axisn Center of Projection (pinhole), focal point, viewpoint, nodal pointn Focal length: distance from focal point to the image planen FOV : Field of View – viewing angles in horizontal and vertical

directions

fzviewpoint

Image plane

3D Computer Visionand Video Computing Focal Length, FOV

n Consider case with object on the optical axis:

fz

Out of view

Image plane

n Optical axis: the direction of imagingn Image plane: a plane perpendicular to the optical axisn Center of Projection (pinhole), focal point, viewpoint, , nodal pointn Focal length: distance from focal point to the image planen FOV : Field of View – viewing angles in horizontal and vertical

directions

n Increasing f will enlarge figures, but decrease FOV

3D Computer Visionand Video Computing Equivalent Geometry

n Consider case with object on the optical axis:

fz

n More convenient with upright image:

fz

Projection plane z = f

n Equivalent mathematically

3D Computer Visionand Video Computing Perspective Projection

n Compute the image coordinates of p in terms of the world (camera) coordinates of P.

n Origin of camera at center of projectionn Z axis along optical axisn Image Plane at Z = f; x // X and y//Y

x

y

Z

P(X,Y,Z)p(x, y)

Z = f

0

YX

ZYfy

ZXfx

3D Computer Visionand Video Computing Reverse Projection

n Given a center of projection and image coordinates of a point, it is not possible to recover the 3D depth of the point from a single image.

In general, at least two images of the same point taken from two different locations are required to recover depth.

All points on this linehave image coordi-nates (x,y).

p(x,y)

P(X,Y,Z) can be any-where along this line

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdam : what do you see in this picture?l straight linel sizel parallelism/anglel shapel shape of planes

l depth

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdamstraight linel sizel parallelism/anglel shapel shape of planes

l depth

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdamstraight line´sizel parallelism/anglel shapel shape of planes

l depth

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdamstraight line´size´ parallelism/anglel shapel shape of planes

l depth

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdamstraight line´size´ parallelism/angle´ shapel shape of planes

l depth

3D Computer Visionand Video Computing Pinhole camera image

Photo by Robert Kosara, robert@kosara.nethttp://www.kosara.net/gallery/pinholeamsterdam/pic01.html

Amsterdamstraight line´size´ parallelism/angle´ shapel shape of planes parallel to imagel depth

3D Computer Visionand Video Computing Pinhole camera image

- We see spatial shapes rather than individual pixels- Knowledge: top-down vision belongs to human- Stereo &Motion most successful in 3D CV & application- You can see it but you don't know how…

Amsterdam: what do you see?straight line´size´ parallelism/angle´ shapel shape of planes parallel to imagel Depth ?

l stereo l motionl sizel structure …

3D Computer Visionand Video ComputingYet other pinhole camera images

Markus Raetz, Metamorphose II, 1991-92, cast iron, 15 1/4 x 12 x 12 inchesFine Art Center University Gallery, Sep 15 – Oct 26

Rabbit or Man?

3D Computer Visionand Video ComputingYet other pinhole camera images

Markus Raetz, Metamorphose II, 1991-92, cast iron, 15 1/4 x 12 x 12 inchesFine Art Center University Gallery, Sep 15 – Oct 26

2D projections are not the “same” as the real object as we usually see everyday!

3D Computer Visionand Video Computing It’s real!

3D Computer Visionand Video Computing Weak Perspective Projection

n Average depth Z is much larger than the relative distance between any two scene points measured along the optical axis

n A sequence of two transformationsl Orthographic projection : parallel raysl Isotropic scaling : f/Z

n Linear Modell Preserve angles and shapes

ZYfy

ZXfx

x

y

Z

P(X,Y,Z)p(x, y)

Z = f

0

YX

3D Computer Visionand Video Computing Camera Parameters

n Coordinate Systemsl Frame coordinates (xim, yim) pixelsl Image coordinates (x,y) in mml Camera coordinates (X,Y,Z) l World coordinates (Xw,Yw,Zw)

n Camera Parametersl Intrinsic Parameters (of the camera and the frame grabber): link the

frame coordinates of an image point with its corresponding camera coordinates

l Extrinsic parameters: define the location and orientation of the camera coordinate system with respect to the world coordinate system

Zw

Xw

Yw

Y

X

Zx

yO

Pw

P

p

xim

yim

(xim,yim)

Pose / Camera

Object / World

Image frame

Frame Grabber

3D Computer Visionand Video Computing Intrinsic Parameters (I)

n From image to framel Image centerl Directions of axesl Pixel size

n From 3D to 2Dl Perspective projection

n Intrinsic Parametersl (ox ,oy) : image center (in pixels)l (sx ,sy) : effective size of the pixel (in mm)l f: focal length

xim

yim

Pixel(xim,yim)

Y

Z

X

x

y

O

p (x,y,f)

ox

oy

(0,0)

Size:(sx,sy)

yyim

xximsoyysoxx)()(

ZYfy

ZXfx

3D Computer Visionand Video Computing Intrinsic Parameters (II)

n Lens Distortions

n Modeled as simple radial distortionsl r2 = xd2+yd2

l (xd , yd) distorted pointsl k1 , k2: distortion coefficientsl A model with k2 =0 is still accurate for a

CCD sensor of 500x500 with ~5 pixels distortion on the outer boundary

)1(

)1(4

22

1

42

21

rkrkyy

rkrkxx

d

d

(xd, yd)(x, y)k1 , k2

3D Computer Visionand Video Computing Extrinsic Parameters

n From World to Camera

n Extrinsic Parametersl A 3-D translation vector, T, describing the relative locations of the

origins of the two coordinate systems (what’s it?)l A 3x3 rotation matrix, R, an orthogonal matrix that brings the

corresponding axes of the two systems onto each other

TPRP w Zw

Xw

Yw

Y

X

Zx

yO

PwP

p

xim

yim

(xim,yim)

T

3D Computer Visionand Video ComputingLinear Algebra: Vector and Matrix

n A point as a 2D/ 3D vectorl Image point: 2D vectorl Scene point: 3D vectorl Translation: 3D vector

n Vector Operationsl Addition:

n Translation of a 3D vectorl Dot product ( a scalar):

n a.b = |a||b|cosql Cross product (a vector)

n Generates a new vector that is orthogonal to both of them

Tyxyx

),(

p

TZYX ),,(PT

zyx TTT ),,(T

Tzwywxw TZTYTX ),,( TPwP

baba Tc

bac ´

T: Transpose

a x b = (a2b3 - a3b2)i + (a3b1 - a1b3)j + (a1b2 - a2b1)k

3D Computer Visionand Video ComputingLinear Algebra: Vector and Matrix

n Rotation: 3x3 matrixl Orthogonal :

n 9 elements => 3+3 constraints (orthogonal/cross ) => 2+2 constraints (unit vectors) => 3 DOF ? (degrees of freedom, orthogonal/dot)

l How to generate R from three angles? (next few slides)

n Matrix Operationsl R Pw +T= ? - Points in the World are projected on three new axes

(of the camera system) and translated to a new origin

IRRRRRR TTT ei ..,1

´ T

T

T

ijrrrrrrrrr

r

3

2

1

333231

232221

131211

33RRR

R

zT

yT

xT

zwww

ywww

xwww

T

TT

TZrYrXr

TZrYrXrTZrYrXr

w

w

w

w

PR

PRPR

TRPP

3

2

1

333231

232221

131211

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Axesl Result of three consecutive

rotations around the coordinate axes

n Notes:l Only three rotationsl Every time around one axisl Bring corresponding axes to each other

n Xw = X, Yw = Y, Zw = Zl First step (e.g.) Bring Xw to X

RRRR Zw

XwYw

YX

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Zw Axisl Rotate in XwOYw planel Goal: Bring Xw to Xl But X is not in XwOYw

l YwX X in XwOZw (Yw XwOZw) Yw in YOZ ( X YOZ)

n Next time rotation around Yw

1000cossin0sincos

RZw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Zw Axisl Rotate in XwOYw plane so thatl YwX X in XwOZw (YwXwOZw)

Yw in YOZ ( XYOZ)n Zw does not change

1000cossin0sincos

RZw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Yw Axisl Rotate in XwOZw plane so thatl Xw = X Zw in YOZ (& Yw in YOZ)

n Yw does not change

cos0sin010

sin0cosR

Zw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Yw Axisl Rotate in XwOZw plane so thatl Xw = X Zw in YOZ (& Yw in YOZ)

n Yw does not change

cos0sin

010sin0cos

R

Zw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Xw(X) Axisl Rotate in YwOZw plane so thatl Yw = Y, Zw = Z (& Xw = X)

n Xw does not change

cossin0sincos0001

R

Zw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Xw(X) Axisl Rotate in YwOZw plane so thatl Yw = Y, Zw = Z (& Xw = X)

n Xw does not change

cossin0sincos0001

R

Zw

Xw

Yw

Y

X

Z

O

3D Computer Visionand Video Computing Rotation: from Angles to Matrix

n Rotation around the Axesl Result of three consecutive

rotations around the coordinate axes

n Notes:l Rotation directionsl The order of multiplications matters:

,,l Same R, 6 different sets of ,,l R Non-linear function of ,,l R is orthogonall It’s easy to compute angles from R

RRRR Zw

XwYw

YX

Z

O

Appendix A.9 of the textbook

coscoscossinsinsincossinsincossincoscossincoscossinsinsinsincoscossinsin

sinsincoscoscosR

3D Computer Visionand Video Computing Rotation- Axis and Angle

n According to Euler’s Theorem, any 3D rotation can be described by a rotating angle, q, around an axis defined by an unit vector n = [n1, n2, n3]T.

n Three degrees of freedom – why?

Appendix A.9 of the textbook

qqq sin0

00

)cos1(cos

12

13

23

232313

322212

312121

nn

nnnn

nnnnnnnnnnnnnnn

IR

3D Computer Visionand Video ComputingLinear Version of Perspective Projection

n World to Cameral Camera: P = (X,Y,Z)T

l World: Pw = (Xw,Yw,Zw)T

l Transform: R, T n Camera to Image

l Camera: P = (X,Y,Z)T

l Image: p = (x,y)T

l Not linear equationsn Image to Frame

l Neglecting distortionl Frame (xim, yim)T

n World to Framel (Xw,Yw,Zw)T -> (xim, yim)T

l Effective focal lengthsn fx = f/sx, fy=f/sy

n Three are not independent

zT

yT

xT

zwww

ywww

xwww

T

TT

TZrYrXr

TZrYrXrTZrYrXr

w

w

w

w

PR

PRPR

TRPP

3

2

1

333231

232221

131211

yyim

xximsoyysoxx)()(

) ,(),(ZYf

ZXfyx

zwww

ywwwyyim

zwww

xwwwxxim

TZrYrXrTZrYrXr

foy

TZrYrXrTZrYrXrfox

333231

232221

333231

131211

3D Computer Visionand Video Computing Linear Matrix Equation of

perspective projectionn Projective Space

l Add fourth coordinate n Pw = (Xw,Yw,Zw, 1)T

l Define (x1,x2,x3)T such thatn x1/x3 =xim, x2/x3 =yim

n 3x4 Matrix Mextl Only extrinsic parametersl World to camera

n 3x3 Matrix Mintl Only intrinsic parametersl Camera to frame

n Simple Matrix Product! Projective Matrix M= MintMext

l (Xw,Yw,Zw)T -> (xim, yim)T

l Linear Transform from projective space to projective planel M defined up to a scale factor – 11 independent entries

13

2

1

ww

w

ZYX

xxx

extintMM

zT

yT

xT

z

y

x

ext

T

TT

TTT

rrrrrrrrr

3

2

1

333231

232221

131211

R

RR

M

1000

0

int yy

xxofof

M

32

31//xxxx

yx

im

im

3D Computer Visionand Video Computing Three Camera Models

n Perspective Camera Modell Making some assumptions

n Known center: Ox = Oy = 0n Square pixel: Sx = Sy = 1

l 11 independent entries <-> 7 parametersn Weak-Perspective Camera Model

l Average Distance Z >> Range dZl Define centroid vector Pw

l 8 independent entriesn Affine Camera Model

l Mathematical Generalization of Weak-Persl Doesn’t correspond to physical cameral But simple equation and appealing geometry

n Doesn’t preserve angle BUT parallelisml 8 independent entries

z

y

x

TfTfT

rrrfrfrfrfrfrfr

333231

232221

131211M

zw

y

x

wp

T

fTfT

frfrfrfrfrfr

PR

MT3

000

232221

131211

zw T PRZZ T3

3

2

1

232221

131211

000 bbb

aaaaaa

afM

3D Computer Visionand Video Computing Camera Models for a Plane

n Planes are very common in the Man-Made World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of a Planel 8 independent entries

n General Form ?l 8 independent entries

10

333231

232221

131211

3

2

1

w

w

w

z

y

x

ZYX

TfTfT

rrrfrfrfrfrfrfr

xxx

dZnYnXn wzwywx dwTPn

3D Computer Visionand Video Computing Camera Models for a Plane

n A Plane in the World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of Zw=0l 8 independent entries

n General Form ?l 8 independent entries

10

333231

232221

131211

3

2

1

w

w

w

z

y

x

ZYX

TfTfT

rrrfrfrfrfrfrfr

xxx

dZnYnXn wzwywx dwTPn

13231

232221

1211

3

2

1

w

w

z

xYX

TrrfTfrfrfTfrfr

xxx

3D Computer Visionand Video Computing Camera Models for a Plane

n A Plane in the World

l One more constraint for all points: Zw is a function of Xw and Yw

n Special case: Ground Plane l Zw=0l Pw =(Xw, Yw,0, 1)T

l 3D point -> 2D point

n Projective Model of Zw=0l 8 independent entries

n General Form ?l nz = 1

l 8 independent entries

1

333231

232221

131211

3

2

1

w

w

w

z

y

x

ZYX

TfTfT

rrrfrfrfrfrfrfr

xxx

dZnYnXn wzwywx dwTPn

1)()()()()()()()()(

3333323331

2323222321

1313121311

3

2

1

w

w

zyx

yyx

xyxYX

TdrrnrrnrTdrfrnrfrnrfTdrfrnrfrnrf

xxx

wywxw YnXndZ

n 2D (xim,yim) -> 3D (Xw, Yw, Zw) ?

3D Computer Visionand Video Computing Applications and Issues

n Graphics /Renderingl From 3D world to 2D image

n Changing viewpoints and directionsn Changing focal length

l Fast rendering algorithmsn Vision / Reconstruction

l From 2D image to 3D modeln Inverse problemn Much harder / unsolved

l Robust algorithms for matching and parameter estimationl Need to estimate camera parameters first

n Calibrationl Find intrinsic & extrinsic parametersl Given image-world point pairsl Probably a partially solved problem ?l 11 independent entries

n <-> 10 parameters: fx, fy, ox, oy, ,,, Tx,Ty,Tz

13

2

1

ww

w

ZYX

xxx

extintMM

z

y

x

extTTT

rrrrrrrrr

333231

232221

131211M

1000

0

int yy

xxofof

M

32

31//xxxx

yx

im

im

3D Computer Visionand Video Computing Camera Model Summary

n Geometric Projection of a Cameral Pinhole camera modell Perspective projectionl Weak-Perspective Projection

n Camera Parameters (10 or 11)l Intrinsic Parameters: f, ox,oy, sx,sy,k1: 4 or 5 independent

parametersl Extrinsic parameters: R, T – 6 DOF (degrees of freedom)

n Linear Equations of Camera Models (without distortion)l General Projection Transformation Equation : 11 parametersl Perspective Camera Model: 11 parameters l Weak-Perspective Camera Model: 8 parameters l Affine Camera Model: generalization of weak-perspective: 8l Projective transformation of planes: 8 parameters

3D Computer Visionand Video Computing Next

n Determining the value of the extrinsic and intrinsic parameters of a camera

Calibration(Ch. 6)

z

y

x

extTTT

rrrrrrrrr

333231

232221

131211M

1000

0

int yy

xxofof

M

13

2

1

ww

w

ZYX

xxx

extintMM

Recommended