Image Metrology and 3D Reconstruction. Stereo The classic way by which one obtains 3D information from images is via stereo. We have two eyes, and because

Image Metrology

and

3D Reconstruction

StereoThe classic way by which one obtains 3D information from images is via stereo. We have two eyes, and because of the way the world is projected differently onto our eyes, we are able to compute the relative distances of objects.

Objects that are close are more widely separated on our retinas, and far objects project to points that are closer together.

To understand how we can take measurements from images we need to understand how images are formed...

In a camera we have a flat image plane (rather than a spherical retina). Points in the 3D world are projected onto the image plane. For convenience we typically draw the image plane in front of the projection centre. This produces a result that is identical to the case where the image plane is behind except that now the image is the right way up!

In order to use a camera for 3D measurements it must be calibrated. This involves finding the mathematical relationship between 3D points in the world and where they appear in the image.

This is known as the Projection Matrix or the Camera Calibration Matrix.

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

1

.

34333231

24232221

14131211

Z

Y

X

qqqq

qqqq

qqqq

s

sv

su

3D world coordinatesScaled image coordinates(divide by s to get actual coordinates)

Calibration typically requires a calibration target that has marked on it some carefully measured 3D positions. These known 3D positions are then related to the positions in the image where these points appear to obtain the calibration matrix

Image Metrology:Stereo

If a 3D calibration frame is placed in a scene and photographed from two or more directions 3D measurements can be made.

However, care is needed to get accurate results.

Image Metrology: Stereo

Stereo reconstruction

3D Measurements can be taken from the photographs long after the scene has been destroyed.

StereoKnowing the position of an object in one image means that the object must lie somewhere along the viewing ray defined by that point in the image.

If we have two images of a scene, and hence can define two viewing rays, we can solve for the 3D location of that point by finding the intersection of the two viewing rays.

The main problem in stereo is establishing correspondences between points in the two images.

This is known as the ‘matching problem’.

An important part of matching involves the calculation of the epipolar lines. The position of a point in one image defines a viewing ray. The image of this viewing ray in the other image is its epipolar line. The matching point in this image must be on this line.

For forensic applications we can afford to use a manual, or semi-manual approach. Matched points can be picked out manually by a user and the computer can then refine this by searching near the selected points, along epipolar lines, to obtain an accurate match by correlation.

3D Reconstruction From a Single View

We, as humans, can look at a photograph or a well executed painting and deduce considerable 3D information. Computer vision techniques that emulate aspects of this process are starting to emerge.

Merton College, Oxford

From a single view such as this we can deduce considerable 3D information

This work has resulted from the fusion of two areas of research

• The use of projective invariants for object recognition

This involves finding properties of objects that are invariant to perspective projection. This allows you to recognize an object/feature no matter what view you have of it.

This is typified by the work of Zisserman, Forsyth, Mundy and Hartley at Oxford and GE Research in the early 90s.

• Camera Autocalibration

This involves automatically determining camera calibration parameters from matched image points in stereo pairs of images, or motion sequences.

Major contributors to this area of research were the group at INRIA, France, headed by Olivier Faugeras, and the groups at Oxford and GE Research.

A Brief history of Perspective...Early art depicted scenes in a very symbolic form.

Cave paintings showed animals in profile

Egyptian art depicted people with their heads in profile, torso in front view, and waist and legs in profile.

Medieval art depicted people and objects very much like cardboard cutouts stuck on a screen.

It was during the Italian Renaissance around 1430 that concepts of perspective were developed. Uccello was one of the first artists to use perspective. Here is some of his work

The Battle of San Romano ~1430

Note the alignment of the spears on the ground, the foreshortened image of the fallen soldier on the left and the road in the background. These features were sensational for the time.

The Hunt ~1460

Note the pattern of the trees vanishing towards an apparent horizon in the distance

Drawing of a Chalice (date unknown)

This is a remarkable piece of work. It emulates what we take for granted with computer generated wire-frame graphics - except that it was done over 500 years ago with pencil and paper!

The first book that described perspective was produced by Leon Battista Alberti in 1435. Copies of this book can be found in the Architecture and Fine Arts Library at UWA. It is fascinating reading.

Vanishing Points and Vanishing Lines

Establishing vanishing points and vanishing lines are the fundamental operations in reconstructing 3D information from a perspective image.

In perspective two parallel lines meet at a point

Two sets of parallel lines in different directions in a plane will give two vanishing points that define the vanishing line of the plane.

All lines that lie in planes parallel to this plane will vanish at points on this line

The Horizon

The horizon is where a plane through the projection centre, and parallel to the reference plane cuts through the image plane.

Anything below this line will be projected to a point below the horizon, anything above is projected above the horizon.

Establishing Relative Sizes of Objects in a Perspective Image

The point hr is the reference height.

If we draw a line through the base points br and bu to the vanishing line we get the vanishing point of that line.

A line from hr to this vanishing point represents a line that is parallel (in the 3D world) to the line through the base points.

Point i is the same height above bu as hr is above br

We cannot use the ratio of the lengths bu to hu over bu to i.

Ratios of lengths are not preserved under perspective.

The Cross RatioIf we have 4 co-linear points the ratio of the ratio of lengths is invariant to projection.

is invariant to perspective projection. In the previous diagram the four points bu, hu, i and v are co-linear, as are the four points br, hr, c and v.

(Actually one cannot apply the cross ratio directly here because point v is at infinity - but the principle holds)

( )( )

( )( )24

23

14

13

XXXX

XXXX

−−

−−

That is, the expression

A B C D

1 1 1

Cross ratio =

ACAD

BCBD

23

12

= = 43

= 1.333

Cross ratio of equi-spaced points

Cross ratio =

ACAD

BCBD

488596

173282

= = 1.334

Measurements from an image…

A cross ratio measured from an image will be identical to the same cross ratio measured in the real world.

If we can measure some of the lengths from reference objects in the world we can calculate an unknown length using the cross ratio

ACAD

BCBD

488596

173282

= = 1.334

?3

12

=

unknown length

? = 2

known lengths

cross ratio from image

An example of height measurement taken from Andrew Zisserman's web pages

Cross ratio measured on image = 1.28Cross ratio measured on car = 1.32

Cross ratio measured on image = 1.39Cross ratio measured on car = 1.41

Image Metrology:Rectification

Calibration targets allow views of flat surfaces to be rectified.

Rectified views allow measurements to be taken.

Transformations of a planar surface

original surface

rotate scale

affine transformation

perspective transformation

The projective transformation of a planar surface into an image (which is simply another planar surface) can be represented by a matrix equation.

There are 8 unknown parameters in the projection matrix. If we know the coordinates of 4 points in the original plane we can solve for these 8 parameters.

We can then invert this equation to convert our image of the plane into a plan view of the plane

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

1

.

13231

232221

131211

y

x

pp

ppp

ppp

s

sv

su

xy coordinates in the planescaled image coordinates

Image Metrology

Rectified views of the fence and ground.

Criminisi, Reid and Zisserman 1999

Image Rectification

Even if you do not have a calibration target in the scene it is possible to undo the perspective distortion of a plane in the scene if we can find the vanishing line of the plane, and if we have two reference measurements of known lengths, or angles, in the scene.

Transformations of a planar surface

original surface

rotate scale

affine transformation

perspective transformation

• Knowledge of the vanishing line of the plane allows you to invert the perspective transformation.

• Knowledge of two lengths, or two angles, allows you to invert the affine transformation

Parallel lines in the paving are used to define two vanishing points and hence the vanishing line of the plane. The blue lines lead to the two vanishing points.

The black line across the top of the image the vanishing line of the plane. It indicates

the height of the camera relative to the objects in the scene.

Points R1-R3 and S1-S4 are used to provide constraints on the affine transformation.

There is still some distortion due to uncorrected lens distortion - lines are not quite straight in the image. Look at the portion of the square in the paving that is cut off at the very left of the rectified image - now try to find it in the original image...

Note that the rectified image cannot include points that are too close to the vanishing line as these points are at infinity! This is why the rectified image is cut off on the left hand side where it is.

Some results generated by the Oxford Visual Geometry Group…

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.





Documents

Image Metrology and 3D Reconstruction. Stereo The classic way by which one obtains 3D information from images is via stereo. We have two eyes, and because