35
Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Embed Size (px)

Citation preview

Page 1: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Structure and Motion from Line Segments in Multiple ImagesCamillo J. Taylor, David J. Kriegman

Presented by David Lariviere

Page 2: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Primary Goal

Given a series of images with known corresponding line segments,

calculate the relative locations of the cameras imaging the scene and the three-dimensional locations of the

line segments.

Page 3: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Some Previous Work• (1981) Longuet-Higgins. “A computer

algorithm for reconstructing a scene from two projections.”

• (1990) Vieville. “Estimation of 3D-motion and structure from tracking 2D-lines in a sequence of images.”

• (1992) Tomasi, Kanade. “Shape and motion from image streams under orthography.”

Page 4: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Problem Characterization• Instead of using generalized scenes

and points, focus on rigid scenes with clear edges as features.

• Advantages of lines as features:– Occur frequently in man-made

environments.– Easily located and tracked– More accurately localized than points

because there is more information available in corroboration.

Page 5: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Algorithm Overview• Determine a non-linear objective

function whose minimization leads to an estimate of scene structure.

• In this case, estimate 3D camera locations/orientations and locations of line segments in 3D, and then reproject the lines onto the estimated image planes.

• The difference between the predicted projected lines and the actually observed lines is the error function to minimize.

Page 6: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Objective Function• pi: ith 3D line

• qj: jth camera position/orientation

• uij: observed edge i in image j.

• m images

• n lines

• F: reprojection of line pi onto the image plane of camera qj.

Page 7: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Notation – Line Representation• Represent a line in 3D space by (v,d)

– v: unit vector pointing in direction of the line– d: vector from origin to closest point on the line.

• m: normal vector of the plane defined by the camera center and line.

• Edge in image plane defined by mxx + myy + mz = 0

Page 8: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Notation – Reference Frames• Relate location/orientation of each

camera to some world base frame.

Page 9: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Summary of Parameters• Camera Location (tj): 3 DOF

• Camera Orientation (Rj): 3 DOF

• Line Location/Orientation (v,d): 4 DOF

• Requires at least 6 edge correspondences in 3 images.

Page 10: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Reprojection Error• Visible endpoints (x1,y1) & (x2,y2)

• Calculate minimal distance between observed and predicted lines for every point integrated on interval between endpoints.

• Normalize error by dividing by length of observed edge.

Page 11: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Algorithm• Primary Algorithm for minimizing non-

linear function: minimize line reprojection error through gradient decent to find local minimum:– Randomly generate initial values.– Iteratively follow function along steepest

descent to reach local minimum.• If local minimum error is below a certain threshold,

accept. • Else, generate new initial values and try again.

• Quality of initial values influence heavily the number of iterations required before the function converges.

Page 12: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Initial Value EstimationIn order to decrease computational cost, additional

steps are added to acquire acceptable starting values for gradient decent:

• User inputs range for camera orientations (Rj) and values of Rj within that range are randomly chosen.

• Holding constant estimates from (1), estimate vi subject to a constraint equation.

• Improve estimate from (2) by now minimizing same constraint equation with both vi and Rj as free parameters.

• Generate initial estimates of di and tj, using a second constraint equation.

• Provide estimates from (3) and (4) as starting values for gradient decent.

Page 13: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Constraint Equations• From the defined relations:

• One can derive:

• Which provides two constraint equations:

Page 14: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Results1. Simulation Results:

1. measuring tolerance to noise, rate of returns due to increased number of images/features, and rate of convergence of global minimization.

2. Comparing proposed method to previous linear methods

2. Real-world Results

Page 15: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Simulation Results:• Main Results:

– The algorithm is much more sensitive to errors in edge endpoints than error in the calibrated camera center.

– Holding maximum baseline constant, increasing the number of images beyond 6 or the number of lines beyond 50 does not improve accuracy.

– Small number of large-baseline images superior to many small-baseline images.

– Rate of convergence of global decent minimization algorithm is highly dependant on initial range of theta.

Page 16: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Simulation Results Continued

Page 17: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Comparison to Linear Method•This method is significantly less sensitive to noise than the leading linear algorithm1

1J. Weng, Y. Liu, T. S. Huang, and N. Ahuja, “Estimating motion/structure from line correspondences”

Page 18: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Real-world Results

Page 19: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Real-world Results…

Page 20: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Real-world Results: Hallway

Page 21: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Discussion• Initial estimation optimizations

improve calculation speed.

• Algorithm is very insensitive to noise

• Future improvements:– Automate edge correspondence tracking

by using video. – Impose edge-intersection and other

geometric restrictions (coplanarity, parallelism, etc).

Page 22: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Modeling and Rendering Architecture from Photographs: A hybrid geometry- and

image-based approach

Paul E. Debevec, Camillo J. Taylor, Jitendra Malik

Page 23: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Overview• Apply previous paper’s methods to

modeling architectural scenes with restricted geometry.

• Utilize model-based stereo to extract precise geometry from a sparse set of large-baseline photographs.

• Utilize 3D models and view-dependant photographs to construct photorealistic computer-generated views.

Page 24: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Architectural Models: Blocks• User starts by choosing geometric primitives

(blocks) to represent the basic geometry of the building

• Block: “hierarchical model of a parametric polyhedral primitive”

– Parametrized by base vertex and Po and other various properties (width, height, length, etc).

Page 25: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Block Relations• Hierarchy of blocks are used to describe the various geometric

primitives that make up the basic architecture.

• User manually maps corresponding edges in images to the edges of the blocks.

• Blocks are related by constraints on their relations in terms of location and orientation:

– For example, ensure that the bottom of one block sits on top of the top of another block.

• Values of blocks are stored symbolically, meaning if one specifies a series of blocks to be parallel, then only one variable is used to enforce this restriction across all blocks.

• gi(X): rigid transformation mapping one block to adjacent block.

• Pw(x): block vertex in world coordinates

• vw(x): line orientation in world orientation

Page 26: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Block Relations Continued…

Page 27: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Advantages of Blocks• Well model most architectural scenes• Implicitly contain features commonly

found in architecture (ex: parallel edges, right angles)

• Manipulation by user is easier due to reduced number of parameters.

• Surfaces are pre-defined by the model, removing the need to calculate them from edges.

• Number of parameters are greatly reduced when performing minimization of cost function.

Page 28: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Single Image Examples:

Page 29: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Estimation of 3D Structure • Very similar to previous paper: Estimate

parameters of camera (R, t) and edges (v, d) which minimize the reprojection error.

• Differences:– Many edges are defined with relation to

one another, meaning fewer variables. – Apply horizontal/vertical constraints on

vi to more accurately estimate Rj.– Instead of using gradient decent, the

authors use Newton-Raphson method to minimize the non-linear error function.

Page 30: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

View-Dependant Texture Mapping• Once camera and edge locations/orientations are known,

project images onto block models. • If multiple images of same area exist, apply weighted

averaging to fuse multiple images. – Weights are inversely proportional to the difference in

angle between the virtual view being synthesized and the camera location/orientation which took the particular image.

• Possible to divide planes into faces, and only calculate the weighted average for one value and apply it to the entire

face.

Page 31: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Example of Texture-Mapping

Page 32: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Model-based Stereopsis• Use known scene geometry and camera

locations to rectify large-baseline images before performing stereo.

• Allows for the avoidance of foreshore-shortening problems which can be very large when images are taken far apart.

• Maintain epipolar constraint by projecting offset image onto model and then reprojecting onto key image’s image plane to create rectified image for use in stereopsis.

Page 33: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Model-based Stereopsis Example

Page 34: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Discussion• For architectural scenes that

generally fit the allowed geometric primitives, approach works quite well.

• Future Possible Improvements:– Additional models: surfaces of

revolution– Estimate BRDF– Devise method of selecting best images

to use for rendering of novel views.

Page 35: Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere

Questions?