Computational Photography and Capture: Rendering a Scene ... · 1.Pick a reference plane 2.Pick a reference direction Figure 6: Measuring heights using parallel lines: Given i) the

Computational Photography and Capture:

Rendering a Scene from a Single Photograph

Gabriel Brostow & Tim Weyrich

TAs: Clément Godard & Fabrizio Pece

Breaking Out of 2D

But must we go to full 3D? 4D?

First, what is IBR?(Image Based Rendering)

“View Interpolation for Image Synthesis”

• By Shenchang E. Chen & Lance Williams, SIGGRAPH 1993

• Complexity independent

• Fast

• 2 Types of IBR:

• With geometric models

• Without models (if pixels don’t count)

Available Radiance Samples

• (R, G, B) at each (x, y, z)

• Synthetic: Obtained from World-space of 3D models

• Real: Obtained by computer vision “innovations”

Rotating to the right…

Using additional source images

From 1 source I From 2 source I’s From 2 source I’s

(closely spaced)

From 2 source I’s

(Interpolated)

Model Implementation

• Make morph maps between all nodes

I(B)I(A)

I(C)

MAB

MAC

MAB-2, 1, -3

Interpolation

Forward Mapping:

• Neighboring pixels (old way)

Backward Mapping:

• 3D Spatial-offset vectors:– pure parametric OR

– range-specific

“offset vector indicates the amount each of the pixels moves in its screen space as a result of the camera’s movement”

Faster!

• Quadtrees:

– When depth changes a lot incr. resolution

– When depth is smooth, pixels move together

• Speedups ARE complexity dependant

• Incremental rendering

Cool Byproducts: Motion Blur

Real View-Interpolation Result

Cool Byproducts

• Shadows

• Specular reflections(requires separate maps)

• Current:

– Image-based primitives

Plenoptic Function: Light Fields

• Light Field Rendering. Levoy and Hanrahan, Siggraph 1996

• The Lumigraph: Gortler et al. Siggraph 1996

data link

http://graphics.stanford.edu/papers/light/

http://cs.harvard.edu/~sjg/papers/lumigraph.pdf

http://cs.harvard.edu/~sjg/papers/lumigraph.pdf

http://lightfield.stanford.edu/lfs.html

Stanford Light-Field Capture(All photos are from http://lightfield.stanford.edu/acq.html)

http://lightfield.stanford.edu/acq.html

Stanford Light-Field Capture(All photos are from http://lightfield.stanford.edu/acq.html)

http://lightfield.stanford.edu/acq.html

Tour Into the PictureHorry, Anjyo, Arai, SIGGRAPH 1997

Slides borrowed from Alexei Efros,

who built on Steve Seitz’s and David Brogan’s

We want more of the plenoptic function

We want real 3D scene

walk-throughs:Camera rotation

Camera translation

Can we do it from a single photograph?

3D in the Real World

Camera rotations with homographies

St.Petersburg

photo by A. Tikhonov

Virtual camera rotations

Original image

..\05_SeamCarving_WarpMorph\TimesSquare-Med.mov

Camera translation• Does it work? synthetic PP

PP1

PP2

So, what can we do here?

• Model the scene as a set of planes!

• Now, just need to find the orientations of these planes.

Remember Preliminaries: Projective Geometry

Ames Room

http://www.illusionworks.com/html/ames_room.html

Silly Euclid!

Parallel lines converge?

Vanishing lines

• Multiple Vanishing Points

– Any set of parallel lines on the plane define a vanishing point

– The union of all of these vanishing points is the horizon line• also called vanishing line

– Note that different planes define different vanishing lines

v1 v2

Vanishing lines

• Multiple Vanishing Points

– Any set of parallel lines on the plane define a vanishing point

– The union of all of these vanishing points is the horizon line• also called vanishing line

– Note that different planes define different vanishing lines

Computing vanishing lines

• Properties– l is intersection of horizontal plane through C with image plane

– Compute l from two sets of parallel lines on ground plane

– All points at same height as C project to l

• points higher than C project above l

– Provides way of comparing height of objects in the scene

ground plane

lC

2 Point Perspective Street Scene http://cutlerart73.com/assign4.html

http://cutlerart73.com/assign4.html

2 Point Perspective Street Scene http://cutlerart73.com/assign4.html

“ 3 Point Perspective At Play”

http://artintegrity.wordpress.com/2008/05/26/24-trouble-with-perspective-in-drawing-this-may-help/

“Ascending and Descending”http://www.mcescher.com/Biography/lw435f2.jpg

http://cutlerart73.com/assign4.html



















http://www.mcescher.com/Biography/lw435f2.jpg

See Also: Single View Metrology

A. Criminisi, I. Reid and A. Zisserman, ICCV1999

1.Pick a reference plane2.Pick a reference direction

Figure 6: Measuring heights using parallel lines: Given i) the vertical vanishing point, ii)the vanishing line for the ground plane and iii) a reference height, then the distance of the top of the window on the right wall from the ground plane is measured from the distance between the two horizontal lines shown, one defined by the top edge of the window, and the other on the ground plane.

“Tour into the Picture,” Horry et al. 1997

• Create a 3D “theatre stage” of five billboards

• Specify foreground objects through bounding polygons

• Use camera transformations to navigate through the scene

The idea

• Many scenes (especially paintings), can be represented as an axis-aligned box volume (i.e. a stage)

• Key assumptions:

– All walls of volume are orthogonal

– Camera view plane is parallel to back of volume

– Camera up is normal to volume bottom

• How many vanishing points does the box have?

– Three, but two at infinity

– Single-point perspective

• Can use the vanishing point

to fit the box to the particular

scene!

Fitting the box volume

• User controls the inner box and the vanishing point placement (# of DOF???)

• Q: What’s the significance of the vanishing point location?

• A: It’s at eye level: ray from COP to VP is perpendicular to image plane.

DEMO

• Now, we know the 3D geometry of the box

• We can texture-map the box walls with texture from the image

link to web page with example code

tip_movie.mov

http://www.mizuno.org/gl/tip/



High

Camera

Example of user input: vanishing point and back face of

view volume are defined

Low

Camera

Example of user input: vanishing point and back face of

view volume are defined

High Camera Low Camera

Comparison of how image is subdivided based on two

different camera positions. You should see how moving

the vanishing point corresponds to moving the eyepoint in

the 3D world.

Left

Camera

Another example of user input: vanishing point and back

face of view volume are defined

Right

Camera

Another example of user input: vanishing point and back

face of view volume are defined

Left Camera Right Camera

Comparison of two camera placements – left and right.

Corresponding subdivisions match view you would see if

you looked down a hallway.

2D to 3D conversion

• First, we can get ratios

left right

top

bottom

vanishing

point

back

plane

• Size of user-defined back plane must equal

size of camera plane (orthogonal sides)

• Use top versus side ratio

to determine relative

height and width

dimensions of box

• Left/right and top/bot

ratios determine part of

3D camera placement

left right

top

bottomcamera

pos

2D to 3D conversion

Foreground Objects•Use separate billboard for each

•For this to work, three separate images used:

– Original image.

– Mask to isolate desired foreground images.

– Background with objects removed

Foreground Objects

• Add vertical rectangles for each foreground object

• Can compute 3D coordinates P0, P1 since they are on known plane.

• P2, P3 can be computed as before (similar triangles)

Foreground

TIP movie

UVA example

fishkill.mov

rotunda.divx.avi

See also…

• Tour into the picture with water surface reflection .

• Tour into theVideo:

– by Kang + Shin

http://doi.wiley.com/10.1002/cav.135

http://doi.wiley.com/10.1002/cav.135

Tour into the picture with water surfaceReflection2006.pdf

http://portal.acm.org/citation.cfm?id=585740.585753





Automatic Photo Pop-upHoiem, Efros, Hebert,

SIGGRAPH 2005<project page>

(using Derek Hoiem’s slides)

http://www.cs.uiuc.edu/homes/dhoiem/projects/popup/index.html

http://www.cs.uiuc.edu/homes/dhoiem

http://www.cs.uiuc.edu/homes/dhoiem

amtrak_white_all9_divx.avi

Related Work

• Multiple Images– Manual Reconstruction

• Façade [Debevec et al. 1996], REALVIZ Image Modeler,

• Photobuilder [Cipolla et al. 1999], [Zieglier et al. 2003], etc.

– Automatic Reconstruction• Structure from Motion (e.g. [Nister 2001], [Pollefeys et al. 2004])

• Single Image– Manual Reconstruction

• Metric ([Liebowitz et al. 1999], [Criminisi et al. 1999])

• Approximate: Tour into the Picture [Horry et al. 1997];

[Kang et al. 2001], [Oh et al. 2001], [Zhang et al. 2001], etc.

– Automatic Reconstruction..?

The Problem

from [Sinha and Adelson 1993]

• Recovering 3D geometry from single 2D projection

• Infinite number of possible solutions!

Goals

• Simple, piecewise planar models

• Outdoor scenes

• Doesn’t need to work all the time (~35%)

Pop-up Book

Approach: Learn Correlation

• Learn structure of the world and appearance-based models of geometry

…

Overview

Input

Ground

Vertical

Sky

Geometric Labels Cut’n’Fold 3D Model

Image

Learned Models

Cues

Color

Location

Texture

Perspective

Need Spatial Support

50x50 Patch50x50 Patch

Color Texture PerspectiveColor Texture Perspective

Robust Spatial Support

RGB Pixels Superpixels

[Felzenszwalb and

Huttenlocher 2004]

Somewhat unpredictable!

Grouping Superpixels

• Simple color, texture, location cues

• Which superpixels come from the same surface?

• Learn from training images– Boosted kernel density estimation

Many Weak Geometric Cues

• Material

• Image Location

• 3D Geometry

Multiple Segmentations

Superpixels

…

Multiple

Segmentations

• Group superpixels likely to be from the same surface into segments

• Single segmentation unreliable

• Create multiple segmentations

Learning Appearance-based Geometry

• All geometric cues available

• Does this segment correspond to a single surface? (homogeneity likelihood)

• If so, what is the geometry of that surface? (label likelihood)

Learn from training images

• Prepare training images– Create multiple segmentations of training images

– Get segment labels from ground truth – ground, vertical, sky, or “mixed”

• Density estimation by boosted decision trees– 8 nodes per tree

– Logistic regression version of Adaboost

• See [Collins and Schapire and Singer 2002]

Label LikelihoodHomogeneity Likelihood

Labeling Segments

…

…

For each segment:

- Get

Image Labeling

…

Labeled Segmentations

Labeled Pixels

Learned from

training images

Cutting and Folding

• Fit ground-vertical boundary

– Iterative Hough transform

Cutting and Folding

• Form polylines from boundary segments– Join segments that intersect at slight angles

– Remove small overlapping polylines

• Estimate horizon position from perspective cues

Cutting and Folding

• ``Fold’’ along polylines and at corners

• ``Cut’’ at ends of polylines and along vertical-sky boundary

Results

Input Image

Automatic Photo Pop-upCut and Fold

Results

Automatic Photo Pop-up

Input Image

Cut and Fold

Results

Automatic Photo Pop-upInput Image

Results

Automatic Photo Pop-upInput Images

Results

Automatic Photo Pop-upInput Image

Comparison with Manual Method

Input Image

Automatic Photo Pop-up (30 sec)!

[Liebowitz et al. 1999]

Failures

Labeling Errors

FailuresForeground Objects

See Music Video

fast_divx.avi

fast_divx.avi

Just the first step…

• Richer set of geometric classes (ICCV 05 paper)

• Segment foreground objects

• Cleaner segmentation

• More accurate labeling

Average: 86%

Quantitative Results (ICCV 05)

http://www.cs.illinois.edu/homes/dhoiem/projects/geometry_project.html

Photo Pop-up Conclusions

• First system to automatically recover 3D scene from single image!

• Learn statistics of our world from training images

See also their journal version:“Recovering Surface Layout from an Image,“

IJCV 2007

See Also (and compare)

• Make3D

– http://make3d.stanford.edu/

– Includes source code!

– “Make3D: Learning 3-D Scene Structure from a Single Still Image”, Saxena, Sun, Ng, PAMI 2008

http://make3d.stanford.edu/

See Also: Image-Based Modelingand Photo Editing

Oh et al., SIGGRAPH 2001

WebVideo (loud)

ibedit_360x240_OhDorseyDurand_Siggraph2001.wmv

http://groups.csail.mit.edu/graphics/ibedit/

Documents

Computational Photography and Capture: Rendering a Scene ... · 1.Pick a reference plane 2.Pick a reference direction Figure 6: Measuring heights using parallel lines: Given i) the