Upload
quasim
View
56
Download
0
Embed Size (px)
DESCRIPTION
Automatic scene inference for 3D object compositing. Kevin Karsch (UIUC), Sunkavalli , K. Hadap , S.; Carr, N.; Jin, H.; Fonte , R.; Sittig , M., David Forsyth SIGGRAPH 2014. What is this system. Image editing system Drag-and-drop object insertion Place objects in 3D and relight - PowerPoint PPT Presentation
Citation preview
Automatic scene inference for 3D object compositing
Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth
SIGGRAPH 2014
What is this system• Image editing system• Drag-and-drop object insertion• Place objects in 3D and relight• Fully automatic for recovering a comprehensive 3D
scene model: geometry, illumination, diffuse albedo, and camera parameters
• From single low dynamic range (LDR) image
Existing problems• It’s the artist’s job to create photorealistic
effects by recognizing the physical space• Lighting, shadow, perspective• Need: camera parameters, scene geometry,
surface materials, and sources of illumination
State-of-the-art• http://www.popularmechanics.com/technolog
y/digital/visual-effects/4218826• http://en.wikipedia.org/wiki/The_Adventures
_of_Seinfeld_%26_Superman
What can not this system handle• Works best when scene lighting is diffuse;
therefore generally works better indoors than out• Errors in either geometry, illumination, or
materials may be prominent• Does not handle object insertion behind existing
scene elements
Contribution• Illumination inference: recovers a full lighting
model including light sources not directly visible in the photograph
• Depth estimation: combines data-driven depth transfer with geometric reasoning about the scene layout
How to do this• Need: geometry, illumination, surface
reflectance• Even though the estimates are coarse, the
composites still look realistic because even large changes in lighting are often not perceivable
Workflow
Indoor/outdoor scene classification• K-nearest-neighbor matching of GIST features• Indoor dataset: NYUv2• Outdoor dataset: Make3D• Different training images and classifiers are
chosen depending on indoor/outdoor scene
Single image reconstruction• Camera parameters, geometry– Focal length f, camera center (cx, cy) and extrinsic
parameters are computed from three orthogonal vanishing points detected in the scene
Surface materials• Per-pixel diffuse material albedo and shading
by Color Rentinex method
Data-driven depth estimation• Database: rgbd• Appearance cues for correspondences: multi-
scale SIFT features• Incorporate geometric information
Data-driven depth estimation
Et: depth transferEm: Manhattan worldEo: orientationE3s: spatial smoothness in 3D
Scene illumination
Visible sources• Segment the image into superpixels;• Then compute features for each superpixel;– Location in image– Use 340 features used in Make3D
• Train a binary classifier with annotated data to predict whether or not a superpixel is emitting/reflecting a significant amount of light.
Out-of-view sources• Data-driven: annotated SUN360 panorama
dataset;• Assumption: if photographs are similar, then
the illumination environment beyond the photographed region will be similar as well.
Out-of-view sources• Use features: geometric context, orientation maps, spatial
pyramids, HSV histograms, output of the light classifier;• Measure: histogram intersection score, per-pixel inner
product;• Similarity metric of IBLs: how similar the rendered canonical
objects are;• Ranking function: 1-slack, linear SVN-ranking optimization
(trained).
Relative intensities of the light sources• Intensity estimation through rendering: adjusting until a
rendered version of the scene matches the original image;• Humans cannot distinguish between a range of illumination
configurations, suggesting that there is a family of lighting conditions that produce the same perceptual response.
• Simply choose the lighting configuration that can be rendered faster.
Physically grounded image editing• Drag-and-drop insertion• Lighting adjustment• Synthetic depth-of-field
User study• Real object, real scene VS inserted object, real
scene• Synthetic object, synthetic scene VS inserted
object, synthetic scene• Produces perceptually convincing results