57
Geometric and Semantic 3D Reconstruction: Part 4A: Volumetric Semantic 3D Reconstruction CVPR 2017 Tutorial Christian Häne UC Berkeley

Geometric and Semantic 3D Reconstruction: Semantic 3D ...chaene/cvpr17tut/multiview.pdf · Geometric and Semantic 3D Reconstruction: ... CVPR 2017 Tutorial ... Using Only Sparse Points

  • Upload
    phamnhi

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Geometric and Semantic 3D Reconstruction:

Part 4A: Volumetric Semantic 3D Reconstruction

CVPR 2017 Tutorial

Christian Häne

UC Berkeley

Dense Multi-View Reconstruction

• Goal: 3D Model from Images (Depth Maps)

A Standard Pipeline

Input Images

Sparse Reconstruction

Structure-from-Motion

Depth Maps

Dense Matching

red close, blue far

A Standard Pipeline

Depth Maps

Dense 3D Model

Depth Map Fusion

Challenges

• Noise in Depth Maps

• Inconsistencies Between Views

• Incomplete Data

• Mistakes in Depth Maps

Domain / Representation

• Volume

– Outside / Inside

• Mesh

– Triangles Representing the Surface

• Surface Element (Surfel)

– Dense Set of Small Disks

• ….

Truncated Signed Distance Field (TSDF) [Curless & Levoy, 1996, Levoy et al. 2000]

• Weighted Average over Multiple Viewpoints

Marching Cubes [Lorensen & Cline 1987]

• Conversion from Volume to Mesh

• Extract Mesh as Iso-Surface (Zero-Crossing)

[Levoy et al. 2000]

Real-Time on GPU [Newcombe et al. 2011]

• Input: Depth Cameras (Kinect)

Regularization

• Noise and Outliers

• Energy minimization

– Data Term

– Smoothness Term

Discrete Domain / Graph Cut [Lempitsky & Boykov, 2007]

• Label Voxels as Inside/Outside (1 or 0)

• Energy minimization via Graph Cut

– Cut Edges -> Smoothness Cost

0 0 0 0

0 1 1 0

0 1 1 0

0 0 1 0 [Lempitsky & Boykov, 2007]

Metrication Artifacts

Continuous Domain / Variational

• Segment Continuous Domain Inside/Outside

• Variational Optimization

– Total Variation as Smoothness

– Penalizes Surface Area

u(x) = 0

u(x) = 1

[Zach, 2007]

Direct Reconstruction

Input Images

Sparse Reconstruction

Structure-from-Motion

Dense 3D Model

Photoconsistency, Energy Minimization

red close, blue far

Per Voxel Photoconsistency

• No explicit Computation of Depth Maps

• Photo Consistency Evaluated per Voxel

• Silhouettes / Visual Hull (Object only)

[Kolev et al. 2009]

Adding Surface Normals [Kolev et al. 2009]

• Surface Normals for High Frequency Details

Estimated Normal Field Guides Reconstruction

Formulation over Rays [Liu & Cooper, 2010, 2014]

• First Transition to Occupied Space Along Ray

• Color Consistent over all Rays

• Discrete Graph Based Formulation – Alternating Minimization, Belief Propagation

Challenging Cases

• Some Examples Using Depth Map Fusion

Adding Semantics

Input Images

Depth Maps Class Likelihoods

Semantic Classifier Sparse Reconstruction Dense Matching

Dense Semantic 3D Reconstruction

Depth Maps Class Likelihoods

Dense Semantic 3D Model

Joint Fusion, Convex Optimization

Adding Semantics to Geometry [Sengupta, Greveson, Shahrokni, Torr, 2013]

• Geometry not Improved

Dense Semantic 3D Reconstruction [Häne et al. 2013, 2016]

Dense Semantic 3D Model Dense 3D Model

Dense semantic 3D model takes class-specific surface orientation into account!

For example: direction of ground: horizontal more likely than vertical

likely unlikely

Multi-Label Formulation

Discrete Domain Continuous Domain

Smoothness: Transitions Along Edges

Smoothness: (anisotropic) boundary length

• Linear Program • Belief Propagation • Graph Cuts

• Convex Program Discretized (for Iterative Optimization)

Anisotropy / Wulff Shape

• Wulff Shape [Wulff 1901, Esedoglu and Osher 2004]

– Convex shape

– defined by

Semantic Reconstruction Formulation [Häne et al. 2013, 2016]

Data Term: Described as per-voxel unary potentials Regularization Term:

Class-specific, direction dependent, surface area penalization

Learned from training data

Data Term

• Sky -> Free Space Weight Along Whole Ray

• Sum of Weight in Each Voxel

Entering Unary Weights

• Weights for all voxels and all views entered

+ + + - - -

+ + +

- - - + + +

- - -

Entering Unary Weights

• Weights for all voxels and all views entered

• Model Recovered based on weights

Smoothness Term

• Isotropic + Anisotropic (Wulff Shape)

• Maximum Likelihood Estimation

• Grid Search

Two Wulff Shape Parameterizations

Training Data

Energy Evolution [Häne et al. 2013, 2016]

Optimization using the first order primal-dual algorithm

• First Order Primal-Dual Algorithm [Chambolle & Pock, 2010]

Weakly Observed Structures [Häne et al. 2013, 2016]

• Buildings Standing on the Ground

Weakly Observed Structures [Häne et al. 2013, 2016]

• Building and Vegetation Separated

Unobserved Surfaces

• Labels can be Separated

Unobserved Surfaces

• Labels can be Separated

Extension To Octrees [Blaha, Vogel, et al. 2016]

• Exploiting Sparsity of Surface

Using More Classes [Cherabier, Häne, Oswald, Pollefeys, 2016]

• Exploiting Sparsity of Labels / Transitions

Only Relevant Labels Active In Block

Using Only Sparse Points [Kundu et al. 2014]

• Discrete Formulation

– Message Passing

Traditional, Unary Potential

• Weights for all voxels and all views entered

+ + + - - -

+ + +

- - - + + +

- - -

Traditional, Unary Potential

• Weights for all voxels and all views entered

• Model Recovered based on weights

Issues with Unary Potentials [Savinov, Ladicky, Häne, Pollefeys, 2015, 2016]

• Data given as Information along rays

• Approximation with unary weights -> artifacts

Closed Archway Inflated Roofline

Artifact

Ray Potential

• Costs along rays for placing surface

– First transition to occupied semantic class

Ray Potentials

• Costs along rays for placing surface

• Optimization contains per-ray variables

Ray Potentials [Savinov et al., 2015, 2016]

• Data given as Information along rays

• Keeping information along rays in formulation

Correct Archway Correct Roofline

Formulation [Savinov, Häne, Ladicky, Pollefeys, 2016]

Data Term: Described as ray potential Regularization Term:

Class-specific, direction dependent, surface area penalization

Local View: Describes Visible Surface in Camera

Global View: Describes Voxel Labeling

Constraints: Make Local and Global View Consistent

Formulation [Savinov, Häne, Ladicky, Pollefeys, 2016]

• Convex Relaxation Weak

• Solution: One Non-Convex Constraint

– Change of Visibility Along Ray <-> Cost Assumed

• Majorize-Minimize Optimization

Results [Savinov, Häne, Ladicky, Pollefeys, 2016]

• Thin Structures

Results [Savinov, Häne, Ladicky, Pollefeys, 2016]

• Semantic 3D Reconstruction

Image Unary Potential Ray Potential

Object Shape Priors

• Real-World Objects

– Reflective

– Transparent

– Specular

• Exploit Object-Class Specific Similarity

Example Input Image

Reconstruction without Prior

Shape Model

Bao, Chandraker, Lin, Savarese, 2013

Results [Bao, Chandraker, Lin, Savarese, 2013]

Image PMVS PSR with Prior Ground Truth

• Only Object Reconstructed

Normal-Based 3D Object Shape Priors [Häne, Savinov, Pollefeys, 2014]

• Reconstruction Volume aligned with object

• Surface normals locally similar between instances

Training the Prior [Häne, Savinov, Pollefeys, 2014]

• Per-voxel surface normal direction distribution

• Regularization with discretized Wulff shapes

Reconstruction Formulation [Häne, Savinov, Pollefeys, 2014]

Data Term: Described as per-voxel unary potentials

Reconstruction volume aligned

Training Data: Mesh models

Extract per-voxel Wulff shapes

Results Cars [Häne, Savinov, Pollefeys, 2014]

Results Cars [Häne, Savinov, Pollefeys, 2014]

Alignment Transform [Maninchedda, Häne, et al. 2016]

Semantic 3D Reconstruction of Heads [Maninchedda, Häne et al. 2016]

Conclusion

• Semantics Helps to Improve Geometry

• Consistent Semantic Segmentation

• Richer Representation

• Multi-Label Shape Priors

• Future Directions:

– Real-Time Applications

– End to End Learning