54
CSC2457 3D & Geometric Deep Learning Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann, Michael Zollhöfer and Gordon Wetzstein Feb 23 rd Presenter: Shayan Shekarforoush Instructor: Animesh Garg

CSC2457 3D & Geometric Deep Presenter: Shayan

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSC2457 3D & Geometric Deep Presenter: Shayan

CSC2457 3D & Geometric Deep Learning

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann, Michael Zollhofer and Gordon Wetzstein

Feb 23rd Presenter: Shayan Shekarforoush

Instructor: Animesh Garg

Page 2: CSC2457 3D & Geometric Deep Presenter: Shayan

Learning Scene Representation

• With 3D Bias:

• Or not:

Page 3: CSC2457 3D & Geometric Deep Presenter: Shayan

Applications

•Downstream tasks

Page 4: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

3D supervision

Page 5: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

3D supervision

Page 6: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

3D supervision 2D image + Camera pose

Page 7: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

Geometry Geometry + Appearance

Page 8: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

Multi-view consistency

Page 9: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

Voxel resolutions

Page 10: CSC2457 3D & Geometric Deep Presenter: Shayan

Challenges

Voxel resolutions

Point cloud sparsity

Page 11: CSC2457 3D & Geometric Deep Presenter: Shayan

Contributions

• A continuous, 3D structure aware, neural scene representation encoding geometry and appearance a multi-view consistent manner.• Along with a Differentiable ray marching algorithm for rendering.

• End-to-end training without explicit 3D supervision.

• Generalizable to other geometry or appearance.

• Evaluation in:• Novel view synthesis.• Few-shot reconstruction.• ...

Page 12: CSC2457 3D & Geometric Deep Presenter: Shayan

Problem Setting

Input data:

2D image:

Page 13: CSC2457 3D & Geometric Deep Presenter: Shayan

Problem Setting

Input data:

2D image:Extrinsic matrix:Intrinsic matrix:

Page 14: CSC2457 3D & Geometric Deep Presenter: Shayan

Implicit Scene Function

Page 15: CSC2457 3D & Geometric Deep Presenter: Shayan

Implicit Scene Function

0.2 -0.5 1.7 . . . 2.4 -0.8

Page 16: CSC2457 3D & Geometric Deep Presenter: Shayan

Implicit Scene Function

0.2 -0.5 1.7 . . . 2.4 -0.8

Visual

Geometry

Page 17: CSC2457 3D & Geometric Deep Presenter: Shayan

Implicit Scene Function

Page 18: CSC2457 3D & Geometric Deep Presenter: Shayan

Implicit Scene Function

Higher Resolution

Page 19: CSC2457 3D & Geometric Deep Presenter: Shayan

Neural Rendering

Neural Renderer

E, K

2D Image

Page 20: CSC2457 3D & Geometric Deep Presenter: Shayan

Neural Rendering

•Ray Marching

•Pixel Generator

Page 21: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

Page 22: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):Intersection as optimization:

Intersection

Surface

Page 23: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

Page 24: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

0.2 -0.5 1.7 . . . 2.4 -0.8

Page 25: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

0.2 -0.5 1.7 . . . 2.4 -0.8

LSTM

Page 26: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

0.2 -0.5 1.7 . . . 2.4 -0.8

LSTM

Page 27: CSC2457 3D & Geometric Deep Presenter: Shayan

Ray Marching

Parametrize ray marching out of pixel (u, v):

Page 28: CSC2457 3D & Geometric Deep Presenter: Shayan

Pixel Generator

Per Pixel:

0.2 -0.5 1.7 . . . 2.4 -0.8

Page 29: CSC2457 3D & Geometric Deep Presenter: Shayan

Pixel Generator

Per Pixel:

0.2 -0.5 1.7 . . . 2.4 -0.8

MLP

RGB

Page 30: CSC2457 3D & Geometric Deep Presenter: Shayan

General Framework

Page 31: CSC2457 3D & Geometric Deep Presenter: Shayan

General Framework

Page 32: CSC2457 3D & Geometric Deep Presenter: Shayan

Generalization over Scenes

Page 33: CSC2457 3D & Geometric Deep Presenter: Shayan

Generalization over Scenes

Page 34: CSC2457 3D & Geometric Deep Presenter: Shayan

Generalization over Scenes

MLP Weights

Page 35: CSC2457 3D & Geometric Deep Presenter: Shayan

Generalization over Scenes

MLP Weights

Latent code

Page 36: CSC2457 3D & Geometric Deep Presenter: Shayan

Generalization over Scenes

MLP Weights

Latent code

Hypernetwork (MLP)

Page 37: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

Joint optimization:

𝜽: Neural Renderer

Page 38: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

Joint optimization using SGD:

Instances Viewpoints

Page 39: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

Joint optimization using SGD:

2

L2 reconstruction loss

Page 40: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

Joint optimization using SGD:

2

L2 reconstruction loss Positive depth

Page 41: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

Joint optimization using SGD:

2

Gaussian PriorL2 reconstruction loss Positive depth

Page 42: CSC2457 3D & Geometric Deep Presenter: Shayan

Optimization

1st shot 2nd shot

Trained beforehand

Page 43: CSC2457 3D & Geometric Deep Presenter: Shayan

Shepard Metzler

•7 element objects•Novel view synthesis on:

• Training set• Few-shot on 100 test objects

Page 44: CSC2457 3D & Geometric Deep Presenter: Shayan

ShapeNet

•Cars and Chairs•Novel view synthesis on:

• Training set.• Few-shot on official test objects.

Page 45: CSC2457 3D & Geometric Deep Presenter: Shayan

ShapeNet

Page 46: CSC2457 3D & Geometric Deep Presenter: Shayan

ShapeNet

Page 47: CSC2457 3D & Geometric Deep Presenter: Shayan

Latent space interpolation

Page 48: CSC2457 3D & Geometric Deep Presenter: Shayan

Camera pose extrapolation

Camera zoom Camera rotation

Page 49: CSC2457 3D & Geometric Deep Presenter: Shayan

Basel face model

•Available disentangled latent:• Identity• Expression

Page 50: CSC2457 3D & Geometric Deep Presenter: Shayan

Minecraft room

Room scale scene

Page 51: CSC2457 3D & Geometric Deep Presenter: Shayan

Critique / Limitations / Open Issues

•Availability of camera pose?

•Effects of view or lighting?

• Failure cases.

Page 52: CSC2457 3D & Geometric Deep Presenter: Shayan

Critique / Limitations / Open Issues

Page 53: CSC2457 3D & Geometric Deep Presenter: Shayan

Contributions (Recap)

• A continuous, 3D structure aware, neural scene representation encoding geometry and appearance a multi-view consistent manner.• Along with a Differentiable ray marching algorithm for rendering.

• End-to-end training without explicit 3D supervision.

• Generalizable to other geometry or appearance.

• Evaluation in:• Novel view synthesis.• Few-shot reconstruction.• ...

Page 54: CSC2457 3D & Geometric Deep Presenter: Shayan

Thank you!