29
Data-Driven Geometry Processing 3D Deep Learning I Qixing Huang March 23 th 2017

Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Data-Driven Geometry Processing3D Deep Learning I

Qixing Huang

March 23th 2017

Page 2: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

AlexNet

Page 3: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Image Generation

Page 4: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

3D Surface Representations

Triangular mesh

Point cloudImplicit surface

Part-based models

Light Field Representation

Page 5: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

3D Voxel Grids

Page 6: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

mesh binary voxel

3D Shape as Volumetric Representation

[Wu et al. 15]

Page 7: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

• Convolution to enable compositionality

• No pooling to reduce reconstruction error

Layer 1-3 convolutional RBM

Layer 4 fully connected RBM

Layer 5multinomial label + Bernoulli feature

form an associate memory

configurations

Convolutional DeepBelief Network

A Deep Belief Network is a generative graphical model that describes the distribution of input x over class y.

[Wu et al. 15]

Page 8: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Convolutional DeepBelief Network

3D ShapeNets ≠ CNNs

generative process

discriminative process

[Wu et al. 15]

Page 9: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Layer-wise pre-training:

Lower four layers are trained by CD

Last layer is trained by FPCD[1]

Fine-tuning:

Wake sleep[2] but keep weights tied

Maximum Likelihood Learning

Convolutional DeepBelief Network

[Wu et al. 15]

Page 10: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

generation process:Gibbs Sampling

Convolutional DeepBelief Network

[Wu et al. 15]

Page 11: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

generation process:Gibbs Sampling

object label

[Wu et al. 15]

Page 12: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Visualization at Different LayersConvolutional DeepBelief Network 12

[Wu et al. 15]

Page 13: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

chai

rbed

des

kta

ble

nig

hts

tand

sofa

bat

htu

bto

ilet

Convolutional DeepBelief Network 13

[Wu et al. 15]

Page 14: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

3D Generative Adversarial Network [Wu et al. 16]

Page 15: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):
Page 16: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Sparse 3D Convolutional Networks [Ben Graham 2016]

40x40x40 GridSparsity for lower layers

Low resolution for upper layers

Page 17: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Discussion

+ Easy to implement

+ Hardware friendly

- Low resolution

- No structural information

- Cannot utilize 2D training data

Page 18: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Light Field Representation

Page 19: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):
Page 20: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):
Page 21: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Discussion

+ Can utilize 2D training data

+ Efficient since using 2D convolutions

+ Top-performing algorithms

-- Redundancy

-- Loss of information per view

-- How to pick views?

? Convolutions on Spheres

Page 22: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Point cloud Representation[Su et al. 17a, Su et al. 17b]

Page 23: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Object Classification on Partial Scans

Input:Partial scan (XYZ)

Output:Category classification

Car Car Airplane

Table

Mug

Page 24: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Object Part Segmentation

Output:Per point label

Input:Point cloud (XYZ)

Page 25: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Semantic Segmentation for Indoor Scenes

Input:Point cloud (XYZRGB) of a room

Output (current performance):Semantic segmentation of the room

wall

floor

table

Page 26: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Uniform Framework: PointNet

Point Set

PointNet

Object Classification

Part Segmentation

Point Feature Learning

Semantic Segmentation in scenes

...

Page 27: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):
Page 28: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):
Page 29: Data-Driven Geometry Processing 3D Deep Learning Ihuangqx/CS395_Lecture_15.pdfSemantic Segmentation for Indoor Scenes Input: Point cloud (XYZRGB) of a room Output (current performance):

Model Accuracy

MLP 40%

LSTM 75%

Conv-Max-FC (1 max) 84%

Conv-Max-FC (2 max) 86%

Conv-Max-FC (2 max) + Input Transform 87.8%

Conv-Max-FC (2 max) + Feature Transform 86.8%

Conv-Max-FC (2 max) + Feature Transform + orthogonal regularization

87.4%

Conv-Max-FC (2 max) + Input Transform + Feature Transform + orthogonal regularization

88.9%

ModelNet shape 40-class classification

Best Volumetric CNN: 89.1% However, PointNet is around 5x - 10x faster than Volumetric CNN