Acquiring 3D Indoor Environments with Variability and Repetition

1

Acquiring 3D Indoor Environments with Variability and Repetition

Young Min Kim Stanford University

Niloy J. MitraUCL/ KAUST

Dong-Ming YanKAUST

Leonidas GuibasStanford University

2

Data Acquisition via Microsoft Kinect

Raw data: Noisy point clouds Unsegmented Occlusion issues

Our tool: Microsoft Kinect

Real-time Provides depth and color Small and inexpensive

http://en.wikipedia.org/wiki/File:Xbox-360-Kinect-Standalone.png

3

Dealing with Pointcloud Data

• Object-level reconstruction

• Scene-level reconstruction[Chang and Zwicker 2011]

[Xiao et. al. 2012]

4

Mapping Indoor Environments

• Mapping outdoor environments– Roads to drive vehicles– Flat surfaces

• General indoor environments contain both objects and flat surfaces– Diversity of objects of interest– Objects are often cluttered– Objects deform and move

Solution: Utilize semantic information

5

Nature of Indoor Environments

• Man-made objects can often be well-approximated by simple building blocks– Geometric primitives– Low DOF joints

• Many repeating elements – Chairs, desks, tables, etc.

• Relations between objects give good recognition cues

6

Indoor Scene Understanding with Pointcloud Data

• Patch-based approach

• Object-level understanding [Silberman et. al. 2012]

[Koppula et. al. 2011]

[Shao et. al. 2012] [Nan et. al. 2012]

7

Comparisons

[1] An Interactive Approach to Semantic Modeling of Indoor Scenes with an RGBD Camera[2] A Search-Classify Approach for Cluttered Indoor Scene Understanding

[1] [2] oursPrior model 3D database 3D database Learned

Deformation Scaling Part-based scaling Learned

Matching Classifier Classifier GeometricSegmentation User-assisted Iteration Iteration

Data Microsoft Kinect Mantis Vision Microsoft Kinect

8

Contributions

• Novel approach based on learning stage– Learning stage builds the model that is specific to

the environment• Build an abstract model composed of simple

parts and relationship between parts– Uniquely explain possible low DOF deformation

• Recognition stage can quickly acquire large-scale environments– About 200ms per object

9

Approach

• Learning: Build a high-level model of the repeating elements

• Recognition: Use the model and relationship to recognize the objects

translational

rotational

10

Approach

• Learning– Build a high-level model of the repeating elements

11

Output Model: Simple, Light-Weighted Abstraction

• Primitives– Observable faces

• Connectivity– Rigid– Rotational– Translational– Attachment

• Relationship– Placement information

3m3m

2m 2m

1m1mg

gcontact

translational

rotational3

1l

Mlmmm },,,,{ 31321

12

Joint Matching and Fitting

• Individual segmentation– Group by similar normals

• Initial matching– Focus on large parts– Use size, height, relative positions– Keep consistent match

• Joint primitive fitting– Add joints if necessary– Incrementally complete the model

13

Approach


14

Approach


• Recognition– Use the model and relationship to recognize the

objects

15

Hierarchy

• Ground plane and desk• Objects– Isolated clusters

• Parts– Group by normals

• The segmentation is approximate and to be corrected later

S},,{ 21 oo

iopp },,{ 21

16

Bottom-Up Approach

• Initial assignment for parts vs. primitives– Simple comparison of height, normal, size– Robust to deformation– Low false-negatives

• Refined assignment for objects vs. models– Iteratively solve for position, deformation and

segmentation– Low false-positives

parts

17

Bottom-Up Approach

• Initial assignment for parts vs. primitive nodes• Refined assignment for objects vs. models

Input points

Initial objects

Models matched

Refined objectsobjects parts matched

18

Results

Data available:http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_learning.ziphttp://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_recognition.zip

http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_learning.zip

http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_learning.zip

http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_recognition.zip

http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_indoor/paper_docs/data_recognition.zip

19

Synthetic Scene

Recognition speed: about 200ms per object

20

Synthetic Scene

21

Synthetic Scene

22

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2Data type

Gaussian 0.004 Gaussian 0.004Gaussian 0.3 Gaussian 0.3Gaussian 1.0 Gaussian 1.0

Precision

Reca

ll

Different pair Similar pair

23

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2Data type


Precision

Reca

ll

Different pair Similar pair

24

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2Noise


Precision

Reca

ll

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2Density

density 0.4 density 0.5density 0.6 density 0.7density 0.8

Precision

Reca

ll

25

Office 1

trash bin

4 chairs2 monitors

2 whiteboards

26

Office 2

27

Office 3

28

Deformations

drawer deformations

monitorlaptopmissed monitor

chair

29

Auditorium 1Open table

30

Auditorium 2

Open table

Open chairs

31

Seminar Room 1

missed chairs

32

Seminar Room 2

missed chairs

33

Limitations

• Missing data– Occlusion, material, …

• Error in initial segmentation– Cluttered objects are merged as a single segment– View-point sometimes separate single object into

pieces

34

Conclusion

• We present a system that can recognize repeating objects in cluttered 3D indoor environments.

• We used purely geometric approach based on learned attributes and deformation modes.

• The recognized objects provide high-level scene understanding and can be replaced with high-quality CAD models for visualization (as shown in the previous talks!)

35

Thank You• Qualcomm Corporation• Max Planck Center for Visual Computing and Communications• NSF grants 0914833 and 1011228• a KAUST AEA grant• Marie Curie Career Integration Grant 303541• Stanford Bio-X travel Subsidy

Documents

Acquiring 3D Indoor Environments with Variability and Repetition