31
1 LSD-SLAM: Large-Scale Direct Monocular SLAM Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich Monocular Video Camera Motion and Scene Geometry Computer Vision Group Technical University of Munich Jakob Engel LSD-SLAM: Large-Scale Direct Monocular SLAM Engel, Schöps, Cremers

Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

1

LSD-SLAM: Large-Scale Direct Monocular SLAM

Jakob Engel, Thomas Schöps, Daniel CremersTechnical University Munich

Monocular Video Camera Motion and Scene Geometry

Computer Vision GroupTechnical University of Munich

Jakob Engel

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 2: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

2

Live Operation

real-time operation on laptop (no GPU)LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 3: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

3

(Some) Related Work

DTAM: Dense Tracking and Mapping in Real-Time. Newcombe, Lovegrove, Davison; ICCV ‘11

MonoSLAM: Real-time single camera SLAM. Davison, Reid, Molton, Stasse; PAMI ‘07

Structure from Motion Causally Integrated Over Time. Chiuso, Favaro, Jin, Soatto; PAMI ‘02

SVO: Fast Semi-Direct Monocular Visual Odometry. Forster, Pizzoli, Scaramuzza; ICRA ‘14

Parallel Tracking and Mapping for Small AR Workspaces. Klein, Murray; ISMAR ‘07

Scalable monocular SLAM. Eade, Drummond; CVPR ‘06

Visual Odometry.Nistér, Naroditsky, Bergen; CVPR ‘04

Scale Drift-Aware Large Scale Monocular SLAM. Strasdat, Montiel, Davison; RSS ‘10

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 4: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

Extract & MatchFeatures

(SIFT / SURF / BRIEF /...)

Input Images

4

Track:min. reprojection error

(point distances)

Map:est. feature-parameters

(3D points / normals)

abstract images to feature observations

Input Images

Track:min. photometric error

(intensity difference)

Map: est. per-pixel depth

(semi-dense depth map)

keep full image

LSD-SLAM: what‘s new?Keypoint-Based Direct (LSD-SLAM)

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 5: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

5

...and why do that?

can only use & reconstruct corners can use & reconstruct whole imageLSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 6: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

Depth Estimation

Input Video

6

Current KF

Tracking

Overview

Add to Map

Map Optimization

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 7: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

7

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KFAdd to Map

Map Optimization

Page 8: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

8

Tracking

KF image KF depth

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 9: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

8

Tracking

KF image KF depth back-warped new frame

Camera Pose in

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 10: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

8

Tracking

KF image KF depth back-warped new frame

Camera Pose in

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 11: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

(≈ forward-compositional Lucas-Kanade)

8

Tracking

minimize using Gauss-Newton Algorithm

KF image KF depth back-warped new frame

Camera Pose in

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 12: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

9

Tracking multi-resolution (track large motions)

Huber norm instead of L2 (outliers & occlusions)

statistical normalization (respect depth- and pixel-noise)

single core timings:320x240: 5-10ms640x480: 20-30ms

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 13: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

10

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KFAdd to Map

Map Optimization

Page 14: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

11

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Add to Map

Map Optimization

Depth Estimation

Current KF

Take KF?

Createnew KF

Refine KF

Page 15: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

12

Depth Estimation

pixelwise filtering (exploit video)small-baseline → large baseline

information selection„only do stereo if sufficientinformation gain“

edge-preserving smoothing distance-based KF selection

image inverse depth inverse depth variance

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

[Engel, Sturm, Cremers; ICCV ´13]

Page 16: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

13

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KF

Take KF?

Createnew KF

Refine KF

Add to Map

Map Optimization

Page 17: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

14

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KF

Take KF?

Createnew KF

Refine KF

Add to Map

Map Optimization

Page 18: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

15

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KF

Take KF?

Createnew KF

Refine KF

Add to MapSim(3) alignmentto all nearby KFs

Optional: FabMap for large loops

Map OptimizationSim(3) pose-graph

Page 19: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

16

Global Mapping

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 20: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

with (warped point)

16

Global Mapping

Direct Tracking with scale (on Sim(3)):

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 21: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

with (warped point)

16

Global Mapping

Direct Tracking with scale (on Sim(3)):

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 22: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

with (warped point)

16

Global Mapping

Direct Tracking with scale (on Sim(3)):

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

+ GN optimization + multi-resolution + Huber norm + statistical norm.

Page 23: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

with (warped point)

16

Global Mapping

Direct Tracking with scale (on Sim(3)):

Optimize pose-graph on Sim(3)

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

+ GN optimization + multi-resolution + Huber norm + statistical norm.

Page 24: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

17

Global Mapping

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 25: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

18

Overview

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Input Video640x480 @ 30Hz

TrackingSE(3) alignment

to current KF

Depth Estimation

Current KF

Take KF?

Createnew KF

Refine KF

Add to MapSim(3) alignmentto all nearby KFs

Optional: FabMap for large loops

Map OptimizationSim(3) pose-graph

Page 26: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

19

Results

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

6 minutes, 640x480@50fps: 16.000 Tracked Frames, 800 Keyframes; 11.000 Constraints; 51 Million Points

Page 27: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

20

Results

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

12 minutes, 640x480@50fps:36.000 Tracked Frames, 1.000 Keyframes; 18.000 Constraints; 100 Million Points

Page 28: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

21

Results

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Semi-Dense Visual Odometry for AR on a Smartphone; T. Schöps, J. Engel, D. Cremers; ISMAR ´14.

Page 29: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

22

Key Ingredients

Direct Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 30: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

22

Key Ingredients

Direct Tracking

Semi-Dense Stereo• filter over many small-baseline frames• strict information selection

Pose-Graph on Sim(3)

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Page 31: Computer Vision Group Technical University of Munich Jakob …pdfs.semanticscholar.org/c13c/b6dfd26a1b545d50d05b52c99... · 2018-11-28 · Technical University Munich Monocular Video

23

LSD-SLAM

LSD-SLAM: Large-Scale Direct Monocular SLAMEngel, Schöps, Cremers

Large-scale direct mono-SLAM Fully direct (no keypoints / features) Real-time even on CPU Open-source code & data-sets