33
1 Computer Vision Group Technical University of Munich Jakob Engel Direct Real-Time SLAM Jakob Engel, Daniel Cremers David Caruso, Thomas Schöps, Vladyslav Usenko, Jörg Stückler, Jürgen Sturm Technical University Munich Jakob Engel Direct real-time SLAM

Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

1

Computer Vision Group

Technical University of MunichJakob Engel

Direct Real-Time SLAMJakob Engel, Daniel CremersDavid Caruso, Thomas Schöps, Vladyslav Usenko,

Jörg Stückler, Jürgen Sturm

Technical University Munich

Jakob Engel Direct real-time SLAM

Page 2: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

2

What is „direct“?

Jakob Engel Direct real-time SLAM

Maximum Likelihood:Find model parameters X that maximise the probability of observing Y.

Camera Poses, Camera Intrinsics, 3D Geometry, ...

Direct: Observations Y are pixel Intensities.=> P models (photometric) noise on

pixel intensities.

Indirect: Observations Y are 2D point positions.=> P models (geometric) noise on

point positions.

Probabilistic model P(Y | X)(likelihood)

Observed Data (Measurements)

Page 3: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

3

A simple example

Direct 2-frame Image Alignment(With known Depth)

[1] Robust Odometry Estimation for RGB-D Cameras (C. Kerl, J. Sturm, D. Cremers), In ICRA 2013. [2] Real-Time Visual Odometry from Dense RGB-D Images (F. Steinbruecker, J. Sturm, D. Cremers),

In Workshop at ICCV 2011.[3] Accurate quadri-focal tracking for robust 3d visual odometry. (A. Comport, E. Malis, P. Rives), In ICRA 2007[4] Lucas-Kanade 20 Years On: A Unifying Framework (S. Baker and I. Matthews), In IJCV 2004[5] .... and many, many more.

Jakob Engel Direct real-time SLAM

Page 4: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

4

A simple exampleNoisy Observations: Model Assumptions:

(brightness constancy)

(gaussian white noise on pixels)

=> Photometric Residuals:

=> Negative log-likelihood (Energy):

Jakob Engel Direct real-time SLAM

Page 5: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

5

Direct minimization of photometric error

ref. image new image

sum over valid ref. pixel

camera pose ref. depth

A simple example

„warps“ a pixel from ref. image to new image

Jakob Engel Direct real-time SLAM

Page 6: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

6

A simple example

• Minimized using the Gauss-Newton / LM algorithm(we use left-multiplicative increments on SE(3)).

• In these slides:(internally, they are stored and multiplied as

elements (quaternion + position)).• Define pose multiplication & inversion directly on this:

represent elements of .

Jakob Engel Direct real-time SLAM

Page 7: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

1. Linearize wrt. left-multiplied increment to pose :

2. Solve for increment

3. Apply increment2. Iterate (until convergence)

7

A simple example

Jakob Engel Direct real-time SLAM

Exatly the same as in Girogio‘s Lecture

Page 8: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

8

A simple example

Requires gradient of residual:

with• = warped point (before projection)

• = intrinsic camera calibration

• = image gradients

= 1x6 row of

Jakob Engel Direct real-time SLAM

[1] Odometry from RGB-D Cameras for Autonomous Quadrocopters (C. Kerl), Master's thesis, TUM, 2012

Page 9: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

9

A simple example

Coarse-to-Fine:Minimize for down-scaled, and use result as initialization for next finer level.

• makes it faster.• greatly increases convergence radius.

rotation around x (degree)Reference frame

Jakob Engel Direct real-time SLAM

Page 10: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

10

A simple example

Jakob Engel Direct real-time SLAM

Page 11: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

11

A simple example

Jakob Engel Direct real-time SLAM

[1] Large-Scale Direct SLAM with Stereo Cameras, (J. Engel, J. Stückler, D. Cremers), IROS ´2015

• computation time (tracking only) is linear to #pixel.• accuracy is still relatively good on very low

resolutions (120x160).useful to quicky try differnt loop-closure possibilities,

different initializations, ….

Page 12: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

12

A simple exampleFinal Algorithm:

k = 0for level = 3 ... 0

compute down-scaled images & depthmaps (factor = )compute down-scaled K (factor = )for i = 1..20

compute Jacobian

compute update

apply update

k++; maybe break early if too small or if residual increased

donedone+ robust weights (e.g. Huber), see iteratively reweighted least squares+ Levenberg-Marquad (LM) Algorithm

Jakob Engel Direct real-time SLAM

Page 13: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

13

Pose-Graph IntegrationIncorporation into a pose-graph framework.

, since we use

, with

At the minimum (b=0), the energy is hence approximated as

Jakob Engel Direct real-time SLAM

(with )

Page 14: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

14

Pose-Graph IntegrationIncorporation into a pose-graph framework.

Pose-Graph minimization:Minimize the sum over all individual errors, with

use

for

use

for

(exactly equivalent (for Lie groups), since the adjoint is linear) (NOT recommendet)

Jakob Engel Direct real-time SLAM

residual function

Page 15: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

15

Pose-Graph Integration

Jakob Engel Direct real-time SLAM

Page 16: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

16

Direct vs. Indirect

Jakob Engel Direct real-time SLAM

can only use (& reconstruct) corners can use (& reconstruct) everything that has image gradient

Indirect Direct

faster slower (but good for parallelism)

can deal well with geometric noise in the system (rolling shutter).

does not model geometric noise inthe system (rolling shutter BAD).

Data association (KP detection & matching) based on incompleteinformation.

No initial, fixed data association.Still need to determine visibility etc.

no need for good initialization. needs good initialization (solved by IMU)

Page 17: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

17

How about Depth?

So far we assumed known geometry (depth).What happens if depth is unknown?

Jakob Engel Direct real-time SLAM

Page 18: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

18

Dense vs. SparseNot synonymous with direct vs. indirect!Dense+Indirect: Compute dense optical flow, then estimate

(dense) geometry from flow-correspondences.Sparse+Direct: Just use a selected subset of pixels & group them.

• For dense, we need to integrate a prior, since (from passive vision alone), dense geometry is not observable (white walls).

• Typically, this prior is some version of „the world is smooth“.

• Causes correlations between geometry parameters (and significantly increases their number), which is bad for GN-like optimization methods.

=>

=>

Jakob Engel Direct real-time SLAM

Page 19: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

19

How about Depth?• As a result, real-time Dense or Semi-Dense appraoches refrain

from jointly optimizing the complete model (3D geometry+color, camera extrinsics+intrinsics).

• They use alternating optimization (LSD-SLAM) or other optimization method like Primal-Dual (REMODE, DTAM).

• They accumulate massive amounts of data, ignoring or approximating correlations.

• This works well if the data is good.

• It fails if the configuration is close-to-degenerate, since we may accumulate wrong linearizations, which will not be corrected as we ignored correlations & don‘t re-linearize.

Here as Example: LSD-SLAM

Jakob Engel Direct real-time SLAM

Page 20: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

20

LSD-SLAM Approach

Poses: ~5k Parameters.

Inverse Depth Values:~100M Parameters.

Pose-Pose block (sparse block-matrix)

Depth-Depth block:very large, very sparse, but irregular (depth residuals + smoothness prior terms)

Pose-Depth correlations.(sparse block-matrix)

(theoretical) Hessian Structure of LSD-SLAM

Jakob Engel Direct real-time SLAM

Page 21: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

21

LSD-SLAM Approach

LSD-SLAM real-time approach:

Tracking:• Track relative frame-frame poses independently, treating depth as

„noisy data“ (used depth-weights are similar to marginalizing depth).• Bundle frames afterwards using a pose-graph.

Mapping (Depth Estimation):• Estimate depth pixel-wise, using filtering (i.e., never re-compute

observations). Approximate noisy camera poses as „geometric error“.• Alternate one photometric „optimization step“ (i.e., an EKF-update)

with smoothing (gradient descend step on regularizer).

Jakob Engel Direct real-time SLAM

Page 22: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

22

LSD-SLAM Result

ECCV-sequence: 7 minutes, 640x480@50fps:25.000 Tracked Frames, 450 Keyframes; 7.000 Constraints; 51 Million Points

Engel, Schöps, Cremers; ECCV ´14

Jakob Engel Direct real-time SLAM

Page 23: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

23

Omni LSD-SLAM

1. estimate and filter distance instead of depth2. inverse compositional LK instead of forward compositional

Piecewise Pinhole Model: Tracking: straight-forwardMapping: ep. lines are straightUgly edge-handling, artificial model.

Unified Model: Tracking: straight-forward (more costly!)Mapping: ep. lines are conics -> costly.Clean model, close to physical lenses.

Caruso, Engel, Cremers; IROS ´15Jakob Engel Direct real-time SLAM

Page 24: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

24

Omni LSD-SLAM

Large-Scale Direct SLAM for Omnidirectional CamerasCaruso, Engel, Cremers; IROS ´15

Jakob Engel Direct real-time SLAM

Page 25: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

25

Affine Lighting

Approach: Make error function invariant to affine lighting changes:

Need to be careful with outliers!

Engel, Stückler, Cremers; IROS ´15Jakob Engel Direct real-time SLAM

Page 26: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

26

Stereo LSD-SLAM

Large-Scale Direct SLAM with Stereo CamerasEngel, Stückler, Cremers; IROS ´15

Jakob Engel Direct real-time SLAM

Page 27: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

27

Stereo-Inertial LSD-SLAM

Direct Visual-Inertial Odometry with Stereo Cameras Usenko, Engel, Stueckler, Cremers; ICRA ´16

Jakob Engel Direct real-time SLAM

Page 28: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

28

Camera CalibrationGeometric Calibration: 3D point -> 2D pixel coordinate.Well understood, and always done (often refined on-line).

Photometric Calibration: Scene irradiance (lux) -> 8 bit pixel value.Widely ignored: Features are generally invariant - brightness constancy is not.

A Simple Photometric Model:

pixel value

camera response function(gamma function)

pixel irradiance

attenuation factor(vignetting)

exposure time

read noise

Jakob Engel Direct real-time SLAM

Page 29: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

29

Photometric Calibration

pixel value

camera response function(gamma function)

pixel irradiance

attenuation factor(vignetting)

exposure time

read noise

log(irradiance B) constant read noise N Vignette V hardware gamma G

Jakob Engel Direct real-time SLAM

Page 30: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

30

Photometric Calibration

Now just replace „brightness constancy“ with „irradiance constancy“ in the model.

(-> need exposure times for each frame)

[1] Dense Visual SLAM, Richard Newcombe, PhD Thesis, Imperial College London

Jakob Engel Direct real-time SLAM

pixel value

camera response function(gamma function)

pixel irradiance

attenuation factor(vignetting)

exposure time

read noise

Page 31: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

31

New Mono-VO Dataset

new dataset coming soon (30+ sequences)

Jakob Engel Direct real-time SLAM

Will be listed here: https://vision.in.tum.de/data/datasets

Page 32: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

32

New Mono-VO DatasetEvaluation: 1. Disable (large) loop-closures in your algorithm.2. Compute drift (rotation, translation and scale) between start

and end segment (we provide bundled / MoCap ground-truth).

Jakob Engel Direct real-time SLAM

Page 33: Computer Vision Group Technical University of Munich Jakob ...labrococo/tutorial_icra_2016/... · 18 Dense vs. Sparse Not synonymous with direct vs. indirect! Dense+Indirect: Compute

33

Questions!

Questions

Jakob Engel Direct real-time SLAM