Multi-Output Learning for Camera Relocalization Abner Guzmán-Rivera UIUC Pushmeet Kohli Ben Glocker Jamie Shotton Toby Sharp Andrew Fitzgibbon Shahram

Multi-Output Learningfor Camera Relocalization

Abner Guzmán-Rivera UIUC

Pushmeet Kohli Ben Glocker Jamie Shotton Toby Sharp Andrew Fitzgibbon Shahram Izadi

Microsoft Research

2

Camera Relocalizationfrom RGB-D images

World

Know 3D model

RGB-Depth

Observe single frame

Where is the camera?

6D camera pose H(rotation and translation)

3

Applications Large scale 3D model reconstruction

4

Applications Vehicle, robot, etc. localization

5

Applications Augmented Reality

6

Other Approaches to Localization Sparse key-point matching:

– Detectors: [Rosten et al. PAMI’10], [Holzer et al. ECCV’12]

– Descriptors: [Winder and Brown CVPR’07], [Calonder et al. ECCV’10], [Rublee et al. ICCV’11]

– Matching: [Lepetit and Fua PAMI’06], [Nistér and Stewénius CVPR’06], [Schindler et al. CVPR’07]

– Pose estimation: [Irschara et al. CVPR’09], [Dong et al. ICCV’09], [Yi et al. ECCV’10], [Baatz et al. IJCV’11], [Sattler et al. ICCV’11]

Whole key-frame matching[Klein and Murray ECCV’08], [Gee and Mayol-Cuevas BMVC’12]

Epitomic location recognition[Ni et al. PAMI’09]

7

Relocalization as Inverse Problem Find the pose H* minimizing the error in a

rendering of the model

3D model of sceneRendering error

View “renderer”Input RGB-D frame

8

Inverse Problem

DiscriminativePredictor

9

Inverse Problem

10

Single Predictor Not Powerful Enough Limited expressivity

The mapping is one-to-many

Input frame

11

Approx. Inverse Problem Stage 1

Portfolio ofDiscriminative

PredictorsWant complementary or “diverse” predictions

12

Approx. Inverse Problem Stage 2

13

How to train such portfolioof complementary predictors?

14

Discriminative Predictor[Shotton et al. CVPR’13]

15

Scene Coordinate Regression Forests

[Shotton et al. CVPR’13]

Pixel comparison features(Depth and RGB) (x,y,z) world coordinate

Regression tree:

Regression forest

. . .

16

Scene Coordinate Regression Forests


Inliers for several hypothesesfrom RANSAC

H1

H2

H3

H4

H5

H6

. . .Forest predicts 3Dworld coordinates

Sample pixels frominput RGB-D frame

17

Learning a portfolio of predictors

to output a set of hypotheses that:Would like to train a set of predictors

1. Are relevant, i.e., approx. local minimizers2. Summarize well the output space

18

Learning a portfolio: previous work Multiple Choice Learning

[Guzman-Rivera et al. NIPS’12, AISTATS’14]

Set min-loss Oracle penalizes portfolio for the errorin the best prediction in the output

– The portfolio is NOT penalized for being diverse– Set min-loss applies to standard datasets– Iterative training of fixed size portfolio

Standard task-loss

19

Learning a portfolio of predictors

Portfolio of predictors CVPR’13 SCoRe Forest

We already have the objective to optimize

and propose to approximate (1) by

20

– The portfolio is NOT penalized for being diverse– Learning procedure is able to tune portfolio to

the reconstruction error to be used at test-time– Next we describe one way to achieve diversity

Multi-Output LossStandard task-loss

21

Training Algorithm

22

Loss to Example Weights

Diversity parameter(“variance” of the weights)

Multi-output loss for example j

Intuition: Want next predictor to emphasize accuracy on examples difficult thus far

23

Rendering Error

24

L1 Rendering ErrorInput frame 1. Raycast depth frame for some hypothesis

2. Evaluate L1 distance between input depth and raycast depth

25

Results

26

7-Scenes Dataset

[Shotton et al. CVPR’13, Glocker et al. ISMAR’13]

27

Metric Proportion Correct (single prediction)

– Correct if translational error ≤ 5cm ANDrotational error ≤ 5o

Competing Approaches CVPR13: Scene Coordinate Regression Forests


CVPR13 + M-Best– Take M-Best RANSAC hypotheses

28

Office

Input frame

Multiple predictions:

Ground-truth (white),Prediction (magenta):

29

Stairs

Input frame

Multiple predictions:

Ground-truth (white),Prediction (magenta):

30

All Scene Average

1 2 3 4 5 6 7 8 9 100.66

0.68

0.70

0.72

0.74

0.76

0.78

0.80

CVPR13 + M-BestMulti-OutputCVPR13

Pro

port

ion

Cor

rect

Size of Portfolio

31

All Scene Average

1 2 3 4 5 6 7 8 9 100.66

0.68

0.70

0.72

0.74

0.76

0.78

0.80

CVPR13 + M-BestMulti-OutputCVPR13

Pro

port

ion

Cor

rect

Size of Portfolio

Usingaggregation

32

Summary Camera relocalization as inverse problem

Portfolio of complementarydiscriminative predictors

Method to learn suchportfolio

State-of-the-art camerarelocalization

Documents

Multi-Output Learning for Camera Relocalization Abner Guzmán-Rivera UIUC Pushmeet Kohli Ben Glocker Jamie Shotton Toby Sharp Andrew Fitzgibbon Shahram