Image source: octomap.github.io Image source: pirobot.org/blog/0015/ › ~valts › docs ›...

Preview:

Citation preview

Image source: octomap.github.io

Image source: pirobot.org/blog/0015/

• Map from first-person images to actions

• Need to learn how to reason about changing observations

• Add explicit Camera Projection and Differentiable Mapping

• Reason about the instruction on a static map

• Automatically handle changing first-person observations

Each pixel in the feature encodes an image neighbourhood

Input Image

Feature Map

Feature MapProjected Features(Map Frame)

(Image Plane in Camera Frame)

Projected features(time )

Semantic Map (time )

Semantic Map(time )

Semantic Map Grounding Map Goal Map

1x1 F

ilter

9x9 F

ilter

LSTMgo to the left side of plane

Inferred goal locationRecognized airplane

Grounding Map Goal Map

Perceptron

Forward velocity

• Output the velocity

command, given Grounding

and Goal maps

• Sent to quadcopter’s flight-

controller

Yaw rate

Agent

Action

Image Instruction

Oracle

Ground truth action

Ground truth trajectory

Modified variant of DAgger

Trade convergence guarantees for speed and memory efficiency

3500 Instructions + Environments

Ground-truth trajectories

63 Landmarks

252 Possible Tasks

Go to right side of mushroom

Total number of rollouts:3500 oracle

2000 policy

83.47

28.67

87.87

0

20

40

60

80

100

GSMN

(Ours)

NN with no

Mapping

Oracle

Outperform standard NN with no mapping

Very close to oracle performance

Feature

ExtractionMapping

1x1 Filter

9x9

FilterMLP

LSTMGo to the left side of

plane

Action

Image Features

Instruction Embedding

Grounding Map

Semantic Map

Goal Map

Recommended