28
Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA

Learning Momentum: Integration and Experimentation

  • Upload
    nailah

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Learning Momentum: Integration and Experimentation. Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA. Motivation. It’s hard to manually derive controller parameters. The parameter space increases exponentially with the number of parameters. - PowerPoint PPT Presentation

Citation preview

Page 1: Learning Momentum: Integration and Experimentation

Learning Momentum: Integration and

ExperimentationBrian Lee and Ronald C. ArkinMobile Robot LaboratoryGeorgia TechAtlanta, GA

Page 2: Learning Momentum: Integration and Experimentation

Motivation It’s hard to manually derive controller parameters.

The parameter space increases exponentially with the number of parameters.

You don’t always have a priori knowledge of the environment. Without prior knowledge, a user can’t confidently derive

appropriate parameter values, so it becomes necessary for the robot to adapt on its own to what it finds.

Obstacle densities and layout in the environment may be heterogeneous. Parameters that work well for one type of environment may

not work well with another type.

Page 3: Learning Momentum: Integration and Experimentation

Adaptation and Learning Methods – DARPA MARS Investigate robot shaping at five distinct

levels in a hybrid robot software architecture Implement algorithms within MissionLab

mission specification system Conduct experiments to evaluate

performance of each technique Combine techniques where possible Integrate on a platform more suitable for

realistic missions and continue development

Page 4: Learning Momentum: Integration and Experimentation

Overview of techniques CBR Wizardry

Guide the operator Probabilistic Planning

Manage complexity for the operator

RL for Behavioral Assemblage Selection

Learn what works for the robot

CBR for Behavior Transitions

Adapt to situations the robot can recognize

Learning Momentum Vary robot parameters in

real time

THE LEARNINGCONTINUUM:

Deliberative (premission)

.

.

.

Behavioral switching

.

.

.

Reactive (online

adaptation)

.

.

.

Page 5: Learning Momentum: Integration and Experimentation

Basic Concepts of LM Provides adaptability to behavior-based systems A crude form of reinforcement learning.

If the robot is doing well, keep doing what it’s doing, otherwise try something different.

Behavior parameters are changed in response to progress and obstacles.

The system is still fully reactive. Although the robot changes its behavior, there is no

deliberation.

Page 6: Learning Momentum: Integration and Experimentation

Currently Used Behaviors Move to Goal

Always returns a vector pointing toward the goal position.

Avoid Obstacles Returns a sum of weighted vectors pointing

away from obstacles. Wander

Returns vectors pointing in random directions.

Page 7: Learning Momentum: Integration and Experimentation

Adjustable Parameters Move to goal vector gain Avoid obstacle vector gain Avoid obstacle sphere of influence

Radius around the robot inside of which obstacles are perceived

Wander vector gain Wander persistence

The number of consecutive steps the wander vector points in the same direction

Page 8: Learning Momentum: Integration and Experimentation

Four Predefined Situations no movement

M < T movement

progress toward the goal M > T movement P > T progress

no progress with obstacles M > T movement P < T progress O count > T obstacles

no progress without obstacles M > T movement P < T progress O count < T obstacles

M = average movement M goal = average movement to the goal P = M goal / M O count = obstacles encountered T movement = movement threshold T progress = progress threshold T obstacles = obstacles threshold

Page 9: Learning Momentum: Integration and Experimentation

Parameter adjustmentsGoal Obstacle NoiseGain Gain Sphere Gain Persistence

NoMovement

-0.1 to 0.0 -0.1 to 0.0 -0.5 to 0.0 0.1 to 0.5 0 to 1

Progress 0.5 to 1.0 -0.1 to 0.0 -0.5 to 0.0 -0.1 to 0.0 -1 to 0

No Progressw/ Obstacles

-0.1 to 0.0 0.1 to 0.5 0.0 to 0.5 0.0 to 0.1 0 to 1

No Progressw/out Obst.

0.0 to 0.3 -0.1 to 0.0 0.0 to 0.5 -0.2 to 0.0 -1 to 0

Sample adjustment parameters for ballooning.

Page 10: Learning Momentum: Integration and Experimentation

Two Possible Strategies Ballooning - Sphere of influence is

increased when obstacles impede progress. The robot moves around large objects.

Squeezing - Sphere of influence is decreased when obstacles impede progress. The robot moves between closely spaced objects.

Page 11: Learning Momentum: Integration and Experimentation

IntegrationBase System

Position andGoal Information

ObstacleInformation

Move To Goal(Gm)

Avoid Obstacles(Go,S)

Wander(Gw,P)

∑Output

direction

Sensors Controller

Gm = goal gain Go = obstacle gain S = obstacle sphere of influence Gw = wander gain P = wander persistence

Page 12: Learning Momentum: Integration and Experimentation

IntegrationIntegrated System

Position andGoal Information

ObstacleInformation

Move To Goal(Gm)

Avoid Obstacles(Go,S)

Wander(Gw,P)

∑Output

direction

Sensors Controller

LM ModuleNew Gm, Go, S, Gw, and P parameters.

Gm = goal gain Go = obstacle gain S = obstacle sphere of influence Gw = wander gain P = wander persistence

Page 13: Learning Momentum: Integration and Experimentation

Experiments in Simulation 150m x 150m area robot moves from (10m, 10m) to

(140m, 90m) Obstacle densities of 15% and 20%

were used. Obstacle radii varied between 0.38m

and 1.43m.

Page 14: Learning Momentum: Integration and Experimentation

Ballooning

Page 15: Learning Momentum: Integration and Experimentation

Observations on Ballooning Covers a lot of area Not as easily trapped in box canyon

situations May settle in locally clear areas May require a high wander gain to

carry the robot through closely spaced obstacles

Page 16: Learning Momentum: Integration and Experimentation

Squeezing

Page 17: Learning Momentum: Integration and Experimentation

Observations on Squeezing Results in a straighter path Moves easily through closely spaced

obstacles May get trapped in small box canyon

situations for large amounts of time

Page 18: Learning Momentum: Integration and Experimentation

Simulations of the Real World

StartPlace

EndPlace

24m x 10m

Simulated setup of the real world environment.

Page 19: Learning Momentum: Integration and Experimentation

Completion Rates For Simulation

0

20

40

60

80

100

120

SetA

SetB

SetC

SetD

0102030405060708090

100

SetE

SetF

SetG

SetH

LM Strategy WanderGain

WanderUpper Limit

Bar 1 None 0.3 NABar 2 None 0.5 NABar 3 None 1.0 NABar 4 Ballooning NA 15Bar 5 Ballooning NA 10Bar 6 Squeezing NA 15

LM Strategy WanderGain

WanderDelta Range

Bar 1 None 0.5 NABar 2 None 1.0 NABar 3 Ballooning NA 0.0 – 0.1Bar 4 Ballooning NA 0.0 – 0.5Bar 5 Squeezing NA 0.0 – 0.1Bar 6 Squeezing NA 0.0 – 0.5

Uniform Obstacle Size (1m radii)

Varying Obstacle Sizes (0.38m - 1.43m radii)

Page 20: Learning Momentum: Integration and Experimentation

Average Steps to Completion

SetA Set

B SetC Set

D

0

2000

4000

6000

8000

10000

12000

14000

no LM

ballooning

squeezing

Set ESet F

Set GSet H

0

5000

10000

15000

20000

25000

30000

no LM

ballooning

squeezing

LM Strategy WanderGain

WanderUpper Limit

Bar 1 None 0.3 NABar 2 None 0.5 NABar 3 None 1.0 NABar 4 Ballooning NA 15Bar 5 Ballooning NA 10Bar 6 Squeezing NA 15

LM Strategy WanderGain

WanderDelta Range

Bar 1 None 0.5 NABar 2 None 1.0 NABar 3 Ballooning NA 0.0 – 0.1Bar 4 Ballooning NA 0.0 – 0.5Bar 5 Squeezing NA 0.0 – 0.1Bar 6 Squeezing NA 0.0 – 0.5

Uniform Obstacle Size (1m radii)

Varying Obstacle Sizes (0.38m - 1.43m radii)

Page 21: Learning Momentum: Integration and Experimentation

Results FromSimulated Real Environment

0

100

200

300

400

500

600

700

800

No LM No LM ballooning squeezing0

102030405060708090

100

No LM No LM ballooning squeezing

% Complete Steps to Completion

As before, there is an increase in completion rates with an accompanying increase in steps to completion.

Page 22: Learning Momentum: Integration and Experimentation

Simulation Results Completion rates can be drastically

improved. Completion rate improvements

come at a cost of time. Ballooning and squeezing strategies

are geared toward different situations.

Page 23: Learning Momentum: Integration and Experimentation

Physical Robot Experiments Nomad 150 robot Sonar ring for

obstacle avoidance

Traverses the length of a 24m x 10m room while negotiating obstacles

Page 24: Learning Momentum: Integration and Experimentation

Outdoor Run (adaptive)

Page 25: Learning Momentum: Integration and Experimentation

Outdoor Run (non-adaptive)

Page 26: Learning Momentum: Integration and Experimentation

Physical Experiment Results Non-learning robots

became stuck. Learning robots

successfully negotiated the obstacles.

Squeezing was faster than ballooning in this case.

0

200

400

600

800

1000

1200

1400

No LM Ballooning

Average steps to goal.

Page 27: Learning Momentum: Integration and Experimentation

Conclusions Improved success has a price of time. Performance of one strategy is very poor

in situations better suited for another strategy.

The ballooning strategy is generally faster.

Ballooning robots can move through closely spaced objects faster than squeezing robots can move out of box canyon situations.

Page 28: Learning Momentum: Integration and Experimentation

Conclusions (cont’d) If some general knowledge of the terrain

is know a priori, an appropriate strategy can be chosen.

If terrain is totally unknown, ballooning is probably the better choice.

A way to dynamically switch strategies should improve performance.