Anna Koop, Leah Hackman, Richard S....

Preview:

Citation preview

playing catch

catch procedure throw procedure

flick tail appropriate

direction and velocity

tail IR sensors not predicting success

ball predictions not active

suceeds when ball rolls

towards partner

terminates when ball not near tail

not near ballrolls towards

succeeds when tail surrounds ball

tail surrounds ball

IR sensors in tail are low

ball predictions are active

terminates when ball position unknown or unreachable

ball unreachable ball position unknown

no known procedure can lead to ball predictions

no procedure reliably leads to ball predictions

position to intersect ball

ball partner

another agent

predictions about

interactions

play

reward for learning

less chance of negative rewardthan competition

object predicting interactions

rolls when I bump into it looks greentemporal coherence to predictions

Bridging the Implementation GapFrom Sensorimotor Experience to Conceptual Knowledge

Anna Koop, Leah Hackman, Richard S. Sutton

Veri�able knowledge is ABOUT sensorimotor data

The RL Perspective

agent environment

m

s

The reinforcement learning agent is an input-output system, interacting with an environment that is only accessible via sensation and motor signals.

...m t-2s t-2m t-1s t-1 m ts t m t+1s t+1m t+2s t+2...

current futurepast

The Gapsensorimotor data conceptual knowledge

temporal, dynamic atemporal, (more) static

shareable, objectiveindividual, subjective

detailed, situated abstract, general

Veri�able Signals

At every timestep the agent receives a sensor signal and sends a motor signal. Experience is made up of past, present, and potential future sensorimotor data.

s t

h

c

p

xt-n xt

Historic Predictivext+nxt

Compositionalxt

We can construct signals that abstract over time and data that are still veri�able statements about sensorimotor experience.

Different Views

We see the Critterbot playing catch with a ball.

The Critterbot sees various sensor and motor signals.

RIAL

&