Modeling goal inference in action observation - · PDF fileIntroduction The discovery of...

Preview:

Citation preview

IntroductionThe discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed actions, suggests that recognising goal-directed actions of others takes place through simulation using one's own action system. It has also been found that childeren tend to imitate the goals of observed actions and use their own preferred means to reach this goal. This is convenient because copying action goals allows the observer to imitate without using the exact same means, which may not be possible. This becomes evident when the observer and observed actor have different bodies (robots and humans) or when the environments of actor and observer differ substantially (when an obstacle is present or absent in either environment). Here we present a model of action observation based on goal inference [1].

References[1] Cuijpers RH, Van Schie HT, Koppen M, Erlhagen W and Bekkering H (in press). Goals and means in action observation: a computational approach. Neural Networks.

SimulationsAction goal inference

The action goal (E) can be inferred even when the target likelihoods (C) and action alternative likelihoods (D) are ambiguous.

With knowledge about the ultimate goal state the action goal is correctly inferred (F) after 25% of the movement time (MT).

Efffect of personal preferences

The target preferences (B) can disambiguate the goal likelihood (C)

The action preferences do not affect the goal likelihood (D) without knowledge about the final goal state, but they do with that knowledge (E)

The preferred action (when imitating) is determined by the preferences (F)

Task

kno

wle

dge

Pre

fere

nces

Vf(j)

κij

p(j|i)

p(Ak|i)

p(igj|f)

p(Ak|i,f)

action goalplanning

action alternativeplanning

Action planning

action goallikelihood

Task

kno

wle

dge

Pre

fere

nces

Vf(j)

κij

λk

p(j|i)

p(Ak|i)

p(cn|i)

p(igj|f,ot)

p(ot|igj)

p(ot|Ak,igj)

ot

action goalinference

action alternativelikelihood

Action observation

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

x coordinate

y co

ordi

nate c1

c2

c3

c4

c5

0 0.5 1 1.5-2

-1.5

-1

-0.5

0

0.5

1

1.5

distance from target

rate

of c

hang

e of

dis

tanc

e c1c2c3c4c5

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

% of MT

com

pone

nt li

kelih

ood

c1c2c3c4c5

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

% of MT

actio

n al

tern

ativ

e lik

elih

ood

A1: [1 5]A2: [2 5]A3: [1 3]A4: [1 4]A5: [2 3]A6: [2 4]A7: [3 5]A8: [4 5]

0 20 40 60 80 1000

0.05

0.1

0.15

0.2

0.25

% of MT

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 20 40 60 80 1000

0.002

0.004

0.006

0.008

0.01

% of MT

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

without final state knowledge with final state knowledge

A B C

D E F

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

x coordinate

y co

ordi

nate

c1 c2

c3

c4

c5

0 0.1 0.2 0.3 0.40

0.2

0.4

0.6

0.8

1

p(c1|i) = pc0 - p(c2|i) p(c1|i) = pc0 - p(c2|i)

actio

n al

tern

ativ

e lik

elih

ood A1: [1 5]

A2: [2 5]A7: [3 5]

0 0.1 0.2 0.3 0.40

0.2

0.4

0.6

0.8

1

goal

alte

rnat

ive

likel

ihoo

d

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.2

0.4

0.6

0.8

1

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.01

0.02

0.03

0.04

p(A1|i)=pA0 - p(A7|i)p(A1|i)=pA0 - p(A7|i) p(A1|i)=pA0 - p(A7|i)

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.01

0.02

0.03

0.04

0.05

0.06

imita

tion

prob

abili

ty

A1: [1 5]A2: [2 5]A7: [3 5]

without final state knowledge with final state knowledge

A B C

D E F

Modeling goal inference in action observationRaymond H. Cuijpers1, Hein T. van Schie1, Mathieu Koppen1, Wolfram Erlhagen2 and Harold Bekkering1

1Nijmegen Institute for Cognition and Information, Radboud University, 6500 HE Nijmegen, The Netherlands2Department of Mathematics for Science and Technology, University of Minho, 4800-058 Guimaraes, Portugal

E-mail: r.cuijpers@nici.ru.nl

Each action alternative Ak entailsa transition from goal state igj

Each component cn may be used bydifferent action alternatives Ak

Model architecture

actor observer building plan

xE

xE

x3

x2x1

Construction taskIn the construction task two agents jointly contstruct a model from Baufixbuilding blocks. Both agents know what mustbe built and how to manipulate the components.Body, action repertoire, and knowledge abouthow to reach the final goal state may differ.

i j1

jM

j2

A1

A2

A3

c1 c2

c3

A3

A2 A1

ot = dn(t) + τ dn(t).

p(ot|Ak,igj) = ∑ exp(− )1√2πσ2

ot2

2σ2n

p(cn|i)

∑p(cl|i)l

p(ot|igj) = ∑ p(ot|Ak,igj)k

p(Ak|i)

∑p(Al|i)l

p(Ak|i,f) = p(igj|f)p(Ak|i)

∑p(Al|i)l

p(igj|f,ot) ~ p(ot|igj) p(j|i) Vf(j)

p(igj|f) ~ p(j|i) Vf(j)

janticipated = arg max p(igj|f,ot)j

kplanned = arg max p(Ak|i,f)k

A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory with a dot at every 10% of the movement time. B: rate of change of distance plotted as a function of the distance of the hand from the target. The solid black line indicates the line d+τd=0,where τ=0.1. C: likelihood given each component as a function of time (in % of movement time). D: likelihood given each action alternative as a function of time (in % of movement time). The lists of components associated to each action alternative are indicated between brackets in the legend. E: likelihood given each action goal without using knowledge about the desired final state. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. F: likelihood given each action goal using knowledge about the desired final state. The vertical line indicates the point in time where the likelihood ratio of the first and second largest likelihood exceeds the threshold α=1.5.

Effect of changing the component preferences (B, C) and the action alternative preferences (D, E, F) on the action goal inference. A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory towards c5 with a dot at every 10% of the movement time. B: likelihood given each action alternative as a function of the preference p(c1|i) for component c1 under the constraint that p(c1|i)+p(c2|i)=pc0=2/5. The lists of components associated to each action alternative are indicated between brackets in the legend. C: likelihood given each action goal as a function of the preference for component c1. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. D: likelihood given each action goal as a function of the preference p(A1|i) for action alternative A1 (without using knowledge about the desired final state f). The sum p(A1|i)+p(A7|i)=pA0=2/8 is kept constant. E: same as D except that knowledge about the final state f is used. F: likelihood of the planned action alternative when the inferred action goal (panel D) is imitated.

Core assumptionsViewpoint invarianceBecause the viewpoint is typically not shared, we use the distance between effector and target and its rate of change as perceptual input.

Use your own action systemBased on the action repertoire of the observer all action alternatives (actions and sets of components on which they operate) are enumerated. The perceptual evidence determines the likelihoods of these action alternatives (of the observer).

Infer goals instead of means Different action alternatives (means) may entail the same action goal. By inferring this action goal an adequate response can be generated, even when the actor being observed uses different means to reach this goal.

Use task knowlegdeThe observer uses knowledge about which components can be combined and the ultimate goal to be reached.

Use personal preferencesWhen planning an action the preferred action alternative is chosen. During action observation these preferences bias the inference process.

.

Recommended