1
Introduction The discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed actions, suggests that recognising goal-directed actions of others takes place through simulation using one's own action system. It has also been found that childeren tend to imitate the goals of observed actions and use their own preferred means to reach this goal. This is convenient because copying action goals allows the observer to imitate without using the exact same means, which may not be possible. This becomes evident when the observer and observed actor have different bodies (robots and humans) or when the environments of actor and observer differ substantially (when an obstacle is present or absent in either environment). Here we present a model of action observation based on goal inference [1]. References [1] Cuijpers RH, Van Schie HT, Koppen M, Erlhagen W and Bekkering H (in press). Goals and means in action observation: a computational approach. Neural Networks. Simulations Action goal inference The action goal (E) can be inferred even when the target likelihoods (C) and action alternative likelihoods (D) are ambiguous. With knowledge about the ultimate goal state the action goal is correctly inferred (F) after 25% of the movement time (MT). Efffect of personal preferences The target preferences (B) can disambiguate the goal likelihood (C) The action preferences do not affect the goal likelihood (D) without knowledge about the final goal state, but they do with that knowledge (E) The preferred action (when imitating) is determined by the preferences (F) Task knowledge Preferences V f (j) k ij p(j|i) p(A k |i) p(igj|f) p(A k |i,f) action goal planning action alternative planning Action planning action goal likelihood Task knowledge Preferences V f (j) k ij l k p(j|i) p(A k |i) p(c n |i) p(igj|f,o t ) p(o t |igj) p(o t |A k ,igj) o t action goal inference action alternative likelihood Action observation -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 x coordinate y coordinate c1 c2 c3 c4 c5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 distance from target rate of change of distance c1 c2 c3 c4 c5 0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1 % of MT component likelihood c1 c2 c3 c4 c5 0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1 % of MT action alternative likelihood A1: [1 5] A2: [2 5] A3: [1 3] A4: [1 4] A5: [2 3] A6: [2 4] A7: [3 5] A8: [4 5] 0 20 40 60 80 100 0 0.05 0.1 0.15 0.2 0.25 % of MT action goal likelihood j1: [1 2] j2: [3 4 5 6] j3: [7 8] 0 20 40 60 80 100 0 0.002 0.004 0.006 0.008 0.01 % of MT action goal likelihood j1: [1 2] j2: [3 4 5 6] j3: [7 8] without final state knowledge with final state knowledge A B C D E F -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 x coordinate y coordinate c1 c2 c3 c4 c5 0 0.1 0.2 0.3 0.4 0 0.2 0.4 0.6 0.8 1 p(c1|i) = pc0 - p(c2|i) p(c1|i) = pc0 - p(c2|i) action alternative likelihood A1: [1 5] A2: [2 5] A7: [3 5] 0 0.1 0.2 0.3 0.4 0 0.2 0.4 0.6 0.8 1 goal alternative likelihood j1: [1 2] j2: [3 4 5 6] j3: [7 8] 0 0.05 0.1 0.15 0.2 0.25 0 0.2 0.4 0.6 0.8 1 action goal likelihood j1: [1 2] j2: [3 4 5 6] j3: [7 8] 0 0.05 0.1 0.15 0.2 0.25 0 0.01 0.02 0.03 0.04 p(A1|i)=pA0 - p(A7|i) p(A1|i)=pA0 - p(A7|i) p(A1|i)=pA0 - p(A7|i) action goal likelihood j1: [1 2] j2: [3 4 5 6] j3: [7 8] 0 0.05 0.1 0.15 0.2 0.25 0 0.01 0.02 0.03 0.04 0.05 0.06 imitation probability A1: [1 5] A2: [2 5] A7: [3 5] without final state knowledge with final state knowledge A B C D E F Modeling goal inference in action observation Raymond H. Cuijpers 1 , Hein T. van Schie 1 , Mathieu Koppen 1 , Wolfram Erlhagen 2 and Harold Bekkering 1 1 Nijmegen Institute for Cognition and Information, Radboud University, 6500 HE Nijmegen, The Netherlands 2 Department of Mathematics for Science and Technology, University of Minho, 4800-058 Guimaraes, Portugal E-mail: [email protected] Each action alternative A k entails a transition from goal state igj Each component c n may be used by different action alternatives A k Model architecture actor observer building plan x E x E x 3 x 2 x 1 Construction task In the construction task two agents jointly contstruct a model from Baufix building blocks. Both agents know what must be built and how to manipulate the components. Body, action repertoire, and knowledge about how to reach the final goal state may differ. i j 1 j M j 2 A 1 A 2 A 3 c 1 c 2 c 3 A 3 A 2 A 1 o t = d n (t) + t d n (t) . p(o t |A k ,igj) = exp(- ) 1 2ps 2 ‘‘‘ o t 2 2s 2 n p(c n |i) p(c l |i) l p(o t |igj) = p(o t |A k ,igj) k p(A k |i) p(A l |i) l p(A k |i,f) = p(igj|f) p(A k |i) p(A l |i) l p(igj|f,o t ) ~ p(o t |igj) p(j|i) V f (j) p(igj|f) ~ p(j|i) V f (j) j anticipated = arg max p(igj|f,o t ) j k planned = arg max p(A k |i,f) k A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory with a dot at every 10% of the movement time. B: rate of change of distance plotted as a function of the distance of the hand from the target. The solid black line indicates the line d+td=0,where t=0.1. C: likelihood given each component as a function of time (in % of movement time). D: likelihood given each action alternative as a function of time (in % of movement time). The lists of components associated to each action alternative are indicated between brackets in the legend. E: likelihood given each action goal without using knowledge about the desired final state. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. F: likelihood given each action goal using knowledge about the desired final state. The vertical line indicates the point in time where the likelihood ratio of the first and second largest likelihood exceeds the threshold a=1.5. Effect of changing the component preferences (B, C) and the action alternative preferences (D, E, F) on the action goal inference. A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory towards c 5 with a dot at every 10% of the movement time. B: likelihood given each action alternative as a function of the preference p(c 1 |i) for component c 1 under the constraint that p(c 1 |i)+p(c 2 |i)=pc 0 =2/5. The lists of components associated to each action alternative are indicated between brackets in the legend. C: likelihood given each action goal as a function of the preference for component c 1 . The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. D: likelihood given each action goal as a function of the preference p(A 1 |i) for action alternative A 1 (without using knowledge about the desired final state f). The sum p(A 1 |i)+p(A 7 |i)=pA 0 =2/8 is kept constant. E: same as D except that knowledge about the final state f is used. F: likelihood of the planned action alternative when the inferred action goal (panel D) is imitated. Core assumptions Viewpoint invariance Because the viewpoint is typically not shared, we use the distance between effector and target and its rate of change as perceptual input. Use your own action system Based on the action repertoire of the observer all action alternatives (actions and sets of components on which they operate) are enumerated. The perceptual evidence determines the likelihoods of these action alternatives (of the observer). Infer goals instead of means Different action alternatives (means) may entail the same action goal. By inferring this action goal an adequate response can be generated, even when the actor being observed uses different means to reach this goal. Use task knowlegde The observer uses knowledge about which components can be combined and the ultimate goal to be reached. Use personal preferences When planning an action the preferred action alternative is chosen. During action observation these preferences bias the inference process. .

Modeling goal inference in action observation - · PDF fileIntroduction The discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed

  • Upload
    vandien

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Modeling goal inference in action observation - · PDF fileIntroduction The discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed

IntroductionThe discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed actions, suggests that recognising goal-directed actions of others takes place through simulation using one's own action system. It has also been found that childeren tend to imitate the goals of observed actions and use their own preferred means to reach this goal. This is convenient because copying action goals allows the observer to imitate without using the exact same means, which may not be possible. This becomes evident when the observer and observed actor have different bodies (robots and humans) or when the environments of actor and observer differ substantially (when an obstacle is present or absent in either environment). Here we present a model of action observation based on goal inference [1].

References[1] Cuijpers RH, Van Schie HT, Koppen M, Erlhagen W and Bekkering H (in press). Goals and means in action observation: a computational approach. Neural Networks.

SimulationsAction goal inference

The action goal (E) can be inferred even when the target likelihoods (C) and action alternative likelihoods (D) are ambiguous.

With knowledge about the ultimate goal state the action goal is correctly inferred (F) after 25% of the movement time (MT).

Efffect of personal preferences

The target preferences (B) can disambiguate the goal likelihood (C)

The action preferences do not affect the goal likelihood (D) without knowledge about the final goal state, but they do with that knowledge (E)

The preferred action (when imitating) is determined by the preferences (F)

Task

kno

wle

dge

Pre

fere

nces

Vf(j)

κij

p(j|i)

p(Ak|i)

p(igj|f)

p(Ak|i,f)

action goalplanning

action alternativeplanning

Action planning

action goallikelihood

Task

kno

wle

dge

Pre

fere

nces

Vf(j)

κij

λk

p(j|i)

p(Ak|i)

p(cn|i)

p(igj|f,ot)

p(ot|igj)

p(ot|Ak,igj)

ot

action goalinference

action alternativelikelihood

Action observation

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

x coordinate

y co

ordi

nate c1

c2

c3

c4

c5

0 0.5 1 1.5-2

-1.5

-1

-0.5

0

0.5

1

1.5

distance from target

rate

of c

hang

e of

dis

tanc

e c1c2c3c4c5

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

% of MT

com

pone

nt li

kelih

ood

c1c2c3c4c5

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

% of MT

actio

n al

tern

ativ

e lik

elih

ood

A1: [1 5]A2: [2 5]A3: [1 3]A4: [1 4]A5: [2 3]A6: [2 4]A7: [3 5]A8: [4 5]

0 20 40 60 80 1000

0.05

0.1

0.15

0.2

0.25

% of MT

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 20 40 60 80 1000

0.002

0.004

0.006

0.008

0.01

% of MT

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

without final state knowledge with final state knowledge

A B C

D E F

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

x coordinate

y co

ordi

nate

c1 c2

c3

c4

c5

0 0.1 0.2 0.3 0.40

0.2

0.4

0.6

0.8

1

p(c1|i) = pc0 - p(c2|i) p(c1|i) = pc0 - p(c2|i)

actio

n al

tern

ativ

e lik

elih

ood A1: [1 5]

A2: [2 5]A7: [3 5]

0 0.1 0.2 0.3 0.40

0.2

0.4

0.6

0.8

1

goal

alte

rnat

ive

likel

ihoo

d

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.2

0.4

0.6

0.8

1

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.01

0.02

0.03

0.04

p(A1|i)=pA0 - p(A7|i)p(A1|i)=pA0 - p(A7|i) p(A1|i)=pA0 - p(A7|i)

actio

n go

al li

kelih

ood

j1: [1 2]j2: [3 4 5 6]j3: [7 8]

0 0.05 0.1 0.15 0.2 0.250

0.01

0.02

0.03

0.04

0.05

0.06

imita

tion

prob

abili

ty

A1: [1 5]A2: [2 5]A7: [3 5]

without final state knowledge with final state knowledge

A B C

D E F

Modeling goal inference in action observationRaymond H. Cuijpers1, Hein T. van Schie1, Mathieu Koppen1, Wolfram Erlhagen2 and Harold Bekkering1

1Nijmegen Institute for Cognition and Information, Radboud University, 6500 HE Nijmegen, The Netherlands2Department of Mathematics for Science and Technology, University of Minho, 4800-058 Guimaraes, Portugal

E-mail: [email protected]

Each action alternative Ak entailsa transition from goal state igj

Each component cn may be used bydifferent action alternatives Ak

Model architecture

actor observer building plan

xE

xE

x3

x2x1

Construction taskIn the construction task two agents jointly contstruct a model from Baufixbuilding blocks. Both agents know what mustbe built and how to manipulate the components.Body, action repertoire, and knowledge abouthow to reach the final goal state may differ.

i j1

jM

j2

A1

A2

A3

c1 c2

c3

A3

A2 A1

ot = dn(t) + τ dn(t).

p(ot|Ak,igj) = ∑ exp(− )1√2πσ2

ot2

2σ2n

p(cn|i)

∑p(cl|i)l

p(ot|igj) = ∑ p(ot|Ak,igj)k

p(Ak|i)

∑p(Al|i)l

p(Ak|i,f) = p(igj|f)p(Ak|i)

∑p(Al|i)l

p(igj|f,ot) ~ p(ot|igj) p(j|i) Vf(j)

p(igj|f) ~ p(j|i) Vf(j)

janticipated = arg max p(igj|f,ot)j

kplanned = arg max p(Ak|i,f)k

A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory with a dot at every 10% of the movement time. B: rate of change of distance plotted as a function of the distance of the hand from the target. The solid black line indicates the line d+τd=0,where τ=0.1. C: likelihood given each component as a function of time (in % of movement time). D: likelihood given each action alternative as a function of time (in % of movement time). The lists of components associated to each action alternative are indicated between brackets in the legend. E: likelihood given each action goal without using knowledge about the desired final state. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. F: likelihood given each action goal using knowledge about the desired final state. The vertical line indicates the point in time where the likelihood ratio of the first and second largest likelihood exceeds the threshold α=1.5.

Effect of changing the component preferences (B, C) and the action alternative preferences (D, E, F) on the action goal inference. A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory towards c5 with a dot at every 10% of the movement time. B: likelihood given each action alternative as a function of the preference p(c1|i) for component c1 under the constraint that p(c1|i)+p(c2|i)=pc0=2/5. The lists of components associated to each action alternative are indicated between brackets in the legend. C: likelihood given each action goal as a function of the preference for component c1. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. D: likelihood given each action goal as a function of the preference p(A1|i) for action alternative A1 (without using knowledge about the desired final state f). The sum p(A1|i)+p(A7|i)=pA0=2/8 is kept constant. E: same as D except that knowledge about the final state f is used. F: likelihood of the planned action alternative when the inferred action goal (panel D) is imitated.

Core assumptionsViewpoint invarianceBecause the viewpoint is typically not shared, we use the distance between effector and target and its rate of change as perceptual input.

Use your own action systemBased on the action repertoire of the observer all action alternatives (actions and sets of components on which they operate) are enumerated. The perceptual evidence determines the likelihoods of these action alternatives (of the observer).

Infer goals instead of means Different action alternatives (means) may entail the same action goal. By inferring this action goal an adequate response can be generated, even when the actor being observed uses different means to reach this goal.

Use task knowlegdeThe observer uses knowledge about which components can be combined and the ultimate goal to be reached.

Use personal preferencesWhen planning an action the preferred action alternative is chosen. During action observation these preferences bias the inference process.

.