Upload
vandien
View
216
Download
1
Embed Size (px)
Citation preview
IntroductionThe discovery of mirror neurons, which are selectively active during execution and observation of similar goal-directed actions, suggests that recognising goal-directed actions of others takes place through simulation using one's own action system. It has also been found that childeren tend to imitate the goals of observed actions and use their own preferred means to reach this goal. This is convenient because copying action goals allows the observer to imitate without using the exact same means, which may not be possible. This becomes evident when the observer and observed actor have different bodies (robots and humans) or when the environments of actor and observer differ substantially (when an obstacle is present or absent in either environment). Here we present a model of action observation based on goal inference [1].
References[1] Cuijpers RH, Van Schie HT, Koppen M, Erlhagen W and Bekkering H (in press). Goals and means in action observation: a computational approach. Neural Networks.
SimulationsAction goal inference
The action goal (E) can be inferred even when the target likelihoods (C) and action alternative likelihoods (D) are ambiguous.
With knowledge about the ultimate goal state the action goal is correctly inferred (F) after 25% of the movement time (MT).
Efffect of personal preferences
The target preferences (B) can disambiguate the goal likelihood (C)
The action preferences do not affect the goal likelihood (D) without knowledge about the final goal state, but they do with that knowledge (E)
The preferred action (when imitating) is determined by the preferences (F)
Task
kno
wle
dge
Pre
fere
nces
Vf(j)
κij
p(j|i)
p(Ak|i)
p(igj|f)
p(Ak|i,f)
action goalplanning
action alternativeplanning
Action planning
action goallikelihood
Task
kno
wle
dge
Pre
fere
nces
Vf(j)
κij
λk
p(j|i)
p(Ak|i)
p(cn|i)
p(igj|f,ot)
p(ot|igj)
p(ot|Ak,igj)
ot
action goalinference
action alternativelikelihood
Action observation
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
x coordinate
y co
ordi
nate c1
c2
c3
c4
c5
0 0.5 1 1.5-2
-1.5
-1
-0.5
0
0.5
1
1.5
distance from target
rate
of c
hang
e of
dis
tanc
e c1c2c3c4c5
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
% of MT
com
pone
nt li
kelih
ood
c1c2c3c4c5
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
% of MT
actio
n al
tern
ativ
e lik
elih
ood
A1: [1 5]A2: [2 5]A3: [1 3]A4: [1 4]A5: [2 3]A6: [2 4]A7: [3 5]A8: [4 5]
0 20 40 60 80 1000
0.05
0.1
0.15
0.2
0.25
% of MT
actio
n go
al li
kelih
ood
j1: [1 2]j2: [3 4 5 6]j3: [7 8]
0 20 40 60 80 1000
0.002
0.004
0.006
0.008
0.01
% of MT
actio
n go
al li
kelih
ood
j1: [1 2]j2: [3 4 5 6]j3: [7 8]
without final state knowledge with final state knowledge
A B C
D E F
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
x coordinate
y co
ordi
nate
c1 c2
c3
c4
c5
0 0.1 0.2 0.3 0.40
0.2
0.4
0.6
0.8
1
p(c1|i) = pc0 - p(c2|i) p(c1|i) = pc0 - p(c2|i)
actio
n al
tern
ativ
e lik
elih
ood A1: [1 5]
A2: [2 5]A7: [3 5]
0 0.1 0.2 0.3 0.40
0.2
0.4
0.6
0.8
1
goal
alte
rnat
ive
likel
ihoo
d
j1: [1 2]j2: [3 4 5 6]j3: [7 8]
0 0.05 0.1 0.15 0.2 0.250
0.2
0.4
0.6
0.8
1
actio
n go
al li
kelih
ood
j1: [1 2]j2: [3 4 5 6]j3: [7 8]
0 0.05 0.1 0.15 0.2 0.250
0.01
0.02
0.03
0.04
p(A1|i)=pA0 - p(A7|i)p(A1|i)=pA0 - p(A7|i) p(A1|i)=pA0 - p(A7|i)
actio
n go
al li
kelih
ood
j1: [1 2]j2: [3 4 5 6]j3: [7 8]
0 0.05 0.1 0.15 0.2 0.250
0.01
0.02
0.03
0.04
0.05
0.06
imita
tion
prob
abili
ty
A1: [1 5]A2: [2 5]A7: [3 5]
without final state knowledge with final state knowledge
A B C
D E F
Modeling goal inference in action observationRaymond H. Cuijpers1, Hein T. van Schie1, Mathieu Koppen1, Wolfram Erlhagen2 and Harold Bekkering1
1Nijmegen Institute for Cognition and Information, Radboud University, 6500 HE Nijmegen, The Netherlands2Department of Mathematics for Science and Technology, University of Minho, 4800-058 Guimaraes, Portugal
E-mail: [email protected]
Each action alternative Ak entailsa transition from goal state igj
Each component cn may be used bydifferent action alternatives Ak
Model architecture
actor observer building plan
xE
xE
x3
x2x1
Construction taskIn the construction task two agents jointly contstruct a model from Baufixbuilding blocks. Both agents know what mustbe built and how to manipulate the components.Body, action repertoire, and knowledge abouthow to reach the final goal state may differ.
i j1
jM
j2
A1
A2
A3
c1 c2
c3
A3
A2 A1
ot = dn(t) + τ dn(t).
p(ot|Ak,igj) = ∑ exp(− )1√2πσ2
ot2
2σ2n
p(cn|i)
∑p(cl|i)l
p(ot|igj) = ∑ p(ot|Ak,igj)k
p(Ak|i)
∑p(Al|i)l
p(Ak|i,f) = p(igj|f)p(Ak|i)
∑p(Al|i)l
p(igj|f,ot) ~ p(ot|igj) p(j|i) Vf(j)
p(igj|f) ~ p(j|i) Vf(j)
janticipated = arg max p(igj|f,ot)j
kplanned = arg max p(Ak|i,f)k
A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory with a dot at every 10% of the movement time. B: rate of change of distance plotted as a function of the distance of the hand from the target. The solid black line indicates the line d+τd=0,where τ=0.1. C: likelihood given each component as a function of time (in % of movement time). D: likelihood given each action alternative as a function of time (in % of movement time). The lists of components associated to each action alternative are indicated between brackets in the legend. E: likelihood given each action goal without using knowledge about the desired final state. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. F: likelihood given each action goal using knowledge about the desired final state. The vertical line indicates the point in time where the likelihood ratio of the first and second largest likelihood exceeds the threshold α=1.5.
Effect of changing the component preferences (B, C) and the action alternative preferences (D, E, F) on the action goal inference. A: scene layout of the bolts (circles), nuts (squares) and a 3-holed slat (diamond). The line indicates the movement trajectory towards c5 with a dot at every 10% of the movement time. B: likelihood given each action alternative as a function of the preference p(c1|i) for component c1 under the constraint that p(c1|i)+p(c2|i)=pc0=2/5. The lists of components associated to each action alternative are indicated between brackets in the legend. C: likelihood given each action goal as a function of the preference for component c1. The lists of action alternatives corresponding to each action goal are indicated between brackets in the legend. D: likelihood given each action goal as a function of the preference p(A1|i) for action alternative A1 (without using knowledge about the desired final state f). The sum p(A1|i)+p(A7|i)=pA0=2/8 is kept constant. E: same as D except that knowledge about the final state f is used. F: likelihood of the planned action alternative when the inferred action goal (panel D) is imitated.
Core assumptionsViewpoint invarianceBecause the viewpoint is typically not shared, we use the distance between effector and target and its rate of change as perceptual input.
Use your own action systemBased on the action repertoire of the observer all action alternatives (actions and sets of components on which they operate) are enumerated. The perceptual evidence determines the likelihoods of these action alternatives (of the observer).
Infer goals instead of means Different action alternatives (means) may entail the same action goal. By inferring this action goal an adequate response can be generated, even when the actor being observed uses different means to reach this goal.
Use task knowlegdeThe observer uses knowledge about which components can be combined and the ultimate goal to be reached.
Use personal preferencesWhen planning an action the preferred action alternative is chosen. During action observation these preferences bias the inference process.
.