Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
MÁTÉ LENGYEL
Computational and Biological Learning LabDepartment of Engineering
University of Cambridge
EPISODIC MEMORY: WHY AND HOW?(THE POWERS AND PERILS OF BAYESIAN INFERENCE IN THE BRAIN)
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: AN EXAMPLE
2
I raised to my lips a spoonful of the tea in which I had soaked a morsel of the cake. ... And suddenly the memory returns. The taste was that of the little crumb of madeleine which on Sunday mornings at Combray, when I went to say good day to her in her bedroom, my aunt Léonie used to give me, dipping it first in her own cup of real or of lime-flower tea.
Marcel Proust: À la recherche du temps perdu
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: AN EXAMPLE
2
I raised to my lips a spoonful of the tea in which I had soaked a morsel of the cake. ... And suddenly the memory returns. The taste was that of the little crumb of madeleine which on Sunday mornings at Combray, when I went to say good day to her in her bedroom, my aunt Léonie used to give me, dipping it first in her own cup of real or of lime-flower tea.
Marcel Proust: À la recherche du temps perdu
PART I: WHY DO WE HAVE SUCH MEMORIES?! specific personal experiences! organised into sequences of events
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: AN EXAMPLE
2
I raised to my lips a spoonful of the tea in which I had soaked a morsel of the cake. ... And suddenly the memory returns. The taste was that of the little crumb of madeleine which on Sunday mornings at Combray, when I went to say good day to her in her bedroom, my aunt Léonie used to give me, dipping it first in her own cup of real or of lime-flower tea.
Marcel Proust: À la recherche du temps perdu
PART I: WHY DO WE HAVE SUCH MEMORIES?! specific personal experiences! organised into sequences of events
PART II: HOW DOES THIS HAPPEN?! memories laid down in the past ! recalled in response to a cue
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
ah, those nice days back in Combray
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
predictions
ah, those nice days back in Combray
representation in memorysufficient statistics
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
predictions
if I soak my cake in my tea ! it will taste good
representation in memorysufficient statistics
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
predictions
if I soak my cake in my tea ! it will taste good
model of the environmentsufficient statistics
‘semantic’ memory:
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
predictions
‘episodic’ memory:select episodes
data points
if I soak my cake in my tea ! it will taste good
model of the environmentsufficient statistics
‘semantic’ memory:
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
predictions
‘episodic’ memory:select episodes
data points?if I soak my cake in my tea !
it will taste good
model of the environmentsufficient statistics
‘semantic’ memory:
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?if I soak my cake in my tea !
it will taste good
model of the environmentsufficient statistics
‘semantic’ memory:
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?
what shall I do next to taste something nice in the end?
model of the environmentsufficient statistics
‘semantic’ memory:
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?
what shall I do next to taste something nice in the end?
model of the environmentsufficient statistics
‘semantic’ memory:
! delayed rewards! temporal credit
assignment
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
learning control
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?
what shall I do next to taste something nice in the end?
model of the environmentsufficient statistics
‘semantic’ memory:
! delayed rewards! temporal credit
assignment
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
learning control
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?
what shall I do next to taste something nice in the end?
model of the environmentsufficient statistics
‘semantic’ memory:
clever hard
! delayed rewards! temporal credit
assignment
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
ON THE USE OF MEMORIES
3
experiencedata
learning control
‘episodic’ memory:select episodes
data points
planning for the futuresequential decision making
?
what shall I do next to taste something nice in the end?
model of the environmentsufficient statistics
‘semantic’ memory:
clever
dull
hard
easy! delayed rewards! temporal credit
assignment
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel 4
SEQUENTIAL DECISION-MAKING UNDER UNCERTAINTYLengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel 4
SEQUENTIAL DECISION-MAKING UNDER UNCERTAINTYLengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
! learns a model of the environment(posterior distribution over parameters, etc)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0
! learns a model of the environment(posterior distribution over parameters, etc)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1
! learns a model of the environment(posterior distribution over parameters, etc)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1T=100
! learns a model of the environment(posterior distribution over parameters, etc)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1T=100
! learns a model of the environment(posterior distribution over parameters, etc)
! selects actions by recursive ‘mental simulation’
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1T=100
! learns a model of the environment(posterior distribution over parameters, etc)
! selects actions by recursive ‘mental simulation’
! tree search implies combinatorial explosion
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1T=100
! learns a model of the environment(posterior distribution over parameters, etc)
! selects actions by recursive ‘mental simulation’
! tree search implies combinatorial explosion
! approximations are necessary
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SEMANTIC MEMORY: MODEL-BASED CONTROL
5
T=0T=1T=100
! learns a model of the environment(posterior distribution over parameters, etc)
! selects actions by recursive ‘mental simulation’
! tree search implies combinatorial explosion
! approximations are necessary
! effective computational noise
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
! stores specific episodes retrospectively(state—action—…—reward sequences)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
T=0! stores specific episodes retrospectively
(state—action—…—reward sequences)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
T=0T=1! stores specific episodes retrospectively
(state—action—…—reward sequences)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
T=0T=1T=100! stores specific episodes retrospectively
(state—action—…—reward sequences)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
T=0T=1T=100! stores specific episodes retrospectively
(state—action—…—reward sequences)
! selects action that yielded maximalultimate reward in past episodes startingfrom current state
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC MEMORY: MODEL-FREE CONTROL
6
T=0T=1T=100! stores specific episodes retrospectively
(state—action—…—reward sequences)
! selects action that yielded maximalultimate reward in past episodes startingfrom current state
compatible with:
! hippocampal involvement in" processing sequential memories
(Fortin et al, 2002; Ergorul & Eichenbaum, 2006; Manns et al, 2007, Lehn et al, 2009)
" imagining new experiences(Hassabis et al, 2007)
! awake forward replay at decision points (Johnson & Redish, 2007)
! reward and episodic information integrated (Lansink et al, 2009; Rossato et al, 2009)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
number of actionsA
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
• vary amount of experience available to semantic and episodic memory systems
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
• vary amount of experience available to semantic and episodic memory systems
• compute average performance of three systems
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
• vary amount of experience available to semantic and episodic memory systems
• compute average performance of three systems
1. perfect semantic memory-based control ! theoretical upper bound
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
• vary amount of experience available to semantic and episodic memory systems
• compute average performance of three systems
1. perfect semantic memory-based control ! theoretical upper bound2. approximate semantic memory-based controller
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
COMPARING THE TWO SYSTEMS
7
• vary complexity of the environment (A, B, and D)
• vary amount of experience available to semantic and episodic memory systems
• compute average performance of three systems
1. perfect semantic memory-based control ! theoretical upper bound2. approximate semantic memory-based controller 3. episodic memory-based controller
depthD
number of actionsA branching factor
B
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
value of state
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
values of available action
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
values of successor states
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
averaging
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
µQ
σQ
averaging
CQ
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
µQ
σQ
averagingmax
CQ
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
µQ
σQ
µV
σV
averagingmax
CQ
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet whole environment
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
µQ
σQ
µV
σV
averagingmax
CQ
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE PERFECT MODEL-BASED SYSTEM
8
more complex environment ! more potential for high rewards
V ′1
V ′2
V ′B
!
V
!
Q1
QA
p11
pAB
core idea: analysing a sub-treelet whole environment
Lengyel & Dayan, NIPS 2007
µV ′
σV ′
µQ
σQ
µV
σV
averagingmax
CQ
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
Lengyel & Dayan, NIPS 2007
Q1 Q2
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
Lengyel & Dayan, NIPS 2007
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
noise-to-signal ratio: ω2 =η22
η21 σ2
Q
actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
noise-to-signal ratio: ω2 =η22
η21 σ2
Q
actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
σQ
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
ω ω
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
single state
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
noise-to-signal ratio: ω2 =η22
η21 σ2
Q
actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
σQ
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
ω ω
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
incre
asin
g le
vels o
fco
mputa
tional n
oise
single state whole environment
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
noise-to-signal ratio: ω2 =η22
η21 σ2
Q
actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
σQ
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
ω ω
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF APPROXIMATIONS
9
incre
asin
g le
vels o
fco
mputa
tional n
oise
deeper environment ! noise is more deleterious
single state whole environment
true action values ! noisy versions
Q̃ = η1Q + η2z z ∼ N (0, 1)
noise-to-signal ratio: ω2 =η22
η21 σ2
Q
actual reward dependson Q of the actionwith highest Q̃
Lengyel & Dayan, NIPS 2007
σQ
P(Q̃2 | Q2
)P
(Q̃1 | Q1
)
Q1 Q2
ω ω
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
! "!! #!! $!! %!! &!!!
!'&
"
!"
! "!! #!! $!! %!! &!!!
!'"
!'#
!#
! "!! #!! $!! %!! &!!!
"!
#!
$!
()*+,-!./!+/)0123,.)/
4!
THE EFFECT OF LEARNING
10
key idea: ignorance about the environment ! additional noise
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
! "!! #!! $!! %!! &!!!
!'&
"
!"
! "!! #!! $!! %!! &!!!
!'"
!'#
!#
! "!! #!! $!! %!! &!!!
"!
#!
$!
()*+,-!./!+/)0123,.)/
4!
THE EFFECT OF LEARNING
10
key idea: ignorance about the environment ! additional noise
number of times each state-action pair
is visited
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
! "!! #!! $!! %!! &!!!
!'&
"
!"
! "!! #!! $!! %!! &!!!
!'"
!'#
!#
! "!! #!! $!! %!! &!!!
"!
#!
$!
()*+,-!./!+/)0123,.)/
4!
THE EFFECT OF LEARNING
10
key idea: ignorance about the environment ! additional noise
number of times each state-action pair
is visited
time requiredto collect it
∝ A BD
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF COMPUTATIONAL + IGNORANCE NOISE
11
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF COMPUTATIONAL + IGNORANCE NOISE
11
incre
asin
g le
vels o
fco
mputa
tional n
oise
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF COMPUTATIONAL + IGNORANCE NOISE
11
approximations: more adverse effects early in learning!
incre
asin
g le
vels o
fco
mputa
tional n
oise
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF COMPUTATIONAL + IGNORANCE NOISE
11
approximations: more adverse effects early in learning!
incre
asin
g le
vels o
fco
mputa
tional n
oise
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
THE EFFECTS OF COMPUTATIONAL + IGNORANCE NOISE
11
approximations: more adverse effects early in learning!
incre
asin
g le
vels o
fco
mputa
tional n
oise
room for alternative decision making systems in low-data limit
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC VS. SEMANTIC MEMORY-BASED DECISION MAKING
12
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC VS. SEMANTIC MEMORY-BASED DECISION MAKING
12
episodic advantage early in learning
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC VS. SEMANTIC MEMORY-BASED DECISION MAKING
12
incre
asin
g e
nviro
nm
enta
l com
ple
xity
episodic advantage early in learning
Amount of experience
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
EPISODIC VS. SEMANTIC MEMORY-BASED DECISION MAKING
12
incre
asin
g e
nviro
nm
enta
l com
ple
xity
episodic advantage early in learning
lasts longer for more complex environments
Amount of experience
Lengyel & Dayan, NIPS 2007
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
SUMMARY
13
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memoryneocortex
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
neocortex
hippocampus
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
neocortex
hippocampus
striatum
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
‘consolidation’
neocortex
hippocampus
striatum
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
‘consolidation’
consolidation: transfer of control rather than transfer of memories?
neocortex
hippocampus
striatum
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
‘consolidation’
‘competing memory systems’
consolidation: transfer of control rather than transfer of memories?
neocortex
hippocampus
striatum
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
‘consolidation’
‘habitization’
‘competing memory systems’
Daw et al, 2005
consolidation: transfer of control rather than transfer of memories?
neocortex
hippocampus
striatum
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
envi
ronm
enta
l co
mple
xit
y
learning
SUMMARY
13
semantic memory
episodic memory
‘value’ or procedural memory
‘consolidation’
‘habitization’
‘competing memory systems’
Daw et al, 2005
consolidation: transfer of control rather than transfer of memories?
neocortex
hippocampus
striatum
environmental non-stationarity
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
! episodes store outcomes (goal-directed) or rewards (habitual)
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
! episodes store outcomes (goal-directed) or rewards (habitual)
! arbitration between parallel systems
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
! episodes store outcomes (goal-directed) or rewards (habitual)
! arbitration between parallel systems! need to represent uncertainty (Daw et al, 2005)
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
! episodes store outcomes (goal-directed) or rewards (habitual)
! arbitration between parallel systems! need to represent uncertainty (Daw et al, 2005)
! replay during sleep not for consolidation
Máté Lengyel: Episodic memory: why and how? BCCN 2009, 3 October 2009 http://www.eng.cam.ac.uk/~m.lengyel
OPEN QUESTIONS
14
! granularity of integration of episodic with model-based system
! episodes store outcomes (goal-directed) or rewards (habitual)
! arbitration between parallel systems! need to represent uncertainty (Daw et al, 2005)
! replay during sleep not for consolidation ! keeping memory representations in register (Káli & Dayan, 2005)