Upload
sarah-miles
View
222
Download
1
Tags:
Embed Size (px)
Citation preview
Striatal Dopamine (DA) and Learning: Do Category
Learning (CL) data constrain computational models?
Alan PickeringDepartment of Psychology
Overview• Classic CL findings and questions • DA, the striatum and learning• Generate simple hypothesis about CL
deficits in Parkinson’s Disease• Generate simple biologically-
constrained neural net to test hypothesis
• Simulate CL data on 2 types of matched CL tasks
• Conclusions – why model fails
Classic Findings and Questions
• Parkinson’s Disease (PD) patients are impaired at CL tasks.
• Why?-What psychological processes are impaired?-What brain regions and neuro- transmitters are involved?
Category Learning in Parkinson’s Disease
Weather task: Knowlton et al, 1996
Category Learning in Parkinson’s Disease
Main Findings: Knowlton et al, 1996
Key Facts• PD involves prominent damage to the
striatum• CL may (sometimes) involve
procedural/habit learning• Striatal structures are part of cortico-
striato-pallido-thalamic loops possibly implicated in procedural learning
• The striatum is strongly innervated by ascending DA projections
Simple Interpretation• CL deficits in PD may arise because
of damage to …
loss of ascending DA signals
which compromise the functioning of (parts of) …
the striatum
Three Learning Processes Which Might Be DA-Related
1. Appetitive reinforcement and motivation
DA cell firing increses/decreases provide a positive/negative reinforcement signal which is required for synaptic strengthening/ weakening
“3-factor learning rule”
(e.g., Wickens; Brown et al etc.)
Corticostriatal (Medium Spiny Cell) Synapse
DA Receptors in StriatumAfter Schultz, 1998
DA receptors: Unfilled rectangles
GLU receptors: Filled rectangles
DA-Related Processes (cont)
2. Reward Prediction ErrorMidbrain DA neurons increase firing in response to unexpected rewards and decrease firing to nonoccurrence of expected rewards
Firing change= reward prediction error
Schultz, Suri, Dickinson, Dayan etc.
DA Cell Recordings: Evidence For Reward Prediction Error
CUE REWARD
DA-Related Processes (cont)
3. Modulation of Neural SignalsFloresco et al (2001): “DA receptor activity serves to strengthen salient inputs while inhibiting weaker ones”
Also: Nicola & Malenka; J.D.Cohen; Ashby & Cassale; Salum et al; Nakahara; Schultz
Evidence For ModulationNicola & Malenka, 1997Recorded effect of DA on response of striatal cells to strong and weak inputs
Strong
Weak
Linking 3-Factor Learning & Reward Prediction Error
Cue
Reward
Striatal Cell
DA Cell
Reward prediction
Reward predictionerror
Excitatory Inhibitory Reinforcement
Simple Working Hypothesis
• CL is impaired in PD patients (and other DA-compromised groups) due to “reduced DA function” in striatum (tail of caudate)
• The loss of ascending DA input reduces the reinforcing function of the reward prediction error signal innervating the striatum
Modelling• Biologically-constrained neural net• Data to be simulated taken from
Ashby et al (2003)• Data from young and old controls
(YC, OC) and PD patients• Study used matched CL tasks: rule-
based (RB) and Information Integration (II)
• Ashby and colleagues believe these tasks are handled by distinct CL systems
Ashby et al: II Task• 3 of the 4 dimensions determine categories• Not readily verbalisable
Cat A
Cat B
Ashby et al: RB Task• 1 dimension (background colour) determines category• Readily verbalisable rule
Cat B
Cat A
Ashby et al: Results• Proportion failing to learning to criterion in 200 trials
0
0.1
0.2
0.3
0.4
0.5
0.6
Proportion
Nonlearners
Y C
OC
PD
0
0.1
0.2
0.3
0.4
0.5
0.6
Proportion
Nonlearners
II Task RB Task
RB Task: Results• Trials to criterion for learners
0
10
20
30
40
50
60
70
80
90
Trials to
Criterion
Y C
OC
PD
0
10
20
30
40
50
60
70
80
90
Trials to Criterion
II Task RB Task
Modelling
• Constrained by input and output connections of striatum (caudate)
• Learning rule based on known 3-factor form of synaptic plasticity in striatum
• Learning rule consistent with reward prediction error properties of DA neurons
Connections of Striatum
Neocortex
Striatum
SNc
VTA
Sth
Thalamus
GPi GPe
Prefrontal Cortex
Schematic Model
Reward
Stimulus Pattern Response Decision
Input Output
DA
….
Reward
DA
S-R Representation
Reward prediction
Model Learning Rule
When reward present, E>0
wJK = kR*E*ykout*xJ
out
When reward absent, E<0
wJK = kN*E*ykout*xJ
out
xJout
yKoutyK
Reward prediction error, E
wJK
Modelling of Reduced DA FunctionLoss of DA input to striatum (tail of caudate) modelled 2 ways (with same results):-
a) loss of modifiability of cortico- striatal weightsb) proportional reduction of reward prediction error strength
Mean proportion of weights modifiable:-YC 0.8 OC 0.5 PD 0.2(with s.d. = 0.15)
Modelling Process• Found parameters which gave good
fit to YC performance on II task and set DA parameters for PD to produce appropriate level of nonlearners on same task
• Varied OC DA values between YC and PD
• Looked at fit (with these parameters) to all other data cells esp. RB task
Modelling II Task Results
0
0.1
0.2
0.3
0.4
0.5
0.6
Data Model0
20
40
60
80
100
120
140
Data Model
Trials to criterion (learners)
Proportion of non-learners
YC PD
Modelling II Task Results
0
0.1
0.2
0.3
0.4
0.5
0.6
Data Model0
20
40
60
80
100
120
140
Data Model
Trials to criterion (learners)
Proportion of non-learners
YC PDOC
Modelling II Task Results*
0
0.1
0.2
0.3
0.4
0.5
0.6
Data Model0
10
20
30
40
50
60
70
80
90
Data Model
Trials to criterion (learners)
Proportion of non-learners
YC PDOC
Model Results II TaskPerformance of learners in blocks of 16 trials
Modelling RB Task Results
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Data Model0
10
20
30
40
50
60
70
80
90
100
Data Model
Trials to criterion (learners)
Proportion of non-learners
YC PDOC
Modelling RB Task Results*
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Data Model0
10
20
30
40
50
60
70
Data Model
Trials to criterion (learners)
Proportion of non-learners
YC PDOC
Conclusions & Future Work 1
• Simplest realistic model of cortico-striatal learning captures only limited aspects of the CL data
• “Bimodal” nature (learn normally vs. fail) of data simulated only under some paramter settings
• No intermediate DA parameter settings in old controls which can be both PD-like for II task and YC-like for RB task
Conclusions & Future Work 2• Model challenges hypothesis under
test: PD (and OC) deficits in some CL tasks seem unlikely to be solely due to reduced DA-related reinforcement in striatum
• Findings are consistent with >1 CL system
• Future model should add rule system (c.f. Ashby’s COVIS)
Alan PickeringCL Refs 2001-
• Pickering, A.D., & Gray, J.A. (2001). Dopamine, appetitive reinforcement, and the neuropsychology of human learning: An individual differences approach. In A. Eliasz & A. Angleitner (Eds.), Advances in individual differences research (pp. 113-149). Lengerich, Germany: PABST Science Publishers.