Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Advanced topics in RSA
Nikolaus Kriegeskorte
MRC Cognition and Brain Sciences Unit
Cambridge, UK
Advanced topics menu
• How does the noise ceiling work?
• Why is Kendall’s tau a often needed to compare
RDMs?
• How can we weight model features to explain brain
RDMs?
• What’s the difference between dissimilarity, distance,
and metric?
• What is the relationship between distance correlation
and mutual information?
• How can we weight model features to explain brain
RDMs?
How does the noise ceiling work?
Correlation distance Euclidean distance2
(for normalised patterns)
1 − 𝑟2
1
𝑑
cos 𝛼 = 𝑟
1 − 𝑟
𝑑2 = (1 − 𝑟)2+ 1 − 𝑟2
= 1 − 2𝑟 + 𝑟2 + 1 − 𝑟2
= 2(1 − 𝑟)
1
Nili et al. 2014
convolutional fully connected
SVM discriminants
weighted combination of fully connected layers and SVM discriminants
accuracy
of human IT
dissimilarity matrix
prediction [group-average
rank correlation]
highest accuracy any model can achieve
other subjects’ average as model
accuracy above chance
p
The group-mean RDM minimises the sum of squared Euclidean distances to single-subject RDMs.
For rank RDMs, the group-mean RDM therefore minimises the sum of correlation distances (thus maximising the average correlation).
The average correlation to the group-mean of rank RDMs, thus provides a precise and hard upper bound on the average Spearman correlation any model can achieve.
distance 1
dis
tance 2
subject 1
subject 2
subject 3
group mean
true model
upper bound
For Kendall’s tau a, an iterative procedure
is needed to find the hard upper bound.
Estimating the noise ceiling
Nili et al. 2014
distance 1
dis
tance 2
subject 1
subject 2
subject 3
group mean
true model
distance 1
dis
tance 2
subject 1
(held out)
leave-one-out
group mean
true model
upper bound lower bound
Estimating the noise ceiling
Nili et al. 2014
Why is Kendall’s tau a often
needed to compare RDMs? Kendall’s tau a is needed when testing categorical models
that predict tied dissimilarities.
General correlation coefficient
Kendall 1944
Pearson Spearman
Kendall
1
2
3
4
5
1 2 3 4 5
0 0
0 0
0
2 -2
difference matrix
Nili et al. 2014
True model can lose to simplified step model (according to most correlation coefficients)
Nili et al. 2014
True model can lose to simplified step model (according to most correlation coefficients)
simulated data real data
Nili et al. 2014
True model can lose to simplified step model (according to most correlation coefficients)
simulated data real data
true model
noisy data
step model
data points
data points
valu
es
ranks
True model can lose to simplified step model (according to most correlation coefficients)
true model
noisy data
step model
data points
data points
valu
es
ranks
Pearson, tau-a
True model can lose to simplified step model (according to most correlation coefficients)
Kendall’s tau a chooses the true model over a simplified model
(which predicts ties in hard cases)
more frequently than Pearson r, Spearman r, tau b and tau c
true model
always preferred
proportion of cases
in which the true model
was preferred
true model
never preferred
noise level in the data
Conclusions
• Rank correlation can be useful for comparing RDMs
when measurement errors might distort a simple linear
relationship.
• When categorical models (predicting tied ranks) are
used, Kendall’s tau a is an appropriate rank correlation
coefficient.
• Kendall’s tau b & c, Spearman correlation, and even
Pearson correlation all prefer models that predict ties for
difficult comparisons to the true model.
How can we weight model features to
explain brain RDMs?
convolutional fully connected
SVM discriminants
weighted combination of fully connected layers and SVM discriminants
accuracy
of human IT
dissimilarity matrix
prediction [group-average of Kendall’s a]
highest accuracy any
model can achieve
other subjects’ average
as model
accuracy above chance
p
Khaligh-Razavi
& Kriegeskorte (2014)
Representational feature weighting with
non-negative least-squares
f1
w2 f2
f2 fk
w1 f1 wk fk
. . .
. . .
model RDM
weighted-model
RDM
Khaligh-Razavi
& Kriegeskorte (2014)
Representational feature weighting with
non-negative least-squares
wk weight given to model feature k
fk(i) model feature k for stimulus i
di,j distance between stimuli i,j
w is the weight vector [w1 w2 ... wk] minimising the squared errors
𝐰 = arg min𝐰∈𝐑+𝒏
𝑑𝑖,𝑗2 − 𝑑 𝑖,𝑗
2 2
𝑖≠𝑗
𝑑 𝑖,𝑗2
= [𝑤𝑘𝑓𝑘 𝑖 − 𝑤𝑘𝑓𝑘 𝑗 ]2
𝑛
𝑘=1
= 𝑤𝑘2[𝑓𝑘 𝑖 − 𝑓𝑘 𝑗 ]
2
𝑛
𝑘=1
w1 2 feature-1 RDM
+w2 2 feature-2 RDM
+wk 2
feature-k RDM
= weighted-model RDM
...
= arg min𝐰∈𝐑+𝒏
𝑑2 − 𝑤𝑘2 ∙ RDM𝑘
𝑛
𝑘=1 𝑖,𝑗𝑖≠𝑗
2
The squared distance RDM
of weighted model features
equals a weighted sum
of single-feature RDMs.
model feature k weight k
stimuli i, j
predicted
distance
Khaligh-Razavi
& Kriegeskorte (2014)
What’s the difference between
dissimilarity, distance, and metric?
Dissimilarity measures
Distances
Metrics
LD-t
Euclidean distance
Mahalanobis distance
squared Euclidean distance
Minkowski distance with p < 1
correlation distance
d(x,y) ≥ 0
d(x,y) = 0 x = y
d(x,z) ≤ d(x,y) + d(y,z)
d(x,y) = d(y,x)
Minkowski distance with p ≥ 1