Advanced topics in RSA...2015/02/17 · Advanced topics menu • How does the noise ceiling work? • Why is Kendall’s tau a often needed to compare RDMs? • How can we weight

Advanced topics in RSA

Nikolaus Kriegeskorte

MRC Cognition and Brain Sciences Unit

Cambridge, UK

Advanced topics menu

• How does the noise ceiling work?

• Why is Kendall’s tau a often needed to compare

RDMs?

• How can we weight model features to explain brain

RDMs?

• What’s the difference between dissimilarity, distance,

and metric?

• What is the relationship between distance correlation

and mutual information?

• How can we weight model features to explain brain

RDMs?

How does the noise ceiling work?

Correlation distance Euclidean distance2

(for normalised patterns)

1 − 𝑟2

1

𝑑

cos 𝛼 = 𝑟

1 − 𝑟

𝑑2 = (1 − 𝑟)2+ 1 − 𝑟2

= 1 − 2𝑟 + 𝑟2 + 1 − 𝑟2

= 2(1 − 𝑟)

1

Nili et al. 2014

convolutional fully connected

SVM discriminants

weighted combination of fully connected layers and SVM discriminants

accuracy

of human IT

dissimilarity matrix

prediction [group-average

rank correlation]

highest accuracy any model can achieve

other subjects’ average as model

accuracy above chance

p

The group-mean RDM minimises the sum of squared Euclidean distances to single-subject RDMs.

For rank RDMs, the group-mean RDM therefore minimises the sum of correlation distances (thus maximising the average correlation).

The average correlation to the group-mean of rank RDMs, thus provides a precise and hard upper bound on the average Spearman correlation any model can achieve.

distance 1

dis

tance 2

subject 1

subject 2

subject 3

group mean

true model

upper bound

For Kendall’s tau a, an iterative procedure

is needed to find the hard upper bound.

Estimating the noise ceiling

Nili et al. 2014

distance 1

dis

tance 2

subject 1

subject 2

subject 3

group mean

true model

distance 1

dis

tance 2

subject 1

(held out)

leave-one-out

group mean

true model

upper bound lower bound

Estimating the noise ceiling

Nili et al. 2014

Why is Kendall’s tau a often

needed to compare RDMs? Kendall’s tau a is needed when testing categorical models

that predict tied dissimilarities.

General correlation coefficient

Kendall 1944

Pearson Spearman

Kendall

1

2

3

4

5

1 2 3 4 5

0 0

0 0

0

2 -2

difference matrix

Nili et al. 2014

True model can lose to simplified step model (according to most correlation coefficients)

Nili et al. 2014


simulated data real data

true model

noisy data

step model

data points

data points

valu

es

ranks


true model

noisy data

step model

data points

data points

valu

es

ranks

Pearson, tau-a


Kendall’s tau a chooses the true model over a simplified model

(which predicts ties in hard cases)

more frequently than Pearson r, Spearman r, tau b and tau c

true model

always preferred

proportion of cases

in which the true model

was preferred

true model

never preferred

noise level in the data

Conclusions

• Rank correlation can be useful for comparing RDMs

when measurement errors might distort a simple linear

relationship.

• When categorical models (predicting tied ranks) are

used, Kendall’s tau a is an appropriate rank correlation

coefficient.

• Kendall’s tau b & c, Spearman correlation, and even

Pearson correlation all prefer models that predict ties for

difficult comparisons to the true model.

How can we weight model features to

explain brain RDMs?

convolutional fully connected

SVM discriminants

weighted combination of fully connected layers and SVM discriminants

accuracy

of human IT

dissimilarity matrix

prediction [group-average of Kendall’s a]

highest accuracy any

model can achieve

other subjects’ average

as model

accuracy above chance

p

Khaligh-Razavi

& Kriegeskorte (2014)

Representational feature weighting with

non-negative least-squares

f1

w2 f2

f2 fk

w1 f1 wk fk

. . .

. . .

model RDM

weighted-model

RDM

Khaligh-Razavi


Representational feature weighting with

non-negative least-squares

wk weight given to model feature k

fk(i) model feature k for stimulus i

di,j distance between stimuli i,j

w is the weight vector [w1 w2 ... wk] minimising the squared errors

𝐰 = arg min𝐰∈𝐑+𝒏

𝑑𝑖,𝑗2 − 𝑑 𝑖,𝑗

2 2

𝑖≠𝑗

𝑑 𝑖,𝑗2

= [𝑤𝑘𝑓𝑘 𝑖 − 𝑤𝑘𝑓𝑘 𝑗 ]2

𝑛

𝑘=1

= 𝑤𝑘2[𝑓𝑘 𝑖 − 𝑓𝑘 𝑗 ]

2

𝑛

𝑘=1

w1 2 feature-1 RDM

+w2 2 feature-2 RDM

+wk 2

feature-k RDM

= weighted-model RDM

...

= arg min𝐰∈𝐑+𝒏

𝑑2 − 𝑤𝑘2 ∙ RDM𝑘

𝑛

𝑘=1 𝑖,𝑗𝑖≠𝑗

2

The squared distance RDM

of weighted model features

equals a weighted sum

of single-feature RDMs.

model feature k weight k

stimuli i, j

predicted

distance

Khaligh-Razavi


What’s the difference between

dissimilarity, distance, and metric?

Dissimilarity measures

Distances

Metrics

LD-t

Euclidean distance

Mahalanobis distance

squared Euclidean distance

Minkowski distance with p < 1

correlation distance

d(x,y) ≥ 0

d(x,y) = 0 x = y

d(x,z) ≤ d(x,y) + d(y,z)

d(x,y) = d(y,x)

Minkowski distance with p ≥ 1

Documents

Advanced topics in RSA...2015/02/17 · Advanced topics menu • How does the noise ceiling work? • Why is Kendall’s tau a often needed to compare RDMs? • How can we weight