Upload
vuduong
View
235
Download
6
Embed Size (px)
Citation preview
Current Biology, Volume 26
Supplemental Information
Automatic Segmentation of Drosophila
Neural Compartments Using GAL4 Expression
Data Reveals Novel Visual Pathways
Karin Panser, Laszlo Tirian, Florian Schulze, Santiago Villalba, Gregory S.X.E.Jefferis, Katja Bühler, and Andrew D. Straw
Figure S1. (data related to Figure 1): Evaluation of k-‐medoids clustering for automatically segmenting brain regions into anatomical structures. (A) Repeatability scores across multiple runs of the k-‐medoids algorithm. The adjusted Rand index, a measure of repeatability, was calculated based on 10 repeated runs of the k-‐medoids algorithm for both datasets and several brain regions. (B) Colocalization similarity (measured as Dice coefficient s on the set of voxels in the manually annotated region and the set in the clustering result) between the Janelia FlyLight dataset and manual assignments using the same 3D template brain. Manual assignments were based on a manually segmented neuropil image. Glomeruli that could not be unambiguously identified were labeled “glomerulus”. (Janelia FlyLight data for the right antennal lobe region, run 1, 6502 voxels, 3462 driver lines, k equal 60.) (C) Automatic segmentation of central complex (CX). 3D axes scale 30 µm. (D) Individual singleton clusters (left) and average image of strongly expressing driver lines in each cluster with broad driver lines removed (right). Scale bars 20 µm. (E) Average images from agglomerated clusters (top) and dendrogram of agglomerated hierarchy. Scale bars 20 µm. (F) As in E, but from the Vienna dataset, k=60. Scale bars 20 µm. Panels C-‐E: Janelia FlyLight data for CX, run 1, 27598 voxels, 3462 driver lines, k=60.
A
0 20 40 60 80 100 120 140 160number of clusters (k)
0.0
0.2
0.4
0.6
0.8
1.0
Adju
sted
Ran
d In
dex
Janelia ALJanelia CXJanelia MBJanelia oVLNPJanelia SEZVienna ALVienna CXVienna MBVienna oVLNPmean
automatic singleton cluster assignment
man
ual a
nnot
atio
n
co-localization similarity (s)(Dice coefficient)
B
C43 C13
C23C46
singletoncluster average image
singletoncluster average image
C D
0.0 0.64
E F
C116C110
C43
C13
C23
C46
Janelia dataset, k=60, run 1
C110 C116 C117
C117
Vienna dataset, k=60, run 1
C88
C108C110
C88 C108 C110
agglomerated cluster average images agglomerated cluster average images
C51
C39
C29
C48
C55
C04
C03
C14
C58
C40
C01
C02
C45
C06
C21
C33
C26
C19
C22
C53
C47
C36
C10
C05
C07
C28
C49
C13
C41
C34
C38
C12
C09
C25
C23
C08
C35
C42
C32
C11
C15
C16
C17
C18
C20
C24
C27
C30
C31
C37
C43
C44
C46
C50
C52
C54
C56
C57
C59
C60
VVA3DL3
glomerulusVM2DM6DM3
glomerulusglomerulus
VL1DA1
VL2pVM3VA2VA5
glomerulusDM5,DM2,DM1
glomerulusglomerulus
DL5D
VL2aDP1lDL2dDA4?
glomerulusVM7
DP1mDC1VM2
glomerulusVM1
glomerulusDC2VA7
glomerulusVA1dVA1v
glomerulusDC3
glomerulusglomerulus
Figure S2. (data related to Figures 2-‐3): Clustering quality for oVLNP in both datasets. (A) Quantification of similarity between clusters as measured by voxel-‐to-‐voxel co-‐expression distance ( , where s is the Dice coefficient between the two sets of enhancer expression) for each medoid of every cluster of run 1 in the oVLNP region using the Janelia dataset. (B) Dendrogram of agglomerative hierarchical clustering using average linkage showing a representation of co-‐expression distance between medoids in the oVLNP region of the Janelia dataset. (C) Quantification of similarity between clusters as measured by voxel-‐to-‐voxel co-‐expression distance for each medoid of every cluster in the oVLNP region of run 1 the Vienna dataset. (D) Dendrogram as in B using the Vienna dataset.
Janelia FlyLight cluster
A B
C D
Jane
lia F
lyLi
ght c
lust
er
0.0 1.0co-expression distance(metric Dice distance)
Vienna Tiles cluster
Vien
na T
iles
clus
ter
0.0 1.0co-expression distance(metric Dice distance)
0.0 0.5 1.0average linkage distance
0.0 0.5 1.0average linkage distance
441833252115626523129235855204130446163735711514540287492412321460484250275938475834174322919525613533954363110
282353572554161225630523124411374219509546432672139361413494027445118358383561176059201541478553222453348341029
1− s
Figure S3. (data related to Figures 2-‐3): Automatically assigned oVLNP singleton clusters colocalize with manually segmented optic glomeruli, repeated clustering of the same dataset gives similar results, and clustering of different datasets gives similar results. (A-‐B) Colocalization similarity (measured as Dice coefficient s on the set of voxels in the manually annotated region and the set in the clustering result) between the Janelia FlyLight dataset and manual assignments using the same 3D template brain. (Janelia FlyLight data for oVLNP, 42317 voxels, 3462 driver lines, k equal 60.) (C-‐D) Colocalization similarity between the Vienna Tiles dataset and manual assignments using the same 3D template brain. (Vienna Tiles data for oVLNP, 13458 voxels, 6022 driver lines, k equal 60.)
B
D
0.00
0.32
0.64
co-lo
caliz
atio
n si
mila
rity (s)
(Dic
e co
effic
ient
)
automatic singleton cluster assignment (run 2, Janelia FlyLight dataset)
man
ual a
nnot
atio
n
C17
C48
C52
C03
C32
C45
C26
C33
C47
C08
C07
C10
C30
C57
C14
C12
C25
C01
C59
C49
C05
C04
C02
C06
C09 C11
C13
C15
C16
C18
C19
C20
C21
C22
C23
C24
C27
C28
C29
C31
C34
C35
C36
C37
C38
C39
C40
C41
C42
C43
C44
C46
C50
C51
C53
C54
C55
C56
C58
C60
LC04LC06LC09LC10LC11LC12LC13LC15LC16LC17LC18LC20LC21
LC22/LPLC4LC24LPC1
LPLC1LPLC2LPLC3MC61MC62MC63
automatic singleton cluster assignment (run 2, Vienna Tiles dataset)
man
ual a
nnot
atio
n
C24
C42
C02
C33
C16
C13
C58
C40
C49
C38
C44
C18
C19
C14
C22
C59
C30
C06
C28
C17
C41
C55
C01
C03
C04
C05
C07
C08
C09
C10 C11
C12
C15
C20
C21
C23
C25
C26
C27
C29
C31
C32
C34
C35
C36
C37
C39
C43
C45
C46
C47
C48
C50
C51
C52
C53
C54
C56
C57
C60
LC04LC06LC09LC10LC11LC12LC13LC15LC16LC17LC18LC20LC21
LC22/LPLC4LC24LPC1
LPLC1LPLC2LPLC3MC61MC62MC63
A
C
automatic singleton cluster assignment (run 1, Janelia FlyLight dataset)
C33
C57
C32
C22
C07
C05
C46
C28
C37
C23
C29
C43
C40
C16
C53
C30
C18
C44
C35
C56
C48
C50
C01
C02
C03
C04
C06
C08
C09
C10 C11
C12
C13
C14
C15
C17
C19
C20
C21
C24
C25
C26
C27
C31
C34
C36
C38
C39
C41
C42
C45
C47
C49
C51
C52
C54
C55
C58
C59
C60
LC04LC06LC09LC10LC11LC12LC13LC15LC16LC17LC18LC20LC21
LC22/LPLC4LC24LPC1
LPLC1LPLC2LPLC3MC61MC62MC63
man
ual a
nnot
atio
n
0.00
0.32
0.64
co-lo
caliz
atio
n si
mila
rity (s)
(Dic
e co
effic
ient
)
automatic singleton cluster assignment (run 1, Vienna Tiles dataset)
man
ual a
nnot
atio
n
C26
C27
C60
C55
C18
C06
C51
C44
C40
C38
C01
C14
C02
C42
C16
C05
C07
C21
C46
C34
C56
C57
C03
C04
C08
C09
C10 C11
C12
C13
C15
C17
C19
C20
C22
C23
C24
C25
C28
C29
C30
C31
C32
C33
C35
C36
C37
C39
C41
C43
C45
C47
C48
C49
C50
C52
C53
C54
C58
C59
LC04LC06LC09LC10LC11LC12LC13LC15LC16LC17LC18LC20LC21
LC22/LPLC4LC24LPC1
LPLC1LPLC2LPLC3MC61MC62MC63
Table S1. (data related to Figure 4): Table with VPN, Clusters, Driver lines, Flycircuit IDs. Note: MC63 may be synonymous with VPN-‐MB1 [S3], which was published while this study was under review.
C (J
anel
ia F
lyLi
ght
data
set)
C' (
Vien
na T
iles
data
set)
C'' (
Jane
lia
FlyL
ight
dat
aset
, 2n
d ru
n)
C'''
(Vie
nna
Tile
s da
tase
t, 2n
d ru
n)
LC04
Col
A (M
u et
al.,
201
2; S
traus
feld
&
Oka
mur
a, 2
007;
Stra
usfe
ld a
nd
Hau
sen,
197
7)G
MR
26G
09, G
MR
47H
03VT
0427
58, V
T046
005
Cha
-F-0
0013
8, C
ha-F
-200
257,
Gad
1-F-
3002
56C
33, C
21, C
15, C
25C
'26,
C'3
9C
''02,
C''1
7C
'''24
LC06
S4 (F
isch
bach
and
Lyl
y-H
üner
berg
, 198
3)G
MR
41C
07, G
MR
22A0
7VT
0065
49, V
T009
855
Cha
-F-0
0003
9, G
ad1-
F-40
0244
, Gad
1-F-
2003
26C
57C
'27
C''4
8C
'''42
LC09
S4 (F
isch
bach
and
Lyl
y-H
üner
berg
, 198
3)G
MR
71C
02, G
MR
14A1
1VT
0142
09,
VT00
5102
, VT0
2770
4C
ha-F
-000
028,
Gad
1-F-
7001
45, G
ad1-
F-20
0274
C32
, C14
C'5
9, C
'60
C''5
2, C
''56,
C''3
5C
'''02
LC10
S3 (F
isch
bach
and
Lyl
y-H
üner
berg
, 198
3)G
MR
22D
06, G
MR
35D
04VT
0217
60, V
T043
920
Gad
1-F-
1000
80, C
ha-F
-300
390,
fru-
F-80
0100
C22
, C09
, C19
C'3
2, C
'55,
C'4
8,
C'2
9C
''03,
C''5
4, C
''49,
C
''06
C'''3
3, C
'''34,
C'''5
0
LC11
L1C
N (M
u et
al.,
201
2)G
MR
23D
02, G
MR
87B0
4, G
MR
51F0
9, G
MR
22H
02VT
0049
68, V
T008
647,
VT0
0496
7C
ha-F
-000
153,
Cha
-F-2
0013
2, G
ad1-
F-30
0060
C07
, C45
C'1
8C
''32.
C''3
0C
'''16
LC12
GM
R59
B10,
GM
R35
D04
, GM
R19
G01
VT06
2247
, VT0
4091
9C
ha-F
-000
124,
Cha
-F-0
0001
5, V
Glu
t-F-0
0005
6, V
Glu
t-F-4
0034
7C
26, C
05C
'06
C''4
5C
'''39,
C'''1
3
LC13
GM
R50
C10
, GM
R14
A11
VT05
7283
, VT0
2577
1C
ha-F
-000
255,
Cha
-F-1
0000
3, G
ad1-
F-10
0040
C46
C'5
1C
''26,
C''0
1C
'''58
LC14
DC
neu
rons
(Has
san
et a
l., 2
000)
GM
R21
H10
, GM
R12
F01,
GM
R58
H11
VT03
7804
Cha
-F-4
0022
8, C
ha-F
-400
231,
Gad
1-F-
3000
16x
C'0
3C
''34
C'''0
8
LC15
GM
R42
H06
, GM
R24
A02
VT01
4207
, VT0
4787
8, V
T012
320
Cha
-F-0
0036
1, C
ha-F
-100
351
C28
C'4
4C
''33,
C''2
1C
'''41,
C'''4
0
LC16
GM
R32
D04
, GM
R25
G03
VT06
1079
, VT0
2577
1G
ad1-
F-10
0202
, Cha
-F-0
0031
6, fr
u-F-
0000
32, V
Glu
t-F-0
0060
3C
37, C
03C
'40,
C'2
7C
''47
C'''4
9
LC17
GM
R21
B04,
GM
R65
C12
VT03
4259
, VT0
3330
1C
ha-F
-100
017,
Cha
-F-0
0000
4, G
ad1-
F-00
0025
C23
, C26
, C01
C'3
5, C
'38,
C'5
8C
''08,
C''4
5C
'''38,
C'''2
9, C
'''35,
C
'''11,
C'''3
9, C
'''60,
C
'''12
LC18
GM
R92
B11
VT00
8183
5-H
T1B-
F-50
0016
, Cha
-F-0
0033
3, fr
u-F-
2000
61, G
ad1-
F-30
0054
C29
, C02
C'0
1C
''07,
C''5
3C
'''37,
C'''4
4
LC20
GM
R17
A04,
GM
R71
G09
VT02
5718
VGlu
t-F-2
0056
4, V
Glu
t-F-7
0016
3, G
ad1-
F-20
0101
C43
xC
''10
x
LC21
GM
R85
F11,
GM
R25
A07
VT01
4960
Gad
1-F-
4001
02, C
ha-F
-300
208
C40
, C28
, C07
C'1
8C
''30,
C''4
0C
'''40,
C'''1
6
LC22
: Gad
1-F-
9000
22, C
ha-F
-600
134,
VG
lut-F
-500
700
LPLC
4: G
ad1-
F-20
0058
, Cha
-F-2
0030
2, C
ha-F
-200
028
LC24
GM
R20
G09
VT03
8216
Cha
-F-0
0028
3, C
ha-F
-200
073,
Cha
-F-4
0011
6C
37C
'40
C''4
7C
'''10
LPLC
1LP
L2C
N (M
u et
al.,
201
2)G
MR
36B0
6, G
MR
12G
03VT
0077
67C
ha-F
-200
219,
Cha
-F-3
0003
5, G
ad1-
F-40
0140
C18
, C44
, C25
C'0
7C
''25
C'''3
0
LPLC
2G
MR
75G
12, G
MR
12E0
4VT
0071
94, V
T049
479
Gad
1-F-
0003
00, C
ha-F
-100
287,
Cha
-F-3
0011
1C
44C
'21
C''2
5C
'''06,
C'''3
0
LPLC
3G
MR
9C11
, GM
R49
A05
VT04
4492
, VT0
6262
4C
ha-F
-100
027,
Cha
-F-3
0000
4, G
ad1-
F-20
0099
, fru
-F-5
0000
9C
35, C
55, C
20, C
30C
'46,
C'0
5, C
'09
C''5
9, C
''13,
C''1
9C
'''28,
C'''1
4
LPC
1G
MR
37G
12, G
MR
77A0
6, G
MR
81A0
5, G
MR
20A0
9 (s
ubse
t)VT
0460
05VG
lut-F
-700
361,
Cha
-F-0
0027
2, fr
u-F-
0001
01C
04, C
30, C
20C
'05
C''1
2, C
''59,
C''1
9C
'''46
MC
61LC
10c
(Ots
una
& Ito
, 200
6)G
MR
53B0
8VT
0020
72, V
T021
203
Gad
1-F-
4000
23, C
ha-F
-300
285,
Cha
-F-2
0002
6,
C56
C'3
4, C
'10
C''4
9C
'''17
MC
62G
MR
78G
04, G
MR
85C
01VT
0626
24no
ne id
entif
ied
C48
C'5
6C
''05
x
MC
63VP
N-M
B1?
(Vog
t et a
l., 2
016)!
GM
R72
C11
VT02
2290
, VT0
0818
3, V
T017
001
Cha
-F-2
0010
3C
42, C
48C
'25,
C'5
6C
''04,
C''1
1, C
''05
C'''5
5
Lat
GM
R16
G04
, GM
R13
E10,
GM
R85
G07
, GM
R39
F04
VT04
5604
, VT0
1496
3, V
T033
613
TH-F
-200
107,
Trh
-F-1
0001
9, T
H-F
-100
004,
Cha
-F-3
0033
3C
50, C
42C
'30,
C'5
2, C
'56,
C
'57
C''0
4C
'''55
Clu
ster
s co
rresp
ondi
ng to
opt
ic g
lom
erul
us o
r tra
ct a
ssoc
iate
d w
ith a
VPN
GM
R24
A05
VT05
8688
LC22
/LPL
C4
C16
C'4
2, C
'19
C''5
7C
'''14
VPN
type
Syno
nym
sBe
st e
nhan
cers
iden
tifie
d fo
r neu
ron
type
from
Jan
elia
G
AL4
libra
ryBe
st e
nhan
cers
iden
tifie
d fo
r neu
ron
type
fro
m V
ienn
a til
es (V
T) G
AL4
libra
ryFl
yCirc
uit.t
w -
Sing
le c
ell e
xam
ples
for n
euro
n ty
pe
Supplemental Experimental Procedures
Thresholding, Dice similarity, k-‐Medoids, and Hierarchical Agglomeration
GAL4 expression patterns were transformed into a binary representation in two steps.
First, the image is thresholded and second, morphological opening (dilation of the erosion
by a 3x3x3 structuring kernel) is applied to reduce clutter. The threshold was chosen so
that the resulting mask yielded 1% stained voxels. This simple heuristic was more reliable
for the datasets tested compared to other standard automatic thresholding methods.
From the binarized images, the set of expressing lines was assembled for each voxel.
Similarity between voxels based on the respective expression set from voxel A and the set
from voxel B is computed using Dice’s coefficient as where ∩ denotes
intersection and ∣x∣ denotes the number of elements in set x. To decrease the effects of
registration error and image acquisition noise and to increase the speed of subsequent
processing steps, we binned the original image voxel data into larger voxels, using a 3x3x3
nearest-‐neighbor downsampling. Analysis was performed on specific brain regions (e.g.
antennal lobe or oVLNP) defined by a 3D brain atlas of neuropils (included in the
supplemental data). Voxels in the bounding cube but not in the defined neuropil were
excluded. The k-‐medoids algorithm [S1] was run in Julia 0.4.0 using JuliaStats Clustering
0.5.0 (see Supplementary file 1). The k-‐medoids was performed on Dice dissimilarity (1-‐s).
To agglomerate the medoids, we used the fastcluster package [S2] with Python 2.7.10
using average linkage with metric distance between medoids.
Initial clustering was performed on a distance matrix found as follows. For each voxel
within the analyzed brain region (e.g. antennal lobe or lateral protocerebrum), we
calculated the set of driver lines for which GFP expression was higher than a threshold.
We used the Dice coefficient (a measure of overlap, see above) to quantify expression
similarity between each possible pair of n voxels. This n x n distance matrix was used to
group voxels into clusters of similar expression using k-‐medoids clustering, a standard
clustering technique (Figure 1A, see Experimental Procedures for details). Clustering with
other standard algorithms such as mini-‐batch k-‐means gave qualitatively similar results,
and we focus here on k-‐medoids only for convenience. As typical for clustering algorithms,
one parameter controls the number of clusters, and in our case we chose several different
values for k and evaluated results for different choices and in each of the two independent
datasets. Every voxel in the analysis is assigned to exactly one cluster. Neither manual
inspection nor calculation of a metric designed to measure clustering repeatability,
adjusted Rand index (Figure S1A), showed an obvious optimal value for k. Therefore, we
chose a value of k equal 60 as a number which appeared to provide sufficiently many
s =2 A∩BA + B
1− s
clusters to capture important structures at a small scale without producing an
overwhelming number. The result of the initial clustering algorithm is the assignment of
each voxel in the input brain region to one of k clusters. The second major step,
hierarchical clustering, took the cluster centers from the first step and agglomerated these
‘singletons’ into 2k-‐1 clusters.
Evaluating repeatability of clustering
As discussed above, automatic calculation of a measure of repeatability (adjusted Rand
index, Figure S1A) found no obvious optimum value of k used in the initial clustering step.
Therefore, we sought to gain a more biologically meaningful sense of consistency across
multiple runs of the algorithm for k=60 by comparing visually the results of manual and
automatic segmentations. We did this for the oVLNP with each of four different clustering
runs, two from each dataset (Figure S3). The results show that, despite different random
number initialization seeds, most optic glomeruli have a strong correspondence with a
singleton cluster across repeated runs of the algorithm within and across the two datasets
(Vienna Tiles and Janelia FlyLight). This indicates substantial biologically meaningful
repeatability within and between datasets at the first clustering step, which agglomeration
then structures hierarchically.
Supplemental References
S1. Kaufman, L., and Rousseeuw, P. J. (1987). Clustering by Means of Medoids. In Statistical Data Analysis Based on the L1 Norm and Related Methods.
S2. fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python Journal of Statistical Software https://www.jstatsoft.org/article/view/v053i09.
S3. Vogt, K., Aso, Y., Hige, T., Knapek, S., Ichinose, T., Friedrich, A. B., Turner, G. C., Rubin, G. M., and Tanimoto, H. (2016). Direct neural pathways convey distinct visual information to Drosophila mushroom bodies. eLife 5, e14009.