Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
1
2
L1 and B1 repeats blueprint the spatial organization of chromatin 3
4
J. Yuyang Lu1#, Lei Chang2#, Tong Li1#, Ting Wang1,3, Yafei Yin1,3, Ge Zhan1,3, Ke Zhang1,3, 5
Michelle Percharde4, Liang Wang1, Qi Peng1, Pixi Yan1, Hui Zhang1, Xue Han1, Xianju Bi1, Wen 6
Shao1, Yantao Hong1, Zhongyang Wu1, Peizhe Wang1, Wenzhi Li1, Jing Zhang1, Zai Chang1, 7
Yingping Hou2, Pilong Li1, Miguel Ramalho-Santos5, Jie Na1, Wei Xie1, Yujie Sun2*, Xiaohua 8
Shen1* 9
10
1Tsinghua Center for Life Sciences, School of Medicine and School of Life Sciences, Tsinghua 11
University, Beijing 100084, China 12
2 State Key Laboratory of Membrane Biology, School of Life Sciences, and Biomedical 13
Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China 14
3 These authors contributed equally to this work. 15
4 MRC LMS & Imperial College London, London W12 0NN, UK 16
5 Lunenfeld-Tanenbaum Research Institute, University of Toronto, Ontario M5T 3H7, Canada 17
# These authors share the co-first authorship. 18
19
* Leading correspondence should be addressed to X.S. ([email protected]). 20
Correspondence may also be addressed to Y.S. ([email protected]). 21
22
23
24
Keywords: retrotransposons, L1 repeats, B1/Alu repeats, higher-order chromatin structure 25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
2
1
2
SUMMARY 3
Despite extensive mapping of three-dimensional (3D) chromatin structures, the basic 4
principles underlying genome folding remain unknown. Here, we report a fundamental role 5
for L1 and B1 retrotransposons in shaping the macroscopic 3D genome structure. Homotypic 6
clustering of B1 and L1 repeats in the nuclear interior or at the nuclear and nucleolar 7
peripheries, respectively, segregates the genome into mutually exclusive nuclear 8
compartments. This spatial segregation of L1 and B1 is conserved in mouse and human cells, 9
and occurs dynamically during establishment of the 3D chromatin structure in early 10
embryogenesis and the cell cycle. Depletion of L1 transcripts drastically disrupts the spatial 11
distributions of L1- and B1-rich compartments. L1 transcripts are strongly associated with 12
L1 DNA sequences and induce phase separation of the heterochromatin protein HP1α. Our 13
results suggest that genomic repeats act as the blueprint of chromatin macrostructure, thus 14
explaining the conserved higher-order structure of chromatin across mammalian cells. 15
16
17
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
3
INTRODUCTION 1
The mammalian genome is folded nearly one million-fold within the nucleus of a cell 2
(Annunziato, 2008). Hierarchical organization of genome folding at different scales is central to 3
gene expression, cellular function and development, and alterations in genome organization often 4
lead to disease (Dekker et al., 2017; Zane et al., 2014). Microscopic and 3C-based approaches 5
reveal that chromatin forms a "fractal globule" with compartments, domains, and loop interactions 6
(Boettiger et al., 2016; Dekker et al., 2002; Mirny, 2011; Wang et al., 2016). At the megabase scale, 7
chromatin is subdivided into two spatially segregated compartments, arbitrarily labeled as A and B, 8
with distinct transcriptional activity (Lieberman-Aiden et al., 2009). The A and B compartments 9
also differ in other features, including gene density, CpG frequency, histone modifications, and 10
DNA replication timing (Bonev and Cavalli, 2016; Dekker et al., 2013; Gibcus and Dekker, 2013; 11
Rivera-Mulia and Gilbert, 2016). Simulation of three-dimensional (3D) genome structures 12
predicted that the euchromatic A compartment adopts a central position, whereas the 13
heterochromatic B compartment moves towards the nuclear periphery and perinucleolar regions 14
(Buchwalter et al., 2019; Stevens et al., 2017). Within compartments at the kilobase-to-megabase 15
scale, chromatin is organized in topologically associated domains (TADs), which serve as 16
functional platforms for physical interactions between co-regulated genes and regulatory elements 17
(Dixon et al., 2012). At a finer scale, TADs are divided into smaller loop domains, in which distal 18
regulatory elements such as enhancers come into direct contact with their target genes via 19
chromatin loops (Kadauke and Blobel, 2009). 20
The organization of the nuclear genome into euchromatin and heterochromatin, which closely 21
match the A and B compartments, respectively, appears to be conserved from ciliates to humans 22
and has been maintained in eukaryotes over 500 million years of evolution (Solovei et al., 2016). 23
Intriguingly, most A/B compartments and TADs seem to be invariant in different mouse and human 24
cell types or during cell-fate transition, whereas sub-TAD loops are more variable to facilitate 25
differential gene expression (Dixon et al., 2015; Stadhouders et al., 2018). The insulator protein 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
4
CTCF (CCCTC-binding factor) and the ring-shaped cohesin complex, considered as the weavers 1
of genome organization, co-localize at the boundaries and anchor regions of contact domains and 2
loops (Rao et al., 2014; Splinter et al., 2006). However, depletion of CTCF or cohesin disrupts 3
some sub-TAD loop contacts, but fails to affect the formation and maintenance of A/B 4
compartments in cells (Bintu et al., 2018; Gassler et al., 2017; Haarhuis et al., 2017; Nora et al., 5
2017; Nuebler et al., 2018; Rao et al., 2017; Schwarzer et al., 2017; Wutz et al., 2017). The direct 6
molecular drivers, other than CTCF and cohesin, that govern compartmental domains are yet to be 7
defined. A few mechanisms have been proposed for the formation of nuclear compartments, such 8
as anchoring heterochromatin to the nuclear lamina (Falk et al., 2019; Guelen et al., 2008; van 9
Steensel and Belmont, 2017; Wijchers et al., 2015; Zullo et al., 2012), preferential attraction of 10
chromatin harboring similar histone modifications and regulators (Jost et al., 2014; Strom et al., 11
2017; Wang et al., 2016), and hypothetical models involving pairing of homologous sequences 12
mediated by active transcription or phase separation (Cook, 2002; Cook and Marenduzzo, 2018; 13
Rowley et al., 2017; Tang, 2017; van de Werken et al., 2017). A recent study based on imaging and 14
polymer simulations proposes that interactions between heterochromatic regions are crucial for 15
compartmentalization of the genome in inverted and conventional nuclei (Falk et al., 2019). 16
However, the molecular basis that drives heterochromatin compartmentalization and segregation 17
from active domain remains unclear (Bouwman and de Laat, 2015; Pueschel et al., 2016). 18
Repetitive elements comprise about half of human and mouse genomes (Biemont, 2010). 19
Once regarded as genomic parasites (Orgel and Crick, 1980), retrotransposons have been recently 20
implicated in playing active roles in re-wiring the genome and gene expression programs in diverse 21
biological processes (Bourque et al., 2008; Chuong et al., 2016; Durruthy-Durruthy et al., 2016; 22
Grow et al., 2015; Liu et al., 2018; Lynch et al., 2011; Rebollo et al., 2012; Wang et al., 2014). 23
Repetitive DNA is speculated to have a role in nuclear organization (Falk et al., 2019; Solovei et 24
al., 2016; Tang, 2017). Considering the diverse nature and widespread distribution of repeats in 25
mammalian genomes (Sundaram and Wang, 2018), it is imperative to determine whether and how 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
5
repetitive sequences are involved in this process. Long and short interspersed nuclear elements 1
(LINEs and SINEs, respectively) are the two predominant subfamilies of retrotransposons in most 2
mammals (Mandal and Kazazian, 2008). L1 (also named as LINE1 or LINE-1) is the most abundant 3
subclass of LINEs, and also constitutes the most abundant type of all repeat subclasses, making up 4
to 18.8% and 16.9% (0.9~1.0 million copies) of the genome in mouse and human, respectively 5
(Taylor et al., 2013). B1 in mouse and its closely related, primate-specific Alu elements in human 6
are the most abundant subclass of SINEs, constituting 2.7%~10.6% (0.6~1.3 million copies) of 7
mouse and human genomes (Lander et al., 2001; Mouse Genome Sequencing et al., 2002). L1 and 8
B1/Alu have distinct nucleotide compositions and sequence lengths. While Alu elements are ~300 9
bp long and rich in G and C nucleotides, L1 elements are 6–7 kb long and AT-rich (Jurka et al., 10
2004). A previous study of metaphase chromosome banding showed roughly inverse distributions 11
of L1 and Alu elements in chromosomal regions with distinct biochemical properties (Korenberg 12
and Rykowski, 1988). Further studies reported that Alu/B1 elements are preferentially enriched in 13
regions that are generally gene-rich, whereas L1 elements are enriched in gene-poor regions, which 14
tend to be located in lamina-associated heterochromatin (Deininger, 2011; Meuleman et al., 2013; 15
Wijchers et al., 2015). However, systematic mapping of L1 and B1 elements in the genome and the 16
nuclear space is still lacking. 17
Here, we report that L1 and B1/Alu repeats harbor the basic structural information in their 18
DNA sequences to instruct the 3D organization of mammalian genomes. Co-segregation of B1/Alu 19
and L1 repeats with the A and B compartments, respectively, is highly conserved in different mouse 20
and human cell types. B1/Alu-rich regions cluster in the interior of the nucleus, while L1-rich 21
regions cluster at the nuclear and nucleolar periphery. This pattern of homotypic clustering and 22
segregation is dynamically established in the cell cycle and in early embryogenesis. L1 RNA is 23
critically required for the formation and maintenance of compartment segregation and the higher-24
order chromatin structure. Most likely, L1 RNA transacts the structural information in L1 DNA 25
repeats by acting upon its own DNA sequences to promote phase separation of HP1, thereby 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
6
inducing heterochromatin formation. Our findings uncover a fundamental principle of 3D genome 1
organization that is rooted in genomic DNA sequences, and also provide an answer to the 2
perplexing phenomenon of the conservation and robustness of chromatin compartmentalization 3
across mammalian species. 4
5
RESULTS 6
Homotypic clustering of B1 and L1 repeats segregates the A/B compartments 7
We analyzed the genomic positions of the major repeat subfamilies and observed positive 8
correlations within the L1 and SINE B1 subfamilies, but strong inverse correlations between them 9
(Figure S1A). This observation suggests that L1 and SINE B1 elements tend to be positioned away 10
from each other in the genome, while repeats from the same subfamily tend to be clustered. The 11
non-random positioning of repeat sequences in the genome prompted us to examine their relative 12
distributions in the A/B compartments. Analysis of Hi-C data in mouse embryonic stem cells 13
(mESCs) (Fraser et al., 2015) showed that dense L1 and B1 repeats appear to reside in distinct and 14
mutually exclusive compartments across the mouse genome, whereas other types of 15
retrotransposons such as ERV1, ERVK, and L2 tend to be randomly distributed (Figures 1A and 16
S1B). The compartments marked by B1 repeats show enrichment of active histone marks 17
(H3K4me3, H3K9ac, H3K27ac, H3K36me3), strong binding of RNA polymerase II (Pol II), and 18
high levels of chromatin accessibility and transcription activity. In contrast, the compartments 19
marked by L1 repeats show signatures of heterochromatin, including enrichment of the repressive 20
H3K9me2 and H3K9me3 marks, and strong binding of heterochromatin proteins such as HP1 21
and the nuclear corepressor KRAB-associated protein-1 (KAP1 or TRIM28) (Figures 1A and S1B). 22
Quantitative sequence analysis of annotated A/B compartments revealed high levels of SINE 23
repeats (including B1, B2 and B4) in the A compartments and L1 repeats in the B compartments, 24
whereas repeats in other subclasses showed no specific enrichment (Figures S1C, D). This 25
opposing distribution pattern of B1/Alu and L1 in the A and B compartments was also detected in 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
7
five other cell types (Figure 1B): human embryonic stem cells (hESCs), lymphoblastoid cells 1
(GM12878) and erythroleukemic cells (K562), and mouse neural progenitor cells (NPCs) and 2
neurons (Fraser et al., 2015; Rao et al., 2014). Furthermore, unsupervised clustering revealed that 3
the genomic positions of A/B compartments in six distinct mouse and human cell types are highly 4
similar to each other, with an average Spearman correlation coefficient >0.73 within species 5
and >0.52 between species (Figure S1E). Compared to other subclass repeats, L1 and B1 are most 6
strongly related to the high-order chromatin structures, and their distributions appear to be 7
conserved in homologous regions of the mouse and human genomes (Figure S1E). For example, a 8
region in mouse chromosome 2 (chr2: 120-140 Mb) and its syntenic region in human chromosome 9
20 (chr20: 6-50 Mb) show similar patterns of Hi-C contact probabilities, and gene and repeat 10
compositions and distributions in the corresponding A and B compartments along the DNA 11
sequences (Figure 1C). Thus, co-segregation of B1/Alu and L1 repeats with the A and B 12
compartments appears to be invariable in different cell types in mouse and human. 13
We then took mouse chromosome 17 (chr17, 95 Mb in length) as an example to overlay L1 14
and B1 repeat features on the Hi-C interaction matrix of mESCs (Figures 1D-F). Interestingly, the 15
plaid pattern of enriched and depleted interaction blocks in the Hi-C map is largely correlated with 16
the compositions and distributions of B1 and L1 along the whole chr17 (Figure 1D). In a 42-Mb 17
region of chr17, four L1-enriched B compartments (denoted by c, e, g and i) and three B1-enriched 18
A compartments (denoted by D, F and H) are alternately positioned along the linear DNA sequence 19
(Figure 1E). Strong interactions were observed between L1-enriched compartments (represented 20
by ce, cg, eg, ei and gi) or B1-enriched compartments (represented by DF, DH and FH), but not 21
between these two compartments (Figure 1E, dotted boxes). The interaction frequencies between 22
D and F (DF) or between c and e (ce) are much stronger than those of D or F with c or e (cD, De, 23
eF), despite the fact that these regions are closer in the linear sequence. 24
Conversion of Hi-C contact frequencies into Pearson correlation coefficients sharpened our 25
view of the long-range chromatin interactions (Figure 1F). Strikingly, the plaid pattern of the Hi-C 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
8
correlation map appears to perfectly match the distribution and interaction status of L1 and B1. L1-1
rich or B1-rich regions show strong enrichment of contacts with regions containing the same repeat 2
type (red blocks in Figure 1F). We refer to these as homotypic contacts. Contacts between regions 3
containing the other repeat type (heterotypic interactions) are strongly depleted (blue blocks in 4
Figure 1F). For example, in one region of chr17 (35 to 95 Mb), L1-enriched segments (from e to 5
u) and B1-enriched segments (from F to T) exhibit high frequencies of homotypic contacts (Figure 6
1F, highlighted by arrows), but strong depletion of heterotypic contacts. Similarly, homotypic 7
contacts between L1-rich regions or B1-rich regions were also observed between chromosomes, as 8
illustrated by chromosomes 17 and 19 (Figure 1G). These results indicate that genomic regions 9
containing B1 or L1 repeats tend to interact with genomic regions containing repeat sequences 10
from similar subfamilies, but not from different subfamilies, regardless of linear proximity. Thus, 11
homotypic clustering of regions rich in B1 or L1 repeats partitions the genome into distinct A and 12
B compartments, respectively. 13
14
DNA FISH reveals the spatial segregation of L1 and B1 compartments 15
To visualize the nuclear localization of B1 and L1 repeats in cells we performed dual-color 16
DNA fluorescence in situ hybridization (FISH) using fluorescence-tagged oligonucleotide probes 17
that specifically target the consensus sequences of B1 and L1 elements. Consistent with the Hi-C 18
analysis, L1 and B1 exhibit distinct and complementary nuclear localizations in mESCs (Figures 19
2A and 2B). While B1 DNA shows punctate signals in the nuclear interior, L1 DNA exhibits highly 20
organized and concentrated signals that line the periphery of the nucleus and nucleolus. Weak L1 21
signals were also detected in a few areas of the nuclear interior subregions where B1 signals were 22
absent. Both B1 and L1 signals are absent from DAPI-dense regions, which likely represent 23
satellite repeat-enriched chromocenters (Mateos-Langerak et al., 2007). To confirm the L1 24
localization at the nucleolar periphery in mESCs, we performed immuno-FISH, which combines 25
DNA FISH with immunofluorescence staining using an antibody against the nucleolar marker 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
9
Nucleolin (NCL). Indeed, L1 signals surround and partially overlap with the ring-shaped signals 1
of NCL at the nucleolar periphery (Figure 2C). 2
To ask whether L1 and B1 localizations might vary with cell type, we performed immuno-3
FISH in four other representative mouse and human cell lines: neural stem cells (NSC), fibroblasts 4
(NIH3T3), myoblasts (C2C12) and Hela cells (Figure 2C). Similar to mESCs, all these cells show 5
non-overlapping localizations of B1/Alu repeats, which are present in the nuclear interior, and L1 6
repeats, which are mainly located at the nuclear and nucleolar peripheries. The distributions of 7
B1/Alu and L1 are reminiscent of the nuclear localization of euchromatin and heterochromatin, 8
respectively. Together, the results from Hi-C and imaging analysis demonstrate that homotypic 9
B1/Alu or L1 DNA sequences associate together to form larger clusters, which divide the nucleus 10
into distinct territories of A/B compartments, and nuclear co-segregation of B1/Alu and L1 repeats 11
with the A and B compartments is conserved across different cell types in mouse and human. 12
13
Dynamic establishment of L1 and B1 segregation during the cell cycle and embryonic 14
development 15
As chromatin organization undergoes dynamic changes during mitosis and after fertilization, 16
we asked whether L1 and B1 compartments are established and re-constructed in these two 17
biological processes. DNA FISH analysis of synchronous mESCs showed that L1 and B1 18
localizations change dramatically at different cell-cycle stages (Figures 3A and S2A). S-phase cells 19
show non-overlapping and complementary localizations of L1 and B1 repeats (Figures 2 and 3A). 20
This is similar to the pattern we observed previously in asynchronous mESCs, most of which are 21
in the S phase of the cell cycle (Figure S2A). However, L1 and B1 DNA signals are mixed on 22
mitotic chromosomes in metaphase (M phase, including prophase and anaphase), when the nuclear 23
membrane and nucleoli are disassembled. As the cell cycle progresses into the G1 phase, L1 and 24
B1 DNA start to segregate again. To quantify the degree of segregation, we defined a FISH-based 25
segregation index as the negative value of Pearson’s correlation coefficient of L1 and B1 DNA 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
10
signals in the nucleus. The FISH segregation index is lowest in M-phase cells, but increases 1
significantly in the G1 phase and peaks in the S phase (Figure 3B). 2
To provide further molecular evidence for segregation of repeats during the cell cycle, we 3
analyzed the published Hi-C data from cell-cycle synchronized mESCs and HeLa cells (Nagano et 4
al., 2017; Naumova et al., 2013). In both cell types, G1-phase cells exhibit a classic plaid pattern 5
of hierarchical interactions, with enriched and depleted interaction blocks outside of the diagonal 6
region of the Hi-C interaction heatmap (Figures 3C, 3D, S2B, and S2C). In contrast, M-phase cells 7
exhibit stronger signals along the diagonal, which represent the linearly organized, longitudinally 8
compressed array of consecutive chromatin loops. To quantify this difference, we defined a Hi-C-9
based segregation index by calculating the ratio of homotypic versus heterotypic interaction 10
frequencies between L1 and B1/Alu subfamilies. Indeed, the Hi-C segregation index is 11
significantly higher in G1-phase cells than in M-phase cells (Figures 3C and S2B). These results 12
indicate that segregation of B1/Alu and L1 repeats is dispersed by mitosis, and is re-established 13
when the higher-order chromatin structure forms during each cell cycle. This finding agrees with 14
previous reports that in metaphase, chromosome folding becomes homogeneous and large 15
megabase-scale A and B compartments are lost, whilst in interphase, chromosomes return to a 16
highly compartmentalized state (Nagano et al., 2017; Naumova et al., 2013). 17
After fertilization, the chromatin undergoes extensive reprogramming from a markedly relaxed 18
state in zygotes to fully organized structures in blastocysts (Du et al., 2017; Flyamer et al., 2017; 19
Ke et al., 2017). We performed time-course DNA FISH analysis of L1 and B1 in early mouse 20
embryos (Figure 3E). L1 and B1 signals are largely overlapping in zygotes and early 2-cell embryos, 21
and become progressively more segregated during embryonic divisions (Figure 3E). A plot of the 22
FISH segregation indexes along the course of blastocyst development showed that the greatest 23
change (steepest trend-line) occurred between the zygote and the late 2-cell stage (Figure 3F). This 24
observation implies that the initiation of B1 and L1 compartmentalization may coincide with the 25
zygotic genome activation (ZGA), during which massive transcription switches on (Qiu et al., 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
11
2003). We then analyzed the published Hi-C data from early mouse embryos (Du et al., 2017). 1
Consistent with our results, we detected a gradual increase of the segregation index (Figure 3G). 2
Early 2-cell embryos exhibit prevalent cis-chromosomal contacts along the diagonal of the Hi-C 3
interaction map, whereas the plaid patterns of Hi-C interactions become readily detectable in late 4
2-cell embryos and are fully established in the inner cell mass (ICM) cells of blastocysts (Figure 5
3H). Thus, in early embryos, compartmentalization of B1- and L1-rich regions appears to be 6
established in a stepwise manner, coincident with de novo establishment of higher-order chromatin 7
structures. 8
9
Repeat RNA and transcription promote L1 and B1 compartmentalization 10
Transcription and nuclear architecture are closely interdependent and influence each other 11
(Dowen et al., 2014; Rowley and Corces, 2018; Rowley et al., 2017). We sought to test whether 12
RNA transcribed from L1 and B1 repeats mediates the function of their DNA sequences. We first 13
analyzed the effects of inhibition of L1 and B1 expression in early mouse embryos, as both repeats 14
are activated and highly expressed in two-cell embryos (Fadloun et al., 2013; Jachowicz et al., 2017; 15
Rothstein et al., 1992). We recently showed that that L1 RNA can be efficiently knocked down with 16
antisense morpholinos (AMO) in mouse embryos and mESCs (Percharde et al., 2018). We 17
successfully depleted L1 and B1 transcripts by injecting antisense morpholinos or oligonucleotides 18
(ASO) into mouse zygotes (Figures S3A-B). Embryos depleted of L1 or B1 RNA were able to pass 19
the first embryonic division and their development was indistinguishable from control embryos at 20
the 2-cell stage; however, they failed to divide further and became arrested at the 2-cell stage 21
(Figure 4A). We collected these embryos for DNA FISH analysis when the control group injected 22
with scramble AMO or ASO had grown to the late 2-cell stage. In embryos depleted of L1 or B1 23
RNA, L1/B1 segregation is impaired, as indicated by significantly lower L1/B1 segregation 24
indexes compared to scramble control and noninjected late 2-cell embryos (Figures 4B and S3C-25
E). This result suggests that formation of L1 and B1 compartments is delayed upon depletion of L1 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
12
or B1 transcripts, which is consistent with the observed embryonic arrest. 1
It was reported that inhibition of Pol II transcription by -amanitin causes embryonic arrest at 2
the late 2-cell stage (Du et al., 2017; Qiu et al., 2003). We analyzed the Hi-C data obtained by Du 3
et al and found that -amanitin-treated embryos exhibit predominantly cis-chromosomal contacts 4
along the diagonal of the Hi-C map, in a pattern similar to that of early 2-cell embryos, while the 5
control group shows the plaid-like pattern of late 2-cell embryos (Figures S4A). Consistent with 6
this, treating mESCs with the transcription inhibitors Actinomycin D (ActD) or DRB (5,6-dichloro-7
1-b-D-ribofuranosylbenzimidizole) also altered L1/B1 compartmentalization as shown by loss of 8
L1 perinucleolar localization and gain of mixed L1 and B1 signals (Figures 4C, 4D and S4B). 9
Notably, inhibition of both Pol I and II by ActD caused a more severe effect than inhibition of Pol 10
II alone by DRB. These results indicate that transcription facilitates the nuclear organization. 11
Next, we sought to deplete repeat transcripts in mESCs in order to dissect the effects on 12
chromatin organization independent of embryonic progression. B1/Alu repeats are known to be 13
broadly involved in diverse cellular processes, including transcription and RNA processing and 14
nuclear export (Chen et al., 2009; Dominissini et al., 2011; Lubelsky and Ulitsky, 2018; Polak and 15
Domany, 2006). Treatment of mESCs with B1 ASO led to severe cell death within hours of 16
transfection (data not shown), which precluded direct assessment of the function of B1 transcripts 17
in organizing the nuclear structure. L1 repeats, the most abundant of all repeat subclasses, are 18
predominantly enriched in heterochromatic B compartments. It was suggested that heterochromatin 19
interactions play a more important role in nuclear compartmentalization than euchromatin 20
interactions (Falk et al., 2019; Houda Belaghzal, 2019). To reveal a direct role of repeat RNA in 21
the nuclear organization, we focused on L1 in subsequent investigations. 22
Treatment of mESCs with L1 AMOs drastically altered the formation of L1/B1 compartments 23
as shown by the appearance of the overlapping, yellowish FISH signals of B1 and L1 DNA (Figures 24
4E, 4F, and S5A). In particular, when L1 RNA was depleted by AMO treatment for 12 or 36 hours, 25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
13
perinucleolar L1 signals were diffuse or absent and nucleoplasmic signals were dramatically 1
increased at both time-points. In contrast to the punctate pattern of B1 DNA in the control mESCs, 2
B1 signals became more uniformly dispersed in the nucleoplasm, which is indicative of decreased 3
clustering. In comparison, treatment with the drug azidothymidine (AZT), which blocks L1 4
retrotransposition activity (Xie et al., 2010), failed to affect L1 RNA levels and the nuclear 5
localization of L1 and B1 (Figures 4E, 4F, and S5B). In addition, statistical analysis indicated that 6
the FISH segregation indexes were significantly lower in L1-depleted mESCs than in mESCs 7
treated with scramble AMO or AZT. These results suggest a role of L1 transcripts in regulating the 8
nuclear localization and segregation of L1 and B1 DNA repeats, independently of L1 9
retrotransposition activity, in mESCs. 10
11
L1 RNA regulates the higher-order chromatin structure in mESCs 12
For in-depth characterization of the molecular defects caused by L1 RNA depletion, we 13
performed in situ Hi-C analysis at 36 hours after transfecting L1 AMO into mESCs. Direct 14
visualization of Hi-C interaction maps revealed obvious differences in the plaid pattern of L1-15
depleted and control mESCs, as illustrated by chr17 (Figure 5A). In L1-depleted mESCs, Hi-C 16
contact signals were abnormally increased along the diagonal line, whereas the plaid signals outside 17
of the diagonal regions became fuzzy or even lost (Figure 5A, panel i). Accordingly, comparison 18
of Hi-C contact frequencies of the control versus L1-depleted cells showed decreased ratios (in 19
blue) across the diagonal and increased ratios (in red) in the periphery (Figure 5A, panel ii). These 20
changes indicate enhanced local chromosomal contacts but decreased long-range interactions in 21
L1-depleted mESCs. 22
A zoomed-in view of a 42-Mb region of chr17 in L1-depleted mESCs further shows decreased 23
homotypic chromatin contacts between L1-rich or B1-rich compartments, but abnormally 24
increased heterotypic contacts (Figure 5B, right; and Figure 5C). For example, B1-B1 interactions 25
(represented by DF, DH and FH) or L1-L1 interactions (represented by ce, cg, ci, eg, ei and gi) 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
14
were downregulated in L1-depleted cells, whereas aberrant B1-L1 contacts (represented by cD, De, 1
Dg, Di, eF, Fg, and Hi) were increased. Globally, L1-depleted cells show significantly lower Hi-C 2
segregation indexes compared to control mESCs (Figure 5A). Consistent with the overlapping 3
signals of L1 and B1 DNA FISH (Figure 4E), these results revealed defective nuclear segregation 4
of L1 and B1 compartments after depletion of L1 RNA (Figure 5B). 5
Next, to directly visualize changes in spatial contacts at specific loci, we performed oligopaint 6
dual-color DNA FISH (Figure 6A). We labeled four representative genomic regions on chr17, a 7
B1-enriched region (F) and three L1-enriched regions (e, g and q), with a set of 500 ~ 4,500 DNA 8
probes for each region at a density of 200 bp per probe, and measured the spatial distances between 9
the probe signals within mESC nuclei. In control mESCs, the two linearly adjacent L1- and B1-10
enriched regions e and F are positioned away from each other in the nuclear space with a median 11
distance of 1.72 0.55 µm. Depletion of L1 RNA significantly shortened the distance between 12
them to 1.27 0.47 µm (Figures 6B and 6C, left). In comparison, the two L1-enriched 13
compartments g and q reside in close spatial proximity (1.08 0.45 µm) in the nuclei of control 14
cells even though they are separated by multiple L1 and B1 compartments in the linear genomic 15
sequence. L1 depletion significantly increased the nuclear distance between g and q to 1.81 0.82 16
µm (Figure 6C, right). Together, DNA FISH and Hi-C results demonstrate that L1 RNA is critical 17
in maintaining the high-order chromatin structure of segregated L1- and B1-enriched compartments 18
in mESCs. 19
20
L1 repeats promote phase separation of HP1 21
Recent evidence suggests that the formation of heterochromatin domains entails a phase-22
separation mechanism (Larson et al., 2017; Li et al., 2012; Strom et al., 2017; Zwicker et al., 2014). 23
HP1, a known component of heterochromatin, forms phase-separated droplets in the presence of 24
DNA or nucleosomes in vitro (Larson et al., 2017; Strom et al., 2017). We showed that HP1 binds 25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
15
strongly to L1-enriched but not B1-enriched genomic regions (Figure 7A). In addition, L1 RNA 1
specifically binds L1 DNA sequences throughout the genome, and is significantly enriched in L1 2
repeat-associated compartments, but is depleted in B1-associated compartments (Figures 1A and 3
7A), as shown by chromatin isolation by RNA purification followed by sequencing of L1 RNA 4
(ChIRP-seq) (Lu et al., paper submitted). Given the abundance and co-residence of L1 repeats with 5
HP1 in similar heterochromatin contexts within the nucleus, we hypothesized that L1 RNA may 6
facilitate heterochromatin formation by promoting HP1 phase separation. 7
We firstly tested the direct protein-RNA interaction between the recombinant human HP1 8
protein and L1 RNA fragments that were transcribed in vitro. Interestingly, the L1 RNA mix 9
captures HP1 protein in extremely high stringent washes (up to 1 M of salt and urea), indicating 10
that HP1 has a strong RNA-binding activity towards L1 (Figure 7B). Droplet formation assays 11
showed that HP1 alone fails to phase separate (Figures 7C), consistent with a previous report 12
(Larson et al., 2017). Intriguingly, upon addition of L1 RNA, HP1 forms spherical droplets with 13
the properties of liquid-like condensates, such as fusion of droplets and rapid recovery after photo-14
bleaching (Figures 7C and S6A, B). HP1-L1 RNA droplets increase in size when increasing 15
concentrations of components are added into the assay (Figure 7C). L1 DNA also promotes the 16
phase separation of HP1 (Figure S6C). These results suggest that L1 repeats play an active role 17
in heterochromatin formation. 18
19
Discussion 20
Although tremendous efforts have been dedicated to studies of structural chromatin proteins 21
and cataloging chromatin maps, the role of DNA sequences in 3D genome organization has been 22
largely ignored. It has been speculated that genome organization may occur through polymer phase 23
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
16
separation mediated by homologous sequence pairing (Cook, 2002; Cook and Marenduzzo, 2018; 1
Falk et al., 2019; Rowley et al., 2017; Tang, 2017). However, it was unclear which types of repeats 2
are involved in this process, and direct evidence was still lacking. In addition, the molecular 3
determinants and mechanisms that drive formation of the A/B compartments remain unclear. In 4
this study, we provide experimental evidence to support a fundamental role of L1 and B1/Alu 5
repeats in organizing chromatin macrostructure. 6
Complementary results based on genomics, Hi-C and imaging analysis revealed that L1 and 7
B1/Alu repeats tend to cluster with sequences from their own repeat subfamily and form mutually 8
exclusive domains in the nuclear space that are highly correlated with A/B compartments (Figures 9
1 and 2). The nuclear segregation of B1/Alu-enriched A compartments in the nuclear interior and 10
L1-enriched B compartments in the nuclear and nucleolar peripheries is relatively conserved and 11
stable across a variety of mouse and human cells, and re-occurs during the cell cycle (Figure 2). In 12
addition, de novo establishment of the segregated pattern of L1- and B1-rich compartments is 13
coincident with the formation of higher-order chromatin structures during early embryogenesis, 14
and appears to be critically regulated by RNA transcripts produced from L1 and B1 DNA sequences. 15
Depletion of L1 RNA by AMOs in mESCs drastically disrupts the nuclear localization of L1 repeats, 16
and leads to impaired segregation of L1 and B1 compartments and altered higher-order chromatin 17
structures. Moreover, L1 RNA binds to its own DNA sequences in cells, and promotes HP1 phase 18
separation in vitro. Together, our findings suggest that repeat transcripts may transact the structural 19
information embedded in their DNA sequences and facilitate formation of the nuclear architecture 20
with assistance from protein partners, such as HP1 (Figures 4-7). 21
Based on our observations together with previous reports in the literature, we propose a 22
hypothetical model in which repetitive elements organize the macroscopic structure of the genome 23
at three hierarchical levels (Figure 7D). First, L1 and B1/Alu repeats serve as the genetic basis for 24
A and B compartments. The abundance and scattering of these repeat elements in the genome 25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
17
provide numerous nucleation points or "structural codes" to seed the formation of nuclear 1
subdomains. Homotypic clustering of Ll-rich regions or B1-rich regions initiates genome folding. 2
Second, the structural information embedded in linear genomic DNA repeats is in part transacted 3
by their transcripts, particularly L1 RNA as demonstrated in this study, into spatially ordered 4
chromatin in the nucleus. Molecular crowding and specific interactions among similar molecules, 5
including DNA sequences and RNAs containing multiple copies of the same repeat type, and their 6
associated proteins, promote clustering into larger subdomains (Bergeron-Sandoval et al., 2016; 7
Cook, 2002; Misteli, 2007). Phase separation of individual subdomains may also lead to their 8
segregation in the nuclear space and the formation of distinct L1-enriched heterochromatin and 9
B1/Alu-enriched euchromatin domains. Third, chromatin compartmentalization may be further 10
reinforced and stabilized through attachment of repeat DNA sequences to subnuclear structures 11
such as the nucleolus and nuclear speckles (Quinodoz et al., 2018). Evidence to support this notion 12
includes our observation that L1 repeats are preferentially localized at the nuclear and nucleolar 13
peripheries, and depletion of L1 RNA disrupts the localization of L1 DNA at these sites. 14
In the absence of zygotic transcription, chromatin folding continues to occur in a significantly 15
delayed manner (Figures 4 and S4), and the higher-order chromatin structure still partially forms 16
(Du et al., 2017). This observation implies that the 3D organization of chromatin is an autonomous 17
process that is greatly facilitated by transcription. We surmise that homotypic clustering of L1-rich 18
or B1-rich regions is autonomous but dynamically regulated. L1/B1-driven nuclear 19
compartmentalization is reminiscent of the self-assembly of tandem repeat sequences such as 20
ribosomal DNA (rDNA) and satellite repeats, which promotes high-order assemblages of the 21
nucleolus and pericentromeric domains, respectively (Falahati et al., 2016; Tjong et al., 2016). 22
Consistent with our model, in mouse ESCs carrying a human artificial chromosome, human L1 23
sequences spatially interact with mouse L1-enriched genomic regions but avoid the SINE-enriched 24
regions, and vice versa for human SINEs (van de Werken et al., 2017). Specific DNA-RNA-protein 25
interactions at homologous L1 or B1 chromosomal segments not only enhance the formation of 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
18
subdomains through phase separation, but may also promote their segregation in the nucleus. A 1
recent study reported that there does not appear to be cross-talk between different transcription 2
factors; instead, they interact among themselves via sequence-specific protein-protein interactions 3
to form concentrated hubs on synthetic lacO DNA arrays (Chong et al., 2018). L1 or B1/Alu repeats 4
tend to associate with distinct sets of transcription and chromatin regulators (Lu et al., paper 5
submitted). For example, heterochromatin proteins such as HP1, KAP1, and SETDB1 are 6
specifically enriched on L1 elements, whereas general transcription factors such as GTF3C2 and 7
CEBPB and RNA Pol II subunits show enriched binding on B1/Alu. In addition, different 8
physiochemical properties of phase-separated L1 or B1 subdomains, for example chromatin 9
compactness, may further promote the segregation of L1 and B1 compartments. 10
By focusing on L1 transcripts, we demonstrate an essential role for L1 repeat RNA in 11
promoting the formation of segregated higher-order chromatin structure in mESCs (Figures 4-6). 12
Heterochromatin domains appear to be more stable than euchromatin domains (Houda Belaghzal, 13
2019). In support of a heterochromatin-dominated model (Falk et al., 2019), our data further 14
suggest that L1-mediated heterochromatin formation may drive nuclear compartmentalization. 15
This function of L1 RNA might be achieved by stabilizing the binding of chromatin proteins such 16
as HP1 to L1 DNA sequences, and by recruiting additional RNA-binding proteins to increase 17
local protein concentrations and multivalent interactions for efficient phase separation. It was 18
reported that the RNA-binding activity of mammalian HP1 is required for its localization to 19
pericentromeric chromatin (Muchardt et al., 2002; Stunnenberg et al., 2015). On the other hand, 20
heterochromatic RNA reduces the mobility and spreading of the fission yeast HP1 protein in 21
heterochromatic domains (Keller et al., 2012; Muchardt et al., 2002; Stunnenberg et al., 2015). 22
From an evolutionary perspective, the properties of co-localization, abundance, and sequence 23
preference of L1 and HP1 might have co-evolved to ensure regulated formation of L1-enriched 24
heterochromatin in organizing the nucleus. L1 RNA binds NCL, and its depletion causes relocation 25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
19
of DUX genes from the perinucleolar heterochromatin to the nucleoplasm (Percharde et al., 2018). 1
Abnormal activation or silencing of L1 in mouse zygotes led to altered chromatin accessibility and 2
compaction, which failed to be rescued by synthetic L1 RNA (Jachowicz et al., 2017). Together 3
with previous reports, our results suggest a model where L1 RNA acts in cis to anchor L1 repeat-4
enriched chromosomal segments to the nucleolar and nuclear peripheries. This scaffold function is 5
reminiscent of reported roles for several chromatin-bound noncoding RNAs in organizing 6
subnuclear domains. For example, Xist RNA binds the lamin B receptor in the inner nuclear 7
membrane to initiate and maintain the silencing of the inactive X chromosome (McHugh et al., 8
2015). Transcription at major satellite repeats precedes chromocenter formation, and satellite 9
transcripts help to recruit the H3K9me3 methyltransferase SUV39H to centromeric DNA 10
sequences (Probst et al., 2010; Velazquez Camacho et al., 2017). However, the exact mechanism 11
underlying the function of L1 RNA remains to be fully elucidated by future studies. 12
In summary, our study starts to unravel the fundamental principle of 3D genome organization, 13
in which the distribution of L1 and B1 transposable elements provides the blueprint for chromatin 14
folding. As discovered by Anfinsen in the late 1950s, the central pillar of protein science tells us 15
that the amino acid sequence of a protein determines its structure and function (Anfinsen, 1972; 16
Anfinsen et al., 1954). Analogously, we propose that the primary DNA sequences determine how 17
the genome folds and functions. We believe that genome folding occurs autonomously, through a 18
process that is driven by homotypic clustering of regions containing L1 or B1 repeat sequences, 19
and is further facilitated by transcription and transcripts produced at these repeat elements. 20
Structural information embedded in L1 and SINE repeats may be universally recognizable, thus 21
contributing to the high degree of stability and conservation in compartmental and TAD 22
organization that is observed across mammalian species and cell types. We want to note that L1 23
and B1/Alu compartmental domains represent a structural and functional ground state of chromatin 24
organization, on which subsequent regulatory features, such as dynamic enhancer-promoter 25
interactions, are overlaid. Nevertheless, the same principle of homotypic clustering, phase 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
20
separation and spatial segregation of chromosomal segments may be reiterated at different genomic 1
scales, consequently folding the genome. 2
3
ACKNOWLEDGMENTS 4
We thank the Shen Laboratory members for insightful discussions, and we thank Dr. Hongxia Lv 5
at the core imaging facility of the School of Life Sciences, Peking University, for imaging support. 6
This work was supported in part by the National Basic Research Program of China 7
(2018YFA0107604, 2017YFA0504204 to X.S.); the National Natural Science Foundation of China 8
(31630095, 31471219 to X.S.; 21573013, 21825401 to Y.S.); the National Key R&D Program of 9
China (2017YFA0505300 to Y.S.); and the Center for Life Sciences (CLS) at Tsinghua University 10
(to X.S). 11
12
AUTHOR CONTRIBUTIONS 13
X.S. conceived of and supervised the study. X.S., Y.S., J.Y.L. and L.C. designed the experiments. 14
J.Y.L. performed bioinformatics analysis. L.C. conducted all imaging experiments with help from 15
Y.Y., G.Z., T.W., X.H., and J.Z. T.L. performed phase separation and RNA pull-down experiments. 16
K.Z., Q.P. and P.Y. prepared the Hi-C library. T.W., W.L. and P.W. did embryo injections. P.L. and 17
L.W. provide HP1 proteins. M.P and M.R.-S. contributed to early experiments of L1 RNA 18
knockdown. W.X., J.N., H.Z., X.B., W.S., Z.W. and Y.H. contributed technical 19
assistance/suggestions. X.S., J.Y.L. and L.C. wrote the manuscript with input from all authors. 20
21
DECLARATION OF INTERESTS 22
The authors declare no competing interests. 23
24
25
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
21
REFERENCES 1
Anfinsen, C.B. (1972). The formation and stabilization of protein structure. Biochem J 128, 737-749. 2
Anfinsen, C.B., Redfield, R.R., Choate, W.L., Page, J., and Carroll, W.R. (1954). Studies on the gross structure, 3
cross-linkages, and terminal sequences in ribonuclease. J Biol Chem 207, 201-210. 4
Annunziato, A.T. (2008). DNA Packaging: Nucleosomes and Chromatin. Nature Education 1(1):26. 5
Bergeron-Sandoval, L.P., Safaee, N., and Michnick, S.W. (2016). Mechanisms and Consequences of 6
Macromolecular Phase Separation. Cell 165, 1067-1079. 7
Biemont, C. (2010). A Brief History of the Status of Transposable Elements: From Junk DNA to Major Players in 8
Evolution. Genetics 186, 1085-1093. 9
Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N., and 10
Zhuang, X. (2018). Super-resolution chromatin tracing reveals domains and cooperative interactions in single 11
cells. Science 362. 12
Boettiger, A.N., Bintu, B., Moffitt, J.R., Wang, S., Beliveau, B.J., Fudenberg, G., Imakaev, M., Mirny, L.A., Wu, C.T., 13
and Zhuang, X. (2016). Super-resolution imaging reveals distinct chromatin folding for different epigenetic 14
states. Nature 529, 418-422. 15
Bonev, B., and Cavalli, G. (2016). Organization and function of the 3D genome. Nat Rev Genet 17, 661-678. 16
Bourque, G., Leong, B., Vega, V.B., Chen, X., Lee, Y.L., Srinivasan, K.G., Chew, J.L., Ruan, Y., Wei, C.L., Ng, H.H., et 17
al. (2008). Evolution of the mammalian transcription factor binding repertoire via transposable elements. 18
Genome Res 18, 1752-1762. 19
Bouwman, B.A., and de Laat, W. (2015). Getting the genome in shape: the formation of loops, domains and 20
compartments. Genome Biol 16, 154. 21
Buchwalter, A., Kaneshiro, J.M., and Hetzer, M.W. (2019). Coaching from the sidelines: the nuclear periphery in 22
genome regulation. Nature Reviews Genetics 20, 39-50. 23
Chen, C., Ara, T., and Gautheret, D. (2009). Using Alu elements as polyadenylation sites: A case of retroposon 24
exaptation. Mol Biol Evol 26, 327-334. 25
Chong, S., Dugast-Darzacq, C., Liu, Z., Dong, P., Dailey, G.M., Cattoglio, C., Heckert, A., Banala, S., Lavis, L., 26
Darzacq, X., et al. (2018). Imaging dynamic and selective low-complexity domain interactions that control gene 27
transcription. Science 361. 28
Chuong, E.B., Elde, N.C., and Feschotte, C. (2016). Regulatory evolution of innate immunity through co-option 29
of endogenous retroviruses. Science 351, 1083-1087. 30
Cook, P.R. (2002). Predicting three-dimensional genome structure from transcriptional activity. Nat Genet 32, 31
347-352. 32
Cook, P.R., and Marenduzzo, D. (2018). Transcription-driven genome organization: a model for chromosome 33
structure and the regulation of gene expression tested through simulations. Nucleic Acids Res 46, 9895-9906. 34
Deininger, P. (2011). Alu elements: know the SINEs. Genome Biol 12, 236. 35
Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., 36
Ren, B., et al. (2017). The 4D nucleome project. Nature 549, 219-226. 37
Dekker, J., Marti-Renom, M.A., and Mirny, L.A. (2013). Exploring the three-dimensional organization of 38
genomes: interpreting chromatin interaction data. Nat Rev Genet 14, 390-403. 39
Dekker, J., Rippe, K., Dekker, M., and Kleckner, N. (2002). Capturing chromosome conformation. Science 295, 40
1306-1311. 41
Dixon, J.R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J.E., Lee, A.Y., Ye, Z., Kim, A., Rajagopal, N., Xie, 42
W., et al. (2015). Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331-336. 43
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological domains 44
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
22
in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-380. 1
Dominissini, D., Moshitch-Moshkovitz, S., Amariglio, N., and Rechavi, G. (2011). Adenosine-to-inosine RNA 2
editing meets cancer. Carcinogenesis 32, 1569-1577. 3
Dowen, J.M., Fan, Z.P., Hnisz, D., Ren, G., Abraham, B.J., Zhang, L.N., Weintraub, A.S., Schujiers, J., Lee, T.I., Zhao, 4
K., et al. (2014). Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. 5
Cell 159, 374-387. 6
Du, Z., Zheng, H., Huang, B., Ma, R., Wu, J., Zhang, X., He, J., Xiang, Y., Wang, Q., Li, Y., et al. (2017). Allelic 7
reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232-235. 8
Durruthy-Durruthy, J., Sebastiano, V., Wossidlo, M., Cepeda, D., Cui, J., Grow, E.J., Davila, J., Mall, M., Wong, 9
W.H., Wysocka, J., et al. (2016). The primate-specific noncoding RNA HPAT5 regulates pluripotency during 10
human preimplantation development and nuclear reprogramming. Nat Genet 48, 44-52. 11
Fadloun, A., Le Gras, S., Jost, B., Ziegler-Birling, C., Takahashi, H., Gorab, E., Carninci, P., and Torres-Padilla, M.E. 12
(2013). Chromatin signatures and retrotransposon profiling in mouse embryos reveal regulation of LINE-1 by 13
RNA. Nat Struct Mol Biol 20, 332-338. 14
Falahati, H., Pelham-Webb, B., Blythe, S., and Wieschaus, E. (2016). Nucleation by rRNA Dictates the Precision 15
of Nucleolus Assembly. Curr Biol 26, 277-285. 16
Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, 17
G., Solovei, I., et al. (2019). Heterochromatin drives compartmentalization of inverted and conventional nuclei. 18
Nature 570, 395-399. 19
Flyamer, I.M., Gassler, J., Imakaev, M., Brandao, H.B., Ulianov, S.V., Abdennur, N., Razin, S.V., Mirny, L.A., and 20
Tachibana-Konwalski, K. (2017). Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-21
zygote transition. Nature 544, 110-114. 22
Fraser, J., Ferrai, C., Chiariello, A.M., Schueler, M., Rito, T., Laudanno, G., Barbieri, M., Moore, B.L., Kraemer, D.C., 23
Aitken, S., et al. (2015). Hierarchical folding and reorganization of chromosomes are linked to transcriptional 24
changes in cellular differentiation. Mol Syst Biol 11, 852. 25
Gassler, J., Brandao, H.B., Imakaev, M., Flyamer, I.M., Ladstatter, S., Bickmore, W.A., Peters, J.M., Mirny, L.A., and 26
Tachibana, K. (2017). A mechanism of cohesin-dependent loop extrusion organizes zygotic genome 27
architecture. Embo J 36, 3600-3618. 28
Gibcus, J.H., and Dekker, J. (2013). The hierarchy of the 3D genome. Mol Cell 49, 773-782. 29
Grow, E.J., Flynn, R.A., Chavez, S.L., Bayless, N.L., Wossidlo, M., Wesche, D.J., Martin, L., Ware, C.B., Blish, C.A., 30
Chang, H.Y., et al. (2015). Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent 31
cells. Nature 522, 221-225. 32
Guelen, L., Pagie, L., Brasset, E., Meuleman, W., Faza, M.B., Talhout, W., Eussen, B.H., de Klein, A., Wessels, L., de 33
Laat, W., et al. (2008). Domain organization of human chromosomes revealed by mapping of nuclear lamina 34
interactions. Nature 453, 948-951. 35
Haarhuis, J.H.I., van der Weide, R.H., Blomen, V.A., Yanez-Cuna, J.O., Amendola, M., van Ruiten, M.S., Krijger, 36
P.H.L., Teunissen, H., Medema, R.H., van Steensel, B., et al. (2017). The Cohesin Release Factor WAPL Restricts 37
Chromatin Loop Extension. Cell 169, 693-707 e614. 38
Houda Belaghzal, T.B., Andrew D. Stephens , Denis L. Lafontaine , Sergey V. Venev , Zhiping Weng , John F. 39
Marko, Job Dekker (2019). Compartment-dependent chromatin interaction dynamics revealed by liquid 40
chromatin Hi-C. bioRxiv. 41
Jachowicz, J.W., Bing, X.Y., Pontabry, J., Boskovic, A., Rando, O.J., and Torres-Padilla, M.E. (2017). LINE-1 42
activation after fertilization regulates global chromatin accessibility in the early mouse embryo. Nature Genetics 43
49, 1502-1510. 44
Jost, D., Carrivain, P., Cavalli, G., and Vaillant, C. (2014). Modeling epigenome folding: formation and dynamics 45
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
23
of topologically associated chromatin domains. Nucleic Acids Res 42, 9553-9561. 1
Jurka, J., Kohany, O., Pavlicek, A., Kapitonov, V.V., and Jurka, M.V. (2004). Duplication, coclustering, and selection 2
of human Alu retrotransposons. Proc Natl Acad Sci U S A 101, 1268-1272. 3
Kadauke, S., and Blobel, G.A. (2009). Chromatin loops in gene regulation. Biochim Biophys Acta 1789, 17-25. 4
Ke, Y., Xu, Y., Chen, X., Feng, S., Liu, Z., Sun, Y., Yao, X., Li, F., Zhu, W., Gao, L., et al. (2017). 3D Chromatin 5
Structures of Mature Gametes and Structural Reprogramming during Mammalian Embryogenesis. Cell 170, 6
367-381 e320. 7
Keller, C., Adaixo, R., Stunnenberg, R., Woolcock, K.J., Hiller, S., and Buhler, M. (2012). HP1(Swi6) mediates the 8
recognition and destruction of heterochromatic RNA transcripts. Mol Cell 47, 215-227. 9
Korenberg, J.R., and Rykowski, M.C. (1988). Human genome organization: Alu, lines, and the molecular structure 10
of metaphase chromosome bands. Cell 53, 391-400. 11
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., 12
FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921. 13
Larson, A.G., Elnatan, D., Keenen, M.M., Trnka, M.J., Johnston, J.B., Burlingame, A.L., Agard, D.A., Redding, S., 14
and Narlikar, G.J. (2017). Liquid droplet formation by HP1alpha suggests a role for phase separation in 15
heterochromatin. Nature 547, 236-240. 16
Li, P., Banjade, S., Cheng, H.C., Kim, S., Chen, B., Guo, L., Llaguno, M., Hollingsworth, J.V., King, D.S., Banani, S.F., 17
et al. (2012). Phase transitions in the assembly of multivalent signalling proteins. Nature 483, 336-340. 18
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., 19
Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding 20
principles of the human genome. Science 326, 289-293. 21
Liu, N., Lee, C.H., Swigut, T., Grow, E., Gu, B., Bassik, M.C., and Wysocka, J. (2018). Selective silencing of 22
euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature 553, 228-232. 23
Lubelsky, Y., and Ulitsky, I. (2018). Sequences enriched in Alu repeats drive nuclear localization of long RNAs in 24
human cells. Nature 555, 107-111. 25
Lynch, V.J., Leclerc, R.D., May, G., and Wagner, G.P. (2011). Transposon-mediated rewiring of gene regulatory 26
networks contributed to the evolution of pregnancy in mammals. Nat Genet 43, 1154-1159. 27
Mandal, P.K., and Kazazian, H.H., Jr. (2008). SnapShot: Vertebrate transposons. Cell 135, 192-192 e191. 28
Mateos-Langerak, J., Brink, M.C., Luijsterburg, M.S., van der Kraan, I., van Driel, R., and Verschure, P.J. (2007). 29
Pericentromeric heterochromatin domains are maintained without accumulation of HP1. Molecular Biology of 30
the Cell 18, 1464-1471. 31
McHugh, C.A., Chen, C.K., Chow, A., Surka, C.F., Tran, C., McDonel, P., Pandya-Jones, A., Blanco, M., Burghard, 32
C., Moradian, A., et al. (2015). The Xist lncRNA interacts directly with SHARP to silence transcription through 33
HDAC3. Nature 521, 232-236. 34
Meuleman, W., Peric-Hupkes, D., Kind, J., Beaudry, J.B., Pagie, L., Kellis, M., Reinders, M., Wessels, L., and van 35
Steensel, B. (2013). Constitutive nuclear lamina-genome interactions are highly conserved and associated with 36
A/T-rich sequence. Genome Res 23, 270-280. 37
Mirny, L.A. (2011). The fractal globule as a model of chromatin architecture in the cell. Chromosome Res 19, 38
37-51. 39
Misteli, T. (2007). Beyond the sequence: cellular organization of genome function. Cell 128, 787-800. 40
Mouse Genome Sequencing, C., Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., 41
Agarwala, R., Ainscough, R., Alexandersson, M., et al. (2002). Initial sequencing and comparative analysis of the 42
mouse genome. Nature 420, 520-562. 43
Muchardt, C., Guilleme, M., Seeler, J.S., Trouche, D., Dejean, A., and Yaniv, M. (2002). Coordinated methyl and 44
RNA binding is required for heterochromatin localization of mammalian HP1 alpha. Embo Reports 3, 975-981. 45
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
24
Nagano, T., Lubling, Y., Vaarnai, C., Dudley, C., Leung, W., Baran, Y., Cohen, N.M., Wingett, S., Fraser, P., and 1
Tanay, A. (2017). Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547, 61-2
67. 3
Naumova, N., Imakaev, M., Fudenberg, G., Zhan, Y., Lajoie, B.R., Mirny, L.A., and Dekker, J. (2013). Organization 4
of the mitotic chromosome. Science 342, 948-953. 5
Nora, E.P., Goloborodko, A., Valton, A.L., Gibcus, J.H., Uebersohn, A., Abdennur, N., Dekker, J., Mirny, L.A., and 6
Bruneau, B.G. (2017). Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from 7
Genomic Compartmentalization. Cell 169, 930-944. 8
Nuebler, J., Fudenberg, G., Imakaev, M., Abdennur, N., and Mirny, L.A. (2018). Chromatin organization by an 9
interplay of loop extrusion and compartmental segregation. Proc Natl Acad Sci U S A 115, E6697-E6706. 10
Orgel, L.E., and Crick, F.H. (1980). Selfish DNA: the ultimate parasite. Nature 284, 604-607. 11
Percharde, M., Lin, C.J., Yin, Y., Guan, J., Peixoto, G.A., Bulut-Karslioglu, A., Biechele, S., Huang, B., Shen, X., and 12
Ramalho-Santos, M. (2018). A LINE1-Nucleolin Partnership Regulates Early Development and ESC Identity. Cell 13
174, 391-405 e319. 14
Polak, P., and Domany, E. (2006). Alu elements contain many binding sites for transcription factors and may 15
play a role in regulation of developmental processes. Bmc Genomics 7, 133. 16
Probst, A.V., Okamoto, I., Casanova, M., El Marjou, F., Le Baccon, P., and Almouzni, G. (2010). A strand-specific 17
burst in transcription of pericentric satellites is required for chromocenter formation and early mouse 18
development. Dev Cell 19, 625-638. 19
Pueschel, R., Coraggio, F., and Meister, P. (2016). From single genes to entire genomes: the search for a function 20
of nuclear organization. Development 143, 910-923. 21
Qiu, J.J., Zhang, W.W., Wu, Z.L., Wang, Y.H., Qian, M., and Li, Y.P. (2003). Delay of ZGA initiation occurred in 2-22
cell blocked mouse embryos. Cell Res 13, 179-185. 23
Quinodoz, S.A., Ollikainen, N., Tabak, B., Palla, A., Schmidt, J.M., Detmar, E., Lai, M.M., Shishkin, A.A., Bhat, P., 24
Takei, Y., et al. (2018). Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. 25
Cell 174, 744-757 e724. 26
Rao, S.S., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., 27
Omer, A.D., Lander, E.S., et al. (2014). A 3D map of the human genome at kilobase resolution reveals principles 28
of chromatin looping. Cell 159, 1665-1680. 29
Rao, S.S.P., Huang, S.C., St Hilaire, B.G., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, 30
S.E., Bascom, G.D., Bochkov, I.D., et al. (2017). Cohesin Loss Eliminates All Loop Domains. Cell 171, 305-320. 31
Rebollo, R., Romanish, M.T., and Mager, D.L. (2012). Transposable Elements: An Abundant and Natural Source 32
of Regulatory Sequences for Host Genes. Annual Review of Genetics, Vol 46 46, 21-42. 33
Rivera-Mulia, J.C., and Gilbert, D.M. (2016). Replication timing and transcriptional control: beyond cause and 34
effect-part III. Curr Opin Cell Biol 40, 168-178. 35
Rothstein, J.L., Johnson, D., Deloia, J.A., Skowronski, J., Solter, D., and Knowles, B. (1992). Gene-Expression during 36
Preimplantation Mouse Development. Gene Dev 6, 1190-1201. 37
Rowley, M.J., and Corces, V.G. (2018). Organizational principles of 3D genome architecture. Nat Rev Genet 19, 38
789-800. 39
Rowley, M.J., Nichols, M.H., Lyu, X.W., Ando-Kuri, M., Rivera, I.S.M., Hermetz, K., Wang, P., Ruan, Y.J., and Corces, 40
V.G. (2017). Evolutionarily Conserved Principles Predict 3D Chromatin Organization. Mol Cell 67, 837-852. 41
Schwarzer, W., Abdennur, N., Goloborodko, A., Pekowska, A., Fudenberg, G., Loe-Mie, Y., Fonseca, N.A., Huber, 42
W., C, H.H., Mirny, L., et al. (2017). Two independent modes of chromatin organization revealed by cohesin 43
removal. Nature 551, 51-56. 44
Solovei, I., Thanisch, K., and Feodorova, Y. (2016). How to rule the nucleus: divide et impera. Current Opinion in 45
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
25
Cell Biology 40, 47-59. 1
Splinter, E., Heath, H., Kooren, J., Palstra, R.J., Klous, P., Grosveld, F., Galjart, N., and de Laat, W. (2006). CTCF 2
mediates long-range chromatin looping and local histone modification in the beta-globin locus. Genes Dev 20, 3
2349-2354. 4
Stadhouders, R., Vidal, E., Serra, F., Di Stefano, B., Le Dily, F., Quilez, J., Gomez, A., Collombet, S., Berenguer, C., 5
Cuartero, Y., et al. (2018). Transcription factors orchestrate dynamic interplay between genome topology and 6
gene regulation during cell reprogramming. Nature Genetics 50, 238-249. 7
Stevens, T.J., Lando, D., Basu, S., Atkinson, L.P., Cao, Y., Lee, S.F., Leeb, M., Wohlfahrt, K.J., Boucher, W., 8
O'Shaughnessy-Kirwan, A., et al. (2017). 3D structures of individual mammalian genomes studied by single-cell 9
Hi-C. Nature 544, 59-64. 10
Strom, A.R., Emelyanov, A.V., Mir, M., Fyodorov, D.V., Darzacq, X., and Karpen, G.H. (2017). Phase separation 11
drives heterochromatin domain formation. Nature 547, 241-245. 12
Stunnenberg, R., Kulasegaran-Shylini, R., Keller, C., Kirschmann, M.A., Gelman, L., and Buhler, M. (2015). H3K9 13
methylation extends across natural boundaries of heterochromatin in the absence of an HP1 protein. Embo J 14
34, 2789-2803. 15
Sundaram, V., and Wang, T. (2018). Transposable Element Mediated Innovation in Gene Regulatory Landscapes 16
of Cells: Re-Visiting the "Gene-Battery" Model. Bioessays 40. 17
Tang, S.J. (2017). Potential Role of Phase Separation of Repetitive DNA in Chromosomal Organization. Genes 18
(Basel) 8. 19
Taylor, M.S., LaCava, J., Mita, P., Molloy, K.R., Huang, C.R., Li, D., Adney, E.M., Jiang, H., Burns, K.H., Chait, B.T., 20
et al. (2013). Affinity proteomics reveals human host factors implicated in discrete stages of LINE-1 21
retrotransposition. Cell 155, 1034-1048. 22
Tjong, H., Li, W.Y., Kalhor, R., Dai, C., Hao, S.L., Gong, K., Zhou, Y.G., Li, H.C., Zhou, X.J., Le Gros, M.A., et al. 23
(2016). Population-based 3D genome structure analysis reveals driving forces in spatial genome organization. 24
P Natl Acad Sci USA 113, E1663-E1672. 25
van de Werken, H.J.G., Haan, J.C., Feodorova, Y., Bijos, D., Weuts, A., Theunis, K., Holwerda, S.J.B., Meuleman, 26
W., Pagie, L., Thanisch, K., et al. (2017). Small chromosomal regions position themselves autonomously 27
according to their chromatin class. Genome Res 27, 922-933. 28
van Steensel, B., and Belmont, A.S. (2017). Lamina-Associated Domains: Links with Chromosome Architecture, 29
Heterochromatin, and Gene Repression. Cell 169, 780-791. 30
Velazquez Camacho, O., Galan, C., Swist-Rosowska, K., Ching, R., Gamalinda, M., Karabiber, F., De La Rosa-31
Velazquez, I., Engist, B., Koschorz, B., Shukeir, N., et al. (2017). Major satellite repeat RNA stabilize 32
heterochromatin retention of Suv39h enzymes by RNA-nucleosome association and RNA:DNA hybrid 33
formation. Elife 6. 34
Wang, J., Xie, G., Singh, M., Ghanbarian, A.T., Rasko, T., Szvetnik, A., Cai, H., Besser, D., Prigione, A., Fuchs, N.V., 35
et al. (2014). Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature 36
516, 405-409. 37
Wang, S., Su, J.H., Beliveau, B.J., Bintu, B., Moffitt, J.R., Wu, C.T., and Zhuang, X. (2016). Spatial organization of 38
chromatin domains and compartments in single chromosomes. Science 353, 598-602. 39
Wijchers, P.J., Geeven, G., Eyres, M., Bergsma, A.J., Janssen, M., Verstegen, M., Zhu, Y., Schell, Y., Vermeulen, C., 40
de Wit, E., et al. (2015). Characterization and dynamics of pericentromere-associated domains in mice. Genome 41
Res 25, 958-969. 42
Wutz, G., Várnai, C., Nagasaka, K., Cisneros, D.A., Stocsits, R.R., Tang, W., Schoenfelder, S., Jessberger, G., Muhar, 43
M., Hossain, M.J., et al. (2017). Topologically associating domains and chromatin loops depend on cohesin and 44
are regulated by CTCF, WAPL, and PDS5 proteins. The EMBO journal 36, 3573-3599. 45
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
26
Xie, Y., Rosser, J.M., Thompson, T.L., Boeke, J.D., and An, W. (2010). Characterization of L1 retrotransposition 1
with high-throughput dual-luciferase assays. Nucleic acids research 39, e16-e16. 2
Zane, L., Sharma, V., and Misteli, T. (2014). Common features of chromatin in aging and cancer: cause or 3
coincidence? Trends Cell Biol 24, 686-694. 4
Zullo, J.M., Demarco, I.A., Pique-Regi, R., Gaffney, D.J., Epstein, C.B., Spooner, C.J., Luperchio, T.R., Bernstein, 5
B.E., Pritchard, J.K., Reddy, K.L., et al. (2012). DNA sequence-dependent compartmentalization and silencing of 6
chromatin at the nuclear lamina. Cell 149, 1474-1487. 7
Zwicker, D., Decker, M., Jaensch, S., Hyman, A.A., and Julicher, F. (2014). Centrosomes are autocatalytic droplets 8
of pericentriolar material organized by centrioles. P Natl Acad Sci USA 111, E2636-E2645. 9
10 11
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
27
FIGURE LEGENDS 1
Figure 1. B1- and L1-rich genomic regions homotypically interact and correlate with A and 2
B compartments. 3
(A) Heatmaps of the distribution densities of B1, L1, L2, and ERV1 repeats and random genomic 4
regions (panel i), DNase I hypersensitive sites (DHS) and ChIP-seq signals of Pol II, H3K4me3, 5
HP1α, and H3K9me3 (panel ii), and ChIRP-seq signals of Malat1 and L1 RNA (panel iii), and 6
RNA-seq (panel iv) in mESCs across two adjacent compartments (Cn, Cn+1). All signals in 696 7
compartments annotated in mESCs were sorted according to the B1 distributions shown in 8
panel (i). 9
(B) Relative contents of various repeats across the A and B compartments annotated in various cell 10
types in mouse (top) and human (bottom). Three repeat subclasses are shown (B1 or Alu, L1, 11
and ERV1), together with random genomic regions as the negative control. 12
(C) Genome browser shots showing conserved domain structures as indicated by heatmaps of the 13
Hi-C contact matrix over a syntenic region in mouse (top) and human (bottom) ESCs. The B1 14
and L1 repeat densities, the A/B compartments shown by eigenvalues of the Hi-C contact 15
matrix, and Refseq gene annotations are shown underneath each heatmap. 16
(D) Heatmap of normalized interaction frequencies at 100-kb resolution on chromosome 17 in 17
mESCs. Genomic densities of B1 and L1 repeats are shown at the left and bottom. 18
(E) A zoomed-in view of the interaction matrix of the genomic region from 18 Mb to 60 Mb on 19
mouse chr17 (40-kb resolution). Genomic densities of B1 and L1 repeats, eigenvalues of the 20
Hi-C matrix representing A/B compartments from mouse mESCs, neural progenitor cells (NPC) 21
and neurons, and Pol II ChIP-seq signals in mESCs are also shown underneath. B1-enriched 22
regions are arbitrarily labeled as D, F, and H in uppercase. L1-enriched regions are labeled as 23
c, e, g, and i in lowercase. Some strong homotypic interactions between compartments rich in 24
the same repeat subfamily (for example, between the B1-rich regions DF, DE and FH, and 25
between the L1-rich regions ce and cg), are highlighted by dotted boxes. 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
28
(F) Correlation heatmap showing Pearson correlation coefficients of the interaction frequencies of 1
any two paired regions in a sub-region on chr17 (500-kb resolution). B1-enriched regions are 2
labeled in uppercase as F, H … R, T. L1-enriched regions are labeled in lowercase as e, g … q, 3
u. Dotted boxes and arrows highlight positive correlations (in red) of the anchor region F 4
(indicated by *) with other B1-enriched genomic regions (horizontal), and of the anchor region 5
g (indicated by *) with other L1-enriched genomic regions (vertical). 6
(G) Correlation heatmap showing Pearson correlation coefficients of inter-chromosomal interaction 7
frequencies between chromosomes 17 and 19 at 500-kb resolution. B1-enriched regions on 8
chr19 are arbitrarily labeled in uppercase as V, W, X, Y, and Z. L1-enriched regions on chr17 9
are labeled as shown in panels (E-F) from c, e … q, to u. The horizontal dotted box and arrows 10
highlight positive correlations of an L1-enriched region on chr19 with other L1-enriched 11
regions on chr17. The vertical dotted box and arrows show positive correlations of a B1-12
enriched region on chr17 with other B1-enriched regions on chr19. 13
14
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
29
Figure 2. DNA FISH reveals the spatial segregation of L1 and B1 compartments. 1
(A) Representative images of L1 (green) and B1 (red) repeats revealed by DNA FISH in mESCs. 2
DNA is labeled by DAPI (blue). All scale bars, 5 μm. 3
(B) Normalized fluorescence intensity profiles along the white line in the merged B1/L1 image 4
shown in panel (A). The L1 channel shows four intensity peaks that surround the two intensity 5
peaks in the B1 channel. These two peaks in the B1 channel correspond to valleys in the L1 6
channel. 7
(C) Representative images of immuno-DNA FISH analysis of L1 (green) and B1 (red) DNA repeats, 8
and NCL protein (purple) in mESCs (n=37), NSC (n=23), NIH3T3 (n=18), C2C12 (n=11) and 9
Hela cells (n=15). 10
11
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
30
Figure 3. Dynamic segregation of L1 and B1 compartments during the cell cycle and 1
embryonic development. 2
(A) and (B) DNA FISH analysis of synchronized mESCs at different cell cycle stages. 3
Representative images and the scatterplot of the segregation index of L1 and B1 DNA are shown 4
in panels (A) and (B), respectively. Data are presented as mean ± standard deviation (SD). N, 5
number of nuclei analyzed. M phase and G1 phase data are compared to S phase using the two-6
tailed Student’s t-test. p values are shown at the top. 7
(C) Boxplot analysis of the ratio of homotypic interactions versus heterotypic interactions for B1 8
and L1 based on single-cell Hi-C data from mESCs. The y-axis shows the segregation index (Hi-9
C) of L1 and B1, which represents the ratio of average interaction frequency between 10
compartments containing similar repeats (B1.B1 and L1.L1) to that between compartments 11
containing different repeats (B1.L1) for all chromosomes (except X and Y). Larger values 12
indicate a higher degree of homotypic interaction between B1- or L1-enriched compartments. p 13
values are calculated with the two-tailed Student’s t-test. 14
(D) Heatmaps of normalized interaction frequencies at 500-kb resolution on chromosome 17 in 15
mESCs at M (left) and G1 (middle) phase. A comparison of contact frequencies between M and 16
G1 phase [log2(G1/M)] for the whole of chromosome 17 is shown on the right. Genomic 17
densities of B1 and L1 repeats are shown at the bottom. 18
(E) and (F) DNA FISH analysis of early embryos. Representative FISH images and the scatterplot 19
of the segregation index of L1 and B1 DNA are shown in panels (E) and (F), respectively. Data 20
are presented as mean ± SD (embryos were collected and processed in two independent 21
experiments). N, number of nuclei analyzed. Each sample is compared to the blastocyst stage 22
and p values are calculated with the two-tailed Student’s t-test. Dotted lines show the trend-line. 23
(G) Boxplot showing a gradual increase of homotypic versus heterotypic interactions between L1-24
rich and B1-rich regions based on Hi-C analysis of early embryos from zygotes (including PN3 25
and PN5 stages) to 8-cell embryos and pluripotent mESCs (in vitro equivalent of the inner cell 26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
31
mass cells of blastocysts). Each sample is compared to mESCs and p values are calculated with 1
the two-tailed Student’s t-test. 2
(H) The upper panels show heatmaps of normalized interaction frequencies at 500-kb resolution 3
on chromosome 17 in mouse embryos at the early 2-cell, late 2-cell and inner cell mass (ICM) 4
stages. The lower panels show comparisons of the contact frequencies between Late 2-cell and 5
Early 2-cell (left), and ICM and Early 2-cell (right). Genomic densities of B1 and L1 repeats are 6
also shown. 7
8
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
32
Figure 4. Repeat RNA and transcription promote the spatial segregation of L1 and B1 1
compartments. 2
(A) Developmental analysis of embryos microinjected with scramble or L1 AMO (left) or with 3
scramble or two different B1 ASOs (ASO-1 and ASO-2) (right) at the zygote stage. N, number 4
of embryos analyzed. 5
(B) Embryos depleted of L1 RNA by AMO show poorer L1/B1 segregation, as indicated by 6
significantly lower L1/B1 segregation indexes compared to the scramble AMO control and 7
noninjected late 2-cell embryos. Each dot represents an embryo analyzed. Black and grey dots 8
represent injected and noninjected embryos, respectively. Data were collected from three 9
independent experiments. p values are calculated with the two-tailed Student’s t-test. 10
(C) and (D) DNA FISH analysis of L1 (green) and B1 (red) repeats in mESCs treated with 11
transcription inhibitors for 3 hours. Representative images and the scatterplot of the segregation 12
index of L1 and B1 DNA are shown in (C) and (D), respectively. DMSO: mock control; DRB 13
(100 μM): inhibits the release and elongation of Pol II; ActD (1 μg/ml): inhibits both Pol I and 14
II. Both drug treatments disrupt the perinucleolar staining of L1 DNA and induce mixing of the 15
L1 and B1 DNA signals. Treatment with ActD elicits a stronger mixing effect than DRB. 16
(E) and (F) DNA FISH analysis of L1 (green) and B1 (red) repeats in mESCs transfected with 17
scramble AMO for 36 hours, or L1 AMO for 12 hours and 36 hours, or treated with the drug 18
AZT for 12 hours. Representative images and the scatterplot of the segregation index of L1 and 19
B1 DNA are shown in (E) and (F), respectively. 20
All scale bars, 5 μm. N, number of nuclei analyzed. Data are presented as mean ± SD (two 21
independent experiments), and p values are calculated with two-tailed Student’s t-test. 22
23
24
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
33
Figure 5. L1 RNA is required for the formation and maintenance of higher-order chromatin 1
structure. 2
(A-B) Hi-C analysis of mESCs treated with scramble control (SCR) and L1 AMOs for 36 hours. 3
Two independent experiments of AMO transfection followed by Hi-C were performed. Panel A 4
shows Hi-C heatmaps of contact frequencies for SCR and L1 AMO (panel i) and the comparison 5
of contact frequencies (panel ii) between scramble and L1 AMOs [log2(SCR/L1)]. The whole 6
of chromosome 17 (chr17) is shown on the left (at 500-kb resolution) and the boxed sub-region 7
(18-60 Mb) of chr17 is shown enlarged on the right. To better orient the visualization and 8
comparison of these three sub-region heatmaps, representative homotypic interactions (ce, DF, 9
and eg) are labeled with black boxes in the right panels. Relevant B1- and L1-enriched 10
compartments are labeled as in Figure 1. Panel B shows a boxplot comparing the Hi-C-based 11
segregation indexes. L1 AMO led to decreased ratios of homotypic interaction versus 12
heterotypic interaction between L1-rich and B1-rich regions compared to SCR AMO. The p 13
value was calculated by the two-tailed Student’s t-test. 14
(C) Schematic representation of compartmental interactions before and after depletion of L1 RNA 15
based on Hi-C data shown in the panel (A). The control mESCs show strong homotypic 16
interactions (indicated by blue solid lines), whereas L1-depleted cells show weakened 17
homotypic interactions (dotted blue lines) and abnormal increases of heterotypic interactions 18
(indicated by brown solid lines). 19
20
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
34
Figure 6. L1 RNA is required for spatial segregation of L1 and B1 compartments. 1
Oligopaint DNA FISH analysis of four representative loci before and after depletion of L1 RNA 2
for 36 hours in mESCs. (A) Scheme of relative chromosomal localizations and spatial distances 3
of the four compartments labeled by Oligopaint DNA FISH probes. B1-enriched domain: F. L1-4
enriched domains: e, g, and q. Representative images and quantification of the spatial distances 5
between two labeled compartments are shown in panels (B) and (C), respectively. Each dot in 6
panel (C) represents a nucleus. In cells treated with L1 AMO, the median distance between the 7
two B1 and L1 compartments e and F is significantly decreased compared to the control, while the 8
median distance between the two L1 compartments g and q is significantly increased. Data are 9
presented as mean ± SD. N, number of nuclei analyzed. p values are calculated by the two-tailed 10
Student’s t-test. Scale bars, 5 μm. 11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
35
Figure 7. L1 RNA binds to its own DNA and promotes the phase separation of HP1α. 1
(A) Genome browser view of mouse chr17: 18 - 60 Mb (left) and chr19: 3 - 12 Mb (right). The 2
first five rows show the genomic density of B1 repeats, B1-related sequencing data (ChIRP-3
seq signals of Malat1, ChIP-seq signals of Pol II, and H3K27ac), and RNA-seq in mESCs. The 4
lower tracks show the genomic density of L1 repeats (highlighted by beige shading) and L1-5
related sequencing data, including ChIRP-seq signals of L1 RNA, ChIP-seq signals of HP1α 6
and H3K9me3. Refseq gene annotations are also included. 7
(B) Biotinylated L1 RNA pulls down recombinant HP1α protein. Reactions without addition of L1 8
RNA were served as mock control. 9
(C) Representative images of droplet formation at different concentrations of HP1α protein and L1 10
RNA. Concentrations of HP1α and RNA are indicated at the top and left of the images, 11
respectively. Scale bars, 10 μm. Data are representative of 3 independent experiments. 12
(D) The model. First, the intrinsic self-assembly property of L1 and B1/Alu repeats provides 13
numerous nucleation points to seed the formation of nuclear subdomains. Repetitive DNA 14
sequences also serve as anchor sites for transcription machinery, regulatory proteins and RNAs. 15
Second, the embedded structural information in DNA repeats may be translated by their RNA 16
transcripts, together with interacting DNA and/or RNA-binding proteins. Molecular crowding 17
generated by interactions among DNA sequences, RNAs, and proteins in subnuclear domains 18
seeded by individual clusters of L1 or B1/Alu, may drive the aggregation of compartments 19
containing the same repeat type through a phase-separation mechanism, which consequently 20
folds the genome. Third, the nuclear segregation of L1-enriched B compartments and B1/Alu-21
enriched A compartments may be further reinforced by attaching their DNA sequences to 22
subnuclear structures such as nuclear speckles and the nucleolus, respectively, which serve as 23
scaffolds to stabilize the nuclear architecture. 24
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
B
0
20
40
6
0
80
95
(Mb)
mESC: intra-chromosome interaction
[chr17: Log2 (interaction matrix)]
E
DF
ISM1 FOXA2 EIF6 TOP1 PLTP
B1
L1
A/B
Gene
B A B A
Gene
B/A
L1
B1
(i) Mouse
-0.06
-0.04
-0.02
0
0.02
0.04
0.06 K562
-0.06
-0.04
-0.02
0
0.02
0.04
0.06 GM12878-0.2
-0.1
0
0.1
0.2
-0.2
-0.1
0
0.1
0.2
-0.2
0
0.2
Rel
ativ
e co
nten
t
mESC NPC Neuron
-0.06
0
0.06
Rel
ativ
e co
nten
t
hESC
A B A B A B
(ii) Human
B1/Alu
L1
ERV1
RandomC
hr19
(0
-61
Mb)
B1
L1
Chr17 (0 - 95 Mb)
B1
L1
mESC: inter-chromosome correlationG
mESC: intra-chromosome correlation
(chr17:35 - 95 Mb)
B1
L1
B1
L1
A compartment B compartment
B1
L1
mESC
NPC
Neuron
Pol II (mESC)
chr17:18 - 60 Mb
ce
cg
DF
DH
eg FH
c e g i
HFD
ei
gi
B1
L1
B1 L1
e g* i k m o q u
F H J L N P R T
F*
H
J
L
N
P
R
T
B1
L1
e
g
i
k
m
o
q
u
L1
B1
V
W
X
Y
Z
V
W
X
Y
Z
c e g i k m o q u
c e g i k m o q u
cD De
c D e F g H i
Figure 1
High
Low
A
(Fraser et al., 2015)
compartments
Ci) Genomic distribution ii) chromatin state iii) ChIRP iv) RNA
Low High (abundance)
696
com
part
men
ts in
mE
SC
Cn Cn+1
B A B A
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
B
CB1 (Alu) LINE1 NCL DAPI
mESC
NSC
NIH3T3
C2C12
MergeDNA FISH
LINE1/B1 (Alu) LINE1/NCL B1 (Alu)/LINE1/NCL
Hela
IFDNA FISH IF Merge
B1 (Alu) L1 NCL DAPI L1/B1 L1/NCL B1/L1/NCL
Hela
C2C
12
NIH
3T
3
NS
C
E
SC
0 5 10 15 20
Distance (μm)
Flu
ore
scent
inte
nsity
(A.U
.)
B1 L11.0
0.8
0.6
0.4
0.2
0.0
B1 L1 DAPI
B1/L1 L1/DAPI B1/L1/DAPI
Figure 2
A
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
Log2
(in
tera
ctio
n m
atrix
)
0
25
50
75
95
(Mb)
-1.0 1.0
E F
H
G
Early 2-cell Late 2-cell ICM
Log2 (Late 2-cell / Early 2-cell) Log2 (ICM / Early 2-cell)
Log2
(in
tera
ctio
n m
atrix
)
0
25
50
75
100
0.0 5.0
B C
D
B1 L1 A
/B
G1 phase
B1 LINE1 Merge
Prophase
Anaphase
S phase
S p
hase
G
1 p
hase
A
naph
ase
P
rophase
B1 L1 Merge
Seg
rega
tion
inde
x (F
ISH
) of
L1
and
B1
0.4
0.2
0.0
-0.2
-0.4N=12
N=12N=21
p value vs S phase
2.6E-10 4.0E-04
Based on DNA FISH
Zygote
B1 LINE1 Merge
Early 2 cell stage
Late 2 cell stage
Morula
Blastocyst
4 cell stage
B1 L1 Merge
Bla
sto
cys
t
M
oru
la
4
-cell
Late
2-c
ell
E
arly
2-c
ell
PN
4
Zyg
ote
Seg
rega
tion
inde
x (F
ISH
) of
L1
and
B1
0.6
0.3
0.0
-0.3
-0.6
-0.9
2.4E-15 4.5E-18 6.2E-15 1.6E-09 5.0E-4
N=11
N=15
N=21N=17 N=21
N=22
9.5E-11 7.6E-08 5.9E-06
Seg
rega
tion
inde
x (H
i-C)
of L
1 an
d B
1
5
4
3
2.5
2.0
1.5
1.0
0.5
(p value vs Blastocyst) (p value vs mESC)
Based on DNA FISH Based on Hi-C
Figure 3
A
B1
L1
A/B
♀
Log2
(in
tera
ctio
n m
atrix
) ch
r17
0
2
5
50
75
95
(Mb)
M phase G1 phase log (G1/ M)
-2.0 2.0
(Nagano et al, 2017)
Single cell Hi-C analysis of the cell cycle (mESC)
0.0 4.0
Seg
rega
tion
inde
x (H
i-C)
of L
1 an
d B
1
3.0
2.5
2.0
1.5
1.0
0.0002
Based on Hi-C
(Du et al, 2017)
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
C D
B
ActD
DMSO
DRB
B1 LINE1 DAPI MergeB1 L1 DAPI Merge
ActD
DR
B
D
MS
OA
ZT
Tre
atm
ent
3
6 h
12 h
SC
R A
MO
L1 A
MO
B1 L1 DAPI Merge
Segre
gation i
ndex
(FIS
H)
of
L1 a
nd B
1
0.4
0.2
0.0
-0.2
-0.4
N=12
N=15N=15
0.0002
0.008
Contr
ol
Segre
gation i
ndex
(FIS
H)
of
L1 a
nd B
1
0.4
0.2
0.0
-0.2
-0.4
-0.6
N=29
N=12 N=29
N=14
0.20
3.0E-12
2.5E-07
Figure 4
A
0
0.5
1D
evel
opm
enta
l pr
ogre
ssio
n (%
)
0
20
40
60
80
100N=52 N=46 N=38 N=31 N=48
Segre
gation i
ndex
(FIS
H)
of
L1 a
nd B
1
0.4
0.2
0.0
-0.2
-0.4
-0.6
2.5E-05 0.0013
n.s.
n.s.
E F
Embryos beyond 2-cell stage
2-cell stage
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
3.0 6.0
0
25
50
7
5
95
i) Log2 (interaction matrix) of chr17
A compartment B compartment
HFD
c e g i
*e*D HFc g
A compartment B compartment
B1
L1
A/B
HFD
c e g i
cD
cF
De
Dg
eF*e*D HFc g
a) SCR AMO
A compartment B compartment
HFD
c e g i
*eD H*Fc g
3.0 6.0
b) L1 AMO
0
25
50
75
95
ii) Log2 (SCR/L1) of chr17
-1.0 1.0Mb:
0
25
50
7
5
95
ceDF
eg
chr17:18 - 60 Mb
e
D HF
c g iL1-L1:strong
B1-L1: absent
B1-B1: Strong
e
D HF
c g iL1-L1: weak
B1-L1 increased
B1-B1: weak
L1 AMO
ceDF
eg
ceDF
eg
Figure 5
Zoom-in views of chr1718-60 Mb
6.2E-15
Seg
rega
tion
inde
x (H
i-C)
of L
1 an
d B
1
2.5
2.0
1.5
#1 #2 #1 #2
SCR AMO L1 AMO
A
CB
B1
L1
A/B
B1
L1
A/B
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
Figure 6
B1
L1
Oligopaint FISH
L1 AMO
e g q
F
chr17:35 - 95 Mb
e F g q
e Fg q
2.3E-04 5.9E-06
Dis
tance o
f tw
o c
losest
alle
les (μ
m)
4
3
2
1
0
5
4
3
2
1
0
(i) e and F (ii) g and q
C
B
A
SCR AMO36 h
LINE1 AMO36 h
B2 and B3A1 and B1
e, F
e, F
SCR AMO
L1 AMO
SCR AMO36 h
LINE1 AMO36 h
B2 and B3A1 and B1
SCR AMO
L1 AMO
g, q
g, q
(i) e in green and F in red (ii) g in green and q in red
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint
Figure 7
B
D
A
Nucleus
L1-enriched compartments
Nucleolus
B1-enriched compartments
LAD
Nuclear speckle
lamina
NAD
B1 DNA
Malat1 ChIRP-seq
Pol II
H3K27ac
RNA-seq
L1 DNA
L1 ChIRP-seq
HP1α
H3K9me3
Refseq genes
chr17:18 - 60 Mb chr19:3- 12 Mb
4.8
6.0
1.0
1.8
40
2.5
25
0.2
0.5
ChI
P-s
eqC
hIP
-seq
DNA: L1 SINE (B1/Alu)
RNA: L1 others (eg: Malat1)
Proteins:
Nucleolin (NCL)
Heterochromatic, eg: HP1α
Euchromatic, eg: Pol II
Biophysical forces:
Self-aggregation
Phase separation
C
NaCl (M) 1 +Urea
Elution
Input
anti-HP1α western blot
MockL1 RNA (biotinylated)
0.075 0.15 0.3 0.5 1 0.15 0.3
HP1α (μM)
L1 R
NA
(ng
/μL)
12.56.25 5025
010
025
6.25
12.5
5075
200
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted October 13, 2019. . https://doi.org/10.1101/802173doi: bioRxiv preprint