Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
1
1 Drosophila Histone Locus Body assembly and function involves 2
multiple interactions 3 4 5
Running title: Drosophila HLB assembly and function 6 7 8 9 10
Kaitlin P. Koreski1,, Leila E. Rieder2, Lyndsey M. McLain3, 11 William F. Marzluff,1,3,4-6* and Robert J. Duronio1,3,4,6,7* 12
13 14
15
16
Affiliations 17
1 Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC 18
27599, USA. 19
2 Department of Biology, Emory University, Atlanta, GA 30322, USA 20
3 Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA. 21
4 Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel 22
Hill, NC 27599, USA. 23
5 Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 24
27599, USA. 25
6 Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 26
USA. 27
7 Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA. 28
* Correspondence: [email protected], [email protected] 29
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
2
Abstract 30
The histone locus body (HLB) assembles at replication-dependent (RD) histone loci and 31
concentrates factors required for RD histone mRNA biosynthesis. The D. melanogaster genome 32
has a single locus comprised of ~100 copies of a tandemly arrayed repeat unit containing one 33
copy of each of the 5 RD histone genes. To determine sequence elements required for D. 34
melanogaster HLB formation and histone gene expression, we used transgenic gene arrays 35
containing 12 copies of the histone repeat unit that functionally complement loss of the ~200 36
endogenous RD histone genes. A 12x histone gene array in which all H3-H4 promoters were 37
replaced with H2a-H2b promoters does not form an HLB or express high levels of RD histone 38
mRNA in the presence of the endogenous histone genes. In contrast, this same transgenic array is 39
active in HLB assembly and RD histone gene expression in the absence of the endogenous RD 40
histone genes and rescues the lethality caused by homozygous deletion of the RD histone locus. 41
The HLB formed in the absence of endogenous RD histone genes on the mutant 12x array 42
contains all known factors present in the wild type HLB including CLAMP, which normally 43
binds to GAGA repeats in the H3-H4 promoter. These data suggest that multiple protein-protein 44
and/or protein-DNA interactions contribute to HLB formation, and that the large number of 45
endogenous RD histone gene copies sequester available factor(s) from attenuated transgenic 46
arrays, thereby preventing HLB formation and gene expression. 47
48
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
3
Introduction 49
An important organizing principle in cells is the use of membraneless compartments to 50
spatially and temporally regulate diverse biological processes (Mitrea and Kriwacki, 2016). 51
Numerous membraneless compartments have been identified in both the nucleus (e.g. nucleoli, 52
Cajal bodies, histone locus bodies) and the cytoplasm (e.g. P-bodies, stress granules, germ 53
granules) and are collectively referred to as biomolecular condensates (Banani et al., 2017). 54
There is increasing evidence suggesting that membraneless compartments are formed through 55
liquid-liquid phase separation or condensation (Alberti et al., 2019). This occurs when proteins 56
and/or nucleic acids in the nucleoplasm or cytoplasm coalesce or demix into a condensed phase 57
that often resembles liquid droplets. Large nuclear condensates that are visible under light 58
microscopy are most often referred to as nuclear bodies (NBs), and represent an important 59
organizing feature of the nucleus. 60
The histone locus body (HLB) is a conserved NB that assembles at replication-dependent 61
(RD) histone genes and concentrates factors required for RD histone mRNA biogenesis (Duronio 62
and Marzluff, 2017). RD histone mRNAs are the only eukaryotic mRNAs that are not 63
polyadenylated (Marzluff and Koreski, 2017). The unique stem loop at the 3’end of RD histone 64
mRNAs results from a processing reaction requiring a specialized suite of factors, some of which 65
are constitutively localized in the HLB (Duronio and Marzluff, 2017). The HLB provides a 66
powerful system to study how NBs form and function because it contains a well characterized set 67
of factors involved in producing a unique class of cell-cycle regulated mRNAs. We previously 68
demonstrated that concentrating factors (e.g. FLASH and U7 snRNP) in the Drosophila 69
melanogaster HLB is critical for efficient histone pre-mRNA processing (Wagner et al., 2007; 70
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
4
Tatomer et al., 2016b). However, a full understanding of how the HLB participates in histone 71
mRNA biosynthesis requires knowledge of HLB assembly at the molecular level. 72
Prior studies of NBs have provided several important assembly concepts that are 73
applicable to the HLB. Many NB components have an intrinsic ability to self-associate, an 74
observation leading to two different models of NB assembly: (1) interactions among NB 75
components occur stochastically, wherein individual factors can be recruited to the body in any 76
order, or (2) components assemble in an ordered or hierarchical pathway, wherein the 77
recruitment of components is predicated on prior recruitment of others (Dundr and Misteli, 2010; 78
Rajendra et al., 2010). The HLB appears to employ a hybrid version of these two possibilities. 79
For example, genetic loss of function experiments suggest a partially ordered assembly pathway 80
of the Drosophila HLB with some components being required for subsequent recruitment of 81
others (White et al., 2011; Terzo et al., 2015; Tatomer et al., 2016a). The scaffolding protein 82
Mxc (Multi sex combs), the Drosophila ortholog of human NPAT (Nuclear Protein, Ataxia-83
Telangiectasia Locus), and FLASH (FLICE-Associated Huge protein) likely form the core of the 84
HLB and are required for subsequent recruitment of other factors (White et al., 2011). Tethering 85
experiments indicate that ectopic HLB formation also may be induced by several different HLB 86
components, supporting a stochastic model of assembly (Shevtsov and Dundr, 2011). 87
The initiation event in self-organizing NB assembly is the key step in the process and is 88
not well understood. A prevalent model postulates a “seeding” event that initiates the nucleation 89
of critical components that form a platform for further recruitment of other components (Dundr, 90
2011; Shevtsov and Dundr, 2011; Altmeyer et al., 2015; Falahati et al., 2016; Stanek and Fox, 91
2017; Gomes and Shorter, 2019). In some instances, RNA is thought to help seed NB assembly, 92
and NBs such as the nucleolus and ectopic paraspeckles can form at sites of specific transcription 93
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
5
(Matera et al., 2009; Mao et al., 2011; Falahati et al., 2016). Blocking transcription prevents 94
complete HLB assembly in both zebrafish and flies (Salzler et al., 2013; Heyn et al., 2017; Hur, 95
2019). However, the HLB is present at RD histone genes even in G1 when the genes are not 96
active, raising the possibility that histone genes themselves participate in seeding HLB assembly 97
(Zhao et al., 1998; Liu et al., 2006). 98
In Drosophila, the RD histone genes are present at a single locus with ~100 copies of a 99
tandemly arrayed 5kb repeat unit, each of which contains one copy of the divergently transcribed 100
H2a-H2b and H3-H4 gene pairs as well as the gene for linker histone H1 (Lifton et al., 1978; 101
McKay et al., 2015; Bongartz and Schloissnig, 2019). Using transgenes containing a wild type or 102
mutant derivative of a single histone repeat, we previously demonstrated that the bidirectional 103
H3-H4 promoter stimulated HLB assembly and transcription of the single histone repeat in 104
salivary glands (Salzler et al., 2013). We subsequently showed that the conserved GAGA repeat 105
elements present in the H3-H4 promoter region are targeted by the zinc-finger transcription 106
factor CLAMP (Chromatin-Linked Adaptor for MSL Proteins), and that this interaction 107
promotes HLB assembly (Rieder et al., 2017). Thus, the H3-H4 promoter region might act to 108
seed HLB assembly. 109
In this work, we leveraged transgenic histone gene arrays to test whether the H3-H4 110
promoter region is necessary for in vivo function of the RD histone locus. We found that 111
replacement of H3-H4 promoters with H2a-H2b promoters results in an attenuated transgenic 112
histone gene array that does not function in the presence of the intact endogenous RD histone 113
locus, but surprisingly provides full in vivo function, including normal HLB assembly and 114
histone gene expression, when the endogenous RD histone locus is absent. These results suggest 115
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
6
that multiple elements in the histone genes and core HLB proteins are involved in HLB 116
assembly. 117
118 Results 119
To study histone gene regulation and the role of histone post-translational modifications 120
in chromatin we previously developed an experimental system in which BAC-based, transgenic 121
histone gene arrays containing 12 copies of the 5kb histone repeat unit assemble HLBs and 122
functionally complement homozygous loss of the ~100 gene copies at the endogenous RD 123
histone locus, HisC (McKay et al., 2015). To study histone gene expression, we created a 124
12xHWT (Histone Wild Type) transgenic construct containing a synonymous polymorphism in the 125
H2a gene (i.e. mutation of a XhoI site) that allows us to distinguish transgenic H2a gene 126
expression from endogenous H2a gene expression (McKay et al., 2015). Here, we extended this 127
design and created a “Designer Wild Type” (DWT) 5kb histone repeat unit that has all five 128
histone genes marked in a similar manner (Fig. 1A, Fig. S1A). We also introduced restriction 129
enzyme sites around the mRNA 3’ end processing signals, the stem loop and histone downstream 130
element (Marzluff and Koreski, 2017), to allow us to readily manipulate those sequences (Fig. 131
S1). We used this DWT repeat to create a 12xDWT construct, and found that a transgenic version 132
of this array behaved identically to the original 12xHWT array. It rescued the lethality caused by 133
homozygous deletion of HisC, resulting in viable, fertile adult flies. In addition, we can maintain 134
a stock of flies lacking the endogenous RD histone locus and containing a homozygous 12xDWT 135
transgene. 136
137
The H3-H4 promoter stimulates HLB formation 138
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
7
To test whether the H3-H4 promoter region is necessary for histone locus function in vivo, we 139
engineered a derivative of the 12xDWT array in which all H3-H4 promoters were replaced with 140
H2a-H2b promoters. In this “Promoter Replacement” (12xPR) construct, we replaced the entire 141
298nt sequence between the initiation codons of the divergently transcribed H3 and H4 genes 142
with the 226nt sequence between the initiation codons of the divergently transcribed H2a and 143
H2b genes, leaving the H1 and H2a and H2b genes intact (Fig. 1A, Fig. S1B). One difference 144
between the two promoters is that the H2a-H2b region lacks the CLAMP-binding GAGA repeats 145
present in the H3-H4 promoter (Rieder et al., 2017). Otherwise it is an intact, native bidirectional 146
histone gene promoter. Consequently, the 12xPR construct should retain the ability to initiate 147
transcription from all RD histone genes. The 72nt size difference between the promoters 148
provides a way to unambiguously distinguish between the 12xPR and 12xDWT array genotypes 149
using PCR primers in the H3 and H4 coding regions in flies heterozygous (Fig. 1B) or 150
homozygous (Fig. 1C) for the HisC deletion. 151
We first assessed whether the 12xPR and 12xDWT transgenes could support HLB formation 152
in the presence of the endogenous histone genes in polytene chromosome spreads from 3rd instar 153
larval salivary glands. In these polyploid cells, the genome is amplified greater than 1000-fold 154
and sister chromatids line up in register, resulting in large chromosomes providing high-155
resolution cytology. We visualized the HLB by immunofluorescence using antibodies that 156
recognize one of several HLB components. These components include Multi sex combs (Mxc), 157
the Drosophila ortholog of human NPAT, which is necessary for HLB assembly and histone 158
gene expression (White et al., 2011; Terzo et al., 2015); histone pre-mRNA processing factors 159
FLASH (Yang et al., 2009) and Lsm11, a component of the U7 snRNP (Godfrey et al., 2009; 160
Burch et al., 2011); and Muscle wasted (Mute), an ortholog of human YARP (YY1AP-related 161
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
8
protein 1) that is a putative repressor of histone gene expression (Bulchand et al., 2010). In these 162
preparations, we also visualized both the endogenous histone locus and the ectopic, transgenic 163
histone genes by FISH using a probe derived from the entire histone repeat unit. We observed 164
HLB formation at the control 12xDWT transgenic locus but not at the 12xPR transgenic locus (Fig. 165
1D). These data are consistent with our previous results that sequences within the H3-H4 166
promoter are necessary for HLB assembly in the presence of the endogenous histone genes 167
(Salzler et al., 2013). 168
Both single copy transgenic histone gene repeats that fail to form an HLB (Salzler et al., 169
2013) and Mxc mutants that don’t form an HLB (Terzo et al., 2015) result in reduced histone 170
mRNA levels, suggesting that the HLB is necessary for efficient histone mRNA biosynthesis. 171
We therefore tested whether the 12xPR transgene could support histone gene expression in the 172
absence of HLB assembly. To determine expression from the ectopic 12xDWT or 12xPR genes in 173
the presence of the endogenous RD histone genes, we randomly prime cDNA from total RNA 174
and then amplify a fragment of each coding region containing the restriction enzyme site that is 175
changed in the DWT construct. By digesting the PCR fragment with the appropriate restriction 176
enzyme and resolving the fragments by gel electrophoresis we can determine the relative level of 177
expression from the transgenic and endogenous RD histone genes. For example, RT-PCR of the 178
H2a gene results in the same size amplification product from both the transgenic and endogenous 179
genes, but only the product from the endogenous histone genes is sensitive to digestion with 180
XhoI (Fig. 1E). When assayed in the presence of the endogenous genes, we found that H2a was 181
expressed from the 12xDWT at a higher level than from the 12xPR transgene, even though identical 182
promoters drove H2a in each transgenic array (Fig. 1E, Fig. 2A, lanes 2 and 4; Fig. 2B, lanes 2 183
and 3). These data are consistent with our previous observations using transgenes with a single 184
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
9
histone gene repeat unit (Salzler et al., 2013; Terzo et al., 2015). Together these results 185
demonstrate that the H3-H4 promoter is required both for HLB formation and histone gene 186
expression in the presence of the endogenous RD histone genes. 187
188
A histone gene array lacking the H3-H4 promoter forms HLBs and is expressed in the 189
absence of the endogenous genes 190
To assess the biological activity of the 12xPR transgene, we determined the developmental 191
outcome of having a 12xPR transgene as the only zygotic source of histone mRNA. Due to large 192
stores of maternal histone protein and mRNAs, embryos homozygous for a deletion that removes 193
the endogenous RD histone gene array develop normally through S phase of cell cycle 14, but 194
require zygotic RD histone gene expression for progression through S phase of cycle 15 and 195
beyond (Gunesdogan et al., 2010, 2014). Consequently, embryos lacking RD histone genes 196
arrest at nuclear cycle 15 and do not hatch. As noted above, this embryonic lethality is rescued 197
by a single 12xDWT transgene, which supports development of histone deletion progeny into 198
viable, fertile, adult flies. Surprisingly, we found that embryos lacking endogenous histone genes 199
and containing the 12xPR transgene hatched and developed into nearly the expected number of 200
fertile adults without any overt developmental delays, similar to embryos lacking endogenous 201
histone genes and rescued by the 12xDWT transgene. These data suggest that the 12xPR transgene 202
provides normal amounts of histone gene function when the endogenous RD histone genes are 203
absent. 204
We interrogated this unexpected result more thoroughly by analyzing 1) histone gene 205
expression, 2) HLB formation, and 3) histone pre-mRNA processing in both 5-7 hour old 206
embryos and 3rd instar larvae lacking endogenous histone genes and containing either the 12xDWT 207
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
10
or 12xPR transgene. In both genotypes, all histone mRNA in either 5-7 hr embryos (Fig. 2A, 208
lanes 3 and 5) or 3rd instar larvae (Fig. 2B, lanes 4 and 5) was derived only from the ectopic 209
histone gene array, indicating that the maternal histone mRNA stores had been degraded 210
normally. We observed robust histone gene expression from the 12xPR transgene when the 211
endogenous histone genes are absent, in stark contrast with the low level of expression from the 212
12xPR transgene when endogenous histone genes are present (Fig. 2A, lane 4; Fig. 2B lane 3). 213
High levels of histone gene expression are strongly correlated with the ability to form an 214
HLB (Salzler et al., 2013; Terzo et al., 2015; Rieder et al., 2017). Given that the 12xPR can 215
support histone gene expression in the absence of the endogenous genes, we assayed for HLB 216
formation in polytene chromosome spreads from 3rd instar larval salivary glands and in embryos. 217
We detected robust HLB formation at the 12xPR transgenic locus with antibodies against FLASH 218
and Lsm11 (Fig. 2C), or Mxc and Mute (Fig. 2D), similar to that observed at the 12xDWT locus. 219
Thus, the 12xPR transgene, which lacks H3-H4 promoter sequences, can support HLB formation 220
and histone gene expression in the absence of endogenous RD histone genes. 221
222
Histone mRNA from the 12xPR transgenic locus is properly processed 223
An important function of the HLB is concentrating factors to promote efficient histone pre-224
mRNA processing (Tatomer et al., 2016b). Therefore, we reasoned it was possible that the 225
attenuated 12xPR gene array might affect other aspects of histone mRNA biosynthesis, including 226
pre-mRNA processing. All Drosophila RD histone genes contain cryptic polyadenylation signals 227
downstream of the normal processing sites that are only used when the histone pre-mRNA 228
processing reaction is compromised, resulting in the production and accumulation of poly(A)+ 229
histone mRNA (Lanzotti et al., 2002; Godfrey et al., 2009; Tatomer et al., 2016b). We examined 230
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
11
pre-mRNA processing efficiency in flies rescued by the 12xPR transgene. Our PCR based assays 231
to detect histone mRNA expression cannot differentiate between properly processed and 232
misprocessed histone mRNAs. We therefore used northern blotting with a probe against the H3 233
coding region to determine whether histone pre-mRNA was efficiently processed. In contrast to a 234
FLASH mutant, which expresses large amounts of poly(A)+ histone mRNA, we did not detect 235
poly(A)+ histone mRNA from early embryos (Fig. 2E) or from whole 3rd instar larvae (Fig. 2F) 236
in HisC deletion animals rescued by the 12xPR transgene. Note also that the levels of histone 237
mRNA from the 12xPR transgenes are similar to wild type levels. These results indicate that the 238
HLB formed on the 12xPR array in the absence of endogenous histone genes supports efficient 239
histone pre-mRNA processing. 240
241
HLB assembly at the 12xPR transgenic locus occurs at the normal time during embryogenesis 242
The above data suggest that the 12xPR represents an attenuated histone gene array that 243
provides normal biological function when not in competition with the wild type endogenous 244
histone genes. To further explore this model, we determined if an HLB assembles on the 12xPR 245
array in the early embryo at the same time that it assembles on the 12xDWT array. The HLB 246
begins assembling in syncytial blastoderm embryos just prior to the onset of zygotic histone 247
transcription (White et al., 2007; White et al., 2011). We previously reported that a “proto-HLB” 248
consisting of Mxc and FLASH forms in nuclear cycle 10, followed by the onset of zygotic 249
histone gene expression and further recruitment of additional HLB components (Mute and 250
Lsm11) in cycle 11 (White et al., 2011; Salzler et al., 2013). To determine if the HLB forms at 251
the 12xPR transgenic locus with normal timing, we stained syncytial blastoderm embryos lacking 252
endogenous histone genes and rescued by a 12xPR transgene with antibodies against Mxc (Fig. 253
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
12
2G). In these experiments, HLB assembly during the syncytial blastoderm cycles was 254
indistinguishable from that of histone deletion embryos rescued by the control 12xDWT transgene. 255
Thus, in the absence of the endogenous genes, an HLB assembles on the 12xPR array at the same 256
time in early development as it does on a wild type array. 257
258
CLAMP is present in the 12xPR HLB in the absence of the endogenous RD histone genes 259
The H3-H4 promoter is highly conserved among Drosophilids and contains conserved GAGA 260
repeats (Salzler et al., 2013; Rieder et al., 2017), which we previously showed are essential for 261
HLB formation and expression of RD histone genes in the presence of the endogenous histone 262
locus (Rieder et al., 2017). Although in vitro these repeats can bind both the zinc-finger GA-263
repeat binding protein CLAMP, and the major Drosophila GAGA repeat binding protein GAF 264
(GAGA factor; trithorax-like) (Gilmour et al., 1989) only CLAMP is bound to the histone locus 265
in wild-type animals (Rieder et al., 2017). The H2a-H2b promoter and the rest of the histone 266
repeat unit do not contain any GAGA repeats longer than 4 nts, and CLAMP preferentially binds 267
to longer GAGA repeats (Kuzu et al., 2016). We asked whether CLAMP is recruited to the 12xPR 268
transgene in salivary gland polytene chromosomes. As with all other HLB components we 269
tested, CLAMP is not recruited to the 12xPR transgene or a 12x histone gene array in which the 270
GAGA sequences are replaced with lacO binding sites (GAM, Fig. 3A) when the endogenous RD 271
histone genes are present (Rieder et al., 2017). Surprisingly, we found that in the absence of 272
endogenous RD histone genes, CLAMP (Fig. 3C), but not GAF (Fig. 3E), is recruited to the 273
12xPR transgenic locus with similar intensity to CLAMP recruitment to the 12xDWT transgenic 274
locus (Fig. 3B, D). Furthermore, in this genotype the 12x GAM array supports H2a gene 275
expression in the absence of the endogenous genes (Fig. 3F). These data indicate that CLAMP 276
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
13
can be recruited to a histone gene array lacking GAGA repeats when the preferred GAGA 277
binding sites within the H3-H4 promoter at the endogenous RD histone locus are absent. The H3 278
and H4 genes are transcribed, indicating that CLAMP is not an essential DNA-binding 279
transcription factor for the H3 and H4 genes, but likely serves as a factor that alters chromatin 280
structure (Rieder et al, 2017). 281
We carried out ChIP-qPCR experiments on embryos containing only the 12xDWT array or 282
the 12xPR array to determine whether CLAMP is interacting with either the H2a-H2b or H3-H4 283
genes. In agreement with ChIP-seq results on the endogenous genes, which showed that CLAMP 284
is localized precisely to the H3-H4 promoter (Rieder et al., 2017), CLAMP was bound to the H3-285
H4 promoter but not to the H2a-H2b genes or promoter on the 12xDWT array (Fig. 3G). In 286
contrast, CLAMP was not bound to either gene pair on the 12xPR array (Fig. 3G), despite our 287
observation that CLAMP is present in the HLB at the 12xPR array (Fig. 3C). These data suggest 288
that CLAMP can be recruited to RD histone genes by protein-protein interactions independently 289
of its binding to DNA. 290
291
A wild type array can activate the 12xPR array when present in trans at the homologous locus 292
The results above demonstrate that the 12xPR transgenic locus does not form an HLB in 293
the presence of the endogenous histone genes unlike the wild type 12xDWT transgene, which 294
does. We tested whether juxtaposing a wild type histone gene array near the 12xPR locus would 295
nucleate a functional HLB and activate expression from the 12xPR transgene. Our BAC based 296
transgenes are inserted into the genome via site-specific recombination at the same chromosomal 297
location. We created flies in which the 12xPR was placed at position VK33 on chromosome 3 in 298
trans to the 12xHWT transgene used in our initial studies (McKay et al., 2015), and analyzed these 299
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
14
transgenes in the presence of the endogenous RD histone genes on chromosome 2. We examined 300
HLB formation at the ectopic VK33 location by staining whole mount salivary glands with 301
antibodies against Mxc and FLASH. Using this approach, we detect only a single, large HLB on 302
the endogenous RD histone genes in wild type nuclei (Fig. 4A-1, 4B). When 12xPR was the only 303
transgene present in addition to the endogenous RD histone genes, none of the nuclei contained 304
an ectopic HLB and we only detected the endogenous HLB (Fig. 4A-2, 4B), consistent with the 305
results of staining spread salivary gland polytene chromosomes (Fig. 1D). In contrast, we 306
observed formation of a second small HLB in addition to the single large endogenous HLB in 307
100% of the 12xHWT/12xPR nuclei (Fig. 4A-5, 4B), as expected since this HLB also forms when 308
only the 12xHWT array is present at this site. 309
We measured histone gene expression from the 12xPR transgene in these genotypes. In 310
the 12xHWT array the histone H2a gene, and not the other histone genes, is marked with a 311
restriction site change (McKay et al., 2015), whereas the 12xPR array has all five histone genes 312
marked with a restriction site change. Therefore, to specifically detect histone gene expression 313
from the 12xPR transgene, we digested embryonic H3 RT-PCR products with SacI, whose 314
recognition site is missing from the 12xPR H3 gene but is present in both the endogenous and 315
12xHWT H3 genes (Fig. 4C). Strikingly, the 12xPR transgene supports stronger H3 gene 316
expression in the presence compared to the absence of the endogenous histone genes when 317
present in trans with the 12xHWT transgene (Fig. 4C, lanes 1 and 2). These data demonstrate that 318
the wild type histone sequences in the 12xHWT were able to activate the 12xPR transgenic locus in 319
the presence of the endogenous histone genes, likely by nucleating ectopic HLB formation that 320
encompasses the paired homologous chromosomes. This result is similar to transvection, a 321
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
15
phenomenon in which transcription of a gene can be activated by an enhancer located in trans 322
through pairing of homologous chromosomes (Duncan, 2002; Fukaya and Levine, 2017). 323
324
Transgenes with a single copy of the wild type repeat partially activate a 12x array in trans 325
Since 12 copies of a wild type histone repeat at the same integration site stimulated 326
expression from the 12xPR transgene, we tested whether a single copy could do the same using 327
two different approaches; one where the single copy is present in trans and the other where it is 328
present in cis. For the first approach, we inserted a 1x HWT transgene at the VK33 integration 329
site. In these strains we detected the HLB in 90% of the salivary gland nuclei in the presence of 330
the endogenous genes (Fig. 4B), similar to what we observed previously (Salzler et al., 2013). 331
When this 1x HWT transgene was placed in trans with the 12xPR array, we detected an ectopic 332
HLB in 94% of the nuclei examined (Fig. 4A-6, 4B). We tested whether there was any activation 333
of expression of the 12xPR array using our RT-PCR assay. There was a modest activation of the 334
expression 12xPR array, but much weaker than when the 12xHWT was in trans with the 12xPR as 335
measured by the ratio of the bands (Fig. 4D, lanes 1 and 3). 336
We next tested if including a single wild type repeat in the 12xPR array could stimulate 337
HLB formation and histone mRNA expression from that array in the presence of the endogenous 338
genes. We created a 12x array (12xPR-1) containing one wild type histone repeat unit in the center 339
of 11 PR repeat units (Fig. 4E). Like 12xPR, the 12xPR-1 transgene rescued a deletion of the 340
endogenous histone genes, resulting in viable and fertile adults and indicating that it is likely 341
fully active when present as the only source of RD histone genes. We then examined HLB 342
formation in intact salivary glands from animals containing both the 12xPR-1 transgene and the 343
endogenous RD histone genes. In this genotype, we detected HLB formation at the ectopic 344
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
16
12xPR-1 transgenic locus in 59% of the nuclei (Fig. 4A-4, 4B), compared to 0% of nuclei from the 345
12xPR. We also analyzed expression from this array in the presence of the endogenous RD 346
histone genes and observed only modest activation of histone mRNA expression (Fig. 4D, lane 347
2) and less than the level of expression when the 1X WT was in trans to the 12X array (Fig. 4D, 348
lane 3). Thus, a single histone repeat unit can stimulate HLB formation, but activates expression 349
of neighboring genes better when placed in trans (i.e. on the other homolog) rather than in cis 350
(i.e. in the same 12x array). 351
We next tested a transgene containing only the H3-H4 promoter with no coding or 3’ end 352
sequences, which we previously demonstrated could form an HLB in salivary glands when 353
inserted into a ectopic site (Salzler et al., 2013). Ectopic HLBs were detected in 70% of the 354
salivary glands when only the H3-H4p promoter transgene was present and 69% when it was 355
placed in trans the 12xPR array (Fig. 4A-3, 4A-7, 4B). We assayed the expression of the H3 gene 356
from the 12xPR when the H3-H4p transgene was in trans to the 12xPR array. We detected low 357
level expression, similar to the expression detected when the 1xWT transgene was present 358
opposite the 12xPR array (Fig. 4D, lane 4). Collectively, these data demonstrate that the presence 359
of an HLB nucleating sequence in trans to 12xPR can induce formation of an ectopic HLB and 360
histone gene expression in the presence of the endogenous histone genes. Further, these data 361
emphasize that the H3-H4 promoter is a critical element in promoting HLB formation, consistent 362
with our previous observations (Salzler et al., 2013; Rieder et al., 2017). 363
364
Discussion 365
In this study, we used our histone gene replacement platform to analyze the cis acting 366
elements within the Drosophila histone repeat unit that are necessary for HLB formation and 367
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
17
histone gene expression. Previously we showed using a single, transgenic histone gene repeat 368
unit that the promoter region of the divergently transcribed H3-H4 gene pair is capable of 369
stimulating HLB formation (Salzler et al., 2013). We subsequently further mapped this 370
functionality using a 12x gene array to conserved GAGA repeats in this region that are targeted 371
by the CLAMP protein (Rieder et al., 2017). Here, we present evidence that although a 12xPR 372
histone gene array devoid of the H3-H4 promoter and lacking CLAMP binding elements cannot 373
assemble an HLB in the presence of the ~100 RD histone gene copies at the endogenous locus 374
(HisC), it surprisingly can rescue homozygous deletion of HisC and fully support the entire 375
Drosophila life cycle. In the HisC deletion background, the 12xPR array assembles an HLB and 376
expresses the same amount of properly processed histone mRNAs as the endogenous genes or as 377
a 12xDWT wild type array. Below we discuss the implications of these observations on HLB 378
assembly and organization. 379
380
The RD histone locus stimulates HLB formation in Drosophila 381
Biomolecular condensates form via a seeding event that promotes a high concentration of 382
factors at a discrete location, leading to recruitment of additional factors that ultimately results in 383
a structure that can be observed by light microscopy (Gomes and Shorter, 2019). A number of 384
putative seeding events for biomolecular condensates have been described (Dellaire et al., 2006; 385
Dundr, 2011; Mao et al., 2011; Shevtsov and Dundr, 2011; White et al., 2011), but in many 386
cases the precise mechanism of seeding is not known. Nucleic acids, particularly RNA, have 387
been proposed to seed different nuclear bodies. Both the nucleolus and the HLB are associated 388
with specific genomic loci (Mao et al., 2011; Shevtsov and Dundr, 2011; Salzler et al., 2013), 389
and it is likely that the DNA (or chromatin) and/or nascent RNA at the locus participates in the 390
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
18
seeding event. The activation of zygotic transcription of rRNA leads to the precise 391
spatiotemporal formation of the nucleolus in Drosophila embryos (Falahati et al., 2016). In the 392
absence of rDNA, Drosophila nucleolar components still form high concentration assemblies, 393
but these are smaller, more numerous, and do not form at the same time in the early embryo as 394
the wild type nucleolus. 395
Drosophila HLB components also stochastically assemble smaller and more unstable foci 396
in embryos lacking the RD histone locus (White et al., 2007; Salzler et al., 2013; Hur, 2019), 397
suggesting that HLBs and the nucleolus form similar seeding events. Indeed, the dynamics of 398
HLB assembly in single early Drosophila embryos display properties consistent with liquid-399
liquid phase transition seeded by HisC (Hur, 2019). Blocking transcription in the early embryo 400
prevents normal HLB growth (Hur, 2019), and a defective H3-H4 promoter (with mutated 401
TATA boxes) does not support HLB formation in the context of a single copy histone gene 402
repeat in salivary glands (Salzler et al., 2013). These data suggest that active transcription is 403
essential for forming a complete HLB. 404
It is important to note that HLBs assemble and persist in non-proliferating Drosophila 405
tissues that do not express histone mRNAs (Liu et al., 2006; White et al., 2007), and are also 406
present in G0/G1 mammalian cells (Ma et al., 2000; Zhao et al., 2000). Histone gene expression 407
is activated as a result of phosphorylation of Mxc/NPAT by Cyclin E/Cdk2, resulting in changes 408
in the HLB that promote histone gene transcription and pre-mRNA processing (Wei et al., 2003; 409
Ye et al., 2003; White et al., 2011). We propose that in early embryonic development the histone 410
locus DNA and/or chromatin seeds HLB assembly in Drosophila, with the H3-H4 promoter 411
region being particularly important. We further propose that subsequent transcriptional activation 412
of histone genes then drives HLB growth and maturation. 413
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
19
Formation of an HLB on a transgenic RD histone gene array requires that this array 414
compete effectively with the endogenous HisC locus for recruitment of HLB components. This is 415
the situation with 12xHWT and 12XDWT arrays, which form HLBs in the presence of HisC. These 416
results also indicate that there are no other elements within HisC that are necessary for HLB 417
formation. Because the 12xPR array does not form an HLB in the presence of HisC, but it does so 418
in the absence of HisC, we hypothesize that the endogenous RD histone gene array sequesters 419
critical HLB components, likely including Mxc and CLAMP, thereby preventing HLB assembly 420
at the transgenic locus. By removing the H3-H4 promoter from the transgene, we removed an 421
element that provided additional interactions with HLB components, notably CLAMP, 422
weakening the overall ability of the locus to stably nucleate an HLB. 423
424
HLB assembly involves multiple interactions among components 425
Interactions among multivalent proteins, or multivalent protein-nucleic acid interactions, 426
are driving forces in the assembly of biomolecular condensates (Shin and Brangwynne, 2017). 427
Mxc is likely the critical factor that together with histone genes seeds Drosophila HLB formation 428
and activates histone gene expression. Mxc is a large (~1800 aa) protein that oligomerizes in 429
vivo and likely provides a scaffold for multivalent interactions (Terzo et al., 2015). A C-terminal 430
truncation mutant of Mxc that fails to recruit histone pre-mRNA processing factors still forms an 431
HLB and activates histone gene expression at sufficient levels to complete development, 432
underscoring the multivalent nature of Mxc (White et al., 2011; Landais et al., 2014; Tatomer et 433
al., 2016b). 434
Surprisingly, the HLB that assembles on the 12xPR array in the absence of HisC contains 435
CLAMP, even though we have removed all of the known CLAMP binding sites from the histone 436
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
20
repeat. Although CLAMP may bind another sequence in the 12xPR array, no other favorable 437
GAGA repeats are present, and we were unable to detect CLAMP bound to any of the histone 438
promoters by ChIP-qPCR experiments. More likely, CLAMP may interact with other HLB 439
components, possibly Mxc or the Mxc-FLASH complex, providing multivalent contacts between 440
CLAMP and other HLB components. Deleting the GAGA sequences from the H3-H4 promoter 441
did not affect transcription of the H3 or H4 genes in the absence of HisC, suggesting that 442
CLAMP’s major function is to promote HLB assembly and not to act as a DNA binding 443
transcription factor. Supporting this interpretation is the observation that another, more abundant 444
transcription factor that binds to GAGA repeats, GAF, is not found at the HLB unless CLAMP is 445
absent (Rieder et al., 2017), consistent with CLAMP’s critical interactions with both the GAGA 446
repeats and HLB factors in seeding the HLB. 447
Because the 12xPR array is capable of assembling a completely functional HLB in the 448
absence of HisC, the H3-H4 promoter is not absolutely essential for HLB formation. One 449
possibility is that there are multiple pathways for assembling functional HLBs. Previous work 450
suggests that not all seeding events are equivalent in their ability to assemble biomolecular 451
condensates. In artificial systems, changes in scaffold stoichiometry, which can stem from 452
changes in valence, alter the recruitment of components (Banani et al., 2016). Further, 453
mathematical modeling has revealed that scaffolds can nucleate distinct complexes when at 454
different concentrations, and that this can qualitatively alter the transcriptional output (Yang and 455
Hlavacek, 2011). Additionally, P-bodies can form in multiple ways through different protein-456
protein or protein-nucleic acid interactions, with different interactions predominating under 457
different conditions (Rao and Parker, 2017). Therefore, different nucleators of the HLB (i.e. the 458
H3-H4 promoter or other sequences in the locus) may result in similar but not identical 459
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
21
outcomes. Collectively our results suggest that HLB formation results from the contribution of 460
many molecular interactions, and the loss of any single one may be overcome by other 461
multivalent interactions within the body. 462
463
Methods 464
Culture condition and fly strains 465
Original fly strains and crosses were used as in McKay et al. 2015. Stocks were maintained on 466
standard corn medium. Viability studies were performed as in (Penke et al., 2016). 467
Transgenic histone gene array construction 468
Construction of the 5kb histone repeat designed for this study was performed using the HiFi 469
DNA Assembly system from New England Biolabs. PCR amplification of fragments from 470
existing histone repeats, in addition to in vitro synthesized gblocks (IDT) containing the desired 471
nucleotide changes were employed for the building blocks of the reaction. Manufactures protocol 472
was followed with slight modification to the incubation time of the reaction. Engineered 5kb 473
histone repeats were then arrayed to 12 copies in pMultiBac as in (McKay et al., 2015). All 474
histone gene arrays were inserted into the VK33 attP site on chromosome 3L (65B2) 475
(Venken et al., 2006). 476
Northern Analysis 477
Northern analysis was performed using a 7 M 6% urea acrylamide gel to resolve histone 478
mRNAs. ~1ug of RNA from embryos or larvae was used with a radio-labeled probe to the 479
coding region of H3 as in (Lanzotti et al., 2002). 480
Histone Expression Analysis 481
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
22
Total RNA was prepared in Trizol and cDNA synthesized with random hexamers using 482
Superscript II (Invitrogen), according to the manufacturer’s instructions. RT-RCR was 483
performed using gene-specific primers to H2a (McKay et al. 2015) and H3. Each reaction was 484
performed at least 3 times with similar results. PCR products were digested using XhoI (H2a) or 485
SacI (H3). Digested amplicons were run on an 8% acrylamide gel or a 1.5% agarose gel. 486
Immunofluorescence 487
We used primary antibodies at the following concentrations: rabbit anti-CLAMP (1:1000), 488
guinea pig anti-Mxc (1:2000), guinea pig anti-Mute (1:5000), rabbit anti-FLASH (1:2000), rabbit 489
anti-Lsm10 (1:1000), mouse anti-MPM-2 (1:100; Millipore), rabbit anti-GAF (1:1000), mouse 490
anti-LacI (1:1000; Millipore). We used Alexa fluor secondary antibodies (Thermo Fisher 491
Scientific) at a concentration of 1:1000. In situ probes were detected using 15 µg/mL 492
streptavidin-DyLight-488 (Vector Laboratories). Salivary gland dissections and squashes were 493
performed as in (Tatomer et al., 2016b). Images were acquired with using a Zeiss Lsm710 with 494
ZEN DUO software. Images were analyzed using ImageJ. Ectopic HLBs in embryos were 495
quantified as (Salzler et al., 2013). 496
Fluorescent in situ hybridization 497
FISH probes were made from a PCR product that spanned the entire wild-type histone repeat 498
(primers [AAAGGAGGTTGGTAGGCAGC] and [ACGCTAGCGCTTTATCTGCA]). 499
Biotinylated FISH probes were made with nick translation by incubating 1 µg of purified PCR 500
product for 2 h at 15°C in a total of 50 µL containing 1×DNAPolI buffer (Fisher Optizyme); 501
0.05mM each of dCTP, dATP, and dGTP; 0.05 mM biotin-11-dUTP (Thermo Scientific); 10 502
mM 2-mercaptoethanol; 0.004 U of DNaseI (Fisher Optizyme); and 10 U of DNAPol I (Fisher 503
Optizyme). The reaction was purified on a PCR purification column (Thermo Scientific) and 504
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
23
diluted in hybridization buffer (2xSSC, 10%dextran sulfate, 50%formamide, 0.8 mg/mL salmon 505
sperm DNA) to a final volume of 220 µL. FISH probes were diluted in hybridization mixture and 506
added to slides containing salivary gland polytene chromosome spreads before heating. We 507
added a coverslip, sealed it with rubber cement, and heated the slide for 2 min on a 91°C heat 508
block. Slides were placed in a humid box and incubated at 37°C overnight. Immunostaining was 509
then performed by incubating the slides in primary antibody overnight at 4°C in a humid box. 510
Chromatin Immunoprecipitation 511
ChIP was performed according to (Blythe and Wieschaus, 2015) with the following particulars. 512
2-4 hours old embryos (~200 embryos for each of three biological replicates) from HisC/HisC; 513
12xDWT or 12xPR strains were collected, fixed and stored at −80°C until further processing. 514
Immunoprecipitation of chromatin preparations was performed using rabbit anti-CLAMP 515
antibody (Rieder et al., 2017) and a polyclonal rabbit IgG (Millipore-Sigma, 12-370). qPCR was 516
performed as described in (Urban et al., 2017). Three biological replicates and two technical 517
replicates were used for each genotype. We normalized CLAMP IP target abundance against IgG 518
IP target abundance. We analyzed data using a Student's t-test, comparing target abundance 519
between the DWT and PR lines. 520
521
Acknowledgements 522
We thank Erica Larschan’s lab at Brown University for assistance with ChIP experiments. This 523
work was supported by NIH grant R01GM058921 to R.J.D and W.F.M and F32GM109663, 524
K99HD092625, and R00HD092625 to L.E.R. 525
526
References 527
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
24
Alberti, S., Gladfelter, A., and Mittag, T. (2019). Considerations and Challenges in Studying 528 Liquid-Liquid Phase Separation and Biomolecular Condensates. Cell 176, 419-434. 529
Altmeyer, M., Neelsen, K.J., Teloni, F., Pozdnyakova, I., Pellegrino, S., Grofte, M., Rask, M.B., 530 Streicher, W., Jungmichel, S., Nielsen, M.L., and Lukas, J. (2015). Liquid demixing of 531 intrinsically disordered proteins is seeded by poly(ADP-ribose). Nat Commun 6, 8088. 532
Banani, S.F., Lee, H.O., Hyman, A.A., and Rosen, M.K. (2017). Biomolecular condensates: 533 organizers of cellular biochemistry. Nat Rev Mol Cell Biol 18, 285-298. 534
Banani, S.F., Rice, A.M., Peeples, W.B., Lin, Y., Jain, S., Parker, R., and Rosen, M.K. (2016). 535 Compositional Control of Phase-Separated Cellular Bodies. Cell 166, 1-13. 536
Blythe, S.A., and Wieschaus, E.F. (2015). Zygotic genome activation triggers the DNA 537 replication checkpoint at the midblastula transition. Cell 160, 1169-1181. 538
Bongartz, P., and Schloissnig, S. (2019). Deep repeat resolution-the assembly of the Drosophila 539 Histone Complex. Nucleic Acids Res 47, e18. 540
Bulchand, S., Menon, S.D., George, S.E., and Chia, W. (2010). Muscle wasted: a novel 541 component of the Drosophila histone locus body required for muscle integrity. J Cell Sci 123, 542 2697-2707. 543
Burch, B.D., Godfrey, A.C., Gasdaska, P.Y., Salzler, H.R., Duronio, R.J., Marzluff, W.F., and 544 Dominski, Z. (2011). Interaction between FLASH and Lsm11 is essential for histone pre-mRNA 545 processing in vivo in Drosophila. RNA 17, 1132-1147. 546
Dellaire, G., Eskiw, C.H., Dehghani, H., Ching, R.W., and Bazett-Jones, D.P. (2006). Mitotic 547 accumulations of PML protein contribute to the re-establishment of PML nuclear bodies in G1. J 548 Cell Sci 119, 1034-1042. 549
Duncan, I.W. (2002). Transvection effects in Drosophila. Annu Rev Genet 36, 521-556. 550
Dundr, M. (2011). Seed and grow: a two-step model for nuclear body biogenesis. J Cell Biol 551 193, 605-606. 552
Dundr, M., and Misteli, T. (2010). Biogenesis of nuclear bodies. Cold Spring Harb Perspect Biol 553 2, a000711. 554
Duronio, R.J., and Marzluff, W.F. (2017). Coordinating cell cycle-regulated histone gene 555 expression through assembly and function of the Histone Locus Body. RNA Biol 14, 726-738. 556
Falahati, H., Pelham-Webb, B., Blythe, S., and Wieschaus, E. (2016). Nucleation by rRNA 557 Dictates the Precision of Nucleolus Assembly. Curr Biol 26, 277-285. 558
Fukaya, T., and Levine, M. (2017). Transvection. Curr Biol 27, R1047-R1049. 559
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
25
Gilmour, D.S., Thomas, G.H., and Elgin, S.C. (1989). Drosophila nuclear proteins bind to 560 regions of alternating C and T residues in gene promoters. Science 245, 1487-1490. 561
Godfrey, A.C., White, A.E., Tatomer, D.C., Marzluff, W.F., and Duronio, R.J. (2009). The 562 Drosophila U7 snRNP proteins Lsm10 and Lsm11 are required for histone pre-mRNA 563 processing and play an essential role in development. RNA 15, 1661-1672. 564
Gomes, E., and Shorter, J. (2019). The molecular language of membraneless organelles. J Biol 565 Chem 294, 7115-7127. 566
Gunesdogan, U., Jackle, H., and Herzig, A. (2010). A genetic system to assess in vivo the 567 functions of histones and histone modifications in higher eukaryotes. EMBO Rep 11, 772-776. 568
Gunesdogan, U., Jackle, H., and Herzig, A. (2014). Histone supply regulates S phase timing and 569 cell cycle progression. Elife 3, e02443. 570
Heyn, P., Salmonowicz, H., Rodenfels, J., and Neugebauer, K.M. (2017). Activation of 571 transcription enforces the formation of distinct nuclear bodies in zebrafish embryos. RNA Biol 572 14, 752-760. 573
Hur, W., Tarzia, M., Deneke, V. E., Terzo, E. A., Duronio, R. J., Di Talia, S. (2019). CDK-574 regulated phase separation seeded by histone genes ensures precise growth and function of 575 Histone Locus Bodies. BioRXiv https://doi.org/10.1101/789933. 576
Kuzu, G., Kaye, E.G., Chery, J., Siggers, T., Yang, L., Dobson, J.R., Boor, S., Bliss, J., Liu, W., 577 Jogl, G., Rohs, R., Singh, N.D., Bulyk, M.L., Tolstorukov, M.Y., and Larschan, E. (2016). 578 Expansion of GA Dinucleotide Repeats Increases the Density of CLAMP Binding Sites on the 579 X-Chromosome to Promote Drosophila Dosage Compensation. PLoS Genet 12, e1006120. 580
Landais, S., D'Alterio, C., and Jones, D.L. (2014). Persistent replicative stress alters polycomb 581 phenotypes and tissue homeostasis in Drosophila melanogaster. Cell Rep 7, 859-870. 582
Lanzotti, D.J., Kaygun, H., Yang, X., Duronio, R.J., and Marzluff, W.F. (2002). Developmental 583 control of histone mRNA and dSLBP synthesis during Drosophila embryogenesis and the role of 584 dSLBP in histone mRNA 3' end processing in vivo. Mol Cell Biol 22, 2267-2282. 585
Lifton, R.P., Goldberg, M.L., Karp, R.W., and Hogness, D.S. (1978). The organization of the 586 histone genes in Drosophila melanogaster: functional and evolutionary implications. Cold Spring 587 Harb Symp Quant Biol 42, 1047-1051. 588
Liu, J.L., Murphy, C., Buszczak, M., Clatterbuck, S., Goodman, R., and Gall, J.G. (2006). The 589 Drosophila melanogaster Cajal body. J Cell Biol 172, 875-884. 590
Ma, T., Van Tine, B.A., Wei, Y., Garrett, M.D., Nelson, D., Adams, P.D., Wang, J., Qin, J., 591 Chow, L.T., and Harper, J.W. (2000). Cell cycle-regulated phosphorylation of p220(NPAT) by 592 cyclin E/Cdk2 in cajal bodies promotes histone gene transcription. Genes Dev 14, 2298-2313. 593
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
26
Mao, Y.S., Sunwoo, H., Zhang, B., and Spector, D.L. (2011). Direct visualization of the co-594 transcriptional assembly of a nuclear body by noncoding RNAs. Nat Cell Biol 13, 95-101. 595
Marzluff, W.F., and Koreski, K.P. (2017). Birth and Death of Histone mRNAs. Trends Genet 33, 596 745-759. 597
Matera, A.G., Izaguire-Sierra, M., Praveen, K., and Rajendra, T.K. (2009). Nuclear bodies: 598 random aggregates of sticky proteins or crucibles of macromolecular assembly? Dev Cell 17, 599 639-647. 600
McKay, D.J., Klusza, S., Penke, T.J., Meers, M.P., Curry, K.P., McDaniel, S.L., Malek, P.Y., 601 Cooper, S.W., Tatomer, D.C., Lieb, J.D., Strahl, B.D., Duronio, R.J., and Matera, A.G. (2015). 602 Interrogating the function of metazoan histones using engineered gene clusters. Dev Cell 32, 603 373-386. 604
Mitrea, D.M., and Kriwacki, R.W. (2016). Phase separation in biology; functional organization 605 of a higher order. Cell Commun Signal 14, 1. 606
Penke, T.J., McKay, D.J., Strahl, B.D., Matera, A.G., and Duronio, R.J. (2016). Direct 607 interrogation of the role of H3K9 in metazoan heterochromatin function. Genes Dev 30, 1866-608 1880. 609
Rajendra, T.K., Praveen, K., and Matera, A.G. (2010). Genetic analysis of nuclear bodies: from 610 nondeterministic chaos to deterministic order. Cold Spring Harb Symp Quant Biol 75, 365-374. 611
Rao, B.S., and Parker, R. (2017). Numerous interactions act redundantly to assemble a tunable 612 size of P bodies in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 114, E9569-E9578. 613
Rieder, L.E., Koreski, K.P., Boltz, K.A., Kuzu, G., Urban, J.A., Bowman, S.K., Zeidman, A., 614 Jordan, W.T., 3rd, Tolstorukov, M.Y., Marzluff, W.F., Duronio, R.J., and Larschan, E.N. (2017). 615 Histone locus regulation by the Drosophila dosage compensation adaptor protein CLAMP. 616 Genes Dev 31, 1494-1508. 617
Salzler, H.R., Tatomer, D.C., Malek, P.Y., McDaniel, S.L., Orlando, A.N., Marzluff, W.F., and 618 Duronio, R.J. (2013). A sequence in the Drosophila H3-H4 Promoter triggers histone locus body 619 assembly and biosynthesis of replication-coupled histone mRNAs. Dev Cell 24, 623-634. 620
Shevtsov, S.P., and Dundr, M. (2011). Nucleation of nuclear bodies by RNA. Nat Cell Biol 13, 621 167-173. 622
Shin, Y., and Brangwynne, C.P. (2017). Liquid phase condensation in cell physiology and 623 disease. Science 357, 1253. 624
Stanek, D., and Fox, A.H. (2017). Nuclear bodies: news insights into structure and function. Curr 625 Opin Cell Biol 46, 94-101. 626
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
27
Tatomer, D.C., Terzo, E., Curry, K.P., Salzler, H., Sabath, I., Zapotoczny, G., McKay, D.J., 627 Dominski, Z., Marzluff, W.F., and Duronio, R.J. (2016a). Concentrating pre-mRNA processing 628 factors in the histone locus body facilitates efficient histone mRNA biogenesis. J Cell Biol. 629
Tatomer, D.C., Terzo, E., Curry, K.P., Salzler, H., Sabath, I., Zapotoczny, G., McKay, D.J., 630 Dominski, Z., Marzluff, W.F., and Duronio, R.J. (2016b). Concentrating pre-mRNA processing 631 factors in the histone locus body facilitates efficient histone mRNA biogenesis. J Cell Biol 213, 632 557-570. 633
Terzo, E.A., Lyons, S.M., Poulton, J.S., Temple, B.R., Marzluff, W.F., and Duronio, R.J. (2015). 634 Distinct self-interaction domains promote Multi Sex Combs accumulation in and formation of 635 the Drosophila histone locus body. Mol Biol Cell 26, 1559-1574. 636
Urban, J.A., Doherty, C.A., Jordan, W.T., 3rd, Bliss, J.E., Feng, J., Soruco, M.M., Rieder, L.E., 637 Tsiarli, M.A., and Larschan, E.N. (2017). The essential Drosophila CLAMP protein 638 differentially regulates non-coding roX RNAs in male and females. Chromosome Res 25, 101-639 113. 640
Venken, K.J., He, Y., Hoskins, R.A., and Bellen, H.J. (2006). P[acman]: a BAC transgenic 641 platform for targeted insertion of large DNA fragments in D. melanogaster. Science 314, 1747-642 1751. 643
Wagner, E.J., Burch, B.D., Godfrey, A.C., Salzler, H.R., Duronio, R.J., and Marzluff, W.F. 644 (2007). A genome-wide RNA interference screen reveals that variant histones are necessary for 645 replication-dependent histone pre-mRNA processing. Mol Cell 28, 692-699. 646
Wei, Y., Jin, J., and Harper, J.W. (2003). The Cyclin E/Cdk2 Substrate and Cajal Body 647 Component p220(NPAT) Activates Histone Transcription through a Novel LisH-Like Domain. 648 Mol Cell Biol 23, 3669-3680. 649
White, A.E., Burch, B.D., Yang, X.C., Gasdaska, P.Y., Dominski, Z., Marzluff, W.F., and 650 Duronio, R.J. (2011). Drosophila histone locus bodies form by hierarchical recruitment of 651 components. J Cell Biol 193, 677-694. 652
White, A.E., Leslie, M.E., Calvi, B.R., Marzluff, W.F., and Duronio, R.J. (2007). Developmental 653 and cell cycle regulation of the Drosophila histone locus body. Mol Biol Cell 18, 2491-2502. 654
Yang, J., and Hlavacek, W.S. (2011). Scaffold-mediated nucleation of protein signaling 655 complexes: elementary principles. Math Biosci 232, 164-173. 656
Yang, X.C., Burch, B.D., Yan, Y., Marzluff, W.F., and Dominski, Z. (2009). FLASH, a 657 proapoptotic protein involved in activation of caspase-8, is essential for 3' end processing of 658 histone pre-mRNAs. Mol Cell 36, 267-278. 659
Ye, X., Wei, Y., Nalepa, G., and Harper, J.W. (2003). The cyclin E/Cdk2 substrate p220(NPAT) 660 is required for S-phase entry, histone gene expression, and Cajal body maintenance in human 661 somatic cells. Mol Cell Biol 23, 8586-8600. 662
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
28
Zhao, J., Dynlacht, B., Imai, T., Hori, T., and Harlow, E. (1998). Expression of NPAT, a novel 663 substrate of cyclin E-CDK2, promotes S- phase entry. Genes Dev 12, 456-461. 664
Zhao, J., Kennedy, B.K., Lawrence, B.D., Barbie, D.A., Matera, A.G., Fletcher, J.A., and 665 Harlow, E. (2000). NPAT links cyclin E-Cdk2 to the regulation of replication-dependent histone 666 gene transcription. Genes Dev 14, 2283-2297. 667 668
669
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
29
Figure Legends 670
671
Figure 1. The H3-H4 bidirectional promoter is required for ectopic HLB formation in the 672
presence of the endogenous RD histone genes. 673
A) Schematic of the endogenous Drosophila RD histone gene repeat, which is found in a tandem 674
array of ~100 copies at HisC, and the BAC-based DWT and PR synthetic histone repeats, which 675
arrayed to 12 copies and inserted at VK33 on chromosome 3. The colors indicate that each 676
transgenic histone gene was engineered to be distinguishable molecularly from the endogenous 677
histone genes (see Fig. S1). In the PR construct the H2a-H2b promoters (green) replaced the H3-678
H4 promoters (blue). B, C) Primers that anneal to H3 and H4 genes (A) were used to amplify the 679
promoter region from genomic DNA extracted from whole flies heterozygous (B) or embryos 680
homozygous (C) for a deletion of the HisC locus and containing either the 12xDWT or 12xPR 681
transgene. The amplicon from a wild type locus is larger (637nts) than the PR locus (565nts). D) 682
Polytene chromosome squashes from 3rd instar larval salivary glands containing both HisC and 683
either the 12xDWT or 12xPR transgenes hybridized with a probe to the histone locus and stained 684
for HLB components Lsm11 and Mxc. Note that absence of an HLB at 12xPR (arrows) when 685
HisC is present (right panels). Scale bar 10 microns. E) Assay for specific detection of 686
endogenous versus transgenic RD histone gene expression. Total RNA was prepared from 3rd 687
instar larvae containing the endogenous genes and one copy of the 12xDWT or 12xPR transgene. 688
RT-PCR with H2a primers followed by digestion with Xho I cleaves cDNA from the endogenous 689
genes but not the transgenes, which was visualized using an 8% polyacrylamide gel. 690
691
692
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
30
Figure 2. The PR transgene is expressed and forms an HLB in the absence of the 693
endogenous RD histone genes. 694
A, B) RNA was isolated from 5-7 hr embryos (A) or 3rd instar larvae (B) containing the 12xDWT 695
or 12xPR transgene in the presence or absence of HisC, with wild-type embryos as a control (lane 696
1). RT-PCR products using H2a primers were digested with Xho I and visualized using a 1.5% 697
agarose gel (which does not resolve the two restriction fragments from the endogenous genes as 698
in Fig. 1E). C, D) Polytene chromosomes from HisC deletion homozygous 3rd instar larval 699
salivary glands rescued by either the 12xDWT or 12xPR transgene stained with antibodies against 700
HLB components Lsm11 and FLASH (C) or Mxc and Mute (D). E, F) Total RNA from 0-3h or 701
3-5h embryos (D) or 3rd instar larvae (E) from wild-type (yw) or HisC deletion homozygotes 702
containing either the 12xDWT or 12xPR transgene analyzed by Northern blotting using a 703
radiolabeled H3 coding region probe. * = normally processed H3 mRNA. A FLASH mutant that 704
cannot recruit the processing machinery to the HLB (Tatomer et al., 2016b) was used as a 705
positive control for production of poly(A)+ histone mRNA. G) Syncytial, embryos from HisC 706
deletion homozygous stocks rescued by 12xDWT or 12xPR transgenes and stained with Mxc to 707
monitor HLB formation in development. 3 consecutive syncytial nuclear cycles (top to bottom) 708
were analyzed. 709
710
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
31
Figure 3. CLAMP but not GAF is recruited to HLBs that form in the absence of GA repeats 711 712 A) Schematic of the WT, PR, and GA mutant (GAM) synthetic histone repeats. For GAM, GA 713
sequences in the H3-H4 promoter were mutated to lacO sites or scrambled (Rieder et al., 2017). 714
B-E) Polytene chromosome squashes from 3rd instar larval salivary glands lacking endogenous 715
histone genes and rescued by a either a 12xDWT transgene (B,D) or a 12xPR transgene (C,E) were 716
stained with antibodies against FLASH and CLAMP (B,C) or FLASH and GAF (D,E). Scale bar 717
10 microns. F) RT-PCR analysis with H2a primers performed on cDNA from 5-7 hr embryos of 718
indicated genotypes and visualized with ethidium bromide on a 0.8% agarose gel. G) 2-4 hr old 719
HisC homozygous deletion embryos rescued by a 12xDWT or 12xPR transgene were analyzed by 720
ChIP-PCR using primers recognizing the indicated promoters. 721
722
Figure 4 Activation of PR transgenes by wild type histone gene arrays 723
A-1 to A-8) Intact salivary gland nuclei from 3rd instar larvae of the indicated genotypes, all of 724
which contained HisC and stained for the HLB components Mxc and FLASH. Ectopic HLBs are 725
indicated by arrows and dashed boxes, and were present in all genotypes except the 12xPR. Scale 726
bar 10 microns. B) Quantification of the percentage of ectopic HLBs in the salivary gland nuclei 727
in panel B. C) Schematic of the assay used to distinguish 12xPR from 12xHWT cDNA. RT-PCR 728
amplification of H3 mRNA from 5-7 hr old embryos of the indicated genotypes, followed by 729
digestion with SacI and visualized on an 8% polyacrylamide gel. D) RT-PCR amplification of 730
H2a mRNA from larvae of indicated genotypes, followed by digestion with XhoI and 731
visualization on an agarose gel. E) Top: Schematic of the BAC-based PR-1 histone array 732
containing 11 PR and 1 DWT repeat unit. Bottom: Analysis of BAC constructs containing WT 733
and PR repeats using PCR as in panel 1B and C. 734
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
32
Supplemental Figure 1. Annotated maps of engineered histone gene repeat unit used to 735
make transgenic 12x replication dependent histone gene arrays. A) Designer Wild Type. B) 736
Promoter Replacement. Restriction enzyme sites that were either added or deleted are shown 737
below each line of sequence. The maps were made using Snapgene. 738
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
merge
FISH
Lsm11
DWT
MxcMxc
ectopicEndo ectopicEndo
PR
Genotype Gel.
DWT PR
H3-H4p - 637ntsH2a-H2bp- 565 nts
+/+ | HisCΔ / +
H3-H4p - 637ntsH2a-H2bp- 565 nts
+/+ | HisCΔ /HisCΔ
XhoI
CTCGAG
Endogenous H2aXhoI
CTGGAA
Transgenic H2a
Trans H2a: 391nts
Endo H2a: 200+191
Endogenous Transgene
+ +DWT
H2a PCR Product digested with Xho I
FISH
Lsm11
merge
+PR
D
A
B C
E
H1DWT
PR
H3 H4 H2a H2b
H1H3 H4 H2a H2b
H1WT H3 H4 H2a H2b
DWT PR
x 100
x 12
x 12
Mxc
Koreski, et al., Figure 1
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
PR
Lsm11
FLASH
merge
FLASH
Lsm11
merge
DWT
A
B
E
C
F
DTransgene Endogenous
DWT DWT PR PR+ + +
1 2 3 4 5
Transgene Endogenous + + +
DWT PR DWT PR
1 2 3 4 5
Mxc Mxc/DNA
PR DWT
Trans H2a
Endo H2a
Trans H2a
Endo H2a
DWTpoly(A
) +
0-3hr 3-5hr
yw DWT PR yw PR FLASH mutant
H3
DWT*
yw DWT PR FLASH mutant
H3
poly(A) +
*
merge
Mute
Mxc
PR
G
merge
Mute
Mxc
DWT
Mxc Mxc/DNA
Koreski et al., Figure 2
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
HWT
A
FLASH
CLAMP
mergeΔHisC; PR
FLASH
CLAMP
mergeΔHisC; DWT
ΔHisC; DWT merge
GAF
FLASHFLASH
GAF
merge
G
ΔHisC; PR
F
B C
D E
GAM
DWTPR PRGAM DWTGAM
+
Trans H2a
Endo H2a
ΔΔ ΔTransgene
Endogenous + +
++ ++ + + +
0
20
40
60
80
100
120
140
H3H4 H2AH2B H1Fold
Enr
ichm
ent o
f CLA
MP
IP o
ver I
gG IP
PR DWT
H1DWT H3 H4 H2a H2b
PR H1H3 H4 H2a H2b
H1H3 H4 H2a H2b
Koreski, et al., Figure 3
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
DWT copies
PR copies
DWTPR
1
1
1
7
1 1
3 11
yw
PR-1
B
mergeMxcFLASH mergeMxcFLASH
212xPR
1yw
412XPR-1
61xHWT
12XPR
512xHWT
12XPR
3H3H4p
7H3H4p
12XPR
812xPR-1
12XPR
D
Transgene #1
Endogenous DWT
HWT
PR PR
+ +PR PR
HWT
Sac I
Endogenous andtransgenic HWT H3
Sac I
TransgenicDWT and PR H3
1 2 3 4 5
Transgene #2
Transgene #1
Endogenous PR PR-1
+ +
PR
H3H4PTransgene #2
+ +
1xHWT
PR E
1 2 3 4
A
C
PR5 DWT1 PR6
tg
endo
Koreski et al., Figure 4
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 1
BspDIClaI
SalI AccI
5
3
80
H1-H3 intergenic region
G T C G A C T A A T G C A T A T G T G G C G A G G A A A T C G A T T G A T T T C A G A A C A A A T T A T T T T A A A A T A T G C A T G A A A A C A C A T T A A T
C A G C T G A T T A C G T A T A C A C C G C T C C T T T A G C T A A C T A A A G T C T T G T T T A A T A A A A T T T T A T A C G T A C T T T T G T G T A A T T A
160
H1-H3 intergenic region
A A C A A G C A A A C A C A T T A A T A A T T T A A G A A A A T A T T A T T T A T T A T A T T A A T A T T A T G T T A T T T A A G A A A G T A T C T G T A T T T
T T G T T C G T T T G T G T A A T T A T T A A A T T C T T T T A T A A T A A A T A A T A T A A T T A T A A T A C A A T A A A T T C T T T C A T A G A C A T A A A
PvuI
240
H1-H3 intergenic region
T T A A C G A T C G A A A A T T A T T T C T G A A T G C T G C T T T A A A G C A A A T T T T T C T G T A G T T C A A T G T G A A C T T A A A T C A A G T A A T A
A A T T G C T A G C T T T T A A T A A A G A C T T A C G A C G A A A T T T C G T T T A A A A A G A C A T C A A G T T A C A C T T G A A T T T A G T T C A T T A T
PacI
320
H1-H3 intergenic region
A A G T A T C T T A A T T A A T A A T A G A C G C T T C T T T T A G A A G C C C T T C A A G G G G A T G A A C G T T T C A A T T T T A A T A A A C A T T A C G A
T T C A T A G A A T T A A T T A T T A T C T G C G A A G A A A A T C T T C G G G A A G T T C C C C T A C T T G C A A A G T T A A A A T T A T T T G T A A T G C T
400
H1-H3 intergenic region
A T T A G T G A A A T A T T T G C A A T G A T T C T T A T T T T A T T A G A T G T T T T T T T A T A A A T T G G T C C A G T T A A A A A T T T G A A T A T A A A
T A A T C A C T T T A T A A A C G T T A C T A A G A A T A A A A T A A T C T A C A A A A A A A T A T T T A A C C A G G T C A A T T T T T A A A C T T A T A T T T
480
H1-H3 intergenic region
A A T T C A A T C A A C A T T T G A A A G T C T C A A A A C C C A T A T T A C A T C C T T T T A A A A A T G G A A A G T G A C G A A A A A A T T A T T T A A A A
T T A A G T T A G T T G T A A A C T T T C A G A G T T T T G G G T A T A A T G T A G G A A A A T T T T T A C C T T T C A C T G C T T T T T T A A T A A A T T T T
560
H1-H3 intergenic region
G T G T A G A A C T A T T A A A A C C T T A T T T T T A T T A A T G A T T T A A A A T A C T A A A A A A T T A A A A A A A G T T T A C A C T T C A A G C A A A C
C A C A T C T T G A T A A T T T T G G A A T A A A A A T A A T T A C T A A A T T T T A T G A T T T T T T A A T T T T T T T C A A A T G T G A A G T T C G T T T G
BsaBIPshAI
640
H1-H3 intergenic region H1 core promoter
T T C G A C A T A G T A A A T G A C T G A T G T C A G T A G C A T T G T T A A A G T G C T C T C C T C C T C G A T T C T C A T C A G A G C A A A G G A G G T T G
A A G C T G T A T C A T T T A C T G A C T A C A G T C A T C G T A A C A A T T T C A C G A G A G G A G G A G C T A A G A G T A G T C T C G T T T C C T C C A A C
BssHII
720
H1 transcription startH1 5'UTR
H1 transl. start
H1 coding region
G T A G G C A G C G C G C G A G C C A T T T T T A A C A G A A A A A A A G T G T T C T C A G T G A A A A A A A G A T G T C T G A T T C T G C A G T T G C A A C G
C A T C C G T C G C G C G C T C G G T A A A A A T T G T C T T T T T T T C A C A A G A G T C A C T T T T T T T C T A C A G A C T A A G A C G T C A A C G T T G C
800
H1 coding region
T C C G C T T C C C C A G T G G C T G C C C C A C C A G C G A C A G T T G A G A A G A A A G T G G T C C A A A A A A A G G C A T C T G G A T C T G C T G G C A C
A G G C G A A G G G G T C A C C G A C G G G G T G G T C G C T G T C A A C T C T T C T T T C A C C A G G T T T T T T T C C G T A G A C C T A G A C G A C C G T G
A Koreski et al., Figure S1.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 2
AflII
880
H1 coding region
Add AflII
A A A G G C A A A G A A A G C C T C T G C G A C G C C G T C A C A T C C G C C A A C T C A G C A A A T G G T G G A C G C T T C C A T T A A A A A T C T T A A G G
T T T C C G T T T C T T T C G G A G A C G C T G C G G C A G T G T A G G C G G T T G A G T C G T T T A C C A C C T G C G A A G G T A A T T T T T A G A A T T C C
960
H1 coding region
A A C G T G G C G G T T C A T C A C T T C T G G C A A T C A A A A A A T A T A T C A C T G C C A C T T A T A A A T G C G A C G C C C A A A A G T T A G C G C C A
T T G C A C C G C C A A G T A G T G A A G A C C G T T A G T T T T T T A T A T A G T G A C G G T G A A T A T T T A C G C T G C G G G T T T T C A A T C G C G G T
1040
H1 coding region
H1 Tag- Remove ScaI
T T C A T C A A G A A G T A T T T A A A A T C G G C C G T G G T C A A T G G A A A G C T T A T T C A A A C T A A G G G A A A G G G T G C A T C T G G A T C T T T
A A G T A G T T C T T C A T A A A T T T T A G C C G G C A C C A G T T A C C T T T C G A A T A A G T T T G A T T C C C T T T C C C A C G T A G A C C T A G A A A
1120
H1 coding region
Removed BamHI
C A A A C T G T C G G C C T C T G C C A A G A A G G A A A A A G A T C C G A A G G C A A A G T C G A A G G T T T T G T C T G C T G A G A A A A A A G T T C A A A
G T T T G A C A G C C G G A G A C G G T T C T T C C T T T T T C T A G G C T T C C G T T T C A G C T T C C A A A A C A G A C G A C T C T T T T T T C A A G T T T
1200
H1 coding region
G C A A G A A G G T A G C C T C T A A G A A G A T T G G T G T C T C C T C C A A A A A A A C T G C C G T T G G G G C T G C T G A C A A A A A G C C C A A A G C T
C G T T C T T C C A T C G G A G A T T C T T C T A A C C A C A G A G G A G G T T T T T T T G A C G G C A A C C C C G A C G A C T G T T T T T C G G G T T T C G A
1280
H1 coding region
A A G A A G G C T G T G G C T A C C A A A A A G A C T G C C G A A A A T A A G A A A A C T G A G A A G G C A A A A G C C A A G G A T G C C A A G A A A A C T G G
T T C T T C C G A C A C C G A T G G T T T T T C T G A C G G C T T T T A T T C T T T T G A C T C T T C C G T T T T C G G T T C C T A C G G T T C T T T T G A C C
1360
H1 coding region
A A T C A T A A A G T C G A A G C C C G C C G C A A C A A A G G C G A A A G T G A C T G C A G C G A A G C C A A A G G C T G T A A T A G C G A A A G C G T C A A
T T A G T A T T T C A G C T T C G G G C G G C G T T G T T T C C G C T T T C A C T G A C G T C G C T T C G G T T T C C G A C A T T A T C G C T T T C G C A G T T
1440
H1 coding region
A G G C A A A G C C A G C G G T G T C T G C A A A A C C C A A A A A G A C G G T G A A G A A A G C A T C G G T T T C T G C T A C C G C C A A G A A G C C G A A A
T C C G T T T C G G T C G C C A C A G A C G T T T T G G G T T T T T C T G C C A C T T C T T T C G T A G C C A A A G A C G A T G G C G G T T C T T C G G C T T T
HpaI
1520
H1 coding region
H1 transl. stopAdded HpaI
H1 3'UTR [3']
H1 3'UTR [5']
G C G A A G A C T A C G G C T G C C A A A A A G T A A G T T A A C A T T G T G A A A A A G T G C A G T A T T T G G T A C A T G T T C G C A A T T A A A A T T T T
C G C T T C T G A T G C C G A C G G T T T T T C A T T C A A T T G T A A C A C T T T T T C A C G T C A T A A A C C A T G T A C A A G C G T T A A T T T T A A A A
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 3
PmeI
1600
H1 3'UTR [3'] H1 HDE
Removed BglIIH1 stem loop
A G A T T T A T G T T T T A T A G A C C T G A A A T T T G T T T A A A C A A G T C C T T T T C A G G G C T A C A A C G T T C C G T T G C A A G A G A A A A A A A
T C T A A A T A C A A A A T A T C T G G A C T T T A A A C A A A T T T G T T C A G G A A A A G T C C C G A T G T T G C A A G G C A A C G T T C T C T T T T T T T
BspHIZraI AatII
1680
H1 HDEAdded AatII
C T T T T A T G A C G T C T T T C T T C C A C T T A T T T A T T A G C T G A C G T T C G C A G C A A C A A T A A A A C G T T T C A T G T C A T G A A T T A C A T
G A A A A T A C T G C A G A A A G A A G G T G A A T A A A T A A T C G A C T G C A A G C G T C G T T G T T A T T T T G C A A A G T A C A G T A C T T A A T G T A
1760
T G C A T G T T G G T C G C A T T C A G T T T T C G T T C C C G A T T T T T T T G T A T T T A T T T G A A C A T T A C C C A A T T A C C C A T A T T G C G G G T
A C G T A C A A C C A G C G T A A G T C A A A A G C A A G G G C T A A A A A A A C A T A A A T A A A C T T G T A A T G G G T T A A T G G G T A T A A C G C C C A
BstBI
1840
Add BstBIH2b HDE
H2b 3'UTR
A A A T A A G T T T T A T T T G T A A A T T C A T A T T C G A T G A T T G G T G G T T C G A A T T G A A A A A T G C A T T T C T T T G G T A T A A C A C A T T G
T T T A T T C A A A A T A A A C A T T T A A G T A T A A G C T A C T A A C C A C C A A G C T T A A C T T T T T A C G T A A A G A A A C C A T A T T G T G T A A C
BglII
1920
H2b 3'UTRAdd BglII
H2b transl. stop
H2b coding sequence
H2b stem loop
T G G C C C T G A A A A G G G C C G T T T T G G A T T A T T G T C C G C A T T C G C A G G A G A A A A A G A T C T T T A T T T A G A G C T G G T G T A C T T G G
A C C G G G A C T T T T C C C G G C A A A A C C T A A T A A C A G G C G T A A G C G T C C T C T T T T T C T A G A A A T A A A T C T C G A C C A C A T G A A C C
SphI
2000
H2b coding sequence
T G A C A G C C T T G G T T C C C T C A C T G A C A G C A T G C T T G G C C A A C T C T C C A G G C A A A A G C A G G C G A A C A G C C G T T T G G A T C T C C
A C T G T C G G A A C C A A G G G A G T G A C T G T C G T A C G A A C C G G T T G A G A G G T C C G T T T T C G T C C G C T T G T C G G C A A A C C T A G A G G
NruI
2080
H2b coding sequence
Added NruI
C G A C T G G T G A T G G T C G A G C G C T T G T T G T A G T G A G C T A G T C G C G A C G C T T C G G C A G C A A T T C G C T C G A A A A T A T C A T T T A C
G C T G A C C A C T A C C A G C T C G C G A A C A A C A T C A C T C G A T C A G C G C T G C G A A G C C G T C G T T A A G C G A G C T T T T A T A G T A A A T G
BpuEIBspMIBfuAI
2160
H2b coding sequence
A A A G C T G T T C A T T A T G C T C A T C G C C T T C G A C G A A A T T C C G G T G T C A G G A T G G A C C T G C T T G A G A A C C T T G T A A A T G T A G A
T T T C G A C A A G T A A T A C G A G T A G C G G A A G C T G C T T T A A G G C C A C A G T C C T A C C T G G A C G A A C T C T T G G A A C A T T T A C A T C T
2240
H2b coding sequence
T G G C A T A G C T C T C C T T C C T T T T G C G C T T C T T T T T C T T G T C G G T C T T G G T G A T G T T C T T C T G A G C C T T G C C A G C C T T C T T G
A C C G T A T C G A G A G G A A G G A A A A C G C G A A G A A A A A G A A C A G C C A G A A C C A C T A C A A G A A G A C T C G G A A C G G T C G G A A G A A C
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 4
PspOMISpeI
2320
H2b coding sequence
H2b transl. start
H2b 5' UTR
H2B transcription start site
G C T G C C T T T C C A C T A G T T T T C G G A G G C A T T G T T C A C G T T A C T T A T A T T T T C A C A A A C A C A A T T C A C T T A T T G T A A T G T G G
C G A C G G A A A G G T G A T C A A A A G C C T C C G T A A C A A G T G C A A T G A A T A T A A A A G T G T T T G T G T T A A G T G A A T A A C A T T A C A C C
FspIApaIBanII
2400
H2b TATAAA box
G C C C G A A C G C G T T C A C A T T T A T A C T T T T T T T C G A G C A G T C A A T T C A G G T C T A A G T C C C C C A C C C C T A A C T G A A T G C G C A G
C G G G C T T G C G C A A G T G T A A A T A T G A A A A A A A G C T C G T C A G T T A A G T C C A G A T T C A G G G G G T G G G G A T T G A C T T A C G C G T C
2480
H2A TATAAA box H2a transcription start siteH2a 5'UTR
G C A A A C G G A A A A G T A T A A A T A T T T C G C T G T C G G G G T T A G G C G A G C A T T C G T G T T C C G T G T G T A A A G T G A A C T A A G T G A A A
C G T T T G C C T T T T C A T A T T T A T A A A G C G A C A G C C C C A A T C C G C T C G T A A G C A C A A G G C A C A C A T T T C A C T T G A T T C A C T T T
XbaI*BmgBI
2560
H2a 5'UTR
H2a transl. start
H2A coding region
Add XbaI
T A A A C G C A A A G C A A A A T G T C T G G A C G T G G A A A A G G T G G C A A A G T G A A G G G A A A G G C A A A G T C T A G A T C A A A C C G T G C C G G
A T T T G C G T T T C G T T T T A C A G A C C T G C A C C T T T T C C A C C G T T T C A C T T C C C T T T C C G T T T C A G A T C T A G T T T G G C A C G G C C
2640
H2A coding region
Removed BspEI
T C T T C A A T T C C C T G T G G G C C G T A T T C A C C G T T T G C T T C G G A A G G G A A A C T A C G C A G A G C G T G T T G G T G C A G G C G C T C C A G
A G A A G T T A A G G G A C A C C C G G C A T A A G T G G C A A A C G A A G C C T T C C C T T T G A T G C G T C T C G C A C A A C C A C G T C C G C G A G G T C
BssS IBbvCI
2720
H2A coding region
T T T A C C T A G C T G C C G T A A T G G A A T A T C T G G C C G C T G A G G T T C T G G A A T T G G C T G G C A A T G C T G C T C G T G A C A A C A A G A A G
A A A T G G A T C G A C G G C A T T A C C T T A T A G A C C G G C G A C T C C A A G A C C T T A A C C G A C C G T T A C G A C G A G C A C T G T T G T T C T T C
2800
H2A coding region
A C T A G A A T T A T T C C G C G T C A T C T G C A A C T G G C C A T C C G C A A C G A C G A G G A G T T A A A C A A G C T G C T C T C C G G C G T C A C A A T
T G A T C T T A A T A A G G C G C A G T A G A C G T T G A C C G G T A G G C G T T G C T G C T C C T C A A T T T G T T C G A C G A G A G G C C G C A G T G T T A
StuIBsu36I Acc65I
KpnI
2880
H2A coding region
H2a transl. stop Add Acc65I to remove 3'UTRH2a 3'UTR
T G C A C A A G G T G G C G T G T T G C C T A A T A T A C A G G C T G T T C T G T T G C C C A A G A A G A C C G A G A A G A A G G C C T A A G G T A C C A C G T
A C G T G T T C C A C C G C A C A A C G G A T T A T A T G T C C G A C A A G A C A A C G G G T T C T T C T G G C T C T T C T T C C G G A T T C C A T G G T G C A
BlpI
2960
H2a 3'UTRH2a HDE
H2a stem loop
T T C A A A G G C T A A G C T A A A A A C C T A C A T G T A C A T A A A A T C G T C A A T C A A A C C G T C C T T T T C A G G A C G A C C A A A T T A T T A C C
A A G T T T C C G A T T C G A T T T T T G G A T G T A C A T G T A T T T T A G C A G T T A G T T T G G C A G G A A A A G T C C T G C T G G T T T A A T A A T G G
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 5
BspEI
3040
H2a HDEAdd BspEI
A A A G A A T T G A A A A A T T T T T T T C C G G A A G C T T G G C A A T T T C T T G T A A T T A G T A A A T C A T A A A G A A T T A T T A A C G T A A A G A T
T T T C T T A A C T T T T T A A A A A A A G G C C T T C G A A C C G T T A A A G A A C A T T A A T C A T T T A G T A T T T C T T A A T A A T T G C A T T T C T A
3120
G G T A A T G T A G T A A G G G T T T T C T A C T A T A T G C G G T A T A A A C T A T A A T T T G C T T C T T T A A A C A A T C G C A C A C C A C G A T G T G A
C C A T T A C A T C A T T C C C A A A A G A T G A T A T A C G C C A T A T T T G A T A T T A A A C G A A G A A A T T T G T T A G C G T G T G G T G C T A C A C T
3200
T G C T G T A C A T G C G G T G T C T G A A A C C A T T T G T A C A G T C T G T A C A A A T C C A T G T T A G A A A T A C A C A T T C T A T T T G A A A G A G T
A C G A C A T G T A C G C C A C A G A C T T T G G T A A A C A T G T C A G A C A T G T T T A G G T A C A A T C T T T A T G T G T A A G A T A A A C T T T C T C A
AgeI
3280
AgeI Added H4 HDE
A C G A A C G A C A G A C A T T T A T T T T T A G T T T A A C A T A T T T T T T G G G A G T C C C G A C C A A T A A A A T T A A A T A C T T T A C C G G T T T G
T G C T T G C T G T C T G T A A A T A A A A A T C A A A T T G T A T A A A A A A C C C T C A G G G C T G G T T A T T T T A A T T T A T G A A A T G G C C A A A C
3360
H4 HDE H4 3'UTR
H4 stem loop
A A A A T C T T C C T C C T T T T A A A A A C T G A A T G G T G G T C C T G A A A A G G A C C G A T T G C T T A A T A G G G G T A C A C A G G A T G T A C A C T
T T T T A G A A G G A G G A A A A T T T T T G A C T T A C C A C C A G G A C T T T T C C T G G C T A A C G A A T T A T C C C C A T G T G T C C T A C A T G T G A
BclI*
3440
H4 3'UTR Add BclIH4 transl. stop [3'] [3']
H4 coding region
H4 Tag Removed NcoI [*]
T T T G A T C A T T A A C C G C C A A A T C C G T A G A G G G T G C G G C C T T G C C T C T T C A G A G C G T A C A C A A C A T C C A T G G C T G T A A C T G T
A A A C T A G T A A T T G G C G G T T T A G G C A T C T C C C A C G C C G G A A C G G A G A A G T C T C G C A T G T G T T G T A G G T A C C G A C A T T G A C A
3520
H4 coding region
C T T C C T C T T G G C G T G T T C C G T G T A G G T C A C G G C A T C A C G A A T T A C G T T C T C C A A G A A A A C C T T C A G A A C G C C A C G C G T T T
G A A G G A G A A C C G C A C A A G G C A C A T C C A G T G C C G T A G T G C T T A A T G C A A G A G G T T C T T T T G G A A G T C T T G C G G T G C G C A A A
NgoMIV NaeI
3600
H4 coding region
Added NaeI
C C T C G T A T A T G A G T C C A G A T A T G C G C T T C A C A C C G C C T C G A C G G G C C A A A C G G C G G A T A G C C G G C T T C G T G A T A C C T T G G
G G A G C A T A T A C T C A G G T C T A T A C G C G A A G T G T G G C G G A G C T G C C C G G T T T G C C G C C T A T C G G C C G A A G C A C T A T G G A A C C
3680
H4 coding regionH4 [*]
H4 transl. start [3']
A T G T T A T C A C G C A G C A C T T T G C G A T G A C G C T T G G C G C C A C C C T T T C C C A A G C C T T T G C C T C C T T T A C C A C G A C C A G T C A T
T A C A A T A G T G C G T C G T G A A A C G C T A C T G C G A A C C G C G G T G G G A A A G G G T T C G G A A A C G G A G G A A A T G G T G C T G G T C A G T A
3760
H4 5' UTRH4 transcription start site
T T T T C A C T G T T C T A T A C T A T T A T A C A C G C A C A G C A C G A A A G T C A C T A A A G A A C T A A T T T C A A C G T T T C T G T G T G C C C C T A
A A A A G T G A C A A G A T A T G A T A A T A T G T G C G T G T C G T G C T T T C A G T G A T T T C T T G A T T A A A G T T G C A A A G A C A C A C G G G G A T
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 6
3840
H4 TATAAA box short GA repeat
T T T A T A G G T A A A A C G A C A A A A A C C C G A G A G A G T A C G A A C G A T A T G T T C G T T C G C T T T T C G C T C G T C A A A T G A A A T G G C C T
A A A T A T C C A T T T T G C T G T T T T T G G G C T C T C T C A T G C T T G C T A T A C A A G C A A G C G A A A A G C G A G C A G T T T A C T T T A C C G G A
3920
long GA repeatH3 TATAA box
C T G T T T C T C T C T C T C T C T C T C T C T C T C T C T T T C A C C C T C C A C G A T T G C T A T A T A A G T A G G T A G C A A A T G C T C T G A T C G T T
G A C A A A G A G A G A G A G A G A G A G A G A G A G A G A A A G T G G G A G G T G C T A A C G A T A T A T T C A T C C A T C G T T T A C G A G A C T A G C A A
XcmI
4000
H4 transcription start site
H3 5'UTR
H3 transl. start [3']
H3 coding region
T A T T G T G T T T T C A A A C G T G A A G T A G T G A A C G T G A A C T T T A G T G A A A C C C A A A T C G G A G A T G G C T C G T A C C A A G C A A A C T G
A T A A C A C A A A A G T T T G C A C T T C A T C A C T T G C A C T T G A A A T C A C T T T G G G T T T A G C C T C T A C C G A G C A T G G T T C G T T T G A C
BsrBISacII
4080
H3 coding region
Added SacII
C T C G C A A A T C G A C T G G T G G A A A G G C G C C G C G G A A A C A A C T G G C T A C T A A G G C C G C T C G C A A G A G T G C T C C A G C C A C C G G A
G A G C G T T T A G C T G A C C A C C T T T C C G C G G C G C C T T T G T T G A C C G A T G A T T C C G G C G A G C G T T C T C A C G A G G T C G G T G G C C T
4160
H3 coding region
G G T G T G A A G A A G C C C C A C C G C T A T C G C C C T G G A A C C G T G G C C T T G C G T G A A A T T C G T C G C T A C C A A A A G A G C A C C G A G C T
C C A C A C T T C T T C G G G G T G G C G A T A G C G G G A C C T T G G C A C C G G A A C G C A C T T T A A G C A G C G A T G G T T T T C T C G T G G C T C G A
PflMI
4240
H3 coding region
H3 Tag Remove SacI
T C T A A T C C G C A A G C T G C C T T T C C A G C G T C T G G T G C G T G A A A T C G C T C A G G A C T T T A A G A C G G A C T T G C G A T T C C A G A G T T
A G A T T A G G C G T T C G A C G G A A A G G T C G C A G A C C A C G C A C T T T A G C G A G T C C T G A A A T T C T G C C T G A A C G C T A A G G T C T C A A
BsaISexAI*
4320
H3 coding region
H3 Tag Remove SacI Remove BstBI
C G G C G G T T A T G G C T C T G C A G G A A G C T A G C G A A G C C T A C C T G G T T G G T C T C T T T G A A G A T A C C A A C T T G T G T G C C A T T C A T
G C C G C C A A T A C C G A G A C G T C C T T C G A T C G C T T C G G A T G G A C C A A C C A G A G A A A C T T C T A T G G T T G A A C A C A C G G T A A G T A
BsiWI
4400
H3 coding region
H3 transl. stopBsiWI Added H3 3'UTR
G C C A A G C G T G T C A C C A T A A T G C C C A A A G A C A T C C A G T T A G C G C G A C G C A T T C G C G G C G A G C G T G C T T A A C G T A C G G C T G A
C G G T T C G C A C A G T G G T A T T A C G G G T T T C T G T A G G T C A A T C G C G C T G C G T A A G C G C C G C T C G C A C G A A T T G C A T G C C G A C T
4480
H3 3'UTR H3 HDE
H3 stem loop
C A C G G C A T T A A C T T G C A G A T A A A G C G C T A G C G T A C T C T A T A A T C G G T C C T T T T C A G G A C C A C A A A C C A G A T T C A A T G A G A
G T G C C G T A A T T G A A C G T C T A T T T C G C G A T C G C A T G A G A T A T T A G C C A G G A A A A G T C C T G G T G T T T G G T C T A A G T T A C T C T
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 7
4560
H3 HDENcoI added
H3-H1 intergenic region
T A A A A T T T T C T G T T C C A T G G G C C G A C T A T T T A T A A C T T T A A A A A A A A A A T A A G A A C A A A A T T C A T A T T C T A T T A T T T A T G
A T T T T A A A A G A C A A G G T A C C C G G C T G A T A A A T A T T G A A A T T T T T T T T T T A T T C T T G T T T T A A G T A T A A G A T A A T A A A T A C
4640
H3-H1 intergenic region
G C G C A A A C G G T A C T G G G T C T T A A A T C A T A T G T A A A A A T A C T A A T T C T G C C A G A G A A G G A A T A A A A A T A A T C T T A T T T T A A
C G C G T T T G C C A T G A C C C A G A A T T T A G T A T A C A T T T T T A T G A T T A A G A C G G T C T C T T C C T T A T T T T T A T T A G A A T A A A A T T
4720
H3-H1 intergenic region
T T G T C A G C T C A A C A T T T A T T A A A T T A A A G A A G A G G T T A A T A C A A A A A T A T A T A T T T T T A T T T G T T C T T T G T G C G A A C A T C
A A C A G T C G A G T T G T A A A T A A T T T A A T T T C T T C T C C A A T T A T G T T T T T A T A T A T A A A A A T A A A C A A G A A A C A C G C T T G T A G
4800
H3-H1 intergenic region
C T T T A A A G C A G T G A A A G T G T C G T G C G G G G C A A G G G A C T C T G A A C C T T A A A C A T C T A A A A A A A A A A T C T G A A T T C T G T G T A
G A A A T T T C G T C A C T T T C A C A G C A C G C C C C G T T C C C T G A G A C T T G G A A T T T G T A G A T T T T T T T T T T A G A C T T A A G A C A C A T
4880
H3-H1 intergenic region
A G A C A G T T T G A A A T T A A T G A A A T T A C A T T G A T G A C G G C A A T A T T T A T A A A A T A A C A G A A A A T A A T A A A A T A A A A C T A G C T
T C T G T C A A A C T T T A A T T A C T T T A A T G T A A C T A C T G C C G T T A T A A A T A T T T T A T T G T C T T T T A T T A T T T T A T T T T G A T C G A
4960
H3-H1 intergenic region
Removed HpaI Removed BsiWI
A T T T T A T A T T T T T T C C A T G T G C T A A C T G A A G A A T G T G T T A T T A T T G A A G A G G T C G T A T G G G A C A A T T G A C A C T G T C C C T T
T A A A A T A T A A A A A A G G T A C A C G A T T G A C T T C T T A C A C A A T A A T A A C T T C T C C A G C A T A C C C T G T T A A C T G T G A C A G G G A A
5040
H3-H1 intergenic region
C A A A C G T C T G T A A A A A A T A A A A C C T A T G T A A A A T T C A G C A C G G A A A T T G G C T A A T T T T G T T G C G G A A T G T A A T A T A T A T T
G T T T G C A G A C A T T T T T T A T T T T G G A T A C A T T T T A A G T C G T G C C T T T A A C C G A T T A A A A C A A C G C C T T A C A T T A T A T A T A A
PaeR7IXhoI
5120
H3-H1 intergenic region
A C A T A A T A A A G G A T A A T A C A A A A A T T G T T T C T T T T T A T T T T T T A T T T G A T T T A T T T A T T T G A C T A C A T A G A C G G C T C G A G
T G T A T T A T T T C C T A T T A T G T T T T T A A C A A A G A A A A A T A A A A A A T A A A C T A A A T A A A T A A A C T G A T G T A T C T G C C G A G C T C
NotI
3
5
5140
A A G G G C G A A T T C G C G G C C G C
T T C C C G C T T A A G C G C C G G C G
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 1
BspDIClaI
SalI AccI
5
3
H1-H3 intergenic region
G T C G A C T A A T G C A T A T G T G G C G A G G A A A T C G A T T G A T T T C A G A A C A A A T T A T T T T A A A A T A T G C A T G A A A A C A C A T T A A T
C A G C T G A T T A C G T A T A C A C C G C T C C T T T A G C T A A C T A A A G T C T T G T T T A A T A A A A T T T T A T A C G T A C T T T T G T G T A A T T A
H1-H3 intergenic region
A A C A A G C A A A C A C A T T A A T A A T T T A A G A A A A T A T T A T T T A T T A T A T T A A T A T T A T G T T A T T T A A G A A A G T A T C T G T A T T T
T T G T T C G T T T G T G T A A T T A T T A A A T T C T T T T A T A A T A A A T A A T A T A A T T A T A A T A C A A T A A A T T C T T T C A T A G A C A T A A A
PvuI
H1-H3 intergenic region
T T A A C G A T C G A A A A T T A T T T C T G A A T G C T G C T T T A A A G C A A A T T T T T C T G T A G T T C A A T G T G A A C T T A A A T C A A G T A A T A
A A T T G C T A G C T T T T A A T A A A G A C T T A C G A C G A A A T T T C G T T T A A A A A G A C A T C A A G T T A C A C T T G A A T T T A G T T C A T T A T
PacI
H1-H3 intergenic region
A A G T A T C T T A A T T A A T A A T A G A C G C T T C T T T T A G A A G C C C T T C A A G G G G A T G A A C G T T T C A A T T T T A A T A A A C A T T A C G A
T T C A T A G A A T T A A T T A T T A T C T G C G A A G A A A A T C T T C G G G A A G T T C C C C T A C T T G C A A A G T T A A A A T T A T T T G T A A T G C T
H1-H3 intergenic region
A T T A G T G A A A T A T T T G C A A T G A T T C T T A T T T T A T T A G A T G T T T T T T T A T A A A T T G G T C C A G T T A A A A A T T T G A A T A T A A A
T A A T C A C T T T A T A A A C G T T A C T A A G A A T A A A A T A A T C T A C A A A A A A A T A T T T A A C C A G G T C A A T T T T T A A A C T T A T A T T T
H1-H3 intergenic region
A A T T C A A T C A A C A T T T G A A A G T C T C A A A A C C C A T A T T A C A T C C T T T T A A A A A T G G A A A G T G A C G A A A A A A T T A T T T A A A A
T T A A G T T A G T T G T A A A C T T T C A G A G T T T T G G G T A T A A T G T A G G A A A A T T T T T A C C T T T C A C T G C T T T T T T A A T A A A T T T T
H1-H3 intergenic region
G T G T A G A A C T A T T A A A A C C T T A T T T T T A T T A A T G A T T T A A A A T A C T A A A A A A T T A A A A A A A G T T T A C A C T T C A A G C A A A C
C A C A T C T T G A T A A T T T T G G A A T A A A A A T A A T T A C T A A A T T T T A T G A T T T T T T A A T T T T T T T C A A A T G T G A A G T T C G T T T G
BsaBIPshAI
H1-H3 intergenic region H1 core promoter
T T C G A C A T A G T A A A T G A C T G A T G T C A G T A G C A T T G T T A A A G T G C T C T C C T C C T C G A T T C T C A T C A G A G C A A A G G A G G T T G
A A G C T G T A T C A T T T A C T G A C T A C A G T C A T C G T A A C A A T T T C A C G A G A G G A G G A G C T A A G A G T A G T C T C G T T T C C T C C A A C
BssHII
H1 transcription startH1 5'UTR
H1 transl. start
H1 coding region
G T A G G C A G C G C G C G A G C C A T T T T T A A C A G A A A A A A A G T G T T C T C A G T G A A A A A A A G A T G T C T G A T T C T G C A G T T G C A A C G
C A T C C G T C G C G C G C T C G G T A A A A A T T G T C T T T T T T T C A C A A G A G T C A C T T T T T T T C T A C A G A C T A A G A C G T C A A C G T T G C
H1 coding region
T C C G C T T C C C C A G T G G C T G C C C C A C C A G C G A C A G T T G A G A A G A A A G T G G T C C A A A A A A A G G C A T C T G G A T C T G C T G G C A C
A G G C G A A G G G G T C A C C G A C G G G G T G G T C G C T G T C A A C T C T T C T T T C A C C A G G T T T T T T T C C G T A G A C C T A G A C G A C C G T G
B Koreski et al., Figure S1
80
160
240
320
400
480
560
640
720
800
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 2
AflII
H1 coding region
Add AflII
A A A G G C A A A G A A A G C C T C T G C G A C G C C G T C A C A T C C G C C A A C T C A G C A A A T G G T G G A C G C T T C C A T T A A A A A T C T T A A G G
T T T C C G T T T C T T T C G G A G A C G C T G C G G C A G T G T A G G C G G T T G A G T C G T T T A C C A C C T G C G A A G G T A A T T T T T A G A A T T C C
H1 coding region
A A C G T G G C G G T T C A T C A C T T C T G G C A A T C A A A A A A T A T A T C A C T G C C A C T T A T A A A T G C G A C G C C C A A A A G T T A G C G C C A
T T G C A C C G C C A A G T A G T G A A G A C C G T T A G T T T T T T A T A T A G T G A C G G T G A A T A T T T A C G C T G C G G G T T T T C A A T C G C G G T
H1 coding region
H1 Tag- Remove ScaI
T T C A T C A A G A A G T A T T T A A A A T C G G C C G T G G T C A A T G G A A A G C T T A T T C A A A C T A A G G G A A A G G G T G C A T C T G G A T C T T T
A A G T A G T T C T T C A T A A A T T T T A G C C G G C A C C A G T T A C C T T T C G A A T A A G T T T G A T T C C C T T T C C C A C G T A G A C C T A G A A A
H1 coding region
Removed BamHI
C A A A C T G T C G G C C T C T G C C A A G A A G G A A A A A G A T C C G A A G G C A A A G T C G A A G G T T T T G T C T G C T G A G A A A A A A G T T C A A A
G T T T G A C A G C C G G A G A C G G T T C T T C C T T T T T C T A G G C T T C C G T T T C A G C T T C C A A A A C A G A C G A C T C T T T T T T C A A G T T T
H1 coding region
G C A A G A A G G T A G C C T C T A A G A A G A T T G G T G T C T C C T C C A A A A A A A C T G C C G T T G G G G C T G C T G A C A A A A A G C C C A A A G C T
C G T T C T T C C A T C G G A G A T T C T T C T A A C C A C A G A G G A G G T T T T T T T G A C G G C A A C C C C G A C G A C T G T T T T T C G G G T T T C G A
H1 coding region
A A G A A G G C T G T G G C T A C C A A A A A G A C T G C C G A A A A T A A G A A A A C T G A G A A G G C A A A A G C C A A G G A T G C C A A G A A A A C T G G
T T C T T C C G A C A C C G A T G G T T T T T C T G A C G G C T T T T A T T C T T T T G A C T C T T C C G T T T T C G G T T C C T A C G G T T C T T T T G A C C
H1 coding region
A A T C A T A A A G T C G A A G C C C G C C G C A A C A A A G G C G A A A G T G A C T G C A G C G A A G C C A A A G G C T G T A A T A G C G A A A G C G T C A A
T T A G T A T T T C A G C T T C G G G C G G C G T T G T T T C C G C T T T C A C T G A C G T C G C T T C G G T T T C C G A C A T T A T C G C T T T C G C A G T T
H1 coding region
A G G C A A A G C C A G C G G T G T C T G C A A A A C C C A A A A A G A C G G T G A A G A A A G C A T C G G T T T C T G C T A C C G C C A A G A A G C C G A A A
T C C G T T T C G G T C G C C A C A G A C G T T T T G G G T T T T T C T G C C A C T T C T T T C G T A G C C A A A G A C G A T G G C G G T T C T T C G G C T T T
HpaI
H1 coding region
H1 transl. stopAdded HpaI
H1 3'UTR [3']
H1 3'UTR [5']
G C G A A G A C T A C G G C T G C C A A A A A G T A A G T T A A C A T T G T G A A A A A G T G C A G T A T T T G G T A C A T G T T C G C A A T T A A A A T T T T
C G C T T C T G A T G C C G A C G G T T T T T C A T T C A A T T G T A A C A C T T T T T C A C G T C A T A A A C C A T G T A C A A G C G T T A A T T T T A A A A
880
960
1040
1120
1200
1280
1360
1440
1520
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 3
PmeI
1600
H1 3'UTR [3'] H1 HDE
Removed BglIIH1 stem loop
A G A T T T A T G T T T T A T A G A C C T G A A A T T T G T T T A A A C A A G T C C T T T T C A G G G C T A C A A C G T T C C G T T G C A A G A G A A A A A A A
T C T A A A T A C A A A A T A T C T G G A C T T T A A A C A A A T T T G T T C A G G A A A A G T C C C G A T G T T G C A A G G C A A C G T T C T C T T T T T T T
BspHIZraI AatII
1680
H1 HDEAdded AatII
C T T T T A T G A C G T C T T T C T T C C A C T T A T T T A T T A G C T G A C G T T C G C A G C A A C A A T A A A A C G T T T C A T G T C A T G A A T T A C A T
G A A A A T A C T G C A G A A A G A A G G T G A A T A A A T A A T C G A C T G C A A G C G T C G T T G T T A T T T T G C A A A G T A C A G T A C T T A A T G T A
1760
T G C A T G T T G G T C G C A T T C A G T T T T C G T T C C C G A T T T T T T T G T A T T T A T T T G A A C A T T A C C C A A T T A C C C A T A T T G C G G G T
A C G T A C A A C C A G C G T A A G T C A A A A G C A A G G G C T A A A A A A A C A T A A A T A A A C T T G T A A T G G G T T A A T G G G T A T A A C G C C C A
BstBI
1840
Add BstBIH2b HDE
H2b 3'UTR
A A A T A A G T T T T A T T T G T A A A T T C A T A T T C G A T G A T T G G T G G T T C G A A T T G A A A A A T G C A T T T C T T T G G T A T A A C A C A T T G
T T T A T T C A A A A T A A A C A T T T A A G T A T A A G C T A C T A A C C A C C A A G C T T A A C T T T T T A C G T A A A G A A A C C A T A T T G T G T A A C
BglII
1920
H2b 3'UTRAdd BglII
H2b transl. stop
H2b coding sequence
H2b stem loop
T G G C C C T G A A A A G G G C C G T T T T G G A T T A T T G T C C G C A T T C G C A G G A G A A A A A G A T C T T T A T T T A G A G C T G G T G T A C T T G G
A C C G G G A C T T T T C C C G G C A A A A C C T A A T A A C A G G C G T A A G C G T C C T C T T T T T C T A G A A A T A A A T C T C G A C C A C A T G A A C C
SphI
2000
H2b coding sequence
T G A C A G C C T T G G T T C C C T C A C T G A C A G C A T G C T T G G C C A A C T C T C C A G G C A A A A G C A G G C G A A C A G C C G T T T G G A T C T C C
A C T G T C G G A A C C A A G G G A G T G A C T G T C G T A C G A A C C G G T T G A G A G G T C C G T T T T C G T C C G C T T G T C G G C A A A C C T A G A G G
NruI
2080
H2b coding sequence
Added NruI
C G A C T G G T G A T G G T C G A G C G C T T G T T G T A G T G A G C T A G T C G C G A C G C T T C G G C A G C A A T T C G C T C G A A A A T A T C A T T T A C
G C T G A C C A C T A C C A G C T C G C G A A C A A C A T C A C T C G A T C A G C G C T G C G A A G C C G T C G T T A A G C G A G C T T T T A T A G T A A A T G
BpuEIBspMIBfuAI
2160
H2b coding sequence
A A A G C T G T T C A T T A T G C T C A T C G C C T T C G A C G A A A T T C C G G T G T C A G G A T G G A C C T G C T T G A G A A C C T T G T A A A T G T A G A
T T T C G A C A A G T A A T A C G A G T A G C G G A A G C T G C T T T A A G G C C A C A G T C C T A C C T G G A C G A A C T C T T G G A A C A T T T A C A T C T
2240
H2b coding sequence
T G G C A T A G C T C T C C T T C C T T T T G C G C T T C T T T T T C T T G T C G G T C T T G G T G A T G T T C T T C T G A G C C T T G C C A G C C T T C T T G
A C C G T A T C G A G A G G A A G G A A A A C G C G A A G A A A A A G A A C A G C C A G A A C C A C T A C A A G A A G A C T C G G A A C G G T C G G A A G A A C
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 4
SpeI
2320
H2b coding sequence
H2b transl. start
H2b 5' UTR
H2B transcription start site
G C T G C C T T T C C A C T A G T T T T C G G A G G C A T T G T T C A C G T T A C T T A T A T T T T C A C A A A C A C A A T T C A C T T A T T G T A A T G T G G
C G A C G G A A A G G T G A T C A A A A G C C T C C G T A A C A A G T G C A A T G A A T A T A A A A G T G T T T G T G T T A A G T G A A T A A C A T T A C A C C
2400
H2b TATAAA box
G C C C G A A C G C G T T C A C A T T T A T A C T T T T T T T C G A G C A G T C A A T T C A G G T C T A A G T C C C C C A C C C C T A A C T G A A T G C G C A G
C G G G C T T G C G C A A G T G T A A A T A T G A A A A A A A G C T C G T C A G T T A A G T C C A G A T T C A G G G G G T G G G G A T T G A C T T A C G C G T C
2480
H2A TATAAA box H2a transcription start siteH2a 5'UTR
G C A A A C G G A A A A G T A T A A A T A T T T C G C T G T C G G G G T T A G G C G A G C A T T C G T G T T C C G T G T G T A A A G T G A A C T A A G T G A A A
C G T T T G C C T T T T C A T A T T T A T A A A G C G A C A G C C C C A A T C C G C T C G T A A G C A C A A G G C A C A C A T T T C A C T T G A T T C A C T T T
XbaI*BmgBI
2560
H2a 5'UTR
H2a transl. start
H2A coding region
Add XbaI
T A A A C G C A A A G C A A A A T G T C T G G A C G T G G A A A A G G T G G C A A A G T G A A G G G A A A G G C A A A G T C T A G A T C A A A C C G T G C C G G
A T T T G C G T T T C G T T T T A C A G A C C T G C A C C T T T T C C A C C G T T T C A C T T C C C T T T C C G T T T C A G A T C T A G T T T G G C A C G G C C
2640
H2A coding region
Removed BspEI
T C T T C A A T T C C C T G T G G G C C G T A T T C A C C G T T T G C T T C G G A A G G G A A A C T A C G C A G A G C G T G T T G G T G C A G G C G C T C C A G
A G A A G T T A A G G G A C A C C C G G C A T A A G T G G C A A A C G A A G C C T T C C C T T T G A T G C G T C T C G C A C A A C C A C G T C C G C G A G G T C
BssS IBbvCI
2720
H2A coding region
T T T A C C T A G C T G C C G T A A T G G A A T A T C T G G C C G C T G A G G T T C T G G A A T T G G C T G G C A A T G C T G C T C G T G A C A A C A A G A A G
A A A T G G A T C G A C G G C A T T A C C T T A T A G A C C G G C G A C T C C A A G A C C T T A A C C G A C C G T T A C G A C G A G C A C T G T T G T T C T T C
2800
H2A coding region
A C T A G A A T T A T T C C G C G T C A T C T G C A A C T G G C C A T C C G C A A C G A C G A G G A G T T A A A C A A G C T G C T C T C C G G C G T C A C A A T
T G A T C T T A A T A A G G C G C A G T A G A C G T T G A C C G G T A G G C G T T G C T G C T C C T C A A T T T G T T C G A C G A G A G G C C G C A G T G T T A
StuIBsu36I Acc65I
KpnI
2880
H2A coding region
H2a transl. stop Add Acc65I to remove 3'UTRH2a 3'UTR
T G C A C A A G G T G G C G T G T T G C C T A A T A T A C A G G C T G T T C T G T T G C C C A A G A A G A C C G A G A A G A A G G C C T A A G G T A C C A C G T
A C G T G T T C C A C C G C A C A A C G G A T T A T A T G T C C G A C A A G A C A A C G G G T T C T T C T G G C T C T T C T T C C G G A T T C C A T G G T G C A
BlpI
2960
H2a 3'UTRH2a HDE
H2a stem loop
T T C A A A G G C T A A G C T A A A A A C C T A C A T G T A C A T A A A A T C G T C A A T C A A A C C G T C C T T T T C A G G A C G A C C A A A T T A T T A C C
A A G T T T C C G A T T C G A T T T T T G G A T G T A C A T G T A T T T T A G C A G T T A G T T T G G C A G G A A A A G T C C T G C T G G T T T A A T A A T G G
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 5
BspEI
3040
H2a HDEAdd BspEI
A A A G A A T T G A A A A A T T T T T T T C C G G A A G C T T G G C A A T T T C T T G T A A T T A G T A A A T C A T A A A G A A T T A T T A A C G T A A A G A T
T T T C T T A A C T T T T T A A A A A A A G G C C T T C G A A C C G T T A A A G A A C A T T A A T C A T T T A G T A T T T C T T A A T A A T T G C A T T T C T A
3120
G G T A A T G T A G T A A G G G T T T T C T A C T A T A T G C G G T A T A A A C T A T A A T T T G C T T C T T T A A A C A A T C G C A C A C C A C G A T G T G A
C C A T T A C A T C A T T C C C A A A A G A T G A T A T A C G C C A T A T T T G A T A T T A A A C G A A G A A A T T T G T T A G C G T G T G G T G C T A C A C T
3200
T G C T G T A C A T G C G G T G T C T G A A A C C A T T T G T A C A G T C T G T A C A A A T C C A T G T T A G A A A T A C A C A T T C T A T T T G A A A G A G T
A C G A C A T G T A C G C C A C A G A C T T T G G T A A A C A T G T C A G A C A T G T T T A G G T A C A A T C T T T A T G T G T A A G A T A A A C T T T C T C A
AgeI
3280
AgeI Added H4 HDE
A C G A A C G A C A G A C A T T T A T T T T T A G T T T A A C A T A T T T T T T G G G A G T C C C G A C C A A T A A A A T T A A A T A C T T T A C C G G T T T G
T G C T T G C T G T C T G T A A A T A A A A A T C A A A T T G T A T A A A A A A C C C T C A G G G C T G G T T A T T T T A A T T T A T G A A A T G G C C A A A C
3360
H4 HDE H4 3'UTR
H4 stem loop
A A A A T C T T C C T C C T T T T A A A A A C T G A A T G G T G G T C C T G A A A A G G A C C G A T T G C T T A A T A G G G G T A C A C A G G A T G T A C A C T
T T T T A G A A G G A G G A A A A T T T T T G A C T T A C C A C C A G G A C T T T T C C T G G C T A A C G A A T T A T C C C C A T G T G T C C T A C A T G T G A
BclI*
3440
H4 3'UTR Add BclIH4 transl. stop [3'] [3']
H4 coding region
T T T G A T C A T T A A C C G C C A A A T C C G T A G A G G G T G C G G C C T T G C C T C T T C A G A G C G T A C A C A A C A T C C A T G G C T G T A A C T G T
A A A C T A G T A A T T G G C G G T T T A G G C A T C T C C C A C G C C G G A A C G G A G A A G T C T C G C A T G T G T T G T A G G T A C C G A C A T T G A C A
3520
H4 coding region
C T T C C T C T T G G C G T G T T C C G T G T A G G T C A C G G C A T C A C G A A T T A C G T T C T C C A A G A A A A C C T T C A G A A C G C C A C G C G T T T
G A A G G A G A A C C G C A C A A G G C A C A T C C A G T G C C G T A G T G C T T A A T G C A A G A G G T T C T T T T G G A A G T C T T G C G G T G C G C A A A
NgoMIV NaeI
3600
H4 coding region
Added NaeI
C C T C G T A T A T G A G T C C A G A T A T G C G C T T C A C A C C G C C T C G A C G G G C C A A A C G G C G G A T A G C C G G C T T C G T G A T A C C T T G G
G G A G C A T A T A C T C A G G T C T A T A C G C G A A G T G T G G C G G A G C T G C C C G G T T T G C C G C C T A T C G G C C G A A G C A C T A T G G A A C C
3680
H4 coding regionH4 [*]
H4 transl. start [3']
A T G T T A T C A C G C A G C A C T T T G C G A T G A C G C T T G G C G C C A C C C T T T C C C A A G C C T T T G C C T C C T T T A C C A C G A C C A G T C A T
T A C A A T A G T G C G T C G T G A A A C G C T A C T G C G A A C C G C G G T G G G A A A G G G T T C G G A A A C G G A G G A A A T G G T G C T G G T C A G T A
3760
H2b 5' UTRH2B transcription start site H2b TATAAA box
G T T C A C G T T A C T T A T A T T T T C A C A A A C A C A A T T C A C T T A T T G T A A T G T G G G C C C G A A C G C G T T C A C A T T T A T A C T T T T T T
C A A G T G C A A T G A A T A T A A A A G T G T T T G T G T T A A G T G A A T A A C A T T A C A C C C G G G C T T G C G C A A G T G T A A A T A T G A A A A A A
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 6
3840
H2A TATAAA box
T C G A G C A G T C A A T T C A G G T C T A A G T C C C C C A C C C C T A A C T G A A T G C G C A G G C A A A C G G A A A A G T A T A A A T A T T T C G C T G T
A G C T C G T C A G T T A A G T C C A G A T T C A G G G G G T G G G G A T T G A C T T A C G C G T C C G T T T G C C T T T T C A T A T T T A T A A A G C G A C A
3920
H2a transcription start siteH2a 5'UTR
H3 transl. start [3']
H3 coding region
C G G G G T T A G G C G A G C A T T C G T G T T C C G T G T G T A A A G T G A A C T A A G T G A A A T A A A C G C A A A G C A A A A T G G C T C G T A C C A A G
G C C C C A A T C C G C T C G T A A G C A C A A G G C A C A C A T T T C A C T T G A T T C A C T T T A T T T G C G T T T C G T T T T A C C G A G C A T G G T T C
BsrBISacII
4000
H3 coding region
Added SacII
C A A A C T G C T C G C A A A T C G A C T G G T G G A A A G G C G C C G C G G A A A C A A C T G G C T A C T A A G G C C G C T C G C A A G A G T G C T C C A G C
G T T T G A C G A G C G T T T A G C T G A C C A C C T T T C C G C G G C G C C T T T G T T G A C C G A T G A T T C C G G C G A G C G T T C T C A C G A G G T C G
4080
H3 coding region
C A C C G G A G G T G T G A A G A A G C C C C A C C G C T A T C G C C C T G G A A C C G T G G C C T T G C G T G A A A T T C G T C G C T A C C A A A A G A G C A
G T G G C C T C C A C A C T T C T T C G G G G T G G C G A T A G C G G G A C C T T G G C A C C G G A A C G C A C T T T A A G C A G C G A T G G T T T T C T C G T
PflMI
4160
H3 coding region
C C G A G C T T C T A A T C C G C A A G C T G C C T T T C C A G C G T C T G G T G C G T G A A A T C G C T C A G G A C T T T A A G A C G G A C T T G C G A T T C
G G C T C G A A G A T T A G G C G T T C G A C G G A A A G G T C G C A G A C C A C G C A C T T T A G C G A G T C C T G A A A T T C T G C C T G A A C G C T A A G
BsaISexAI*
4240
H3 coding region
H3 Tag Remove SacI Remove BstBI
C A G A G T T C G G C G G T T A T G G C T C T G C A G G A A G C T A G C G A A G C C T A C C T G G T T G G T C T C T T T G A A G A T A C C A A C T T G T G T G C
G T C T C A A G C C G C C A A T A C C G A G A C G T C C T T C G A T C G C T T C G G A T G G A C C A A C C A G A G A A A C T T C T A T G G T T G A A C A C A C G
BsiWI
4320
H3 coding region
H3 transl. stopBsiWI Added
C A T T C A T G C C A A G C G T G T C A C C A T A A T G C C C A A A G A C A T C C A G T T A G C G C G A C G C A T T C G C G G C G A G C G T G C T T A A C G T A
G T A A G T A C G G T T C G C A C A G T G G T A T T A C G G G T T T C T G T A G G T C A A T C G C G C T G C G T A A G C G C C G C T C G C A C G A A T T G C A T
4400
BsiWI AddedH3 3'UTR
H3 stem loop
C G G C T G A C A C G G C A T T A A C T T G C A G A T A A A G C G C T A G C G T A C T C T A T A A T C G G T C C T T T T C A G G A C C A C A A A C C A G A T T C
G C C G A C T G T G C C G T A A T T G A A C G T C T A T T T C G C G A T C G C A T G A G A T A T T A G C C A G G A A A A G T C C T G G T G T T T G G T C T A A G
4480
H3 HDENcoI added
H3-H1 intergenic region
A A T G A G A T A A A A T T T T C T G T T C C A T G G G C C G A C T A T T T A T A A C T T T A A A A A A A A A A T A A G A A C A A A A T T C A T A T T C T A T T
T T A C T C T A T T T T A A A A G A C A A G G T A C C C G G C T G A T A A A T A T T G A A A T T T T T T T T T T A T T C T T G T T T T A A G T A T A A G A T A A
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint
Page 7
4560
H3-H1 intergenic region
A T T T A T G G C G C A A A C G G T A C T G G G T C T T A A A T C A T A T G T A A A A A T A C T A A T T C T G C C A G A G A A G G A A T A A A A A T A A T C T T
T A A A T A C C G C G T T T G C C A T G A C C C A G A A T T T A G T A T A C A T T T T T A T G A T T A A G A C G G T C T C T T C C T T A T T T T T A T T A G A A
4640
H3-H1 intergenic region
A T T T T A A T T G T C A G C T C A A C A T T T A T T A A A T T A A A G A A G A G G T T A A T A C A A A A A T A T A T A T T T T T A T T T G T T C T T T G T G C
T A A A A T T A A C A G T C G A G T T G T A A A T A A T T T A A T T T C T T C T C C A A T T A T G T T T T T A T A T A T A A A A A T A A A C A A G A A A C A C G
4720
H3-H1 intergenic region
G A A C A T C C T T T A A A G C A G T G A A A G T G T C G T G C G G G G C A A G G G A C T C T G A A C C T T A A A C A T C T A A A A A A A A A A T C T G A A T T
C T T G T A G G A A A T T T C G T C A C T T T C A C A G C A C G C C C C G T T C C C T G A G A C T T G G A A T T T G T A G A T T T T T T T T T T A G A C T T A A
4800
H3-H1 intergenic region
C T G T G T A A G A C A G T T T G A A A T T A A T G A A A T T A C A T T G A T G A C G G C A A T A T T T A T A A A A T A A C A G A A A A T A A T A A A A T A A A
G A C A C A T T C T G T C A A A C T T T A A T T A C T T T A A T G T A A C T A C T G C C G T T A T A A A T A T T T T A T T G T C T T T T A T T A T T T T A T T T
4880
H3-H1 intergenic region
Removed HpaI Removed BsiWI
A C T A G C T A T T T T A T A T T T T T T C C A T G T G C T A A C T G A A G A A T G T G T T A T T A T T G A A G A G G T C G T A T G G G A C A A T T G A C A C T
T G A T C G A T A A A A T A T A A A A A A G G T A C A C G A T T G A C T T C T T A C A C A A T A A T A A C T T C T C C A G C A T A C C C T G T T A A C T G T G A
4960
H3-H1 intergenic region
G T C C C T T C A A A C G T C T G T A A A A A A T A A A A C C T A T G T A A A A T T C A G C A C G G A A A T T G G C T A A T T T T G T T G C G G A A T G T A A T
C A G G G A A G T T T G C A G A C A T T T T T T A T T T T G G A T A C A T T T T A A G T C G T G C C T T T A A C C G A T T A A A A C A A C G C C T T A C A T T A
5040
H3-H1 intergenic region
A T A T A T T A C A T A A T A A A G G A T A A T A C A A A A A T T G T T T C T T T T T A T T T T T T A T T T G A T T T A T T T A T T T G A C T A C A T A G A C G
T A T A T A A T G T A T T A T T T C C T A T T A T G T T T T T A A C A A A G A A A A A T A A A A A A T A A A C T A A A T A A A T A A A C T G A T G T A T C T G C
NotIPaeR7IBsoBIXhoIAvaI
BmeT110I
3
5
5067
H3-H1 intergenic region
G C T C G A G A A G G G C G A A T T C G C G G C C G C
C G A G C T C T T C C C G C T T A A G C G C C G G C G
.CC-BY-NC-ND 4.0 International licenseavailable under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted March 18, 2020. ; https://doi.org/10.1101/2020.03.16.994483doi: bioRxiv preprint