15
Conservation of separase N-terminal domain 1 Michael Melesse 1 , Joshua N. Bembenek 1* , Igor B. Zhulin 2,3, * 2 3 4 1- Department of Biochemistry, Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 5 2- Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA 6 3- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA 7 *- Co-corresponding authors 8 . CC-BY-NC-ND 4.0 International license under a not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available The copyright holder for this preprint (which was this version posted December 19, 2017. ; https://doi.org/10.1101/236216 doi: bioRxiv preprint

Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Conservation of separase N-terminal domain 1

Michael Melesse1, Joshua N. Bembenek1*, Igor B. Zhulin2,3, * 2

3

4

1- Department of Biochemistry, Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 5

2- Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA 6

3- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA 7

*- Co-corresponding authors 8

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 2: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Abstract 9

We report a reanalysis of the sequence conservation of the cell cycle regulatory protease, 10

separase. The sequence and structural conservation of the protease domain has long been 11

recognized. Here we reexamine the protein sequence conservation at the N-terminus using PSI-12

BLAST analysis and report our discovery of a cysteine rich motif (CxCXXC) conserved in 13

nematodes and vertebrates. This motif is found in a solvent exposed linker region connecting 14

two TPR-like helical motifs. Mutation of this motif in Caenorhabditis elegans separase leads to a 15

temperature sensitive hypomorphic protein, and several N-terminal residues identified as 16

intragenic suppressors are not conserved. Conservation of this motif in multiple organisms 17

raises the possibility that the motif plays similar roles across species. 18

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 3: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Introduction 19

Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 20

division. Separase proteolytic activity is tightly regulated mainly by the binding of an inhibitory 21

chaperone called securin [1, 2], which is degraded at the onset of anaphase in a proteasome 22

dependent manner after polyubiquitination by the Anaphase Promoting Complex/Cyclosome 23

(APC/C) [3, 4]. Once activated, separase cleaves a subunit of cohesin which holds sister 24

chromatids together, allowing sister chromatids to segregate to opposite poles [5-8]. Later 25

during anaphase, separase cleaves a number of other substrates to regulate the spindle [9], 26

centrosome duplication [10], and exocytosis [11]. Separase is also found to be functionally 27

conserved from single-celled yeast to multicellular mammals with molecular mass ranging from 28

144 kDa (in Caenorhabditis elegans) to 244 kDa (in a A. thaliana) [12]. This highly conserved 29

protease is central to the regulation of anaphase in many systems. 30

Previous reports attempting to define sequence conservation of separase have mainly 31

identified conservation within the C-terminal protease domain of the protein [13-14]. More recent 32

structural analyses of C. elegans and yeast separases have found that the geometry of the 33

catalytic active site is also conserved [15-17]. However, separase is a large protein with a 34

sizable N-terminal domain that has been poorly characterized and N-termini of yeast and C. 35

elegans are structurally different [16, 17]. C. elegans separase is composed of an N-terminal α-36

solenoid domain connected to the C-terminal caspase-like protease domain [15] (Figure 1). The 37

α-solenoid domain comprises 25 α-helices, mainly arranged as a right-handed superhelix that 38

resembles a TPR superhelix [17]. In contrast to a canonical TPR superhelix, however, the 39

irregular length of its constituent α-helices creates a compact globular structure that lacks the 40

deep surface grooves typical of TPR proteins [17]. Modifications of the N-terminus, including 41

phosphorylation by Cdk1 [18,19], a change in conformation mediated by the peptidyl-prolyl 42

cis/trans isomerase (PPIase) pin1 [20] as well as separase auto cleavage [21, 22] regulate 43

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 4: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

separase activity. Therefore, it is important to understand how the N-terminal domain of 44

separase functions to better understand these different regulatory mechanisms. 45

In C. elegans, separase localizes to vesicles and regulates exocytosis events during 46

anaphase [11, 23]. A mutation (Cys450Tyr) in the N-terminus of the C. elegans separase (sep-47

1(e2406)) results in a temperature sensitive phenotype that leads to the failure of cortical 48

granule exocytosis (CGE) during anaphase, with little effect on chromosome segregation [11]. 49

During anaphase, the SEP-1(e2406) protein localizes to the meiotic spindle, but has reduced 50

localization to cortical granules and supports a lower number of exocytic events. In an extensive 51

suppressor screen, we identified multiple intragenic suppressors of sep-1(e2406) that 52

exclusively introduce mutations to the N-terminus [24]. Therefore, we investigated structural 53

conservation in the N-terminus to better understand the structural elements of this domain that 54

are affected by these mutations. 55

Previous computational studies of separase from different taxonomic lineages revealed 56

no sequence conservation in the N-terminal region corresponding to the α-solenoid domain, 57

whereas the C-terminal region including the caspase-like domain (recognized by the 58

Peptidase_C50 Pfam model) was found to be well conserved [13, 25]. No multiple sequence 59

alignments of the separase N-terminal region are available in the current literature due to the 60

reported extreme divergence; however, a pairwise sequence alignment between the human and 61

nematode separase sequences was produced based on the recently solved three-dimensional 62

structure of the C. elegans separase [17]. Twenty-five structurally identified helices, comprising 63

eleven TPR-like repeats in the N-terminal region of the C. elegans separase (accession 64

NP_491160.1, aa 1-700), were aligned to the twenty-five α-helices predicted in the N-terminal 65

region of the human separase (NP_036423.4, aa 651-1641). This analysis revealed more than 66

80 identical residues, including Cys450. The very low percentage of identity between the two 67

sequences and the fact that they were aligned manually, guided by structural information, 68

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 5: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

prompted us to explore potential sequence conservation in the N-terminal region of separase 69

using more conventional bioinformatics approaches. 70

First, we performed a BLAST [26] search (with default parameters) of the non-redundant 71

protein sequence database at the NCBI (NR) with the N-terminal (aa 1-740) portion of the C. 72

elegans separase (accession NP_491160.1) as a query. The search confidently retrieved 73

similar proteins from several distantly related nematode species and reciprocal BLAST searches 74

with full-length sequences validated their orthology. Next, we performed exhaustive PSI-BLAST 75

[27] searches (with default parameters) of the NR database with N-terminal regions of the 76

nematode sequences as queries. The search initiated with the N-terminal region (aa 1-760) of 77

the Toxicara canis separase (accession KHN86283.1) retrieved a separase from a vertebrate 78

Danio rerio (XP_001337869.1) with E value 9e-04 in the third iteration and a human separase 79

(accession NP_036423.4) with E value 2e-11 in the fourth iteration, among many other 80

separase sequences from vertebrates. While having statistically significant matches to the PSI-81

BLAST generated profile, vertebrate separase sequences showed low identity (10-12%) to 82

nematode sequences. We then constructed a multiple sequence alignment of representative 83

nematode and vertebrate sequences using MAFFT [28] with default parameters (Supplementary 84

Figure 1). Overall, the alignment, which was only minimally edited based on structural 85

information, looks similar to that of the C. elegans and H. sapiens sequences published by [17]; 86

however, there are drastic differences with respect to conserved positions revealed by these 87

alignments. While more than 80 identical residues in the C. elegans and H. sapiens sequences 88

were identified by (Supplementary Figure 4 of [17]), our alignment shows only 7 positions that 89

are identical in all nematode and vertebrate sequences. The majority of identical positions 90

revealed by comparing only two sequences do not match conservation patterns revealed by our 91

multiple sequence alignment. These alignments appear to be due to incorrect gap placements 92

or happen purely by chance (Supplementary Figure 1), which further emphasizes problems 93

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 6: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

associated with building and interpreting “manual alignments”, even when assisted by structural 94

information [29]. Among the very few identical positions that we identified in nematode and 95

vertebrate N-terminal sequences (Figure 1A), four are in various parts of the region: W93/W754 96

(C. elegans/H. sapiens), L158/L827, W584/W1294, and R685/R1629. The remaining three 97

identical residues, C448/C1146, C450/C1148, and C453/C1151, appear to form a motif, which 98

is located at the border of Insert 1 and helix H16 (see Supplementary Figure 1 for details). Other 99

conspicuous conserved positions include: F90/L751, E127/K789, F219/W898, K262/D940, 100

D330/E1006, and F520/Y1227. As expected, multiple residues, distributed throughout the N-101

terminal domain (Supplementary Figure 2), are conserved among nematode separases; which 102

are different from residues conserved among vertebrate separase (Figure 1B and 1C). None of 103

these conserved residues are in contact with the separase inhibitory chaperone, securin. The 104

nature of this conservation remains unclear. None of the highly conserved residues appear to 105

form a contact with another similarly conserved residue, as judged by the proximity of beta 106

carbons on the separase structure (Supplementary Table 1). However, their functional 107

importance is highlighted by the fact that no changes in these positions are observed in human 108

populations (Supplementary Figure 3) and mutation in one of them, C450, results in 109

temperature sensitive embryonic lethality in C. elegans [11, 23]. Residues that can, when 110

mutated, suppress sep-1(e2406) [24] are not conserved, do not contact securin and are 111

distributed throughout the N-terminus (Supplementary Figure 2C). Understanding effects of 112

these mutations on SEP-1 structure will require further investigation. Our analysis demonstrates 113

the utility of studies in C. elegans in understanding separase regulation in multiple organisms. 114

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 7: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

References 115 116 1. Michaelis C, Ciosk R, Nasmyth KA. Cohesins: Chromosomal proteins that prevent premature separation of sister 117 chromatids. Cell. 1997;91(1):35-45. doi:10.1016/S0092-8674(01)80007-6. 118 2. Waizenegger IC, Giménez-Abián JF, Wernic D, Peters JM. Regulation of human separase by securin binding and 119 autocleavage. Curr Biol. 2002;12(16):1368-1378. doi:10.1016/S0960-9822(02)01073-4. 120 3. Cohen-Fix O, Peters JM, Kirschner MW, Koshland D. Anaphase initiation in Saccharomyces cerevisiae is 121 controlled by the APC-dependent degradation of the anaphase inhibtitor Psd1p. Genes Dev. 1996;10:3081-3093. 122 doi:10.1101/gad.10.24.3081. 123 4. Peters JM. The anaphase-promoting complex: proteolysis in mitosis and beyond. Mol Cell. 2002;9(5):931-43. 124 doi:10.1016/S1097-2765(02)00540-3. 125 5. Ciosk R, Zachariae W, Michaelis C, Shevchenko A, Mann M, Nasmyth K. An ESP1/PDS1 complex regulates 126 loss of sister chromatid cohesion at the metaphase to anaphase transition in yeast. Cell. 1998;93(6):1067-1076. 127 doi:10.1016/S0092-8674(00)81211-8. 128 6. Uhlmann F, Lottspeich F, Nasmyth KA. Sister-chromatid separation at anaphase onset is promoted by cleavage of 129 the cohesin subunit Scc1. Nature. 1999;400(6739):37-42. doi:10.1038/21831. 130 7. Uhlmann F, Wernic D, Poupart Ma, Koonin EV, Nasmyth KA. Cleavage of cohesin by the CD clan protease 131 separin triggers anaphase in yeast. Cell. 2000;103(3):375-86. 132 8. Nasmyth KA. Segregating sister genomes: the molecular biology of chromosome separation. Science. 133 2002;297(5581):559-65. doi:10.1126/science.1074757. 134 9. Sullivan, M., C. Lehane, and F. Uhlmann, 2001 Orchestrating anaphase and mitotic exit: separase cleavage and 135 localization of Slk19. Nat Cell Biol. 2001;3(9):771-777. doi:10.1038/ncb0901-771. 136 10. Baskerville, C., M. Segal, and S. I. Reed, 2008 The protease activity of yeast separase (Esp1) is required for 137 anaphase spindle elongation independently of its role in cleavage of cohesin. Genetics 178: 2361–2372. 138 doi:10.1534/genetics.107.085308. 139 11. Bembenek JN, Richie CT, Squirrell JM, Campbell JM, Eliceiri KW, Poteryaev D, et al. Cortical granule 140 exocytosis in C. elegans is regulated by cell cycle components including separase. Development. 141 2007;134(21):3837-3848. doi:10.1242/dev.011361. 142 12. Kumar R. Separase: Function Beyond Cohesion Cleavage and an Emerging Oncogene. J Cell Biochem. 143 2017;118(6):1283-1299. doi:10.1002/jcb.25835. 144 13. Winter A, Schmid R, Bayliss R. Structural Insights into Separase Architecture and Substrate Recognition 145 through Computational Modelling of Caspase-Like and Death Domains. PLOS Comput Biol. 146 2015;11(10):e1004548. doi:10.1371/journal.pcbi.1004548. 147 14. Viadiu H, Stemmann O, Kirschner MW, Walz T. Domain structure of separase and its binding to securin as 148 determined by EM. Nat Struct Mol Biol. 2005;12(6):552-553. doi:10.1038/nsmb935. 149 15. Lin Z, Luo X, Yu H. Structural basis of cohesin cleavage by separase. Nature. 2016;(1):1-15. 150 doi:10.1038/nature17402. 151 16. Luo S, Tong L. Molecular mechanism for the regulation of yeast separase by securin. Nature. 152 2017;542(7640):255-259. doi:10.1038/nature21061. 153 17. Boland A, Martin TG, Zhang Z, Yang J, Bai Xc, Chang L, et al. Cryo-EM structure of a metazoan separase-154 securin complex at near-atomic resolution. Nat Struct Mol Biol. 2017;(March):1-7. doi:10.1038/nsmb.3386. 155 18. Gorr IH, Boos D, Stemmann O. Mutual inhibition of separase and Cdk1 by two-step complex formation. Mol 156 Cell. 2005;19(1):135-141. doi:10.1016/j.molcel.2005.05.022. 157 19. Stemmann O, Zou H, Gerber SA, Gygi SP, Kirschner MW. Dual inhibition of sister chromatid separation at 158 metaphase. Cell. 2001;107(6):715-726. doi:10.1016/S0092-8674(01)00603-1. 159 20. Pin Pi, Hellmuth S, Rata S, Hellmuth S, Rata S, Brown A, et al. Human Chromosome Segregation Involves 160 Multi- Layered Regulation of Separase by the Peptidyl- Human Chromosome Segregation Involves Multi-Layered 161 Regulation of Separase by the Peptidyl-Prolyl-Isomerase Pin1. Mol Cell. 2015;58(3):495-506. 162 doi:10.1016/j.molcel.2015.03.025. 163 21. Papi M, Berdougo E, Randall CL, Ganguly S, Jallepalli PV. Multiple roles for separase auto-cleavage during the 164 G2/M transition. Nat Cell Biol. 2005;7(10):1029-1035. doi:10.1038/ncb1303. 165 22. Zou H, Stemmann O, Anderson JS, Mann M, Kirschner MW, Stemman O, et al. Anaphase specific auto-166 cleavage of separase. FEBS Lett. 2002;528(1-3):246-250. doi:10.1016/S0014-5793(02)03238-6. 167 23. Bembenek JN, White JG, Zheng Y. A role for separase in the regulation of RAB-11-positive vesicles at the 168 cleavage furrow and midbody. Curr Biol. 2010;20(3):259-64. doi:10.1016/j.cub.2009.12.045. 169

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 8: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

24. Melesse M, Sloan DE, Benthal JT, Caylor Q, Gosine K, Bai X, Bembenek JN. Genetic Identification of 170 Separase regulators in Caenorhabditis elegans. Genes Genomes Genetics, 2017; accepted. 171 25. Bachmann G, Richards MW, Winter A, Beuron F, Morris E, Bayliss R. A closed conformation of the 172 Caenorhabditis elegans separase-securin complex. Open Biol. 2016;6(4):160032. doi:10.1098/rsob.160032. 173 26. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 174 1990;215(3):403-410. doi:10.1016/S0022-2836(05)80360-2. 175 27. Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: A new generation of protein 176 database search programs. Nucleic Acids Res. 1997;25(17):3389-3402. doi:10.1093/nar/25.17.3389. 177 28. Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence 178 choice and visualization. Brief Bioinform. 2017;(December):1-7. doi:10.1093/bib/bbx108. 179 29. Iyer LM, Aravind L, Bork P, et al. Quoderat demonstrandum? The mystery of experimental validation of 180 apparently erroneous computational analyses of protein sequences. Genome Biol. 2001;2(12):RESEARCH0051. 181 doi:10.1186/gb-2001-2-12-research0051. 182 183

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 9: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Figure legends 184

Figure 1: Separase N-terminal domain is conserved across multiple residues. A. C. 185 elegans separase Cryo-EM structure (PDB 5MZ6) illustrating N-terminal residues universally 186 conserved in nematodes and vertebrates. The N-terminal domain is depicted in gray and C-187 terminal protease domain is shown in light orange. The two visible conserved residues that 188 make up the cysteine motif (C450 and C453) are shown in yellow and are found as part of a 189 solvent exposed loop between helices 15 and 16, while C448 was not resolved in the crystal 190 structure. Other conserved residues shown in red are distributed across the N-terminus. W93, 191 L158 and R685 are found in the interior of the protein where as W584 is on the protein surface. 192 Different sets of N-terminal residues are conserved among nematode and vertebrate separase. 193 Protein diagram shows N-terminal residues conserved among nematode (B) and vertebrate (C) 194 separases in green. Residues conserved across all species are shown in black. The residue 195 numbering at the top of each diagram corresponds to C. elegans (B) and human (C) separase. 196

Supplementary Figure 1. Multiple sequence alignment of separases from nematodes and 197 vertebrates. Sequences from eleven nematode species (top portion of each panel) and from 198 seven representative vertebrate species (human, opossum, turtle, chicken, frog, coelacanth, 199 and fish) are shown. Twenty-five alpha helices (labeled H1 to H25) comprising eleven TPR-like 200 repeats (labeled TPR1A,B to TPR11A,B) in the C. elegans separase (PDB accession 5MZ6) are 201 shown above the alignment. Identical residues in each group are highlighted: negatively 202 charged, red; positively charged, blue; aromatic, green; aliphatic, yellow; alcohol, magenta; 203 small, grey. NCBI accession numbers: Caenorhabditis_elegans_1, NP_491160.1; 204 Caenorhabditis_brenneri_1, EGT38506; Caenorhabditis_briggsae_1, CAP33358; 205 Caenorhabditis_remanei_1, XP_003114963.1; Loa_loa_1, XP_003140515.1; 206 Wuchereria_bancrofti_1, EJW80934; Brugia_malayi_1, XP_001894870.1; 207 Dictyocaulus_viviparus_1, KJH53363.1; Dictyocaulus_viviparus_2, KJH53362.1; 208 Haemonchus_contortus_1, CDJ83415.1; Ancylostoma_duodenale_1, KIH65515.1; 209 Ancylostoma_ceylanicum_1, EYC45610.1; Toxocara_canis_1, KHN86283.1; Homo_sapiens_1, 210 NP_036423.4; Monodelphis_domestica_1, XP_007506592.1; Chelonia_mydas_1, 211 XP_007058605.1; Gallus_gallus_1, XP_015128534.1; Xenopus_tropicalis_1, XP_004912005.1; 212 Latimeria_chalumnae_1, XP_014347491.1; Maylandia_zebra_1, XP_014264400.1. “Identical 213 residues”* show positions defined as identical in a pairwise comparison of C. elegans and H. 214 sapiens sequences by Boland et al., 2017. 215

Supplementary Figure 2: Separase N-terminal residues conserved among nematodes are 216 distributed throughout the structure. C. elegans separase Cryo-EM structure (PDB 5MZ6) 217 illustrating N-terminal residues conserved among nematodes found in the interior (A) and on the 218 surface (B) of the TPR-like N-terminal domain. Intragenic suppressors of SEP-1(e2406) are 219 shown (C) and are not among the conserved residues. The structures are oriented with the N-220 terminus to the left with a perspective that best illustrates the distribution of each highlighted 221 residue. 222

Supplementary Figure 3: Mutations in human separase (ESPL-1) associated with 223 cancers. The collection of separase allelic variants of human Separase from the ExAC 224 collection (http://exac.broadinstitute.org) and the ICGC (https://icgc.org/) which collects genomic 225 sequences of various cancers. The frequency of each missense mutation are indicated. 226

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 10: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

227

228

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 11: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Supplementary figure 1: 229

230 231

2D structure ------HHHHHHHHHHHHHHH-----HHHHHHHHHHHHHH------HHHH.........--...HHHHHHHHHHHHHHH..HHHHHHHHHHHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHHHHH..------.HHHHHHHHH

“Identical residues”* * * * ** * * * ** * * * * * * * * * **

Caenorhabditis_elegans_1 5 SVDKQHIEKLDELRKNVSCTV-----IGFAEQTAELQQEI------SELFIAEFGVNGP--IDMNSLSKLARITSYYASSEYFQGLAKYQRTACKMFITWQTLRKEAM--ECRSKDREIFASIPAKLCFFYFYNGE------LCRAVVCLLD 137

Caenorhabditis_brenneri_1 5 GPNKEDKIKLDSLIKDVNTYV-----KSFVEQVDDLQQYV------AELFIREFGENGP--IDLNSLNKLARITSHYASSEVSQKLGRYQRAIQKLFTTWQSLRTEAL--ECTNREKEQIAVIPAKMSFFYFYNGE------LCKAVVCLLD 137

Caenorhabditis_briggsae_1 5 DPNKQDIEKLNELYNSANTII-----HSFLGKIDEIQREV------ADLFISEFGVNGS--IDLHSLGKLARITSQYASFECSQKLGKYQLSVQKFFNAWQSLRKEVA--ELSTKERVRYALIPAKICFFYFYNGE------LCKAVVCLLD 137

Caenorhabditis_remanei_1 5 SSTKRELDDLEILIKDVNSII-----TSFVEQTVELQQHV------ADCFLSEFDIHGS--IDLRAITKLARITSRYASSESFQNLGKYQRITQKLFNAWQSLRKDAL--ECSSKYKASIAVIPAKLCFFFFYNGE------LCKAVVCLLD 137

Dictyocaulus_viviparus_1 1 --------------------------------MPKHIIKV------FISLKDEFTHENV--CEPTSSPKYMTVVSTFASIAAAR--GDIYGNLNKCYTIWRDKIEPNV---RNSTKKHTYAELPLRLAACFLLNGD------YYKALSCLLH 101

Haemonchus_contortus_1 1 ---MKSSIDKRVVYNNCLQYL-----NLYIDGLVAVQEEV------FNALKNDFKPEDV--CEPTATPKYLTVLSKFTSSSASK--GNLYENVIKCYTIWKDKLEPVV---RDATKKSAYAELPLRLAACFFLNGD------YYKALLCLRH 125

Ancylostoma_duodenale_1 1 MSKKAPSISKRDVYRNCLEYV-----RRYLVDLASVQEEV------FANLKEEFKPEHV--CEPTASPKYLTAITKFT-SVHNK--TKIYSSIKKCYHIWKDKLQPVI---GDGGMKNSYAELPLRLAACFFLNGD------YYKALLCLRH 127

Ancylostoma_ceylanicum_1 1 MSKKASSVNKRDVYRNCLEYV-----RRYLVDLAVVQEEV------FANLKEEFKPEHV--CEPTASPKYLVAITKLA-SAHNK--VKIYSSIKKCYHVWKDKLQPVI---GDGGVKNSYAELPLRLAACFFLNGD------YYKALLCLRH 127

Loa_loa_1 1 -MVREGKRQLDSMKKELLLHM-----KNHIVATMSARREI------YDALKELFDTNSADFYNLSSTVKHVRIIRLLI-KCLEK--YKPHQHLQDAFFVWNSRLAPAVLAIADPAEQRKYSRHAKLLAACLLDSDE---ACFYTKAIICLGH 134

Wuchereria_bancrofti_1 1 -MIREGKPQLDSMKKELLLHM-----KNHIVATMNARREI------YDALKESFDTSSADFYNLSSTVKHVRIIRLLI-TCLEK--YKPHQHLQDAFSVWNWRLAPAVFALSDPVERQKFSKHAKLLAACLLDSDE------YTKAIICLGH 131

Brugia_malayi_1 1 -MIREGKPQLDSMKKELLLHM-----KNHIVATMSTRREV------YDALKESFDTSSTDFYNLSSTVKHVRIIQLLI-TCLEK--YKPHQHLQDAFSVWNCRLAPAVFALSDPVERQKFSKHAKLLAACLLDSDE------YTKAIICLGH 131

Toxocara_canis_1 56 IEVKYQIERVDTVDASPFLDLYPLV-DELFVSTQLGQRAISSCTFMFNAVKKE-EDNGFHFFDTTHFETHAALSKLFI-AALAK--HKAHFYLERAYSFWSEQFAPAVLALESKEERTTFAQFPKLLAAYLLVAGE------YSKVVVCLAH 197

“Identical residues”* * * * ** * * * ** * * * * * * * * * ** *

Homo_sapiens_1 670 ARDQLLDDKAQALLWLYICTL-----EAKMQEGIERDRRAQ-----APGNLEEFEVNDLNYED--KLQEDRFLYSNIAFNLAAD--AAQSKCLDQALALWKELLTKGQAPAVRCLQQTAASL--QILAALYQLVAKPMQALEVLLLLRIVSE 804

Monodelphis_domestica_1 694 ATDQFLDDKAQALLWLYICSL-----ETKMQKGIEQDLRLK-----LQPSSIELETNDLNYED--KIQDDHFLYSSIAFNLAVD--AAQSKCLDQALALWKEMLKKGQVPTVRCLQQTTASL--QILAALYQMVAKPLQALETHLLLRTLSE 827

Chelonia_mydas_1 658 NQEQLLDDKAQALLWLYICTL-----ESKMQESIERDQRAGDQ---GHKNLEDFEPNDLNYED--KLQDDKFLYNSISFNLVAD--SAQAKCLDDALALWKKLLAKKEVPVVRSAEQTVFSL--HFLAALYRMMAKPLQSMESYLLIRTLSS 797

Gallus_gallus_1 657 NREQLLDDRAQALLWLYICTL-----ESKLEEGIAREQRVKAL---GQKSLEDFEPNDLNYES--RLQEDAFLNSGISFNLLTE--LAQSKSLDDALALWKQLLASQGIPAVRSVEQTTASL--RVTAALYRLMDKPLQAMESYLLVRALCS 796

Xenopus_tropicalis_1 669 PTERLQDIKAQALLWQYICTL-----EANMREGLEEQNRKAKLEGQNQWTGVDYEPNDLNYED--KLHDDQSVRDGICFTLAGE--TGPLKGLDEALDLWSSLLSSSQVQYLNSAEQTIFSL--HLLGSLYRLMGKPLQSMQSFQLAGRLCH 809

Latimeria_chalumnae_1 617 DAEKLLDDKAHASLWLYISTL-----ESTVKKSIEREERVANL-GADLGNLEEQETNDLNYED--QLQDDKFVHDGINFNLAAD--SGQSQYLEDALVLWKKLVPEKRLPQVRSTDQTMASL--HIMASLYRLIGKPLQAIESYFLIVRLVK 756

Maylandia_zebra_1 659 NADKLKDDKAHALLWLYICTL-----EKNLQEAIDSDRKRKELREQTQCVTSSIGNNDFDYEDKQKTQDSIPVYEGLHFNLTAE--HKLCQPLESALDIWTTIFQSETLPSVREPKETCRSI--GVTASLFRLMGKPLQALKALQILTDFSR 803

Con_90% ...ph....tp..........-----...h.t....p.th..........h.p.t.pt....p..t..c.h.h.p.h....h.p...t....hpphh.hWpphht.......p...pt.hu....hhs.hh...sc.......h.hlhtl.p

Con_80% ...ph.p.htph..hh.....-----pthh.th.t.ppch......ht..hpphtsps....p..th.c.hhlhp.hh.s.htp..ht..p.lpchhthWpphht......hpshppp.hu...hhhuhhhhh.sc.......hhhlhsL.c

Con_70% ...pl.s.htps.hal.....-----cphh.pshphppcl......hpthhp-Fcsssh...-..ph.c.hhlhsths.shhsc..tt.tptlpcshslWpphht.sh...spshcppthu...thLAuhahhsuc......hhphlhsLhc

Con_60% t.cpLpshctpsLhal.....-----cshltpshphppcl......hpshhc-FcsNsh..h-.pphtchhtlhsths.shhsc..sc.ppsLpcsaslWpshlpssh...scssccspauhhsthLAAhahhsG-......hh+sllCLtc

H6 TPR3A H7 TPR3B H8 TPR4A H9 TPR4B H10 TPR5A H11 TPR5B

2D structure HHHHH-..HHHHHHHHHHHHHH....HHHHHHHH......HHH------------HHHHHHHHHHHH.----------.HHHHHHHHHHHHHHHH-HHHH.....HHHHHHHHH---HHHHHHHH................HHHHHHHHHHHHHHHHH

“Identical residues”* * * ** * * * * * * * * * * * * *

Caenorhabditis_elegans_1 138 YIDLS-DDTLAKEAALRWLMFLGETELIEKKLKTWKMDKSSKD------------MFSATEFAMNYLK----------KSEYRVEMLEKLMKLRD-KVKSDPTRSFSRYELASY---VSWLCSTLSNVPVGSALRECEFPDRVSHIQEAALKSDSLVR 268

Caenorhabditis_brenneri_1 138 YVELVPEDVLTMEATLRWLMFLGETELAEKKMKQWKTEKSSED------------IFKATRIEIGFLK----------HGDDRVELINELIALRD-RIKAENIRSFPRYELASY---VAWLCSTLSNSAVGASLTGLEFPDRLSQVQEAVNKAESIVR 269

Caenorhabditis_briggsae_1 138 YVELVPEDAMSMEAALRWLMFLGETELAEKKMKSWKMAKNSAD------------VFEATRIGISYLK----------TTESRSAALDQLIQLRD-RIKSENIRSFPKYELSSY---VHWLCSTLSNVAVGTSLSGCEFPDRMSQVHESSSKAEAIIR 269

Caenorhabditis_remanei_1 138 YIELVPEDVLTMEAALRWLMFLGETELAEKKIKQWKMEKSSED------------LFEATKIAISYLK----------KSDNRVDMLKKMIDLRD-KIRAENVRSFPKYELASY---VYWLCSTFSNVPVGKTLNGCEFPDRMSQLQEASSKAEAIIR 269

Dictyocaulus_viviparus_1 102 SINLESKNILPRVLALRWSAFLGEWEMAEMALAFENAKFMQKD-VEDYFASHLILTKEFVKYYVDFNR----------DHNTRSNSGTSLLKMTH-KINADTRTTFLTFEAQSF---ANWIY-------------------SFNCLHRAVVKSEGVMK 225

Haemonchus_contortus_1 126 TIYLEPKNLVPRILALRWSVYLGEWEMLEMALAFEKLKPPKSHGDDDFVMAQLRFIIEIAKSYVEFNR----------DHKSRPTSAKSILEMTN-LVNSDTRMTYYIFESQAF---ANWICAQLP-KVAKFSTDD-PLLDSYNCLHRAVSKSESVMK 267

Ancylostoma_duodenale_1 128 SIHLEPKSLYPRILALRWSVYLGEWEMAEMALAFEKAKPPQKNGVDDHLVEQLVFMTEMVRAYVDFNR----------DQRSRAVSAKTILTMTE-KTNAEERPTFPIFETQAF---ANWVAAHLP-KVAGFRVENLELFDTFNCLHRAVLKAESVMK 270

Ancylostoma_ceylanicum_1 128 SIHLEPKSLYPRILALRWSVYLGEWEMAEMALAFEKAKPPQKNGVDDHLVEQLVFMTEIVRAYVDFNR----------DQRSRAVSAKTILTMTE-KTNAEERPTFPIFETQAF---ANWVAAHLP-KVAGFRVENLELFDTFNCLHRAVLKSESVMK 270

Loa_loa_1 135 VVRLNEGDCHSRLLLTYWLCLLGYWSLAKTQVLCEEEVKRMYSIHKEI-------INEVINDIIHLHTGKTVGHYYGDALSEQDHILKTMQSFES-LLAKSKAKTFQQYEAQILVKRILIAANHFPYADLT------LIGDPLQNMDFVKARYETLVK 278

Wuchereria_bancrofti_1 132 AVRLHDGDVHSRLLLTYWLCFLGYWSLAKTQVLDEDEVKRMYSIHKEI-------INEVINDIIYLHAGKAAGHHSGDSLPAHDRILKTMQSFES-ILTKSKAKTFQQYEAQILVKRVLIAANHFSSADLT------LIGDPLQNMDFIKARYETLVK 275

Brugia_malayi_1 132 VVRLHDGDVHSRLLLTYWLCFLGYWSLAKTQVLDEDEVKRMYSIHKEI-------INEVINDIIYLHTGKAAGHHSGDSLPAHNRILKTMQSFES-ILTKSKAKTFQQYEAQILVKRVLIAANHFSSADLT------LIGDPLQNMDFIKARYETLVK 275

Toxocara_canis_1 198 AIRLNEFDPTCRILILRWLCHLGEWQLAKKQLDIEAKLPPISGTHYD-------LIIDIYRNIVALNT-----------LSKKNEVVEKLVMQWN-RLSEEGSKTFMQYQCQALLKRALFIASRLPHADIN------IIGDPLKNSEMEIARLTVLIK 330

“Identical residues”* * ** * * * * * * * * * * * * *

Homo_sapiens_1 805 RLKDHSKAAGSSCHITQLLLTLGCPSYAQLHLEEAASSLKHLDQTTDTY-----LLLSLTCDLLRSQL-------YWTHQKVTKGVSLLLSVLRD-PALQKSSKAWYLLRVQVLQLVAAYLS--LPSNNLSHSLWEQLCAQGWQTPEIALIDSHKLLR 947

Monodelphis_domestica_1 828 NVKDYIRAASASCHITRLLLTLDCPGYAQFYLEEAESSLQLSDHTSDGY-----LLHTQTCALLRSQL-------YCLQKKVTEGLSLLLSVVRD-STLQKSSKAWYLLRVQALQVVAVYLS--LPSSSLSVQQWEQLYTQGWQTSEIALIESHKLLR 970

Chelonia_mydas_1 798 VLGDWLGTANALCQVTKLLFLLESPSEAKIFLEEAESCLQKADCSSDSY-----LVLKQTCLILHSQL-------CCANHKVEEGLTLLLEVLQH-PALQKISKVWYLLRAHVLQLMAVYLS--LSSTSLLPELRQKLWAQGWKTPETALADAHKLFR 940

Gallus_gallus_1 797 ALGDNLGTAGALCQVTKLLLQLECPSYAKLFLEEAESCLQRTDGGSDSY-----LLLQQTCLVLRSQL-------YCASHRIKDGTALLLEVLQN-PALQKTAKVWYMLQAQVLQLTATYLS--LPPSHLSPEFRQYIFTQGWKTPETALSEAHRLLR 939

Xenopus_tropicalis_1 810 SLQDHIKEVGALCHLTKILFYLESPEYAQVILQKAEEILKRADRTSENY-----TLAELSCQLLRSHL-------CRVTRQVKEGVELLLGLLQH-PSLQKSSKAWYLLKVQVLQELSLFLM--LPLENLTSDLYRQLWAHGCHNPETALTEAHKLLR 952

Latimeria_chalumnae_1 757 ALSDDLPLVNALCQITKLLLTLGCPSAAQGYLDEAELWLQSADALSDGY-----SLMKLTCAVLRSQL-------YCSSQKL-----------------------WGNVELRHL----VYET--APSS--------------WKMQEAALLDAHKLLC 859

Maylandia_zebra_1 804 KLADNEACASSLIHSASILLDLGSTELAQTQLDQVESVLPSN--TAEGS-----SSLSLLAILVKAQC-------HYNNGQIDCGLRCLIEVLKQLKQQQKQSKSWYLLRARTLQTCASFLS--IDAALLPQVQKNLIMEQGISTSDGALYESLKLLC 945

Con_90% .lt....th.s...hh.h.h.Lt..phhph.h...t......t............h.p.hp.hl...h..........t...p......h....p....tp...sa..hpht.h.....hh...hs................hpp.p....c.ptlhp

Con_80% .lt...tsh.sh.hhhph.hhLsp.phAph.lt..c......s............hhphhp.hlt..h..........p.p.p.t....h..hpp..h.tpt.hsa..hchp.h...s.ahs..hs.s.h........h.p.hpp.c.sh.c.ctlh+

Con_70% hlp...tsshshhhhh+hLhhLGp.phAchtltptcthh.pts..t-........lhchsp.hlp.ph..........ptp.c.thhhhl..hpp..h.tcpt+sa.hhcspsh...s.als..hspssl........h.pshps.chAhhcucpll+

Con_60% slcLp.tsshuhhhhh+WLhaLGphphAchtLpptctthpptchhp-........lhchsphhlchpp..........spcs+sshhppl.shps.hhppcps+oa.haEsQsh...ssals..Lssssls.t..t..hscshps.ctAls+ucsLl+

H12 TPR6A H13 TPR6B H14 TPR7A H15 TPR7B

2D structure HHHH.........---------..........-------........-------HHHHHHHHHHHHHHHHHHH..........HHHHHHHHHHHHHHHHHH...HHH---------------HHHHHHHHHHHHHH..........HHHHHHHHHH

“Identical residues”* * * * ** * * * **

Caenorhabditis_elegans_1 269 NRIPGLASSQFDN---------SVNASIWPFL-------DGHQED-S-------NYYVHIGSTIAWHFEMRRECALVNVTTAQTRDSMSAMILNLRVALKSASFFRV---------------LQTTNTLAYYSSIIEEAGSEKNAKLMRVSCVNLL 385

Caenorhabditis_brenneri_1 270 NRIPGLAAYQFDN---------SVNSSIWPFL-------EGSRNGSS-------TDYIHLGSTLAWYFEMRRELALVNVATAQTRDSMSSMILNLRIGLKSASFFRI---------------LQTTNLLAYYTALVEEVGSERNAKLMRVSCFNLL 387

Caenorhabditis_briggsae_1 270 NRVHGLAAYQFDN---------SVNASICQFL-------NEGSHG---------TDYVHLGSTVAWHFEMRRELALVNVAAAQTRDSISAMILNLRIALKSASFFRI---------------LQMTNVLAYYTGLVEEVGCEKNARLMRISCVNLL 401

Caenorhabditis_remanei_1 270 NRVPGLAAYQFDN---------SVSTSIWPFL-------EEKHHGSS-------TDYIHLGSTVAWHFEMRREWSLVNVSTAQTRDSMSSMILNWRIALKSASFFRI---------------LQMTNLLAYYTTIIEEVGTEKNAKLMRISVVNLL 387

Dictyocaulus_viviparus_1 226 NRIPGIHKESE-----------SIDFR--------------KVTCDT-------TDFVKACASYAEFCELRREYSIELLNVGMVHDAAFHSQLGLWSALKIGVFFRI---------------QQFVNVNTLIRQCVVQVEFRKDLQFCMNIIRILY 333

Haemonchus_contortus_1 268 NRLPGIQKEHL-----------PMDFK--------------CVARDP-------LDFMKACSSFAECCDLRRDYVIELLHVGTIRDAAANSQLGLYCALKSGVLFRI---------------QQFVNTNTLLRSCVVHPDFKNDLHNCLDMLRMMY 376

Ancylostoma_duodenale_1 271 NRMQGIQKEDV-----------PLDLK--------------KIASDP-------LEFLKACASFAESCEVRRDYAIELLSVGMIRDAAAVSQLGLWCAVKAGVLFRLIKQLAVDASHPTSTILQFVNINTLIRACVVQQEFKKDLRTCMDIMHTIY 394

Ancylostoma_ceylanicum_1 271 NRMPGMQKEDV-----------PLDLK--------------KIASDP-------LEFLKACTSFAESCELRRDYAIELLSVGMIRDAAAVSQLGLWCAVKSGVLFRI---------------LQFVNINTLVRACVVQQEFKKDLRTCLDIMRTIY 379

Loa_loa_1 280 CRSSVYYGSDMKEEDAGCAAFAGGEFQETKLMSTVSSKLQPQFINIF-------DEYLKLSAVLAEHFKVVLQLCFKYIELGILRESESEGLKLLHQVLKTTNLQRC---------------LAVLNVLFLTSTLAHGNSVNVRPDPFRPVIASLW 427

Wuchereria_bancrofti_1 276 CRSSVYYGNDMKEEDVGCAAFAGGEFQETKIDETKI------------------DEYLKLSAAVAEHFKVVLQLCFKYIELGILRESESEGLKLLHQVLKTSNLQRC---------------LAVLNVLFLTSTLARGDIMNVRPDPFRPMIASLW 393

Brugia_malayi_1 276 SRSSVYYGSDMKEEDGGCATFAGGEFQEVNP-----------------------DDFLKLSAAVAEHFKIVLQLCFKYIELGILRESESEGLKLLHQVLKTSNLQRC---------------LAVLNVLFLTSTLARGDITNVRPDPFRPMIASLW 393

Toxocara_canis_1 331 NRYQAFFNYGE-----------GIELQ---------------KKVNP-------DDFLKLCSHISEHFEATILQCYELAETGVLRECEGIVMSLWRESFRLGSLPRA---------------LIAINLLMMLVMMS--NYMRKREQSLRQVYAALL 437

“Identical residues”* * * * ** * * *

Homo_sapiens_1 948 SIILLLMGSDI--------------------LSTQKAAVETSFLDYGCSEKLVCENLVQKWQVLSEVLSCSEKLVCHLGRLGSVSEAKAFCLEALKLTTKLQIPRQC---------------ALFLVLKGEL------ELARNDIDLCQSDLQQVL 1055

Monodelphis_domestica_1 971 SIIILLMGRDL--------------------LFVQQSAVDAAFLDYGCSEKLVTENLVQKWQVLAEVLGCSEKLVTLLGRLGTVSEAKAFCLEALRLAMKLQCVRQC---------------SLFLVLKGEL------ELARNDLDLCQSDLEQVL 1077

Chelonia_mydas_1 941 SILLLLMGGDL--------------------LSSPKTVTETQFVDHGCSEKLIAENLLQKWQVLADMLVCSEKLIALFSQVEMVCEAKAFCLEALKLAMKLQAIRWC---------------AGFLVLKAKL------ELQRSELELCHSDLQQAL 1052

Gallus_gallus_1 940 GIVLLLMGSNV--------------------LGSHKSAADIRFIDCGCSEHLVVDNVLQKWQVLADLLGCSEHLVVLLSRVEVMCKAKAFCLEAIKLAVKLQAVRWC---------------ASFLVLKAQL------ELQQSELELSHWNLQQAL 1051

Xenopus_tropicalis_1 953 SIVLLLLGNDV--------------------LASPKGSSEVRFVDNGCTHDLVKENLLQKWQVLADLLSCCRKLVHTLGRMGCVSEAKSFCLEGLRVAKTLQSMRHC---------------ADFLVRKAEL------ETLRSETQLCEEDLQYVL 1064

Latimeria_chalumnae_1 860 SIVILLTGKSI--------------------VPSHKANANIRFMDQGCCRKLVHENLLQKWQVLADLLSCTHDLVKLFGWVGSVCEAKAFCLEALKLTMKFKLLTQC---------------ADFLVSKAEL------ELQRSRSDLCHLDLQQVL 971

Maylandia_zebra_1 946 SLLVTLVGKGL-----------------YGTTTSSNTNVVSSFINQGCSKKMVSDNLALKWQLLAELLNCSKKMVSVRSDCGAINDARLQCLEALKLAIKLQTLSQC---------------SELLVMKAEL------ELMQGEKDESRTDLDIVR 1060

Con_90% sh..hh.t.t.............................................phhph.t.hu..hth..phsh...tht.hp-u.......hh.sh+.t.h..h.................hhsh.h.h.........p.p.p.....h..hh

Con_80% shh.hhht.ph..............................t..s.........pphlphhtshA.hhphp.phsh..hphu.hp-u.u.....Lh.shK.tsh.ph................thhsh.h.h.........ptp.c.hh..h..lh

Con_70% shh.hlhupsh..............................phhs.s.......pshlphhpslA-hhphpcchshhhhphu.lp-utu.sl.sL+huhK.tshhph...............hthlslhs.h......thhcpchc.hp..htplh

Con_60% sRl.slhupsh...........shp.p....h.......p.phhsps.......ssal+hssslA-hhchpcchshhhsplGhl+-utuhsl.sL+hAlKhsslhRs...............hphlNlhshl..h...phhcpchchh+ssltplh

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 12: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

232 233

INSERT 1 H16 H17 TPR8A

2D structure .---------------------------------....----.................-.......---........................-----------------------------...........HHHHHHH..HHHHHHHHHHHHHH

“Identical residues”* * * * * *

Caenorhabditis_elegans_1 386 S---------------------------------SNPI----IVRCSTPKETGATSRAHTPMAGSSV---SEKQNTMRPDLADLLGDLELLDEQ-----------------------------SFHPITRSCVCNVCTIYPLHSSFAAEYMMSYAIH 473

Caenorhabditis_brenneri_1 388 S---------------------------------SEPV----IVRCSTPKEPRVPSRAHTPLPGEKI--RPQQQSTMIADESDEFEDIYLL-EK-----------------------------PFHSITRSCSCTVCLHYPNSSTFAAEYMLSHCIY 475

Caenorhabditis_briggsae_1 402 S---------------------------------SQPI----VVRCSTPKETRAASRAHTPLPGEESGNRRSKSDT-LNILADELEELDLYNDQ-----------------------------IFHPITRSCNCNVCLHYPLSTTFAAEYMMAYCIN 491

Caenorhabditis_remanei_1 388 S---------------------------------ANPV----LVRCSTPKESRGPSRAQTPLPGEKSGYVPHISETMVAPLEDLMEDLDLFDEQ-----------------------------PFHSVTRTCPCNVCLHYPLSCTFAAEYMMSYCIH 478

Dictyocaulus_viviparus_1 334 LKEPSSKN-----------VFGGSKQK--EVLPDDSEM--SLLLTAKSPMGNRLFPASE-------------GQKSVAISIMESFEDLRLE------------------------------------------------------------------ 397

Haemonchus_contortus_1 377 SNGSSMTDM----------LFGCSKKRPPE-VSFSDELTTEIPLPRISTP--SRSPARV-------------GQKSLATTLIESFEELRLETGKTPQRGRSPSRIASKVIDSFHELTLSETVSSLHELSDSCSCNFCWFCRTNYNASFEYWFSSFLY 507

Ancylostoma_duodenale_1 395 SSDTLLSKT----------SLGCTKKKPPEMLEFVDNI--DLPIPCTSTPIRSKSPGRV-------------SHKSVALSLAESFEELRLESAKTPPRCKSPGRITSKFIDSFHELTLSETICSLHEISEECECNTCELCSSNYNAAFEYWMSSFLY 526

Ancylostoma_ceylanicum_1 380 SSENLSSKT----------SLGFTKKKPPEMLEFVDNI--DLPVPCTSTPIRSKSPARV-------------SHKSVALSLAESFEELRLESTKTPPRCKSPGRITSKFIDSFHELTLSETICSLHEISEDCECNTCELCSSNYNAAFEYWMSSFLY 511

Loa_loa_1 428 NRSRNET------------SVCVDKEM---------------------------------------------QNKKLNLSSKDQKIIDLTEDEPNNLPQANKENMENKL--------------ILEEHVEYCDCVICTLCDINPQMHIEAIFSKMLF 513

Wuchereria_bancrofti_1 394 NRSKSGI------------SVCTDSER---------------------------------------------LNRELNLS--SQKIIDLSEDEPNNLSQANKENMESKL--------------ILEEHVEYCDCVICTLCGMNPQMHIEAIFSKMLF 477

Brugia_malayi_1 394 NRSKNGI------------YVRTDNER---------------------------------------------LNKKLNLSSGSQEIIDLSEDEPSGLPEANKKNMESKL--------------ILEEHVEYCDCVICTLCGMNPQMHIEAIFSKMLF 479

Toxocara_canis_1 438 NVPKRSD------------DSSNTAER--------------------------------SPDPGTLA-------------KEPSLAALMLHAFCYGQESGDGEKKEEKIEDD-----------KLESHNEKCTCVVCATCEHDLQLQFDISFAAFLF 526

“Identical residues”* ** * * * * *

Homo_sapiens_1 1060 FLLESCT------------EFGGVTQHLDS-VK--------------KVHLQKGKQQAQVPCPPQLP----EEELFLRGPALELVATVAKEPGPIAPSTNSSPVLKTKPQPIP----------NFLSHSPTCDCSLCASPVLT-AVCLRWVLVTAGV 1170

Monodelphis_domestica_1 1082 FLLGSST------------EFEGIAQLPNT-VK--------------RIHPQKGQQEPRIQPGSELS----EEEAFLKGPALQLVATVEKDPGP----SVSSPVLKSKPRPCP----------AFLSHSLSCSCILCTDPALV-VVCLRWLLVSAGA 1189

Chelonia_mydas_1 1053 FLLESGTGEVSFTSDLKERNFETKEKQKSE-VKI-------------KPKKGRSKGSKHQGPSAEPA---GEEDGFLKGPSLEFVDTVSRQEK--ESVLTTSPVLKPKKKRCL----------NFLSHSALCSCSLCSDVTLS-AVCLHWVVSYGQW 1175

Gallus_gallus_1 1052 FLLESGT------------EFEADEKQRAP-LKI-------------LPRKGKPEGRKQRDEGSELP---GEDEGFLKGPALEFVAMVSGLEK--ADDLSSSPQLKPKRKRRL----------AFLAHPAACPCCLCSDLTLS-SLCLRWLLCCARG 1162

Xenopus_tropicalis_1 1065 FLMESCT------------DFAAKSKQK-E-VKI-------------KLCKGKPFHKVDITETPPSP----PKDDFLKAVDLHYVETRDLSKAP------ESPTCKPSLEKNL---------PHFLSHTQDCTCPLCSDLVLS-LLCTQWLVAKAEN 1170

Latimeria_chalumnae_1 972 FLIESCT------------DFDSRGQVQGK-MKI-------------KVKKQKAVRKQTTKTDPTAE----GEDAFIKGPSLRNIATVSRDK---EGALTASPVLKSKVRKQL----------SFLSHQPACWCLVCTDVELS-LACTKWTVVFGEA 1080

Maylandia_zebra_1 1061 NLLERST------------DFSDQVQRA-E-VKI-------------KPRKGRPVQRPQSPLP--------TIEDDLK----DILSTRWTSKEPIVKDLSGSPPLKALPHRWL----------SSLTHEPECQCPCCSEPCLG-RATARWAAAQADL 1163

Con_90% .........................................................................p..h.....p.h..........................................h...C.C.hC........hthph.hs.h..

Con_80% .........................................................................pt.hths..p.ht.h.h..t...............................h.ths..C.CshC.....s..hthchhhs.h..

Con_70% ...t................hts.tp......................h.h.tts.tp..............tpt.lths.hc.htpl.hppt.......t...ht..................hhphsttCsCslCt.hshs.thshcahhu.hhh

Con_60% sh.ppss............phtsptpp.....................pstpt+s.tc.....s........pppplphs.h-.htplphpst......ts...hpsK...............shhshsppCsCslCs.sshs.phshcahhuphha

H18 TPR8B H19 TPR9A H20 TPR9B H21 TPR10A

2D structure --------....HHHHHHHHHHHHHHHHHHHHHHHHH---------......-------.....HHHHHHHHHHHHHHHHHHHHH.....-----HHHHHHHHHHHHHHH....------..HHHHHHHHHHHHHHH...----------..

“Identical residues”* * * * * * * ** * * ** * * * * **

Caenorhabditis_elegans_1 474 --------SDFSQLSIKHFNDEFARIRERGMSSQVLM---------HRDSSV-------RPRPNIIQNEIFGMCVIRWLTKKLDSKESAD-----EDTMEIFNNALKIVRYLQQ------RTTDMILAVTQLGRQLEFPM----------EC 580

Caenorhabditis_brenneri_1 476 --------MNFSQDAIKHFNDEFSHIRERGMEFQLMM---------HRDRSV-------RP----------------------------------------------VVKYLNL------RTTDLLLAVTQLGRQLEFPF----------DV 541

Caenorhabditis_briggsae_1 492 --------SDFTQISMKHFNDEFAQIRERGMNAQTMM---------HRDKSV-------RPRPNIIQNEVFGLCVIQWLKRKVDTKQLVD-----TETLEIFKNALKIVKYLNL------RTTDLLLAVQQLGRQLEFPQ----------DC 598

Caenorhabditis_remanei_1 479 --------SNFNQTSLKHFNEEFIRIRERGMSSQTMM---------QRNTQV-------RPRPNIIQNEVFGMCVCQWLVRKIDSKQYVD-----QESLEIFNNALKIVRYLNI------RTTDLMLAVQQLGRQLEFPQ----------DC 585

Dictyocaulus_viviparus_2 1 ------------------------------------------------------------------MCETFAICVVRWLRRLEEACWNRENAALRLLREKIVQDALKICSYSSL------RTVPYRLAISQLERLQNLPQ----------RL 70

Haemonchus_contortus_1 508 --------ESMSESAIQCLEERFHALRARVAPEQCAV----------LGKKV-------KPRPPSIMCETFALCVVRWLRRLEDSSWDSENLKLKPLRDSIVKSALKICGYSSV------RTVCYRLSISQLERIRKLSR----------CY 618

Ancylostoma_duodenale_1 527 --------ESMSNESIQCLEKQFQILRARVAVEQRAV----------LGKKV-------KPRPPSIMCETFAMCALRWLRSLEDSEWDHDNSSLKALREQIIASTLK--------------------------------------------- 608

Ancylostoma_ceylanicum_1 512 --------ESMSNDSIQCLEKQFQMLRARVAVEQRAV----------LGKRV-------KPRPPSIMCETFAMCAVRWLRSLEDAGWDHDNSALRALREQIVASTLKICGYSSV------RTIYYRLAVSQLERIRRVLR----------PS 622

Loa_loa_1 514 --------YRYSSRMFRVLSERILKLPQETALDVIAICSSLQKQ--ATDKAIE------ERRYPLAALDLFLLAAIRWLRRVGEEEK-------PEIIQDVIRESLKLCSTDPI------YFRWQWLTIRQLARTPL--K----------IP 624

Wuchereria_bancrofti_1 478 --------YRYSSRIFRMFSERILKLSQETSLDVIAICSALQKE--ATDKAIE------ERKYPLEALDRFLLAAIRWLRRVSEEEK-------PEIIQDVIKESLKICLTDPI------HFRWQWLTIRQLARTPL--K----------IP 588

Brugia_malayi_1 480 --------YRYSSRMFRMFSERILKLSQETSLDVIDICSALQKE--ATDKAIE------ERKYPLEALDRFLHAAIRWLRRVSEEEK-------PEIIQDVIKESLKICLTDPI------HFRWQWLTIRQLARTPL--K----------IP 590

Toxocara_canis_1 527 --------DSFSSHSFQTLATKFRTLETKFARAHSCM-----------------------------------------------------------CNAIGACVDAELCELEPC------HHRWQWLTIRQLEREPMIDR----------TS 595

“Identical residues”* * * * * * * ** * * ** * * * * **

Homo_sapiens_1 1171 --RLAMGHQAQGLDLLQVVLKGCPEAAERLTQALQAS---LNHK--TPPSLV-------PSLLDEILAQAYTLLALEGL-NQPSNES----------LQKVLQSGLKFVAAR---IPHLEPWRASLLLIWALTKLGGLS----CCTTQLFAS 1290

Monodelphis_domestica_1 1190 --ELSSGHQAKGLDMLRSVLRRCPGATHHLAQVLQAT---LSQE--ASPSPA-------PGFLEDLIAQAYAHLALEGI-NHPPDKG----------LWDILETGLNFVASR---TPFLNPWHASLLVTKSLATLSSLS----CCTTQLFGN 1309

Chelonia_mydas_1 1176 --ELASGNRAEGLGLLQASLERCMVMTSRFSTMVREA---SQQK--GKKAAIWDRSM--LGLLDNLVANIYATVAAQSMGGHQPEKR----------LWELLEAGLTFLSSRVPCLPGLEYHKASLLLTKAVATIYSLASKHNGCMAHIFSS 1308

Gallus_gallus_1 1163 --ELAAGSAAEGLGLIQATLQRCGTVGPRFTVLLQDK---LGSG--HKAPA--------LRLLDDLVATGYAALALHSLAGLCPEEK--------- LEEHLKAGLSFLASCSLHLPSLGVSKASLLLAKATAAVCRLAAKHGGSVDAVFAS 1289

Xenopus_tropicalis_1 1171 --------NTQSRGLLKSALKRCKALTQRFSGIVQEI---LGKG--VTKPAV-------LGVASELMARVYLAMA-----SQSPSKL----------PEETLEEGLSFVSSR--QATLPKHWRAGLLLAKALCSIYNLAAKHGGCVADMFVQ 1285

Latimeria_chalumnae_1 1081 --EAAAGNHAECRRLFQTSLERCETLAERFAGFIQGT---LAKEKPAKSASLKGKGMPSFGFLDDLCAKIYSKMAAFSL-TPKPEKT----------TWKVLNAGLIFLTSKAVCLPSLEHAKAALLLAKAVALISSLAAKHCCSATELFSP 1216

Maylandia_zebra_1 1164 LLQLDPNDTSVSSKLQWATLSRSKTVTAKLGVKLAKL---FPLY--NTGKHI-----SKPSLMQDIVGRVYLHMALSGV-DPSLTKVCA--------IWKILEAGLAFVESTP--SPMLRPVRAGLMTAKAIASLISLAAKNECTPEELFSK 1284

Con_90% .........t.s........pt...h..ch......................................phh.hhs.............................sh.hh.............h..hh.h.tht...................

Con_80% .........t.s.t.hphh.pph.hhttch......h........................h......phahhhsh..h.t...tp.............phhttsLphht...........hh..hLhhptltp.....t............

Con_70% ........ts.upt.hpth.cch.tlpt+hs....th........................h.s.hhsphahhhshp.l.p...pc.............cllptuLphst.ps.......hhp..hLhhptLsp..thst...........s

Con_60% ....... pshopphhpsh.cchhplppRhu.t.pth...htpt..tpsttl........sh.s.lhscsauhsAlphLhphtspch..........h.cllpsuLchsthpsh......+hps.hLslppLu+h.phsp..........ss

INSERT 2 H22 TPR10B H23 TPR11A H24 TPR11B H25

2D structure ................--------------...HHHHHHHHHHHHHHHH.......HHHHHHHHHHHHH.---------...HHHHHHHHHHH...HHHHHHHHHHH...

“Identical residues”* * * * * * * *

Caenorhabditis_elegans_1 581 NYSWMRPTIRKPRVKAT-(29)---------FDKERFEKVRLAMRNEMNHYGHILYREWRCRLFAYVGRT---------SRDPWEAAYAWAESTQIGARNAVQSRLEKCKR 697

Caenorhabditis_brenneri_1 542 NYSWMQPIVKKPRSKKG-(30)---------VDKEKLLKIRSEARREMNNYAHILYREWRCRLFPYIGRI---------SNDPWEAAYAWAESTLIGSRNAIQCKLERCRK 657

Caenorhabditis_briggsae_1 599 NCSWMHPIVKKPRVKAR-(30)---------FNKERYDKVCHEIRREMTQYTHILYREWRSRLFPYVGRT---------SKDPWLSAYAWAESTLIGSRNAVQCRIAKCRK 729

Caenorhabditis_remanei_1 586 DYSWMQPVIKKPRVKAG-(30)---------FDSARYEKIRTEMKLEMNLYGHILYREWRCRLFPYIGRT---------STDPWEAGYAWAESTLIGSRNAIQCRLEKCRK 701

Dictyocaulus_viviparus_2 71 RYDWMEDDTRTNNFMNT-(27)------NSAEQKDLTKGIVDDALKEYYRYSGLLYRDWRFSLCTFLGEY---------ATCPWLAAMMWSESTLLATRQLARHISGKFKG 184

Haemonchus_contortus_1 619 RYEWMKDTSRAGKFMDA-(23)------EVVAQKIVEKNMMKEALSEYASFSNLLYRDWRFPVCTFLGEY---------VSCPWQAAMMWSESTMLATRQLSRFMSGKMNG 730

Ancylostoma_duodenale_1 609 ------------------------------EEQKKIEEAAIAEANAEYASYSNLLCRSMGSSNDVVRVN------------------------------------------ 647

Ancylostoma_ceylanicum_1 623 SHDWMTDPSLTNRVIST-(21)------DSEEQKKAEEAAIAEANAEYASYNNLLCRDWRYPMCTFLGEY---------SKCPWEAAMMWSESTLLTTRQHSRLLLGKMSG 732

Loa_loa_1 625 KLEWMHLHVSPI--KAG-(41)--NHKTVTNVTPSYIDMEMEEARKDFNSFSHLFYREWRFQTCSYLGQVCAFQYKADWIEPGWSAAYYFNEAMFISTRQLARMIYKKSEE 765

Wuchereria_bancrofti_1 589 KLEWLRLHVKARLRSPF---------------------------------------------------------------------------------------------- 605

Brugia_malayi_1 591 KLEWLRLHVSPV--KAG-(33)--NRNTVTNVINNYVDTEMEEARKDFDSFSHLFYREWRFQICSYLGQVCAFQYKANWIESGWSAAYYFNEAMFISTRQLARMISKKSEE 733

Toxocara_canis_1 596 HLPWMLPSVSSDRIRLS-(45)--DSETA-NGMNPRYCQLYETANEDFEQYSHLFYREWRYLNCSSLACA--------NLSDNWMSAYFFSEATCFATRQLARTVDEEGES 733

“Identical residues”* * * * * * * *

Homo_sapiens_1 1291 SWGWQPPLIKSVPGSEP-(262)-PSAPV-ATGLSTLDSICDSLSVAFRGISHCPPSGLYAHLCRFLALCLG-------HRDPYATAFLVTESVSITCRHQLLTHLHRQLS 1641

Monodelphis_domestica_1 1310 TWAWQPPAAKLSQMSKS-(256)-ASTPA-AFGQPTLESIYDSLSIAFRCVSHCPPSELYAQLCRLLALCLG-------HRDPHTTALLVTESMSITSRHQLLTTVHRHLH 1660

Chelonia_mydas_1 1309 AWAWKPLTPS-SETKGS-(322)-SSLAA-AVSLSSLDSVCNSLKAAFQAISHCPPGTLYCQLCRLLALCIG-------SRDPLTTAYLVSESVSITVRHQLMSNVHRRLY 1724

Gallus_gallus_1 1290 AWAWQPAPLSPTEPKAA-(361)-NVLPA-VDDASLLDTACGLLKSAFSCISHCPPGTLYSQLCQLLALATG-------DQDPITTAYLLAESVSVTTRHQLLSIIHKKLH 1745

Xenopus_tropicalis_1 1286 LWGWQPHQNGKKAADLN-(272)-SSLPA-TFSASDLSLVSSLLHEALESIAHFPPSTLYSRLCRLLALCRG-------RQDPYVTALLVSESVSITLRHQLLSDIHRKLR 1652

Latimeria_chalumnae_1 1217 LWAWKPFPIPDPAKEKA-(359)-SALTSDVADFSALDSVYACLCSAFHSISHCPPSILFSHLCQLLALCRG-------SCDPYTTAYFVSESVAITARHQMLTNIHKKLS 1671

Maylandia_zebra_1 1285 AWTWNMPKEGKDLKSEQ-(303)-SHADS-RPDNLTLEDVQSLLRSAWLTLQHFPPPTLYPTLCALLALTTG-------QQNSITTAMLHSHSVGITSRHRTIRHIASCLK 1692

Con_90% .h.W.......................................h..th..htth..................................hhsps..ht.p.......tp....

Con_80% thtW.....t......t.....................thht.hp.thtthsph..tphh..hh.hlu.h..........pssh.sAhhhsEuh.lssRp......tc....

Con_70% thsW.....p.s.hpts...................h.thhtphptthtphuHh..pphh.plhthlu.h..........pssa.sAhhhuESs.lssRp.h..hht+....

Con_60% pasWh.shsptsthpts..............hs..hhpplhsphpptapshoHh..pphhhpLCphLuhs.........ppsPapsAhhhuESshIsoRpthhphlc+....

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 13: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Supplementary figure 2: 234

235

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 14: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Supplementary figure 3: 236

237

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint

Page 15: Conservation of separase N-terminal domainDec 19, 2017  · 19 Introduction 20 Separase is a CD clan cysteine protease which cleaves multiple substrates to regulate cell 21 division

Supplementary Table 1: Conserved residues found within helices that makeup the N-terminal 238 domain are listed. Residues that are within interacting distance (as assayed by measuring a 239

distance less than 6 angstroms (Å) between β-Carbons) are indicated. These residues are 240 generally located within the same helix and don’t appear to be important for stabilizing inter-241 helix interactions. 242

Conserved

residue

<6 Å ββββ-C

distance

Same

helix?

F90 W93 (5.5 Å) same

W93 F90 (5.5 Å) same

E127 R130 (4.7 Å) same

L158 I514 (5.3 Å) no

F220 Y222 (6.0 Å) same

D264 None

D330 R329 (5.1 Å) same

F520 None

W584 Y582 (5.7 Å) same

R685 None

243

.CC-BY-NC-ND 4.0 International licenseunder anot certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available

The copyright holder for this preprint (which wasthis version posted December 19, 2017. ; https://doi.org/10.1101/236216doi: bioRxiv preprint