15
Submitted 22 May 2017 Accepted 11 July 2017 Published 31 July 2017 Corresponding author Steven H.D. Haddock, [email protected] Academic editor Robert Toonen Additional Information and Declarations can be found on page 11 DOI 10.7717/peerj.3633 Copyright 2017 Francis et al. Distributed under Creative Commons CC-BY 4.0 OPEN ACCESS Symplectin evolved from multiple duplications in bioluminescent squid Warren R. Francis 1 ,2 , Lynne M. Christianson 1 and Steven H.D. Haddock 1 1 Monterey Bay Aquarium Research Institute, Moss Landing, CA, United States of America 2 Department of Biology, University of Southern Denmark, Odense, Denmark ABSTRACT The squid Sthenoteuthis oualaniensis, formerly Symplectoteuthis oualaniensis, generates light using the luciferin coelenterazine and a unique enzyme, symplectin. Genetic information is limited for bioluminescent cephalopod species, so many proteins, including symplectin, occur in public databases only as sequence isolates with few identifiable homologs. As the distribution of the symplectin/pantetheinase protein family in Metazoa remains mostly unexplored, we have sequenced the transcriptomes of four additional luminous squid, and make use of publicly available but unanalyzed data of other cephalopods, to examine the occurrence and evolution of this protein family. While the majority of spiralians have one or two copies of this protein family, four well- supported groups of proteins are found in cephalopods, one of which corresponds to symplectin. A cysteine that is critical for symplectin functioning is conserved across essentially all members of the protein family, even those unlikely to be used for bioluminescence. Conversely, active site residues involved in pantetheinase catalysis are also conserved across essentially all of these proteins, suggesting that symplectin may have multiple functions including hydrolase activity, and that the evolution of the luminous phenotype required other changes in the protein outside of the main binding pocket. Subjects Biochemistry, Evolutionary Studies, Genomics, Marine Biology Keywords Luciferase, Neofunctionalization, Coelenterazine, Squid, Gene duplication, Symplectin, Evolution, Bioluminescence INTRODUCTION Luminous cephalopods (squids and octopods) generate light through mechanisms both native to the animal and through control of symbiotic bacteria (Haddock, Moline & Case, 2010). Although two independent evolutionary events of bacterial bioluminescence have been identified in squid (Lindgren et al., 2012), native bioluminescence (also called autogenic bioluminescence) is by far more common. The bioluminescence mechanisms of many of these autogenic species are unknown, although it is known that most of the species make use of the same luciferin, coelenterazine, which is the consumable substrate for the reaction. Coelenterazine is the most widely occurring luciferin in marine bioluminescence, its use being report in at least eleven phyla (Haddock, Moline & Case, 2010). Some species have been found to use forms of coelenterazine with additional modifications: the firefly squid Watasenia scintillans utilizes sulfated coelenterazine (Inoue et al., 1975), while the squid Sthenoteuthis oualaniensis (Takahashi & Isobe, 1994) and the How to cite this article Francis et al. (2017), Symplectin evolved from multiple duplications in bioluminescent squid. PeerJ 5:e3633; DOI 10.7717/peerj.3633

Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Submitted 22 May 2017Accepted 11 July 2017Published 31 July 2017

Corresponding authorSteven H.D. Haddock,[email protected]

Academic editorRobert Toonen

Additional Information andDeclarations can be found onpage 11

DOI 10.7717/peerj.3633

Copyright2017 Francis et al.

Distributed underCreative Commons CC-BY 4.0

OPEN ACCESS

Symplectin evolved from multipleduplications in bioluminescent squidWarren R. Francis1,2, Lynne M. Christianson1 and Steven H.D. Haddock1

1Monterey Bay Aquarium Research Institute, Moss Landing, CA, United States of America2Department of Biology, University of Southern Denmark, Odense, Denmark

ABSTRACTThe squid Sthenoteuthis oualaniensis, formerly Symplectoteuthis oualaniensis, generateslight using the luciferin coelenterazine and a unique enzyme, symplectin. Geneticinformation is limited for bioluminescent cephalopod species, so many proteins,including symplectin, occur in public databases only as sequence isolates with fewidentifiable homologs. As the distribution of the symplectin/pantetheinase proteinfamily inMetazoa remainsmostly unexplored, we have sequenced the transcriptomes offour additional luminous squid, andmake use of publicly available but unanalyzed dataof other cephalopods, to examine the occurrence and evolution of this protein family.While the majority of spiralians have one or two copies of this protein family, four well-supported groups of proteins are found in cephalopods, one of which corresponds tosymplectin. A cysteine that is critical for symplectin functioning is conserved acrossessentially all members of the protein family, even those unlikely to be used forbioluminescence. Conversely, active site residues involved in pantetheinase catalysisare also conserved across essentially all of these proteins, suggesting that symplectinmay have multiple functions including hydrolase activity, and that the evolution of theluminous phenotype required other changes in the protein outside of the main bindingpocket.

Subjects Biochemistry, Evolutionary Studies, Genomics, Marine BiologyKeywords Luciferase, Neofunctionalization, Coelenterazine, Squid, Gene duplication,Symplectin, Evolution, Bioluminescence

INTRODUCTIONLuminous cephalopods (squids and octopods) generate light through mechanisms bothnative to the animal and through control of symbiotic bacteria (Haddock, Moline &Case, 2010). Although two independent evolutionary events of bacterial bioluminescencehave been identified in squid (Lindgren et al., 2012), native bioluminescence (also calledautogenic bioluminescence) is by far more common. The bioluminescence mechanismsof many of these autogenic species are unknown, although it is known that most ofthe species make use of the same luciferin, coelenterazine, which is the consumablesubstrate for the reaction. Coelenterazine is the most widely occurring luciferin in marinebioluminescence, its use being report in at least eleven phyla (Haddock, Moline & Case,2010). Some species have been found to use forms of coelenterazine with additionalmodifications: the firefly squid Watasenia scintillans utilizes sulfated coelenterazine (Inoueet al., 1975), while the squid Sthenoteuthis oualaniensis (Takahashi & Isobe, 1994) and the

How to cite this article Francis et al. (2017), Symplectin evolved from multiple duplications in bioluminescent squid. PeerJ 5:e3633; DOI10.7717/peerj.3633

Page 2: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

clam Pholas dactylus (Tanaka, Kuse & Nishikawa, 2009) use a version that is dehydrated atthe 2-position, called dehydrocoelenterazine (hereafter abbreviated as dhCtz).

Much of the work on coelenterazine-using bioluminescent systems has focused oncnidarians (both hydromedusae like Aequorea victoria or Obelia spp. and the octocoralRenilla reniformis) and crustaceans likeGaussia princeps andOplophorus. Nonetheless, somekey protein components have been identified in two squid species. The luciferase of thefirefly squid Watasenia scintillans was recently identified (Gimenez et al., 2016), belongingto the same protein superfamily as firefly luciferases. The other protein comes from thepurpleback flying squid Sthenoteuthis oualaniensis, where a 500-amino acid photoproteinhas been cloned and characterized (Isobe et al., 2008) and was named symplectin. Whilethe family of firefly luciferases and marine coelenterazine luciferases has been well studied,little attention has been given to the distribution and origins of symplectins.

Previous work highlighted the sequence similarity of symplectin to members ofthe biotinidase/pantetheinase family (Fujii et al., 2002), part of the superfamily ofcarbon-nitrogen hydrolases. Such proteins have been characterized in mammals, andin Drosophila (Swango & Wolf, 2001) and have roles in recycling of the enzymatic cofactorsbiotin and pantothenic acid. However, information about this protein family is limitedoutside of model organisms. Thus, very little insight could be offered to explain how abiotinidase evolved into a photoprotein, and how this apparently happened independentlyof other mechanisms of bioluminescence in squid.

Transcriptomics has proven helpful in identifying new fluorescent proteins (Hunt etal., 2012) and photoproteins (Powers et al., 2013) in organisms where traditional cloningapproaches have failed. Based on this, we sought to identify orthologs of symplectin intranscriptomes of other luminous squid. We present transcriptomes from light-producingtissues of four luminous squid, Chiroteuthis calyx, Octopoteuthis deletron, Vampyroteuthisinfernalis, Pterygioteuthis hoylei, and one luminous nudibranch Phylliroe bucephala. Wefound homologs of symplectin in all five species, and then further compared them tohomologs identified in other animal phyla.Many important catalytic residues are conservedacross the entire protein family, including among cephalopod proteins. As this proteinfamily has undergone many independent expansions in several other groups since the lastcommon ancestor of animals, it may be that large expansions of protein families facilitateevolution of novel phenotypes, including bioluminescence.

MATERIALS AND METHODSSpecimens and sequencingSpecimens of Chiroteuthis calyx,Octopoteuthis deletron, and Vampyroteuthis infernalis werecollected around the Monterey Bay, off the coast of California using detritus samplerson ROVs (remotely-operated vehicles) from the Monterey Bay Aquarium ResearchInstitute (MBARI). Pterygioteuthis hoylei and Phylliroe bucephala were caught in the Gulfof California by trawls and blue-water diving, respectively. ROVs were operated duringdaytime hours, between 07:00 and 19:00 h. Operations were conducted under permitSC-4029 issued to SHD Haddock by the California Department of Fish and Wildlife.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 2/15

Page 3: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Table 1 Transcriptomic data sources.

Species Tissue Luminous Bases(Gb)

Assembledtranscripts

Accession Reference

Chiroteuthis calyx Arm Yes 6.3 78,445 SRR5527417 This studyOctopoteuthis deletron Arm tips Yes 6.1 122,672 SRR5527415 This studyVampyroteuthis infernalis Arm tips Yes 6.4 149,961 SRR5527416 This studyPterygioteuthis hoylei Photophore Yes 6.8 93,201 SRR5527418 This studyDosidicus gigas Photophore Yes 6.0 94,197 SRR5152122 Francis et al. (2013)Watasenia scintillans Arm/mantle Yes 30.0 216,307 GEDZ00000000.1 Gimenez et al. (2016)Uroteuthis edulis Photophore Bacterial 8.8 119,033 PRJNA257113 Pankey et al. (2014)Euprymna scolopes Multiple Bacterial 45.6 280,433 PRJNA257113 Pankey et al. (2014)Octopus vulgaris Nerve/brain No 1.2 59,859 JR435555 Zhang et al. (2012)Sepia pharaonis Whole animal No 4.6 131,176 SRR3011300 Wen et al. (2016)Loligo vulgaris Sucker No 3.1 43,951 SRR3472303 Jung et al. (2016)Doryteuthis pealeii Multiple No 15.2 212,516 PRJNA255916 Alon et al. (2015)Phylliroe bucephala Whole animal Yes 6.1 66,535 SRR5527414 This study

Species used are unprotected and unregulated, and no vertebrates or octopus were used,so the International and NIH ethics guidelines are not invoked, although organismswere treated humanely. All samples were flash-frozen in liquid nitrogen. Total RNA wasextracted with the Qiagen RNA-easy kit following manufacturer’s instructions. All sampleswere sequenced at the University of Utah using the Illumina HiSeq2000 platform. Librarieswere generated using the Illumina TruSeq kit, with oligo-dT selection, and were run withsix samples per lane. Assemblies were generated as previously described (Francis et al.,2013). NCBI SRA numbers are given in Table 1. All assemblies can be downloaded athttps://bitbucket.org/wrf/squid-transcriptomes.

Data acquisitionWe made use of public data from a number of previous studies (Table 1). Whenavailable, assembled transcriptomes were used. Otherwise, data were downloaded fromNCBI SRA and assembled with Trinity v2.2.0 (Grabherr et al., 2011), with the options--normalize_reads and --trimmomatic.

Gene treesHomologs of symplectin were identified by BLAST alignment using blastp or tblastn, usingsymplectin (NCBI accession: C6KYS2.2) as the query and an e-value threshold of 10−10. AllBLAST searches were done using the NCBI BLAST 2.2.29+ package (Camacho et al., 2009).As the query was 501 amino acids, partial proteins under 100 amino acids were unlikelyto provide useful comparisons and were excluded. Alignments for protein sequences werecreated using MAFFT v7.029b, with L-INS-i parameters for accurate alignments (Katoh& Standley, 2013). Phylogenetic trees were generated using either FastTree (Price, Dehal &Arkin, 2010) with default parameters or RAxML-HPC-PTHREADS v8.2.10 (Stamatakis,2014), using the PROTGAMMALG model for proteins and 100 bootstrap replicates withthe ‘‘rapid bootstrap’’ (-f a) algorithm.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 3/15

Page 4: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Structure modellingA model structure of symplectin was generated using the HHPred webserver (Alva et al.,2016), with human vanin-1 (PDB accession: 4CYG) as the template protein, using defaultparameters. Vanin-1 was the highest scoring model available (probability of 100, e-valueof 1.4∗10−71). Model evaluations can be downloaded at https://bitbucket.org/wrf/squid-transcriptomes.

RESULTSSymplectin-like proteins in other cephalopodsWe generated approximately 6 Gb of pair-end RNAseq data for 4 squid and one luminousnudibranch (Table 1). Reads for each species were assembled de novo using Trinity(Grabherr et al., 2011). With symplectin as the query, we used BLAST (Camacho et al.,2009) to identify homologs in the transcriptomes and genomes of other animals. Ingeneral, cephalopods have four copies of this protein sharing a single origin (Fig. 1), whilethe precise number varies between species, likely subject to coverage limits of transcriptomesequencing or tissue-specific expression. We found full-length proteins of each of the fourgroups from only three species, Watasenia scintillans, Sepia pharaonis (non-luminous),and Pterygioteuthis hoylei. The symplectin group and group 1 proteins were not found inany octopodiform (represented by only V. infernalis and two Octopus species) (Zhang etal., 2012; Albertin et al., 2015).

Of the four protein groups, homologs of symplectin were found in two luminousspecies where the mechanisms of light production are unknown, D. gigas and P. hoylei.These homologs could potentially be the active protein in the bioluminescence of thesespecies. However, the firefly squid W. scintillans has five proteins in total, one of eachof the four groups, as well as an additional duplicate of the symplectin group. We werealso able to find symplectin homologs in three species that are not luminous, Sepiapharaonis, Loligo vulgaris, and Doryteuthis pealei, indicating that the tree position aloneis not enough to predict bioluminescence of the organism, or whether the enzyme couldact as a photoprotein; it could be that all four cephalopod protein groups can act asphotoproteins, though it could also be that symplectin is the only enzyme that acts as aphotoprotein in this entire protein family. Additionally, two species that have luminoussystems which do not involve coelenterazine still have members of this protein family: thesquid Euprymna scolopes and Uroteuthis edulis both generate light from interactions withsymbiotic, luminous bacteria, yet we still found homologs from protein groups 1, 2 and 3.

Symplectin-like proteins across metazoansTo better understand the relative importance of the cephalopod duplications, we examinedhomologs of symplectin across Metazoa (Fig. 2). At the sequence level, symplectin issimilar to two annotated proteins in human (Fujii et al., 2002), vanin-1 (pantetheinase,30% identity) and biotinidase (btd1, 31% identity), both of which are hydrolyzing amidasesand do not require other cofactors. Homologs of symplectin/biotinidase were found inmostanimal groups, including vertebrates, arthropods, cnidarians, sponges, placozoans, as wellas choanoflagellates (single-celled eukaryotes that are sister-group to animals). We were

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 4/15

Page 5: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

0.6

Crassostrea_gigas_EKC31982

Uroteuthis_edulis_DN40275_c0_g1_i1

Sepia_pharaonis_DN22891_c0_g1_i1

Watasenia_scintillans_GEDZ01046679.1

Lotgi1|128172

Chiroteuthis_calyx_17333_comp13011_c7_seq1

Euprymna_scolopes_DN37210_c0_g1_i7

Sepia_pharaonis_DN24688_c1_g2_i1

Lotgi1|218395

Octopus_bimaculoides_stringtie.93053.2Vampyroteuthis_infernalis_comp70596_c0_seq2

Sepia_pharaonis_DN25074_c0_g1_i1

Dosidicus_gigas_comp9737_c1_seq1

Doryteuthis_pealei_DN34337_c15_g31_i3_partial

Pterygioteuthis_hoylei_comp1890_c0_seq1

Doryteuthis_pealei_DN27962_c0_g1_i1

Vampyroteuthis_infernalis_comp83932_c0_seq1_partialLotgi1|204287

Euprymna_scolopes_DN37748_c7_g3_i2

Pinctada_fucata_aug2.0_2079.1_32034.t1

Octopoteuthis_deletron_comp14997_c3_seq6

Doryteuthis_pealei_DN33820_c0_g1_i1

Watasenia_scintillans_GEDZ01116067.1_partial

Octopoteuthis_deletron_comp21891_c0_seq1

Symplectin

Pomacea_canaliculata_GBZZ01041301.1

Uroteuthis_edulis_DN39599_c0_g1_i1

Sepia_pharaonis_DN25762_c5_g1_i1

Phylliroe_bucephala_comp16391_c0_seq1

Pterygioteuthis_hoylei_comp18475_c0_seq1

Doryteuthis_pealei_DN33433_c0_g1_i1

Ocbimv22003864m.p

Watasenia_scintillans_GEDZ01127263.1

Chiroteuthis_calyx_comp21382_c2_seq1

Watasenia_scintillans_GEDZ01132136.1

Loligo_vulgaris_DN12358_c0_g1_i1

Dosidicus_gigas_comp8261_c0_seq1

Pterygioteuthis_hoylei_comp15250_c1_seq14Octopus_vulgaris_JR438964.1

Watasenia_scintillans_GEDZ01145005.1

Octopus_bimaculoides_stringtie.12337.1

Pterygioteuthis_hoylei_comp24624_c0_seq1

Pinctada_fucata_aug2.0_1686.1_21739.t1

Euprymna_scolopes_DN29795_c1_g2_i1

Phylliroe_bucephala_comp16052_c0_seq1

100

65

84

100

100

100

19

76

100

16

100

100

100

100

52

Symplectin group

Group 1

Group 2

Group 3

Other molluscs

EKS-triad

ERS-triad

Other proteinsBacterial biolum

Mechanism unknown

Not bioluminescent

Figure 1 Phylogenetic tree of symplectin and homologs from other molluscs. Four strongly supported groups are identified in cephalopods.Species that are luminous but have unknown mechanisms are high-lighted in blue, while those with bacterial bioluminescence or other known pro-teins are highlighted in orange and pink, respectively. Black labels indicate non-luminous species. Clades/proteins with a catalytic triad of E-K-Shave a green star, and those with E-R-S triad have a yellow star; all others have the conserved E-K-C triad. Internal bootstrap values are removed forclarity. Complete version of the same tree containing homologs from across metazoans is shown in Fig. 2.

unable to find any homologs in hemichordates (based on the genomes of Ptychodera flavaand Saccoglossus kowalevskii) nor any ctenophores (based on the genome of Mnemiopsisleidyi and transcriptomes from 11 species). Because of the presence in choanoflagellates,absence of this protein family in hemichordates and ctenophores is likely a secondary loss.However, hemichordates are only represented by two species so sequencing of additionalspecies, or deeper sequencing of the studied species, may reveal homologs in this clade.

Comparison to annotated proteinsIt is clear from the alignment that the original symplectin sequence is not the completeCDS (Fig. 3), as it does not start with methionine, and around 30 residues are missingcompared to symplectin homologs in other cephalopods. At present, no transcriptomic

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 5/15

Page 6: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

0.6

100

8710065

100

99100

100

99

51

Symplectin

Cephalopods

Choanoflagellates

Arthropods

Cnidarian

Vanin?

VertebrateVanin

VertebrateBiotinidase

Group 1

Group 2

Group 3

Othermolluscs

SpongesEchinoderms

Placozoans

Cnidarian Biotinidase?

100100

36

100

Figure 2 Phylogenetic tree of known symplectin and homologs from all animals. Cephalopod-specificgroups are indicated as in Fig. 1. Symplectin is indicated by the blue star. Most species have only a singlehomolog and the proteins form a clade by phylum. This is not true of chordates (yellow and orange forbiotinidase and vanin-1/pantetheinase, respectively) and cnidarians (red). Internal bootstrap values areremoved for clarity, though many support values for backbone nodes are low. Detailed version of the sametree showing only molluscs is shown in Fig. 1.

or genomic data is available for S. oualaniensis, thus we could not examine the proteincompleteness or copy number in that species.

While symplectin does not have an available crystal structure, the structure of the humanprotein vanin-1 has been determined (Boersma et al., 2014), which is the only member ofthis family to have a crystal structure. The vanin-1 protein structure is divided into twodomains, the catalytic ‘‘nitrilase’’ domain, and the base domain, which has an unknownfunction in vanin-1 (Boersma et al., 2014). The catalytic triad residues of human vanin-1include a glutamate (E79), lysine (K178) and cysteine (C211) (Boersma et al., 2014). Twoof the residues, K178 and E79, are conserved in essentially all taxa, including cephalopods(though this cannot be evaluated in partial sequences). The cysteine is also conservedin most proteins, except some cephalopods sequences have a serine instead of cysteine,although serine could still function as a nucleophile for hydrolysis. All Group 3 proteinsare serine-containing, but otherwise the serine proteins do not form a clade, indicating thatthis mutation has occurred multiple times independently. The cysteine is followed by aphenylalanine (F212 in vanin-1) in nearly all proteins, while most of the cephalopod serineproteins are followed by tyrosine, suggesting some role of this position in the specificity orreactivity of the enzyme.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 6/15

Page 7: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

250 260 270 280

C F D F Q - - - - A VQ L I E Q Y N V RH I A Y P A S WVN L P P I Y Q S I Q SC F D I MF K K P T V E L I R KHN I RN I A Y S T AWV E R P P I F R S I AMS F D V F F K R P A I E L AD K Y Q I RN I A F S TGWY E M T P I F R S I Q VC F D I MF K E P T L E L I KQH K I E N I A Y S S AWV E H P P I F R S I QMC F D I L F HD P A V T L V KD F H VD T I V F P T AWMN V L P H L S A V E F

290 300 310 320

H S A F A R F A K I N L L A A S VH R L E T S T Y G S G I Y S P NG A E I F Y FH S G V A R A A K AN I L S AN L HWL P TN S Y G S G I Y T P DG A V T Y H YH S A F A R E AQ VN F L V S NMG - - - - - - - G S G I Y T P NG T K T Y Y YH S A L A R AG K VN I L A AN L HWL P T R S Y G S G I Y T P DG A A A Y Y HH S AWAMGMR VN F L A S N I H Y P S K KM TG S G I Y A P N S S R A F H Y

490 500 510 520 530 540 550 560

MY NN Y I A L D P F VWN Y T E AGG I E T K P G T S T P L H S AN L V A R I Y A KD S - - - S KH VHQ S H P I D E G V I KMA V K Y ML Y VMA A Y V Y A A S -MD E GQ Y K L A S D AWT Y N E HNG I K S Q P E Q T VH L H S AN L L A R V Y R RD V L L RD P N K K KN E G S E KW- VN S T I N F V L R T I T S F I F S I I VMDGD S Y E L V P NDWT Y S D E QG L Q S L AG K K I T L H S A T L MS RD F S KD I L VGG KW I E E K E E E - - - - - - - - - - - - - - - - - - - - - - - - -MDGGQ Y E L V P D AWT - S T F NG V E T R P G K T V K L H S AN L V A R V F R KD V L F VN P S QQQ K R K E N K Y L I I A S I KN F I RMMA S Y I L F - - -S E N - - - Q L A P G E F Q V S TDG R L F S L K P T S G P V L T V T L F G R L Y E KDW- - - - - - - - - A S N A S S G L T AQ A R I I ML I V I A P I VC S L S W

330 340 350

R P D I P K S K L L V A E I - - L P I H V K K P E - - - Q TR KND P K S K L L WAD I - - H K I DME A I E - - - E DR P H T TG S N L T V S D I - - Y Q I KN V K P N S K I E TK RN V P K S K L I I AD I - - HQ I D I E R E S - - - DDDMK T E E G K L L L S Q L D S H P S H S A V VN - - - WT

360 370 380 390 400

V VN F DN P V F P S E DDD VQD L F D RGD F A F L K Y K RM T T R AG T V E VC Q K S F C C KE Q E Q E D E I KD A - - - N V L D L F K T KD L P L F K F KH L T A R S G L V E VC QGQ F C C KD F N I S NDNN T K - - - HD T E L KQ I Y P G F E L K F KH L K T K S G S V T VC E K S F C C RE TD P T P P P T P T V - QQ V L D L K T VN P F S S F K F E R L K T K KG E A K VC H K S F C C ES Y A S S I E A L S S - - - GN K E F KG T V F F D E F T F V K L TG V AGN Y T VC Q KD L C C H

410 420

A R Y A V KD R F - K E V Y A VG V Y DA T Y A I E GHN - Q E D Y A VG I Y DA Y Y A I E G S F - E E T Y A L G V F NA S Y T I KG R F - R E V Y A VG I Y EL S Y KMS E N I P N E V Y A L G A F D

430 440 450 460 470 480

G L L S AG ANN L Y F Q I C T V I QC P H K K - - - C G L K I S K V R TH F K Y L N L R ADGWL D R Y V F P S Y T VG Y I L A S E K K L Y VQ I C T L L KC P N S E - - - C G T E I TN V R T V F S H F S L KG TG L D S Y F I F P S VMAD L ME I AGN KMY TQ I C T L S KC QD S R - - - C G S W I AN A K T V F S N I T L Q AGG V L S K Y MY P S L L TG Y L L A S KN K L Y VQ I C T L L KC Q S S E - - - C G T E I TD V S T E F S H V S L KG TD F I TG Y I Y P S L L TG - L H T V E G R Y Y L Q I C T L L KC K T TN L N TC GD S A E T A S T R F E MF S L S G T - F G TQ Y V F P E V L L

220 230 240

L F WE E GWF N S S KN Y E MA L WD T P I G K F G T F ML F - K E DWF N S S K S DD L A I F D T P F G K F A T F IL F - NDG Y F D A A K S P E A A I F D T P F G K F A T F IL F - E E DWF N S A K S V E MA I F D T P F G K F G T F IL F MG E NQ F N V P K E P E I V T F N T T F G S F G I F T

170 180 190 200 210

MY MV VNMAG R E P C R R A T E - P E C P GD KQ L L Y N TN V A F NN E GD V V A R Y Y K THMF I V VNMAG R E AC K - - P E - D E C P S DGQ F L Y N TN V V F S P KG E I I A R Y Y K TNMF L V VNMAG R E P C N - - NQ - S E C P P DN E Y L HN TN V I F S P KGQ V V A R Y Y K THMF L I I NMAG R E TC N - - P E L D E C P HD E Q Y L Y N TN V V F S P E G S V V A R Y Y K THI Y V V AN I GD K K P C D - - T S D P QC P P DG R Y Q Y N TD V V F D S QG K L V A R Y H KQN

10 20 30 40 50 60 70

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Y V - - - - - - - - - - - R P V S S WK V A V F E HQ V I P P K - TDME - - TMS VH - - - - - - - - - - R L S L L - - - - L L L L S L I Y V ADC R R S RN RQ S P P P D S Y R VG V V E H A V I P P S P P D VN - - TMA T S - Y RNN T S C T I R L A L L F F S S I L F I S F I NGDD - S RN TDDQ - - L L P S Y K VG V L D Y N V I V P Y - E V I D - - TML V S S R S NNM TMP - - WS L L - - - - L L I I S L L QC VHC K L T I NDQ - - P L P S Y K V A V F E H A V I P AD - KN L N - - SM T TQ - - - - - - - L P A Y V A I L - - - - L F Y V S R A S C QD - - - - - - - - - - - - - T F T A A V Y E H A A I L P N - A T L T P V S

80

R E E A L D A L K LR KMA L DGMKWR K K A L E AMQ KR RMA I Q AMK KR E E A L A L MN R

150 160

Y Q TQ S S E ML R T F S C MA K E NDF - - TGGQ VQ RWL S C S A R Y N YF KD E S T E VQ R L L S C I A K E N EF R S E N T A I L RML S C I A RDN R- R F GQ T P VQ E R L S C L A KNN S

90 100 110 120 130 140

N S D V Y H E A V L E S R S KG V KM I V F P E Y G L Y D I N T L T R T RMD L M A E K V P H P KHGH RN P C D E P EN VD A Y N R TMV E A K RMG V K L L V Y P E Y G L Y G T RM - T R E RMS L F A E T I P D P K E GH RN P C I D RDN I H V Y NQ T A I E A K L QG V K L L V F P E Y G I L G L G V F S R K L L K L F A E T I P D S K - S Q K I P C E D A EN I D V F N E T V I E A R S KG V KML V F P E Y G L Y D T Y M - T R E R I N L F A E P I P D T K T S H KN P C TD P EN L D I L E G A I T S A ADQG AH I I V T P E D A I Y GWN F - N RD S L Y P Y L E D I P D P E - VNW I P C NN RN

Nit

rila

se d

om

ain

Base

dom

ain

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

SymplectinPterygioteuthis_hoylei_comp18475_c0_seq1Dosidicus_gigas_comp8261_c0_seq1Watasenia_scintillans_GEDZ01046679.1VNN1_HUMAN

Figure 3 Alignment of symplectin and vanin-1.Multiple sequence alignment of symplectin and vanin-1 with top hits from D. gigas, P. hoylei, andW. scintillans. Intensity of blue color shows conservation. Catalytic residues (E, K, C) identified in vanin-1 are indicated by triangles beneath, andthe catalytic cysteine is shown in green, though this position is a serine for D. gigas. The dhCtz-binding cysteine is shown in yellow, indicated by theblue hexagon above. Disulfide bridges found in both symplectin and vanin-1 are shown in black. Those found only in vanin-1 are shown in green,while the one remaining disulfide bond found in symplectin is shown in red. For Group 2 cephalopod proteins (Fig. 1), the conserved cysteine atalignment position 440 is substituted and a neighboring residue (I383 in symplectin) is instead a cysteine.

Conservation and role of cysteinesInstead of using the more-common coelenterazine molecule, symplectin requiresdehydrocoelenterazine (dhCtz). Unlike cnidarian or ctenophore photoproteins, whichhold coelenterazine through a peroxide bond to tyrosine (Head et al., 2000), dhCtz islinked to symplectin by a thioether to a free cysteine residue (C390 in Symplectin) (Isobeet al., 2008). Symplectin has 11 cysteines (Fig. 3), initially suggesting that the odd numberallowed for a free cysteine to bind coelenterazine. However,mass spectrometry data indicatethat only three pairs are in the form of disulfide bridges (Kongjinda et al., 2011), thoughperhaps the remaining two pairs are not effectively captured. If the conserved cysteine(C196) is actually catalytic in the binding pocket of the nitrilase domain of the symplectin

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 7/15

Page 8: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Figure 4 Modeled structure of symplectin. PDB format structure based on the structure of vanin-1 (4CYF). (A) Overview of the two domains,named after those defined in vanin-1. Residues 1–290 (blue to green) compose the nitrilase domain while residues 291–465 compose the base do-main (yellow to red). (B) Close-up view of the catalytic triad E60-K163-C196 (C) Putative disulfide bridge of C390 and C385, while overall theseresidues are located close to a number of hydrophobic residues, potentially involved in dehydrocoelenterzine binding.

and does not form a bridge, and three pairs are already identified, then this leaves oneadditional pair of cysteines for disulfide bonding.

C390 is conserved in essentially all other proteins of this family, including biotinidase(C471) and vanin-1 (C411). In vanin-1, C411 is in close proximity to C403 and formsa disulfide bond, though apparently this cannot occur in symplectin between C385 andC390, since C390 would no longer be available for thioether bonding to dhCtz (Isobe etal., 2008). Mass spectrometry data suggest that C385 instead forms a bridge with C380(Kongjinda et al., 2011). However, when symplectin is modeled based on the structure ofvanin-1 (Fig. 4), C380 would be on another beta strand and point away from the bindingpocket to bridge with C345, while the adjacent cysteine (C344) bridges further away toC339 in a beta-hairpin motif. The cysteine pair for C345/C380 is found in all proteinssuggesting an important structural role of C380, and arguing against the formation of theC380/C385 bond. By comparison, the C344/C339 pair is not found in sponges, polychaetes,or choanoflagellates, suggesting it is dispensable.

This discrepancy may be reconciled three ways: (1) the structures of symplectin andvanin-1 differ enough that vanin-1 cannot reliably be used to model disulfide bridges insymplectin; (2) C385/C390 natively form a disulfide bond, though when digested, otherchanges in the redox state disrupt this bond and the portion containing C380 and C385dynamically forms a disulfide bond when fragmented. However, this presents an additionalproblem, as even if C385 and C390 do indeed form a disulfide bridge, this bond must breakin order for the thioether to form, and C385 is then a free cysteine; (3) the fragmentationdata are incorrect and do not reliably capture the disulfide bridges of the native protein, andthe free cysteine in the catalytic core of the nitrilase domain (C196) is actually responsiblefor the photoprotein activity.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 8/15

Page 9: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

DISCUSSIONCatalytic structure predictionsAssuming the overall structure is similar between vanin-1 and symplectin, the conservedcatalytic triad is located in the nitrilase domain of symplectin (E60, K163, C196), whilethe cysteine for the thioether linkage is found in the base domain. Therefore, catalyticbinding pocket of nitrilase domain is not responsible for the bioluminescence activity ofsymplectin, which is instead likely carried out by residues in close spatial proximity to C390.Although two other luciferases have solved crystal structures (Loening, Fenn & Gambhir,2007; Tomabechi et al., 2016), these were unbound forms so mechanistic generalizationscannot be made.

Nonetheless, three phenylalanines (F316, F321, and F323) and one tyrosine (Y359)are predicted to be close to C390, and may have a role in coordinating the binding ofcoelenterazine, although only the tyrosine is well-conserved outside of symplectin. Besidesthe non-polar residues, several other residues predicted to be in proximity to C390 mayhave a role in the catalytic activity, including D314, K325, and E357. All three are conservedin one protein each from P. hoylei and W. scintillans, which are the two closest proteins tosymplectin in the tree. However, given that the overall conservation is low, even proteinsevolutionarily close to symplectin may not have any luminescence activity.

It was also noted that the native vanin-1 structure is a homodimer (Boersma et al., 2014).Head-to-tail arrangement of the two domains within a homodimer would allow anotheralternative, where the nitrilase domain of one monomer catalyzes the oxidation of dhCtzbound to the other monomer. In this case, there may be no need for any residues directlysurrounding C390 to have a role in the enzymatic activity.

Evolution of the protein familiesInterpreting the phylogenetic tree and determining the ancestral number of genes ischallenging given the copy number in some animal lineages relative to the tree position. Formost cases, such as the majority of arthropods, lineage-specific duplications have createdmultiple copies, the most being eight homologs in the spider Parasteatoda tepidariorum.The same is seen for the two polychaetes and other non-cephalopod molluscs. However,cnidarians (only anemones and corals represented) and chordates have two separateprotein groups. For chordates, pantetheinase and biotinidase each have their own group,and although the copy number relative to the cephalochordate outgroup Branchiostomafloridae suggests a duplication specific to vertebrates, the tree does not indicate thisarrangement with strong support. For cnidarians, both groups have unknown functionsand are not monophyletic. Clearly a duplication must have occurred to give rise to thetwo groups, although the copy number in other groups (such as Porifera or Placozoa) isinconsistent with a duplication in the common ancestor of all animals. Two scenarios maybe considered. The two cnidarian groups resulted from a duplication specific to cnidarianseven though the tree (Fig. 2) does not indicate a single origin, suggesting that this proteinfamily is poorly resolved by existing phylogenetics programs. The samewould therefore alsobe true for the two vertebrate proteins. Alternatively, multiple bilaterian phyla (molluscs,

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 9/15

Page 10: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

annelids, arthropods, echinoderms) must have lost at least one of the ancient copies thatwas otherwise retained by chordates.

Given the ubitquity of both biotin and pantothenic acid, metabolism of one or both ofthese molecules is likely to be the ancestral function. Biotinidase activity was detected inDrosophila melanogaster (Swango & Wolf, 2001), providing evidence that the monophyleticgroup of arthropod proteins are biotinidases, and possibly all bilaterian members of thisfamily. This would therefore suggest several points about the evolution of this proteinfamily. First, one or many of the cephalopod proteins may still have biotinidase activity oract as a hydrolase in other contexts via the nitrilase domain. Since symplectin is thereforepredicted to have two separate functional domains, it is possible that symplectin performsmultiple functions, and may even still have a role in biotin metabolism. Secondly, if theancestral function is a biotinidase, then the pantetheinase activity of vanin-1 and relatedproteins found only in vertebrates therefore is a derived function.

Among the four cephalopod protein groups, homologs of the symplectin group were stillfound in several non-luminous species, Sepia pharaonis, Loligo vulgaris, and Doryteuthispealei. For this reason, it could be that only a very small set of symplectin homologs areluminous because of key amino acid changes in the protein. While we were not able to findall four homologs in the transcriptomes of most species, we found five in W. scintillans,where two were in the symplectin group. Of these, one was highly similar to symplectin,including a symplectin-specific indel of two amino acids found in only one other protein(in P. hoylei). Candidate luciferases were already identified from W. scintillans (Gimenezet al., 2016) belonging to another protein family, so it is unclear what the role would beof the symplectin homologs identified here, or whether they have photoprotein activitywith dhCtz.

Bioluminescence in cephalopodsPhylogenetic analysis of cephalopods suggests a complex pattern of gains and losses ofbioluminescence (Lindgren et al., 2012), where a luminous phenotype appears to have beenacquired five times, and pelagic species were significantly more likely to display autogenicbioluminescence. In addition to the two independent gains of bacterial bioluminescence,there are a total of seven inventions of bioluminescence in this class. As the proteinmechanisms are only known for two species,W. scintillans and S. oualaniensis, is it possiblethat several other luciferases have evolved in this family.

Convergent evolution of bioluminescence is comparably commonplace within otheranimal groups. For instance, cnidarians are likely to have at least five separate evolutionaryevents of bioluminescence, in octocorals, deep-sea anemones (Johnsen et al., 2012),coronate and semaeostome scyphomedusae (Shimomura et al., 2001) and hydromedusae,though the proteins responsible have only been identified in octocorals (Renilla-typeluciferases) and hydromedusae (calcium-activated photoproteins). Thus, even within thesame clade, multiple separate origins of bioluminescence making use of the same luciferinwith different proteins has already happened at least once in metazoans.

For the proteins themselves, three inventions of luciferases have been found in familyof adenylating enzymes: for fireflies, the New Zealand glow worm Arachnocampa luminosa

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 10/15

Page 11: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

(Sharpe et al., 2015), and the squid W. scintillans (Gimenez et al., 2016). This may suggestthat members of this protein family has some propensity for becoming luciferases. Becausemany luciferins are hydrophobic molecules, enzymes that already catalyze reactions onother hydrophobic molecules (such as fatty acids) may have some innate affinity forluciferins, and then mutations that allow binding of oxygen ultimately change the functionof these enzymes into luciferases.

Future directionsThe case of convergent evolution of luciferases from adenylating enzymes was essentiallyunpredictable. For this reason, it is just as plausible to consider that none of the symplectinhomologs here can function as luciferases as it is to consider that all of them function thisway. Cloning and characterizing all symplectin-group homologs in vitro may resolve this,but does not effectively account for the possibility where luciferase activity independentlyevolved twice from the same protein family. Thus, if it were necessary to clone all membersof a protein family to check for luciferase activity, thismay be prohibitively time consuming.For this reason, a more directed approach of proteomic investigations of bioluminescentmaterial from certain species may prove to be more useful, particularly for a species like thevampire squid, which has a secreted bioluminescence (Robison et al., 2003) and would becomparably easy to acquire concentrated luminous material. Such studies typically requirea reference genome or transcriptome, so potentially the transcriptomes presented herecould serve as a reference for these or closely related species. Further investigation of themechanisms of bioluminescence in this animal clade may reveal more general principlesof protein evolution, and how a few amino acid changes may have dramatic effects on thephenotype and biology of an organism.

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis workwas supported byNIH grantNIGMS-5-R01-GM087198 to S.H.D.H. The fundershad no role in study design, data collection and analysis, decision to publish, or preparationof the manuscript.

Grant DisclosuresThe following grant information was disclosed by the authors:NIH: NIGMS-5-R01-GM087198.

Competing InterestsThe authors declare there are no competing interests.

Author Contributions• Warren R. Francis conceived and designed the experiments, performed the experiments,analyzed the data, contributed reagents/materials/analysis tools, wrote the paper,prepared figures and/or tables.• Lynne M. Christianson performed the experiments.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 11/15

Page 12: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

• Steven H.D. Haddock conceived and designed the experiments, analyzed the data, wrotethe paper, prepared figures and/or tables.

Field Study PermissionsThe following information was supplied relating to field study approvals (i.e., approvingbody and any reference numbers):

Operations were conducted under permit SC-4029 issued to SHD Haddock by theCalifornia Department of Fish and Wildlife.

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences:

Reads for all five transcriptomes are at NCBI SRA with accessions SRR5527414–SRR5527418.

Data AvailabilityThe following information was supplied regarding data availability:

All assemblies, alignments, and intermediate files are deposited at BitBucket:https://bitbucket.org/wrf/squid-transcriptomes.

Supplemental InformationSupplemental information for this article can be found online at http://dx.doi.org/10.7717/peerj.3633#supplemental-information.

REFERENCESAlbertin CB, Simakov O, Mitros T,Wang ZY, Pungor JR, Edsingergonzales E, Brenner

S, Ragsdale CW, Rokhsar DS. 2015. The octopus genome and the evolution ofcephalopod neural and morphological novelties. Nature 524(7564):220–224.

Alon S, Garrett SC, Levanon EY, Olson S, Graveley BR, Rosenthal JJC, Eisenberg E.2015. The majority of transcripts in the squid nervous system are extensively recodedby A-to-I RNA editing. eLife 4:1–17 DOI 10.7554/eLife.05198.

Alva V, Nam S-Z, Söding J, Lupas AN. 2016. The MPI bioinformatics toolkit as anintegrative platform for advanced protein sequence and structure analysis. NucleicAcids Research 44(W1):W410–W415 DOI 10.1093/nar/gkw348.

Boersma YL, Newman J, Adams TE, Cowieson N, Krippner G, Bozaoglu K,Peat TS. 2014. The structure of vanin 1: a key enzyme linking metabolic dis-ease and inflammation. Acta Crystallographica Section D 70(12):3320–3329DOI 10.1107/S1399004714022767.

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, MaddenTL. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:421DOI 10.1186/1471-2105-10-421.

FrancisWR, Christianson LM, Kiko R, Powers ML, Shaner NC, Haddock SHD.2013. A comparison across non-model animals suggests an optimal sequenc-ing depth for de novo transcriptome assembly. BMC Genomics 14(1):167DOI 10.1186/1471-2164-14-167.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 12/15

Page 13: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Fujii T, Ahn JY, Kuse M, Mori H, Matsuda T, IsobeM. 2002. A novel photoproteinfrom oceanic squid (Symplectoteuthis oualaniensis) with sequence similarity tomammalian carbon-nitrogen hydrolase domains. Biochemical and BiophysicalResearch Communications 293(2):874–879 DOI 10.1016/S0006-291X(02)00296-6.

Gimenez G, Metcalf P, Paterson NG, SharpeML. 2016.Mass spectrometry analysisand transcriptome sequencing reveal glowing squid crystal proteins are in the samesuperfamily as firefly luciferase. Scientific Reports 6:27638 DOI 10.1038/srep27638.

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, FanL, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N,Di Palma F, Birren BW, NusbaumC, Lindblad-Toh K, Friedman N, Regev A. 2011.Full-length transcriptome assembly from RNA-Seq data without a reference genome.Nature Biotechnology 29(7):644–652 DOI 10.1038/nbt.1883.

Haddock SH, Moline MA, Case JF. 2010. Bioluminescence in the sea. Annual Review ofMarine Science 2(1):443–493 DOI 10.1146/annurev-marine-120308-081028.

Head JF, Inouye S, Teranishi K, Shimomura O. 2000. The crystal structure ofthe photoprotein aequorin at 2.3 A resolution. Nature 405(6784):372–376DOI 10.1038/35012659.

HuntME, Modi CK, Aglyamova GV, Ravikant DVS, Meyer E, Matz MV. 2012.Multi-domain GFP-like proteins from two species of marine hydrozoans. Photochemical &Photobiological Sciences 11(4):637–644 DOI 10.1039/c1pp05238a.

Inoue S, Sugiura S, Kakoi H, Hasizume K, Goto T, Iio H. 1975. Squid bioluminescenceII. Isolation fromWatasenia scintillans and synthesis of 2-(p-hydroxybenzyl)-6-(p-hydroxyphenyl)-3, 7-dihydroimidazo [1, 2-a] pyrazin-3-one. Chemistry Letters4(2):141–144 DOI 10.1246/cl.1975.141.

IsobeM, Kuse M, Tani N, Fujii T, Sobe BMI, Use MK, Ani ÃNT. 2008. Cysteine-390is the binding site of luminous substance with symplectin, a photoprotein fromOkinawan squid, Symplectoteuthis oualaniensis. Proceedings of the Japan 84:386–392DOI 10.2183/pjab/84.386.

Johnsen S, Frank TM, Haddock SHD,Widder EA, Messing CG. 2012. Light and visionin the deep-sea benthos: I. Bioluminescence at 500–1,000 m depth in the BahamianIslands. Journal of Experimental Biology 215(19):3335–3343 DOI 10.1242/jeb.072009.

Jung H, Pena-Francesch A, Saadat A, Sebastian A, KimDH, Hamilton RF, AlbertI, Allen BD, Demirel MC. 2016.Molecular tandem repeat strategy for eluci-dating mechanical properties of high-strength proteins. Proceedings of the Na-tional Academy of Sciences of the United States of America 113(23):6478–6483DOI 10.1073/pnas.1521645113.

Katoh K, Standley DM. 2013.MAFFT multiple sequence alignment software version7: improvements in performance and usability.Molecular Biology and Evolution30(4):772–780 DOI 10.1093/molbev/mst010.

Kongjinda V, Nakashima Y, Tani N, Kuse M, Nishikawa T, Yu CH, Harada N, IsobeM.2011. Dynamic chirality determines critical roles for bioluminescence in symplectin-dehydrocoelenterazine system. Chemistry-An Asian Journal 6(8):2080–2091DOI 10.1002/asia.201100089.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 13/15

Page 14: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Lindgren AR, PankeyMS, Hochberg FG, Oakley TH. 2012. A multi-gene phylogenyof Cephalopoda supports convergent morphological evolution in association withmultiple habitat shifts in the marine environment. BMC Evolutionary Biology12(1):129 DOI 10.1186/1471-2148-12-129.

Loening AM, Fenn TD, Gambhir SS. 2007. Crystal structures of the luciferase andgreen fluorescent protein from Renilla reniformis. Journal of Molecular Biology374(4):1017–1028 DOI 10.1016/j.jmb.2007.09.078.

PankeyMS, Minin VN, Imholte GC, SuchardMA, Oakley TH. 2014. Predictabletranscriptome evolution in the convergent and complex bioluminescent organs ofsquid. Proceedings of the National Academy of Sciences of the United States of America111(44):E4736–E4742 DOI 10.1073/pnas.1416574111.

Powers ML, McDermott AG, Shaner NC, Haddock SHD. 2013. Expression andcharacterization of the calcium-activated photoprotein from the ctenophore Batho-cyroefosteri: insights into light-sensitive photoproteins. Biochemical and BiophysicalResearch Communications 431(2):360–366 DOI 10.1016/j.bbrc.2012.12.026.

Price MN, Dehal PS, Arkin AP. 2010. FastTree 2–approximately maximum-likelihoodtrees for large alignments. PLOS ONE 5(3):e9490 DOI 10.1371/journal.pone.0009490.

Robison BH, Reisenbichler KR, Hunt JC, Haddock SHD. 2003. Light production bythe arm tips of the deep-sea cephalopod Vampyroteuthis infernalis. The BiologicalBulletin 205(2):102–109 DOI 10.2307/1543231.

SharpeML, Dearden PK, Gimenez G, Krause KL. 2015. Comparative RNA seq analysisof the New Zealand glowworm Arachnocampa luminosa reveals bioluminescence-related genes. BMC Genomics 16:825 DOI 10.1186/s12864-015-2006-2.

Shimomura O, Flood PR, Inouye S, Bryan B, Shimomura A. 2001. Isolation andproperties of the luciferase stored in the ovary of the scyphozoan medusa Periphyllaperiphylla. Biological Bulletin 201(3):339–347 DOI 10.2307/1543612.

Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysisof large phylogenies. Bioinformatics 30(9):1312–1313DOI 10.1093/bioinformatics/btu033.

Swango KL,Wolf B. 2001. Conservation of biotindase in mammals and identificationof the putative biotinidase gene in drosophila melanogaster.Molecular Genetics andMetabolism 74(4):492–499 DOI 10.1006/mgme.2001.3244.

Takahashi H, IsobeM. 1994. Photoprotein of luminous squid, symplectoteuthisoualaniensis and reconstruction of the luminous system. Chemistry Letters23(5):843–846 DOI 10.1246/cl.1994.843.

Tanaka E, Kuse M, Nishikawa T. 2009. Dehydrocoelenterazine is the organic sub-stance constituting the prosthetic group of pholasin. ChemBioChem 10(Scheme2):2725–2729 DOI 10.1002/cbic.200900503.

Tomabechi Y, Hosoya T, Ehara H, Sekine SI, ShirouzuM, Inouye S. 2016. Crystalstructure of nanoKAZ: the mutated 19 kDa component of Oplophorus luciferasecatalyzing the bioluminescent reaction with coelenterazine. Biochemical andBiophysical Research Communications 470(1):88–93 DOI 10.1016/j.bbrc.2015.12.123.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 14/15

Page 15: Symplectin evolved from multiple duplications in bioluminescent … · 2017-07-31 · Pterygioteuthis hoylei Photophore Yes 6.8 93,201SRR5527418This study Dosidicus gigas Photophore

Wen J, Zhong H, Xiao J, Zhou Y, Chen Z, Zeng L, Chen D, Sun Y, Zhao J, Wang F. 2016.A transcriptome resource for pharaoh cuttlefish (Sepia pharaonis) after ink ejectionby brief pressing.Marine Genomics 28:53–56 DOI 10.1016/j.margen.2016.05.005.

Zhang X, Mao Y, Huang Z, QuM, Chen J, Ding S, Hong J, Sun T. 2012. Transcriptomeanalysis of the Octopus vulgaris central nervous system. PLOS ONE 7(6):1–11DOI 10.1371/journal.pone.0040320.

Francis et al. (2017), PeerJ, DOI 10.7717/peerj.3633 15/15