17
Genes within Genes: Multiple LAGLIDADG Homing Endonucleases Target the Ribosomal Protein S3 Gene Encoded within an rnl Group I Intron of Ophiostoma and Related Taxa J. Sethuraman,* A. Majer,* N. C. Friedrich, D. R. Edgell, and G. Hausner* *Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada; and  Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada In some ascomycete fungi, ribosomal protein S3 (Rps3) is encoded within a group I intron (mL2449) that is inserted in the U11 region of the mitochondrial large subunit rDNA (rnl) gene. Previous characterization of the mL2449 intron in strains of Ophiostoma novo-ulmi subspecies americana (Dutch Elm Disease) revealed a complex genes-within-genes arrangement whereby a LAGLIDADG homing endonuclease gene (HEG) is inserted into the RPS3 gene near the 3# terminus, creating a hybrid Rps3–LAGLIDADG fusion protein. Here, we examined 119 additional strains of Ophiostoma and related taxa representing 85 different species by a polymerase chain reaction– based survey and detected both short (;1.6 kb) and long (.2.2 kb) versions of the mL2449 intron in 88 and 31 strains, respectively. Among the long versions encountered, 21 were sequenced, revealing the presence of either intact or degenerated HEG-coding regions inserted within the RPS3 gene. Surprisingly, we identified two new HEG insertion sites in RPS3; one near the original C-terminal insertion site and one near the N-terminus of RPS3. In all instances, the HEGs are fused in-frame with the RPS3-coding sequences to create fusion proteins. However, comparative sequence analysis showed that upon insertion, the HEGs displaced a portion of the RPS3-coding region. Remarkably, the displaced RPS3-coding segments are duplicated and fused in-frame to the 3# end of RPS3, restoring a full-length RPS3 gene. We cloned and expressed the LAGLIDADG portion of two Rps3–HEG fusions, and showed that I-OnuI and I-LtrI generate 4 nucleotide (nt), 3# overhangs, and cleave at or 1 nt upstream of the HEG insertion site, respectively. Collectively, our data indicate that RPS3 genes are a refuge for distinct types of LAGLIDADG HEGs that are defined by the presence of duplicated segments of the host gene that restore the RPS3 gene, thus minimizing the impact of the HEG insertion on Rps3 function. Introduction Homing endonuclease genes (HEGs) code for rare cut- ting DNA endonucleases. HEGs are encoded within group I or group II introns, as in-frame fusions with inteins, or as free-standing open reading frames (ORFs, Gimble 2000; Belfort et al. 2002; Toor and Zimmerly 2002). The associ- ation of HEGs with self-splicing RNA or protein elements is thought to be a mutualistic relationship, where the self- splicing elements provide the HEGs with a phenotypically neutral insertion site to minimize damage to the host ge- nome, while the homing endonuclease (HEase) promotes mobility of the self-splicing element to related genomes (Belfort and Perlman 1995; Lambowitz et al. 1999; Scha ¨fer 2003). In contrast, free-standing HEGs are usually found inserted in intergenic regions between genes, thus minimiz- ing their impact on the host genome. Regardless of their insertion site, HEGs function as mobile elements by intro- ducing a double-strand break (DSB), or nick, in genomes that lack the endonuclease coding sequence. The homing process is completed by host DSB-repair (DSBR) pathways that use the HEG-containing allele as a template to repair the DSB (Dujon 1989; Dujon and Belcour 1989; Belfort et al. 2002; Haugen et al. 2005; Stoddard 2005). The repair results in the nonreciprocal transfer of the HEG into the HEG-minus allele, and is usually associated with coconver- sion of markers flanking the HEG insertion site (Belfort et al. 2002). In the case of HEGs encoded within introns or inteins, coconversion of flanking markers ensures that the self-splicing element is inherited during the mobility process. Studies on some intron-encoded HEGs demon- strated that HEGs are mobile elements that can move inde- pendent of their ‘‘host’’ self-splicing introns (Mota and Collins 1988; Sellem et al. 1996; Sellem and Belcour 1997; Haugen et al. 2004; Gibb and Hausner 2005). These observations support the concept that HEGs have the poten- tial to invade different genomic locations, or self-splicing introns and inteins, to create composite mobile genetic elements (Belfort and Roberts 1997; Belfort 2003). There are four families of HEase proteins (Chevalier and Stoddard 2001) designated by the presence of con- served amino acid sequence motifs: the GIY-YIG, His- Cys box, HNH, and LAGLIDADG families (Jurica and Stoddard 1999; Guhan and Muniyappa 2003). Recently, a fifth family has been recognized, an HEase encoded within a group I intron that interrupts cyanobacterial tRNA genes and that is similar to PD/E.X.K type restriction en- zymes (Bonocora and Shub 2001; Zhao et al. 2007). The LAGLIDADG endonucleases are the largest known family and are encountered in some bacteria and bacteriophages, and in organellar genomes of protozoans, fungi, plants, and sometimes in early branching Metazoans (Stoddard 2005). LAGLIDADG endonucleases possess one or two of the conserved LAGLIDADG amino acid sequence motifs, with the single-motif forms functioning as homodimers and the double-motif forms functioning as monomers (Chevalier and Stoddard 2001). The double-motif types are thought to have evolved by gene duplication of an ancestral single-motif HEG followed by a fusion event (Lambowitz et al. 1999; Haugen and Bhattacharya 2004). Although LAGLIDADG endonucleases commonly function to pro- mote mobility, they can also function as maturases to facil- itate splicing of their respective host introns (Caprara and Waring 2005). Fungal mitochondrial genomes appear to be in con- stant flux, in part due to the activities of group I and group Key words: LAGLIDADG homing endonuclease, HEG transduction/ exon shuffling, RPS3, evolution, fungi, mitochondrial genomes. E-mail: [email protected]. Mol. Biol. Evol. 26(10):2299–2315. 2009 doi:10.1093/molbev/msp145 Advance Access publication July 13, 2009 Ó The Author 2009. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791 by guest on 02 April 2018

Genes-within-genes: Multiple LAGLIDADG homing endonucleases

  • Upload
    haque

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

Genes within Genes: Multiple LAGLIDADG Homing Endonucleases Target theRibosomal Protein S3 Gene Encoded within an rnl Group I Intron ofOphiostoma and Related Taxa

J. Sethuraman,* A. Majer,* N. C. Friedrich,� D. R. Edgell,� and G. Hausner**Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada; and �Department of Biochemistry, Schulich Schoolof Medicine and Dentistry, University of Western Ontario, London, ON, Canada

In some ascomycete fungi, ribosomal protein S3 (Rps3) is encoded within a group I intron (mL2449) that is inserted inthe U11 region of the mitochondrial large subunit rDNA (rnl) gene. Previous characterization of the mL2449 intron instrains of Ophiostoma novo-ulmi subspecies americana (Dutch Elm Disease) revealed a complex genes-within-genesarrangement whereby a LAGLIDADG homing endonuclease gene (HEG) is inserted into the RPS3 gene near the 3#terminus, creating a hybrid Rps3–LAGLIDADG fusion protein. Here, we examined 119 additional strains of Ophiostomaand related taxa representing 85 different species by a polymerase chain reaction– based survey and detected both short(;1.6 kb) and long (.2.2 kb) versions of the mL2449 intron in 88 and 31 strains, respectively. Among the long versionsencountered, 21 were sequenced, revealing the presence of either intact or degenerated HEG-coding regions insertedwithin the RPS3 gene. Surprisingly, we identified two new HEG insertion sites in RPS3; one near the original C-terminalinsertion site and one near the N-terminus of RPS3. In all instances, the HEGs are fused in-frame with the RPS3-codingsequences to create fusion proteins. However, comparative sequence analysis showed that upon insertion, the HEGsdisplaced a portion of the RPS3-coding region. Remarkably, the displaced RPS3-coding segments are duplicated andfused in-frame to the 3# end of RPS3, restoring a full-length RPS3 gene. We cloned and expressed the LAGLIDADGportion of two Rps3–HEG fusions, and showed that I-OnuI and I-LtrI generate 4 nucleotide (nt), 3# overhangs, andcleave at or 1 nt upstream of the HEG insertion site, respectively. Collectively, our data indicate that RPS3 genes area refuge for distinct types of LAGLIDADG HEGs that are defined by the presence of duplicated segments of the hostgene that restore the RPS3 gene, thus minimizing the impact of the HEG insertion on Rps3 function.

Introduction

Homing endonuclease genes (HEGs) code for rare cut-ting DNA endonucleases. HEGs are encoded within group Ior group II introns, as in-frame fusions with inteins, or asfree-standing open reading frames (ORFs, Gimble 2000;Belfort et al. 2002; Toor and Zimmerly 2002). The associ-ation of HEGs with self-splicing RNA or protein elementsis thought to be a mutualistic relationship, where the self-splicing elements provide the HEGs with a phenotypicallyneutral insertion site to minimize damage to the host ge-nome, while the homing endonuclease (HEase) promotesmobility of the self-splicing element to related genomes(Belfort and Perlman 1995; Lambowitz et al. 1999; Schafer2003). In contrast, free-standing HEGs are usually foundinserted in intergenic regions between genes, thus minimiz-ing their impact on the host genome. Regardless of theirinsertion site, HEGs function as mobile elements by intro-ducing a double-strand break (DSB), or nick, in genomesthat lack the endonuclease coding sequence. The homingprocess is completed by host DSB-repair (DSBR) pathwaysthat use the HEG-containing allele as a template to repairthe DSB (Dujon 1989; Dujon and Belcour 1989; Belfortet al. 2002; Haugen et al. 2005; Stoddard 2005). The repairresults in the nonreciprocal transfer of the HEG into theHEG-minus allele, and is usually associated with coconver-sion of markers flanking the HEG insertion site (Belfortet al. 2002). In the case of HEGs encoded within intronsor inteins, coconversion of flanking markers ensures thatthe self-splicing element is inherited during the mobility

process. Studies on some intron-encoded HEGs demon-strated that HEGs are mobile elements that can move inde-pendent of their ‘‘host’’ self-splicing introns (Mota andCollins 1988; Sellem et al. 1996; Sellem and Belcour1997; Haugen et al. 2004; Gibb and Hausner 2005). Theseobservations support the concept that HEGs have the poten-tial to invade different genomic locations, or self-splicingintrons and inteins, to create composite mobile geneticelements (Belfort and Roberts 1997; Belfort 2003).

There are four families of HEase proteins (Chevalierand Stoddard 2001) designated by the presence of con-served amino acid sequence motifs: the GIY-YIG, His-Cys box, HNH, and LAGLIDADG families (Jurica andStoddard 1999; Guhan and Muniyappa 2003). Recently,a fifth family has been recognized, an HEase encodedwithin a group I intron that interrupts cyanobacterial tRNAgenes and that is similar to PD/E.X.K type restriction en-zymes (Bonocora and Shub 2001; Zhao et al. 2007). TheLAGLIDADG endonucleases are the largest known familyand are encountered in some bacteria and bacteriophages,and in organellar genomes of protozoans, fungi, plants, andsometimes in early branching Metazoans (Stoddard 2005).LAGLIDADG endonucleases possess one or two of theconserved LAGLIDADG amino acid sequence motifs, withthe single-motif forms functioning as homodimers and thedouble-motif forms functioning as monomers (Chevalierand Stoddard 2001). The double-motif types are thoughtto have evolved by gene duplication of an ancestralsingle-motif HEG followed by a fusion event (Lambowitzet al. 1999; Haugen and Bhattacharya 2004). AlthoughLAGLIDADG endonucleases commonly function to pro-mote mobility, they can also function as maturases to facil-itate splicing of their respective host introns (Caprara andWaring 2005).

Fungal mitochondrial genomes appear to be in con-stant flux, in part due to the activities of group I and group

Key words: LAGLIDADG homing endonuclease, HEG transduction/exon shuffling, RPS3, evolution, fungi, mitochondrial genomes.

E-mail: [email protected].

Mol. Biol. Evol. 26(10):2299–2315. 2009doi:10.1093/molbev/msp145Advance Access publication July 13, 2009

� The Author 2009. Published by Oxford University Press on behalf ofthe Society for Molecular Biology and Evolution. All rights reserved.For permissions, please e-mail: [email protected]

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 2: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

II introns (Clark-Walker 1992; Belcour et al. 1997; Salvoet al. 1998; Gobbi et al. 2003; Hausner 2003; Schafer 2003;Gibb and Hausner 2005). Mitochondrial introns and theirORFs have been associated with mitochondrial instabilitiesin Saccharomyces cereviseae, Podospora anserina,Neurospora crassa, and Ophiostoma ulmi (Dujon 1989;Cummings et al. 1990; Lambowitz and Perlman 1990;Abu-Amero et al. 1995), and mitochondrial DNA (mtDNA)introns are sometimes components of plasmid-like elements(Cummings et al. 1986, 1990; Dujon 1989; Abu-Ameroet al. 1995; Sethuraman et al. 2008). One interesting poly-morphism within ascomycetes mitochondria is a pervasivegroup I intron that is inserted within the U11 region of themtDNA rnl (large subunit rDNA) gene at a position thatcorresponds to Escherichia coli LSU position 2449 (Johansenand Haugen 2001). Encoded within the rnl U11 intron (alsoreferred to as the mL2449 intron) is an ORF for the putativeribosomal protein S3 (Burke and RajBhandary 1982; re-viewed in Gibb and Hausner 2005). The ribosomal proteinwas originally referred to as S5, but more recently this ORFand the free-standing mtDNA var1 gene were recognized asRPS3 homologs (Bullerwell et al. 2000). In spite of beingencoded within an intron, the Rps3 protein appears to befunctional and in N. crassa is incorporated into mitochon-drial ribosomes (LaPolla and Lambowitz 1981). By encod-ing an essential host gene such as RPS3, the optional groupI intron ensures its persistence in the mitochondrial ge-nome, because deletion of the intron would also deletethe RPS3 gene, presumably a lethal event. Interestingly,previous studies have shown that RPS3 itself might be a tar-get of HEGs; in strains of Cryphonectria parasitica andOphiostoma novo-ulmi subspecies americana, a complexORF is embedded within the mL2449 intron, where adouble-motif LAGLIDADG-type coding region is fusedin-frame to the C-terminus of the RPS3-coding region(Hausner et al. 1999; Gibb and Hausner 2005; Sethuramanet al. 2008). This arrangement suggests a recent insertion ofa mobile LAGLIDADG HEG into some RPS3 alleles, asother filamentous ascomycetes lack a HEG/maturase withinthis intron (Gibb and Hausner 2005).

Here, we surveyed species of Ophiostoma and relatedtaxa for the presence of HEGs within the RPS3 gene in or-der to gain a better understanding on the distribution andpersistence of LAGLIDADG HEGs within this mtDNAlocus. This study focussed on species of Ophiostoma, fungithat frequently are associates of bark beetles and are of eco-nomic importance as many are causal agents of blue stain insapwood, and some species are potential plant pathogens(Wingfield et al. 1993). We find that LAGLIDADG HEGsare widespread in Ophiostoma and related taxa and definethree distinct insertion sites of the HEG within the RPS3gene; one similar to the previously described insertion siteand two novel sites. Multiple HEG insertions were alsonoted within the same RPS3-coding region. We biochemi-cally characterized two HEGs: I-OnuI (from O. novo-ulmisubspecies americana), showing that it cleaves RPS3 al-leles that lack the HEG precisely at the site of insertionand I-LtrI (from Leptographium truncatum), which cleavesan HEG-minus RPS3 allele 1 bp downstream from its in-sertion site. Our data are consistent with the hypothesis thatthe LAGLIDADG HEGs are mobile endonucleases, pro-

moting their spread between RPS3 alleles, and are not in-volved in the mobility of the mL2449 group I intron.Moreover, our data raise intriguing questions as to thefunction of the Rps3–HEG fusions and suggest that HEGmobility promotes the transduction of flanking sequence.

Materials and MethodsSource and Maintenance of Fungal Cultures and DNAExtraction Protocols

Strains used in this study were from previous rDNAphylogenetic studies (Hausner et al. 1993, 2000; Hausnerand Reid 2003). The sources for all strains used in this studyare listed in supplementary table 1S, Supplementary Mate-rial online. All strains were cultured in petri dishes contain-ing 2% malt extract agar (20 g malt extract [Difco,Michigan] supplemented with 1-g yeast extract [YE; Gibco,Paisly, United Kingdom] and 20-g bacteriological agar[Gibco] per liter). From these cultures, agar plugs were re-moved and used to inoculate 125-ml flasks containing 50 mlof PYG liquid medium (1 g peptone, 1 g YE, and 3 g glucoseper liter) to generate biomass for DNA or RNA extraction(Hausner et al. 1992). The liquid cultures were still grown at20 �C for up to 5 days and then harvested onto Whatman #1filter paper via vacuum filtration. The harvested myceliumwas homogenized by vortexing in the presence of 4 ml (vol-ume) of small glass beads (equal ratio of 0.5- and 3-mmbeads) in 6 ml of extraction buffer (10 mM Tris–HClpH7.6, 1 mM ethylenediaminetetraacetic acid [EDTA],50 mM NaCl, 1% hexadecyl trimethyl ammonium bromide,and 0.5% sodium dodecyl sulfate [SDS]) and then incu-bated at 60 �C for 2 h. The lysate was mixed with an equalvolume of chloroform and centrifuged at 2,000� g. About 5ml of aqueous layer was recovered and mixed with 12 ml ofice cold 95% ethanol. The precipitated DNA was centri-fuged for 30 min at 3,000 � g, and the resulting pellet re-suspended in 400 ll Tris-EDTA buffer (Tris–HCl, 1.0 mMEDTA, pH 7.6).

Polymerase Chain Reaction (PCR) Amplification,Cloning of PCR Products, and DNA Sequencing

A PCR-based survey utilizing primers IP1 and IP2 (Bellet al. 1996) was conducted in order to examine the mt-rnlU11intron in members ofOphiostomaand related taxa for the pres-ence of potential HEG insertions. Between 50 and 100 ng ofwhole-cell DNA served as a template for PCR reactions. Taqpolymerase, buffers, and deoxyribonucleotide triphosphateswere obtained from Invitrogen (Life Technologies, Burling-ton,ON)andusedaccordingtothemanufacturer’srecommen-dations. Typically, PCR conditions were as follows: an initialdenaturationstep of94 �C for 3 min was followed by25 cyclesof denaturing (93 �C for 1 min), annealing (52.9 �C for 1 min30 s) and extension (70 �C for 4 min 30 s) followed by coolingthe reactions to 4 �C. PCR fragments were separated by gelelectrophoresis through a 1% agarose gel in Tris-borate-EDTA buffer (89 mM Tris–borate buffer with 10 mM EDTAat pH 8.0). DNA fragments were sized using the 1-kb-plusDNA ladder (Invitrogen) and the DNA fragments were visu-alized by staining with ethidium bromide (0.5 lg/ml).

2300 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 3: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

PCR products were used directly as templates forDNA sequence analysis or products cloned using the TopoTA cloning kit (Invitrogen). The PCR products were puri-fied with the Wizard SV Gel and PCR clean-up system(Promega), and plasmid DNA was purified using theWizard Plus Minipreps DNA purification system (Promega).The sequencing reactions were performed at the UniversityofCalgaryCoreDNAservicesfacility(Calgary,AB).Table1lists the strains that were examined by DNA sequenceanalysis and also provides the GenBank accession for se-quences obtained in this study. Initially, sequencing em-ployed the IP1 and IP2 primers, or when appropriate forcloned PCR products, the M13 forward and reverse primerswere used; thereafter, nested primers were designed asneeded. DNA sequences were obtained for both strands.Oligonucleotides used in this study were synthesized byAlpha DNA (Montreal, Que, Canada).

Reverse Transcriptase-PCR (RT-PCR) Analysis for thernl-U11 Segment

RNA was isolated from strain O. novo–ulmi subsp.americana WIN(M) 900 using the RNeasy kit for totalRNA isolation (Qiagen Sciences, MD) with some modifi-cations. Initially, the mycelium was ground in liquid nitro-gen. However, once the cell walls were broken, the RNAwas extracted and purified following the yeast protocolof the RNeasy kit. The RNA was treated with DNase(Ambion) following the manufacturer’s recommendation,and 1 lg of RNA was used as template for RT-PCR usingthe ThermoScript RT-PCR system (Invitrogen) accordingto manufacture’s recommendations. First-strand synthesiswas carried out with primer IP2 at a final concentrationof 10 lM and subsequent PCR amplification was carriedout with primers Lsex-2R (CCTTGGCCGTTAAAT-GCGGTC) and IP2 (10 lM concentration). The PCR prod-ucts generated by the RT-PCR reaction were cloned into theTopo TA cloning kit (Invitrogen) and sequenced with pri-mers Lsex2-R-RT (TAGACGAGAAGACCCTATGCAG)and IP2 (CTTGCGCAAATTAGC) (Bell et al. 1996).

Sequence and Phylogenetic Analysis

The individual sequences were assembled manuallyinto contigs using the GeneDoc program v2.5.010 (Nicholaset al. 1997). The ORF Finder program (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) was used (setting: geneticcode for mtDNA of molds) to search for potential ORFswithin the rnl-U11 group I introns. The online resourceBlastP (Altschul et al. 1990) was used to retrieve sequencesthat were related to the putative ORFs obtained from ourstrains (table 1). Sequences were aligned and refined manu-ally with the aid of the GeneDoc program. For phylogeneticanalyses, only those segments of the alignment where all se-quences could be aligned unambiguously were retained.Phylogenetic estimates were generated by the programs con-tained within the PHYLIP package (Felsenstein 1989, 2005)and the MrBayes program v3.1 (Ronquist and Huelsenbeck2003; Ronquist 2004). In PHYLIP, a phylogenetic tree wasobtained by analyzing the alignment with the PROTPARS

(protein parsimony algorithm, version 3.55c) program incombination with bootstrap analysis (SEQBOOT) andCONSENSE to obtain the majority rule consensus tree alongwith an estimate of confidence levels for the major nodeswithin the phylogenetic tree (Felsenstein 1985). Phyloge-netic estimates were also generated within PHYLIP usingthe NEIGHBOR program using distance matrices gener-ated by PROTDIST (setting: Dayhoff PAM250 substitutionmatrix; Dayhoff et al. 1978).

The MrBayes program was used for Bayesian analy-sis. The amino acid substitution model setting for Bayesiananalysis was as follows: mixed models and gamma distri-bution with four gamma rate parameters. The Bayesian in-ference of phylogenies was initiated from a random startingtree and four chains were run simultaneously for 1,000,000generations; trees were sampled every 100 generations. Thefirst 25% of trees generated were discarded (‘‘burn-in’’) andthe remaining trees were used to compute the posteriorprobability values. Phylogenetic trees were drawn withthe TreeView program (Page 1996) using PHYLIP tree out-files or MrBayes tree files and annotated with Corel Draw(Corel Corporation and Corel Corporation Limited).

Expression and Purification of I-OnuI and I-LtrI

For expression of I-OnuI and I-LtrI in E. coli, codon-modified versions of these genes were constructed syntheti-cally, taking into account differences between the fungalmitochondrial andE. coli genetic code (BioS & T, Montreal,Que, Canada). Both the I-OnuI and I-LtrI genes were clonedinto pBlueScript II SKþ, and then subcloned into pTOPO-4(Invitrogen). Subsequently, the I-OnuI and I-LtrI sequenceswere moved into pET200/D-TOPO (Invitrogen) with the N-terminalHis-tag intact togeneratepI-OnuIandpI-LtrI,whichwere subsequently transformed into E. coli strain ER2566(New England Biolabs, NEB) for expression studies.

To express and purify I-OnuI or I-LtrI, a 10-ml E. coliculture containing pI-OnuI or pI-LtrI was grown overnightand diluted 1:100 into 1 l of Luria-Bertani media. The 1 lculture was grown at 37 �C until A600 ; 0.4, shifted to 27�C, and expression induced by adding isopropyl-b-D-thiogalactopyranoside to a final concentration of 1 mM.After additional growth for 2.5 h, cells were harvested bycentrifugation at 5000 rpm for 5 min and the pellet was frozenat �80 �C. For protein purification, the frozen cells werethawed in the presence of protease inhibitor (Roche Diag-nostic) and resuspended in 10 ml of lysis buffer (20 mMTris–HCl, pH 7.9, 500 mM NaCl, 40 mM imidazole and10% glycerol) per 1 gm of wet cell weight. Cells were dis-rupted by homogenization followed by centrifugation at27,200�g for 25 min at 4 �C. The supernatant was sonicatedto facilitate DNA fragmentation, and centrifuged at 20,400�g for 15 min at 4 �C. The supernatant was applied to a HisTrapHP Affinity column (GE Healthcare) that had been chargedwith 0.1 M NiSO4 and equilibrated with binding buffer(20 mM Tris–HCl, pH 7.9, 500 mM NaCl, 40 mM imidazole,and 10% glycerol). Bound proteins were eluted with elutionbuffer (20 mM Tris–HCl pH7.9, 500 mM NaCl, and 10%glycerol) over a linear gradient of imidazole from 0.08 to0.5 M, and 500-ll fractions were collected over 50 ml. Toprevent precipitation, 500 ll of 2 M NaCl and 10 ll of

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2301

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 4: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

0.5 M EDTA, pH 8.0, were added to peak fractions. The peakfraction was loaded directly onto a Superdex 75 gel-filtrationcolumn (GE Healthcare) equilibrated with lysis buffer with-out immidazole. Fractions were collected in 0.25-ml aliquotsover 25 ml. Peak-containing fractions were pooled and ali-quoted and frozen at �80 �C.

Endonuclease Assays

In vitro cleavage assays were carried out with the I-OnuI protein using a variety of possible substrates: 1)The RPS3–HEG-minus sequence was PCR amplified fromO. novo-ulmi subsp. americana strain WIN(M) 904 (Gibband Hausner 2005) and inserted into a pTOPO-4 (Invitro-gen) vector. This construct (pRPS3) provided the HEG-minus target substrate for cleavage and mapping assays;2) a complete RPS3–HEG fusion was synthetically con-structed (BioS & T) and inserted into pET200/D-TOPO

(Invitrogen) to create pRPS3/HEG. This construct servedas the HEG-containing substrate for cleavage assays; and3) the mt-rnl-U7 region was amplified from Ceratocystispolonica strain WIN(M) 1409 using primers LSEX-1(GCTAGTAGAGAATACGAAGGC) and LSEX-2 (GACCGCATTTAACGGCCAAGG) (Sethuraman et al. 2008) andinserted into the TOPO-4 vector. This construct, pU71409,served as a negative control for the cleavage assay.

Cleavageassays were carriedout by incubating 200ngofplasmid substrate in a total volume of 20 ll containing 1 ll ofI-OnuI (25 ng), 2 ll NEB Buffer #3 (100 mM NaCl, 50 mMTris–HCl, pH 7.9, 10 mM MgCl2, and 1 mM dithiothreitol)and 17 ll of H2O at 37 �C. Aliquots were taken at 5-min in-tervals for 30 min and stopped by the addition of loadingbuffer and stop solution (0.1 M Tris–HCl, pH 7.8, 0.25 M ED-TA, 5% w/v SDS, 0.5 lg/ml proteinase K). Reactions wereanalyzed by agarose gel electrophoresis and fragments werevisualized by staining with ethidium bromide (0.5 lg/ml).

Table 1List of Strains, Presence and Absence of RPS3 HEG Insertions, Category of HEG Insertion, and Genbank AccessionNumbers

Organism Strain NumberPresenceof HEGa

Positionof HEGb Degeneratedc

GenbankAccession

Ceratocystiopsis brevicomi WINd(M) 1452 L C Yese FJ717840Ceratocystis curvicollis (5 Ophiostoma

nigrum sensu Upadhyay 1981)WIN(M) 55 L C Yes FJ717842

Ceratocystiopsis minuta-bicolor WIN(M) 480 S FJ717855Ceratocystiopsis parva WIN(M) 59 S FJ717754Ophiostoma aureum CBSf 438.69 S FJ717847Ophiostoma distortum WIN(M) 847 (5ATCCg 18998) L C Yes FJ717845Ophiostoma europhioides WIN(M) 449 L B Yes FJ717841

WIN(M) 1430 L B Yes FJ717836WIN(M) 1431 L B Yes FJ717839

Ophiostom himal-ulmi CBS 374.67 L C YES FJ717862Ophiostoma ips WIN(M) 923 L C# Yes FJ717857

WIN(M 1487 S FJ717858Ophiostoma laricis WIN(M) 1461 L A/B Yes (A/B) FJ717851Ophiostoma megalobrunneum WIN(M) 509 L C Yes FJ717856Ophiostoma minus WIN(M) 861 L C Yes FJ717860

WIN(M) 888 S FJ717859Ophiostoma nigrum CBS 163.61 S FJ717846Ophiostom novo-ulmi subsp. americana WIN(M) 900 L C No AY275136

WIN(M) 904 S AY275137Ophiostoma penicillatum WIN(M) 27 L C No FJ607136

WIN(M) 136 S FJ607138Ophiostoma piceaperdum WIN(M) 979 L A No FJ717837Ophiostoma pseudoeurophioides WIN(M) 42 S FJ717848Ophiostoma rollhansenianum WIN(M) 113 S FJ717853Ophiostoma tetropii WIN(M) 111 (5NFRIh 80-113/9) L C Yes FJ717843

WIN(M) 451 L C Yes FJ717844Ophiostoma torulosum WIN(M) 730 (5CBS 770.71) L C Yes FJ717861Ophiostoma ulmi WIN(M) 1223 L C No FJ717838Leptographium lundbergii WIN(M) 1250 S FJ717850Leptographium pithyophilum WIN(M) 1454 L B No FJ607137Leptographium truncatum WIN(M) 254 L B No FJ717852

WIN(M) 1434 L B No FJ717849WIN(M) 1435 S FJ717835

Sporothrix sp. WIN(M) 924 L C No FJ717834

a ‘‘S’’ indicates the absence of an HEG insertion whereas ‘‘L’’ suggests the presence of an insertion within the mL2449 encoded RPS3 gene.b Positions based on A, B, and C designations in figure 2.c Presence of frameshift mutations and premature stop codons are viewed as evidence for degeneration.d WIN(M) 5 University of Manitoba (Winnipeg) Collection.e Yes 5 HE ORF is degenerated, No 5 HE ORF appears to be intact.f CBS 5 Centraal Bureau voor Schimmelcultures, Utrecht, The Netherlands.g ATCC 5 American Type Culture Collection, Manassas, VA.h NFRI 5 Norwegian Forest Research Institute, As, Norway.

2302 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 5: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

Cleavage Site Mapping for I-OnuI and I-LtrI

In order to determine the cleavage sites for I-OnuI andI-LtrI, PCR products that included the putative cleavage sitelocated near the 3# end of the RPS3-coding sequence wereamplified from pRPS3 with primers end labeled on the non-coding (top) or coding (bottom) strand. The substrate mol-ecule for the I-OnuI assay was a 201-bp product amplifiedby using primers 900FP1 (AAATTAAATTCTAATA-TGC) and IP2 (Bell et al. 1996). Primers were 5#-end la-beled with OptiKinase (USB, Cleveland, OH) accordingto the manufacturer’s protocols using [c-

32

P]ATP. The201-bp amplicons were generated using either 900FP1 orIP2 5#-end-labeled primers; thus, substrates could be gen-erated where either the coding or the noncoding strandswere labeled. The end-labeled PCR products were incu-bated with 1 ll I-OnuI for 10 min at 37 �C in 20-ll reactionmixtures consisting of 5-ll substrate, and 1� NEB Buffer#3. The resulting cleavage products were resolved on a de-naturing 6% polyacrylamide/urea gel (19:1 acrylamide:bis-acrylamide) and electrophoresed alongside the correspondingsequencing ladders obtained from pRPS3 using the end-labeled primers (900FP1 and IP2) (USB Biologicals).

The substrate for the I-LtrI assay was an RPS3 PCRproduct derived from the HEG-minus strain of L. truncatumWIN(M)1435. The cleavage site mapping assay was per-formed as for I-OnuI, but the following primers were usedfor generating the cleavage substrate and correspondingDNA-sequencing ladders: 254synclmap1: AAAGATAAT-AAAGATATTGTAT TTG and IP2.

ResultsThe rnl U11 Intron and a PCR-Based Survey for RPS3HEG Insertions

The rnl-U11 intron was previously characterized froma variety of filamentous ascomycetes such as P. anserina,C. parasitica, and O. novo-ulmi subsp. americana (re-viewed in Hausner 2003; Gibb and Hausner 2005), andclassified as a group I intron belonging to the IA1 subgroupbased on sequence data and structural features. To confirmthat this region indeed represents an intron, we performedRT-PCR on total RNA isolated from O. novo-ulmi subsp.americana strain WIN(M)900. Using primers that flank theintron insertion site, a 3-kb product was amplified from ge-nomic DNA (fig. 1, lane 1), whereas a 0.65-kb product wasamplified from cDNA, the size expected to result from li-gation of exons after intron splicing (fig. 1, lane 3). We con-firmed that the 0.65-kb product corresponded to ligatedexons by cloning and sequencing the product, showing thatthe U11 insertion is indeed an intron. Based on the sequenceobtained from the RT-PCR product, the splice junction wasas follows: 5#exon-TAGGGAT/intron/AACAGG-3#exon.The intron insertion site corresponds to position L2449of the E. coli LSU rDNA.

To assess the diversity of HEG insertions within RPS3genes that are encoded in the mL2449 group I intron, weperformed a PCR-based survey with primers IP1 and IP2that flank the mL2449 insertion site using total DNA iso-lated from 119 strains of ophiostomatoid fungi representing85 species. Two categories of PCR products were ampli-

fied: short (1.6-kb) products for 88 strains, and long(2.4- to 3.0-kb) products for 31 strains (supplementary table1S, Supplementary Material online, table 1). Based on pre-vious work on ophiostomatoid fungi and related taxa (Gibband Hausner 2005; Sethuraman et al. 2008), we assumedthat short PCR fragments most likely represented RPS3genes within the L2449 intron that are not interrupted bya HEG (HEG-minus RPS3 alleles), whereas the long frag-ments represent RPS3 genes that are interrupted by a HEG(HEG-plus RPS3 alleles). We sequenced a total of 21 longPCR products to characterize the HEG insertions and alsosequenced 11 short PCR products from closely related spe-cies to accurately localize the HEG insertion point. In sum-mary, we identified three different HEG insertion siteswithin RPS3 alleles of ophiostomatoid fungi, all involvingdouble-motif LAGLIDADG HEases (fig. 2A). In additionto completely sequencing 21 of the long PCR products,we partially sequenced an additional 10 products, noneof which revealed novel insertion sites/HEGs and weretherefore not characterized any further. A-type HEG inser-tions were located in the N-terminal coding region of RPS3(fig. 2B), and B-type and C-type insertions were locatedwithin the C-terminal coding region of RPS3 (fig. 2Cand D). The C-type insertions are similar to the insertionpreviously described for O. novo-ulmi subsp. americana(Gibb and Hausner 2005). In addition, we found one exam-ple where an A- and B-type HEG had independently

FIG. 1.—RT-PCR assay to detect splicing of the mL2449 group Iintron in Ophiostoma novo-ulmi ssp americana strain WIN(M) 900. (A)Representative agarose gel of RT-PCR reactions. Lane 1 shows a PCRproduct (;3 kb as indicated) amplified from total DNA using primersLsex2-R and IP2. Lane 2 is an RT-PCR reaction performed without priorreverse transcriptase step, to confirm that all DNA has been degraded.Lane 3 represents the RT-PCR product generated with primers Lsex2-Rand IP2 after the reverse transcriptase step. Lanes indicated ‘‘M’’ are DNAsize standards (1 kb plus, Invitrogen). (B) Schematic representation of thernl region analyzed. Sequence of the RT-PCR product revealed the exon–exon junction to be 5#-CGCTAGGGAT/AACAGGCTAA-3#.

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2303

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 6: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

inserted into a single RPS3 gene of Ophiostoma laricis(A/B-type insertion; fig. 2E). Each of these insertions isdescribed in detail below.

A-Type HEG Insertions Create Bi-ORFic U11 rnlIntrons

Sequencing of the Ophiostoma piceaperdum strain IPPCR product resolved the size of the mL2449 intron to be2.914 kb (fig. 2B), whereas sequencing of a closely relatedspecies Ophiostoma aureum (CBS 438.69; Hausner et al.1993) revealed a 1.6-kb mL2449 intron that lacked anHEG insertion in RPS3. This HEG-minus sequence wasused as a reference to determine the insertion point ofthe HEG in the RPS3 gene of O. piceaperdum. The inser-tion of the LAGLIDADG HEG within the O. piceaperdumL2449 intron has created two putative ORFs. The first ORFis 1.446 kb, encoding a 482 amino acid fusion protein con-sisting of the first 189 bp of RPS3 (the N-terminal 63 aminoacids) followed by 1.257 kb (419 amino acids) that corre-

sponds to a double-motif LAGLIDADG HEase. The secondORF within the O. piceaperdum U11 intron is separatedfrom the first ORF by a 79-bp spacer region, is 1.041 kb long,and encodes a Rps3 homolog of 347 amino acids. Interest-ingly, the origin of 79-bp spacer sequence and the first 38-bpsequence of the second ORF (Rps3) in O. piceaperdum areunknown, as similar sequences are not found in the closelyrelated O. aureum RPS3 sequence (or for that matter in anycharacterized rnl U11 sequence). However, after the novel39-bp RPS3 N-terminal sequence, the remaining RPS3 se-quence in O. piceaperdum is identical to the correspondingRPS3 sequence of O. aureum.

B- and C-Type Insertions Create Mono-ORFic mL2449Introns

All rnl-U11 regions that yielded PCR products of;2.4 kb were sequenced and found to contain a group Iintron-encoded RPS3 gene plus a single double-motif LA-GLIDADG HEG that was inserted in one of two locations

FIG. 2.—Schematic representation of the mL2449 intron, the intron-encoded RPS3 gene and the HEG insertion sites. (A) Three HEG insertion sites(A, B, and C) in the RPS3 gene of ophiostomatoid fungi and related taxa. Striped rectangles indicate intron sequence, whereas the open rectanglerepresents the RPS3 gene. LSU (rnl), large subunit rDNA gene. (B) Example of an A-type insertion in Ophiostoma piceaperdum WIN(M)979. Theshaded box indicates the LAGLIDADG HEG. (C) Example of a B-type HEG insertion in Ophiostoma europhioides WIN(M)449. (D) Example of a C-type insertion in Ophiostoma novo-ulmi subsp. americana WIN(M)900. The 4-bp direct repeats flanking the HEG are indicated by solid lines. The 52-bp spacer segment separating the HEG and downstream intron sequence is indicated by a dark box. (E) Example of an RPS3 gene with two HEGinsertions in Ophiostoma laricis WIN(M)1461. The HEGs are A- and B-type insertions, as described in panels B and C, respectively.

2304 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 7: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

within the RPS3 C-terminal region, herein referred to as theB- and C-type HEG insertions (see fig. 2C and D, table 1).These examples are designated as mono-ORFic as only oneRPS3–HEG fusion is present within the intron. The C-typeHEG insertion point and the arrangement of the HEase-coding region with respect to the upstream RPS3 codingregion is the same as previously described for O. novo-ulmisubsp. americana (Gibb and Hausner 2005). The newlyidentified C-type HEG insertions identified in this studyare listed in table 1. The C-type HEG insertions are asso-ciated with a short direct repeat, 5#-GAAT-3# (table 2). Inaddition, 52 bp separates the C-terminal (or 3# end) of theRps3–HEG fusion from the original RPS3 C-terminus thatwas displaced downstream by the insertion event; this dis-placed sequence is likely noncoding (fig. 3). The source ofthe 52-bp segment is not known as BlastN searches yieldedno significant hits. In each case, the HEG insertion eventdisplaced the original RPS3 C-terminal coding region(see fig. 3). However, the effect of the HEG insertion onRPS3 function is negated because the displacedRPS3-codingsegment is essentially duplicated to generate a new Rps3C-terminus. Significantly, we found that 12 of 16 C-typeHEGs showed evidence of degeneration caused by indelswithin the HEase-coding region that resulted in frameshiftmutations and premature termination codons.

Three strains of Ophiostoma europhioides (WIN(M)449, 1430, and 1431), one strain of Leptographium pithyo-philum, and two strains of L. truncatum (WIN(M) 254 and1434) were noted to have a single HEG insertion, referred toas the B site that is located about 28 bp upstream of the Cinsertion site (see fig. 2C and table 1). The O. europhioides,L. pithyophilum, and L. truncatum sequences were com-pared with each other’s rnl U11 region including theRPS3–HEG-minus O. aureum U11 sequence. Comparativeanalysis showed that within this group, the HEG is inserted

such that the original C-terminus (45 bp) of the residentRPS3 gene is displaced downstream from the resultantRPS3–HEG fusion. As observed for the C-type HEGs,the B-type HEG insertions are also associated with dupli-cations of the displaced RPS3 C-terminal sequences ensur-ing that the RPS3-coding regions remain intact. Similar toC-type insertions, the C-terminal (or 3# end) of the RPS3HEG–coding region is separated from the original RPS3C-terminus that was displaced by the insertion event(fig. 3). However, the spacer sequence is only 4 or 5 bp(figs. 2C and 3), as opposed to the longer 52-bp spacer as-sociated with C-type insertions. Furthermore, the spacer se-quences show no similarity to any other rnl-U11 sequence,suggesting that these sequences were introduced during theHEG insertion event. For B-type insertions, three HEaseORFs appear intact, whereas four possess indels and missensemutations resulting in premature stop codons (table 1). Theupstream RPS3-coding regions in all cases were alwaysnoted to be intact, that is, no premature stop codons.

Independent Insertion of Two LAGLIDADG HEGs ina Single RPS3 Gene

A variation of the O. piceaperdum mL2449 intronORF arrangement was noted in a strain of O. laricis(WIN(M) 1461) (fig. 2E). Here, the resident RPS3-codingregion was invaded independently by two double-motifLAGLIDADG-type HEGs, creating two hybrid fusionORFs. One HEG insertion is an A-type insertion, wherethe HEG is fused in-frame to the N-terminus of the originalRPS3 ORF. The second HEG insertion is a B-type insertion,where the HEG is fused in-frame to the C-terminus of theRPS3-coding region. However, both HEGs are character-ized by frameshift mutations, suggesting that they have de-generated. In both Rps3–HEG fusions, the RPS3-coding

Table 2Sequences Upstream and Downstream of RPS3 HEG Insertions

Organism and Strain NumberSequences Before (3#)

the HEG Insertion PointSequences After (5#)

the HEG Insertion Point Type

Ophiostoma ulmi (WIN(M) 1223) AGGTTGAAT GAAT.AAGTGGA COphiostoma novo-ulmi subsp americana (WIN(M) 900) AGGTTGAAT GAAT.AAGTGGA COphiostoma himal-ulmi (CBS 374.67) AGGTTGAAT GAAT.AAGTGGA CSporothrix sp. (WIN(M) 924) AGGTTGGaAT GAAT.AAGTGGA COphiostoma distortum (WIN(M) 847) AGGTTGAAT GAAT.AAGTGGA COphiostoma minus (WIN(M) 861) AGGTTGGAT GAAT.AAGTGGA CCeratocystiopsis brevicomi (WIN(M) 1452) AGGTTGAAT GAAT.AAGTGGA COphiostoma torulosum (WIN(M) 730) AGGTTGAAT GAAT.AAGTGGA COphiostoma penicillatum (WIN(M) 27) AGGTTGAAT GAAT.AAGTGGA CCeratocystis curvicollis (WIN(M) 55) AGGATGAAT GAAT.AAGTGGA COphiostoma tetropii (WIN(M) 111) AGGTTGAAT GAAT.AAGTGGA CO. tetropii (WIN(M) 451) AGGTTGAAT GAAT.AAGTGGA COphiostoma ips (WIN(M) 923) TAAAAGGTT GAAT.AATTGGA C#Ophiostoma europhioides (WIN(M) 1431) TCTAAACGT AGTATAGGAGC BO. europhioides (WIN(M) 1430) TCTAAACGT AGTATAGGAGC BO. europhioides (WIN(M) 449) TCTAAACGT AGTATAGGAGC BLeptographium truncatum (WIN(M) 1434) TCTAAACGT AGTATAGGAGC BL. truncatum (WIN(M) 254) TCTAAACGT AGTATAGGAGC BLeptographium pithyophilum (WIN(M) 1454) TCTAAACGT AGTATAGGAGC BOphiostoma laricis (WIN(M) 1461) TCTAAACGT AGTATAGGAGC BOphiostoma piceaperdum (WIN(M) 979) AATTTTCCT GTATATGAC AOphiostoma laricis (WIN(M) 1461) AATTTTCCT GTATATGAC A

a Nucleotides shown in bold indicate positions that deviate from the consensus sequence 3# to HEG insertion sites.

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2305

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 8: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

regions are upstream of the HEase-coding segments, imply-ing that frameshift mutations within the HEGs should notdirectly affect the translation of Rps3. The two Rps3–HEGfusion ORFs are separated by a 36-bp sequence that lackssimilarity to U11 region/intron sequence, and the secondORF starts with a 38-bp segment that may representa new Rps3 N-terminus, similar to the situation describedfor A-type insertions in O. piceaperdum (see fig. 2B). Insummary, the resident RPS3 gene has essentially been splitsuch that the N- and C-termini are now components of twoORFs that each includes a LAGLIDADG HEase.

Phylogenetic Analysis of the LAGLIDADG HEGsInserted in RPS3 Genes

A BlastP search identified double-motif LAGLI-DADG HEases related to those we identified in this study.To analyze the evolutionary relationships among the HEGs,the sequences were combined into a single alignment andanalyzed by a variety of phylogenetic methods (fig. 4A andB). In particular, we were interested in determining if thedouble-motif HEGs identified in our study fit the model thatsuggests that double-motif LAGLIDADG HEGs evolvedby a gene-duplication event of a single, ancestral LAGLI-DADG motif type HEG. When the LAGLIDADG ORFswere treated as separate components (Sethuraman et al.2008; N-terminus including P1 and rA; C-terminus includ-ing P2 and rB), phylogenetic analyses yielded evolutionary

trees that grouped the N- and C-terminal sequences intoseparate clades (fig. 4B). This tree topology is that expectedif the two halves of the LAGLIDADG sequences originatedby a gene duplication event (Haugen and Bhattacharya2004). A similar finding was made when the HEGs weretreated as a continuous sequence; they grouped into threedistinct clades (fig. 4A). Both phylogenetic analyses suggestthat the C-terminally inserted HEGs (sites B and C) sharea recent common ancestor and are distantly related to the A-type HEG that inserted in the N-terminus of RPS3 gene.Group I intron–encoded LAGLIDADG ORFs recoveredfrom Genbank by BlastP analysis failed to identify a poten-tial intron-encoded ancestor for the RPS3 HEGs discoveredin this study, whereas the previously described HEG in-serted within the C. parasitica RPS3 gene appears to be re-lated to the C-type HEGs identified in species ofOphiostoma (including Leptographium) species.

The RPS3 Host Gene Phylogeny Suggests VerticalRather Than Horizontal Inheritance

To determine the phylogenetic relationship among thehost RPS3 genes, and to test for horizontal transfer of RPS3and HEG genes, we extracted related RPS3 sequences fromGenBank representing two major groups within the Pezizo-mycotina: the Eurotiomycetes and the Sordariomycetes(Blackwell et al. 2006). In total, 47 RPS3 sequences werecompiled of which our study generated 33 new RPS3

FIG. 3.—Details of the B- and C-type HEG insertions in RPS3. Shown are HEG-minus and HEG-containing RPS3 sequences of representative B-and C-type insertions, with translated amino acid sequence indicated above or below the coding-strand sequence. The dashed lines indicate thesequence that was inserted into RPS3, including the ‘‘duplicated’’ RPS3 sequence and the HEG. The ‘‘displaced’’ original RPS3 sequence is indicated bya dashed rectangle. Direct repeats flanking the C-type HEG insertion are in bold and enlarged font. There are insufficient examples of the A-type HEGsto provide details on the sequence changes that occurred during the HEG insertion.

2306 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 9: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

sequences for meiotic and mitotic members of the genusOphiostoma sensu lato. The phylogenetic analysis of theRPS3 data (fig. 5) yielded a tree that essentially reflectsthe expected relationships based on previous rDNA studiesof the various ascomycetous taxa examined in this study(Hausner et al. 1993, 2000). Furthermore, although RPS3is encoded within a potentially mobile group I intron,and in some instances the RPS3 ORF is associated with po-tentially mobile HEGs, the comparison between the RPS3and the HEG trees provides no evidence that the RPS3 genehas been transferred horizontally. Comparative phyloge-netic analysis of RPS3 sequences with their correspondingHEGs failed to show evidence for recent lateral transfers ofeither the HEG or RPS3 sequences, as the phylogenetictrees observed appeared to be congruent for both theRPS3- and HEase-coding regions (data not shown).

I-OnuI and I-LtrI Are Functional LAGLIDADGEnzymes That Cleave at or Near the HEG Insertion Site

Phylogenetic analysis showed that the B- and C-typeRPS3 HEGs may share a common ancestor. We focused ontwo HEG insertions, a B-type HEG in the RPS3 gene ofL. truncatum strain WIN(M) 254 and a C-type HEG inthe RPS3 gene of O. novo-ulmi subsp. americana strainWIN(M)900. Previous work based on comparative se-quence analysis suggested that for the C-type RPS3 inser-tion, a GAAT sequence is a logical candidate as a cleavage

and insertion site (Gibb and Hausner 2005). However, forthe B-type RPS3 insertions, potential cleavage–insertionsites were not as apparent; thus, the HEase was character-ized with regard to its cleavage site within the RPS3 gene.The cleavage site assays also determined whether the LA-GLIDADG HEases inserted within the C-terminus of theRPS3 gene are functional.

In order to characterize each HEase, we initiallysynthesized two gene constructs for each HEase for usein overexpression studies. One construct included the entireRPS3–HEG fusion, whereas a second construct corre-sponded to the LAGLIDADG endonuclease portion ofthe RPS3–HEG fusion. In each case, the genetic codewas optimized for expression in E. coli. Although both pro-teins expressed well, the Rps3–HEG fusion did not bind tonickel-charged resin, whereas the HEG-only construct wasreadily purified by nickel-affinity and gel-filtration chroma-tography (fig. 6A).

For the C-type HEG, purified HEase was incubatedwith plasmid substrate (pRPS3) containing a clonedRPS3–HEG-minus allele (source: O. novo-ulmi subsp.americana strain WIN(M) 904). As shown in figure 6B, cir-cular pRPS3 was linearized after addition of the purifiedHEase (fig. 6B, lanes 3–5). In contrast, no cleavage wasobserved by the HEase with a substrate that correspondedto HEG-plus allele (pRPS3/HEG), or a substrate containinga different group I intron–encoded ORF (mL1699 ORF;-pU7-1409) (fig. 6B). In accordance with standard

FIG. 4.—(A) Phylogenetic analyses of 32 double-motif LAGLIDADG sequences. Topology of trees shown in panels A and B are based onBayesian analysis of LAGLIDADG HE amino acid sequences. The numbers at nodes indicate the level of support based on bootstrap analysis incombination with parsimony and NJ analysis, respectively. The third number at the nodes below the line represents the posterior probability valuesobtained from the 50% majority consensus tree generated using Bayesian analysis. Numbers are provided for those nodes that generated high values,that is, posterior probability values of . 99% and bootstrap support values .95%. NA indicates a particular node was not observed with one of thephylogenetic reconstruction methods utilized in this analysis. Accession numbers [ ] are provided for those sequences obtained by BlastP searches. (B)Phylogenetic analysis where the N- and C-terminal domains of the LAGLIDADG HEases were treated as individual sequences, nodes labeled as inpanel A. The letters P and D following the HEG names indicate P 5 putative (i.e., HE activity not tested) and D 5 degenerated (based on the presenceof premature stop codons).

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2307

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 10: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

nomenclature for HEases, we have named the endonucleaseI-OnuI.

The I-OnuI cleavage sites were mapped by incubatingthe enzyme with end-labeled substrate that included the pre-dicted I-OnuI insertion site. By resolving the cleavage prod-ucts next to corresponding DNA sequencing ladders, theI-OnuI cleavage site was mapped to positions 1214 and1210 on the coding and noncoding strands, respectively,of the O. novo-ulmi subsp. americana (WIN(M) 904)RPS3 gene (fig. 6C and D). These nucleotide positions cor-respond to the 5#-GAAT-3# sequence previously noted toform a 4-bp direct repeat flanking the HEG insertion site(figs. 3 and 6D, table 2). Similarly, the I-LtrI cleavage siteswere mapped as for I-OnuI, except the cleavage site substratewas derived from an RPS3-minus HEG allele obtained fromL. truncatum strain WIN(M)1435. For I-LtrI, the data showthat the HEase generated a 3# 4 nt overhang (GTAT; fig. 7).Based on comparative sequence analysis, the insertion sitefor I-LtrI is 1 bp upstream from the 4-bp cleavage site, thatis, 5#. . .GT[HEG]C[GTATYAGGA. . .3#, where [ and Y

denotes the bottom- and top-strand cleavage sites, respec-tively (see fig. 7).

DiscussionThe RPS3 Locus: a Refuge for LAGLIDADG HEGs

In the mitochondrial genomes of fungi, nonessentialmobile elements, such as group I introns and HEGs, havesuccessfully proliferated and in some instances comprisea significant percentage of the mtDNA, in spite of providingno apparent function to the host (Cummings et al. 1990;reviewed in Hausner 2003). In this work, we focused onthe U11 region of the rnl gene that in many filamentousascomycetes fungi is interrupted by a group I intron at po-sition 2449 (mL2449). Encoded within the mL2449 intronof some ascomycete fungi is the gene for RPS3, which ap-pears to be functional and incorporated into the mitochon-drial ribosomes (LaPolla and Lambowitz 1981; Burke andRajBhandary 1982; Lambowitz et al. 1999; reviewed inHausner 2003). In other fungi, RPS3 is not intron encoded

FIG. 5.—Phylogenetic relationships among 47 mL2449 intron–encoded Rps3 amino acid sequences. Tree topology is based on a 50% majorityconsensus tree generated using Bayesian analysis (Ronquist et al. 2003; Ronquist 2004). Among the 34 Ophiostoma and Leptographium Rps3sequences used, 24 had HEG insertions and 11 sequences (denoted by *) had no HEG insertions. Rps3 sequences marked with (þ) had remnants ofdegenerate LAGLIDADG ORFs and were not included in the HEG phylogenies (fig. 4A and B). Nodes, with regard to statistical support, were labeledas in figure 4. On the right side of the phylogenetic tree is a table indicating the presence/absence of HEGs inserted in RPS3 genes for each species. Thesizes of the IP1/IP2 PCR products obtained are indicated (short [S] 5 1.55 kb and long [L] . 2.4 kb). L indicates the presence and S the absence ofHEGs within RPS3. The HEG insertion positions are indicated by either A, B, or C (see fig. 2). Any evidence for ORF degeneration (i.e., premature stopcodons, frameshift mutations) is indicated by YES and the absence of degeneration by NO.

2308 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 11: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

FIG. 6.—Purification and characterization of I-OnuI. (A) ‘‘Top gel,’’ SDS-PAGE analysis of I-OnuI purification by HisTrapHP. Lanes are indicatedas follows: U, uninduced cells; I, induced cells; C, crude fraction from induced cells; P, insoluble fraction; S, soluble fraction; FT, flow through; W,wash. I-OnuI was eluted over an increasing linear gradient of immidazole as indicated by the left-facing triangle. ‘‘Bottom gel,’’ 6% SDS-gel showingthe peak fractions from Superdex 75 gel-filtration column, with fraction numbers indicated above the gel. (B) In vitro cleavage assay with I-OnuI. Lane1, uncut pRPS3; lane 2, pRPS3 linearized with PstI; lanes 3–5, cleavage assays with pRPS3 incubated for 0, 15, and 30 min with I-OnuI; lane 6,cleavage assay with pRPS3 þ HEG construct; lane 7, cleavage assay with pU7143 (mL1669 intron with ORF). The lane marked M is the 1-kb-plusLadder. (C) Physical map of the pRPS3 used for generating substrate molecules via PCR for cleavage mapping assays. In the diagram, open boxesoutline the RPS3 gene. Shown are relative positions of primers (IP1, IP2, 900FP1) used to generate substrate for mapping, with the position of theGAAT insertion site noted. (D) Mapping of I-OnuI cleavage sites. Shown is a representative gel where end-labeled PCR products (5 SUB for substrate)corresponding to the coding (top) or noncoding (bottom) strands were incubated with I-OnuI (þ) or with buffer (�). Cleavage products (5 CP) wereelectrophoresed alongside the corresponding sequencing ladders. Schematic representation of the I-OnuI cleavage sites, indicated by solid triangles onthe top strand and bottom strand. The HEG insertion site based on comparative sequence analysis would be after the GAAT.

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2309

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 12: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

and found as a free-standing ORF or missing altogether(presumably nuclear encoded; Bullerwell et al. 2003). In-terestingly, the RPS3 gene itself has been invaded bymobile elements, as previous studies showed that the in-tron-encoded RPS3 genes of C. parasitica (Hausneret al. 1999) and O. novo-ulmi subsp. americana (Gibband Hausner 2005) are interrupted by putative LAGLI-DADG HEGs, creating an apparent Rps3–HEG fusionprotein. To gain an insight into the pervasiveness and func-tional consequences of HEG insertions into the RPS3 gene,we performed a systematic PCR-based survey on 119 ad-ditional ophiostomatoid fungi to identify the extent andtypes of HEG insertions. In summary, 21 strains were se-quenced that were positive by PCR screening for the pres-ence of HEGs, revealing three different positions within theRPS3 gene that were invaded by HEGs. These insertionsare referred to as A-, B-, and C-type insertions, respectively.

Complex organization of RPS3 genes has also beendescribed in other biological systems. For example, inmany Angiosperms, the mitochondrial RPS3 gene is inter-rupted by a group II intron (Laroche and Bousquet 1999). Amore dramatic example is found in Dictyostelium discoi-deum where the mtDNA RPS3 gene is split and both the5# and 3# segments are fused to separate ORFs of unknownfunction (ORF425 and ORF1740, respectively) (Iwamotoet al. 1998). Collectively, these data suggest that RPS3 isa hotspot for insertion of mobile elements and that RPS3

can tolerate insertions that rearrange the coding region with(presumably) little effect on Rps3 function.

Althoughit isunknownhowtheRPS3genewas insertedinto the mL2449 group I intron in fungi, such an arrangementmayoffer several advantages.Thisconfigurationensures thatthe RPS3 gene and ribosomal RNA are cotranscribed, yield-ing stoichiometric amounts of both components necessaryfor ribosome biogenesis. Although it is very likely that therns and rnl transcripts are under the control of differentpromoters (Kubelik et al. 1990), one would assume that bothtypes of rRNAs are produced at similar levels; thus, the rnlintron–encoded rps3 ORF should be produced in similaramounts to the rns RNA. However, significant RNA process-ing of the rnl transcript must occur, to both generate a func-tional rnl rRNA and liberate the group I intron, ultimatelyyielding a translatable mRNA. In this context, it is interestingto consider a possible link between intron splicing andRPS3expression. For instance, the HEase I-SceI is encodedwithin a group I intron that is inserted at position 2449 oftheS.cerevisiaernlgene,andithasbeensuggested that rRNAprocessingandsplicingof the intronplaysarole inmaturationof the3#endof the I-SceI mRNA (Zhuetal. 1987; Gillhaet al.1994; Johansen et al. 2007). The intron-encodedRPS3ORFs(including the RPS3–HEG fusions described here) are lo-cated in the P8 loop of the group I intron RNA, as is I-SceI,so expression of the RPS3-HEase ORFs may depend in parton the splicing of the intron generating appropriate mRNAsfor translation.

Consequences of the HEG Insertions on RPS3 Structureand Function

As is the case with many mobile intron and endonu-clease insertion sites, the HEG insertion sites in RPS3 de-scribed here correspond to highly conserved regions of theN- and C-terminal regions of Rps3 (fig. 8A and B). Theseobservations are consistent with the A-, B, and C-typeHEGs targeting conserved sequences to ensure homingto related organisms that possess HEG-minus RPS3 alleles(fig. 2). RPS3 orthologs from divergent species are notori-ously difficult to align at the amino acid level, but two con-served regions have been identified. The N-terminal regionis a KH RNA-binding domain (pfam00417) connected bya linker region to a well-conserved C-terminal domain(pfam00189). The B- and C-type HEGs are inserted in loopor nonstructured regions of the Rps3 protein, rather than inwell-defined regions of secondary structure (fig. 8C). Crys-tal structures of the 30S small subunit ribosome show thatRps3 makes multiple contacts to helices in the small rRNAand to other ribosomal proteins (Wilson and Nierhaus 2005;supplementary fig. 8D, Supplementary Material online).Previous biochemical studies in E. coli have found thatRps3 binds to the small ribosomal subunit at a late stepof ribosome biogenesis to aid in the assembly of additionalribosomal proteins. Moreover, the mitochondrially encodedRps3 of N. crassa was shown to be ribosome associated(LaPolla and Lambowitz 1981). Thus, our data showing thatRPS3 has been invaded multiple, independent times by LA-GLIDADG endonucleases raise the obvious question of theeffect of the insertions on Rps3 structure and function.

FIG. 7.—Mapping of I-LtrI cleavage sites. Shown is a representativegel where end-labeled PCR products (5 SUB for substrate) correspond-ing to the coding (top) or noncoding (bottom) strands were incubated withI-LtrI (þ) or with buffer only (�). Cleavage products (5 CP) wereelectrophoresed alongside the corresponding DNA sequencing ladders.Shown below is a schematic representation of the I-LtrI cleavage sites,indicated by solid triangles on the top strand and bottom strand; insertionsite for HEG is also noted by a vertical line.

2310 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 13: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

In the case of the B- and C-type insertions, the effect ofthe HEG insertion on RPS3 is likely to be minimal becausethe C-terminal coding region of Rps3 that is ‘‘displaced’’ is‘‘duplicated’’ and fused in-frame to the 3# end of the RPS3gene (fig. 3). Thus, a B- or C-type insertion is defined as theduplicated region of the 3# end of RPS3 plus the HEase-coding region (fig. 3). The original 3# end of the RPS3 geneis now separated from the RPS3–HEG fusion by a stop co-don and is presumably noncoding. For each insertion, theduplicated 3# RPS3 nucleotide sequence is not identical tothe displaced RPS3 sequence, but all changes result in con-servative amino acid replacements with probably little or nooverall effect on Rps3 function. Regardless of the insertiontype, the consequence of the B- and C-type insertions is anRps3 protein where the HEG is fused to the C-terminus.Many examples exist where intron-encoded LAGLIDADGORFs are fused in-frame to upstream exons (Cummingset al. 1989; Gonzalez et al. 1998). The expression of theseORFs is thought to involve proteolytic processing of thechimeric translation products, possibly by the m-AAA pro-tease (Arlt et al. 1998; Van Dyck et al. 1998;). Such a mech-anism could possibly explain how a functional Rps3 proteinwould be generated in the case of B- and C-type HEGs atthe extreme C-terminus of Rps3. Alternatively, the Rps3–HEG fusion may be assembled into the 30S ribosome sub-unit without any posttranslational processing.

Free-standing HEG-like elements reminiscent of thosediscussed here have been described in Allomyces macrogy-nus and other chytridiomycetes and zygomycetes fungi. Inthese fungi, HEG-like sequences are inserted within themtDNA atp6 gene (Paquin et al. 1994; Seif et al. 2005).For instance, in A. macrogynus, the element (ORF 360)encodes a GIY-YIG HEase that is a separate ORF down-stream of the 3# end of the atp6 gene (Paquin et al.1994). However, the sequence data show that the HEGinsertion resulted in a new C-terminus for the atp6 geneas the original resident C-terminus was displaced down-stream and is now noncoding. These characteristics are re-markably similar to the RPS3 insertions described here, asthe GIY-YIG HEG not only inserted into the atp6 gene butalso brought along a new atp6 C-terminus, fused in-frameto the resident atp6, to compensate for the displacement ofthe original C-terminus (Paquin et al. 1994; Paquin andLang 1996). In either of the RPS3 or atp6 cases, the mo-lecular mechanism(s) that duplicated the displaced host-gene sequence upon insertion of the HEG to restore theC-terminal region of Rps3 or Atp6 is unknown, but clearlymust involve illegitimate recombination.

However, for the two A-type insertions that interruptthe N-terminus of Rps3, the situation is far more complex asthe RPS3 genes have been effectively split into two halves(fig. 2). The first A-type HEG insertion creates two

FIG. 8.—(A) Sequence logos (Schneider and Stephens 1990) representing those segments of the Rps3 amino acid alignments corresponding tonucleotide positions that are invaded by HEGs at the gene level. Vertical lines indicated the three Rps3 HEG insertion sites: A, B, and C. The sequencelogos were generated using the online program WebLogo (Crooks et al. 2004; http://weblogo.berkeley.edu/).(B) The relative HEG insertion points withregard to the Rps3 amino acid sequence are shown with reference to the Rps3 amino acids sequence obtained from Ophiostom novo-ulmi subsp.americana strain WIN(M) 904 (a HEG-minus allele; GenBank accession: AY275137). (C). Structure of Escherichia coli Rps3 protein with the positionof the B- and C-type HEG insertion sites in the corresponding fungal Rps3 denoted by arrows (modified from PDB 1FKA; Schluenzen et al. 2000).Details of A-type insertions were not shown as the intron-encoded version of Rps3 appears to have no similarity with the N-terminal region of thebacterial type Rps3.

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2311

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 14: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

independent ORFs: one ORF where the N-terminal 63amino acids of Rps3 is fused to the HEG-coding region,followed by a second ORF consisting of the C-terminalcomponent of the RPS3-coding region. The second A-typeHEG insertion we identified is accompanied by a B-typeHEG insertion (fig. 2E). Here, the N and C terminal–codingregions of RPS3 are components of two separate fusionORFs, one including an A-type HEG and the other a B-typeHEG, respectively. It is difficult to envision how RPS3function is maintained in these cases, but it is possible thateach fusion ORF is independently translated, and RPS3function is restored by the fusion proteins interacting post-translationally to reconstitute an active protein. A detailedtranscriptional analysis of the RPS3 A-type HEG insertionsmay provide clues as to how the two RPS3 containing ORFsare expressed.

Introns with two ORFs have been characterized for theP. anserina nad1-i4 intron (Sellem and Belcour 1997). Thesituation is somewhat different to that described here, as thegroup I intron is inserted within a protein-coding gene(nad1) and the first intron ORF (orf1) is a LAGLI-DADG-type HEase fused to the upstream nad1 exon,whereas the second independent intron ORF encodes a pu-tative GIY-YIG HEase. Sellem and Belcour (1994, 1997)showed that translation of the second ORF requires alter-native mRNA-splicing events that bring the second ORFin-frame with the upstream exon. It is difficult to envisiona similar alternative splicing event for expression of the sec-ond RPS3 ORF within the bi-ORFic mL2449 introns, be-cause alternative splicing would join the RPS3 mRNA to anrRNA transcript that is not a substrate for translation. Un-fortunately, little is known about cis-acting regulatory se-quences involved in mtDNA transcription and translationin filamentous ascomyetes fungi.

An alternative explanation regarding the effect of theHEG insertions on RPS3 structure and function is that theyare of little consequence, as nuclear copies of RPS3 or func-tionally equivalent ribosomal proteins may exist (Bonenand Calixte 2006). Many fungi do not encode anyribosomal proteins in the mitochondrial genome, the corre-sponding genes for which have presumably been trans-ferred to the nucleus. There is also the possibility ofheteroplasmy, where additional HEG-minus RPS3 allelesexist on other mitochondrial chromosomes within an indi-vidual. However, in samples that were HEG positive, weonly observed a single PCR product corresponding tothe HEG-containing allele, suggesting the presence of a sin-gle mtDNA genotype. Moreover, if the HEG-containingRPS3 alleles we described are not essential, the observa-tions that all B- and C-type HEG insertions occurred suchthat the 3# region of RPS3 is duplicated to restore a com-plete ORF would be unexpected. Furthermore, nonfunc-tional copies of RPS3 would rapidly degenerate,whereas, as discussed below, we only observed degenera-tion of the HEG-coding region. Moreover, we calculatedthe ratio of nonsynonymous (Ka) to synonymous (Ks)amino acid substitutions for RPS3 using a codon-based nu-cleotide alignment derived from 36 species and found theKa/Ks ratio to be consistently above 1.0, suggesting purify-ing selection for maintenance of function (Sethuraman J,data not shown).

Vertical Transmission and Degeneration Define theEvolutionary History of RPS3-Associated HEGs

LAGLIDADG HEases are found in two forms: a single-LAGLIDADG motif that dimerizes and double-motif formsderived from a gene fusion event between two monomericforms. Sequence analysis indicates that all of the HEGs in-serted into RPS3 are of the double-motif form, and phyloge-netic analysis showed that N- and C-terminal domainsformed separate clades, consistent with the evolution of dou-ble-motif LAGLIDADG endonucleases originating by geneduplication and fusion of a single-motif endonuclease. A sig-nificant finding from our phylogenetic analyses was that theB- and C-type HEGs are monophyletic, and the A-typeHEGs were not related to the latter two types. This relation-ship suggests that a single ancestral HEG (or variants thereof)invaded one of the two HEG insertion sites that we haveidentified within the C terminal–coding region of theRPS3 gene. Invasion of ectopic sites may have beenfacilitated by the relaxed requirement for substrate sequencesymmetry that is characteristic of the double-motifLAGLIDADG enzymes as compared with the single-LAGLIDADG motif enzymes. Subsequent to the origin ofHEG insertions in RPS3, our phylogenetic analyses are con-sistent with a vertical mode of inheritance, as opposed to hor-izontal or lateral transfer of the HEGs, as HEG and RPS3trees are essentially congruent for the B- and C-type inser-tions. Moreover, we noted many examples of HEGs that con-tained frameshift and missense mutations resulting inpremature stop codons that likely render these HEGs non-functional; only 4 of 14 C-type and 3 of 7 B-type HEG in-sertions have what appear to be intact ORFs. The sequencedata show that the combination of RPS3 genes fused withdegenerate HEGs are more frequent than the combinationof RPS3 genes fused with intact and presumably activeHEGs. HEG-like elements that spread vertically througha population can be subjected to a slow degenerative processdue to a lack of natural selection on a neutral genetic element(Goddard and Burt 1999; Gogarten and Hilario 2006). Ex-periments were conducted to observe whether the HEGs weidentified are mobile during sexual crosses. Suitable HEGþand HEG� strains with appropriate mating types were se-lected but, despite conducting reciprocal crosses, all the re-sulting progeny lacked RPS3–HEG elements. Additionalexperiments with more stringent conditions to ensure thatself-mating is not occurring are required to distinguish theparental mtDNA genotypes to aid in the identification ofprogeny that inherit HEGs during mating.

In addition to the phylogenetic evidence indicatingthat the B- and C-type HEGs follow a primarily verticalmode of inheritance, we have demonstrated that one B-typeHEG, I-LtrI, and one C-type HEG, I-OnuI, possess charac-teristics typical of LAGLIDADG HEases (figs. 6 and 7).The B-type cleavage site is 28 bp upstream of the C-typecleavage site and the data show that the I-LtrI HEase gen-erates 4-nt 3# overhangs, and the corresponding HEG is in-serted 1 bp upstream of the 4-bp cleavage site. This is incontrast to I-OnuI that cleaves HEG-minus RPS3 allelesat the site of the HEG insertion, generating 4-nt 3# over-hangs. The cleavage mapping data showing that I-OnuIand I-LtrI target the RPS3 gene imply that the HEases

2312 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 15: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

do not promote mobility of the mL2449 group I intron inwhich RPS3 is embedded. Instead, our data are consistentwith I-OnuI and I-LtrI promoting their own mobility be-tween alleles of RPS3 that lack the HEG insertion. It is alsoworth noting that, based on known characteristics of recog-nition sites of LAGLIDADG endonucleases, the I-OnuIrecognition site is likely to encompass sequence of boththe RPS3-coding region and the noncoding group I intron.

Mobility pathways of HEases have been well charac-terized in a number of systems, and all rely on DSBR path-ways of the host organism, a hallmark of which iscoconversion of nucleotide sequence flanking the insertionsite (Cho and Palmer 1999). Analysis of sequence sur-rounding the B- and C-type insertion sites failed to revealobvious coconversion tracts between HEG-minus andHEG-containing strains that would be expected if I-OnuIand I-LtrI utilized DSBR pathways for mobility. Lack ofcoconversion tracts might not be unexpected, however,given that many of the HEGs we identified show obvioussigns of degeneration and are probably not mobile.

An intriguing aspect of the C-type insertions is thepresence of direct repeats that flank the HEG, which inthe case of I-OnuI is 5#-GAAT-3#. The presence of the di-rect repeats flanking the HEG insertion has not been ob-served previously and is not compatible with the currentmodels for HEG homing. Assuming that both GAAT re-peats are actually host-gene sequences, a site-specific re-combination mechanism might explain the origin of thedirect repeats, analogous to that observed during mobilityof cut-and-paste transposons where staggered cuts are filledin to generate direct repeats. However, no such pathway hasyet been demonstrated for HEG homing or transposition.The second possibility is that the upstream GAAT sequenceis part of the ‘‘new Rps3 terminus’’ sequence the HEGbrought along during the homing-recombination event to en-sure that the HEG insertion minimizes their impact on thefunction of the host gene. The latter may be a more likelyscenario as the phylogenetically related B-type HEGs arenot associated with direct repeats (table 2). Another questionto be addressed in future work is how two families of HEasethat are derived from a common ancestor (the B and C-typeHEGs) evolved to target two closely spaced but different re-gions within the RPS3 gene. Clearly, for HEGs to persist andspread into new sites within a genome, HEGs must evolvethe ability to bind and cleave at new sites. The elements pre-sented in this work may offer a useful model system to ex-amine the molecular basis that facilitated invasion of newsites by two closely related LAGLIDADG HEGs, possiblyby identification of key amino acids that change DNA spec-ificity within a relatively short evolutionary time frame.Moreover, our data support the findings of Haugen andBhattacharya (2004) showing that double-motif LAGLI-DADG-type HEGs evolved from a duplication event of asingle-motif LAGLIDADG HEG that were subsequentlysuccessful in spreading to new genomic insertion sites.

Conclusion

Fungal mitochondrial genomes are a haven for mobilegenetic elements, as evidenced by the large number ofgroup I and II introns characterized to date. Our data,

and that of previous studies, suggest that the HEGs insertedinto RPS3 may represent a distinct class of mobile elementsthat are defined by the presence of duplicated segments ofthe host gene that restore gene function, minimizing the im-pact of the HEG insertion on gene function. Although ourdata indicate that the HEGs we describe here are typicalHEases, the initial insertion within the RPS3 gene musthave arisen by an illegitimate recombination event. Boththe initial insertion event and subsequent spread of the el-ement by typical homing pathways have the potential to in-fluence the evolution of functionally critical genes bygeneration of new alleles.

Supplementary Material

Supplementary table 1S and figure 8D are available atMolecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

Acknowledgments

This research is supported by Discovery Grants fromthe Natural Sciences and Engineering Research Council ofCanada to G.H. and D.R.E. and in part by a Canadian In-stitute of Health Research grant to D.R.E. We also wouldlike to thank Shelly Rudski for help with sequencing and DrJames Reid for supplying fungal strains for this study.

Literature Cited

Abu-Amero SN, Charter NW, Buck KW, Brasier CM. 1995.Nucleotide-sequence analysis indicates that a DNA plasmid ina diseased isolate of Ophiostoma novo-ulmi is derived byrecombination between two long repeat sequences in themitochondrial large subunit ribosomal RNA gene. CurrGenet. 28:54–59.

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990.Basic local alignment search tool. J Mol Biol. 215:403–410.

Arlt H, Steglich G, Perryman R, Guiard B, Neupert W, Langer T.1998. The formation of respiratory chain complexes inmitochondria is under the proteolytic control of the m-AAAprotease. EMBO J. 17:4837–4847.

Belcour L, Rossignol M, Koll F, Sellem CH, Oldani C. 1997.Plasticity of the mitochondrial genome in Podospora. Poly-morphism for 15 optional sequences: group-I, group-IIintrons, intronic ORFs and an intergenic region. Curr Genet.31:308–317.

Belfort M. 2003. Two for the price of one: a bifunctional intron-encoded DNA endonuclease-RNA maturase. Genes Dev.17:2860–2863.

Belfort M, Derbyshire V, Parker MM, Cousineau B,Lambowitz AM. 2002. Mobile introns: pathways and proteins.In: Craig NL, Craigie R, Gellert M, Lambowitz AM, editors.Mobile DNA II. Washington (DC): American Society ofMicrobiology Press. p. 761–783.

Belfort M, Perlman PS. 1995. Mechanisms of intron mobility.J Biol Chem. 270:30237–30240.

Belfort M, Roberts RJ. 1997. Homing endonucleases: keepingthe house in order. Nucleic Acids Res. 25:3379–3388.

Bell JA, Monteiro-Vitorello CB, Hausner G, Fulbright DW,Bertrand H. 1996. Physical and genetic map of themitochondrial genome of Cryphonectria parasitica Ep155.Curr Genet. 30:34–43.

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2313

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 16: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

Blackwell M, Hibbett DS, Taylor JW, Spatafora JW. 2006.Research coordination networks: a phylogeny for kingdomfungi (deep Hypha). Mycologia. 98:829–837.

Bonen L, Calixte S. 2006. Comparative analysis of bacterial-origin genes for plant mitochondrial ribosomal proteins. MolBiol Evol. 23:701–712.

Bonocora RP, Shub DA. 2001. A novel group I intron-encodedendonuclease specific for the anticodon region of tRNA(fMet)genes. Mol Microbiol. 39:1299–1306.

Bullerwell CE, Burger G, Lang BF. 2000. A novel motif foridentifying rps3 homologs in fungal mitochondrial genomes.Trends Biochem Sci. 25:363–365.

Bullerwell CE, Leigh J, Seif E, Longcore JE, Lang BF. 2003.Evolution of the fungi and their mitochondrial genomes. In:Arora DK, Khachatourians GG, editors. Applied mycologyand biotechnology, Vol. III: Fungal genomics. New York:Elsevier Science. p. 133–159.

Burke JM, RajBhandary UL. 1982. Intron within the large rRNAgene of N. crassa mitochondria: a long open reading frameand a consensus sequence possibly important in splicing. Cell.31:509–520.

Caprara MG, Waring RB. 2005. Group I introns and theirmaturases: uninvited, but welcome guests. Nucl Acids MolBiol. 16:103–119.

Chevalier BS, Stoddard BL. 2001. Homing endonucleases:structural and functional insight into the catalysts of intron/intein mobility. Nucleic Acids Res. 29:3757–3774.

Cho T, Palmer JD. 1999. Multiple acquisitions via horizontaltransfer of a group I intron in the mitochondrial cox1 geneduring evolution of the Araceae family. Mol Biol Evol.16:1155–1165.

Clark-Walker GD. 1992. Evolution of mitochondrial genomes infungi. Int Rev Cytol. 141:89–127.

Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo:a sequence logo generator. Genome Res. 14:1188–1190.

Cummings DJ, Domenico JM, Nelson J. 1989. DNA sequenceand secondary structures of the large subunit rRNA codingregions and its two class I introns of mitochondrial DNA fromPodospora anserina. J Mol Evol. 28:242–255.

Cummings DJ, McNally KL, Domenico JM, Matsuura ET. 1990.The complete DNA sequence of the mitochondrial genome ofPodospora anserina. Curr Genet. 17:375–402.

Cummings DJ, Turker MS, Domenico JM. 1986. Mitochondrialexcision-amplification plasmids in senescent and long-livedcultures ofPodospora anserina. In: Wickner RB, Hinnebusch A,Lambowitz AM, Gonsalus IC, Hollaender A, editors. Extrac-hromosoma1 elements in lower eukoryotes. New York: PlenumPress. p. 129–146.

Dayhoff MO, Schwartz RM, Orcutt BC. 1978. A model ofevolutionary change in proteins. In: Dayhoff MO, editor. Atlasof protein sequence and structure. Washington (DC): NationalBiomedical Research Foundation. Suppl. 3:p. 345–352.

Dujon B. 1989. Group I introns as mobile genetic elements: factsand mechanistic speculations—a review. Gene. 82:91–114.

Dujon B, Belcour L. 1989. Mitochondrial DNA instabilities andrearrangements in yeasts and fungi. In: Berg DE, Howe MM,editors. Mobile DNA. Washington (DC): American Society ofMicrobiology. p. 861–878.

Felsenstein J. 1985. Confidence limits on phylogenies: anapproach using the bootstrap. Evolution. 39:783–791.

Felsenstein J. 1989. PHYLIP—Phylogeny Inference Package(Version 3.2). Cladistics. 5:164–166.

Felsenstein J. 2005. PHYLIP (Phylogeny Inference Package)version 3.6. Distributed by the author. Seattle (WA):Department of Genome Sciences, University of Washington.

Gibb EA, Hausner G. 2005. Optional mitochondrial introns andevidence for a homing-endonuclease gene in the mtDNA rnlgene in Ophiostoma ulmi s. lat. Mycol Res. 109:1112–1126.

Gillha NW, Boynton JE, Hauser CR. 1994. Translationalregulation of gene expression in chloroplasts and mitochon-dria. Annu Rev Genet. 28:71–93.

Gimble FS. 2000. Invasion of a multitude of genetic niches bymobile endonuclease genes. FEMS Microbiol Lett. 185:99–107.

Gobbi E, Firrao G, Carpanelli A, Locci R, Van Alfen NK. 2003.Mapping and characterization of polymorphism in mtDNA ofCryphonectria parasitica: evidence of the presence of anoptional intron. Fungal Genet Biol. 40:215–224.

Goddard MR, Burt A. 1999. Recurrent invasion and extinction ofa selfish gene. Proc Natl Acad Sci USA. 96:13880–13885.

Gogarten JP, Hilario E. 2006. Inteins, introns, and homingendonucleases: recent revelations about the life cycle ofparasitic genetic elements. BMC Evol Biol. 6:94. doi:10.1186/1471-2148-6-94.

Gonzalez P, Barroso G, Labarere J. 1998. Molecular analysis ofthe split cox1 gene from the Basidiomycota Agrocybeaegerita: relationship of its introns with homologousAscomycota introns and divergence levels from commonancestral copies. Gene. 220:45–53.

Guhan N, Muniyappa K. 2003. Structural and functionalcharacteristics of homing endonucleases. Crit Rev BiochemMol Biol. 38:199–248.

Haugen P, Bhattacharya D. 2004. The spread of LAGLIDADGhoming endonuclease genes in rDNA. Nucleic Acids Res.32:2049–2057.

Haugen P, Runge HJ, Bhattacharya D. 2004. Long-termevolution of the S788 fungal nuclear small subunit rRNAgroup I introns. RNA. 10:1084–1096.

Haugen P, Simon DM, Bhattacharya D. 2005. The natural historyof group I introns. Trends Genet. 21:111–119.

Hausner G. 2003. Fungal mitochondrial genomes, plasmids andintrons. In: Arora DK, Khachatourians GG, editors. Appliedmycology and biotechnology, Vol. III: fungal genomics. NewYork: Elsevier Science. p. 101–131.

Hausner G, Monteiro-Vitorello CB, Searles DB, Maland M,Fulbright DW, Bertrand H. 1999. A long open reading framein the mitochondrial LSU rRNA group-I intron of Crypho-nectria parasitica encodes a putative S5 ribosomal proteinfused to a maturase. Curr Genet. 35:109–117.

Hausner G, Reid J. 2003. Notes on Ceratocystis brunnea andOphiostoma based on partial ribosomal DNA sequence data.Can J Bot. 81:865–876.

Hausner G, Reid J, Klassen GR. 1992. Do galeate-ascosporemembers of the Cephaloascaceae, Endomycetaceae andOphiostomataceae share a common phylogeny? Mycologia.84:870–881.

Hausner G, Reid J, Klassen GR. 1993. On the phylogeny ofOphiostoma, Ceratocystis s.s., Microascus, and relationshipswithin Ophiostoma based on partial ribosomal DNA sequen-ces. Can J Bot. 71:1249–1265.

Hausner G, Reid J, Klassen GR. 2000. On the phylogeny of themembers of Ceratocystis s.l. that possess different anamorphicstates, with emphasis on the asexual genus Leptographium,based on partial ribosomal sequences. Can J Bot. 78:903–916.

Iwamoto M, Pi M, Kurihara M, Morio T, Tanaka Y. 1998. Aribosomal protein gene cluster is encoded in the mitochondrialDNA of Dictyostelium discoideum: UGA termination codonsand similarity of gene order to Acanthamoeba castellanii.Curr Genet. 33:304–310.

Johansen S, Haugen P. 2001. A new nomenclature of group Iintrons in ribosomal DNA. RNA. 7:935–936.

2314 Sethuraman et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018

Page 17: Genes-within-genes: Multiple LAGLIDADG homing endonucleases

Johansen SD, Haugen P, Nielsen H. 2007. Expression of protein-

coding genes embedded in ribosomal DNA. Biol Chem.

388:679–686.Jurica MS, Stoddard BL. 1999. Homing endonucleases: structure,

function and evolution. Cell Mol Life Sci. 55:1304–1326.Kubelik AR, Kennell JC, Akins RA, Lambowitz AM. 1990. Iden-

tification of Neurospora mitochondrial promoters and analysis of

synthesis of the mitochondrial small rRNA in wild-type and

the promoter mutant [poky]. J Biol Chem. 265:4515–4526.Lambowitz AM, Caprara MG, Zimmerly S, Perlman PS. 1999.

Group I and group II ribozymes as RNPs: clues to the past and

guides to the future. In: Gesteland RF, Cech TR, Atkins JF,

editors. The RNA world. New York: Cold Spring Harbor

Laboratory Press. p. 451–485.Lambowitz AM, Perlman PS. 1990. Involvement of aminoacyl-

tRNA synthetases and other proteins in group I and group II

intron splicing. Trends Biochem Sci. 15:440–444.LaPolla RJ, Lambowitz AM. 1981. Mitochondrial ribosome assembly

in Neurospora crassa. Purification of the mitochondrially

synthesized ribosomal protein, S-5. J Biol Chem. 256:7064–7067.Laroche J, Bousquet J. 1999. Evolution of the mitochondrial rps3

intron in perennial and annual angiosperms and homology to

nad5 intron 1. Mol Biol Evol. 16:441–452.Mota EM, Collins RA. 1988. Independent evolution of structural

and coding regions in a Neurospora mitochondrial intron.

Nature. 332:654–656.Nicholas KB, Nicholas HB Jr, Deerfield DW. 1997. GeneDoc:

analysis and visualization of genetic variation. EMBNEW NEWS.

4:14.Page RD. 1996. TreeView: an application to display phylogenetic

trees on personal computers. Comput Appl Biosci. 12:357–358.Paquin B, Laforest MJ, Lang BF. 1994. Interspecific transfer of

mitochondrial genes in fungi and creation of a homologous

hybrid gene. Proc Natl Acad Sci USA. 91:11807–11810.Paquin B, Lang BF. 1996. The mitochondrial DNA of Allomyces

macrogynus: the complete genomic sequence from an

ancestral fungus. J Mol Biol. 255:688–701.Ronquist F. 2004. Bayesian inference of character evolution.

Trends Ecol Evol. 19:475–481.Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: bayesian

phylogenetic inference under mixed models. Bioinformatics.

19:1572–1574.Salvo JL, Rodeghier B, Rubin A, Troischt T. 1998. Optional

introns in mitochondrial DNA of Podospora anserina are the

primary source of observed size polymorphisms. Fungal

Genet Biol. 23:162–168.Schafer B. 2003. Genetic conservation versus variability in

mitochondria: the architecture of the mitochondrial genome in

the petite-negative yeast Schizosaccharomyces pombe. Curr

Genet. 43:311–326.

Schluenzen F, Tocilj A, Zarivach R, et al. (11 co-authors). 2000.Structure of functionally activated small ribosomal subunit at3.3 angstroms resolution. Cell. 102:615–623.

Schneider TD, Stephens RM. 1990. Sequence logos: a new wayto display consensus sequences. Nucleic Acids Res. 18:6097–6100.

Seif E, Leigh J, Liu Y, Roewer I, Forget L, Lang BF. 2005.Comparative mitochondrial genomics in zygomycetes: bacteria-like RNase P RNAs, mobile elements and a close source of thegroup I intron invasion in angiosperms. Nucleic Acids Res.33:734–744.

Sellem CH, Belcour L. 1994. The in vivo use of alternate 3#-splicesites in group I introns. Nucleic Acids Res. 22:1135–1137.

Sellem CH, Belcour L. 1997. Intron open reading frames asmobile elements and evolution of a group I intron. Mol BiolEvol. 14:518–526.

Sellem CH, d’Aubenton-Carafa Y, Rossignol M, Belcour L.1996. Mitochondrial intronic open reading frames in Podo-spora: mobility and consecutive exonic sequence variations.Genetics. 143:777–788.

Sethuraman J, Okoli CV, Majer A, Corkery TL, Hausner G.2008. The sporadic occurrence of a group I intron-likeelement in the mtDNA rnl gene of Ophiostoma novo-ulmisubsp. americana. Mycol Res. 112:564–582.

Stoddard BL. 2005. Homing endonuclease structure andfunction. Q Rev Biophys. 38:49–95.

Toor N, Zimmerly S. 2002. Identification of a family of group IIintrons encoding LAGLIDADG ORFs typical of group Iintrons. RNA. 8:1373–1377.

Upadhyay HP. 1981. A Monograph on Ceratocystis andCeratocystiopsis. Athens: University of Georgia Press. p. 176.

Van Dyck L, Neupert W, Langer T. 1998. The ATP-dependentPIM1 protease is required for the expression of intron-containing genes in mitochondria. Genes Dev. 12:1515–1524.

Wilson DN, Nierhaus KH. 2005. Ribosomal proteins in thespotlight. Crit Rev Biochem Mol Biol. 40:243–267.

Wingfield MJ, Seifert KA, Webber JF. 1993. In: Wingfield MJ,Seifert KA, Webber JF, editors. Ceratocystis and OphiostomaBiology, taxonomy and ecology. American PhytopathologicalSociety Press.ISBN 0-89054-156-6.

Zhao L, Bonocora RP, Shub DA, Stoddard BL. 2007. The restrictionfold turns to the dark side: a bacterial homing endonuclease witha PD–(D/E)–XK motif. EMBO J. 26:2432–2442.

Zhu H, Macreadie IG, Buttow RA. 1987. RNA processing andexpression of an intron-encoded protein in yeast mitochon-dria: role of a conserved docecamer sequence. Mol Cell Biol.7:2530–2537.

Charles Delwiche, Associate Editor

Accepted June 27, 2009

Homing Endonucleases Targeting the mtDNA RPS3 Gene 2315

Downloaded from https://academic.oup.com/mbe/article-abstract/26/10/2299/1108791by gueston 02 April 2018