15
Copyright Ó 2007 by the Genetics Society of America DOI: 10.1534/genetics.106.069724 Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAGCTG Trinucleotide Repeat Instability in Escherichia coli Rabaab Zahra,* John K. Blackwood,* Jill Sales and David R. F. Leach* ,1 *Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom and Biomathematics and Statistics Scotland, Edinburgh EH9 3JZ, United Kingdom Manuscript received December 14, 2006 Accepted for publication February 8, 2007 ABSTRACT Expanded CAGCTG trinucleotide repeat tracts are associated with several human inherited diseases, including Huntington’s disease, myotonic dystrophy, and spinocerebellar ataxias. Here we describe a new model system to investigate repeat instability in the Escherichia coli chromosome. Using this system, we reveal patterns of deletion instability consistent with secondary structure formation in vivo and address the molecular basis of orientation-dependent instability. We demonstrate that the orientation dependence of CAGCTG trinucleotide repeat deletion is determined by the proofreading subunit of DNA polymerase III (DnaQ) in the presence of the hairpin nuclease SbcCD (Rad50/Mre11). Our results suggest that, al- though initiation of slippage can occur independently of CAGCTG orientation, the folding of the inter- mediate affects its processing and this results in orientation dependence. We propose that proofreading is inefficient on the CTG-containing strand because of its ability to misfold and that SbcCD contributes to processing in a manner that is dependent on proofreading and repeat tract orientation. Furthermore, we demonstrate that transcription and recombination do not influence instability in this system. C AGCTG repeat expansion is associated with hu- man hereditary neurodegenerative diseases such as Huntington’s disease, myotonic dystrophy, and spin- ocerebellar ataxias (Cummings and Zoghbi 2000; Sinden et al. 2002). The mechanism(s) involved in trinucleo- tide repeat (TNR) instability (expansion and deletion) has been studied in a number of model systems, in- cluding Escherichia coli, yeast, mouse, and cultured cells (Maurer et al. 1996; Freudenreich et al. 1997; Kaytor et al. 1997; Miret et al. 1998; Sarkar et al. 1998; Iyer and Wells 1999; Kovtun and McMurray 2001; McMurray and Kortun 2003; Pelletier et al. 2005). Instability is affected by both cis- and trans-acting factors. Cis-acting factors include the length and purity of the repeat array. In rapidly dividing cells, such as E. coli and yeast, insta- bility is orientation dependent with respect to replica- tion, suggesting a central role for DNA synthesis in instability. Trans-acting factors include the replication machinery and the repair and recombination proteins of the organism. Furthermore, mutations in replication genes in yeast—Pola, Pold, PCNA, Fen1/Rad27, Dna2 helicase, and primase—have profound effects on re- peat instability (Schweitzer and Livingston 1999; Callahan et al. 2003). On the other hand, studies of instability in mice and in cultured cells have revealed evidence for replication-independent expansion of re- peat arrays (Takano et al. 1996; Hashida et al. 1997; Kovtun and McMurray 2001). This is consistent with expansion of repeats in human tissues, such as the brain where cells seldom replicate. However, even these replication-independent expansion events are likely to be dependent on DNA synthesis at sites of DNA repair. It is therefore essential to understand how the machin- ery required to copy DNA interacts with these repetitive sequences. E. coli provides an attractive model system in which to study the basic properties of DNA replication, recombi- nation, and genetic instability because of the detailed understanding of its genetic and biochemical pathways, its rapid growth, and sophisticated biotechnology. De- spite these advantages, all previous studies have consid- ered instability on bacterial plasmid-borne repeats apart from one recently reported study on chromosomal tri- nucleotide repeat instability (Kim et al. 2006). The studies based on plasmid systems have included in- vestigations of the effects of transcription (Bowater et al. 1997; Schumacher et al. 2001), mismatch repair (Jaworski et al. 1995; Schmidt et al. 2000), nucleotide excision repair (Oussatcheva et al. 2001), proofread- ing (Iyer et al. 2000), and recombination ( Jakupciak and Wells 1999, 2000a,b; Hashem et al. 2004). This body of work has revealed many different effects, and contra- dictory conclusions have sometimes been reached. It is likely that aspects of plasmid biology such as copy 1 Corresponding author: Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, The King’s Buildings, Mayfield Rd., Edinburgh EH9 3JR, United Kingdom. E-mail: [email protected] Genetics 176: 27–41 (May 2007)

Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

Copyright � 2007 by the Genetics Society of AmericaDOI: 10.1534/genetics.106.069724

Proofreading and Secondary Structure Processing Determine theOrientation Dependence of CAG�CTG Trinucleotide

Repeat Instability in Escherichia coli

Rabaab Zahra,* John K. Blackwood,* Jill Sales† and David R. F. Leach*,1

*Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom and†Biomathematics and Statistics Scotland, Edinburgh EH9 3JZ, United Kingdom

Manuscript received December 14, 2006Accepted for publication February 8, 2007

ABSTRACT

Expanded CAG�CTG trinucleotide repeat tracts are associated with several human inherited diseases,including Huntington’s disease, myotonic dystrophy, and spinocerebellar ataxias. Here we describe a newmodel system to investigate repeat instability in the Escherichia coli chromosome. Using this system, wereveal patterns of deletion instability consistent with secondary structure formation in vivo and address themolecular basis of orientation-dependent instability. We demonstrate that the orientation dependence ofCAG�CTG trinucleotide repeat deletion is determined by the proofreading subunit of DNA polymerase III(DnaQ) in the presence of the hairpin nuclease SbcCD (Rad50/Mre11). Our results suggest that, al-though initiation of slippage can occur independently of CAG�CTG orientation, the folding of the inter-mediate affects its processing and this results in orientation dependence. We propose that proofreading isinefficient on the CTG-containing strand because of its ability to misfold and that SbcCD contributes toprocessing in a manner that is dependent on proofreading and repeat tract orientation. Furthermore,we demonstrate that transcription and recombination do not influence instability in this system.

CAG�CTG repeat expansion is associated with hu-man hereditary neurodegenerative diseases such

as Huntington’s disease, myotonic dystrophy, and spin-ocerebellar ataxias (Cummings and Zoghbi 2000; Sinden

et al. 2002). The mechanism(s) involved in trinucleo-tide repeat (TNR) instability (expansion and deletion)has been studied in a number of model systems, in-cluding Escherichia coli, yeast, mouse, and cultured cells(Maurer et al. 1996; Freudenreich et al. 1997; Kaytor

et al. 1997; Miret et al. 1998; Sarkar et al. 1998; Iyer andWells 1999; Kovtun and McMurray 2001; McMurray

and Kortun 2003; Pelletier et al. 2005). Instability isaffected by both cis- and trans-acting factors. Cis-actingfactors include the length and purity of the repeat array.In rapidly dividing cells, such as E. coli and yeast, insta-bility is orientation dependent with respect to replica-tion, suggesting a central role for DNA synthesis ininstability. Trans-acting factors include the replicationmachinery and the repair and recombination proteinsof the organism. Furthermore, mutations in replicationgenes in yeast—Pola, Pold, PCNA, Fen1/Rad27, Dna2helicase, and primase—have profound effects on re-peat instability (Schweitzer and Livingston 1999;Callahan et al. 2003). On the other hand, studies ofinstability in mice and in cultured cells have revealed

evidence for replication-independent expansion of re-peat arrays (Takano et al. 1996; Hashida et al. 1997;Kovtun and McMurray 2001). This is consistent withexpansion of repeats in human tissues, such as thebrain where cells seldom replicate. However, even thesereplication-independent expansion events are likely tobe dependent on DNA synthesis at sites of DNA repair.It is therefore essential to understand how the machin-ery required to copy DNA interacts with these repetitivesequences.

E. coli provides an attractive model system in which tostudy the basic properties of DNA replication, recombi-nation, and genetic instability because of the detailedunderstanding of its genetic and biochemical pathways,its rapid growth, and sophisticated biotechnology. De-spite these advantages, all previous studies have consid-ered instability on bacterial plasmid-borne repeats apartfrom one recently reported study on chromosomal tri-nucleotide repeat instability (Kim et al. 2006). Thestudies based on plasmid systems have included in-vestigations of the effects of transcription (Bowater

et al. 1997; Schumacher et al. 2001), mismatch repair(Jaworski et al. 1995; Schmidt et al. 2000), nucleotideexcision repair (Oussatcheva et al. 2001), proofread-ing (Iyer et al. 2000), and recombination (Jakupciak andWells 1999, 2000a,b; Hashem et al. 2004). This body ofwork has revealed many different effects, and contra-dictory conclusions have sometimes been reached. It islikely that aspects of plasmid biology such as copy

1Corresponding author: Institute of Cell Biology, School of BiologicalSciences, University of Edinburgh, The King’s Buildings, Mayfield Rd.,Edinburgh EH9 3JR, United Kingdom. E-mail: [email protected]

Genetics 176: 27–41 (May 2007)

Page 2: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

number, size, and specialized processing pathways havecontributed to the differences observed.

Orientation dependence of TNR instability has pre-viously been seen for E. coli plasmids and in yeast. Thishas been explained on the basis of lagging-strandreplication dynamics coupled to the greater thermody-namic stability of CTG repeat hairpins relative to CAGrepeat hairpins (Gacy et al. 1995; Kang et al. 1995;Rosche et al. 1995; Maurer et al. 1996; Freudenreich

et al. 1997; Miret et al. 1998). The model suggests thatCTG repeat hairpins formed on the lagging-strandtemplate lead to deletions while CTG repeat hairpinsformed on the nascent lagging strand lead to expan-sions. Since the E. coli system is biased toward deletionsand not many expansions are observed, even when CTGrepeats are present on the nascent lagging strand, thismodel is at best an incomplete explanation of orienta-tion dependence of repeat instability in E. coli.

Trinucleotide repeats, when single stranded, can foldinto hairpins in vitro (Gacy et al. 1995; Mitas et al. 1995a;Smith et al. 1995; Yuet al. 1995a,b; Petruska et al. 1996),which can be attacked by SbcCD (Connelly et al. 1999).The SbcCD nuclease of E. coli is the homolog of theMre11/Rad50 nuclease in eukaryotes (Sharples andLeach 1995). The enzyme has double-strand exonucle-ase and hairpin endonuclease activities and acts on avariety of substrates, including DNA hairpins (Connelly

and Leach 1996; Connelly et al. 1997, 1999). It hasalso been shown to affect the frequency and nature ofdeletions between 101-bp direct repeats flanking aninverted repeat sequence and can affect the nature of adeletion event between 101-bp direct repeats, even inthe absence of an inverted repeat sequence, in a mannerconsistent with nucleolytic processing of the slippageintermediate (Bzymek and Lovett 2001a,b). The ob-servations of CTG and CAG hairpins (Gacy et al. 1995;Mitas et al. 1995b; Smith et al. 1995; Yu et al. 1995a;Petruska et al. 1996) and of slipped mispairing struc-tures in vitro (Pearson et al. 1998; Sinden et al. 2002)have given rise to the hypothesis that misfolded struc-tures play a role in instability. Preformed slipped mis-pairing structures are repaired in vitro in human cellextracts to delete, retain, or shorten the looped-outrepeat containing strand, demonstrating the presenceof activities capable of processing misfolded structures(Panigrahi et al. 2005). The hypothesis that secondarystructures may be involved in the processing of CAG�CTGrepeats in vivo has received some support from obser-vations implicating the SbcCD and Rad50–Mre11 com-plexes in E. coli and yeast. In E. coli, it has been reportedthat a multiply mutant strain (SURE) shows dramaticexpansion of CAG�CTG repeats in a plasmid that isprevented by SbcCD (Sarkar et al. 1998). In yeast, areduction in CAG repeat expansions associated withdouble-strand break repair in mre11D strains was sup-pressed by overexpressing the Rad50–Mre11 complex,suggesting that it may cleave hairpin structures (Richard

et al. 2000). Furthermore, CTG repeat-induced sponta-neous double-strand breaks were reduced in a rad50mutant (Freudenreich et al. 1998). Arguments againstthe involvement of secondary structures come fromthe inability to drive a structural transition in vitro insupercoiled templates (Bacolla et al. 1997) and thelack of effect of the sbcCD genotype on CAG�CTG repeatinstability in a plasmid system different from that ofSarker et al. (1998; Schmidt et al. 2000). Furthermore,the small changes in repeat length observed in plasmidsubstrates show no bias to even-numbered patterns ofdeletion products (except as a consequence of mis-match repair, which eliminates 11 and �1 repeatchanges; see Schmidt et al. 2000) despite the existenceof preferred folding patterns comprising even numbersof trinucleotide repeats in vitro (Petruska et al. 1998)and in vivo (Darlow and Leach 1995). Relevant tomany of these arguments is the observation of an SbcCDeffect on strand slippage in the absence of a hairpin-forming substrate (Bzymek and Lovett 2001a,b),which weakens any argument for DNA structures basedon SbcCD or Rad50/MRE11 effects. It is clear thatdespite the strong evidence that CAG�CTG repeats canform secondary structures in vitro that are substrates forenzymatic processing, the question of whether second-ary structure plays a role in instability in vivo has beenmore difficult to determine experimentally and thehypothesis remains controversial.

In E. coli, proofreading during DNA replication is per-formed by the 39–59 exonucleolytic e-subunit of DNApolymerase III, which is encoded by the dnaQ gene.During replication, the proofreading function preventsslipped-strand pairing events that can lead to instabilityin repeated sequences. A mutation in the proofreadingfunction of DNA polymerase III, dnaQ49 ts, was shown toenhance instability of CAG�CTG trinucleotide repeats(Iyer et al. 2000). Another mutation, mutD5, along withdnaQ49 ts, was shown to enhance instability of tandemrepeat sequences (Saveson and Lovett 1997; Bzymek

et al. 1999).Here, we describe a study of trinucleotide repeat

instability carried out in the E. coli chromosome. Wedemonstrate that instability is length and orientationdependent. Longer repeat arrays are more unstablethan shorter repeat arrays and CTG repeats on thelagging-strand template are more unstable than on theleading-strand template. Furthermore, for both orien-tations of CAG�CTG trinucleotide repeat tracts, thedistributions of deletion lengths are skewed in a waythat is consistent with deletions stimulated by hairpinstructures. This is direct evidence that secondary struc-ture plays a role in CAG�CTG repeat instability in vivo.We also demonstrate that mutation of the gene encod-ing the proofreading subunit of DNA polymerase III(DnaQ) destabilizes CAG�CTG trinucleotide repeattracts. Furthermore, orientation dependence of insta-bility is lost in the dnaQ mutant, and SbcCD, whose

28 R. Zahra et al.

Page 3: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

homolog in eukaryotes is Rad50/Mre11, modulates theeffect of DnaQ. These data argue that intermediates inthe replicative pathway leading to trinucleotide repeatinstability are detected and processed by both proof-reading and the SbcCD (Rad50/Mre11) complex.Furthermore, we demonstrate that this replicative in-stability is not caused by transcription or recombination.

MATERIALS AND METHODS

Construction of pLacD1 and pLacD2: The plasmid pLacD1,a derivative of pTOF24 (Merlin et al. 2002), was created toinsert TNR arrays into the beginning of lacZ in the E. coligenome using a region of lacZ homology. The 800-bp lacZ DNAfragment was constructed using crossover PCR. Using CSH100(DL844) as template DNA, primers Lac1F (AAA AAC TGCAGT TGG TGC GGA TAT CTC GGT AGT GG) and Lac1R(GAA GAC GCA ATT GGA GAC CAT GGT CAT AGC TGTTTC CTG TG) were used to create one homology arm, whileprimers Lac2F (TTT TTC AGC TGA ATA ATT CGC GTC TGGCCT TCC TG) and Lac2R (GGT CTC CAA TTG CGT CTTCGT CGT TTT ACA ACG TCG TGA CTG) were used to createthe other. Primers Lac1F and Lac2F were used to amplify thefusion of the two PCR products. CSH100 contains the L8mutation in the CRP-binding site upstream of the lacZ startcodon, which was introduced into the lac homology armsduring cloning. The three restriction enzyme sites, BsaI, MfeI,and BbsI, were introduced 10 bp downstream from the startcodon of lacZ without altering the reading frame, and therestriction sites for these enzymes in the plasmid pTOF24 wereremoved using site-directed mutagenesis (SDM) before insert-ing the 800 bp of lacZ homology between PstI and SalI restric-tion sites. MfeI was removed using the primers MfeI_SDM_F(CAT CTC AAC TGG TCT AGG TGA TTT TAA TCA CTA TACCAA CTG AGA TGG G) and MfeI_SDM_R (CCC ATC TCAGTT GGT ATA GTG ATT AAA ATC ACC TAG ACC AGT TGAGAT G); BsaI was removed with the primers BsaI_SDM_F(GTC TAT TGC TGG TAT CGG TAC CCG ACC TGC AGG)and BsaI_SDM_R (CCT GCA GGT CGG GTA CCG ATA CCAGCA ATA GAC); and BbsI was removed with the primersBbsI_SDM_F (CGA CTC CTG CAT CCC TTT CAT CTT CGAATA AAT ACC) and BbsI_SDM_R (GGT ATT TAT TCG AAGATG AAA GGG ATG CAG GAG TCG). All alterations topTOF24 were confirmed via restriction enzyme digestion andDNA sequencing. After rounds of site-directed mutagenesis,the 800-bp lacZ fragment was cloned using PstI and SalI. Thismodified plasmid was named pLacD1.

Plasmid pLacD2 was derived from pLacD1. pLacD1 had anextra site for BbsI in one lac homology arm, which was removedby changing A to G using primers SDM_BbsI_F [59-GGG ATACGA CGA TAC CGA GGA CAG CTC ATG-39 (underlinedsequences define the position of the BbsI site and the boldfaceG is the base that has been changed from A by SDM)] andSDM_BbsI_R (59-CAT CAG CTG TCC TCG GTA TCG TCGTAT CCC-39) by site-directed mutagenesis, resulting in theplasmid pLacD2. A list of plasmids is provided in Table 1.

Building of long repeat arrays: Repeat arrays were gener-ated in the plasmid pLacD2, which was further used to inte-grate the repeats in chromosomes. (CAG)5 and (CTG)5 repeatswere introduced between the lac homology arms of pLacD2 bysite-directed mutagenesis using primer pairs ExCAG-01 (59-CTA TGA CCA TGG TCT CGC AGC AGC AGC AGC AGG TCTTCG TCG TTT TAC-39), ExCTG-01 (59-GTA AAA CGA CGAAGA CCT GCT GCT GCT GCT GCG AGA CCA TGG TCATAG-39), ExCAG-02 (59-CTA TGA CCA TGG TCT CGC TGCTGC TGC TGC TGG TCT TCG TCG TTT TAC-39), and

ExCTG-02 (59-GTA AAA CGA CGA AGA CCA GCA GCA GCAGCA GCG AGA CCA TGG TCA TAG-39), removing the MfeIsite. Another unique restriction site, HindIII, was used to per-form double digestions. The pLacD2 plasmid containing therepeats was digested by BsaI and HindIII and by BbsI andHindIII, giving two fragments of 2774 and 3702 bp. The frag-ments containing the repeats were extracted from gels andligated together to increase the repeat number.

Integration into chromosome: Repeat sequences were in-tegrated into the chromosome in both CAG and CTG orien-tations in the 59 part of the lacZ gene using the pKO3 integrationstrategy (Link et al. 1997). Figure 1A shows the structure of theCAG-leading (CAG) orientation where CAG repeats are on theleading-strand template and CTG repeats are on the lagging-strand template. Following integration, the presence of the re-peat tract and the absence of the lacL8 mutation were checkedby PCR and sequencing. A list of bacterial strains is provided inTable 2.

Instability assay: Twelve parental colonies were tested ineach assay. Each colony was grown overnight in LB with andwithout IPTG (2 mm) at 37� with shaking. The cultures were

Figure 1.—(A) The location of repeats integrated into thechromosome. Repeats were integrated at the 59-end of thelacZ coding sequence to generate in-frame insertions ofCAG or CTG codons. The construct shows the CAG orienta-tion where CAG repeats are on the leading-strand templateand CTG repeats are on the lagging-strand template. (B)Map of plasmid pLacD2, showing the restriction sites (BsaIand BbsI) between lac1 and lac2 homology arms, where re-peats were introduced by site-directed mutagenesis, and Hin-dIII, which was used for double digestions. (C–E) Examplesshowing the data output of GeneMapper. (C) The arrow pointsto a 373-bp peak, which corresponds to the repeat length of 75(373 bp¼ 3 3 75 bp 1 148 bp) as the PCR product size withoutrepeats is 148 bp. The inset shows a magnification of the regionaround the 373-bp peak to illustrate the ‘‘stutter’’ bands moreclearly. (D) The arrow points to a peak of 230 bp, which is a de-letion of (CAG)75 to (CAG)27. (E) An example of a mixed col-ony showing a parental length of (CAG)75 with a 373-bp peak(arrow on the right) along with a deletion peak of 205 bp (arrowon the left), which corresponds to (CAG)19.

Orientation Dependence of Repeat Instability 29

Page 4: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

diluted 106-fold in LB and 100 ml was plated onto LB plates.The plates were incubated at 37� overnight. For each parentalcolony, 10 sibling colonies were selected for analysis fromIPTG and no-IPTG cultures. PCR was carried out to check thelength of the repeat tract. No significant differences wereobserved in any of the mutants between instabilities in thepresence or absence of IPTG, and the data, plus or minusIPTG, were pooled.

GeneMapper analysis of repeats: Repeat tracts were ampli-fied using primers Ex-test-F (59-TTA TGC TTC CGG CTC GTATG-39) and Ex-test-R (59-GGC GAT TAA GTT GGG TAA CG-39). Primer Ex-test-F was labeled with the fluorescence tag6-Fam (Metabion). A size standard (GeneScan-500 LIZ fromABI) was added to the fluorescent PCR product(s), andfragments were resolved by capillary electrophoresis on apolyacrylamide medium in an ABI 3730 genetic analyzer. Theresults were analyzed by using GeneMapper software version 4.Characteristic result outputs are shown in Figure 1, C–E. Hereit can be seen that, in addition to the main peaks characteristicof the repeat array lengths, several ‘‘stutter’’ peaks areobserved. These represent deletions and expansions that havearisen during the PCR reaction and not in vivo.

The instability proportion was defined as the proportion ofsibling colonies that had a repeat length changed from theparental length. To avoid counting deletions that had arisenon the plates, mixed colonies (containing cells with parentaland new lengths) were classified as parental in the instabilityproportion. However, these lengths were included in theanalysis of deletion length distributions. Rare sibling colonies,derived from one parental colony containing the same lengthof deletion, were counted only once on the assumption thatthey were sister clones. Rare expansions of the repeat arraywere detected but have not been included in this analysis.

Logistic regression models were fitted to the CAG and CTGarrays separately, using Genstat 8th edition, to compare theinstability proportions in the different arrays. Approximate95% confidence intervals were calculated for each estimatedinstability proportion as the mean 6 2 3 standard error.

RESULTS

Strategy for the construction and analysis of ex-panded CAG�CTG trinucleotide repeat tracts in theE. coli chromosome: A strategy was developed to gener-ate a set of uninterrupted repeat tracts of various lengthsin both CAG-leading and CTG-leading orientations.CAG leading is defined here as the orientation wherethe CAG repeat tract is on leading-strand template whileCTG leading is defined as the orientation where theCTG repeat tract is on leading-strand template. Forsimplicity, CAG leading and CTG leading will be de-scribed as ‘‘CAG’’ and ‘‘CTG,’’ respectively, from nowonward in this article.

Both CAG and CTG repeats of length 5 were in-troduced between BsaI and BbsI restriction sites ofplasmid pLacD2 (Figure 1B) by site-directed mutagen-esis. This was followed by rounds of DNA restriction andligation to construct longer repeat lengths. The recog-nition sites of BsaI and BbsI direct cleavage inside therepeat sequence (Figure 2B), so in every restriction andligation round, there was a doubling of the repeat array

Figure 1.—continued

30 R. Zahra et al.

Page 5: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

length coupled to the loss of two repeats. This methodfollows the formula nx ¼ 2nx�1 � 2, where n is thenumber of repeat units in the repeat tract and x is theround of restriction and ligation.

A schematic of the strategy is shown in Figure 2A (fordetails, see experimental procedures). This strategy issimilar to one used previously (Grabczyk and Usdin

1999; Krasilnikova and Mirkin 2004). However, it issimpler in that it requires the ligation of only twofragments (as opposed to three) and all the productsof ligation carry the expanded repeat array withoutany requirement for multiple rounds of ligation ordephosphorylation.

The following strategy was used to measure theinstability of the repeat arrays. In every instability assay,12 parental colonies were taken and grown in thepresence and absence of IPTG (2 mm). For each par-ental colony, 10 sibling colonies were analyzed. PCR wascarried out across the repeat array and the length of therepeat tract was determined by running the PCRproducts on an ABI 3730 genetic analyzer (AppliedBiosystems, Foster City, CA), which automatically de-tects and determines the sizes of DNA fragments basedon electrophoretic separation. The data collected fromthe ABI 3730 Genetic Analyzer were analyzed usingGeneMapper software version 4.0. The fragments werevisualized as peaks on a graph as displayed in Figure 1.

Instability increases with increasing repeat lengthand depends on repeat orientation with respect to thedirection of replication: Three different lengths of thetwo orientations of CAG�CTG repeats were initiallystudied in wild-type cells. In both orientations, instabil-ity was found to be dependent on repeat tract length asthe proportion of instability increased with the length ofthe repeat tract. The instability proportion representsthe frequency of sibling colonies that had a repeatlength changed (expanded or contracted) from theparental length (see experimental procedures for amore detailed description of the instability proportion).Figure 3A shows the instability proportions of all repeatlengths studied. The instability proportion for (CAG)84

was 31-fold higher than that for (CAG)45. Similarly,(CTG)140 had a 28-fold higher instability proportionthan (CTG)48. Notably, the highest repeat lengthstudied in the CAG orientation (CAG)84 had an in-stability proportion 1.7-fold higher than that for thehighest length of the CTG orientation i.e., (CTG)140. Itis the orientation where CAG repeats lie on the leading-strand template that is more unstable than the oppositeorientation, where CTG repeats lie on the leading-strand template. The same orientation dependencewith respect to the direction of replication was observedfor CAG�CTG repeats inserted at the l attB site (data notshown).

Figure 1.—continued

Orientation Dependence of Repeat Instability 31

Page 6: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

CAG�CTG repeat arrays are destabilized in a dnaQmutant and orientation dependence is lost: As can beseen in Figure 3B, instability was increased for bothrepeat orientations, with a more pronounced effect onthe CTG orientation in a PolIII (dnaQ) proofreadingmutant. The instability proportion for (CTG)75 (therepeat length was deleted down to 75 during the con-struction of mutant strain) was increased 8-fold in adnaQ mutant compared with that of (CTG)95 in wildtype while (CAG)75 had an instability proportion only1.6-fold higher than (CAG)75 in wild type (P ¼ 0.008).Similar increases in instability were not observed in amutS mutant (deficient in mismatch repair pathway) inthe CTG orientation (Figure 3B), suggesting that theinstability observed in the dnaQ mutant is likely to beattributable to its proofreading defect and not to anonspecific effect of elevated mutagenesis. A smallincrease in instability proportion was observed for themutS mutant in the CAG orientation (P ¼ 0.050). Mostinterestingly, the orientation dependence of instabilitywas lost in the dnaQ mutant but not in the mutS mutant.This result indicates that orientation dependence couldbe a direct consequence of different efficiencies ofproofreading of CAG and CTG templates.

CAG repeat instability is reduced in an sbcCD mutantand CTG repeat instability is reduced in an sbcCD dnaQdouble mutant: The effect of SbcCD was investigated for

both CAG and CTG repeat orientations. The instabilityproportion for the (CAG)75 repeat array was 1.8-foldlower in the sbcCD mutant (P¼ 0.027). The stabilizationobserved in an sbcCD mutant was lost in an sbcCD dnaQdouble mutant (Figure 3B).

The CTG orientation was too stable at the lengthstudied to obtain sufficient data permitting a statisticaldistinction between wild-type and sbcCD mutants (Fig-ure 3B). However, an effect of sbcCD could be measuredin an sbcCD dnaQ mutant where a significant decrease inthe instability proportion was observed relative to thatin dnaQ, even though the repeat length was shorter inthe dnaQ mutant (P ¼ 0.004). In contrast to the dnaQmutant, orientation dependence was retained in sbcCDdnaQ (P , 0.001), suggesting that intermediates in theCTG orientation pathway can escape deletion in theabsence of SbcCD and proofreading more easily thanintermediates in the CAG orientation pathway.

Large deletions predominate over small deletions insbcCD and sbcCD dnaQ mutants in the CAG orientationand in a dnaQ mutant in both repeat orientations: Tosee the sizes of deletions obtained in CAG�CTG repeatsin wild-type and mutant cells, all observed deletionswere plotted as a function of the percentage of deletionsize against the number of events (Figure 4). Both CAGand CTG repeat deletion distributions are negativelyskewed as seen from the long tails toward the left in

Figure 1.—continued

32 R. Zahra et al.

Page 7: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

Figure 4. The median (midpoint of the distribution) for(CAG)75 comes at 65% and for (CAG)84 at 70%. The(CTG)95 deletion distribution has a median at 58%while, for (CTG)140, it can be seen at 61%. These distri-butions suggest that large deletions are more frequentthan small deletions, consistent with the existence ofintermediates comprising many repeats as would bepredicted if large hairpins could form. Given the lowfrequency of deletions and their origins in populationsgrown from single cells carrying the original length ofrepeat array, the vast majority of deletions will havearisen in single events from arrays of parental length.The distributions of deletion lengths in CAG repeattracts are not influenced substantially by mutations indnaQ or sbcCD (Figure 4), suggesting that these genesdo not affect the nature of the primary intermediatein the pathway but instead the frequency of its process-ing to a product with a new repeat length. In the CTGrepeat orientation, the number of events observed makea comparison most meaningful between the dnaQ anddnaQ sbcCD mutants. Here, the negatively skewed dis-tribution observed in dnaQ (median 63%) disappearsin dnaQ sbcCD, giving a flat distribution with a medianof 51% (Figure 4). This suggests that, contrary to theCAG orientation, the presence of SbcCD (in the ab-sence of proofreading) favors the formation of largedeletions.

Mutations in recombination genes recA, recB, andrecF do not influence CAG�CTG repeat instability: Itcan be seen in Figure 5 that mutations in the re-combination genes recA, recB, and recF do not affectthe proportions of instability for (CAG)75 and (CTG)95

repeat tracts. This suggests that recombination does notcontribute to the instability of CAG�CTG repeats in theE. coli chromosome at the lengths studied in this work.

Transcription does not affect CAG�CTG repeatinstability: To analyze the influence of transcriptionon the instability of CAG�CTG repeats, the repeats wereintegrated in the 59-end of the lacZ gene in the wild-typestrain that also bears a lacIq repressor gene. Underuninduced conditions (the absence of inducer IPTG),transcription from the lacZ promoter is repressed by theLacI repressor, while growing the cells in IPTG inducestranscription. In the CTG orientation, the transcribedstrand is the CTG strand, which is also the leading-strand template, while in the CAG orientation it is theCAG repeat-containing strand that acts as leading-strand template and the transcribed strand. The repeatlengths studied in both orientations showed no differ-ence in the proportion of instability in the presence orabsence of IPTG (Figure 6). To obtain further evidencefor or against a potential effect of transcription and tospecifically test whether stalling of transcription com-plexes might affect instability, the effect of the tran-scription-coupled repair factor Mfd was investigated. Asshown in Figure 7, mfd mutants do not show a differencein proportion of instability as a function of induction oftranscription with IPTG or as compared to their cor-responding wild-type levels in either repeat orientation,demonstrating that in this system rescue of stalled RNApolymerases by Mfd protein is not involved in CAG�CTGrepeat instability.

DISCUSSION

CAG�CTG trinucleotide repeat instability is known tobe length and orientation dependent in E. coli. Here, wehave devised a polymerization-independent strategyfor the expansion of repeat arrays in vitro and have ap-plied this to the generation and insertion of CAG�CTG

TABLE 1

Plasmids used in the study

Plasmid Characteristics Source DL no.

pTOF24 repAts sacB Cmr; used for SalI–PstI cloning Millicent Masters 1605pLacD1 pTOF24 derivative; contains BbsI, MfeI, and BsaI sites

in center of two 400-bp lac homology arms, lacL8This study 1823

pLacD2 pLacD1 derivative; BbsI site in lacZ homology arm removed This study 2911pLacD2 (CAG)5 pLacD2 derivative; (CAG)5 in place of MfeI site, This study 1816pLacD2 (CAG)8 pLacD2 derivative; (CAG)8 in place of MfeI site This study 1892pLacD2 (CTG)8 pLacD2 derivative; (CTG)8 in place of MfeI site This study 1893pLacD2 (CAG)14 pLacD2 derivative; (CAG)14 in place of MfeI site This study 1899pLacD2 (CAG)26 pLacD2 derivative; (CAG)26 in place of MfeI site This study 1894pLacD2 (CTG)26 pLacD2 derivative; (CTG)26 in place of MfeI site This study 1895pLacD2 (CAG)28 pLacD2 derivative; (CAG)28 in place of MfeI site This study 1900pLacD2 (CTG)28 pLacD2 derivative; (CTG)28 in place of MfeI site This study 1901pLacD2 (CAG)50 pLacD2 derivative; (CAG)50 in place of MfeI site This study 1911pLacD2 (CTG)50 pLacD2 derivative; (CTG)50 in place of MfeI site This study 1912pLacD2 (CAG)98 pLacD2 derivative; (CAG)98 in place of MfeI site This study 2912pLacD2 (CTG)98 pLacD2 derivative; (CTG)98 in place of MfeI site This study 2913pTOF24-mfd pTOF24 derivative to integrate mfd deletion This study 2519

‘‘DL’’ indicates strains that are constructed in the Leach Lab.

Orientation Dependence of Repeat Instability 33

Page 8: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

repeats in the E. coli chromosome (Figure 2). Using thisnew model system, we have investigated the basis of theorientation dependence of instability.

CAG�CTG trinucleotide repeats show length- andorientation-dependent instability in the E. coli chromo-some: We show that CAG�CTG repeat instability in theE. coli chromosome increases with increasing repeattract length and that instability is orientation dependent(Figure 3A). In humans, length dependence of repeatarray mutation underlies the phenomenon of dynamicmutation where expanded repeat arrays have an in-creased probability of further expansion leading toanticipation in the inheritance of disease phenotypes.The orientation in which elevated instability is observedis that where the CTG repeat lies on the template for thelagging strand of the replication fork. This is the sameorientation dependence with respect to replication asobserved in previous studies using bacterial plasmidsand yeast (Kang et al. 1995; Rosche et al. 1995; Maurer

et al. 1996; Freudenreich et al. 1997; Miret et al. 1998),but differs from a recent study of repeat instability in theE. coli chromosome (Kim et al. 2006) where the in-stability observed shows variable length and orientation

dependence. The authors have used the same assay inboth plasmid and chromosomal contexts and observedno length or orientation dependence in the plasmid.The system of Kim et al. (2006) also differs from ours inthat it detects only a subset of deletion events that givesrise to chloramphenicol resistance and complete de-letion of the repeat array is a common outcome. Sincewe do not observe complete deletion of the repeat array,it seems that the two assays are detecting differentmolecular events.

In both E. coli plasmids and yeast, instability is repli-cative and is strongly biased toward deletions, but inmice instability can occur in nondividing cells (Takano

et al. 1996; Hashida et al. 1997; Kovtun and McMurray

2001) and can be influenced by the position of thetransgenic insert (Monckton et al. 1997; Seznec et al.2000). This correlates with the expansion patternsobserved in humans (Seznec et al. 2000). However, theobservation of nonreplicative instability in a mamma-lian system does not exclude replicative sources of in-stability or the importance of replicative stability inmammals. Furthermore, the nonreplicative instabilityitself is likely to involve DNA synthesis. This is particularly

TABLE 2

E. coli strains

Strain Genotype Source, reference, or construction

DL732 F� thr-1 leuB6 proA2 his4 thi1argE3 lacY1 galK2 rpsLsupE44 ara-14 xyl-15 mtl-1, txs-33) sbcCDTKm

Leach Lab

CSH115 ara D(gpt-lac)5 rpsL mutSTmini-Tn10 Cold Spring Harbor LabDB1318 recD1014 hsdR2 zjj-202TTn10 recATCm Wertman et al. (1986)RM6972 dnaQTmini-Tn10 Genevieve Maenhaut-MichelJJC450 recF400TTn5 (KmR) Benedicte MichelJJC1086 recBTKm Benedicte MichelDL1786 MG1655 lacZx� lacIq ZeoRx1 John EykelenboomDL1994 DL1786 lacZT(CTG)48 This studyDL1995 DL1786 lacZT(CAG)75 This studyDL2009 DL1786 lacZT(CTG)95 This studyDL2079 DL2009 recATCm This study (P1 from DB1318)DL2080 DL2009 recBTKm This study (P1 from JJC1086)DL2081 DL2009 recF400TKm This study (P1 from JJC450)DL2104 DL2009 sbcCDTKm This study (P1 from DL732)DL2250 DL1786 lacZT(CAG)45 lacL8 This studyDL2300 DL2009 mutSTTc This study (P1 from CSH115)DL2301 DL1995 dnQTTc This study (P1 from RM6972)DL2302 DL1995 mutSTTc This study (P1 from CSH115)DL2303 DL1995 sbcCDTKm This study (P1 from DL732)DL2304 DL1995 recATCm This study (P1 from DB1318)DL2305 DL2009 expansion to (CTG)140 This studyDL2437 DL1995 recBTKm This study (P1 from JJC1086)DL2445 DL1786 lacZT(CTG)75 dnaQTTc This study (P1 from RM6972)DL2639 DL1786 lacZT(CAG)84 This studyDL2831 DL1995 mfd� This study (using pDL2519)DL2915 DL2009 mfd� This study (using pDL2519)DL2976 DL2303 dnaQTTc This study (P1 from RM6972)DL3046 DL2104 dnaQTTc This study (P1 from RM6972)DL3138 DL1786 lacZT(CAG)80 recF400TKm This study (P1 from JJC450)

P1 represents construction by P1 transduction. DL strains are strains constructed in the Leach Lab.

34 R. Zahra et al.

Page 9: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

clear in the case of nonreplicative expansion whereexpansion in the absence of accompanying contractionimplies net DNA synthesis.

The proofreading subunit of DNA polymerase III(DnaQ) determines orientation dependence of repli-cative instability in cells with active SbcCD nuclease:We demonstrate that a mutation in dnaQ destabilizesCAG�CTG repeat arrays and that orientation depen-dence of instability is lost in this mutant. This is a specificeffect of dnaQ mutation since a mutation in mutS doesnot have a corresponding effect (Figure 3B). Previousstudies suggested that the orientation dependence ofCAG�CTG repeat instability is caused by the dynamics oflagging-strand DNA synthesis accompanied by thegreater thermodynamic stability of CTG repeat hairpinscompared to CAG repeat hairpins (Maurer et al. 1996;Freudenreich et al. 1997; Miret et al. 1998). However,it has not previously been shown how this process ismediated. Here, we suggest that proofreading is inef-ficient on the CTG repeat template of the lagging strand(CAG orientation), leading to orientation-dependentinstability. It is the more foldable CTG repeat strand thatbehaves as if it is more refractory to proofreading. Al-though we favor the interpretation that the dnaQ muta-tion destabilizes the repeat tract because of its effect onproofreading, we cannot exclude the possibility thatother indirect effects of the dnaQ mutation contributeor are responsible.

Our work is consistent with the previous observationthat the dnaQ49 ts mutation destabilizes CTG repeats inbacterial plasmids (Iyer et al. 2000) but is in divergence

with experiments demonstrating that proofreadingmutants of DNA polymerases d and e do not destabilizethese repeats in yeast (Schweitzer and Livingston

1999). The yeast results are interesting, given that homoand dinucleotide repeats are destabilized in thesemutants (Strand et al. 1993; Tran et al. 1997), suggest-ing some particular resistance to proofreading ofCAG�CTG triplet repeats by polymerases d and e. Inthis context, it should be noted that DnaQ has beenshown to share sequence homology with human DNAediting enzyme DNase III (Hoss et al. 1999). Thisenzyme is present in equal amounts in nondividingand proliferating cells, which suggests that it is involvedin repair processes as well as in replication. An alterna-tive is that eukaryotic cells might correct trinucleotiderepeat slippage by sharing proofreading activities be-tween polymerases (Pavlov et al. 2006). So, it is plau-sible that proofreading during replication and repair inhuman cells may contribute to repeat stability.

A number of studies have documented the stabilizingand destabilizing effects of MutS and its homologs onCAG�CTG repeat instability in E. coli, yeast, and mouse(Jaworski et al. 1995; Schumacher et al. 1998; Manley

et al. 1999; Schmidt et al. 2000). In this system, weobserve a small destabilizing effect of the mutS mutationin one orientation (CAG on leading-strand template)that lies on the border of significance and no significanteffect in the other orientation (CTG on leading-strandtemplate). The absence of a substantial effect of mutS onthe frequency of deletion formation is consistent withour observation here that deletions are distributed over

Figure 2.—Schematic of the strategyfor building long repeat arrays. (A)(CAG)5 and (CTG)5 were introducedbetween BsaI and BbsI restriction sitesby site-directed mutagenesis. Double di-gestions of plasmid DNA were carriedout using BbsI plus HindIII or BsaI plusHindIII. The bands were separated on a1% agarose gel. The fragments contain-ing repeats (A and B) were extractedfrom a gel and ligated to obtain a longerrepeat tract length than in the originalplasmid. (B) The restriction sites of BsaIand BbsI direct cleavage inside the re-peat sequence so every new repeat tractlength will be twice the original tractlength minus 2 repeats. As shown, aftercleavage by BsaI and BbsI and ligation,(CAG)5 gives rise to (CAG)8.

Orientation Dependence of Repeat Instability 35

Page 10: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

a wide range of sizes, while mismatch repair can correctonly small insertion/deletion loops of up to threenucleotides.

The SbcCD nuclease increases CAG repeat instabilitywhen proofreading is active and CTG repeat instabilitywhen proofreading is inactive: We show that CAG�CTGrepeat instability is reduced in an sbcCD mutant whenthe CTG-containing strand is the template for thelagging strand of the replication fork (CAG orienta-tion). This stabilizing effect of an sbcCD mutation is lostin a dnaQ sbcCD double mutant. This result argues forantagonistic action of SbcCD and the proofreadingsubunit of DNA polymerase III. It is consistent withthe existence of an SbcCD-dependent pathway of de-letion formation for the CAG orientation that issignificant only in the presence of proofreading. Thismay be because the action of SbcCD is antagonistic to

proofreading through removal of the structure signal-ing the need to proofread.

In the CTG orientation (CAG on the lagging-strandtemplate), SbcCD plays an active role in stimulatingdeletions in the absence of proofreading. This is evi-denced by the small but significant decrease in in-stability in an sbcCD dnaQ strain compared to dnaQ andthe shift from a skewed distribution of deletion sizes(in dnaQ) to a flat distribution (in sbcCD dnaQ) for theCTG orientation. Furthermore, the observation of ori-entation dependence in sbcCD dnaQ (but not in dnaQ)implies a role of SbcCD in removing orientation depen-dence in the absence of proofreading. These observa-tions are consistent with SbcCD having access to anintermediate in the CTG deletion pathway (CAG on thelagging-strand template) in the absence of dnaQ andstimulating its conversion to a deletion product ratherthan its return to a parental template. It has recentlybeen shown that fluorescently tagged fusions of Bacillussubtilis SbcC localized with a pattern similar to that of thereplication factory, consistent with action of SbcCD atthe site of DNA replication (Meile et al. 2006). A similarcolocalization of fluorescently tagged SbcC with a rep-lication factory protein has also been observed in E. coli(E. Darmon, personal communication).

CAG�CTG repeat instability in the chromosome isnot caused by recombination: We demonstrate that, atthe lengths studied here, CAG�CTG repeat instability inthe E. coli chromosome is not affected by mutations inrecombination genes recA, recB, and recF. These findingsare interesting since recombination has been reported toinfluence instability of repeats in plasmids (Jakupciak andWells 1999, 2000a,b; Napierala et al. 2002; Pluciennik

et al. 2002; Hashem et al. 2004; Hebert et al. 2004). Furtherwork is required to reconcile these observations.

Transcription does not influence CAG�CTG insta-bility in the E. coli chromosome: We report anotherobservation that contrasts with several previous plas-mid studies. We show here that, at the repeat lengthsstudied, instability in the chromosome is not affectedby transcription. Previously, transcription was reportedto affect plasmid-borne CAG�CTG repeat instability(Bowater et al. 1997; Schumacher et al. 2001; Mochmann

and Wells 2004) although one study did not detectsuch an effect (Schmidt et al. 2000). As we observed noeffect of transcription, we reasoned that the absence ofeffect might be caused by the enzymatic removal ofstalled transcription complexes before they were able toaffect instability. We therefore tested the effect of anmfd mutant on instability. Mfd protein, a transcription-repair coupling factor, ensures the repair of DNA dam-age in transcribed strands of active genes. It can bindDNA, RNA polymerase, and the UvrA protein, removesRNA polymerase from the DNA, and recruits the exci-sion repair apparatus to the damaged site. It is also re-quired in the removal of stalled transcription complexes(Park et al. 2002). A mutation in the mfd gene did not

Figure 3.—(A) Instability proportion of different repeatlengths for CAG and CTG orientations. The instability pro-portion is the frequency of sibling colonies that had a repeatlength changed from the parental length. Each bar representsthe pooled data of two independent assays and corresponds tothe individual analysis of 480 clones by capillary electropho-resis and GeneMapper. The error bars show 95% confidenceintervals. (B) Instability proportions for dnaQ, mutS, sbcCD,and sbcCD dnaQ mutants of CAG�CTG repeats compared withwild-type cells. The CAG repeat tract length studied in wild-type and all mutants was 75 and the CTG repeat tract lengthwas 95 except in the dnaQ mutant where it was 75. Each barrepresents the proportion of instability (pooled data of twoindependent assays of 240 clones each). The error bars show95% confidence intervals.

36 R. Zahra et al.

Page 11: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

significantly change the proportion of instability in theCAG or CTG repeat orientations, suggesting no role ofthe transcription-repair coupling factor Mfd in repeatinstability at the lengths studied. This result lends nosupport to the hypothesis that stalled transcriptioncomplexes influence CAG�CTG repeat instability in E.coli (Kim et al. 2006).

Evidence for the influence of secondary structureson instability in vivo: Several of our observations lendweight to the hypothesis that secondary structures doinfluence the instability of CAG�CTG repeat arrays. First,in the CAG orientation (CTG on the lagging-strandtemplate), the distribution of sizes of deletion productsis negatively skewed, consistent with a preference for

Figure 4.—Distributions of deletion sizes in cells containing CAG�CTG repeats. The deletions observed are plotted as the per-centage of the tract deleted against the number of times that the particular deletions were observed.

Orientation Dependence of Repeat Instability 37

Page 12: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

large deletions comprising enough repeats to form sta-ble hairpins. This is independent of any of the geno-types tested here and argues for the formation ofhairpins stable enough to influence the spectrum ofdeletion products irrespective of the presence of SbcCD.Second, we observe an effect of SbcCD on the frequencyof instability in the orientation predicted to form themore stable CTG repeat hairpins, and we know thatSbcCD is a hairpin nuclease that has been shown tocleave CTG repeat hairpins (Connelly et al. 1999).Third, we observe a negatively skewed distribution ofdeletion products for the CTG orientation (CAG onthe lagging-strand template) in the presence of SbcCDand the absence of proofreading, suggesting the exis-tence of less stable secondary structures in this orienta-tion that require the presence of SbcCD to manifestthemselves as deletions. Longer repeat arrays give riseto longer deletion products, arguing against the forma-tion of a specific structure composed of a set numberof repeats. Instead, it would appear that larger second-ary structures are free to form within longer repeatarrays. The strongest evidence for secondary structurescomes from the skewed distributions of deletion prod-ucts as these reflect the nature of the slippage inter-mediates. Any arguments based on the action of SbcCDas a hairpin nuclease must be moderated by the obser-vation that SbcCD can affect the nature of a deletionevent between 101-bp direct repeats, even in the absenceof an inverted repeat sequence (Bzymek and Lovett

2001a,b).A model for replicative instability of CAG�CTG

repeats: Combining the results of sbcCD and dnaQ muta-tions, we propose a model to explain the orientationdependence of CAG�CTG repeat instability duringreplication (Figure 8). Orientation dependence is de-termined by proofreading of slippage intermediatesformed during replication of the lagging strand. We

suggest that the CTG repeat template for the laggingstrand is partially refractory to proofreading, leadingto elevated frequencies of deletions in wild-type cells.Intermediates in the slippage reaction in this orienta-tion of the repeat array are accessible to the SbcCDnuclease, which can increase instability by digestingthe strands that signal the presence of a substrate forproofreading. In the absence of proofreading, SbcCDcan no longer affect instability and we suggest that thisis because its effect is to divert intermediates from ef-fective proofreading. An alternative possibility is thatcleavage with SbcCD is not possible in a proofreadingmutant. The latter hypothesis is made less likely with theobservation of an effect of SbcCD on instability in a

Figure 5.—Instability proportions of recA, recB, and recF mu-tants containing CAG�CTG repeats The length studied forCTG repeats is 95 in wild type and mutants. For CAG repeats,it is 75 for all mutants except in recF, where it is 80. Each barrepresents the instability proportion calculated from the dataof two independent experiments (480 clones). Error bars rep-resent 95% confidence intervals.

Figure 6.—Proportions of instability for CAG�CTG repeatsin the presence or absence of IPTG. The cells were grownovernight in the presence and absence of IPTG (2 mm).The x-axis shows the repeat lengths for CAG and CTG repeatorientations. The bars represent the data from two indepen-dent assays and correspond to the individual analyses of 240clones. Error bars represent 95% confidence intervals.

Figure 7.—Instability proportions of (CAG)75 and (CTG)95

repeats in mfd mutants in the presence or absence of IPTG.Each bar represents the pooled data of two independent ex-periments (240 clones). Error bars represent 95% confidenceintervals.

38 R. Zahra et al.

Page 13: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

dnaQ mutant when CAG repeats are the template for thelagging strand. In this orientation, we hypothesize thatmore unstable intermediates are formed and that theeffect of SbcCD is to divert them from a proofreading-independent pathway of return to parental length. Themodel is explained in detail in the legend of Figure 8.

Conclusion: CAG�CTG repeat instability in the E. colichromosome shows length- and orientation-dependentbehavior consistent with preferential deletion when CTGrepeats are the template for the lagging strand. Deletionpatterns are consistent with the formation of hairpins thatare sufficiently stable to influence the spectrum ofproducts for this unstable orientation irrespective of thegene products investigated here. In the more stableorientation, the spectrum of deletion products is alsoconsistent with secondary structure formation, but this is

revealed only in the presence of the hairpin nucleaseSbcCD (Rad50/Mre11). The observed patterns ofCAG�CTG instability provide direct evidence for theformation of secondary structures in vivo. These struc-tures must be large and their sizes determined by thelengths of the repeat arrays. In cells containing the activeSbcCD nuclease, orientation dependence of replicativeCAG�CTG repeat instability is determined by DnaQ, theproofreading subunit of DNA polymerase III. Theseresults demonstrate an interaction between the proof-reading of slippage intermediates and their processing bythe SbcCD nuclease that affects replicative instability ofrepeated sequences. Whether similar processing reac-tions are important in the replicative or nonreplicativepathways of trinucleotide repeat instability in human cellsremains to be determined.

Figure 8.—Model for orientation-dependent replicative instability DNA synthesis arrests within a CAG or CTG repeat array (seeDNA structure 1) and strand slippage occurs either accompanied (b) or not (c) with stable secondary structure formation (a). Atemplate containing CAG repeats will form less stable secondary structures (see DNA structure 2) and so primarily will adopt‘‘deletion pathway A’’ whereas a template with CTG repeats will form more stable secondary structures (see DNA structure 3and DNA structure 4) and so primarily will adopt ‘‘deletion pathway B.’’ Some interconversion between well-folded and lesswell-folded structures (and between differently folded structures, not shown) may also be possible (interchange d). The obser-vation that orientation dependence of repeat instability is lost in a dnaQ mutant implies that the initiation of slippage is inde-pendent of whether the template contains CAG or CTG repeats. This has the further implication that initiation of slippageoccurs independently of the potential of the strand to form stable secondary structures. The requirement for the presence ofSbcCD for loss of orientation dependence in a dnaQ mutant implies that SbcCD ensures the efficient processing of slippage in-termediates initiated on both CAG and CTG repeats. Orientation dependence is generated by poor proofreading of the CTGtemplate strand. In the model, this is envisaged to be because 39–59 exonucleolytic proofreading does not remove a secondarystructure if it is present (compare f with e) and so new synthesis has the potential to slip again on such a template (DNA structure 3to DNA structure 4). In the absence of proofreading, intermediates with stable structures are committed to deletion (whether ornot SbcCD is present) while intermediates with less stable secondary structures can escape through an inefficient disassemblypathway (g) that is significant only in the absence of SbcCD (and of proofreading).

Orientation Dependence of Repeat Instability 39

Page 14: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

We thank Camelia Mihaescu and Siarhei Mankou (past labmembers) who obtained initial results demonstrating orientationdependence and the importance of proofreading in chromosomalinstability using a set of CAG�CTG repeats they constructed andanalyzed at the l attB site. In addition, we thank Ewa Okely fortechnical support, Elise Darmon for critical reading of the manuscript,John Eykelenboom for providing the strain DL1786, and the Schoolof Biological Sciences sequencing service for DNA fragment analysis.R.Z. holds a Ph.D. studentship funded by the Commonwealth Scholar-ship Commission, UK. The work is supported by a grant from theMedical Research Council to D.R.F.L.

LITERATURE CITED

Bacolla, A., R. Gellibolian, M. Shimizu, S. Amirhaeri, S. Kang

et al., 1997 Flexible DNA: genetically unstable CTG.CAG andCGG.CCG from human hereditary neuromuscular disease genes.J. Biol. Chem. 272: 16783–16792.

Bowater, R. P., A. Jaworski, J. E. Larson, P. Parniewski and R. D.Wells, 1997 Transcription increases the deletion frequency oflong CTG.CAG triplet repeats from plasmids in Escherichia coli.Nucleic Acids Res. 25: 2861–2868.

Bzymek, M., and S. T. Lovett, 2001a Evidence for two mechanismsof palindrome-stimulated deletion in Escherichia coli: single-strand annealing and replication slipped mispairing. Genetics158: 527–540.

Bzymek, M., and S. T. Lovett, 2001b Instability of repetitive DNAsequences: the role of replication in multiple mechanisms. Proc.Natl. Acad. Sci. USA 98: 8319–8325.

Bzymek, M., C. J. Saveson, V. V. Feschenko and S. T. Lovett,1999 Slipped misalignment mechanisms of deletion formation:in vivo susceptibility to nucleases. J. Bacteriol. 181: 477–482.

Callahan, J. L., K. J. Andrews, V. A. Zakian and C. H. Freudenreich,2003 Mutations in yeast replication proteins that increaseCAG/CTG expansions also increase repeat fragility. Mol. Cell.Biol. 23: 7849–7860.

Connelly, J. C., and D. R. Leach, 1996 The sbcC and sbcD genes ofEscherichia coli encode a nuclease involved in palindrome inviabil-ity and genetic recombination. Genes Cells 1: 285–291.

Connelly, J. C., E. S. de Leau, E. A. Okely and D. R. Leach,1997 Overexpression, purification, and characterization of theSbcCDproteinfromEscherichiacoli. J.Biol.Chem.272:19819–19826.

Connelly, J. C., E. S. de Leau and D. R. Leach, 1999 DNA cleavageand degradation by the SbcCD protein complex from Escherichiacoli. Nucleic Acids Res. 27: 1039–1046.

Cummings, C. J., and H. Y. Zoghbi, 2000 Fourteen and counting:unraveling trinucleotide repeat diseases. Hum. Mol. Genet. 9:909–916.

Darlow, J. M., and D. R. Leach, 1995 The effects of trinucleotiderepeats found in human inherited disorders on palindrome in-viability in Escherichia coli suggest hairpin folding preferencesin vivo. Genetics 141: 825–832.

Freudenreich, C. H., J. B. Stavenhagen and V. A. Zakian,1997 Stability of a CTG/CAG trinucleotide repeat in yeast is de-pendent on its orientation in the genome. Mol. Cell. Biol. 17:2090–2098.

Freudenreich, C. H., S. M. Kantrow and V. A. Zakian, 1998 Ex-pansion and length-dependent fragility of CTG repeats in yeast.Science 279: 853–856.

Gacy, A. M., G. Goellner, N. Juranic, S. Macura and C. T. McMurray,1995 Trinucleotide repeats that expand in human disease formhairpin structures in vitro. Cell 81: 533–540.

Grabczyk, E., and K. Usdin, 1999 Generation of microgram quan-tities of trinucleotide repeat tracts of defined length, intersper-sion pattern, and orientation. Anal. Biochem. 267: 241–243.

Hashem, V. I., W. A. Rosche and R. R. Sinden, 2004 Genetic recom-bination destabilizes (CTG)n.(CAG)n repeats in E. coli. Mutat.Res. 554: 95–109.

Hashida, H., J. Goto, H. Kurisaki, H. Mizusawa and I. Kanazawa,1997 Brain regional differences in the expansion of a CAG re-peat in the spinocerebellar ataxias: dentatorubral-pallidoluysianatrophy, Machado-Joseph disease, and spinocerebellar ataxiatype 1. Ann. Neurol. 41: 505–511.

Hebert, M. L., L. A. Spitz and R. D. Wells, 2004 DNA double-strandbreaks induce deletion of CTG.CAG repeats in an orientation-dependent manner in Escherichia coli. J. Mol. Biol. 336: 655–672.

Hoss, M., P. Robins, T. J. Naven, D. J. Pappin, J. Sgouros et al.,1999 A human DNA editing enzyme homologous to the Escher-ichia coli DnaQ/MutD protein. EMBO J. 18: 3868–3875.

Iyer, R. R., and R. D. Wells, 1999 Expansion and deletion of tripletrepeat sequences in Escherichia coli occur on the leading strand ofDNA replication. J. Biol. Chem. 274: 3865–3877.

Iyer, R. R., A. Pluciennik, W. A. Rosche, R. R. Sinden and R. D.Wells, 2000 DNA polymerase III proofreading mutants en-hance the expansion and deletion of triplet repeat sequencesin Escherichia coli. J. Biol. Chem. 275: 2174–2184.

Jakupciak, J. P., and R. D. Wells, 1999 Genetic instabilities in(CTG.CAG) repeats occur by recombination. J. Biol. Chem.274: 23468–23479.

Jakupciak, J. P., and R. D. Wells, 2000a Gene conversion (recom-bination) mediates expansions of CTG[middle dot]CAG repeats.J. Biol. Chem. 275: 40003–40013.

Jakupciak, J. P., and R. D. Wells, 2000b Genetic instabilities of trip-let repeat sequences by recombination. IUBMB Life 50: 355–359.

Jaworski, A., W. A. Rosche, R. Gellibolian, S. Kang, M. Shimizu

et al., 1995 Mismatch repair in Escherichia coli enhances instabil-ity of (CTG)n triplet repeats from human hereditary diseases.Proc. Natl. Acad. Sci. USA 92: 11019–11023.

Kang, S., A. Jaworski, K. Ohshima and R. D. Wells, 1995 Ex-pansion and deletion of CTG repeats from human disease genesare determined by the direction of replication in E. coli. Nat.Genet. 10: 213–218.

Kaytor, M. D., E. N. Burright, L. A. Duvick, H. Y. Zoghbi andH. T. Orr, 1997 Increased trinucleotide repeat instability withadvanced maternal age. Hum. Mol. Genet. 6: 2135–2139.

Kim, S. H., M. J. Pytlos, W. A. Rosche and R. R. Sinden,2006 (CAG)*(CTG) repeats associated with neurodegenerativediseases are stable in the Escherichia coli chromosome. J. Biol.Chem. 281: 27950–27955.

Kovtun, I. V., and C. T. McMurray, 2001 Trinucleotide expansionin haploid germ cells by gap repair. Nat. Genet. 27: 407–411.

Krasilnikova, M. M., and S. M. Mirkin, 2004 Replication stalling atFriedreich’s ataxia (GAA)n repeats in vivo. Mol. Cell. Biol. 24:2286–2295.

Link, A. J., D. Phillips and G. M. Church, 1997 Methods for gen-erating precise deletions and insertions in the genome of wild-type Escherichia coli: application to open reading frame character-ization. J. Bacteriol. 179: 6228–6237.

Manley, K., T. L. Shirley, L. Flaherty and A. Messer, 1999 Msh2deficiency prevents in vivo somatic instability of the CAG repeatin Huntington disease transgenic mice. Nat. Genet. 23: 471–473.

Maurer, D. J., B. L. O’Callaghan and D. M. Livingston,1996 Orientation dependence of trinucleotide CAG repeat in-stability in Saccharomyces cerevisiae. Mol. Cell. Biol. 16: 6617–6622.

McMurray, C. T., and I. V. Kortun, 2003 Repair in haploid malegerm cells occurs late in differentiation as chromatin is condens-ing. Chromosoma 111: 505–508.

Meile, J. C., L. J. Wu, S. D. Ehrlich, J. Errington and P. Noirot,2006 Systematic localisation of proteins fused to the green fluo-rescent protein in Bacillus subtilis: identification of new proteinsat the DNA replication factory. Proteomics 6: 2135–2146.

Merlin, C., S. McAteer and M. Masters, 2002 Tools for character-ization of Escherichia coli genes of unknown function. J. Bacteriol.184: 4573–4581.

Miret, J. J., L. Pessoa-Brandao and R. S. Lahue, 1998 Orientation-dependent and sequence-specific expansions of CTG/CAG tri-nucleotide repeats in Saccharomyces cerevisiae. Proc. Natl. Acad.Sci. USA 95: 12438–12443.

Mitas, M., A. Yu, J. Dill and I. S. Haworth, 1995a The trinucleo-tide repeat sequence d(CGG)15 forms a heat-stable hairpin con-taining Gsyn.Ganti base pairs. Biochemistry 34: 12803–12811.

Mitas, M., A. Yu, J. Dill, T. J. Kamp, E. J. Chambers et al., 1995b Hair-pin properties of single-stranded DNA containing a GC-rich tripletrepeat: (CTG)15. Nucleic Acids Res. 23: 1050–1059.

Mochmann, L. H., and R. D. Wells, 2004 Transcription influencesthe types of deletion and expansion products in an orientation-dependent manner from GAC*GTC repeats. Nucleic Acids Res.32: 4469–4479.

40 R. Zahra et al.

Page 15: Proofreading and Secondary Structure Processing Determine ... · Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG CTG Trinucleotide Repeat

Monckton, D. G., M. I. Coolbaugh, K. T. Ashizawa, M. J. Siciliano

and C. T. Caskey, 1997 Hypermutable myotonic dystrophyCTG repeats in transgenic mice. Nat. Genet. 15: 193–196.

Napierala, M., P. Parniewski, A. Pluciennik and R. D. Wells,2002 Long CTG.CAG repeat sequences markedly stimulate in-tramolecular recombination. J. Biol. Chem. 277: 34087–34100.

Oussatcheva, E. A., V. I. Hashem, Y. Zou, R. R. Sinden and V. N.Potaman, 2001 Involvement of the nucleotide excision repairprotein UvrA in instability of CAG*CTG repeat sequences inEscherichia coli. J. Biol. Chem. 276: 30878–30884.

Panigrahi, G. B., R. Lau, S. E. Montgomery, M. R. Leonard andC. E. Pearson, 2005 Slipped (CTG)*(CAG) repeats can be cor-rectly repaired, escape repair or undergo error-prone repair. Nat.Struct. Mol. Biol. 12: 654–662.

Park, J. S., M. T. Marr and J. W. Roberts, 2002 E. coli transcriptionrepair coupling factor (Mfd protein) rescues arrested complexesby promoting forward translocation. Cell 109: 757–767.

Pavlov, Y. I., C. Frahm, N. McElhinny, A. Niimi, M. Suzuki et al.,2006 Evidence that errors made by DNA polymerase alphaare corrected by DNA polymerase delta. Curr. Biol. 16:202–207.

Pearson, C. E., Y. H. Wang, J. D. Griffith and R. R. Sinden,1998 Structural analysis of slipped-strand DNA (S-DNA)formed in (CTG)n. (CAG)n repeats from the myotonic dystro-phy locus. Nucleic Acids Res. 26: 816–823.

Pelletier, R., B. T. Farrell, J. J. Miret and R. S. Lahue, 2005 Mech-anistic features of CAG*CTG repeat contractions in culturedcells revealed by a novel genetic assay. Nucleic Acids Res. 33:5667–5676.

Petruska, J., N. Arnheim and M. F. Goodman, 1996 Stability ofintrastrand hairpin structures formed by the CAG/CTG classof DNA triplet repeats associated with neurological diseases. Nu-cleic Acids Res. 24: 1992–1998.

Petruska, J., M. J. Hartenstine and M. F. Goodman, 1998 Analysisof strand slippage in DNA polymerase expansions of CAG/CTGtriplet repeats associated with neurodegenerative disease. J. Biol.Chem. 273: 5204–5210.

Pluciennik, A., R. R. Iyer, M. Napierala, J. E. Larson, M. Filutowicz

et al., 2002 Long CTG.CAG repeats from myotonic dystrophyare preferred sites for intermolecular recombination. J. Biol.Chem. 277: 34074–34086.

Richard, G. F., G. M. Goellner, C. T. McMurray and J. E. Haber,2000 Recombination-induced CAG trinucleotide repeat expan-sions in yeast involve the MRE11–RAD50–XRS2 complex. EMBOJ. 19: 2381–2390.

Rosche, W. A., T. Q Trinh and R. R. Sinden, 1995 Differential DNAsecondary structure-mediated deletion mutation in the leadingand lagging strands. J. Bacteriol. 177: 4385–4391.

Sarkar, P. S., H. C. Chang, F. B. Boudi and S. Reddy, 1998 CTGrepeats show bimodal amplification in E. coli. Cell 95: 531–540.

Saveson, C. J., and S. T. Lovett, 1997 Enhanced deletion forma-tion by aberrant DNA replication in Escherichia coli. Genetics146: 457–470.

Schmidt, K. H., C. M. Abbott and D. R. Leach, 2000 Two opposingeffects of mismatch repair on CTG repeat instability in Escherichiacoli. Mol. Microbiol. 35: 463–471.

Schumacher, S., R. P. Fuchs and M. Bichara, 1998 Expansion ofCTG repeats from human disease genes is dependent upon rep-lication mechanisms in Escherichia coli: the effect of long patchmismatch repair revisited. J. Mol. Biol. 279: 1101–1110.

Schumacher, S., I. Pinet and M. Bichara, 2001 Modulation oftranscription reveals a new mechanism of triplet repeat instabilityin Escherichia coli. J. Mol. Biol. 307: 39–49.

Schweitzer, J. K., and D. M. Livingston, 1999 The effect of DNAreplication mutations on CAG tract stability in yeast. Genetics152: 953–963.

Seznec, H., A. S. Lia-Baldini, C. Duros, C. Fouquet, C. Lacroix

et al., 2000 Transgenic mice carrying large human genomic se-quences with expanded CTG repeat mimic closely the DM CTGrepeat intergenerational and somatic instability. Hum. Mol.Genet. 9: 1185–1194.

Sharples, G. J., and D. R. Leach, 1995 Structural and functionalsimilarities between the SbcCD proteins of Escherichia coli andthe RAD50 and MRE11 (RAD32) recombination and repair pro-teins of yeast. Mol. Microbiol. 17: 1215–1217.

Sinden, R. R., V. N. Potaman, E. A. Oussatcheva, C. E. Pearson, Y.L. Lyubchenko et al., 2002 Triplet repeat DNA structures andhuman genetic disease: dynamic mutations from dynamic DNA.J. Biosci. 27: 53–65.

Smith, G. K., J. Jie, G. E. Fox and X. Gao, 1995 DNA CTG tripletrepeats involved in dynamic mutations of neurologically relatedgene sequences form stable duplexes. Nucleic Acids Res. 23:4303–4311.

Strand, M., T. A. Prolla, R. M. Liskay and T. D. Petes, 1993 Desta-bilization of tracts of simple repetitive DNA in yeast by mutationsaffecting DNA mismatch repair. Nature 365: 274–276.

Takano, H., O. Onodera, H. Takahashi, S. Igarashi, M. Yamada

et al., 1996 Somatic mosaicism of expanded CAG repeats inbrains of patients with dentatorubral-pallidoluysian atrophy:cellular population-dependent dynamics of mitotic instability.Am. J. Hum. Genet. 58: 1212–1222.

Tran, H. T., J. D. Keen, M. Kricker, M. A. Resnick and D. A. Gordenin,1997 Hypermutability of homonucleotide runs in mismatchrepair and DNA polymerase proofreading yeast mutants. Mol.Cell. Biol. 17: 2859–2865.

Wertman, K. F., A. R. Wyman and D. Botstein, 1986 Host/vectorinteractions which affect the viability of recombinant phagelambda clones. Gene 49: 253–262.

Yu, A., J. Dill and M. Mitas, 1995a The purine-rich trinucleotiderepeat sequences d(CAG)15 and d(GAC)15 form hairpins. Nu-cleic Acids Res. 23: 4055–4057.

Yu, A., J. Dill, S. S. Wirth, G. Huang, V. H. Lee et al., 1995b Thetrinucleotide repeat sequence d(GTC)15 adopts a hairpin con-formation. Nucleic Acids Res. 23: 2706–2714.

Communicating editor: S. T. Lovett

Orientation Dependence of Repeat Instability 41