5
Proc. NatI. Acad. Sci. USA Vol. 82, pp. 6990-6994, October 1985 Genetics Generation of single base-pair deletions, insertions, and substitutions by a site-specific recombination system (evolution/4080/site-specific inversion/heteroduplex/mutation) JOHN M. LEONG, SIMONE E. NUNES-DUBY, AND ARTHUR LANDY* Division of Biology and Medicine, Brown University, Providence, RI 02912 Communicated by Allan M. Campbell, June 17, 1985 ABSTRACT The sequence analysis of both products of individual 480 site-specific recombination events in vivo shows that recombination with a secondary attachment (aft) site generates several different novel joints at the mismatched position: one recombination event resulted in a single base-pair deletion and two other recombination events resulted in two different single base-pair substitutions. The characterized products of recombination can be straightforwardly interpret- ed as the outcome of strand exchange involving staggered nicks bracketing the heterology within an overlap region of five to nine base pairs. In comparison, more complex segregation patterns have been observed in previous studies of X recom- bination between nonidentical aft sites; the nature of the overlap region heterology may have a significant effect on the segregation patterns. To recover both products of a single recombination event, we used a plasmid that carries the 480 int and xis genes and both aft sites. Because the two aft sites are situated in opposite orientation, intramolecular recombination between them inverts rather than deletes the intervening segment of DNA. Although subsequent reinversion restores the original gross genetic arrangement, single base-pair insertions, deletions, and substitutions are introduced at the sites of recombination. One of the mutations improves the recombi- nation efficiency of the secondary aft site and thereby converts a formerly "stable" sequence to an efficient target for rear- rangement, and other mutations are predicted to alter the specificity of recombination. These pathways may also provide useful models for the efficient generation of localized sequence diversity on a developmental (as well as evolutionary) time scale. For members of the lambdoid family of bacteriophage, site-specific recombination between the phage attP site (POP') and the bacterial attB site (BOB') leads to integration of a circularized phage genome into the bacterial chromo- some (1). We have been studying the site-specific recombi- nation system of 480, which is similar to the more extensively studied X system (for review, see refs. 2 and 3) but has a different att site specificity. DNA sequence analysis of the 080 att sites has revealed that attP and attB share a 17-base-pair (bp) perfect homology, the common core region (4). The sequences on either side of the core are designated P and P' for attP and B and B' for attB. The crossover occurs within the common core to generate an intact core in each of the prophage att sites, attL (BOP') and attR (POB'). This integration event requires the 480-encoded protein, integrase (Int; refs. 5 and 6), and the host-encoded protein, integration host factor (4, 7, 8). Excision of the prophage, involving recombination between attL and attR to regenerate attP and attB, additionally requires the phage-encoded protein excisionase (Xis; unpublished data). The 080 core region has two sequences present as an inverted repetition that may function as the recognition elements for Int. This hypothesis is supported by the analo- gous arrangements of these sequences and the X Int core recognition sequences (4, 9). In each system, the two sym- metric elements are separated by a 3-bp interval. Addition- ally, the position of each repeat unit with respect to integra- tion host factor binding sequences in the two att sites is almost identical (4, 10). Int molecules interacting with these core sites presumably carry out the cleavage and rejoining of DNA strands during recombination. The first suggestion that X integrative recombination in- volved a heteroduplex intermediate came from Shulman and Gottesman's genetic analysis of a mutant att site (11). Subsequently, the biochemical experiments of Mizuuchi and the characterization of Int nicking in vitro established that recombination involves staggered single-strand nicks at po- sitions -3/-2 and +4/+5 within the common core (12, 13). DNA homology between the 7-bp "overlap regions" of two recombining att sites greatly enhances recombination effi- ciency (14-16). This homology, along with the binding specificity of Int protein (9), is responsible for the site specificity of the X pathway. In contrast to the in vitro biochemical results with att sites having identical overlap regions, the genetic analyses of A recombination between nonidentical att sites, such as the att24 and safG mutants (11, 14), have not yielded the segregation patterns predicted by the simple model of stag- gered cuts and heteroduplex formation (see Discussion). We have addressed this question by using a plasmid that permits easy recovery and analysis of both products of a single 480 site-specific recombination event on the same molecule. The results are simply interpreted as the direct consequence of a strand exchange that proceeds via staggered cuts bracketing the overlap region heterology. In addition, the products of individual recombination events between attP and the sec- ondary att site reveal several different novel joints. MATERIALS AND METHODS Plasmids. (i) pJL10 contains the 480 5.4-kilobase-pair (kb) Sma I-F fragment cloned into the HincII site of pKT21, a derivative of pBR322 lacking the HincII site at position 650 (gift of K. Talmadge and W. Gilbert; ref. 4); (ii) Inv-A, Inv-B, and Inv-C were derived from pJL10 as described in Results; (iii) pSN21 was derived from pJL10 by an Int-mediated deletion extending from attP rightwards to a secondary att site in pKT21 sequence. Plasmid DNA was restricted, 5' end labeled, and isolated as described (4, 9). Sequencing reactions were performed as described by Maxam and Gilbert (17). Abbreviations: att site, attachment site; bp, base pair(s); kb, kilobase pair(s). *To whom reprint requests should be addressed. 6990 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Generation of single base-pair deletions, insertions, and

Embed Size (px)

Citation preview

Page 1: Generation of single base-pair deletions, insertions, and

Proc. NatI. Acad. Sci. USAVol. 82, pp. 6990-6994, October 1985Genetics

Generation of single base-pair deletions, insertions, andsubstitutions by a site-specific recombination system

(evolution/4080/site-specific inversion/heteroduplex/mutation)

JOHN M. LEONG, SIMONE E. NUNES-DUBY, AND ARTHUR LANDY*Division of Biology and Medicine, Brown University, Providence, RI 02912

Communicated by Allan M. Campbell, June 17, 1985

ABSTRACT The sequence analysis of both products ofindividual 480 site-specific recombination events in vivo showsthat recombination with a secondary attachment (aft) sitegenerates several different novel joints at the mismatchedposition: one recombination event resulted in a single base-pairdeletion and two other recombination events resulted in twodifferent single base-pair substitutions. The characterizedproducts of recombination can be straightforwardly interpret-ed as the outcome of strand exchange involving staggered nicksbracketing the heterology within an overlap region of five tonine base pairs. In comparison, more complex segregationpatterns have been observed in previous studies of X recom-bination between nonidentical aft sites; the nature of theoverlap region heterology may have a significant effect on thesegregation patterns. To recover both products of a singlerecombination event, we used a plasmid that carries the 480 intand xis genes and both aft sites. Because the two aft sites aresituated in opposite orientation, intramolecular recombinationbetween them inverts rather than deletes the interveningsegment ofDNA. Although subsequent reinversion restores theoriginal gross genetic arrangement, single base-pair insertions,deletions, and substitutions are introduced at the sites ofrecombination. One of the mutations improves the recombi-nation efficiency of the secondary aft site and thereby convertsa formerly "stable" sequence to an efficient target for rear-rangement, and other mutations are predicted to alter thespecificity of recombination. These pathways may also provideuseful models for the efficient generation of localized sequencediversity on a developmental (as well as evolutionary) timescale.

For members of the lambdoid family of bacteriophage,site-specific recombination between the phage attP site(POP') and the bacterial attB site (BOB') leads to integrationof a circularized phage genome into the bacterial chromo-some (1). We have been studying the site-specific recombi-nation system of 480, which is similar to the more extensivelystudied X system (for review, see refs. 2 and 3) but has adifferent att site specificity. DNA sequence analysis of the080 att sites has revealed that attP and attB share a17-base-pair (bp) perfect homology, the common core region(4). The sequences on either side of the core are designatedP and P' for attP and B and B' for attB. The crossover occurswithin the common core to generate an intact core in each ofthe prophage att sites, attL (BOP') and attR (POB'). Thisintegration event requires the 480-encoded protein, integrase(Int; refs. 5 and 6), and the host-encoded protein, integrationhost factor (4, 7, 8). Excision of the prophage, involvingrecombination between attL and attR to regenerate attP andattB, additionally requires the phage-encoded proteinexcisionase (Xis; unpublished data).

The 080 core region has two sequences present as aninverted repetition that may function as the recognitionelements for Int. This hypothesis is supported by the analo-gous arrangements of these sequences and the X Int corerecognition sequences (4, 9). In each system, the two sym-metric elements are separated by a 3-bp interval. Addition-ally, the position of each repeat unit with respect to integra-tion host factor binding sequences in the two att sites isalmost identical (4, 10). Int molecules interacting with thesecore sites presumably carry out the cleavage and rejoining ofDNA strands during recombination.The first suggestion that X integrative recombination in-

volved a heteroduplex intermediate came from Shulman andGottesman's genetic analysis of a mutant att site (11).Subsequently, the biochemical experiments of Mizuuchi andthe characterization of Int nicking in vitro established thatrecombination involves staggered single-strand nicks at po-sitions -3/-2 and +4/+5 within the common core (12, 13).DNA homology between the 7-bp "overlap regions" of tworecombining att sites greatly enhances recombination effi-ciency (14-16). This homology, along with the bindingspecificity of Int protein (9), is responsible for the sitespecificity of the X pathway.

In contrast to the in vitro biochemical results with att siteshaving identical overlap regions, the genetic analyses of Arecombination between nonidentical att sites, such as theatt24 and safG mutants (11, 14), have not yielded thesegregation patterns predicted by the simple model of stag-gered cuts and heteroduplex formation (see Discussion). Wehave addressed this question by using a plasmid that permitseasy recovery and analysis of both products of a single 480site-specific recombination event on the same molecule. Theresults are simply interpreted as the direct consequence of astrand exchange that proceeds via staggered cuts bracketingthe overlap region heterology. In addition, the products ofindividual recombination events between attP and the sec-ondary att site reveal several different novel joints.

MATERIALS AND METHODSPlasmids. (i) pJL10 contains the 480 5.4-kilobase-pair (kb)

Sma I-F fragment cloned into the HincII site of pKT21, aderivative of pBR322 lacking the HincII site at position 650(gift of K. Talmadge and W. Gilbert; ref. 4); (ii) Inv-A, Inv-B,and Inv-C were derived from pJL10 as described in Results;(iii) pSN21 was derived from pJL10 by an Int-mediateddeletion extending from attP rightwards to a secondary attsite in pKT21 sequence.Plasmid DNA was restricted, 5' end labeled, and isolated

as described (4, 9). Sequencing reactions were performed asdescribed by Maxam and Gilbert (17).

Abbreviations: att site, attachment site; bp, base pair(s); kb, kilobasepair(s).*To whom reprint requests should be addressed.

6990

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Page 2: Generation of single base-pair deletions, insertions, and

Proc. NatL. Acad Sci USA 82 (1985) 6991

Recombination in Serially Propagated Cells. Cells carryingplasmids were diluted 1:500 daily and grown to saturation(i.e., serially propagated) in LB broth with 20 Ag of tetracy-cline per ml at 320C or 370C. Recombination was monitoredby isolating plasmid DNA from 10-ml cultures, cutting withCla I, and analyzing the products on agarose gels. The rateof accumulation of plasmids in inverted orientation wasvariable and was influenced by several factors, includinggrowth conditions and host strain; for the isolates describedhere, Escherichia coli HB101 was used, and between one andfive passages were required to accumulate 10-20%6 "invert-ed" plasmids. E. coli W3110 and BR257 support inversionmuch less efficiently (for unknown reasons) and were used toisolate pure stocks of inverted and noninverted plasmids.Different recombination products were identified and isolat-ed by screening plasmids for an Xmn I restriction site in theleft side of the AattR core and for differences in the rate atwhich they accumulate reinverted plasmids (as a conse-quence of a second recombination event). Plasmids that haveundergone a full cycle of inversion and reinversion wereobtained by transforming HB101 with inverted plasmids,followed by serial propagation to accumulate reinvertedplasmids.

RESULTS

080 Int-Mediated Plasmid Inversion. The plasmid used inthis study, pJL10, contains the 5.4-kb Sma I-F fragment from080h (4). This insert carries the 480 int and xis genes, attP,and a secondary att site, attol.4, located on the left arm ofthe 480 chromosome, -1.4 kb to the left of attP. BecauseattJl.4 and attP are situated in opposite (instead of direct)orientation, intramolecular recombination inverts (ratherthan deletes) the intervening DNA, and both recombinantproducts are retained on the same plasmid (Fig. 1). In keepingwith the canonical nomenclature of attP (POP') and attB(BOB'), we refer to attol.4 as attA, or AOA'; the products ofrecombination between attP and attbl.4 are termed AattL(AOP') and AattR (POA'). The inversion in pJL10 can bemonitored by digestion with an endonuclease that gives acharacteristic restriction profile for each of the two forms ofplasmid (e.g., Cla I; Fig. 1). This inversion is Int-mediated,since small insertions in the int gene on pJL10 abolishinversion (unpublished data).Sequence analysis of attA suggests that it should be a

relatively efficient secondary att site in the 480 system. In Xsecondary att sites, positions homologous to the center of thecore (i.e., the overlap region) are highly conserved, andadjacent sequences, responsible for Int recognition at thecore, are conserved to a lesser extent (9). attA shares ahomology of 9 of 10 bp with the center of the 480 attP coreregion, from coordinates -5 to +4, with a single mismatch atposition -2 (Figs. 2a and 3). attA also shares partial sequencehomology with the potential core Int binding sequences (ref.4; Fig. 3). Two sequences that match these sequences at four

Cla

of the seven positions are situated in exactly analogouspositions in attA and could act as left and right Int bindingsequences, respectively. In addition, shifted one base left-ward is an alternative sequence, matching at 5 of 7 bp, thatcould function as the left Int binding sequence. Outside ofthecore region, attA shows no striking homology to attP or attB.

Single Recombination Events Generate Several DifferentNovel Joints. Cultures of E. coli HB101 carrying pJL10 wereserially propagated to accumulate plasmids in inverted ori-entation (see Materials and Methods). Populations of mixed(inverted and noninverted) plasmids were transformed intostrains that support recombination much less efficiently (seeMaterials and Methods), and transformants carrying plas-mids in inverted orientation were identified. We isolatedthree plasmids, Inv-A, Inv-B, and Inv-C, that have under-gone inversion. The recombinant att sites of these plasmidswere sequenced (Fig. 2). Each of the three recombinationevents has generated a unique pair of recombinant joints(AattL and AattR), differing at the mismatched base pair(position -2): Inv-A has undergone a T-A C-G transition,Inv-B has undergone a deletion of the C-G bp, and Inv-C hasundergone a C-G -+ T-A transition (Fig. 4).The most reasonable explanation for these recombination

products is a mechanism involving staggered cuts and for-mation of a "heteroduplex intermediate" during strandexchange. In particular, it is difficult to account for theseproducts by a strand exchange involving flush cuts (see alsobelow). We presume that 480 Int generates staggered nickswithin the core during recombination and that the mis-matched base at position -2 lies between the nicks. Int, inrecognizing the potential binding sequences on attA, wouldgenerate staggered ends within the attA core: upon strandswitching, "heteroduplexes" would be formed at both AattLand AattR, each with a single base mismatch at position -2.We do not know whether all four strands are rejoined andwhether there are unligated nicks in the "heteroduplex." XInt protein has been shown to efficiently nick a mismatchedheteroduplex att site without subsequent rejoining, so thatthe two activities are apparently separable (16). For simplic-ity we shall consider all strands in the heteroduplex as beingresealed.The observed recombination products can be derived from

heteroduplex att sites either by chromosomal replication orby mismatch repair (Fig. Sa). Since the two att sites areinverted relative to one another, the "top" sequence of attPand the analogous complement ("bottom") sequence of attAare on the same strand of plasmid DNA. Consequently,strand exchange followed by replication through the AattLand AattR heteroduplexes would give rise to two daughterinversion products, each originating from different parentalstrands; these products would differ at the mismatchedposition. Although this replication pathway (in the absence ofmismatch correction) predicts an equal mixture of tworecombinant types, our experiments do not support this.(Inv-A and Inv-C were in fact isolated from two independent

FIG. 1. 480 Int-mediated inversion. VectorICla I DNA is indicated by the thin line; 480 DNA isa indicated by the thick line. The two att sites on

pJL10, AAO' and POP', are designated by lightand dark arrows, respectively. Recombinationbetween them inverts the intervening segmentof DNA and is monitored by Cla I digestion.

Genetics: Leong et aL

Cla I

Page 3: Generation of single base-pair deletions, insertions, and

6992 Genetics: Leong et aL

a

att(b 1.4A T A+ 4 v

G G C C C

gC

_~~~~~~~~~_

*T_

AAG4A-AA

T- T

_w cTT

TT

_~~* OP A'_o T

__ ~~~~~~~~~a_o- 9.~mo

b

AattRA T

G G C C

III :......ii~oVIf-1`--:~--

AattL

A TG G C C

Proc. NatL Acade Sci USA 82 (1985)

c

AattL

A A T+ v +

G G C C CA _

O

p

a

Ta

TTGTG AA

Go4G A-O

AT

TT

T

0

A.

9

a

`"49 Al

g9to

TA

AA

TT -oC

CodTTCC

AAA

AA

cc A

.a~

t

AATAA T

A

aT

-CA

AA

AA

UP" c

low ci_C,AW

FIG. 2. DNA sequence of attP, attA, and prophage att sites. Base zero is indicated in each panel, and common core region sequences (i.e.,the 17-bp sequence common to attP and wild-type attB) are represented by uppercase letters. P and P', A and A' denote flanking arm sequences.The mismatched base pairs at position -2 are indicated by solid arrowheads. The deleted base pair at position -2 in Inv-B is indicated by anopen arrowhead (c). (a) att-containing fragment *Hpa I (+155 in A' arm)-Ava I (%-700 in A arm) from pSN21 (* in fragment name denotesposition of 5' label). attA has also been sequenced from pJL10 (data not shown). (b) Left panel: AattR-containing fragment *Xmn I (+165 inA' arm)-Cla I (-900 in P arm) from Inv-A. Right panel: AattL-containing fragment *Nde I (-60 in A arm)-Xho I (+754 in P' arm) from Inv-A.(c) &attL-containing fragments *Nde I (-60)-EcoRV (+240 in P' arm) from the population of plasmids from which Inv-B was isolated.

recombination events.) We sequenced the AattL and AattR inan uncloned mixture of recombinants (rather than from anindividual plasmid cloned out by transformation): only oneinversion product (Inv-B; see below and Fig. Sb) was ob-served under conditions in which less (by a factor of 5-10) ofthe sibling product ("Inv-D") would have been detected ifpresent (Fig. 2c). However, there are factors that mightinterfere with detection of the second product predicted byreplicational segregation (e.g., the random loss of one of thetwo daughter cells).The alternative explanation for the observed recombina-

tion products is mismatch repair (or excision repair of nickedstrands) at AattL and AattR (Fig. 5a). Correction of bothmismatches toward the C-G bp results in the net TEA --C-OGtransition found in Inv-A; correction toward the T-A bp atboth sites yields the net COG -- T-A transition observed in

-10 0 +10

attP ATTAGAACACTTTCTTAAATTGTCATTTG

5/7

4/7 4/7attA CCCAAAAACCTTCCTTAAAAAGCTACGTG

FIG. 3. Inverted repetitions and potential Int binding sequencesin the attP and attA cores. Candidates for Int recognition areindicated by large letters and arrows. The fraction given abovearrows in attA represent the fraction of positions that match theproposed binding sequence. Diamonds indicate mismatched posi-tions within (+) and outside (9) the proposed overlap region.

Inv-C. According to this pathway, the mismatches at AattLand AattR were corrected by using the same chromosomalstrand as a template. Studies of mismatch repair in E. colisuggest that the proximity of the two att sites (1.4 kb) wouldfavor such corepair toward the same strand (18, 19). There-fore, the present data are compatible either with replicationalsegregation of the heteroduplex products of strand exchangeor with mismatch correction.

-2

attPpJLIO attA A C A'

AattR P ACmnv-A AattL A PI

AattR P ------T aIInv-B AattL A

AattRInv-C AattL

P ------T A'

A T----- P'

FIG. 4. Single recombination events generate mutations at themismatched position. Solid lines indicate sequences common to attPand attA in the center ofthe core. Homology is broken at position -2,as indicated. Sequences flanking this region are symbolized by wavylines for attA and by dashed lines for attP.

P I

Page 4: Generation of single base-pair deletions, insertions, and

Proc. NatL Acad Sci USA 82 (1985) 6993

a att Pr. dr rrr444r P______ Tf~0444477T4

A mcc t tcc t taaaaata - ggaagg a at t tt

strand exchange

AattR -.. 4ertccttaaaar#444sX f rrt t

AfL cctrrerru r Pa ggaaggaa t tl%

(repair) (repair)

A relication

,4ertcct taaaa AACTTTCTTAAAA

A-TGAAGGAATTTT - - - * o4f rr tt34

-CCTTCCTTAA&T _ cOtrrerr444f.r P..!.Jggaaggaat - GGAAAGAATTI_A

Inv-A Inv-C

b P 7-

affP Xc,4 rrre rrf.4. r______rc444E rrrx

A cc~ttccttasaaa 3at A ggaaggaat tit t

Z 8"

strand exchange

AattR ort cttaaaa a

rg444044rr t t

AccTT(FTr444j-iAattL Ig~gaag aattt4 EEM ,

(repair) -- - (repair)

A replication-

_ _ACTTTCTTAAA&- - - _r____r40-44; rr t t I-MA ccrrerrE4trr Pf I

_ _oGGA&GA&TTTA _

Inv-B

P AIP Hert tcc t taaaa- - - -TGAAAGGA&TTTT- - - >

-CCTTCCTTAAAT -iggaaggaa t t T4

'Inv-D'FIG. 5. Proposed mechanisms for the generation of mutations (see text for detailed explanation). Curved arrows represent staggered nicks

within the core. By analogy with X, these staggered cuts are presented 7 bp apart, leaving 5' protruding ends; this detail is not essential as long

as the mismatched base at position -2 is flanked by the nicks. The two DNA strands of each plasmid are represented by thin and thick lines.

Dashed lines indicate newly synthesized DNA, as the result of either replication or mismatch correction. (a) Formation of Inv-A and Inv-C.Asterisks indicate mismatched base pairs. (b) Formation of Inv-B. Diamonds indicate the extra bases in the attA overlap region. "Inv-D" is

a hypothetical product that has not been isolated.

The sequences of Inv-A and Inv-C are consistent with thesupposition that the staggered ends generated on attA are

identical in length to those generated on attP. However, forthe generation of Inv-B, we recall that on the left side of theattA core there is an alternative "Int binding site" shifted 1bp leftward (Fig. 3). If the Int protein recognizes thisalternative sequence and if the position of Int binding andcutting on the right is not altered, staggered nicks would bemade 1 bp further apart (Fig. Sb). Therefore, the twomismatched heteroduplexes formed at AattL and AattR uponstrand exchange would each have an extra base on one

strand. Inv-B is obtained from these heteroduplexes by usingas template for correction or replication that plasmid strandwithout the extra bases. When the other plasmid strand,containing the extra bases, is used as the template the productcorresponding to Inv-D in Fig. Sb would be predicted(although this was not isolated in our studies).A Full Cycle of Inversion and Reinversion Generates Single

Base-Pair Mutations. Growth of HB101 carrying an invertedplasmid gives rise to a population of reinverted plasmids thathave undergone a second site-specific recombination event(data not shown). Because the substrates of this secondrecombination are AattL and AattR, reinversion would beexpected to require Xis as well as Int. Consistent with this,deletion of the xis gene on pJL10 abolishes reinversion(unpublished data).Although this reinversion restores the gross genetic ar-

rangement of the plasmid, most of the products should havea mutation in either attP or attA. We isolated and sequencedplasmids with a reinverted segment from each of severalstrains originally transformed with different inverted plas-mids. All of the reinverted plasmids were indeed altered atthe site of recombination: at the -2 position we found a T-A-* COG transition in attP, a COG bp insertion in attP, and a COGbp deletion in attA (data not shown). Each ofthese mutationscan be derived by recombination between one of the AattLAattR pairs shown in Fig. 5 by using the same mechanismdescribed above.

Sequence homology in the overlap regions of two recom-bining X att sites is required for efficient recombination, andphages with mutations in this region (saf mutants) possessaltered insertion specificities (14, 15). The cycle of intramo-lecular inversion and reinversion described here provides anefficient mechanism for altering this sequence and thus forchanging the target site specificity of the recombinationsystem. Some of the mutations will result in increasedhomology between attP and attA (e.g., attP and attA ofpSN28, derived from Inv-A, share a homology of 10 of 10 bp).If the base changes do not alter Int recognition sequences, apredicted outcome ofthis increased homology is an increasedefficiency of recombination. For example, the reinversion ofInv-C should generate a wild-type attP core and an alteredattA core that both have good matches to the potential Intbinding sequences and also share perfect homology in theoverlap region. As predicted, the reinverted derivative ofInv-C undergoes recombination with such high efficiencythat it cannot be isolated in a homogeneous form in thepresence of a functional int gene: after only one passage, allof the transformants screened contain a mixture of plasmidswith at least 50%6 in inverted orientation (data not shown).

DISCUSSIONRelation to A Recombination Between Nonidentical aft Sites.

Studies in vivo have focused on recombination betweennonidentical att sites, a configuration that not only facilitatesgenetic analysis of strand exchange but also is ofconsiderablebiological interest. The first suggestion that X site-specificrecombination might involve staggered cuts and a heterozy-gote intermediate came from Int-dependent crosses with theatt24 mutant (11), which was subsequently shown to be a

single base-pair deletion within the overlap region (20). Aheterozygote intermediate was inferred by analyzing theproducts of a replication-blocked cross between mutant andwild type, but an additional explanation was necessary toaccommodate the segregation pattern: in most of the crossesexamined, the mutation segregated as if it were to the left of

Genetics: Leong et aL

Page 5: Generation of single base-pair deletions, insertions, and

Proc. NatL AcadJ Scd USA 82 (1985)

the point of genetic exchange >90-95% of the time, incontrast to the 50% predicted by a mechanism involving apair of staggered cuts bracketing the overlap mutation. Laterstudies of a 3-bp mutation within the overlap region, safG,established the importance of DNA homology between theoverlap regions of recombining att sites and also yielded anunpredicted segregation pattern: in this case the mutationwas to the right of the point of genetic exchange >95% of thetime (14).Two general classes of explanations were offered for the

unexpected segregation patterns of att24 and safG. One classpostulated a mismatch correction mechanism that was con-strained to use the same chromosomal strand as a templatefor repairing both att site products. The other class ofexplanations postulated that the overlap region heterology(or the mutation) may have depressed one pair of Int-promoted strand exchanges, thus generating either a single-strand exchange Holliday structure or a recombinant productwith unsealed nicks. In the former case, the Holliday struc-ture would have to be resolved by a second-strand exchange(not necessarily mediated by Int) on the same side of themutation as the first exchange or by replication through theHolliday structure. In the latter case, replication would haveto lead to disassembly of the unsealed strand.

In contrast to att24 and safG, the results described hererequire no further assumptions to be congruent with thesegregation patterns predicted by a mechanism involvingstaggered nicks bracketing the overlap heterology. Similarresults have also been observed in a series of crosses betweenX att sites with single base heterologies introduced into theoverlap region by site-directed mutagenesis (C. Bauer and J.Gardner, personal communication). Thus, the simplest seg-regation patterns, and those closest to the predictions ofstaggered cuts and a heterozygote intermediate, come fromcrosses with a single base-pair mismatch or insertion in theoverlap region, whereas the more complex segregation pat-terns are observed in crosses involving a single base-pairdeletion or a 3-bp heterology in the parental overlap regions.This suggests that the nature, extent, and location of theoverlap heterology have significant effects on the segregationpatterns of site-specific recombination.

Limits of the 480 Overlap Region. In contrast to thesegregation pattern of the heterology at position -2, themismatched base pairs at positions -6 and +5 alwayssegregate with their respective parental arms: P and A or P'and A'. The staggered nicks responsible for strand exchangeoccur between these two positions, so the upper limit for the480 overlap region is 10 bp. If the staggered nicks are madeat the same position in the two potential Int recognitionsequences (see Fig. 3) the overlap region is between 5 and 9bp in length. (The overlap region of the X att site is 7 bp.)

In the formation of Inv-B, we have suggested that 480 Intgenerates staggered nicks on attA by interacting with tworecognition sequences spaced 4 bp apart rather than thenormal 3 bp (see Fig. 3). This requires at least limitedflexibility in the spacing of the two Int molecules that nickattA. Some flexibility is apparent in X site-specific recombi-nation, since an att site with a single base-pair insertion in theoverlap region can recombine efficiently with a wild-type attsite (15).

Localized Recombination-Induced Mutations. Localizedmutations in the overlap region were generated by cycles ofinversion and reinversion. Because of their critical location,these mutations are predicted to alter the site specificity ofrecombination (14). Additionally, one of these mutationsleads to increased attP by attA recombination. The recom-

bination process can thus improve the efficiency of a sec-ondary att site. This recruitment of new att sites and themobilization of previously "stable" sequences suggests amechanism by which DNA inversion systems that regulategene expression may have evolved. Indeed, such a pathwayhas been suggested for the generation of the hin inversionsystem of Salmonella on the basis of DNA sequence com-parisons (21).

Localized recombination-induced mutations may be ageneral feature of site-specific recombination systems andmay have relevance in many organisms. The results de-scribed here highlight the high efficiency with which se-quence diversity can be generated, even in the absence ofgross rearrangements. This efficiency suggests that, in addi-tion to its evolutionary significance, the genetic diversitygenerated by site-specific recombination may also be impor-tant on a developmental time scale, such as during thegeneration of antibody diversity.

We thank C. Lesser and P. Moalli for help in screening plasmid-bearing strains, A. Oser and B. Franz for some of the sequence data,L. Vargas and B. Rogen for preparation of plasmids, B. Tracy fortechnical assistance, S. Penkoff for the illustrations, and S.Rodrigues and J. Boyles for help in preparation of the manuscript. K.Talmadge and W. Gilbert generously provided vector plasmids. Weare grateful to R. Weisberg, R. Fishel, R. Kolodner, W. Ross, P.Youderian, M. Susskind, C. Bauer, J. Gardner, and A. Campbell forhelpful discussion and/or communication of results prior to publi-cation. This work was supported by Grant 1-543 from the NationalFoundation-March ofDimes and by Grant AI13544 from the NationalInstitutes of Health. J.M.L. was the recipient of a National ScienceFoundation graduate fellowship.

1. Campbell, A. (1962) Adv. Genet. 11, 101-145.2. Nash, H. A. (1981) Annu. Rev. Genet. 15, 143-167.3. Weisberg, R. A. & Landy, A. (1983) in Lambda II, eds. Hendrix,

R. W., Stahl, F. W., Roberts, J. W. & Weisberg, R. A. (ColdSpring Harbor Laboratory, Cold Spring Harbor, NY), pp. 211-250.

4. Leong, J. M., Nunes-Dfiby, S. E., Lesser, C. F., Youderian, P.,Susskind, M. M. & Landy, A. (1985) J. Biol. Chem. 260,4468-4477.

5. Franklin, N. C., Dove, W. F. & Yanofsky, C. (1965) Biochem.Biophys. Res. Commun. 18, 910-923.

6. Signer, E. & Beckwith, J. (1966) J. Mol. Biol. 22, 33-51.7. Miller, H. I. & Friedman, D. I. (1977) in DNA Insertion Elements,

Plasmids and Episomes, eds. Bukhari, A. I., Shapiro, J. A. &Adhya, S. L. (Cold Spring Harbor Laboratory, Cold Spring Har-bor, NY), pp. 349-356.

8. Kikuchi, A., Flamm, E. & Weisberg, R. A. (1985) J. Mol. Biol. 183,129-140.

9. Ross, W. & Landy, A. (1983) Cell 33, 261-272.10. Craig, N. & Nash, H. A. (1984) Cell 39, 707-716.11. Shulman, M. & Gottesman, M. (1973) J. Mol. Biol. 81, 461-479.12. Mizuuchi, K., Weisberg, R., Enquist, L., Mizuuchi, M.,

Buraczynska, M., Foeller, C., Hsu, P.-L., Ross, W. & Landy, A.(1981) Cold Spring Harbor Symp. Quant. Biol. 45, 429-438.

13. Craig, N. & Nash, H. A. (1983) Cell 35, 795-803.14. Weisberg, R. A., Enquist, L., Foeller, C. & Landy, A. (1983) J.

Mol. Biol. 170, 319-342.15. de Massy, B., Studier, F. W., Dorgai, L., Appelbaum, E. &

Weisberg, R. A. (1984) Cold Spring Harbor Symp. Quant. Biol. 49,715-726.

16. Bauer, C. E., Hesse, S. D., Gardner, J. F. & Gumport, R. I. (1984)Cold Spring Harbor Symp. Quant. Biol. 49, 699-705.

17. Maxam, A. & Gilbert, W. (1980) Methods Enzymol. 65, 499-560.18. Wagner, R. W., Jr., & Meselson, M. (1976) Proc. Natd. Acad. Sci.

USA 73, 4135-4139.19. Fishel, R. A. & Kolodner, R. (1983) in Cellular Responses to DNA

Damage, eds. Friedberg, E. C. & Bridges, B. A. (Liss, NewYork), Vol. 11, 309-324.

20. Ross, W., Shulman, M. & Landy, A. (1982) J. Mol. Biol. 156,505-520.

21. Szekely, E. & Simon, M. (1983) J. Bacteriol. 155, 74-81.

6994 Genetics: Leong et aL