11
Gene, 109 (1991) 1-11 @ 1991 Elsevier Science Publishers B.V. All fights reserved. 0378-1119/91/$03.50 GENE 06183 Cloning, overexpression and nucleotide sequence of a thermostable DNA ligase-encoding gene (Ligase chain reaction; mrrB; phoA promoter; protein-DNA interactions; recombinant DNA) Francis Baranya and David H. Gelfandb " Department of Microbiology, Hearst Microbiology Research Center, Cornell University Medical College, New York, NY 10021 (U.S.A.); and b PCR Division, Cetus Corporation, Emeryville, CA 94608 (U.S.A.) Tel. (415)420-3384 Received by J.L. Slightom: 6 August 1991 Accepted: 30 August 1991 Received at publishers: 19 September 1991 SUMMARY Thermostable DNA ligase has been harnessed for the detection of single-base genetic diseases using the ligase chain reaction [Barany, Prec. Natl. Acad. Sci. USA 88 (1991) 189-193]. The Thermus thermophilus (Tth) DNA ligase-encoding gene (ligT) was cloned in Escherichia coil by genetic complementation of a ligts7 defect in an E. coli host. Nucleotide sequence analysis of the gene revealed a single chain of 676 amino acid residues with 47 % identity to the E. coli ligase. Under phoA promoter control, Tth ligase was overproduced to greater than 10% of E. coli cellular proteins. Adenylated and deadenylated forms of the purified enzyme were distinguished by apparent molecular weights of 81 kDa and 78 kDa, respectively, after separation via sodium dodecyl sulfate-polyacrylamide-gel electrophoresis. INTRODUCTION DNA ligases form a covalent phosphate link between two strands of DNA (for reviews see Engler and Richardson, 1982; Lehman, 1974). These versatile enzymes use NAD (or ATP) as a cofaetor to seal a nick in ds DNA, or join two fragments containing either complementary ss Correspondenceto: Dr. F. Barany, Department of Microbiology, Box 62, Corncll University Medical College, New York, NY 10021 (U.S.A.) Tel. (212)746-6509; Fax (212)746-8587. Abbreviations: aa, amino acid(s); Ap, ampiciUin; bp, base pair(s); buffer A, 20 mM Tris. HCI pH 7.6/1 mM EDTA; ds, double strand(ed); IPTG, isopropyl@D-thiogalactopyranoside; kb, kilobase(s) or 1000 bp; LCR, ligase chain reaction; ligT, gene encoding Tth iigase; MTase, methyl- transferase; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; ORF, open reading frame; ori, origin of DNA replication; PAGE, polyacryl- amide-gel electrophoresis; PMSF, phenylmethylsulfonyl fluoride; R resistance/resistant; SDS, sodium dodecyl sulfate; ss, single strand(ed); TE, 10 mM "Iris. HC! pH 8.0/1 mM EDTA; T., Thermus; Tth, T. thermo- philus; u, unit(s); wt, wild type; [ ], designates plasmid-carrier state. or blunt ends (Cozzarelli et al., 1967; Gefter et al., 1967; Gellert, 1967; Olivera and Lehman, 1967; Weiss and Richardson, 1967b). The ligation reaction occurs in three reversible steps: (0formation of a high-energy enzyme intermediate by transfer of the adenyl group from the cofac- tor to the 8-amir~ogroup of a lysine residue; (ii) transfer of the adenyl group to the 5' phosphate of one DNA strand, and (iii)attack of this activated 5' end by a 3'-hydroxyl group on the adjacent DNA strand. This forms a phos- phodiester link between the two DNA strands, and elimi- nates AMP (Becker et al., 1967; Gumport and Lehman, 1971; Modrich et al., 19'73; Modrich and Lehman, 1973; Weiss and Richardson, 1967a; Weiss etal., 1968; Yudelevieh et al., 1968; Zimmerman et al., 1967; Zimmer- man and Oshinsky, 1969). The preference of DNA ligases to join oligos with perfect complementarity at the junction has been exploited for detection of genetic diseases (Barany, 1991a; Barringer et al., 1990; Landegren et al., 1988; Nickerson et al., 1990; Wu and Wallace, 1989a,b). Use of a thermostable DNA

Cloning, overexpression and nucleotide sequence of a thermostable DNA ligase-encoding gene

  • Upload
    david-h

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Gene, 109 (1991) 1-11 @ 1991 Elsevier Science Publishers B.V. All fights reserved. 0378-1119/91/$03.50

GENE 06183

Cloning, overexpression and nucleotide sequence of a thermostable DNA l igase-encoding gene

(Ligase chain reaction; mrrB; phoA promoter; protein-DNA interactions; recombinant DNA)

Francis Barany a and David H. Gelfandb

" Department of Microbiology, Hearst Microbiology Research Center, Cornell University Medical College, New York, NY 10021 (U.S.A.); and b PCR Division, Cetus Corporation, Emeryville, CA 94608 (U.S.A.) Tel. (415)420-3384

Received by J.L. Slightom: 6 August 1991 Accepted: 30 August 1991 Received at publishers: 19 September 1991

SUMMARY

Thermostable DNA ligase has been harnessed for the detection of single-base genetic diseases using the ligase chain reaction [Barany, Prec. Natl. Acad. Sci. USA 88 (1991) 189-193]. The Thermus thermophilus (Tth) DNA ligase-encoding gene (ligT) was cloned in Escherichia coil by genetic complementation of a ligts7 defect in an E. coli host. Nucleotide sequence analysis of the gene revealed a single chain of 676 amino acid residues with 47 % identity to the E. coli ligase. Under phoA promoter control, Tth ligase was overproduced to greater than 10 % of E. coli cellular proteins. Adenylated and deadenylated forms of the purified enzyme were distinguished by apparent molecular weights of 81 kDa and 78 kDa, respectively, after separation via sodium dodecyl sulfate-polyacrylamide-gel electrophoresis.

INTRODUCTION

DNA ligases form a covalent phosphate link between two strands of DNA (for reviews see Engler and Richardson, 1982; Lehman, 1974). These versatile enzymes use NAD (or ATP) as a cofaetor to seal a nick in ds DNA, or join two fragments containing either complementary ss

Correspondence to: Dr. F. Barany, Department of Microbiology, Box 62, Corncll University Medical College, New York, NY 10021 (U.S.A.) Tel. (212)746-6509; Fax (212)746-8587.

Abbreviations: aa, amino acid(s); Ap, ampiciUin; bp, base pair(s); buffer A, 20 mM Tris. HCI pH 7.6/1 mM EDTA; ds, double strand(ed); IPTG, isopropyl@D-thiogalactopyranoside; kb, kilobase(s) or 1000 bp; LCR, ligase chain reaction; ligT, gene encoding Tth iigase; MTase, methyl- transferase; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; ORF, open reading frame; ori, origin of DNA replication; PAGE, polyacryl- amide-gel electrophoresis; PMSF, phenylmethylsulfonyl fluoride; R resistance/resistant; SDS, sodium dodecyl sulfate; ss, single strand(ed); TE, 10 mM "Iris. HC! pH 8.0/1 mM EDTA; T., Thermus; Tth, T. thermo- philus; u, unit(s); wt, wild type; [ ], designates plasmid-carrier state.

or blunt ends (Cozzarelli et al., 1967; Gefter et al., 1967; Gellert, 1967; Olivera and Lehman, 1967; Weiss and Richardson, 1967b). The ligation reaction occurs in three reversible steps: (0formation of a high-energy enzyme intermediate by transfer of the adenyl group from the cofac- tor to the 8-amir~o group of a lysine residue; (ii) transfer of the adenyl group to the 5' phosphate of one DNA strand, and (iii)attack of this activated 5' end by a 3'-hydroxyl group on the adjacent DNA strand. This forms a phos- phodiester link between the two DNA strands, and elimi- nates AMP (Becker et al., 1967; Gumport and Lehman, 1971; Modrich et al., 19'73; Modrich and Lehman, 1973; Weiss and Richardson, 1967a; Weiss etal., 1968; Yudelevieh et al., 1968; Zimmerman et al., 1967; Zimmer- man and Oshinsky, 1969).

The preference of DNA ligases to join oligos with perfect complementarity at the junction has been exploited for detection of genetic diseases (Barany, 1991a; Barringer et al., 1990; Landegren et al., 1988; Nickerson et al., 1990; Wu and Wallace, 1989a,b). Use of a thermostable DNA

I Bm

lac

Tth ligase ~l

Bg Bm Bg Bm RI

pBR ori

I-1

Ap R fl lacZ ori alpha

'I ~ " q ~

Hd Bm

pDZ1

lac

Bm Bg Bm B As RI Hd Bm

pDZ6,7

I Bg/Bm

lac

Bm As RI Hd Bm/Bg

pDZ12

Pho ( Taq A 17 R)

Bm Bg Ps Ps Bg/Bm

#94 -~q~' #85 #78 ~ 4"#66

Pho A T7

(r~~~~~~1~I ~I I"I lliiiiiiiiiiiiiiiiiiiiiiii i iii! iiiililiiiiiii!~ ~Ps I ] II

Bm Bg RI Pv Ps BglBm

pDZ13

pDZ15

I I I I I I I I 0 I 2 3 4 5 6 7 kb

Fig. !. Genealogy of plasmids containing the Tth iigase gene. Plasmids are represented schematically as if opened at either a BamHI or Pvull site, with genes drawn approximately to scale. The Tth ligase gene and direction are represented by the strongly hatched arrow; the vector Ap R (bla) gene is represented by the shaded arrow; the truncated end of the lacZ-a complementing fragment represented by the patterned arrow; the truncated end of a nonfunctional Taql endonuclease gene represented by the lightly hatched arrow; and the pBR and fl phage origins of replication represented by the open bars. ThephoA, T7 and lac promoters are indicated by right angle arrows and point in the direction of transcription. Restriction sites are: As, Asp718; Bm, BamHl; Bg, Bglll; (Bg/Bm or Bm/Bg recombined site is not cleavable by either 8amHI or BgllI); Hd, HindllI; RI, EcoRI; Ps, PstI; Pv, Pvull. Polylinker regions from pTZI8R are indicated by the triangular 'rake', with only the outside restriction sites listed. E. cog host strains used in the

ligase allowed for pract ical development o f the L C R method

ibr amplifying and detecting a single-base genetic disease

(Barany, 1991a,b). This enzyme, purified either from a

clone or the original T. thermophilus strmn, demonst ra ted

N A D +-dependent ligase activity at temperatures up to

85° C (Barany, 1991a; Takahash i and Tsuneko, 1986;

Takahash i et al., 1984). Ligation near the oligo melting

temperatures (Tm) al lowed for higher specificity in discrimi-

nat ing complementary from single-nucleotide mismatched

substrates (Barany, 1991a). Furthermore, the cloned en-

zyme retained activity after repeated 1 min exposures to

94°C, condit ions required to denature duplex D N A

(Barany, 1991a). Thus, thermostable ligase significantly

improved the specificity of amplification, as well as elimi-

nat ing background amplification or the requiremem to con-

t inually replenish enzyme, when compared to use of

mesophil ic enzymes (Barringer etal . , 1990; Wu and Wallace, 1989a).

Mesophil ic ligases from E. coli bacteriophage T4, and

yeasts have been cloned by genetic complementat ion

(Cameron eta l . , 1975; Got tesman, 1976; Wilson and

Murray, 1979), or by screening for ligase-adenylate inter-

constructions described below were obtained from the following sources: RRI (HB 101 recA ÷, Maniatis et al., 1982); from G. Wilson, N3098 (/igtsT, Wilson and Murray, 1979); from N. Murray, JH132 (mrr-, Tnl0, Heitman and Model, 1987); from J. Heitman, MM294 (endA - , hsdR-, hsdM ÷ , thi-I, supE44, Meselson and Yuan, 1968); and from our collection. Strains AK53 (mrrB - , MM294) and AK76 (mrrB-, N3098) were constructed by transducing the mrrB- phenotype from JH 132 as described (Miller, 19"/2). Presence of the mrrB- phenotype was confirmed by tolerance of these strains to the Taq MTase-encodmg gene present on plasmid pFBT71 (Barany, 1987; Barany et al., 1992a). T. thermophilus HB8 DNA purification and library constructions will be described elsewhere (Baranv et al., 1992a). DNA constructs were introduced into E. coil strains as described (Hanahan, 1983), and lig ÷ phenotype selected by growth on SOB plates (Hanahan, 1983) containing 0.2% maltose/0.2 mg IPTG per ml (to induce the lac promoter)/50 #g Ap/ml at 42-42.5 °C overnight. Plasmid pDZ I was isolated from a T. thermophilus HB8 DNA library containing inserts in the Bam HI site of pTZ 18R. Plasmid pDZ 1 contains ligT downstream from both lac and T7 promoters present in the starting vector. The N terminus of the ligase gene was brought closer to the two promoters as follows: Plasmid pDZ ! was randomly linearized with the restriction endonuclease HinPI (G~CGC) and blunt ended with Klenow enzyme, or with CviJl in the presence of ATP (PuG~CPy; Xia et al., 1987) as described (Barany, 1988b). After DNA purification, these randomly linearized plasmids were then treated with Asp718, which cleaves within the polylinker site directly downstream from the two promoters, and blunt ended with Klenow. Resulting fragments were separated via electrophoresis in low melting agarose, sequential slices including full-length linear and progressively smaller DNA fragments, excised, and DNA recovered as described (Barany, 1988b). The DNA fragments were subsequently recircularized by blunt end ligation, introduced into AK76 cells (/igts7 strain), and the lip, + phenotype selected at 42 ° C on SOB plates containing maitose/IPTG/Ap. Only plasmids containing deletions between the promoters and the start ofligT should confer viability under these conditions. Individual clones were picked, plasmid DNA prepared using standard methods (Holmes and Quigley, 1981; Birnboim, 1983), and analyzed by restriction digestion. Plasmid pDZ2, pDZ3, pDZ6 and pDZ7 lacked the 1.8-kb BamHI fragment, containing instead a 1.3, 1.4, 1.2 and 1.2-kb fragment respectively. Single-stranded DNA was made from these plasmids (Mead et al., 1985; 1986), and sequenced using the universal 'reverse primer' oligo 5'-dAGCGGATAACAATrTCACACAGGA and T7 DNA polymerase ('sequenase II' from U.S. Biochemicals; Tabor and Richardson, 1987). Piasmids pDZ2, 3, 6 and 7 were fused 105 nt, 184 nt, 19 nt and 19 nt respectively, upstream from a long ORF of which the first 60 aa show better than 50~ identity to E. coli DNA ligase. For optimizing subsequent constructions, it was necessary to remove the BgllI site downstream from the ligase gene. Plasmid pDZ7 was partially digested with BamH! + BgiII, the correct size smaller linear fragment separated from full length linear by electrophoresis, excised and purified as described above. Since BamHI and Bgill generate the same cohesive ends (5'-GATC), the linear fragment could be recircularized with T4 ligase, and introduced into E. coil strain AK53 via transformation. Several clones had deleted the 0.5-kb Barn HI-Bglll fragment resulting in a 5.7-kb plasmid, and one such clone was designated pDZ 12. Synthetic oligos Nos. 66, 78, 85 and 94 were synthesized, to allow for fusion of a phoA promoter (from pFBT64; Barany, 1987) and ribosome-binding sequence to the start of ligT using PCR (Horton et al., 1989). Oligos: No. 66, 5'-dCTGGCTFATCGAAATTAAT, 19 mer, PvulI site to T7 promoter throughphoA promoter, top strand ofpFBT64 (direction of Taqi endonuclease gene); No. 78, 5'-dCCAGGGTCATITFATITTCTCCATGTACAAAT, 32 mer, 5' end complementary to start ofligT corresponding to the nontranslated strand, 3' end complementary to Shine-Dalgarno side ofphoA promoter, bottom strand ofpFBT64; No. 85, 5'-dCATGGAGAAAATAAAATGACCCTGGAAGAGGCG, 33 mer, 5' end corresponding to Shine-Dalgarno side ofphoA promoter, 3' end corresponding to start of ligT, top strand of pDZ7 (direction of iigase gene); No. 94, 5'-dAAGCCGGTCGTACTCGGC, 18 mer, bottom strand of pDZ7 corresponding to nontranslated strand aa residues 40-35 of ligT, dov, mstream from BgllI site (see Fig. 3). Approximate positions of oligonucleotide primers are depicted by arrows and primer numbers below pDZ 13. In one reaction tube, 400 ng of primers Nos. 66 and 78 were added to 200 ng ofPstl + Pvull digested pFBT64 containing 50/~mol of dATP, dCTP, dGTP and dTI'P each, and 2.5 units Amplitaq T M in 100 #l PCR buffer and cycled as described (Saiki et al., 1988; Perkin-Elmer Cetus Instruments, Emoryville, CA). A second reaction tube contained 400 ng of primers Nos. 85 and 94, 200 ng ofEcoRI + BamHI digested pDZ7, and 2.5 units Amplitaq T M in the same reaction buffer, and incubated as above. The products from the initial reaction (2/~l each) were combined in a third reaction tube containing 400 ng primers Nos. 66 and 94 and 2.5 units Amplitaq T M

in the same reaction buffer, and incubated as above. After removal of Amplitaq T M the larger product PeR fragment was treated with Bglll + EcoRl, electrophoresed in low melting agarose, and purified as described above. In addition, the 2.7-kb PstI-B~lll ligase gene containing fragment from pDZ 12 and the 2.4-kb PstI-EcoRl Ap R gene and ori-containing fragment from pFBT64 were purified. All three fi~gments were combined in a three way ligation and introduced into E. coil strain AK53 via transformation. Several clones contained a 5.5-kb plasmid which overproduced ligase under phoA promoter control; one was designated pDZI3. To reverse the orientation of the Ap R gene, the 2.3-kb PstI-PvuII fragment from pFBLT69 (Barany, 1988a), was ligated to the 3.2-kb Pst I-Pmll iigase-encoding gene containing fragment of pDZI3. The ligation mix was transformed into E. coil strain AK53, several transformants analyzed ~oy restriction digests to confirm orientation of Ap R gene. One such clone was designated pDZ 15. Plasmid.carrying cells were grown in fortified broth (Moses et al., 1979) or on L-agar plates (Miller, 1972) supplemented with 10 mM K2HPO4 pH 7.6, to assure repression of the phoA promoter. For maximum induction of Tth ligase, cells were diluted 50-fo~d into MOPS medium (containing 0.2 mM K2HPO4; Neidhardt et al., 1974) and grown overnight at 37°C. Tth ligase activity was assayed as described in Fig. 5.

mediates (Barker et al., 1988; 1989). DNA sequence analy- sis of the E. coligene (Ishino et al., 1986) revealed no signifi- cant identities to the T7, T4 or yeast ligases (Barker et al., 1988; 1989; Dunn and Studier, 1983; Krayev et al., 1983). This work reports the isolation and nt sequence of the T. thermophilus DNA ligase-encoding gene (ligT). A com- parison of the aa sequence with its E. coil homologue is presented. By replacing the endogenous Therrnus transcrip- tion/translation signals with E. coil signals, the enzyme could be overproduced and purified. This will facilitate the study of structure/function relationships which may lead to an understanding of the exquisite specificity of this thermo- stable ligase.

RESULTS AND DISCUSSION

(a) Preparation of Thermus species DNA libraries in Escherichia coil

Therrnus aquaticus YT1 (ATCC25104; Brock and Freeze, 1969) and T. thermophilus HB8 (ATCC27634; Oshima and Imahori, 1971) are currently classified by the ATCC as a single T. aquaticus species (Degryse et al., 1978). The aa sequences of T. aquaticus YT1 and T. therrno-

philus HB8 restriction endonucleases (Barany etal., 1992a,b; Slatko et al., 1987), methyltransferases (Barany et al., 1992a,b; Slatko et al., 1987), and DNA polymerases (D.H.G., unpublished results) show 77~o, 79~o and 88% identity, respectively. DNA homology studies of numerous thermophilic isolates suggests that these organisms may indeed be separate species (Williams, 1988). Until the taxonomy is resolved, both designations are correct and will be used interchangeably. To assure that the highly methyl- ated T. thermophilus HB8 DNA (including TCGA and AATT sites; Barany et al., 1992a) did not undergo turf- associated restriction or mutation (Heitman and Model, 1987), it was necessary to prepare libraries in an E. coil host strain lacking not only mrrA, but more importantly, a new methyl dependent endonuclease termed mrrB (Barany et al., 1992a). The rnrrB gene product restricts N6MeA DNA at TCGA and other sequences. Strain AK76 (mrrB-, /igts7) was constructed to serve as a host for isolating the ligT gene as described in the legend to Fig. 1.

(b) Isolation of the ligase-encoding gene Thermostable ligase was cloned by screening for growth

of an E. coliligts7 derivative (AK76) at 42 ° C, when comple- mented with plasmid libraries of 7". therrnophilus HB8 DNA. The ligts7 mutation had previously been shown to reduce endogenous E. coli ligase to about 1% of normal levels at 25 ° C, and to cause loss ofviability at 42 ° C (Gellert and Bullock, 1970; Gottesman et al., 1973; Konrad et al., 1973). Viability could be restored at 42°C by introduction

of wt or heterologous ligase activity into the host cell (Cameron etal., 1975; Gottesman, 1976; Wilson and Murray, 1979). Plasmid libraries of T. thermophilus HB8 DNA screened for complementation ranged in size from 5000 to 27000 clones. Clones containing either comple- menting plasmid or revertants were observed in the Barn HI, Sacl, KpnI, HindIIl, and PstI libraries. However, true com- plementation of the ligts7 defect could only be demon- strated for a single plasmid isolate, termed pDZ 1, contain- ing two contiguous BamHI fragments (Fig. 1). True com- plementation could be distinguished from revertants by: (i) pinpoint colony size of pDZ 1 containing cells at 42 ° C, (ii) reintroduction of pDZ 1 into fresh AK76 cells resulted in normal transformation frequencies when selected at 42 ° C, and (ill) presence of a thermostable NAD +-depend- ent nick-closing (DNA ligase) activity in crude extracts when assayed at 65°C. Colonies containing pDZ 1 grew large r in the presence of IPTG at 42 ° C, suggesting that ligT was downstream from the lac promoter. In addition, sub- cloning each BamHI fragment individually, or inversion of the entire insert, resulted in a loss ofcomplementation ofthe ligts7 defect. This suggested that expression of Tth ligase in quantities sufficient to complement the ligts7 defect required an exogenous promoter. The inability to isolate comple- menting plasmids from the KpnI, HindIll, or Pstl libraries may reflect a requirement for a strong E. coil promoter just upstream from the ligase gene.

(c) Analysis of the nt and deduced aa sequence (1) Sequence identities between Thermus thermophilus and Escherichia coli iigase genes

With ligT downstream from the iac promoter, it was possible to select for directional deletions which brought the promoter closer to the start of the gene (Fig. 1). The nt sequence analysis of four independent isolates revealed a putative ORF with > 50% aa identity to the E. coli ligase gene in the first 60 codons. The sequencing strategy, nt sequence and translation of the ligase ORF, is shown in Fig. 2 and Fig. 3. The sequence had a nt composition of 66% G + C, closely correlating to the 69% G + C content of 7". thermophilus HB8 (Kagawa et al., 1984). A compari- son of the deduced aa sequence for Tth and E. coli ligase (Ishino et al., 1986) revealed a single chain with 47% identity and 66% similarity (Fig. 4). There are several regions which show extensive aa identities and conserva- tion of charge: aa 24-.61; aa 111 -* 149; aa 168-, 178; aa 199--,221; aa 303-~321; aa 425-,478; aa 485-*498. Regions of high aa identities have less identity at the nt sequence, suggesting an evolutionary conservation of these aa residues (see Table I). The biological significance of these regions remains to be determined.

BO2 Bm,Tq,Xh

ill Bgl I ,o 'q'i" "l

I ' ' ' ' I ' ' ' ' I

0 1 2 kb

Fig. 2. The nt sequencing strategy. The hatched arrow on the map delineates the coding sequence and direction of the Tth iigase gene. Arrows indicate sequence obtained in the sense (-,) or nonsense (~-) direction. Approx. 225-250 bp of new sequence information was obtained from each primer. Lengths of arrows is schematic and indicates minimum length ofcontig overlap. 100% of the DNA sequence was determined for both strands. Restriction sites are: Bin, BamHl; Bg 1, Bgll; Bg 2, Bg/ll; Bx, BstXl; Ne, Ncol; Nr, Narl; Tq, Taql; Xh, XhoI. Sequencing was off single-stranded phagemid template prepared from pDZ 1 (see Fig. l); pDZ9, which contains the 2. l-kb BamHI fragment from pDZ7 cloned in pTZ 19R; pDZ 10, which contains the 2. I-kh BamHI fragment from pDZ7 cloned in the opposite orientation in pTZ 19R; and pDZ 11, which contains the 1.2-kb BamHI-Kpnl fragment from pDZ7 cloned in pTZ 19R. Piasmid pDZ ! and derivatives were sequenced by dideoxy chain termination using modified T7 polymerase ('Sequenase II TM' from U.S. Biochemicals; Tabor and Richardson, 1987) as well as Taq polymerase with or without deaza-7-dGTP (Innis et al., 1988), and reactions electrophoresed on 100 mM Tris. borate pH 8.9/1 mM EDTA/7 M urea/6% polyacrylamide gels for 3 and 7 h at 60 W constant power. Primers used (nt positions) were: forward primers, LS210 (101-,119), LS228 (308--,326), LS252 (490-,510), LS2449 (524--,541), LS256 (677-,696), LS224 (877-, 897); LS215 (1050-, 1067), LS212 (! 118-, 1137), LS226 (1327--, 1346), LS246 (1553-, 1572), LS258 (1752-, 1771), LS264 (1846-, 1868), LS270 (1963-, 1982), LS295 (2046-,2066), and reverse primers, LS211 (101,-119), LS229 (308,--326), LS253 (453,--474), LS266 (565,-588), LS254 (681 ~-700), LS225 (884 ~- 904), LS255 (1055,-- 1074), LS213 (1119,-- 1137), LS257 (1356,-- 1375), LS263 (1496,-- 1516), LS247 (1553 ~ 1572), LS259 (1755,- ! 774), LS260 (1884,- 1906), LS283 (2020,-2040). For sequences at the end of the inserts, the reverse and universal M 13 primers were also used. Oligo primers were synthesized on a Biosearch 8700 DNA synthesizer.

TABLE !

The location of aa sequence identity regions between Tth HB8 and Escherichia coil ligase

Sequence Nucleotide Amino acid location" identity identity

Tth HB8 34-, 40 17/21 7/7 E. coil 32-, 38 Tth HB8 34-, 48 30/45 13/15 E. coil 32-, 46 Tth HB8 137-, 142 13/18 6/6 E. coli 134-,-, 139 Tth HB8 168-, 175 16/24 8/8 £. coil 167-, 174 Tth HB8 199-,-,210 25/36 12/12 E. coli 198-, 209 Tth HB8 199-, 219 42/63 20/21 E. coli 198-,-,218 Tth HB8 333-,339 16/21 7/7 E. coli 329-, 335 Tth HB8 485 -* 490 16118 6/6 E. coli 484-, 489 Tth HB8 485-, 496 26/36 11/12 E. coli 484-, 495

~' The aa sequence positions for Tth ligase are from Fig. 3, and for E. coli ligase from Ishino et al. (1986).

(2) Codon usage and G + C content in third position of codons Codon usage in ligT was heavily biased towards use of

G and C in the third position (91.4% G + C, see Table II)

as would be expected for an organism with G + C rich

DNA. Essentially identical third-position codon bias has

been observed for other Thermus genes: 95.5% G + C in third position for the gk24 gene encoding lactate dehydro- genase of T. caidophilus GK24 (Kunai et al., 1986); 95%

and 92% for the trpB and trpA genes from T. thermophilus HB8 (Koyama and Furukawa, 1990); 94.8% for the mdh gene from T.flavus AT-62 (Nishiyama et al., 1986); 93.0% and 94.3 % for the trpE and trpG genes from T. thermophilus HB8 (Sato et al., 1988); 91.8% for the poll gene of T. aquaticus YTI (Lawyer et al., 1989); and 89.4% for the

leuB gene from T. thermophilus HB8 (Kagawa et al., 1984). Codon usage is strikingly similar between the Taqpoll gene

and the Tth ligase gene, as well as the other genes described above (see Table II). Exceptions to this codon usage rule

include the aqualysin I-encoding gene from T. aquaticus YT1 (Kwon et al., 1988), and the isoschizomeric Taql and TthHB8 restriction endonucleases and methyltransferases

(Barany et al., 1992a). A related bias is seen in the frequency of TaqI sites in the Tth ligase and other thermophilic genes:

all TaqI sites are also Xhol sites (Kunai et al~ 1986). Again

Thermostable ligase: M T L E E A R K 8

TCGGAATAGG GGATGCGCCC CTAGTCCAAG GGAAAGTATA GCCCAAGGTA CACTAGGGCC A~G ACC CTG GAA GAG GCG AGG AAG 84

R V N E L R D L I R Y H N Y R Y Y V L A D P E I S 33

CGG GTA AAC GAG TTA CGG GAC CTC ATC CGC TAC CAC AAC TAC CGC TAC TAC GTC CTG GCG GAC CCG GAG ATC TCC 159

D A E Y D R L L R E L K E L E E R F P E L K S P D 58 GAC GCC GAG TAC GAC CGG CTT CTT AGG GAG CTC AAG GAG CTT GAG GAG CGC TTC CCC GAG CTC AAA AGC CCG GAC 234

S P T L Q V G A R P L E A T F R P ~ R H P T R M Y 83

TCC CCC ACC CTT CAG GTG GGG GCG AGG CCT TTG GAG GCC ACC TTC CGC CCC GTC CGC CAC CCC ACC CGC ATG TAC 309

S L D N A F N L D E L K A F E E R I E R A L G R K 108

TCC TTG GAC AAC GCC TTT AAC CTT GAC GAG CTC AAG GCC TTT GAG GAG CGG ATA GAA CGG GCC CTG GGG CGG AAG 384 G P F A Y T V E H K V D G L S V N L Y Y E E G V L 133

GGC CCC TTC GCC TAC ACC GTG GAG CAC AAG GTG GAC GGG CTT TCC GTG AAC CTC TAC TAC GAG GAG GGG GTC CTG 459

V Y G A T R G D G E V G E E V T Q N L L T I P T I 158 GTC TAC GGG GCC ACC CGG GGG GAC GGG GAG GTG GGG GAG GAG GTG ACC CAG AAC CTC CTC ACC ATC CCC ACC ATC 534

P R R L K G V P E R L E V R G E V Y M P I E A F L 183 CCG AGG AGG CTC AAG GGG GTG CCG GAG CGC CTC GAG GTC CGG GGG GAG GTC TAC ATG CCC ATA GAG GCC TTC CTC 609

R L N E E L E E R G E R I F K N P R N A A A G S L 208 CGG CTC AAC GAG GAG CTG GAG GAG CGG GGG GAG AGG ATC TTC AAA AAC CCT AGG AAT GCG GCG GCG GGT TCC TTA 684

R Q K D P R I T A K R G L R A T F Y A L G L G L E 233

AGG CAA AAA ~AC CCC CGC ATC ACC GCC AAG CGG GGC CTC AGG GCC ACC TTC TAC GCC TTA GGG CTT GGG CTG GAG 759

E V E R E G V A T Q F A L L H W L K E K G F P V E 258

GAG GTG GAG AGG GAA GGG GTG GCG ACC CAG TTT GCC CTC CTC CAC TGG CTC AAG GAA AAA GGC TTC CCC GTG GAG 834

H G Y A R A V G A E G V E A V Y Q D W L K K R R A 283 CAC GGC TAC GCC CGG GCC GTG GGG GCG GAA GGG GTG GAG GCG GTC TAC CAG GAC TGG CTC AAG AAG CGG CGG GCG 909

L P F E A D G V V V K L D E L A L W R E L G Y T A 308

CTT CCC TTT GAG GCG GAC GGG GTG GTG GTG AAG CTG GAC GAG CTT GCC CTT TGG CGG GAG CTC GGC TAC ACC GCC 984 R A P R F A I A Y K F P A E E K E T R L L D V V F 333

CGC GCC CCC CGG TTC GCC ATC GCC TAC AAG TTC CCC GCC GAG GAG AAG GAG ACC CGG CTT TTG GAC GTG GTC TTC 1,059

Q V G R T G R V T P V G I L E P V F L E G S E V S 358 CAG GTG GGG CGC ACC GGG CGG GTG ACC CCC GTG GGG ATC CTC GAG CCC GTC TTC CTA GAG GGC AGC GAG GTC TCC 1,134

R V T L H N E S Y I E E L D I R I G D W V L V H K 383

CGG GTC ACC CTG CAC AAC GAG AGC TAC ATA GAG GAG TTG GAC ATC CGC ATC GGG GAC TGG GTT TTG GTG CAC AAG 1,209

A G G V I P E V L R V L K E R R T G E E R P I R W 408 GCG GGC GGG GTC ATC CCC GAG GTC CTC CGG GTC CTC AAG GAG AGG CGC ACG GGG GAG GAA AGG CCC ATT CGC TGG 1,284

P E T C P E C G H R L L K E G K V H R C P N P L C 433

CCC GAG ACC TGC CCC GAG TGC GGC CAC CGC CTC CTC AAG GAG GGG AAG GTC CAC CGC TGC CCC AAC CCC TTG TGC 1,359

P A K R F E A I R H F A S R K A M D I Q G L G E K 458

CCC GCC AAG CGC TTT GAG GCC ATC CGC CAC TTC GCC TCC CGC AAG GCC ATG GAC ATC CAG GGC CTG GGG GAA AAG 1,434 ,,' ,

L I E R L L E K G L V K D V A D L Y R L R K E D L 483

CTC ATT GAG AGG CTT TTG GAA AAG GGG CTG GTC AAG GAC GTG GCC GAC CTC TAC CGC TTG AGA AAG GAA GAC CTG 1,509

V G L E R M G E K S A Q N L L R Q I E E S K K R G 508

GTG GGC CTG GAG CGC ATG GGG GAG AAG AGC GCC CAA AAC CTC CTC CGC CAG ATA GAG GAG AGC AAG AAA AGA GGC 1,584

L E R L L Y A L G L P G V G E V L A R N L A A R F 533

CTG GAG CGC CTC CTC TAC GCC TTG GGG CTT CCC GGG GTG GGG GAG GTC TTG GCC CGG AAC CTG GCG GCC CGC TTC 1,659

G N M D R L L E A S L E E L L E V E E V G E L T A 558 GGG AAC AT___GG GAC CGC CTC CTC GAG GCC AGC CTG GAG GAG CTC CTG GAG GTG GAG GAG GTG GGG GAG CTC ACG GCG 1,734

R A I L E T L K D P A F R D L V R R L K E A G V E 583

AGG GCC ATC CTG GAG ACC TTG AAG GAC CCC GCC TTC CGC GAC CTG GTA CGG AGG CTC AAG GAG GCG GGG GTG GAG 1,809

M E A K E K G G E A L K G L T F V I T G E L S R P 608 AT__GG GAG GCC AAG GAG AAG GGC GGG GAG GCC CTT AAA GGG CTC ACC TTC GTG ATC ACC GGG GAG CTT TCC CGC CCC 1,884

R E E V K A L L R R L G A K V T D S V S R K T S Y 633 CGG GAA GAG GTG AAG GCC CTC CTA AGG CGC CTC GGG GCC AAG GTG ACG GAC TCC GTG AGC CGG AAG ACG AGC TAC 1,959

L V V G E N P G S K L E K A R A L G V P T L T E E 658

CTC GTG GTG GGG GAG AAC CCG GGG AGC AAG CTG GAG AAG GCC AGG GCC CTC GGG GTC CCC ACC CTC ACG GAG GAG 2,034 E L Y R L L E A R T G K K A E E L V * 676

GAG CTC TAC CGG CTC CTG GAG GCG CGG ACG GGG AAG AAG GCG GAG GAG CTC GTC TA_._A AGGGCTTCC 2,100

Fig. 3. The nt sequence of the ggT gene and deduced aa sequence. The nt positions are numbered consecutively starting 60 nt upstream from the gene. Restriction sites for Bglll (A~GATCT), BamHI (G~GATCC), and Taql (TtCGA) are underlined. Potential in-frame start and stop (asterisk) codons are double underlined. The aa numbers (lesser) are shown above nt numbers (greater). This nt sequem.c is available from GenBank under accession No. M74792.

the Taql and TthHB8 restriction and modification genes are notable exceptions, suggesting an independent evolu- tionary origin as discussed elsewhere (Barany et al., 1992a).

(d) Overproduction and properties of Tth ligase To obtain enzyme overproduction, it was necessary to

replace the endogenous Thermus promoter/ribosome- binding site with a stronger E. coil promoter/ribosome-

I0 20 30 40 50 60 70 80 -- +++ - +- + + ~ +- +- --+ + + - + - + +

TthHB8 MTLEEARKRV NELRDLIRYH NYRYYVLADP EISDAEYDRL LRELKELEER FPELKSPDSP TLQVGARPLE ATFRPVRHPT

: :o.O .... :::.O.:.: .:O:.:. : ::O::::::: .:::.:::...:::O.:::: :O.:::O;: :.:...::OO

E. culi M--ESIEQQL TELRTTLRHH EYLYHVMDAP EIPDAEYDRL MRELRELETK HPELITPDSP TQRVGAAPL- ARFSQIRHEV _ - - + + . . . . . . . + +- +- - + - - + + + -

90 I00 + - -- + --+

TthHB8 RMYSLDNAFN LDELKAFEER

O:O::::.:. O...O::.O:

E. coli PMLSLDNVFD EESFLAFNKR . . . . ++

Ii0 120 130 140 150 160 --+ ++ -- + -- __ + +

IERALGRKGP FAYTVEHKVD GLSVNLYYEE GVLVYGATRG DGEVGEEVTQ NLLTIPTIPR

• .OO:O...O ..... :O:.: ::.:...::. ::::O.:::: ::..::..:O :.O::O.::O

VQD?.LKNNEK VTWCCELKLD GLAVSILYEN GVLVSAATRG DGTTGEDITS NVRTIRAIPL -+ + -+ - + - - + - -_ + +

168 178 188 198 208 218 228 238 + + --+ - + - - + ..... ~ -+ + + + +- + + + + .... ~-

TthHB8 RLKG--VPER LEVRGEVYMP IEAFLRLNEE LEERGERIFK NPRNAAAGSL RQKDPRITAK RGLRATFYAL GLGLEEVERE

.:.: .:O: :::::::..: OO.:O..::. OOOO:O..:O :::::::::: ::O::::::: :.: ::... :.:.O:.O :

E. coli KLHGENIPAR LEVRGEVFLP QAGFEKINED ARRTGGKVFA NPRNAAAGSL RQLDPRITAK RPL--TFFCY GVGVLEGG-E + - + - + - -+ -- ++ + + + - + + + - -

248

TthHB8 GVATQFALLH

O.O:...O:O

E. coli LPDTHLGRLL - +

258 268 278 288 298 308 318

+-+ - + - - - ++++ - - + -- +- + + +

WLKEKGFPVE HGYARAVGAE GVEAVYQDWL KKRRALPFEA DGVWKLDEL ALWRELGYTA RAPRFAIAYK

O.:OO:.::. OOO.O.O.:: O:O:.:.O.O OO:O.:.:.. ::::.:...: :OOO.::.O: ::::.:.:.:

QFKKWGLPVS DRVTLCESAE EVLAFYHKVE EDRPTLGFDI DGWIKVNSL AQQEQLGFVA RAPRWAVAFK ++ -+ + - --+ - - + - + + +

328 338 348 358 368 378 388 398 --+- + - + + - - - + . . . . + - + - + +-+

TthHB8 FPAEEKETRL LDVVFQVGRT GRVTPVGILE PVFLEGSEVS RVTLHNESYI EELDIRIGDW VLVHKAGGVI PEVLRVLKER

:::.:OO:O. O::O:::::: :O.:::.O:: ::O.O:OO:: ..::::OOO: :O:..::::O : .... ::O:: :.:..:.O.O

E. coli FPAQEQMTFV RDVEFQVGRT GAITPVARLE PVHVAGVLVS NATLHNADEI ERLGLRIGDK VVIRRAGDVI PQVVNVVLSE - +- - + + . . . . . ~ + -+ ++ - -

408 418 + --+ + - - +

TthHB8 RTGEERPIRW PETCPECGHR

:OO..:O.O. :..::O::OO

E. coli RPEDTREVVF PTHCPVCGSD + -- +-

426 435 445 455 465 475 +- + + ++ - + ++ - -+ -+ - + +- -

L--LKEGKVH RCPNPL-CPA KRFEAIRHFA SRKAMDIQGL GEKLIERLLE KGLVKDVADL

• .O O:. ::O..: :.: .:O:...::. ::.:::..:. :.:.:..:.: :..:...:::

VERVEGEAVA RCTGGLICGA QRKESLKHFV SRRAMDVDGM GDKIIDQLVE KEYVHTPADL -+ - _ + ++- + ++ - - -+ - - +- -

485 495 505 515 525 535 545 555

+ ++-- -+ -+ + -- + ++ -+ - + + + - -

TthHB8 YRLRKEDLVG LERMGEKSAQ NLLRQIEESK KRGLERLLYA LGLPGVGEVL ARNLAARFGN MDRLLEASLE ELLEVEEVGE

• .:0000:0: :::::.:::: :..00.:00: 00..0:.::: ::.00:::.0:0.:::0::...0"00::.: ::00:0.::0

E. coli FKLTAGKLTG LERMGPKSAQ NVVNALEKAK ETTFARFLYA LGIREVGEAT AAGLAAYFGT LEALEAASIE ELQKVPDVGI + + -+ + -+ + - + + . . . . . . + -

565 575 585 591 601 610 620 630 + - +- +- + + +- - - + -+ - + - + + -- + ++ + - +

TthHB8 LTARAILETL KDPAFRDLVR RLKEAGVEME AK ..... EKG GEALKGLTFV ITGELSR-PR EEVKALLRRL GAKVTDSVSR

.O:OO.O.O.O.O.O:...O O:O ::O.O :O :OO .O...:O:.: .::.::. O: ...::O:O0: ::::OO:::.

E.coli VVASHVHNFF AEESNRNVIS ELLAEGVHWP APIVINAEEI DSPFAGKTVV LTGSLSQMSR DDAKARLVEL GAKVAGSVSK -- + . . . . . + + -- + + - + +

640 650 660 670 + - + -+ + --- + - + + + --

TthHB8 KTSYLVVGEN PGSKLEKARA LGVPTLTEEE LYRLLEARTGK KAEELV * 676 aa

:: ...... :..::::O::.O ::.OO..:O: ..:::O.

E. coli KTDLVIAGEA AGSKLAKAQE LGIZVIDEAE MLRLLGS * 671 aa + - - + + . . . . . +

Fig. 4. The aa sequence comparison of the DNA ligases from T. thermophilus and E. coli. Deduced aa sequences for 1". thermophilus ligase (TthHB8) and E. coil ligase (Ishino et al., 1986) were analyzed for aa sequence identity by using the computer program GAP from the University of Wisconsin Genetics Computer Group (UWGCG). Colons, denote aa identity; periods, denote aa similarity; zeros, denote lack of aa relatedness between aa residues in the two DNA ligases. Dashes indicate deletion of corresponding aa in that region. The aa numbering is for the 1". thermophilus ligase, and positive and negative

charges of various residues are indicated above or below the sequence.

A B C D E F G H

92

66

Ligase-AMP Ligase

e

31 l i p

Fig. 5. A 0.1% SDS-10% polyacrylamide gel electrophoresis of T. thermophilus ligase at different stages of purification. Lanes: A and H, marker proteins (molecular sizes in kDa); B, whole cells after induction; C, crude supernatant after sonication; D, pooled DEAE flow-through after heat treatment; E, one fraction (No. 23) after phosphocellulose chromatography; F, fraction No. 23 incubated with nicked DNA in ligase buffer in the absence ofNAD + ; G, fraction No. 23 incubated with NAD + in iigase buffer in the absence of nicked DNA (see below). Higher molecular weight ligase (approx. 81 kDa, barely visible in lane E, and clearly visible in lane G) is the adenylated form, while lower molecular weight ligase (approx. 78 kDa) is the deadenylated form. Lanes F and G had half as much protein loaded as in lane E. Methods. The AK53- [pDZiS] cells (2 liters grown for 9 h in 0.2 mM phosphate - MOPS medium containing 50 #g Ap/ml) were harvested {4.6 g wet weight) and resuspended in 20 ml ofbuffer A (20 mM Tris. HCI, pH 7.6/I mM EDTA) containing 10 mM 2-mercaptoethanol and 0.15 mM PMSF. After soni- cation (5 × i min at 50% power at 4°C), the solution was centrifuged at 39000 x g for 60 min. The supernatant (40 ml) was brought to 300 mM KCI and passed through a 5 ml DEAE sephacel column (to remove DNA) using 70 ml buffer A containing 0.3 M KCI. The flow-through fractions were combined (containing the ligase), and treated at 65°C for 20 min to irreversibly hea~ denature many E. cog enzymes including endonucleases and exonucleases. Denatured proteins were removed by centrifugation (39 000 x g for 15 min), and the iigase enzyme precipitated from the supernatant by adding an equal volume of saturated (NH4)aSO4 at room temperature (30 min). The ammonium sulfate precipitate was harvested by centrifugation and resuspended in 4 ml distilled wager. Samples were dialyzed against buffer A, followed by buffer A containing 50 mM KCI. The dialyzed protein solution was applied to a phospho- cellulose column (40 ml) equilibrated with buffer A containing 50 mM KCI using an FPLC apparatus. After washing with 80 ml of the same buffer, the column was eluted with a 120 ml linear gradient of KC! (0.05-0.5 M) in buffer A. The enzyme eluted as a relatively sharp peak from 0.25-0.35 M KCI. Ligase activity was assayed for the ability to seal nicked plasmid DNA (pUC4KIXX; Barany, 1985) as monitored by electrophoresis on a ! % agarose gel. (Each plasmid has about 5-10 nicks introduced by DNase I.) One nick-closing unit ofligas¢ is defined as the amount ofligase that circularizes 0.5/~g of nicked pUC4KIXX DNA in 20~i of' 20mM Tris. HCl pH 7.6 (at 25°C) containing 50mM KCI/10 mM MgCI2/i mM EDTA/1 mM NAD + / 10 mM dithiothreitol, overl~.ying with a drop of mineral oil, after a 15 min incubation at 65°C. Purity was monitored by presence of two bands of apparent molecular weight approx. 81 kDa (adenylated form) and 78 kDa (deadenylated form) on an 0.1% SDS-10% polyacrylamide gel (Laemmli, 1970). Each

binding site. This was achieved by identifying and sequer~cing the N-terminal region of the gene, and sub- sequently designing appropriate primers to fuse that portion to a strong promoter (pho.4) using PCR (Fig. 1). In addi- tion, yields of ligase enzyme could be slightly improved by reversing the bin (Ap R) gene orientation (plasmid pDZ 15). A similar effect of the bin gene orientation had been previ- ously observed with overproduction of Taql (Barany, 1988a). Enzyme purification was essentially as described (Panasenko et al., 1978; Takahashi et al., 1984), and fur- ther simplified by including a heat denaturation step to remove approximately half of the E. cog proteins including endogenous endonucleases and exonucleases (Fig. 5; and Barany, 1987).

During induction and purification of the ligase, two bands of varying intensities were observed in different preparations. By incubating purified enzyme either in the presence of DNA or NAD +, but not both, the higher molecular weight ligase (approx. 81 kDa) was shown to be the adenylated form, while lower molecular weight ligase (approx. 78 kDa) was the deadenylated form (Fig. 5). Such a ligase-adenylate intermediate was previously identified for the T4 and E. cog enzymes by acid precipitation of radio- actively labeled enzyme, gel filtration, or by its slower mobility after electrophoresis through an SDS-polyacryl- amide-gel (Modrich et al., 1973; Weiss and Richardson, 1967a; Weiss etal., 1968; Zimmerman and Oshinsky, 1969). Proteolytic degradation of the adenylated form of the E. coil enzyme revealed adenosine 5' monophosphate linked to an ~-amino group of Lys (Gumport and Lehman, 1971). It is interesting to note that only 14 of the 42 Lys residues are conserved between the Tth and E. cog genes, a few of which are in highly conserved regions. Additional experiments including site-specific mutagenesis will be required to pinpoint the active site Lys.

(e) Conclusions T. thermophilus and E. coli ligase genes have regions of

high sequence similarity, and these may function in catalysis or substrate binding. Overproduction of the Tth ligase will

form could be converted to the other form by incubating 150/~g protein in ligase buffer containing either 25/~g nicked salmon sperm DNA without NAD + (resulting in deadenylated form, lane F), or in ligase buffer with 1 mM NAD ÷ (resulting in the adenylated form, lane G) for 30 min at 65°C. An equal volume of 20 mM Tris. HCI pH 8.0 in 100% glycerol containing I mM EDTA/2 mM dithiothreitol (DTT)/200 #g per ml bovine serum albumin (Fraction V) was added (final glycerol concen- tration is 50%), and enzyme stored at either -70 ° C or -20 ° C. Protein concentration was determined by the method of Bradford (1976). Final yield was 6 mg ligase with a specific activity of 1.67 x 106 units/mg.

TABLE II

Comparison of codon usage in Thermus aquaticus DNA (TAQ POL) and Thermus thermophilus DNA ligase (TTH

polymerase i LIG) genes

TAQ "ITH POL LIG

TAQ TTH POL LIG

Arg

Leu

Ser

Thr

Pro

Ala

CGU CGC CGA CGG AGA AGG s u m

UUA UUG CUU CUC CUA CUG sum

UCU UCC UCA UCG AGU AGC s u m

ACU ACC ACA ACG sum

CCU CCC CCA CCG sum

GCU GCC GCA GCG sum

0 24

0 27

0 25 76

0 3

20 46

5 5O

124

0 15 0 1 1

14 31

0 2O

0 10 3O

3 34

2 9

48

2 77 0

12 91

0 27

0 26

2 17 72

3 11 15 42

2 20 93

0 9 0 0 0 9

18

0 22

0 6

28

2 25

0 5

32

0 38 0

17 55

Val GUU 1 1 GUC 21 19 GUA 0 2 GUG 29 32 sum 51 54

His CAU 0 0 CAC 18 10 sum 18 10

Gin

Asn

Lys

Asp

Glu

Tyr

Phe

Cys

CAA 1 2 CAG 15 7 sum 16 9

AAU 0 ! AAC 12 14 sum 12 15

AAA 5 6 AAG 37 36 sum 42 42

GAU 3 0 GAC 39 24 sum 42 24

GAA 8 10 GAG 79 82 sum 87 92

UAU 4 0 UAC 20 21 sum 24 21

UUU 8 5 UUC 19 15 sum 27 20

UGU 0 0 UGC 0 4 sum 0 4

Gly GGU 0 1 lie AUU 3 2 GGC 28 12 AUC 20 15 GGA 0 0 AUA 2 4 GGG 30 41 sum 25 21 sum 58 54

Met AUG 16 7

Trp UGG 14 5

provide a ready source of this enzyme for use in LCR amplification/detection as well as the next generation of DNA diagnostic techniques (Barany, 1991a,b).

ACKNOWLEDGEMENTS

We thank Antje Koller and Frances C. Lawyer for expert technical assistance, Jean Lee for DNA sequencing and the Cetus DNA synthesis group for oligo synthesis. We thank Hamilton O. Smith, John Sninsky, Tom Gingeras and Marc Kahn for critical reading and discussions, and David Cowburn, Richard Gumport, Leroy Hood, Anneel Aggarwal, Kenneth Berns, Jef Boeke, Eric Spitzer, John Zebala, Alan Mayer and Michael Danzitz for discussions. We thank Jim Van Etten for this generous gift of CviJI. We thank Paul Modrich, Joe Heitman, Geoffrey Wilson, Michael Gottesman and Noreen Murray for strains. This work was supported by grants from the National Institutes of Health (GM-41337-03) and the National Science Foundation (DMB-8714352) to F.B.

NOTE ADDED IN PROOF

After submission of this work, an alternative method for cloning thermophilic ligase was reported (Lauer et al., 1991).

REFERENCES

Barany, F.: Two codon insertion mutagenesis of plasmid genes using single-stranded hexameric oligonucleotides. Proc. Natl. Acad. Sci. USA 82 (1985) 4202-4206.

Barany, F.: A genetic system for isolation and characterization of Taql restriction endonuclease mutants. Gene 56 (1987) 13-27.

Barany, F.: Overproduction, purification, and crystallization of TaqI restriction endonuclease. Gene 63 (1988a) 167-177.

Barany, F.: ProceOures for linker insertion mutagenesis and use of new kanamycin resistance cassettes. DNA Prot. Eng. Tech. 1 (1988b)

29-35. Barany, F.: Genetic disease detection and DNA amplification using

cloned thermostable ligase. Proc. Natl. Acad. Sci. USA 88 (1991a)

189-193. Barany, F.: The ligase chain reaction (LCR) in a PCR world. PCR

Methods Applicat. 1 (1991b) 5-16. Barany, F., Danzitz, M., Zebala, J. and Mayer, A.: Cloning and

sequencing of the TthHB8I restriction and modification enzymes, and comparison with the isoschizomeric Taql enzymes. Gene (in

press). Barany, F., Slatko, B., Danzitz, M., Cowburn, D., Schildkraut, I. and

Wilson, G.:The corrected nucleotide sequence of the Taql restriction and modification enzymes also reveals a large overlap. Gene (in

press). Barker, D.G., White, J.H.M. and Johnston, L.H.: Molecular characteri-

zation of the DNA ligase gene, CDCI 7, from the fission yeast Schizo- saccharomyc s pombe. Eur. J. Biochem. 162 (1988) 659-667.

Barker, D.G., ,Vhite, J.H.M. and Johnston, L.H.: The nucleotide sequence of the DNA ligase gene (CDCg) from Saccharomyces cerevisiae: a ger, e which is cell-cycle regulated and induced in response to DNA damage. Nucleic Acids Res. 13 (1989) 8323-8337.

Barringer, K., Orgel, L,, Wahl, G. and Gingeras, T.R.: Blunt-end and

10

single-stranded iigations by Escherichia coil ligase: influence on an in vitro amplification scheme. Gene 89 (1990) 117-122.

Becker, A., Lyn, G., Getter, M. and Hurwitz, J.: Enzymatic repair of DNA II. Characterization of phage-induced sealase. Prec. Natl. Acad. Sci. USA 58 (1967) 1996-2003.

Birnboim, H.C.: A rapid extraction method for the isolation of plasmid DNA. Methods Enzymoi. 100 (1983) 243-255.

Bradford, M.M: A rapid a~,:J sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72 (1976) 248-254.

Brock, T.D. and Freeze, H.: Thermus aquaticus gen.n, and sp.n., a non- sporulating extreme thermophile. J. Bacteriol. 98 (1969) 289-297.

Cameron, J.R., Panasenko, S.M., Lehman, I.R. and Davis, R.W.: In vitro coustruction of bacteriophage carrying segments of the Escherichia coli chromosome: selection of hybrids containing the gene for DNA ligase. Prec. Natl. Acad. Sci. USA 72 (1975) 3416-3420.

Cozzarelli, N.R., Melechen, N.E., Jovin, T.M. and Kornberg, A.: Poly- nucleotide cellulose as a substrate for a polynucleotide ligase induced by phage T4. Biochem. Biophys. Res. Commun. 28 (1967) 578-586.

Degryse, E., Glansdorff, N. and Pierard, A.: A comparative analysis of extreme thermophilic bacteria belonging to the genus Thermus. Arch. Microbiol. 117 (1978) 189-196.

Dunn, J.J. and Studier, F.W.: The complete nucleotide sequence of the bacteriophage T7 DNA, and the locations ofT7 genetic elements. J. MoL Biol. 166 (1983) 477-535.

Engier, MJ. and Richardson, C.C.: DNA ligases. In: Boyer, P. (Eds.), The Enzymes, Vol. 15. Academic Press, New York, 1982, pp. 3-29.

Getter, M.L, Becker, A. and Hurwitz, J.: Enzymatic repair of DNA 1. Formation of circular DNA. Prec. Natl. Acad. Sci. USA 58 (1967) 240-247.

Gellert, M.: Formation of covalent circles of lambda DNA by E. coli extracts. Prec. Natl. Acad. Sci. USA 57 (1967) 148-155.

Gellert, M. and Bullock, M.L.: DNA ligase mutants of Escherichia coll. Prec. Natl. Acad. Sci. USA 67 (1970) 1580-1587.

Gottesman, N.M.: Isolation and characterization of a ~. specialized trans- ducing phage for the Escherichia coil DNA ligase gene. Virology 72 (1976) 33-44.

Gottesman, M.M., Hicks, M.L and Gellert, M.: Genetics and function of DNA ligase in Escherichia coli. J. Mol. Biol. 77 (1973) 531-547.

Gumport, R.I. and Lehman, I.R.: Structure of the DNA ligase-adenylate intermediate: lysine (E-amino).linked adenosine monophosphor- amidate. Prec. Natl. Acad. Sci. USA 68 0971) 2559-2563.

Hanahan, D.: Studies on transformation of E. coil with plasmids. J. Mol. Biol. 166 (1983) 557-580.

Heitman, J. and Model, P.: Site-specific methylases induce the SOS DNA repair response in Escherichia coll. J. BacterioL 169 (1987) 3243-3250.

Holmes, D.S. and Quigiey, M.: The rapid boiling method for the prepa- ration of bacterial plasmids. Anal. Biochem. 114 (1981) 193-197.

Horton, R.M., Hunt, H.D., He, S.N., Pullen, J.K. and Pease, L.R.: Engineering hybrid genes without the use of restriction enzymes: gene splicing by overlap extension. Gene 77 (1989) 61-68.

Innis, M.A., Myambo, K.B., Geifand, D.H. and Brow, M.A.D.: DNA sequencing with Thermus aquaticus DNA polymerase and direct sequencing of polymerase chain reaction-amplified DNA. Prec. Natl. Acad. Sci. USA 85 (1988) 9436-9440.

Ishino, Y., Shinagawa, H., Makino, K., Tunasawa, S., Sakiyama, F. and Nakata, A.: Nucleotide sequence of the Iig gene and primary structure of DNA ligase of Escherichia coll. Mol. Gen. Genet. 204 (1986) I-7.

Kagawa, Y., Nojima, H., Nukiwa, N., Ishizuka, M., Nakajima, T., Yasuhara, T., Tanaka, T. and Oshima, T.: High guanine plus cytosine content in the third letter ofcodons ofan extreme thermophile: DNA sequence of the isopropylmalate dehydrogenase of Thermus thermo- philus. J. Biol. Chem. 259 (1984) 2956-2960.

Konrad, E.B., Modrich, P. and Lehman, I.R.: Genetic and enzymatic characterization of a conditional lethal mutant of Escherichia coil K- 12 with a temperature-sensitive DNA ligase. J. Mol. Biol. 77 (!973) 519-529.

Koyama, Y. and Furukawa, K.: Cloning and sequence analysis oftrypto- phan synthetase genes of an extreme thermophile, Thermus thermo- philus HB27: plasmid transfer from replicated Escherichia coil recom- binant colonies to competent T. thermophilus cells. J. Bacteriol. 172 (1990) 3490-3495. "

Krayev, A.S., Zimin, A., Mironova, M.V., Janulaitis, A.A., Tanyashin, V.I., Skryabin, K.G. and Bayev, A.A.: The DNA ligase gene of bac- teriophage T4. Dokl. Biochem. 264 (1983) 235-239.

Kunai, K., Machida, M., Matsuzawa, H. and Ohta, T.: Nucleotide sequence and characteristics of the gene L-lactate dehydrogenase of Thermus caldophilus GK24 and the deduced amino-acid sequence of the enzyme. Eur. J. Biochem. 160 0986) 433-440.

Kwon, S.-T., Terada, I., Matsuzawa, H. and Ohta, T.: Nucleotide sequence of the gene for aqualysin I (a thermophilic serine protease) of Thermus aquaticus YT-! and characteristics ofthe deduced primary structure of the enzyme. Eur. J. Biochem. 173 (1988) 491-497.

Laemmli, U.K.: Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227 (1970) 680-685.

Landegren, U.K., Kaiser, R., Sanders, J. and Hood, L.: A ligase-mediated gene detection technique. Science 241 (1988) 1077-1080.

Lauer, G., Rudd, E.A., McKay, D.L., Ally, A., Ally, D. and Backman, K.C.: Cloning, nucleotide sequence, and engineered expression of Thermus thermopl~ilus DNA iigase, a homolog ofEscherichia coli DNA ligase. J Bacteriol. 173 (1991) 5047-5053.

Lawyer, F.C., Stoffel, S., S~iki, R.K., Myambo, K., Drummond, R. and Geifand, D.H.: Isolation, characterization, and expression in Escherichia coli of the DNA polymerase gene from Thermus aquaticus. J. Biol. Chem. 264 (1989) 6427-6437.

Lehman, LR.: DNA ligase: structure, mechanism, and function. Science 186 (1974) 790-797.

Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982.

Mead, D.A., Szczesna-Skorupa, E. and Byron, K.: Single stranded DNA SP6 promoter plasmids for engineering mutant RNAs and proteins: synthesis of a 'stretched' preproparathyroid hormone. Nucleic Acids Res. 13 (1985) !103-1118.

Mead, D.A., Szczesna-Skorupa, E. and Byron, K.: Single-stranded DNA 'blue' T7 promoter plasmids: a versatile tandem promoter system for cloning and protein engineering. Prot. Eng. ! (1986) 67-74.

Meselson, M. and Yuan, R.: DNA restriction enzyme from E. coll. Nature 217 (1968) !110-1114.

Miller, J.H.: Experiments in Molecular Genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1972, pp. 201-205.

Modrich, P. and Lehman, I.R.: Deoxyribonucleic acid ligase: a steady state kinetic analysis of the enzyme from Escherichia coll. J. Biol. Chem. 248 (1973) 7502-7511.

Modrich, P., Anraku, Y. and Lehman, I.R.: Deoxyribonucleic acid ligase: isolation and physical characterization of the homogeneous enzyme from Escherichia coll. J. Biol. Chem. 248 (1973) 7495-7501.

Moses, P.B., Boeke, J.D., Horiuchi, K. and Zinder, N.: Restructuring the bacteriophage fl genome: expression of gene VllI in the intergenic space. Virology 104 (1980) 267-278.

Neidhardt, F.C., Bloch, P.L. and Smith, D.F.: Culture medium for entero- bacteria. J. Bacteriol. 119 (1974) 736-747.

Nickerson, D.A., Kaiser, R., Lappin, S., Stewart, J., Hood, L. and Landegren, U.: Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay. Prec. Natl. Acad. Sci. USA 87 (1990) 8923-8927.

N~shiyama, M., Matsubara, N., Yamamoto, K., Iijima, S., Uozumi, T.

and Beppu, T.: Nucleotide sequence of the malate dehydrogenase gene of Thermus flavus and its mutation directing an increase in enzyme activity. J. Biol. Chem. 261 (1986) 14178-14183.

Olivera, B.M. and Lehman, I.R.: Linkage of polynucleotides through phosphodiester bonds by an enzyme from Escherichia coll. Proc. Natl. Acad. Sci. USA 57 (1967) 1426-1433.

Oshima, T. and Imahori, K.: Isolation of an extreme thermophile and thermostability of its transfer ribonucleic acid and ribosomes. J. Gen. Appl. Microbiol. 17 (1971) 513-517.

Panasenko, S.M., Alazard, RJ. and Lehman, I.R.: A simple, three-step procedure for the large scale purification of DNA ligase from a hybrid lambda lysogen constructed in vitro. J. Biol. Chem. 253 (1978) 4590-4592.

Saiki, R.K., Gelfand, D.H., Stoffel, S., Scbarf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B. and Erlich, H.A.: Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239 (1988) 487-491.

Sato, S., Nakada, Y., Kanaya, S. and Tanaka, T.: Molecular cloning and nucleotide sequence of Therrnus thermophilus HB8 trpE and trp6. Biochem. Biophys. Acta 950 (1988) 303-312.

Slatko, B.E., Benner, J.S., Jager-Quinton, T., Moran, L.S., Simcox, T.G., Van Cott, E.M. and Wilson, G.G.: Cloning, sequencing and expres- sion of the TaqI restriction-modification system. Nucleic Acids Res. 15 (1987) 9781-9796.

Tabor, S. and Richardson, C.C.: DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proc. Natl. Acad. Sci. USA 84 (1987) 4767-4771.

Takahashi, M. and Tsuneko, U.: Thermophilic HB8 DNA ligase: effects ofpolyethylene glycols and polyamines on blunt-end ligation of DNA. J. Biochem. 100 (1986) 123-131.

Takahashi, M., Yamaguchi, E. and Uehida, T.: Thermophilic DI~,IA ligas~. J. Biol. Chem. 259 (1984) 10041-10047.

Weiss, B. and Richardson, C.C.: Enzymatic breakage and joining of deoxyribonucleic acid, IIl. An enzyme-adenylate intermediate in the polynucleotide ligase reaction. J. Biol. Chem. 242 (1967a) 4270-4272.

11

Weiss, B. and Richardson, C.C." Enzymatic breakage and joining of deoxyribonucleic acid, I. Repair of single strand breaks in DNA by an enzyme system from Escherichia coli infected with T4 bacterio- phage. Proc. Natl. Acad. Sci. USA 57 (1967b) 1021-1028.

Weiss, B., Thompson, A. and Richardson, C.C." Enzymatic breakage and joining of deoxyribonucleic acid, VII. Properties of the enzyme- adenylate intermediate in the polynucleotide ligase reaction. J. Biol. Chem. 243 (1968) 4556-4563.

Williams, R.A.D.: Biochemical taxonomy of the genus Thermus. In: Da Costa, M.S., Duarte, J.C. and Williams, R.A.D. (Eds.), FEMS Sym- posium, Troia, Portugal, Vol. 49, Elsevier, Amsterdam, 1988, pp. 82-97.

Wilson, G.G. and Murray, N.E.: Molecular cloning of the DNA ligase gene from the bacteriophage T4, I. Characterization of the recombi- nants. J. Moi. Biol. 132 (1979) 471-491.

Wu, D.Y. and Wallace, R.B.: The ligation amplification reaction (LAR): amplification of specific DNA sequences using sequential rounds of template-dependent ligation. Genomics 4 (1989a) 560-569.

Wu, D.Y. and Wallace, R.B.: The specificity of nick-closing activity of bacteriophage T4 DNA ligase. Gene 76 (1989b) 245-254.

Xia, Y., Burbank, D.E., Uher, L., Rabussay, D. and Van Etten, J.L.: IL-3A virus infection of a Chlorella-like green alga induces a DNA restriction endonuclease with novel sequence specificity. Nucleic Acids Res. 15 (1987) 6075-6090.

Yudelevich, A., Ginsberg, B. and Hurwitz, J.: Discontinuous synthesis of DNA during replication. Proc. Natl. Acad. Sci. USA 61 (1968) 1129-1136.

Zimmerman, S.B. and Oshinsky, C.K.: Enzymatic joining of deoxyribo- nucleic acid strands, I. Further purification of the deoxyribonucleic acid ligase from Escherichia coli and multiple forms of the purified enzyme. J. Biol. Chem. 244 (1969) 4689-4695.

Zimmerman, S.B., Little, J.W., Oshinsky, C.K. and Gellert, M.: Enzymatic joining of DNA strands: a novel reaction of diphospho- pyridine nucleotide. Proc. Natl. Acad. Sci. USA 57 (1967) 1841-1848.