7
Eur. J. Biochem. 158,647-653 (1986) 0 FEBS 1986 Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli David M. CLARKE ', Tip W. LOO2, Shirley GILLAM2 and Philip D. BRAGG' Departments of Biochemistry and Pathology2. University of British Columbia, Vancouver (Received March 14/May 12, 1986) - EJB 86 0266 A 3240-base-pair DNA fragment spanning the pyridine nucleotide transhydrogenase @nf) genes of Escherichia coli has been sequenced. The sequence contains two open-reading frames, pntA and pntB of 1506 and 1386 base pairs, coding for the transhydrogenase a and j subunits, respectively. The coding sequences are preceded by a promoter-like structure and are most likely co-transcribed. Each coding sequence is preceded by a Shine-Dalgarno sequence. The amino-terminal amino acid sequences were determined from the purified a and /j subunits of the transhydrogenase. These sequences agree with those predicted from the nucleotide sequences of the pntA and pntB genes. The predicted relative molecular masses of 53 906 (a) and 48 667 (8) are close to the values obtained by analysis of the subunits by sodium dodecyl sulfate/polyacrylamide gel electrophoresis. Several hydrophobic regions large enough to span the cytoplasmic membrane were observed in each subunit. These results indicate that transhydrogenase is an intrinsic membrane protein. Pyridine nucleotide transhydrogenase, found in the cyto- plasmic membrane of Escherichia coli and in the inner mem- brane of mitochondria, catalyzes the reversible transfer of a hydride ion equivalent between NAD and NADP. The transhydrogenation between NADH and NADP is coupled to respiration and ATP hydrolysis (see [l] for review). During the past decade, a large body of evidence has accumulated from studies on the mitochondria1 transhy- drogenase to support Mitchell's hypothesis [2] that the trans- hydrogenase functions as a proton pump and translocates protons across the membrane according to the equation [3]: nHi', + NADPH + NAD $ nH;", + NADP + NADH. Comparatively little work has been done with the cor- responding enzyme from E. coli despite advantages that genet- ic manipulation of this system can offer. Recently, we have cloned the pnt (pyridine nucleotide transhydrogenase) gene of E. coli into a bacterial plasmid and amplified its products [4]. The enzyme was found to consist of two polypeptides of MI 50000 and 47000. Neither of the two subunits alone exhibited transhydrogenase activity. Cloning of the pnt gene into multicopy plasmids resulted in greater than 70-fold overproduction of transhydrogenase in cells harboring the plasmid. These cells served as an excellent starting material for the purification of the transhydrogenase as the two subunits, a (M, 50000) and /? (MI 47000), were the two major proteins in the cytoplasmic membrane. The enzyme Correspondence to P. D. Bragg, Department of Biochemistry, University of British Columbia, 2146 Health Sciences Mall, Van- couver, British Columbia, Canada V6T 1 W5 Abbreviations. SDS, sodium dodecyl sulfate; kb, lo3 base pairs. Enzyme. Pyridine nucleotide transhydrogenase or NAD(P)* transhydrogenase (EC 1.6.1.1). was purified to apparent homogeneity by a relatively simple purification scheme involving differential extraction of the membranes with detergents and centrifugation through a su- crose solution [5]. The purified enzyme was reconstituted into liposomes and shown to promote the acidification of the intravesicular space as indicated by the quenching of the fluo- rescence of 9-aminoacridine. Thus, the transhydrogenase could generate a proton gradient in the absence of other components of the electron transfer and energy-conserving systems such as the cytochromes and the ATPase. Both proton translocation and catalytic activities were inhibited by N,W- dicyclocarbodiimide, suggesting that both activities are obli- gatorily linked. In the present paper we describe the determination of the nucleotide sequence of the pntA and pntB genes. This has permitted the prediction of the amino acid sequence of the transhydrogenase a and j subunits and of the possible regions of the polypeptide chains which span the membrane. The primary structures of these proteins should aid in further studies on the structure of the transhydrogenase and the mech- anism by which the transfer of a hydride ion equivalent is coupled to proton translocation. MATERIALS AND METHODS Materials Restriction endonucleases were obtained either from Amersham International Corp. or from Pharmacia P-L Bio- chemicals Ltd. Phage T4-DNA ligase and exonuclease BAL31 were purchased from Boehringer Mannheim Corp. DNA polymerase I (Klenow fragment) was obtained from Bethesda

Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

Embed Size (px)

Citation preview

Page 1: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

Eur. J. Biochem. 158,647-653 (1986) 0 FEBS 1986

Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli David M. CLARKE ', Tip W. LOO2, Shirley GILLAM2 and Philip D. BRAGG' Departments of Biochemistry and Pathology2. University of British Columbia, Vancouver

(Received March 14/May 12, 1986) - EJB 86 0266

A 3240-base-pair DNA fragment spanning the pyridine nucleotide transhydrogenase @nf) genes of Escherichia coli has been sequenced. The sequence contains two open-reading frames, pntA and pntB of 1506 and 1386 base pairs, coding for the transhydrogenase a and j subunits, respectively. The coding sequences are preceded by a promoter-like structure and are most likely co-transcribed. Each coding sequence is preceded by a Shine-Dalgarno sequence.

The amino-terminal amino acid sequences were determined from the purified a and /j subunits of the transhydrogenase. These sequences agree with those predicted from the nucleotide sequences of the pntA and pntB genes. The predicted relative molecular masses of 53 906 (a) and 48 667 (8) are close to the values obtained by analysis of the subunits by sodium dodecyl sulfate/polyacrylamide gel electrophoresis. Several hydrophobic regions large enough to span the cytoplasmic membrane were observed in each subunit. These results indicate that transhydrogenase is an intrinsic membrane protein.

Pyridine nucleotide transhydrogenase, found in the cyto- plasmic membrane of Escherichia coli and in the inner mem- brane of mitochondria, catalyzes the reversible transfer of a hydride ion equivalent between NAD and NADP. The transhydrogenation between NADH and NADP is coupled to respiration and ATP hydrolysis (see [l] for review).

During the past decade, a large body of evidence has accumulated from studies on the mitochondria1 transhy- drogenase to support Mitchell's hypothesis [2] that the trans- hydrogenase functions as a proton pump and translocates protons across the membrane according to the equation [3]:

nHi', + NADPH + NAD $ nH;", + NADP + NADH.

Comparatively little work has been done with the cor- responding enzyme from E. coli despite advantages that genet- ic manipulation of this system can offer. Recently, we have cloned the pnt (pyridine nucleotide transhydrogenase) gene of E. coli into a bacterial plasmid and amplified its products [4]. The enzyme was found to consist of two polypeptides of MI 50000 and 47000. Neither of the two subunits alone exhibited transhydrogenase activity.

Cloning of the pnt gene into multicopy plasmids resulted in greater than 70-fold overproduction of transhydrogenase in cells harboring the plasmid. These cells served as an excellent starting material for the purification of the transhydrogenase as the two subunits, a ( M , 50000) and /? (MI 47000), were the two major proteins in the cytoplasmic membrane. The enzyme

Correspondence to P. D. Bragg, Department of Biochemistry, University of British Columbia, 2146 Health Sciences Mall, Van- couver, British Columbia, Canada V 6 T 1 W 5

Abbreviations. SDS, sodium dodecyl sulfate; kb, lo3 base pairs. Enzyme. Pyridine nucleotide transhydrogenase or NAD(P)*

transhydrogenase (EC 1.6.1.1).

was purified to apparent homogeneity by a relatively simple purification scheme involving differential extraction of the membranes with detergents and centrifugation through a su- crose solution [5]. The purified enzyme was reconstituted into liposomes and shown to promote the acidification of the intravesicular space as indicated by the quenching of the fluo- rescence of 9-aminoacridine. Thus, the transhydrogenase could generate a proton gradient in the absence of other components of the electron transfer and energy-conserving systems such as the cytochromes and the ATPase. Both proton translocation and catalytic activities were inhibited by N,W- dicyclocarbodiimide, suggesting that both activities are obli- gatorily linked.

In the present paper we describe the determination of the nucleotide sequence of the pntA and pntB genes. This has permitted the prediction of the amino acid sequence of the transhydrogenase a and j subunits and of the possible regions of the polypeptide chains which span the membrane. The primary structures of these proteins should aid in further studies on the structure of the transhydrogenase and the mech- anism by which the transfer of a hydride ion equivalent is coupled to proton translocation.

MATERIALS AND METHODS

Materials

Restriction endonucleases were obtained either from Amersham International Corp. or from Pharmacia P-L Bio- chemicals Ltd. Phage T4-DNA ligase and exonuclease BAL31 were purchased from Boehringer Mannheim Corp. DNA polymerase I (Klenow fragment) was obtained from Bethesda

Page 2: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

648

- - 4 - - - c -c +- y.. PIX 11 Hi -pntA -pntB

, I

m a Y s m HPHc EXHcSa E B P

B%E -.Tz ps c.-- --- t- c-

* C - t C C C C - C - e

c_ cc - 0.5 kb -

Fig. 1. Summary ojclones used to establish the nucleotide sequence ojthe pnt gene. Horizontal arrows represent extent of sequences determined and their orientations. (. . . . .) pUC13 vector DNA; Hi, HindIII; H, HpaI; Ps, PstI; P, Pvull; E, EcoRT; X , XhoI; Hc, HinclI; Bs, EsiElI; S, SstI; B. EumHI; Sm, SmaI; Sa, SalI

Research Laboratories Inc. The 17-nucleotide synthetic primer and [LX-~~P]~ATP were the products of Amersham In- ternational Corp. The M13mp18 and M13mp19 phages were generous gifts of R. T. A. MacGillivray of the Department of Biochemistry.

Cloning in bacteriophage MI3 The plasmids pDCll and pDC21 are hybrid plasmids of

pUC13 containing the pntA and pntB genes on inserts of 4.05 x lo3 and 3.05 x lo3 bases respectively [4]. The plasmids were grown in E. coli JM83 and isolated as described pre- viously [4].

Most of the nucleotide sequence was derived by analysing the products of exonuclease BAL31 digestion. Three libraries of fragments were generated using BAL31 by first cleaving pDC21 at unique restriction sites at the 5'-terminus, 3'4er- minus, or near the middle of the insert containing the pntA andpntB genes. The digestion was stopped by phenolic extrac- tion and the DNA precipitated with ethanol. The cleaved DNA was suspended in 10 mM Tris/HCl, pH 8.0, 1 mM EDTA and treated with BAL31 using the procedure described by Maniatis et al. [6]. An equal volume of twofold- cencentrated BAL31 buffer (24 mM CaC12, 24 mM MgC12, 0.4M NaC1, 40mM Tris/HCI, pH8.0, 2mM EDTA) was added to the sample of DNA. The samples were incubated at 37°C for 3 min and then an appropriate amount of BAL31 was added which was determined experimentally. At appro- priate times (0.5 - 12 min) a sample was removed and EGTA (0.2 M, pH 8.0) was added to a final concentration of 20 mM. The samples of BAL31-digested DNA were extracted with phenol/chloroform and precipitated with ethanol. The shortened fragments were released from the vector by treat- ment with a restriction endonuclease and cloned into Ml3mp18 or Ml3mp19. Restriction endonuclease digestions were carried out according to the manufacturer's instructions. The conditions used for ligation, transformation and agarose gel electrophoresis have been described previously [4].

DNA sequencing Three sets of fragments from plasmid pDC21 were gener-

ated using BAL31. The plasmid was linearized by cleavage at the unique restriction endonuclease SmaI or PstI site, treated with BAL31 for various lengths of time, and then the shortened fragments were released with PstI or BamHI, re- spectively. The fragments were cloned into M13mp19 cleaved with PstIISmaI or BamHIISmaI. Clones containing fragments of the opposite strand of pDC21 were generated by cleaving the plasmid with BstEII, treating the linearized plasmid for various lengths of time with BAL31, and then releasing two fragments containing opposite ends of the pnt gene by

cleavage with HindIII and BamHI for cloning into either BamHI/HincII-t reated M 1 3mpl8 or Hind11 IIHincII-treated M13mp19. The unique HindIIIIHpaI fragment of pDCl1 was also cloned into Mi3 vectors for sequencing.

Single-stranded DNA templates were prepared from re- combinant M13 phages [7] grown in E. coli JM103 and sequenced by the dideoxy chain-termination method [8] using a 17-nucleotide sequencing primer [9].

Amino-terminal sequence analysis

Transhydrogenase was purified from E. coli strain JM83 carrying the pnt gene on the multicopy plasmid pDC21 as described by Clarke and Bragg [5 ] . The transhydrogenase a and /I subunits were separated by preparative SDS/polyacryl- amide gel electrophoresis in a vertical gradient slab gel by the method of Laemmli [lo]. Purified transhydrogenase (450 pg) was dissolved in 400 ~1 of the sample buffer, containing 2% SDS and 5% 2-mercaptoethanol, and applied on the 4.0% stacking gel. The acrylamide concentration in the 1.5-mm- thick and 140-mm-long separating gel was 10%. Sodium mercaptoacetate (0.1 mM) was included in the cathode buffer reservoir to minimize the destruction of tryptophan, histidine and methionine side-chains by free radicals or oxidants trapped in the gel matrix [ll]. After electrophoresis, the pro- teins were detected by immersing the gel into 0.1 M KCl at 4°C for about 5 min [12]. Gel slices containing the now visible protein bands were cut out, finely divided and eluted with 3 ml (about five times the volume of the wet gel slice) of 0.1% SDS/l mM dithiothreitol by slow agitation at 22°C for 6 h. The filtered eluates of a and /I proteins were made 12% in trichloroacetic acid, kept for 1 h at O"C, and centrifuged in a Fisher micro-centrifuge for 15 min at 4°C. The recovered protein precipitates were washed three times with 1 ml of ice- cold 10% trichloroacetic acid and three times with acetone (at - 20 "C) by resuspension and recentrifugation for 5 min at 4°C. The samples were lyophilized. Amino-terminal sequence analysis was performed on a gas-phase sequenator at the University of Victoria.

RESULTS AND DISCUSSION

Sequencing strategy

The restriction endonuclease maps of plasmids pDC11 and pDC21 are illustrated in Fig. 1. The positions of the pntA and pntB genes relative to the restriction map are based on gene expression studies with recombinant plasmids [4]. These studies indicated that at least the 2.65-kb fragment bounded by the HpaI restriction sites was essential for the expression of enzyme activity. Two polypeptides with a combined M, of 97000 would require a coding capacity of about 2.7 kb of

Page 3: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

649

Table 1. Amino acid sequences of the amino-terminal regions of the transhydrogenase a and B subunits The transhydrogenase subunits were isolated as described in Materials and Methods and sequenced on a gas-phase sequenator at the University of Victoria. n.d., not determined

Subunit Amino acid residue

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

A or

01 M R I G I P R E R L T N E T R V A V T P K T G E Q n . d . n . d . K

P M S G G L V T A

Table 2. Codon usage in the E. coli pnt genes The number of times that a particular codon is used for the indicated amino acid (AA) in the transhydrogenase a and p subunits is given

Codon AA a B Codon AA a B Codon AA a P Codon AA a f l

TTT TTC TTA TTG

CTT CTC CTA CTG

ATT ATC ATA ATG

GTT GTC GTA GTG

F 9 8 F 11 13 L 4 3 L 6 6

L 8 3 L 4 3 L 2 0 L 18 34

I 24 19 I 11 18 I 3 1 M 15 19

V 13 9 v 11 10 v 4 7 V 23 22

TCT TCC TCA TCG

CCT ccc CCA CCG

ACT ACC ACA ACG

GCT GCC GCA GCG

S 1 7 S 2 4 S 6 1 S 5 6

P 2 4 P 0 0 P 5 4 P 17 9

T 5 4 T 17 9 T 3 2 T 5 7

A 2 13 A 11 6 A 18 12 A 28 22

TAT TAC TAA TAG

CAT CAC CAA CAG

ATT AAC AAA AAG

GAT GAC GAA GAG

Y 4 5 Y 4 3

0 0 0 0

H 2 7 H 2 5

Q 11 6

N 10 4 N 8 14 K 20 12 K 4 5

D 9 11 D 7 7 E 24 15 E 7 4

4 9 4

TGT TGC TGA TGG

CGT CGC CGA CGG

AGT AGC AGA AGG

GGT GGC GGA GGG

c 2 2 c 4 2

0 0 W 8 3

R 10 7 R 6 4 R 1 1 R 1 1

S 5 2 S 8 5 R 2 0 R O O

G 10 22 G 22 14 G 1 5 G 8 7

DNA. Therefore, the plasmid pDC21 with an insert of 3.05 kb is close to the minimum amount of DNA required for the expression of complete pntA and pntB gene products.

The strategy for sequencing the pnt gene complex was to generate a library of progressively shortened DNA fragments using exonuclease BAL31 as described in Materials and Methods. Thus, the sequence was built up in an orderly and rapid manner in both orientations as summarized in Fig. 1.

The structural pntA and pntB genes

endonuclease mapping of the cloned pnt gene [4]. The usage of codons in the pnt gene coding regions further supports the assignment of the open-reading frames (Tables 2). It is non- random and typical of many E. coli genes [13]. The codons CTA, ATA, AGA and AGG are rarely used in E. coli [14]. These codons are also used sparingly in the pntA and pntB genes. Grosjean and Fiers [14] have analyzed codon usage in E. coli genes. They found that an efficient in-phase translation is facilitated by proper choice of degenerated codewords ending with T or C promoting a codon-anticodon interaction with intermediate strength (optimal enernv) over those with -- I

In order to predict the amino acid sequence of the a and fl subunits of E. coli transhydrogenase, the nucleotide se- quence of the insert of plasmid pDC21 which extends over positions 211 -3240 (Fig.2) was determined. When this se-

very strong or very weak interaction energy. Generally, efficiently expressed genes show a clear preference for a C in the third base position of a codon if the first two nucleotides of the codon are T and/or A and a preference for a T in the

quence was examined for coding capacity, two large reading frames of 1506 (nucleotides 247 - 1752) and 1386 (nucleotides 1786-3171) were observed. The reading frames can encode the formation of polypeptides of 502 and 462 amino acids. The first amino acid residues predicted from the DNA sequence agreed well with the N-terminal sequences obtained by analy- sis of the purified transhydrogenase subunits (Table I). The discrepancy at position 23 of the a subunit is probably due to an ambiguity in the protein sequencing. These data clearly show that the first open-reading frame (pntA) encodes the a subunit and the second open-reading frame (pntB) encodes the fi subunit. The orientation of the two reading frames confirms predictions made previously from restriction

third base position of a codon if the first two nucleotides of the codon are C and/or G. Conversely, codon usage in weakly expressed genes such as repressor genes follow exactly the opposite rules. That is, codons for weakly expressed genes favour a T in the third base position of a codon if the first two nucleotides of the codon are T and/or A, and a preference for a C in the third base position of a codon if the first two nucleotides are C and/or G. Codon usage in both the o! and fl subunit genes (Table2) does not clearly resemble codon usage in either weakly or strongly expressed genes. The codon usage reflects a moderately efficient translation of transhy- drogenase mRNA. In normal E. coli cells, transhydrogenase represents only 0.1 -0.5% of the cytoplasmic membrane pro-

Page 4: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

650

15 30 45 6 0 15 90 105 GATGTCTCGTTTATCCGGCGTTCTAAGGTG~ATCCCACTCACGCGATCAC~CTGAATCGTTAATATTTTGCGAGTTCACGCCG~TACXA~~G~CTAGATCACAGGCATAATT

132 -%-- 147 162 :lo-- 117 192 201 222 T T C A G T A ~ ~ ~ T T A T A G G G C G ~ - ~ ~ A A ~ A ~ T A A C G G A ~ ~ ~ ~ ~ T ~ A G C T C A C G C G C G T A C A T G A G C A G C T T G T G T G G C T ~ C T G A C A C A G G C R A A C C A T C A T C M T ~ C C G A T G

249 264 279 294 309 324 3 39

I

M R I G I P R E R L T N E T R V A A T P K T V E Q L L K L G F T V A V G A A A A A T A T C A C AA'ITGGCATACCAAGAGAACGGTTAACCAATGARRCCAA~AAACCCGTGTPGCAGCAACGCC~CAGnCAACAGCTGCTGA~CTGGG~TACffiTCGCGGTA

366 381 396 411 426 441 456 E S G A V N W Q V L T I K R L C S G R E I V E G N S V W Q S E I I L K V N A P

GAGAGCGGCGCGG~AACnCAAG~GACGAT~GCGTT~TGCAGCG~CG~AAATTGTAGAA~GAATAGCGTCTCACGCGGGCAGTCAGAGATCATTC~AAGGTCAATGCGCCG

483 498 513 528 543 550 573 L D D E I A L L N P G T T L V S F I W P A Q n P E L M Q K L A E R N V T V M A

TTAGATGATGAAATPGCGTTA~AATCCTGGGACAACAACGCTGGTGAG~ATCTGGCCTGCGCAGAATCCGGAATTAATGC~CTTGCGG~CGTAACGTGACCG~ATGGCG

600 615 630 645 660 615 690 M D S V P R I S R A Q S L D A L S S M A N I A G Y R A I V E A A H E F G R F F ATGGACTCACGCGCTGTGCCGCGTATCTCACGCGCACGCGCACAATCGCTGGACGCACTAAGCTCACGCGCGATGGCGAACATCGCCGGTTATCGCGCCATTGTPGAAGCGGCACATGAATTnCGCGCTTC~

ah----,

717 732 7 4 1 162 177 792 8 0 1 T G Q I T A A G K V P P A K V M V I G A G V A G L A A I G A A N S L G A I V R

ACCGGGCRAATTACTGCGGCCGGGAAAGTGCCACCGGC~G~AT~TGATTGG~C~GTGTPGCAGGTCTCACGCGGGCCGCCATPGGCGCAGCRAACAGTCTCGGCGCGA~~CGT

834 849 864 879 894 909 924 A F D T R P E V K E Q V Q S M G A E F L E L D F K E E A G S G D G K A K V M S GCATTCGACACCCGCCCGGAAGTGRAAGAACAAGTTCRAAGTA~CGCGGAATTCCTCGAGCTGGA~T~GAGGAAG~GCAGCGGCGA~GCTATGCCRAAGTGATGTCG

9 5 1 966 9 8 1 996 1011 1026 1041 D A F I K A E M E L F A A Q A K E V D I I V T T A L I P G K P A P K L I T R E

GACGCGTTCATCRAAGCGGAAATGGAACTCTCTTTGCCGCCCA~C~GAGGTCGATATCATTGTCACCACCGCGCTTATTCCAGGCRAACCAGCGCCGAAGCTAA~ACCCG~AA

1060 1083 1098 1113 1120 1143 1158 M V D S M K A G S V I V D L A A Q N G G N C E K T V P G E I F T T E N G V K V

ATGGTTGACTCACGCGCCATGAAGGn;GGCAGTGTGA~rCGACCTCACGCGGGCAGCCCAAAAC~CGGCAAC~TG~TACACffiTGCCGGGCCGTCTGTGAAATCTTCACTACGGAAAATGGTGTCAAAGTG

1185 1200 1215 1230 1245 1260 1275 I G K T D L P G R L P T Q S S Q L Y G R N L V N L L K L L C K E K D G N I T V

ATPGGTTATACCGATCTTCCGGGCCGTCTGCCGTCTGCCGACGCAATCCTCACGCGCACAGCTCACGCGTTACGGCAGAAACCTCACGCGCGTTAATCTGCTGAAA~TTGTGC~GAGRAAGACGGCAATATCACTGTT

1302 1317 1332 1347 1362 1311 1392 D F D D V V I R G V T V I R A G E I T W P A P P I Q V S A Q P Q A A Q K A A P

GAmn;ATGATGnCTGATTCGCGGCGTGACCGTGATCCG~C~GCG~TTACCTGGCCGGCACCGCCGATTCAGGTATCAGCTCACGCGCAGCCGCA~CGGCACAAAAAGCGGCACCG

1419 1434 1449 1464 1479 1494 1509 E V K T E E K C T C S P W R K K A L M A L A I I L F G W M A S V A P K E F L G

G A A G T G A A A A C T G A G G A A A A A T G T A C C G T C G

1536 1551 1566 1 5 8 1 1596 1611 1626 H F T V F A L A C V V G K Y V V W N V S H A L H T P L M S V T N A I S G I I V CACTTCACCGTTTTCGCGCCCTG~TPG~TTATTACGnCTGTGGAATGTATCGCACGCGCTGCATACACCGTTGATGTC~TCACCAACGCGATTTCAGGGATTATPGTT

1653 1668 1603 1698 1713 1728 1743

GTCGGAGCACX'ITGCAGAT~;GCCA~C~C~CGTTAGCTTCCTTAG~ATCGCGGTG~ATAGCCAGCATTAATA~CGGTGGCTTCACCGTGACTCACGCGCAGCGCATGTGA

1770 1785 1800 1815 1830 1845 1860

AAATGTTCCGCWTTAASTAACATATG GAGGATTAGTTACAGCTGCATACATPGT~CCGCGATCCTGTTTATfTTCAGTCTGGC~T~TCGAAACATGAAACG

1801 1902 1917 1932 1947 1962 1977 S R Q G N N F G I A G M A I A L I A T I F G P D T G N V G W I L L A M V I G G

T~TCGCCAGGGTMCAACTTCGTAT~GCC~ATGGCGATTGCGTTAATCGCAACCATT~GACCGGATA~~~GGTAATG~CTGGATCTCACGCGTG~CGAT~TCATPGG~GG

2004 2019 2034 2049 2064 2079 2094 A I G I R L A K K V E M T E M P E L V A I L H S F V G L A A V L V G F N S Y L

GCAATn;GTATCCGTCXGCGAAG~GTTGAAATGACCGRAATGCCAGAACTGGTGGCGATCC~CATAGCTCACGCGTffiTGGGTCncC~CAGTGCTGGTPGGCTTTAACAGCTATCTG

V G A L L Q I G Q G G W V S F L S F I A V L I A S I N I F G G F T V T Q R M

M S G G L V T A A Y I V A A I L F I F S L A G L S K H E T

pntw- - -

2121 2136 2151 2166 2181 2196 2211 H H D A G M A P I L V N I H L T E V F L G I F I G A V T F T G S V V A F G K L

CATCATGACGCGGGAAncCACCGATTCTGGTCAATATTCACCTGACGGAAGTGTTCCTCGGTATCTTCATC~GCGGTAACGTTCACGGGTTC~T~TGGffiTTCGGCAAAC~

2230 2253 2268 2283 2290 2313 2328 C G K I S S K P L M L P N R H K M N L A A L V V S F L L L I V F V R T D S V G

T G T G G C A A G A T T T f f i T C T R A A C C A T P G A T G C T G C C G T C G T C G G C

2355 2370 2385 2400 2415 2430 2445 L Q V L A L L I M T R I A L V F G W H L V A S I G G A D M P V V V S M A E L V

CTGCAAGTGCTGGCATTGCTATAATGACCGCAATTGCGCTGGTATTCGG~GCATTTAGTCGCCTCACGCGCCATC~TGG~CAGATA~CCAGTGG~~TCGAT~CTGAACTCGTA

2472 2481 2502 2517 2532 2547 2562 L R L G G C G C G L Y A Q Q R P V I V T G A L V G S S G A I L S K I M C K A M

~CCGGCTGGGCGGCTGCGGCTGCGGGCTCACGCGTTA~CTCACGCGCAGCAACGACCTGTGATTG~ACCGGTGCGCTGGTC~TTCTTCG~~CTATC~CTTACATTATGTGTAAGGCGATG

2589 2604 2619 2634 2649 2664 2679 N R S F I S V I A G G F G T D G S S T G O D Q E V G E H R E I T A E E T A E L

AACCGTTCCTTTATCAGCGTTATPGC~nC~CGGCACCGACGG~TTCTACTGGCGATGATCAGGAAG~GG~AGCACCGffiAAATCACCGCAGAAGAGACAGCGGAACTCACGCGG

2106 2121 2736 2751 2766 2781 2796 L K N S H S V I I T P G Y G M A V A Q A Q K P V A E I T E K L R A R G I N V R

CTGAAAAACTCCCATTCAGTGA~ATTACTCCG~GTACGGCAT~CAGTCGCGCAGGCGCAATATC~TCGCTGAAATTACTGAGAAATPGCGCGCTCGTGGTATTAATGTGCGT

Fig. 2

Page 5: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

651

2823 2838 2853 2868 2883 2898 2913 F C l H P V A G R L P G H M N V L L A E A K ' J P Y D I V L E M D E l N D D F A

~ I T C G G T A T C C A C C r r X ; T C G f f i ~ G ~ T ~ C ~ ~ A ~ T A n ; A A f f i T A ~ ~ G ~ M G C A A A A G T A ~ T A n ; A C A T f f i T C ~ ~ n ; G A C G A G A T ~ ~ A ~ A C ~ ~

2940 2955 2970 2985 3000 3015 3930

G A T A C C G A T A C C G T A C n ; G T A T T G G T G C T A A C G A T A C G G T T ~ C C C G G C ~ C G ~ ~ A ~ A T C C G ~ G A G T C f f i A ~ ~ G T A ~ C C ~ T C C ~ A A G ~ ~ G A A A G C G C A G ~ C

3057 3072 3087 3102 3117 3132 3147 V I V F K R S H N T G Y A C V Q N P L F F K E N T H M L F G D A K A S V D A l

G T C A T T G T C T T T A A A C G T T C G A T ~ C A ~ C T A T C ~ T ~ C ~ C C C G ~ T T ~ C M ~ ~ C A C C ~ ~ ~ C T C ~ T C A C G C C A M G C C A G C G T ~ A ~ C A A T C

D T D T V L V I G A N D T V N P A A Q D D P K S P X A ~ M P V L E V ~ K A Q N

3169 3179 3189 3199 3209 3219 3229 3239 L K A L C T G ~ G C T C n ; T A A C C C T f f i A ~ ~ ~ A ~ C C G T C A C C T A

Fig. 2. Nucleotide sequence of the pnt gene region. Transcription and translation are from left to right. Each gene is marked above the proposed points of initiation of translation (pntA and pntB). Promoter-like sequences are boxed and labeled - 10 and - 35. Proposed ribosome binding sites (Shine and Dalgarno sequences) are underlined

tein. This is much more expression than weakly expressed genes such as repressor proteins but much less than efficiently expressed genes such as RNA polymerase and ribosomal pro- teins.

Translational and transcriptional sequences As discussed above, the proposed start points for transla-

tion (positions 247 and 1786) were rigorously established by determining the amino-terminal amino acid sequences of the transhydrogenase a and p subunits. Both coding regions start with ATG. The pntA coding region terminates with a TGA codon whereas that for pntB terminates with a TAA codon. Typical ribosome binding sites [15, 161 are present 7-9 base pairs upstream of the ATG codons for the amino-terminal formylmethionine (Fig. 2).

Upstream of the initiation point for translation of thepntA gene at position 163- 168 (Fig.2, - 10) is a structure like a Pribnow hexamer [17] which is separated by 19 base pairs from an RNA polymerase recognition site [18] at position 138- 143 (Fig.2, - 35).

No promoter-like structure is present between the pntA and pntB genes. There are only 30 nucleotides between the pntA termination codon and thepntB start codon. Therefore, both genes are probably co-transcribed.

No obvious terminator-like structure is observed in the sequence of the inserted DNA following the coding region in pDC21. This is not surprising since the sequence following this coding region is short (69 nucleotides).

Amino acid composition of the a and fl subunits

The predicted amino acid compositions of the trans- hydrogenase a and p subunits are shown in Table 3. The calculated M , for the a and j subunits are 53 906 and 48 667, respectively. These results are in close agreement with the values of 52000 and 47000 obtained by sodium dodecyl sulfate/polyacrylamide gel electrophoresis of the subunits [4].

The amino acid composition of bovine mitochondrial transhydrogenase has been determined [19 - 211. The polarity index [22] of mitochondrial transhydrogenase calculated by summing of the molar percentages of polar amino acid re- sidues (Asx, Glx, Ser, Thr, His, Lys, Arg) was 40% [20]. The polarity indices of the a (38%) and B (33%) subunits of the E. coli transhydrogenase are lower and more closely resemble the polarity of intrinsic membrane proteins [22]. The polarity of the mitochondrial transhydrogenase is at the borderline between that of intrinsic membrane proteins and soluble pro- teins [20].

Table 3. Amino acid composition of the transhydrogenase subunits The composition was predicted from the nucleotide sequences of the pntA and pntB genes

Amino No. of residues/subunit Content in acid

a B a B

mol/lOO mol

GIY 41 48 8.17 10.39 Ala 59 55 11.75 11.90 Leu 42 47 8.37 10.17 Ile 38 38 7.57 8.22 Val 51 48 10.16 10.39 Pro 24 17 4.78 3.68 Glu 31 19 6.18 4.11 Gln 20 10 3.98 2.16 ASP 16 18 3.19 3.90 Asn 18 18 3.59 3.90 Thr 30 22 5.98 4.76 Ser 27 25 5.38 5.41 Met 15 19 2.99 4.1 1 CYS 6 4 1.20 0.87 Arg 20 13 3.98 2.81 LYS 24 17 4.78 3.68 His 4 12 0.80 2.60 TYr 8 8 1.59 1.73 Phe 20 21 3.98 4.55

8 3 1.59 0.65 TrP - - Total 502 462

Analysis of the amino acid sequences [.$the c( and p subunits f o r transmembrane domains

The hydropathic properties of the a and fl subunits were examined by the method of Kyte and Doolittle [23]. The hydrophobicity values were averaged over seven consecutive residues. Averaging over longer segments up to 19 residues made little difference to the form of the hydropathic profile. To assist in this analysis, the prediction of secondary structure by the procedure of Garnier et al. [24] was also applied to these sequences. This, and other predictive methods, often underestimate the amount of a-helical structure. In particular, strongly hydrophobic a-helical segments are often predicted as fl strands. However, the prediction by this method of re- gions likely to be involved in reverse turns (and the distribu- tion of charged amino acids in the sequence) has aided analy- sis. The results of these analyses are shown in Fig. 3.

Page 6: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

652

4 A r % b A A A A A A A A A A & - & .A A + h . n - IIUI I I IIUI -.I I N I I-IU*III--- - I I u

I 250 300 350 400 450 sbo

Residue number

3r

b I wo 300 350 400 450 Residue number

Fig. 3. Hydropathy plots and secondary structure analyses of the a subunit ( A ) and f i subunit ( B ) . The hydropathy values (0 ) are the averages of the Kyte-Doolittle parameters [23] over seven consecutive residues. The analysis of secondary structure followed the procedure of Gamier et al. [24]. Predicted a-helical regions and reverse turns are shown by horizontal lines and short vertical lines, respectively. (A) The positions of positively charged amino acids in the sequence; ( A ) negatively charged amino acids; I -VII, potential transmembrane hydrophobic sequences

The average hydropathy values of 0.27 and 0.56 for the a and p subunits indicate that the 6 subunit is the more non- polar of the two polypeptides. The first 200 amino acid resi- dues of the p subunit appear to be organized into seven hydrophobic segments (I - VII) separated by polar regions containing some charged residues. Segments I-VI are also separated by regions in which a reverse-turn can be predicted. Van Heijne [25] has calculated the free energies of transfer of residues in a helix in water to that of a helix in a non-polar

phase lacking hydrogen-bonding capacity. The free energy changes for the helices of segments I - VII are in the range of - 93 to - 144 kJ/mol segment. This is within the range expected for non-polar membrane helices. The 250 carboxyl- terminal amino acid residues show no hydrophobic sequences long enough to span the membrane. Many charged residues are present.

In contrast to the /3 subunit, the possible membrane- spanning hydrophobic segments of the a subunit are largely

Page 7: Nucleotide sequence of the pntA and pntB genes encoding the pyridine nucleotide transhydrogenase of Escherichia coli

653

in the carboxyl-terminal region of the molecule even though the rules of Gamier et al. [24] predict that this is largely a region of /I structure. Four possible segments (11-V) have been identified. The free energy for transfer of helical segments 11, I11 and IV from water into a non-polar phase falls within the range of - 85 to -120 kJ/mol segment. The value of - 78.5 kJ/mol for segment V makes it less clear that this segment is situated within the membrane. The amino-terminal 80% of the a subunit contains many charged amino acids. One potential hydrophobic transmembrane segment (segment I) can be recognized in this region. Its free energy of transfer is - 108 kJ/mol segment. This segment contains the sequence VIGAGVAGLAAIGAANSLGA which shows homologies with the FAD and NAD(P)-binding folds of lipoyl dehydrogenase, glutathione and mercuric reductases [26]. Thus, it seems possible that this region of the a subunit contains the binding site for one of the substrates of the transhydrogenase.

This work was supported by a grant from the Medical Research Council of Canada. D. M. Clarke is the recipient of a Medical Re- search Council studentship. We thank Sandy Kielland (Protein Sequencing Facility, University of Victoria) for determining amino acid sequences.

REFERENCES 1. Fisher, R. R. & Earle, S. R. (1982) in The pyridine nucleotide

coenzymes (Everse, J., Anderson, B. & You, K.-S., eds) pp. 279-324, Academic Press, New York.

2. Mitchell, P. (1966) Bid . Rev. 41,445 - 502. 3. Earle, S. R. & Fisher, R. K. (1980) J . Biol. Chem. 255,827-830. 4. Clarke, D. M. & Bragg, P. D. (1985) J. Bacteriol. 162,367-373. 5 . Clarke, D. M. & Bragg, P. D. (1985) Eur. J . Biochem. 149, 517-

523.

6. Maniatis, J., Fritsch, E. F. & Sambrook, J. (1982) Molecular cloning. A laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.

7. Sanger, F., Coulson, A. R., Barrell, B. G., Smith, A. J. H. & Roe, B. A. (1980) J. Mol. Biol. 143, 161 - 178.

8. Sanaer. F.. Nicklen. S. & Coulson. A. R. (1977) Proc. Nutl Acud.

9.

10. 1 1 .

12.

13.

14. 15.

16.

17.

18 .

19.

20.

21.

22.

23. 24.

25. 26.

. , Sci. USA 74,5463 - 5467.

Duckworth, M. L., Gait, M. J., Goelet, P., Hong, G. F., Singh, M. & Titmas, R. C. (1981) Nucleic Acid1 Res. 9, 1691 -1706.

Laemmli, U. K. (1970) Nature (Lond.) 227,680-685. Hankapiller, M. W., Lujan, E., Ostrander, F. & Hood, L. E.

Bhown, A. S. & Bennet. J. C. (1983) Methods Enzymol. 91,450-

Grantham, R., Gautier, C., Gouy, M., Jacobyone, M. & Mercier,

Grosjean, H. & Fiers, W. (1982) Gene 18, 199-209. Shine, J. & Dalgarno, L. (1974) Proc. Nut1 Acad. Sci. USA 71,

Steitz, J. A. & Jakes, K. (1975) Proc. Natl Acad. Sci. USA 72,

Gold, L., Pribnow, D., Schneidcr, T., Shinedling. S.. Singer, B. S. & Stormo, G. (1981) Annu. Rev. Microbiol. 35, 365-403.

Rosenberg, M. & Court, D. (1979) Annu. Rev. Genet. 13, 319- 353.

Rydstrom, J . (1981) in Mitochondria and microsomes (Lee, C. P., Schatz, G. & Dallner, G., eds) pp.317 -335, Addison-Wesley Publishing Co., Reading MA.

Wu, L. N. Y., Pennington, R. M., Everett, J. D. iyr Fisher, R. R. (1Y82) J. Biol. Chem. 257,4052-4055.

Persson, B., Enander. K., Tang, H.-L. & Rydstrom, J . (1984) J. Biol. Chem. 259,8626 - 8632.

Capaldi, R. A. & Vanderkooi, G. (1972) Proc. Nutl Acad. Sci.

Kyte, J. &Doolittle, R. F. (1982) J . Mol. Biol. 157, 105-132. Gamier, J., Osguthorpc, D. J. & Robson, B. (1978) J. Mol. Biol.

von Heijne, G. (1981) Eur. J . Biochem. 116,419-422. Rice, D. W., Schulz, G. E. & Guest, J. R. (1984) J. Mol. Biol.

(1983) Methods Enzymol. 91, 227-236.

455.

R. (1981) Nucleic Acids Res. 9, 43-74.

1342 - 1346.

4734 - 4738.

USA 69,930 - 932.

I20,97 - 120.

174,483-496.