5
Mol Gen Genet (1991) 226:332-336 002689259100112Z © Springer-Verlag 1991 Nucleotide sequence of the ilvH-fruR gene region of Eschericbia coil K12 and Salmonella typhimurium LT2 K. Jahreis 1, P.W. Postma 2, and J.W. Lengeler ~ 1 FachbereichBiologie/Chemie, UniversitgtOsnabrfick,Postfach4469, W-4500-Osnabriick, Germany 2 E.C. Slater Institute for BiochemicalResearch, Universityof Amsterdam,P.O. Box 20151, NL-1000 HD Amsterdam,The Netherlands Received October 30, 1990 Summary. We have sequenced the fruR gene and flank- ing DNA fragments from Escherichia eoli KI 2 and Sal- monella typhimurium LT2. The fruR gene codes for a protein that represses the fru operon and activates the pps gene for PEP synthase. The corresponding open reading frame (ORF) FruR consists of 334 amino acid residues. The ORF contains an amino-terminal helix- turn-helix motif, characteristic of DNA-binding proteins and has similarity to known repressor proteins. The se- quence is identical to that of the E. coli shl gene (mne- monic for suppressor-H-linked phenotype). It is flanked upstream by the ilvlH genes and downstream by the pbpB gene in both organisms and by orfB, a gene possi- bly involved in the regulation of cell division. Key words: FruR repressor - Fructose phosphotransfer- ase system -fru operon - shl gene - ilvlH-pbpB In enteric bacteria, the FruR protein, encoded by the fruR gene, is involved in the regulation of fructose me- tabolism and gluconeogenesis (Geerse etal. 1989b). FruR is a repressor of the fru operon, which encodes the enzymes fructose phosphotransferase and fructose 1-phosphate kinase. At the same time it is an activator of the pps gene, which codes for phosphoenolpyruvate (PEP) synthase. Mutants lacking FruR thus express the fru operon constitutively and are also unable to grow on lactate and pyruvate due to the absence of PEP syn- thase. The fruR gene has been cloned from Escherichia coil and Salmonella typhimurium and the gene product shown to be a protein with an apparent molecular weight of 36000 daltons, based on expression of thefruR gene in a maxicell system. The fruR gene has been mapped at approximately 2 min, adjacent to the ilvIH operon (Geerse et al. 1986, 1989a). Since the FruR protein plays a combined role showing repressor function for the fru operon and activator function for the pps gene and possi- Offprint requests to: J.W. Lengeler bly for other genes encoding gluconeogenic enzymes (Chin et al. 1987), it was of interest to determine the nucleotide and deduced amino acid sequences from both E. coli K12 and S. typhimurium LT2. Comparison of DNA sequences isolated from closely related organisms helps to determine the reading frame of the gene under consideration and of adjacent genes because most base exchanges occur in the wobble bases of ORFs and in the intergenic sequences. Starting with plasmids pBCP33 and pBCP35 (Geerse et al. 1986), containing the fruR genes of E. coli and S. typhimurium respectively, both fruR genes were sub- cloned into the vector pBluescript II SK + (Alting-Mees and Short 1988). By using the Exo III deletion method of Henikoff (1987), nested deletions of both strands of the 1.82 kb HindIII fragment of pBCP33, and of the 2.5 kb HindlII-PstI fragment from pBCP35 were pre- pared. Sequencing of both strands was done with the dideoxy chain-termination method (Sanger et al. 1977), using the T7 polymerase sequencing system obtained from Pharmacia. In the case of pBCP33, there is a single open reading frame (ORF) of sufficient length (334 amino acids) to code for a protein of 36 kDa. The 5' terminal end of the E. coli sequence corresponded to the published se- quence (Squires et al. 1983a) of the 3' terminal end of the ilvH gene with three exceptions: (i) the GA bases at positions 590 and 591 had been inverted; (ii) an A had been missed at position 672 such that the stop codon canot be the proposed TAA (positions 678-680) but rather the TAG at positions 686-688 ; (iii) 7 bp had been missed (ATGCGCA) at position 700 in the intergenic region between ilvH and fruR. The larger subclone of pBCP35 from S. typhimurium contains, as expected, the corresponding ORF for fruR of 334 amino acids, but also the entire for the gene ilvH and the 3' end of the ilvI gene, two genes that are cryptic in S. typhimurium (Squires et al. 1983 b) and which had not been sequenced previously. The E. coil and S. typhimurium FruR protein se- quences differed in only 4 positions, 2 of which were conservative exchanges (Fig. 1). The highly conserved

Nucleotide sequence of the ilvH-fruR gene region of Escherichia coli K12 and Salmonella typhimurium LT2

Embed Size (px)

Citation preview

Mol Gen Genet (1991) 226:332-336 002689259100112Z

© Springer-Verlag 1991

Nucleotide sequence of the ilvH-fruR gene region of Eschericbia coil K12 and Salmonella typhimurium LT2 K. Jahreis 1, P.W. Postma 2, and J.W. Lengeler ~

1 Fachbereich Biologie/Chemie, Universitgt Osnabrfick, Postfach 4469, W-4500-Osnabriick, Germany 2 E.C. Slater Institute for Biochemical Research, University of Amsterdam, P.O. Box 20151, NL-1000 HD Amsterdam, The Netherlands

Received October 30, 1990

Summary. We have sequenced the fruR gene and flank- ing DNA fragments from Escherichia eoli KI 2 and Sal- monella typhimurium LT2. The fruR gene codes for a protein that represses the fru operon and activates the pps gene for PEP synthase. The corresponding open reading frame (ORF) FruR consists of 334 amino acid residues. The ORF contains an amino-terminal helix- turn-helix motif, characteristic of DNA-binding proteins and has similarity to known repressor proteins. The se- quence is identical to that of the E. coli shl gene (mne- monic for suppressor-H-linked phenotype). It is flanked upstream by the ilvlH genes and downstream by the pbpB gene in both organisms and by orfB, a gene possi- bly involved in the regulation of cell division.

Key words: FruR repressor - Fructose phosphotransfer- ase system - f ru operon - shl gene - ilvlH-pbpB

In enteric bacteria, the FruR protein, encoded by the fruR gene, is involved in the regulation of fructose me- tabolism and gluconeogenesis (Geerse etal. 1989b). FruR is a repressor of the fru operon, which encodes the enzymes fructose phosphotransferase and fructose 1-phosphate kinase. At the same time it is an activator of the pps gene, which codes for phosphoenolpyruvate (PEP) synthase. Mutants lacking FruR thus express the fru operon constitutively and are also unable to grow on lactate and pyruvate due to the absence of PEP syn- thase. The fruR gene has been cloned from Escherichia coil and Salmonella typhimurium and the gene product shown to be a protein with an apparent molecular weight of 36000 daltons, based on expression of thefruR gene in a maxicell system. The fruR gene has been mapped at approximately 2 min, adjacent to the ilvIH operon (Geerse et al. 1986, 1989a). Since the FruR protein plays a combined role showing repressor function for the fru operon and activator function for the pps gene and possi-

Offprint requests to: J.W. Lengeler

bly for other genes encoding gluconeogenic enzymes (Chin et al. 1987), it was of interest to determine the nucleotide and deduced amino acid sequences from both E. coli K12 and S. typhimurium LT2. Comparison of DNA sequences isolated from closely related organisms helps to determine the reading frame of the gene under consideration and of adjacent genes because most base exchanges occur in the wobble bases of ORFs and in the intergenic sequences.

Starting with plasmids pBCP33 and pBCP35 (Geerse et al. 1986), containing the fruR genes of E. coli and S. typhimurium respectively, both fruR genes were sub- cloned into the vector pBluescript II SK + (Alting-Mees and Short 1988). By using the Exo III deletion method of Henikoff (1987), nested deletions of both strands of the 1.82 kb HindIII fragment of pBCP33, and of the 2.5 kb HindlII-PstI fragment from pBCP35 were pre- pared. Sequencing of both strands was done with the dideoxy chain-termination method (Sanger et al. 1977), using the T7 polymerase sequencing system obtained from Pharmacia.

In the case of pBCP33, there is a single open reading frame (ORF) of sufficient length (334 amino acids) to code for a protein of 36 kDa. The 5' terminal end of the E. coli sequence corresponded to the published se- quence (Squires et al. 1983a) of the 3' terminal end of the ilvH gene with three exceptions: (i) the GA bases at positions 590 and 591 had been inverted; (ii) an A had been missed at position 672 such that the stop codon canot be the proposed TAA (positions 678-680) but rather the TAG at positions 686-688 ; (iii) 7 bp had been missed (ATGCGCA) at position 700 in the intergenic region between ilvH and fruR. The larger subclone of pBCP35 from S. typhimurium contains, as expected, the corresponding ORF for fruR of 334 amino acids, but also the entire for the gene ilvH and the 3' end of the ilvI gene, two genes that are cryptic in S. typhimurium (Squires et al. 1983 b) and which had not been sequenced previously.

The E. coil and S. typhimurium FruR protein se- quences differed in only 4 positions, 2 of which were conservative exchanges (Fig. 1). The highly conserved

EoC,

S.t.

E,c,

$,t.

t l v ' I

Y G H V G I q I S H P H ? L E S K L S E A L E q V R N N R L V F V D V T V D G $ E H Y Y

tatgggcat~tc~ggatccagatttctcatcc~at~-gc~g~aaa~caaactta~ag~gctgg~acag~gcaa~atcgcctg~tgttt~tt~at~ttacc~tcg~t~gcagcgagcacgtct (129)

. . . . . . . . . . . . . . . ~ T G ~ g ~ G T C C G ~ G ~ G G A A A G ~ A A A C I ~ G A A ~ G ~ T G T G ~ G T ~ N C C ~ C ~ G G T G ~ C G ~ C ~ G ~ C A C C G g ~ G G ~ G ~ G ~ T G ~ G ~ ( 1 1 5 )

L Q I N R P D E L E S K L $ E A L E H Y R N N R L Y F Y D Y T Y D G S E H Y Y

P ~ I

SO t~vH

P H q I R G G G M D E M W L S K T E R T * M R R I L S Y L L E N E $ G A L S R Y I G ( 2 1 )

a c c c g a t g c a ~ a t t ¢ g c g g ~ g g c g g a a t ~ a t g a a a t ~ t g g t t a a ~ c a a a a c ~ g a g a g a a ¢ c t ~ a t t a t g c g c c g g a t a ~ t a t c a g t c t t a c t c g a a a a t g a a t c a ~ c g t t a t c c c g c g t g a t t g g c ( 2 5 9 )

A ~ C ~ G ~ T ~ G ~ G G G G G ~ G G ~ G ~ G A ~ A T G ~ g ~ N ~ G ~ G A G G A C C ~ G C ~ G N ~ A ~ G G N ~ N C T G G A ~ A A C G ~ A T C ~ G G G G ~ G ~ N ~ G ~ g A ~ C C - G C ( 2 4 5 )

P M Q I R G G G M D E M N L S K T E R T * H R R I L S V L L E N E S G A L S R V I G ( 2 1 )

333

E,c,

$ . t .

L F S 0 R G Y N I E S L T V A P T D D P T L S R M T I q T V G D E K V L E q I E K q L H ( 6 5 )

c t t t t t t c c c a ~ c t a c a a c a t t ~ a a a g c c t g a c c ~ t t g ~ c c ~ a c c g a c ~ a t c c ~ a c a t t a t c ~ c g t a t g a c c a t c c a ~ a c c ~ t g ~ c g a t ~ a a a a ~ g t a c t t g a g c a ~ a t c ~ a a a a ~ c a a t ~ a c ( 3 8 9 )

,° °.o . . . . . o . . . . . . . . . . . , . . . . , . . , . . . . . . , , , , o , , . , o , ° . . , ° , . , , ° , , , . , , , ° . . . . . ° . . . . . . . . . . . . . . . . . . . o,° . . . . . ° , , , . , , , , ° , , . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . , . . . . . . . . . , . . . . . . . ° . . . . . . . . . . . . . . . . . . , , , ° . , . . . . . , , , o ° o . .

CTCTTTTCGCAACGCG~T~GAAAGCC T~CGT~GCGCC~C~T~C~CGTTGT~GCG~T~C~TC~CG~TAGGC~T~AA~AAGTGCTT~GCAAATTGAAAAG~AA~TGC ( 375)

L F S q R G Y N I E S L T Y A P T D D P T L S R H T I 0 T V G D E K Y L E q [ E K q L H ( 6 5 )

EoC.

$ . t .

K L V D V L R V S E L G Q G A H V E R E I H L V K I Q A S G Y G R 0 E V K R N T E I F (I08)

acaaactgg tcgat g t c t t gcgcg tgag tgag t tggggcagggcgcgca tg t t gagcgggaaa tca tgc t gg tgaaaat tcaggccagcggt tacgggcgt gacga~gtgaaacgtaatacggaaatat t (519)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . o . . . . . . . . . . . . . , . . . . . . . . . oo, , , , . . . . . . . . , . . . . . . . . . . . . o . . ,° . . . . . . ,o , o * * , , ° , ° , o . . . . , . . . . , , , , , o , ° ° ° , ° , , , , , . . . . . . . . ° . . . . . . . . . . . , . . . . . . . . . . . , ,°

ACAAGCTGGTTGATGTG~TGcGCGTCAG~GAGCTGGGAcAGGGAGCGCACGTTGAG~GGGAAATCM~CTGGTG~AAATCCAGGCCAG~GGCTACGGACGGGAAGAGGTGAAGCGT~TACG~T ( 505)

K C V D V L R V S E L G q G A H V E R E I H L Y K I Q A S G Y G R E E Y K R N T E I F (108)

E,c . r

S . t .

H l n d l I I

R G q I I D V T P S L Y T V Q L A G T S G K L D A F L A $ I R D V A K I V E V A R S G ( 1 5 1 )

ccg t gggcaaat t a t c g a t g t cacaccc t cgc t t t a t a c c g t t caa t ta~cag~caccagcggtAAG~TTGATGCATTTTTAGCAT~GATTCGCGATGTGGCGA~AATTGTGGAGGTTGCTCGCTCTGGT ( 6 4 9 )

. . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . o . . . . . . . . . . . . , . . . . . . , , ° , , , , , o . . . . . . , . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , , , , , * * , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CCGTGG TCAGATTATTGACGTTA~GC~AA~GCTGTATAC~GTTCAACTGG~GGGCACcAGCGATAAACTGGATGCTTTT~TGGCCTCGcTGCGCGACGTGGCGA~TTGTTG~AGTGG~GCGTTCAGGC ( 6 3 5 )

R G Q I I D V T P T L Y T V Q L A G T S D K L D A F L A S L R D V A I( 1 V E V A R $ G ( 1 5 1 )

E°c°

S.t.

EoC,

$.t.

E,¢°

$.t.

V V G L S R G D K I M R * (163)

GTGGTCG~CTTTCGCGCGGC~IAAAAT~TGCGTT~GAAT . . . . . . . . ~TCTCAATGCGCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (706)

G~CG~cGGG~G~G~GG~ATG~G~G~A~g~GccG~c~cG~GcG~G~GTTgA~ccG~T~Nc~GGc~M~GGc~G~Tcc~T~GcGNG~Gc~A~AGG~A (765) V V G L S R G D K I M R * (163)

-35

. . . . . . . . . . . . . . . . . . . . . . . . . ATTTA~G~C~--ATGT~C--G~GGGC~T~-GC~T~GTGGc~c~T~AT~AG~G~G~cG~G~MT~TcTG~GCT~A~TGIT.-. . , , ( 805)

I l~i l l i : ~ I~ II II! l ~ l l l l l l l i ~ I I i l l l i l ~ l J i l l l l l l l l l : ~ I ~ i l l I ~ l~ l l l I I l l ~ I l l l i : l i | i ~ l ~

GGTTTG~G~A~AG~GG.AA~N~G~AA~A~A~cGGTIGGG~TTTT~TTTG~GAAAIc~G~GGI~c~GGAA~AAAAG~GG~G~G~G~cAAA~TG~G~gA~G~ ( 895) ..... / ~ ....

-35

fruR

-I0 S0 M K L D E I A R k A G g S R T T A S Y V I N G ( 2 3 )

~-TG~AT~Cc~TACcAGTAC~GGcTATGGTTTTTACATTTTAcG~.~GGCAATTGTGAAAcTG~TGAAATcGCTcGGCTGGCGG~GTGTCGcG~cCAcTGCAAGcTATGT~TTMCGG ( 9 3 5 )

AAc~IIII~Cc~A~cC~cA~-~-AAAGG~AIGGI~IGIAc~IIIAcAc~GGcNII~IGAA~cIG~IGAA~IcGcT~GGcIG~ccGGIGIcIc~c~cACN~IGeMGcIACGITAIAAACGG (I025)

-I0 SD H K L D E I A R L A G V S R T T A S Y V I M G ( 2 3 )

E°c,

S,t.

K A K q Y R V S D K T V E K V H A V V R E I I N Y H P N A V A A G L R A G R T R S I G L ( 6 6 )

~AAAGCGAAG~MTAcCGTGTGAGCCd~CAAAACCGTTGAAAA~GTCATGGcTGTGGTGCGTGAGC~TTACcACCCGA~G~CGTGGCAGCTGGG~TTCGTGCTG~CG~GTT~TATTGGTcTI (1065)

IN~.GCAM~C~IA~CG~GAG~c~A~,~GTAc~1V~G~AI~GCGGIAG~G~GI~G~IAc~AI~M~G~N~GG~G~CGGG~G~G~G~G~A~cG~Ic~IG~I~N (1155)

K A K Q Y R V S D K T V E K V M A V V R E H N Y li P N A V A A G L R A G R T R S I G L ( 6 6 )

V I P D L E N T S Y T R I A N Y L E R Q A R Q R G Y Q L L I A C S E D Q P D N E M R C I (110)

E . c . GTGAT•C•CGATCTGGAGAA•A•CAGcTATACcCGCATCG•TAA•TATCTTGAACGCCAGGCGCGGcAACGGGGTTATCAACTGC TGATTGCCTGCTCAGAAGATCAGCCAGACAACGAAATGCGGTGCA (1105)

S.t. GTGATCCCGGACCTTGAAAACACGAGCTACACCCGTATCGCAA~C TATCTTGAGCGCCAGGCACGCCAGCGTGGCTACCAAC TGC TGATCGCCTGTTCTGAAGATCAGCCGGATAACGAAATGCGCTGCA (1285)

V I P D L E N T S Y T R I A N Y L E R Q A R Q R G y q L L I A C S E 0 Q p D N E M R C i (110)

F i g . 1 ( c o n t i n u a t i o n , see p a g e 334 )

334

E,Co

S.t.

E H L L Q R q V D A I I V S T S L P P E H P F Y q R W A N D P F P I V A L D R A L D R (153)

TTGA~CACCTTTTACAGCGTCAGGTTGATGCCATTATTGTTTCGACGTCGTTGCCTCCTGAGCATCCTTTTTATC~CGCTGG~CTAACGACCCGTTCCCGATTGTCGC~CT~GACCGCGCCCTCGA~CG (1325)

TTGAGCACCTTTTGCAACG~CAGGT~GAT~CMTCATTGTTTCAACTTCGTTAC~CCGGAGCATCcCTTCTATCAGCGC~G~G~CAAcGATCCGTTC~CCAT~GT~C~CTCGAC~GCGC~CTGGATCG (1415)

E H L L Q R Q V D A I I V S T S L P P E H P F Y Q R W A N D P F P I V A k D R A L D R (153)

E,c°

S.L,

E I I F T S V V G A D Q D D A E M L A E E L R K F P A E T V L Y L G A L P E L S V $ F L (196)

TGAACACTTCAC~AGCGT~GTT~TGCCGATCAGGATGATGCC~AAATGCT~GC~GAAGAGTTACGTAAGTTIcCCGCCGAGA~GGTGCTTTATCTTGGI~CGCTAC~GGAGCTTTCTGT~A~CT~CCT~ (1455)

, , . , ° oo°,o°,,,,o,,, ,, , , , , , , , , , , , , , , , . , , . , ,,° o,o.. , ,° . ,° . . . . . , ,, ,, ,, ,. °o,o°,,o°,,, , o, o,. . , ,,°,o, , oo . . . ° , °°,,°o , , ° , , °, , ,oo,.°,o,,° °, . ° , , , ° , . ° . , , ° , . , , , , ° . , , . , , , , . . . °o . , o..., ,, ,, .. ,o , ° ,o , , ° ° . , , , , ,° ° , , . , , , , , ° , , ,. o.,o° ,o,°.,

cGAACATTTcACCA~CGTGGTC~GcGCCGATcAGGATGATGCCGAGAT~T~GGC~AAGAGC~G~GTAp~TT~CC~G~GG~CGG~GCTTTATTTGG~CG~T~cGGAGTT~TCC~TCAGTTTCCTG (1545)

E H F T S V V G A D Q D D A E M L A E E L R K F P A E T V L Y L 6 A L P E L S V S F L (196)

E,c°

S.t,

R E Q G F R T A W K D D P R E V H F L Y A N $ Y E R E A A A Q L F E K W L E T H P M P Q (240)

CGTGAACAA~G~TTCcGTACTGCCTGG/~GATGATCCGcGCGMGTGCATTTCc~GTATGc~AACAGCTATG~GGGAG~CGGCTGCCCA~TTATTCGAAA~TG~C~GG~ACGCATCCGATGCCG~ (1585)

CGCGAGCAGGGGTTCCGCACCGCATGGA~AGACGATCCGCGG~GGTGAATT~CTTATATGCCAACAGCTATGAGCGCGMGCCGCCGCGCAGTTGll~TGAGAAATGGCTG~AACGCATCCTATGCCGC (1675)

R E Q G F R T A W K D D P R E V N F L Y A H S Y E R E A A A Q L F E K W L E T H P H P q (240)

E,c,

$,t.

A L F T T S F A L L Q G V M D V T L R R D G K L P S D L A I A T F G D N E L L D F L Q (283)

AGGcGCTGTTcACAACGTCGTTTG~TTGTTGCAAGGAGTGATGGATGT~A~GCTGCGTCGCGA~GGCA~cTGCcTTCTGACCTGGCAATTGc~AC~TTTGGCGATAA~GAA~TGCTCGA~TTcTTAcA (1715)

AGGCGCTCTTTACGACAICGTTCGCGCTATNCAGGGCGT~TG~CGT~cGcTGCGGCGc~TG~cN~CTTcG~TTNGC~TTG~CTTCGGC~NAT~GC TGC TG~TTTTCTGCA (1805)

A L F T T S F A L L Q G V M D V T L R R D G K k P $ D L A I A T F G D H g L L D F L Q (283)

E,c,

SoL.

C P V L A V A Q R H R D V A E R V L E I V L A S L D E P R K P K P G L T R I K R H L Y (326)

GTGTCCGGTGCTGGCAGTGGCTCAACGTCAcCGC~ATGTCGCAGAGCGTGTGCTGGAGATTGTCCTGGCAAGCCTGGACGAACCGCGTAAGCCAAAACCTGGTTTAAcGCGCATTAAACGTAATCTcTAT (1845)

ATGCCCGGTACTGGCGGTGGCGCAGCGT~TCGT~TGTCGCGG~CGCGTGcTG~TNTGCTG~C~GTCTT~T~CCGC~T~CC~cCCGGCTI~CGCGIATNGGc~CCTTTAI (1935)

C P V L A V A Q R II R D V A E R V L E I V L A S L D E P R K P K P G L T R I R R R L Y (326)

EoCo

S,t.

R R G V L S R S * (334)

~GC~cGGCGTGCTcAGCcGIAGcTM~-G~CG~G~C/~AcGcGC~AGGTG/~TTT~c~TCTGGC~CGTA~AGTAC~GA~TGGACA~cMTA~GCTTk~G~ATMGACTATTC~TGACTA~ (1973)

CGTCGCGGCAITCTGAGCCGTAGCTA[u~,GGACCGGCGGTAAAAC~CTCTC . . . . . . . . TCTTCTGCCGCCGTCNCA/~TGCGTATCAGTNTATCCCTTNTAATT~.~GATTATTCCTTAAAAT (2057)

R R G I L S R S * (334)

E,C,

S.t.

TATTGAT~/V~TG~TTTT~V~CC~GC~CGTTMTTAACTCA~CAG~TC~/~V~TTCACAAT~TTAAGTGATAT~GACAGCGCGTTTTTGCATTATTTTGTTAC"~TGC-GGCGATGhAT~GCC~ATTTA/~CAA (2102)

TAT~CGAAAAT~CT~T~AT~AcCT~CTTT~G~CCGAAATCG~C~ACATT~-CA~TAAT~AAGACGTAAC~AGTC~T~CC~GTAT~G~TTACAAGCAGGCAA~AAATCGTCGAATTAACAG (2186)

E,c,

S.t.

ACACTTTTCTTTGCTTTTGCGCM . . . . . . . . . . . AC~GCTGG~AT~AAGCGC~A~ACAGACGTM~CAAGGA~TGTTAACCGGGGAAr̀~TATGTCCTAAAATGC~GCTCG~GTCGC/w~TGACACT (2219)

CCACT~T--TTGTCT~GGcGCGACAGTCGcTCGTAGG~G~GTATAGCAGGCA~cG~ACAcGTTA~CAGGcACTGTTAACA~GGTGC/~TATG~CCTAA/~TGC~GCT~GCGT~GCAAACTGACACT (2314)

E,c,

$.t .

TTATATTTGCTG•GGAAAATAGTGAGTCATT•TAAAACGGTGATGACGATGAGGGATTTTTT•TTACAGCTATTCATAACGTTAAT•TGCTTCGCACGTTGGACGTAAAATA•ACAAcGCTGATATTAGC (2349)

~A~AT~TTC~CGGAT~A~A~GAG~CCG~TT~NCGGC~G~G-GG~C~G~TTT~C~ACA~A~ATTCA~MCG~TAA~GGC~GCACGC~GGGCGGA/~AA~CAAC~CAGA~A~C~C (2443)

llIndIII

E .c. CGTAAACATCGGGTTTTTTACCTCGGTATGCCTIGTGACTGGCTIGACAAGCTT

= = ~ = : : : : : : : : , , , : : = = : . . . . . . . . . . . . . . . . . . . . . . . . . . . , , , , ,, , , , ° , , , , ° , , , , , ° ° , , , ° .

S .t . CGTAMCATTGGCTTTTTTACTTAGGTAAGCGTTGTGACTTGC TTGACMGCTT

HtndIII

Fig. 1. Nucleotide sequences and deduced amino acid sequences of the genes ilv'I, ilvH and fruR from Escherichia coli K12 and Salmonella typhimurium LT2. The nucleotide sequence of a 1.8 kb HindIII fragment from E. eoli and a 2.5 kb PstI-HindIII fragment from S. typhimurium are shown, the former being completed with a region of 583 bp, which cover part of the iIvIH region and is denoted by small letters (from Squires et al. 1983a). The numbers

(2403)

(2497)

of the nucleotides and of the deduced amino acid residues are given in parenthesis; potential - 3 5 and - 1 0 , as well as putative ribosome-binding (SD) sites are overlined; important dyad symme- tries and palindromic sequences for a potential rho-independent ilvIH terminator and fruR promoter are underlined by horizontal arrows. Colons indicate nucleotides identical to both sequences. Important HindIII and PstI restriction sites are also given

335

DNA sequences (83.5% overall identity for fruR) had the great majority of the exchanges in the wobble base pairs, a pattern that helps to identify the true reading frame. Both ORFs start with a GTG, a rare start codon which, however, is found often in genes such as regulato- ry genes that are translated at low levels (Gualerzi and Port 1990). They are preceded by a conservative Shine- Dalgarno sequence 7 bp upstream of the GTG, and con- servative - 10 and - 3 5 boxes, the - 10 box being sur- rounded by a putative palindromic sequence (Fig. 1).

Comparison of the FruR amino acid sequences with those of known regulatory proteins revealed the presence of a typical helix-turn-helix motif (residues 1-22) in the headpiece (residues 1-100). The similarity values for identical residues in the headpiece were 38 for LacI, 36 for GalR, 34 for PurR, 28 for DeoR and 35 for RbtR (the lac, gal, put and deo repressors from E. coli, and the rbt repressor from Klebsiella pneumoniae respective- ly). The similarity level in the remainder of these proteins was lower (about 20%) as was the overall similarity with the CytR repressor of E. coli. A high headpiece similarity (42/100) was also found when FruR was compared to the ScrR receptor from a plasmid-coded metabolic path- way and the corresponding scr regulon (Schmid et al. 1988), again with lower overall similarity (26%). Inter- estingly this repressor, which has been shown to bind D-fructose, has two additional conserved motifs com- pared to FruR (residues 116-123 and 239-253 in Fig. 1) which are not conserved in the other repressors men- tioned (Jahreis and Lengeler 1989). They might thus be involved in binding of the common inducer D-fructose. No extensive similarity was found with the regulatory proteins AraC, RhaC, MelR, ArgR, BioR or LysR from E. coli.

While we were preparing this manuscript, Leclerc et al. (1990) published the sequence of an E. coli gene which they called shl, and which they postulated to be involved in the expression of the supH (serU) suppressor. The shl gene was cloned on a 1.8 kb HindIII fragment and selected by virtue of its growth-inhibiting effect on a supH strain. It was found to be linked to the leu locus (map distance 2 rain), immediately downstream of ilvlH. The gene shl is identical to the fruR gene which had been cloned previously on the same HindIII fragment and had been identified by means of its ability to repress the fru operon and to activate pps expression (Geerse et al. 1986). The question that arises is why the clone should have the effects described by Leclerc et al. (1990) given that it encodes the FruR regulatory protein. These authors and Eggertsson and coworkers (Thorbjarnar- d6ttir et al. 1985) suggested that in supH strains, a muta- tion from UCG to U U G in a minor seryl-tRNA (gene serU) leads to incorporation of serine instead of leucine during translation. It was hypothesized, furthermore, that the activity of supH caused a cell division defect, attributed to the synthesis of a faulty protein, and that the Shl protein regulates the synthesis or the activity of ser U (supH).

How can we reconcile the different properties origi- nally attributed to Shl and to the FruR protein? It oc- curred to us that the fruR gene contains three UCG

codons for which there is no tRNA in supH strains. This would result in the absence of a functional FruR protein but, more importantly, in a stringent response when fruR is present on a multi-copy plasmid. If this interpretation is correct, the Shl phenotype should be triggered by over-production of any protein containing several UCG codons. Furthermore, supH mutants should have a FruR- phenotype. Relatively drastic ef- fects on general translation efficiency and cell growth have been demonstrated to result from limitation for a single tRNA, especially when the synthesis of regulato- ry proteins is affected (Chen et al. 1990).

Based on the analysis of the nucleotide sequence of the 1.8 kb HindIII fragment, Leclerc et al. postulated the existence of a second ORF of 28 amino acid residues (nucleotides 700-786 in Fig. 1). It is preceded by a puta- tive promoter and Shine-Dalgarno sequence, including a potential terminator hairpin followed by six uridine residues, similar to a rho-independent terminator. The authors suggested a possible transcription anti-termina- tion mechanism in the expression of thefruR (shl) gene, analogous to that in several other catabolic systems, in particular the/?-glucoside-PTS (bgl genes) of E. coli (Ma- hadevan and Wright 1987; Schnetz and Rak 1990), and the sucrose-PTS (sac genes) from Bacillus subtilis (Crutz et al. 1990). Several of our observations argue against such a model: (a) The putative anti-terminator for fruR expression would contain in S. typhimurium a large insertion (nucle- otides 709-790 in Fig. 1). No ORF can be deduced from this sequence. (b) A classical promoter seems to be present in front of fruR. (c) Comparison of the two highly related DNA se- quences from E. coli and S. typhimurium allowed iden- tification (using the wobble base exchange rule) of the reading frame for ilvH which in both organisms starts at an equivalent ATG and ends at an equivalent TGA. Both sequences show very high homology at the DNA and protein sequence levels (98% identity at the amino acid level for the 163 residues). The putative pro- moter for thefruR anti-terminator gene would be locat- ed within this ilvH sequence, which seems rather unlike- ly. (d) Using the same rule for comparing the published ilvI sequence from E. coli (Squires et al. 1983 a) and the one from S. typhimurium, we postulate that apart from the sequence errors in ilvH mentioned before, a further nucleotide has been missed at position 38 (Fig. 1). Ad- justing the reading frame from E. coli to that from Sal- monella reveals that both are highly similar (99% iden- tity) and terminate two nucleotides upstream of the ATG from ilvH. Such a topology normally indicates the fight translational coupling of the two genes thus linked, as has been observed for ilvI and ilvH (Squires et al. 1983 a). (e) The ilvlH genes are cryptic in S. typhimurium, but can be activated at high frequency by mutations located upstream of ilvI, most likely in a promoter (Squires et al. 1983b). This is compatible with the conclusion that the reading frames for the ilvlH genes in S. typhimurium

336

are intact. Consequently, it seems more reasonable to postulate that the rho-independent terminator described by Leclerc et al. (1990) (nucleotides 713-735 in Fig. 1) is the transcription terminator for the ilvIH operon, rath- er than being a f ruR anti-terminator. This hypothesis would also explain why these authors failed to detect promoter activity with their expression vector p G L J 5.

The very high degree of identity between the f r u R genes (83.5% identity) ends exactly with the predicted TAA stop codons, the downstream D N A being enriched for A + T nucleotides as is characteristic for many inter- genic sequences. Toward the end of the sequenced frag- ments, similarity levels increase again, suggesting that the next gene is nearby. According to our restriction map (Geerse et al. 1986) and the map published pre- viously by N a k a m u r a et al. (1983), which overlap in five characteristic endonuclease restriction sites, as well as the nucleotide sequence of the gene pbpB published re- cently (Gdmez et al. 1990), this gene, encoding the peni~ cillin-binding protein 3, is preceded by an open reading frame (termed orfB) possibly involved in the regulation of cell division. I t begins 431 bp downstream of the HindIII site at the 3' end of our sequenced fragment. There is no indication for (an) extended open reading frame(s) preceded by transcriptional or translational consensus sequences in either direction on these frag- ments (positions 1873-2403 of our sequence plus posi- tions 1-437 f rom the sequence of G6mez et al. 1990). The gene sequence in this part of the chromosome of E. coli K12 and S. typhimurium LT2 would consequently be ilvI ilvH f ruR orfB pbpB, with the f r u R orfB genes separated by an unusually long intergenic sequence of 967 bp.

Acknowledgements. We would like to thank Eileen Placke for help in preparing the manuscript and the Deutsche Forschungsgemein- schaft for financial support through SFB171, TPC3.

"The sequences reported in this paper have been deposited in the EMBL Data Library under accession numbers as follows: X55456, S. typhimurium ilvH-fruR genes; X55457, E. coli fruR gene."

References

Alting-Mees MA, Short JM (1989) pBluescript II: gene mapping vectors. Nucleic Acids Res 17:9494

Chen K-S, Peters TC, Walker JR (1990) A minor arginine tRNA mutant limits translation preferentially of a protein dependent on the cognate codon. J Bacteriol 172:2405-2510

Chin AM, Feucht BU, Saier MH Jr (1987) Evidence for the regula- tion of gluconeogenesis by the fructose phosphotransferase sys- tem in Salmonella typhimurium. J Bacteriol 169:897-899

Crutz A-M, Steinmetz M, Aymerich S, Richter R, Le Coq D (1990) Induction of levan sucrase in Bacillus subtilis: an antitermina- tion mechanism negatively controlled by the phosphotransfer- ase system. J Bacteriol 172:1043-1050

Geerse RH, Ruig CR, Schuitema ARJ, Postma PW (1986) Rela- tionship between pseudo-HPr and the PEP: fructose phospho- transferase system in Salmonella typhimurium and Escherichia coli. Mol Gen Genet 203:435-444

Geerse RH, Izzo F, Postma PW (1989a) The PEP: fructose phosp- hotransferase system in Salmonella typhimurium : FPr combines enzyme III vru and pseudo-HPr activities. Mol Gen Genet 216:517-525

Geerse RH, van der Pluijm J, Postma PW (1989b) The repressor of the PEP: fructose phosphotransferase system is required for the transcription of the pps gene of Escherichia coll. Mol Gen Genet 218:348-352

G6mez MJ, Fluoret B, van Heijenoort J, Ayala JA (1990) Nucleo- tide sequence of the regulatory region of the gene pbpB of Escherichia coll. Nucleic Acids Res 18:2813

Gualerzi CO, Pon CL (1990) Initiation of mRNA translation in procaryotes. Biochemistry 29: 5881-5889

Henikoff S (1987) Unidirectional digestion with exonuclease III in DNA sequence analysis. Methods Enzymol 155:156-165

Jahreis K, Lengeler JW (1989) Klonierung und Sequenzierung der Regulatorgene (scrR) fiir den Sucroseabbau yon Klebsiella pneu- moniae und pUR400. Biol Chem Hoppe Seyler 370:913-914

Leclerc G, Noel G, Drapeau GR (1990) Molecular cloning, nucleo- tide sequence, and expression of shl, a new gene in the 2-minute region of the genetic map of Escherichia coli. J Bacteriol 172:4696-4700

Mahadevan S, Wright A (1987) A bacterial gene involved in tran- scription antitermination: regulation at a Rho-independent ter- minator in the bgl operon of E. coll. Cell 50:485-494

Nakamura M, Marayuma IN, Soma M, Kato J, Suzuki H (1983) On the process of cellular division in Escherichia coli: nucleotide sequence of the gene for penicillin-binding protein 3. Mol Gen Genet 191 : 1-9

Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463-5467

Schmid K, Ebner R, Altenbuchner J, Schmitt R, Lengeler JW (1988) Plasmid-mediated sucrose metabolism in Escherichia coli K12: mapping of the scr genes of pUR400. Mol Microbiol 2:1-8

Schnetz K, Rak B (1990) The fl-glucoside permease represses the bgl operon of E. coli by phosphorylation of the antiterminator protein and also interacts with enzymelII g~°, the key element in catabolite control. Proc Natl Acad Sci USA 87 : 5074-5078

Squires CH, DeFelice M, Devereux J, Calvo M (1983 a) Molecular structure of ilvlH and its evolutionary relationship to ilvG in Escherichia coli K12. Nucleic Acids Res 11:5299-5313

Squires CH, DeFelice M, Lago CT, Calvo JM (1983 b) ilvHI locus of Salmonella typhimurium. J Bacteriol 154:1054-1063

Thorbjarnard6ttir S, Uemura H, Dingermann T, Rafnar T, Thor- steinsd6ttir S, $611 D, Eggertsson G (1985) Escherichia coli supH suppressor: temperature-sensitive missense suppression caused by an antieodon change in tRNA s~r. J Bacteriol 161:207-211

Communicated by H. B6hme