17
Q INSTITUT PASTEUR/ELsEVIER Res. MicrobioL Paris 1990 1990, 141, 407-423 NUCLEOTIDE SEQUENCE OF THE GENE ENCODING THE 35-kDa PROTEIN OF MYCOBACTERIUIVI TUBERCULOSIS S.P. O'Connor (l) (,), H.S. Rumschlag (i) and L.W. Mayer (2) O) Division of Bacterial Diseases, Center for Infectious Diseases, Centers for Disease Control, Atlanta, GA 30333 (USA) (z) Division of Vector-Borne Infectious D" eases, Center for Infectious Diseases, Centers for Disease Control, Fort Collins, CO 80522 (USA) SUMMARY A 35-kilodalton (kDa) protein of Mycobacterium tuberculosis is expressed by recombinant Escherichia coil which possess a plasmid that contains a 2.4-kilobase fragment of M. tuberculosis chromosomal DNA. The nucleotide sequence of this fragment was determined by the dideoxynucleotide chain- termination method. Analysis of the sequence revealed four open reading frames that could encode proteins greater than 250 amino acids in length. The reading frame for the 35-kDa protein was identified by subcloning DNA fragments into expression vectors pTTQI8 and pTTQI9, and assaying for production of the 35-kDa protein by Western blotting. A protein with a primary structure of 270 amino acids and a predicted molecular weight of 29,260 daltons was deduced from the nucleotide sequence. A computer-aided search of nucleic and amino acid sequence databases did not identify any pro- teins with significant sequence similarity to this protein. The organization of the gene encoding this protein was compared with other mycobacterial genes that have been sequenced. Information obtained from the investigation of this protein may aid in the development of reagents to diagnose and control mycobacterial disease. KEY-WORDS : Mycobacterium tuberculosis, 35-kDa protein, Nucleotide, Se- quencing; Tuberculosis, Diagnosis. Submitted December 15, 1989, accepted March 2, 1990. (*) Mailing address: Ste~en P. O'Connor, Centers for Disease Control, Building i, Room 2226, D I I, Allanla, GA 30333 (USA).

Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

  • Upload
    lw

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

Q INSTITUT PASTEUR/ELsEVIER Res. MicrobioL Paris 1990 1990, 141, 407-423

NUCLEOTIDE SEQUENCE OF THE GENE ENCODING

THE 35-kDa PROTEIN OF MYCOBACTERIUIVI TUBERCULOSIS

S.P. O'Connor (l) ( , ) , H.S. Rumschlag (i) and L.W. Mayer (2)

O) Division o f Bacterial Diseases, Center for Infectious Diseases, Centers for Disease Control, Atlanta, GA 30333 (USA)

(z) Division o f Vector-Borne Infectious D" eases, Center for Infectious Diseases,

Centers for Disease Control, Fort Collins, CO 80522 (USA)

S U M M A R Y

A 35-kilodalton (kDa) protein of Mycobacterium tuberculosis is expressed by recombinant Escherichia coil which possess a plasmid that contains a 2.4-kilobase fragment of M. tuberculosis chromosomal DNA. The nucleotide sequence of this fragment was determined by the dideoxynucleotide chain- termination method. Analysis of the sequence revealed four open reading frames that could encode proteins greater than 250 amino acids in length. The reading frame for the 35-kDa protein was identified by subcloning DNA fragments into expression vectors pTTQI8 and pTTQI9, and assaying for production of the 35-kDa protein by Western blotting. A protein with a primary structure of 270 amino acids and a predicted molecular weight of 29,260 daltons was deduced from the nucleotide sequence. A computer-aided search of nucleic and amino acid sequence databases did not identify any pro- teins with significant sequence similarity to this protein. The organization of the gene encoding this protein was compared with other mycobacterial genes that have been sequenced. Information obtained from the investigation of this protein may aid in the development of reagents to diagnose and control mycobacterial disease.

KEY-WORDS : Mycobacterium tuberculosis, 35-kDa protein, Nucleotide, Se- quencing; Tuberculosis, Diagnosis.

Submitted December 15, 1989, accepted March 2, 1990.

(*) Mailing address: Ste~en P. O'Connor, Centers for Disease Control, Building i, Room 2226, D I I, Allanla, GA 30333 (USA).

Page 2: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

408 S.P. O 'CONNOR E T AL.

I N T R O D U C T I O N

The genus Mycobacterium includes a group of bacteria tha t are the aetiologic agents of tuberculosis in humans. This group, the M. tuberculosis complex, consists of the species M. tuberculosis, M. bovis, M, a fricenum, and M. microtL Although tuberculosis can be treated with antibiotics such as r i fampin, e thambutol and isoniazid, and a vaccine is available, it is ~till a significant public health problem. It has been estimated by the World Health Organization that ten miilion new cases of active tuberculosis occm each year, with annual mortali ty estimated at three million people (Joint internat ional Union Against Tuberculosis and World Health Study Group, 1982), Most of these cases occur in developing countries; however, the incidence of tuber- culosis in at least one developed country has been on the increase in recent years. In the United States, the first significant increase in tuberculosis in- cidence since 1953 was recorded in 1986 (Center for Disease Control , 1988). A fraction of this increase was at t r ibutable to an influx of foreign-born peo- ple arriving in the United States. The remainder is and has been at t r ibuted to the susceptibility of patients with acquired immune deficiency syndrome (AIDS} to mycobacterial infections (Center for Disease Control , 1988, 1989). In New York City, 5 °70 of AIDS patients reported between 198i and 1985 were also diagnosed as carrying active M. tuberculosis infections (Center for Disease Control , 1987), which apparently evolved from latent infections due to the immunosuppression resulting from infection with human immunodefi- ciency virus. In addit ion to M. tuberculosis infections, AIDS patients exhibit a marked susceptibility to infection by bacteria of the Air. avium-M, in- tracellulere group (Wallace et al., ! 984). The magni tude of this problem has prompted a renewed effort to characterize M. tuberculosis and develop bet- ter diagnostic and therapeutic reagents.

In recent years, genes from a variety of mycobacterial species have been cloned in Escherichia coli (Young et aL, 1985; Shinnick et aL, 1987; Thole et aL, 1985). Some of the genes f rom M. tuberculosis have been shown to encode proteins that are immunodominant antigens for this species (Shinniek, 1987; Lu et aL, 1987). The goal of this work is to identify proteins that may be used in improved vaccines or diagnostic tests for tuberculosis. Cohen et aL (1987) have cloned a 4.7-kilobase (kb) DNA fragment from the chromosome of M. tuberculosis on which resides a gene encoding a 35-kDa protein. To

AIDS - acquired immune deficiency syn- mAb = monoclonal antibody. drome. ORF = open reading frame.

bp - base pair. PBS = phosphate-buffered saline. E D T A = ethylcncdiaminctctraacetic acid. PBS-T - PBS/Tween-20, IPTG = isoprnnyI-D-thiogalactopyranoside. SDS = sodium dodecyl sulphate. kDa = kilodalton. X-gal = 5-bromo-4-cl-Juro-3-indoyI-D-galac- kb = kilobase, tosid¢.

Page 3: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. T U B E R C U L O S I S 35 -kDa P R O T E I N 409

facil i tate the s t u d y c~f th is p ro te in , and to fu r the r our u n d e r s t a n d i n g o f the genet ic o rgan iza t i on o f 114. tuberculosis , we have de t e rmined and ana lysed the nuc leo t ide s equence o f th is gene.

MATERIALS AND METHODS

E. coil strains, plasmids, and monoclonal antibody.

E. coli strains used in this study were TB1 (Bethesda Research laboratory, 1984) as a re~.ipient in plasmid-cloning experiments and 71-15 (Yaniscil-Perron et aL, 1985) for bacteriophage-M 13-cloning. Strain TB-11, a recombinant E. coli that contains plasmid pLWM2110, was the source of the DNA sequenced (Cohen e ta l . , 1987). This plasmid is comprised of vector pUC13 and a 2.4-kb insert o f DNA derived from the genome of M. tuberculosis strain H37Ra {number 201, American Type Culture Collection, Roekville, MD). All E. eoli strains were grown in L broth at 37°C. Where required, carbeniciUin (Sigma Chemical Co., St Louis, MO) was added at a concen- tration of 200 ~g/ml.

Expression vectors pTTQ18 and pTTQI9 (Stark, 1987) were obtained from Amer- sham Corp., Arlington Heights, 1L. Bacteriophage M 13 replicative form DNA was purchased from Pharmaeia, Inc., Piscataway, NJ. A murine monoelonal antibody (mAb), designated 2B2, was used to detect the 35-kDa protein. The cell line which secretes this lgGzb molecule was created from spleen cells of mice immunized with recombinant 35-kDa protein (Rumschlag e t aL , in press).

Recombinant D N A methods.

Plasmid DNA was extracted as previously described (Ish-Horowocz and Burke, 1981), and purified over caesium ehloride/ethidium bromide gradie-, t s (Maniatis et aL, 1982). Restriction enzymes and T4 DNA ligase (New England Biolabs, Inc., Bever- ly, MA) were used according to the manufacturer ' s specifications. Restriction fragments were separated in 0.7 % agarose gels using Tris-acetate buffer {0.04 M Tris-aeetate, 0.001 M EDTA pH 8.0). Eleetrophoresis was carried out at 10 mA con- stant current for 16 h. Restriction fragments to be subcloned were cut out of agarose gels and purified with "Geneclean" (Biol01, La Jolla, CA).

Transformation and screening of recombinant E. coi l

Plasmid or MI3 ligation mixtures were transformed into strains TBI or 71-18, respectively, as previously described (Hanahan, 1983). Plasmid-transformed cells were plated onto medium containing carbenicillin, 0.4 p.g/ml 5obromo-4-ehloro-3-indolyl- D-galactoside (X-gal; Research Organics, Inc., Cleveland, OH), and 0.4 mM isopropyl-D-thiogalactopyranoside (IPTG; Research Organics). After incubation overnight at 37°C, 200 white colonies were transferred onto nitrocellulose filters (BA85, 0.45 m pore size; Sehleicher and Schuell, Inc., Keene, NH) that had been overlaid onto carbenicillin plates. The colonies were screened by DNA-DNA hybridiza- tion for Mycobacterium DNA. Filters were prepared for colony hybridization by se- quential exposure to 0.5 M NaOH, 1 M Tris (pH 7.4), and Tris-saline buffer (0.5 M Tris pH 7.4, 1.5 M NaCI). The filters were washed with Tris-saline buffer, 100 ml chloroform, air dried, and then baked for 2 h at 80°C under vacuum. A i.8-kb Sail fragment from TB-11 (fig. 1) was labelled with 3zp by nick translation (Nick Translation Kit; Bethesda Research Laboratories, Gaithersburg, MD) and hybridiz- ed to the filters under conditions of high stringency (Tm - 20°C). Filters were wash- ed and autoradiographed as previously described (Maniatis el aL, 1982).

Page 4: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

410 S.P. O 'CONNOR E T AL .

M!3 transformations were.plated in soft agar supplemented with carbenieillin, X-gal, and IPTG using standard techniques. White plaques were purified by sub- culturinl~ and the presence of insert DNA was confirmed by agarose gel ¢lec- trophoresis. Colnplementation analysis was used to identify clones containing inserts in both orientations where the insert DNA was not force-cloned, and to confirm that force-cloned fragments were complementary. DNA used as template in sequencing reactions was prepared as previously described (Schierer and Cortese, 1979).

Western blot analysis.

The production of 35-kd protein by plasmid or MI3 subclones was detected by Western blotting. The optical density (As~nrn) of late-logophase cultures was measured and adiusted to a value of 1.0. Samples (10 I~1) of the cell suspensions were added to loading buffer containing SDS and bailed for 5 min. The total cellular pro- teins were separated by discontinuous SDS-polyacrylamide gel electrophoresis (12.5 % separation ge~; Laemmli, 1970), and blotted onto nitrocellulose membrane (Sehleieher and Sehuell) by the method of Towbin et aL, 1979). After transfer, filters were blocked by 3 ten-minute incubations in phosphate-buffered saline-0.3 °7o Tween-20 (PBS-T). Filters were incubated for 1.5 h in a 1/1 dilution of mAb 2B2 in PBS, washed once (10 min) in PBS-T, then in casein-thimerosal buffer (Kenna et al., 1970). Filters were incubated for 1 h in horseradish-peroxidase-conjugated goat anti-mouse IgG (Bio- rad Laboratories, Richmond, CA) diluted 1/4,000 in casein-thimerosal buffer. After 2 washes in PBS, the filters were developed in buffer containing tetramethylbenzidine as substrat~ (Bos et aL, 1981).

Nueleic acid seql~encing.

Phage DNA isolated from MI3 recombinants was sequenced by the dideox- ynt:cleotide chain-termlnatlon method (Sanger et al., 1977) with alpha-3SS-adenosine trlphosphate and T7 polymerase (Sequenase; United States Biochemical Corp., Cleveland, Olq) as described by the manufacturer. Reactions with some templates were carried out using inosine triphosphate in place of guanosine triphosphate to reduce compression ~rtifacts observed when sequencing high-G + C-content DNA. The products of sequencing reaction were electrophoresed on 4, 6, and 8 ag0 polyacrylamide gels containing 7 M urea. After electrophoresls, gels were dried under vacuum and exposed to film for 24.48 h at room temperature.

Computer analyses of sequence data were performed with the DNASTAR soft- ware package (DNASTAR, Inc., Madison, WI), or with the Genetics Computer Group Sequence Analysis Software package (Devereau e! aL, 1984).

RESULTS

Plasmid pLWM2020 consists of vector pLIC13 and a 4.7-kb BamHl frag- ment of chromosomal DNA from M. tuberculosis strain H37Ra. Ptasmid pLWM2110 was derived from pLWM2020 by deleting an internal 2.3-kb Sphl fragment from the Mycobacterium DNA (fig. 1, panel A). The deleted frag- ment mapped more thao. 1 kb downstream from the gene which encodes the 35-kDa protein. The hybrid site formed by !~gation of the Sphl ends is located between the adjacent SalI and BarnHl sites indicated in figure 1. This dele- tion had no effect on the level of expression of 35-kDa protein, and did not result in the production of a truncated protein, indicating that the gene en- coding the 35-kDa protein had not been affected (Devereau et al., 1984).

Page 5: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. T U B E R C U L O S I S 3 5 - k D a P R O T E I N 411

A

MMIll ~ SpHI I)5-11 D~MH|

A S~L r4UGn~

S~L |

B

n u ~ A P S S / P S XA P $ S A

S A F S s t ~ l s A P M M N II T L H D T T OH T L n ) t

I | 1

int

ltd.. Itb I

FIG. 1. - - Construction o f pL WM211O from pL WM2020 (#at:e! A}, and restriction map and sequencing strategy o f mycobacteriul DNA insert in pL WM2110 (panel B).

A) ORF indicates the gene encoding the 35-kDa protein; Ap denotes the ampicillin resistance determinant pldCl3, and the hatched region in pLWM2020 is the Sphl fragment which was deleted to give pLWM2110. Arrows associated with each gee: indicate the direction of transcription.

B) The open reading frame for the 35-kDa protein is shown by the closed bar. The arrows indicate the direction and extent ot' sequences derived from the 5' end of the template DNA- Arrows preceded by a square indicate where synthetic oligo~ucleotides specific for the ~e- quenced DNA were used as primers.

To sequence the remaining 2.4 kb o f mycobacterial D N A in pLWM21 I0, restriction f ragments spanning the entire f ragment were subcloned into bacter iophage M13, A m a p of restrictien enzyme sites in the myeobacterial D N A and a schematic representat ion o f the strategy used for sequencing is

Page 6: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

412 S.P. O ' C O N N O R E T A L .

shown in figure 1. Across two-thirds of the fragment bo~h strands of DNA were sequenced. This includes the region later identified as encoding the 35-kDa protein.

Within the 2.4 kb of DNA sequenced, four open reading frames (ORF) greater than 750 nucleotides in length were identified. Three of these ORF began with the codon ATG (methionine), while the fourth began with GTG (valine). The relative positions of the ORF in the sequenced DNA were as follows: 556 to 1368 and 1387 to 2169 on strand 1 ; 2016 to 1240 and 1239 to 424 on the complementary strand.

Three of the ORF were excluded as not encoding the 35-kDa protein on the basis of protein expression from a recombinant M13 generated for se- quencing. A recombinant phage containing a 440-base-pair (bp) SphI /Ss t I fragment (see fig. 1) cloned into Ml3mpl9 was found to produce a truncated protein (approximately 19 kDa) that reacted with mAb 2B2 (data not shown). The same fragment in the opposite orientation in Ml3mpl8 did not express protein. To confirm that we had identified the correct ORF, we used a pair of expression vectors, pTTQI8 and pTTQ19 (Stark, 1987). These plasmids contain a synthetic tac promoter upstream from the lacZ gene and a multiclon- ing polylinker that is in an inverted orientation in the respective plasmids. We subcloned a 985-bp Sal l /Ps t l fragment (a partial PstI digest was used because a Pstl site was located within the reading frame) that encompassed the entire ORF with an additional 130 bp of DNA upstream of the ORF.

The results of these experiments are shown in figure 2. A map of the resulting plasmids indicating the relevant features is shown in panel A of figure 2. To confirm that the fragments of Mycobacterium DNA were clon- ed, intact plasmid DNA was isolated and restricted with either Pstl and Sail, or Ncol and Drall . The resulting fragments were separated by agarose gel electrophoresis and are shown in figure 2, panel B. Digestion of plasmids pSOC2320 and pSOC2340 with Pstl and Sail results in the generation of three fragments, the vector DNA at approximately 4.7 kb, and two fragments of mycobacterial DNA 440 and 540 bp in length. This was in agreement with the sequence data. Digestion with Ncol and Dral l confirmed that the mycobacterial DNA fragment was inserted in opposite orientations in the two plasmids. As seen in lanes 5 and 6 of panel B, the restriction fragments generated by these enzymes agree with the sizes predicted from the maps in panel A, confirming that the ORF in question had been cloned in both orien- tations relative to tac and lacZ. E. colt harbouring these plasmids were then analysed for expression of 35-kDa protein by Western blotting.

As shown in figure 3 (lane 2), strain TBI containing pTTQI8 did not pro- duce any proteins that bound mAb 2B2. For recombinant strain TB-11, 35-kDa protein is produced in the absence of the lac inducer (lane 3). As shown previously (Cohen et aL, 1987), induction of lac with IPTG results in increased production of 35-kDa protein (lane 4). In this case, the increase measured was approximately 8-fold. The results obtained from the E. colt strain con- taining plasmid pSOC2320, again grown in the presence or absence of IPTG,

Page 7: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. T U B E R C U L O S I S 35-kDa P R O T E I N 413

A.

B.

S A L I ~ C ~ !

I ~ T I

II

P s l r !

1 2 3 4 5 6

6.1 5.1 4.1

3.1

2,0

1.6

1.0

0.5

~ t D m . - .

F[~. 2. - - Schematic representation o f plasmids pSOC2320 and pSOC2340 indicating rele- vant restriction sites and orientation o f 35-kDa ORF relative to vector promoter regions fpanel A), ¢md restriction fragments ofpSOC2320 and pSOC2340 after separation by agarose

gel electrophoresis and staining with ethidium bromide (panet B),

Lane I ) l -kb ladder; lane2) pTTQI8. Sail; lane3) pSOC2320, Pstl and Sail; lane 4) pSOC2340, Pstl and Sail; [ane 5) pSOC2320, Ncol and Drall ; lane 6) pSOC2340, NcoI and Droll.

Page 8: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

414 S . P . O ' C O N N O R E T A L ,

1 2 3 4 5 6 7 8

96

55 43

36

29 O

FtG. 3. - - Western blot reacted with mAb 2b2, specific for M. tuberculosis 35-kDa protein.

E. coil strains were grown to log-phase, adjusted to equivalent cell concentration, and total cell proteins were s~parated by SDS-PAGE, then blotted to nitrocellulose. Lanes: I) molecular weight s tandards; 2) TBI(pTTQ18); 3) JMI07(pLWM2110); 4) JM 107(pLWM2 t 10) with IPTG; 5) TBI (pSOC2320) ; 6) TBl(pSOC2320) with IPTG; 7) TBI (pSOC2340) ; g) TBI(pSOC2340) with IPTG.

a re s h o w n in lanes 5 and 6. N o de tec table 35 -kDa p ro te in was p r o d u c e d in the absence o f I P T G . Induc t i on o f lac resul ted in p r o d u c t i o n o f nat ive-s ize 35-kDa protein, and the quan t i ty p roduced was approx imate ly 3.5-fold greater t h a n TB-1 i g r o w n u n d e r t h e s a m e cond i t ions . T h e r e c o m b i n a n t c o n t a i n i n g the M y c o b a c t e r i u m D N A in the oppos i t e o r ien ta t ion did no t p r o d u c e a detec- table level o f 35 -kDa pro te in , even in the presence o f I P T G . These resul ts c o n f i r m tha t we have cor rec t ly ident i f ied wh ich O R F in the s equenced D N A encodes the 35 -kDa pro te in .

T h e s equence o f the gene e n c o d i n g 35 -kDa p ro te in a n d 550 bases o f u p s t r e a m D N A is s h o w n in f igure 4, T h e O R F begins wi th t he c o d o n for m e t h i o n i n e ( A T G ) a n d ex t ends for to ta l o f 813 bases . Loca t ed 15 bases u p s t r e a m f r o m the s tar t c o d o n is a pu ta t ive r i b o s o m e - b i n d i n g site t ha t ex- hibi ts a 5 o f 6 m a t c h wi th the c o n s e n s u s ShJne -Da lga rno sequence , Between 50 and 100 bases u p s t r e a m f r o m the ini t ia tor c o d o n were ident i f ied sequences c o r r e s p o n d i n g to t he - 35 (5 o f 6 m a t c h wi th the c o n s e n s u s E . co l i sequence)

Fro. 4. - - Nucleotide sequence o f the region that contains the 35-kDa protein gene.

The predicted amino acid sequence of the protein is shown below the nucieotide sequence. The regions underlined are those which exhibit the greatest homology with consensus E. coil promoter sequences and ribosome-binding site.

Page 9: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

1o z o 3 o 4 0 5 o 6 o o GAT c,~r.,c ( : ~ ; &COG ¢CaC, CTGA'I"I'GCC.,Cr.~,AACC.IAC CC.CAS CUCC6.G;C~;TI"CA.C,.:r G ~ , , T O ~

7 0 8 0 ~'0 1 0 0 1 1 0 1:20 1 3 0

140 1 5 0 1 6 0 1 7 0 1 8 0 1 9 0 ~,c C~C.AC ¢~C~-~ ~ T ~ C ~ , ~ T C C.C.qC~ c 'z~ CC~CC.~W.~T~T C CC ATTCC, C . C ~ ~ C

2 0 0 ~ 1 0 2 2 0 ~ 3 0 ;~41O 25O ~60 CO.TAT CT~TC~-U~ATC~AUCUCC~TCC~A~CAcccr~c~ C C ~ ~ T T ~

~'~o ~11,0 3 9 0 3O0 3 ~ 0 3 ~ 0 T A C C ~ C T C C C ~ T T ~ T C G G ~ C ~ T C ~ U U U C U a ~ ~ T C ~

3 3 ~ 3 4 0 : ]50 36O 3 7 0 3elO 3 9 0 ~ C S T ~ C C ~ G C C A ¢ C C C ~ G ~ G G ( A G A G C A ~ CGGCGCCa CCATTC,~ ~ ' ~ C C ~ C a ~

4 0 0 4 1 0 49:0 ,113£" ' 14~ 4SO ~GTC, A ~ C ¢ C ,~TC~T~"r CC, C1T, G ~ CCT/~ ,GG 0 T C ~ C CC, U ~ C A A G

- 1 5

4 ~ 0 4 7 0 4 8 0 4":10 5QI) 5 ] ,0 5~0 AT~C~CGCC..~T~GGTG6G'~CAATG~ CGCTGA¢C CC.~TA~ATTGAGCGACAC.~CUCT~AGACCAT

- 1 0

5 : )0 ~540 5 5 0 5~O CIT~CGCACACA~GAAG~CrAACT~ A T ~ GCC AAT CO; TTC v--CT AAA

S/D M~T ALA ASH ~RO PHR, VAL LYS

5 e 5 5 0 n 5 1 5 GCC TGC AAC TA¢ CTC ATG ~ O1~ TTC AG~ ~ A~3 ATC CAC ~ ~AT kLA T~ LYE '~R I~U MET ~ ~ ~ S~R SZR L~G ZLE A~P GLU HIS

e]o ~45 c¢0 u&c ccc A~ CT~ C~= AIT AIA ASp ~p~ L~5 ~&L Cl~ 1LE

675 690 ~CC ~.Ac c.s.A ~ CT~ A C T C ~ A T I t R H I S G,~I A L A LEU T H R GLI~

~ g G I ~ l ~ J G I ~ ~ T ARC ~ U

? s o

66,0

7 0 5 "/~ 0

o u ~ ~ L A ~ L a GL~ v a ~ I L E Gt.Y A S ~ G[~I

'~so 1 6 5 JU~¢ CGA ~ Cl 'G G O ; GAC &'3'C GAH A/,,G ASN AR~ C I R LEU ~ A ~ P I L E GLE L ¥ 9

7 S 5 a l O O c c ~ ~ CTU C C C C.~C ~ ~ A C C A L a LEU TIm~ LEU ~ 1 ~ L q P ,~LH k U , T ~ m

~ Z 5 8 4 0 S 5 5 C, CC GCC ~ C ~ C G C T G ~ &.q~ G I ~ ACC C, AA T&C ~ C ~AC G ~ C.CC ¢ ~ A I ~ ~ GLV ASP A L A A t ~ L ¥ ~ AL,~ T H R GI~U T P a ~ . ~ S ~ A L a ~ . £ G L U

S?O Ba5 ~0O C, CG T ? ~ QCA C~C t A G CPR GT~ A ~ = GCC r A G CAG AGC OTC C, AA C~C ~TC ~ I A PRH a [ a ~ I a G I ~ I ~ U V A L T H R A 1 a GLU GKM SER V A L G i n ~ S P L ~ J

• ~S ~ a ~ 4 5 9 6 0 ~ ~ c 1 ~ CAT GA¢ t A G GCG CTT AGe SCC OCa ~ ~ GCC I . ~ ; LYS THR LEU HIS ASP GLN ALA LEU SER &Igt &LA ALA C12~ ALA L~5 L¥S

9 : ' 5 9'30 1 0 0 5 GCC GTC GAA ~ , A A~T GC~ ATG GTG CTG CaG t A G ~ ATC ~CC UA~ CUA &LA VAL GLU ARG ASH AUL )~T VAL LEU GLN GLN Lg9 ILE ALA CLU A ~

1 0 2 0 1 0 3 5 1050 &CC a A ~ ~ CTC A~C CAG OTC GAG tAG GCG AAG A T G C A ~ GAG ¢.AO GT¢ T~DR L ¥ S LEU LEU S ~ GLU L E U GLU G L ~ A L ~ L Y 5 MET G L ~ ~ J ~LM V A L

X 0 5 ~ l o 7 o 1085 HGC , ~ . k TCG TTG C~G TCG HT~ ACT ~A~ crc ucc ~c~ c c A c ~ c A~C A C e SER ALA S~R LEU /~I~ SSR ~ SER G I ~ LE~ A~A ALA PRO GL¥ ASN THR

1100 1115 1130 CCG A~C C~C UaC ~Z~ ~TC CCC ~AC ZZ~ ATC CA~ COT CCC T~C =CO A~C PRO 5ER ~ ASp G ~ " VAL A~G ASP ~¥S ZLE GLL~ ARG AP.G T~R ALA ~k~

1 1 6 0 1 1 ~ 5 1 1 9 0 u c u A T e GGT TCG G e T G ~ C~T ; C C UA~ ~GT TC~ GT~ t A G ~ ~ ~ A l P , l i e GLY SER A I A GLU L E U & l ~ GL~ 5 E R ~ER V A L G I a GLY ARG ~ET

1 2 0 5 1 2 3 0 1;~45 CTC ~ GT~ G~G CAG GCC U4~ ATe CA~ ~TG GCC GUT CAT Tea C~3 TTG LEU GLU VA~ GLU GIg4 AEA GLY ILZ G[~4 MET ALA GL¥ HIS SER ~RG LEU

1 ~ o i~5 1 2 9 0 GAA CAG ATC CGC GCA TCG ATG ~GC GGT GAA GCG TTG CCG GCC GGC GGG GLU GL~ IL~ A~G ALA SE~ HET AR~ GLY GLU ALA LEU PRO ALA CL~ GLY

1 3 0 5 1 3 2 0 1 1 3 5 A~C ACG GCT ACC CCC ACA CCG GCC ACC GAG ACT ~_~ CGC CGC CCT ATT

T~ ~A T~ ~R0 ~ PP.O AL~ T~ GLU THR SER GL~ GL~ AEJ% ~L~

1550 13~§ 1375 13~5 llg~ GCC GAG CAG CCC TAC C.GT CA~ TAG TTGC.GCAGACGGTTCGGCkTGG~AQI~.AAAG A~ GLU GIN PRO T¥~ GLY GLM ~er

1 , 0 5 1 4 1 5 1 4 2 5 1 4 3 4 C C ~ . ~ C~G ~1~ ~' C~C.42 c ~ "~,~G C O ~ G T T G C T C C A ~

Page 10: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

416 S.P. 0 'CONNOR E T AL .

and - 10 (4 of 6 match with consensus) promoter regions. While the putative promoter sequences exhibit a high degree of homology with the E. coli con- sensus sequences, the distance which separates them (39 bases) is much greater than typically seen in promoters which function in E. coli (16 to 19 bases),

Shown ~n table I is a comparison of codon usage frequency for the 35-kDa protein gene and the gene for the M. tuberculosis 65-kDa protein (Shinninck,

TABLE I. - - Codon usage for M. tuberculosis 35-kDa and 65-kDa proteins.

Frequency in : Frequency in : Amino acid Codon 35-kDa 65-kDa Amino acid Codon 35-kDa 65-kDa

ALA GCA 1.9 0.2 LEU CTT 1.1 0 ALA GCC 10.0 8. i LEU TTA 0 0 ALA GCG 4.4 3,9 LEU TTG 1.5 0.8 ALA GCT 1.9 1.1

LYS AAA 0.4 0.4 ARG AGA 0 4 0 LYS AAG 4.4 7.7 ARG AGG 0 0 ARG CGA 1,5 0.2 MET ATG 3.3 0,9 ARG CGC 1.2 2,6 ARG CGG l . I 0.6 PHE TTC 1.0 I. I ARG CGT 0.7 0.8 PHE TTT 0 0.2

ASN AAC 2.2 2.6 PRO CCA 1.5 0 ASN AAT 1.1 0.4 PRO CCC 1.1 0.9

PRO CCG 1.5 2.1 ASP GAC 3.3 4. ! PRO CCT 0 0 ASP GAT 1.1 0.4

SER AGC 2.2 0.6 CYS TGC 0 0 SER AGT 0.7 0 CYS TGT 0 0 SER TCA 0.4 0

SER TCC 0 1.9 GLN CAA 2.2 0,2 SER TCG 2.2 0.9 GLN CAG 8.9 2.4 SER TCT 0.4 0

GLU GAA 3.0 1.3 TER TAA 0 0 GLU GAG 5,9 9.0 TER TAG 0.4 0

TER TGA 0 0.2 GLY GGA 0.4 0.6 GLY GGC !.5 5.3 THR ACA 0 0.4 GLY GGG 1.1 0.9 THR ACC 3.0 4.9 GLY GGT 1.9 3.4 THR ACG 1,5 0.9

TI-iR ACT 0.7 0.2 HIS CAC 0.4 0.4 HIS CAT 1.1 0 TRP TGG 0.4 0.2

ILU ATA 0 0 TYR TAC 1.5 1.3 ILU ATC 3,0 4.7 TYR TAT 0 0 ILU ATT !. 1 0.fl

VAL GTA 0 O.4 LEU CTA 0 0.2 VAL GTC 1,5 5.3 LEU CTC 3.0 2.3 VAL GTG 3.0 3.6 LEU CTG 3.3 7.1 VAL GTT 0.4 0.9

Page 11: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. T U B E R C U L O S I S 35-kDt7 P R O T E I N 417

1987). In the case o f bo th genes, there is a bias towards codons conta ining C or G in third posi t ion. For example , a m o n g the four codons for a lanine, those that have C or G in the third posi t ion represent 79 % o f the total codons for a lanine in the gene encoding 35-kDa prote in , and 90 % o f the ¢odons for alanine in the 65-kDa prote in . The codon usage reflects the high G + C content in the D N A o f mycobacter ia .

The deduced amino acid sequence o f the 35-kDa prote in is shown in figure 4. The pro te in was 270 amino acids long, giving a calculated molecular weight o f 29.26 kDa. The a m i n o acid compos i t ion o f the pro te in is shown in table II. All 20 amino acids except for cystcine arc present in the prote in . The pro te in conta ins 29 strongly basic a m i n o acids (arginine and lysine) and 33 strongly acidic amino acids (aspartate and glutamate). The predicted isoclec- tric point o f the pro te in was 5.81.

TAaLE I1. - - A m i n o acid compos i t i on o f the M. tuberculosis 3S-kDa protein derived f r o m the nucleot ide sequence .

Amino acid No. o f residues

ALA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 ARG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ! 6 ASN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 ASP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 CYS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 GLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 GLN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 GLY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 HIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 ILE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I I LEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 LYS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 MET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Pl-IE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 PRO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 SER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 THR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 TRY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 TYR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 VAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

The 35-kDa prote in contains only 37 °70 hydrophobic residues, and a H o p p and W o o d s (1981) analysis did no t show any signif icant regions o f high hydrophobiei ty (fig. 5). Chou and Fasman 0978) analysis for secondary struc- ture predicted that the amino- terminal two-thirds is capable o f forming a]pha- helical s tructures. A turn region was predic ted th rough amino acids 180 to

Page 12: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

418 S.P. O ' C O N N O R E T A L .

-2.31

2.31

.2.31 2 ~

FIG. 5. - - Hydrophobicily plot of the 35-kDa protein.

The hydrophobicity values for each amino acid residue are plotted against position in the pro- tein's sequence. Numbering begins at the amino terminus. Values are calculated according to parameters previously described (Hopp and Woods, 1981) by using a six-amino-acid window.

185, and the carboxy-terminal end of the molecule was predicted to be primarily alpha-helical with four potential turns within thirty-five residues of the car- boxy terminus.

DISCUSSION

The original impetus for cloning genes encoding M. tuberculosis antigens into E. coil was to obtain recombinants that could provide a source of DNA probes or protein for use as reagents in improved diagnostic tests or im- munoprophylactic preparations for the detection and prevention of mycobactc=ial disease. The original screening of recombinants used polyelonal anti~.erum raised against live tubercle bacilli with the objective that this ap- proach would increase the chalice of identifying clones that express al;tigens which are relevant to naturally occurring infections (Cohen et al., 1987). Using this approach, a recombinant that expresses a 35-kDa M. tuberculosis an- tigen was identified. This sequencing project was undertaken to increase our knowledge of this protein and to learn more about genetic organization within the genus Mycobacteriurn.

A total o f 2,383 bases of mycobacterial DNA were sequenced. The overall G + C content o f this D N A was 6 6 . 1 % , in agreement with previous deter-

Page 13: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. TUBERCULOSIS 35-kDa PROTEIN 419

minations of the base composition of M. tuberculosis DNA of 66 070 (Wayne and Cross, 1968). The ORF for the 35-kDa protein had a slightly lower O + C content of 63.3 °7o. The high G + C content of the DNA resulted in the ap- pearance of sequencing artifacts in the form of compressions appearing on the autoradiographs. To decrease our error rate in reading the nucleotide se- quence it was imperative that both strands of DNA spanning the ORF were sequenced. We also found that substituting inosine triphosphate for guanosine triphosphate in the sequencing reactions greatly reduced the incidence of com- pressions observed (data not shown).

Computer-assisted analysis of the sequence of this 2.4-kb fragment identi- fied four ORF that could encode proteins with a primary structure greater than 250 amino acids in length (data not shown). Although amino acid sequence data for the protein would have identified the proper ORF, attempts to derive this information from purified protein were unsuccessful. No attempt was made to generate internal peptide fragments for amino-terminal sequencing.

We found that one of the subclones in M 13 expressed a protein of approx- imately 19-kDa molecular mass that reacted with mAb 2B2. The Mycobaeterium DNA fragment in this subelone encompassed the internal por- tion of two of the ORF. Because mycobacterial regulatory regions were not present on this fragment the protein produced must have been a beta- galactosidase fusion protein expressed from the lac promoter. Since the frag- ment was force cloned into MI3, we could determine which ORF was en- coding the 35-kDa protein. This conclusion was supported by the fact that no protein was produced when the Myeobacteriurn DNA was inserted in the opposite orientation (data not shown).

We then subcloned a DNA fragment that encompassed the entire ORF in- to a set of expression vectors (Stark, 1987) for two reasons. Cloning the frag- ment in opposite orientations in these vectors allowed us to confirm that the proper ORF had been identified because we were able to induce expression with IPTG for only one orientation of the insert (fig. 3). Interestingly, we did not get any expression of 35-kDa protein from the clone possessing pSOC2320 in the absence of IPTG. This is contrary to previous results in which expression of this protein was obtained in the absence of IPTG, sug- gesting that the M. tuberculosis promoter was functioning in E. coil (Cohen et al., 1987). Although examination of the sequence revealed the presence o f a properly positioned Shine-Dalgarno region, the distance which separates the putative - 10 and - 35 regions (chosen entirely on the basis of agreement with the E. coil consensus sequences) precludes their ability to function effi- ciently as a promoter in E. coIL Two possible explanations for this incongruity are that expression is from a mycobacterial promoter which is located upstream of the region which was subcloned into pSOC2320, or the mycobacterial pro- moter does not function in E. coli and expression without ITPG was the result of incomplete repression of the lac promoter in pUC13.

We also cloned this gene into an expression vector in an attempt to in- crease the level of 35-kDa protein production. Protein production in E. coli

Page 14: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

420 S.P. O ' C O N N O R E T AL .

rather than M. tuberculosis has several advantages related to the biochemical composition, slow generation time, and extreme infectivity of this pathogen. The protein has been successfully purified from recombinant strain TB-2 (Cohen et aL, 1997), but with low yields (Rumsehlag, unpublished data). Utilization of the expression vector resulted in a 3.5-fold relative increase in 35-kDa protein synthesis compared with the original clone; therefore, use of this recombinant strain as the source of protein may result in increased yields during purification.

Since the ORF identified as the gene encoding the 35-kDa protein had an internal Sphl site it was necessary to preclude the possibility that a segment of the ORF was lost in the construction of pLWM2110 from pLWM2020. This was done by performing double-stranded sequencing of pLWM2020 using an oligonucleotide (5' C G A C A A G A T G C G C G C C A T G G G T G G 3') which primes a region approximately 150 bp upstream from the Sphl site. Sequence data obtained from pLWM2020 extended from this region to a position 15 bp downstream from the Sphl site, and it agreed exactly with the data obtained from pLWM2110 (data not shown). These data confirm that the ORF indicated in figure 4 represents the complete gene for the 35-kDa protein.

We compared the codon usage of this gene with the M. tuberculosis 65-kDa protein gene. This protein has been shown to be a heat-shock protein that is homologous to the GroEL protein of E. coil (Shinniek et aL, 1988). Com- parison of the frequency of codon usage was found to be similar in the genes (table ".). We have observed a similar pattern of codon usage in the M. leprae 65-kDa protein gene (Buchanan et al., 1987) and to a lesser extent in the 18-kDa p",~te'n gene from the same organism (Booth el al., 1988; data not shown). 'l his bias towards codons with C or G in the third position is a reflection of the composition of the DNA from species within this genus (M. tuberculosis, 65 mol ~0 G + C; M. leprae, 56 mol % G + C). The putative - 10-, - 3 5 - , and ribosome-binding sites from both genes exhibit a high degree of homology with the consensus sequences of E. coll. A search at the nucleotide level for internal repeats did not reveal any significant repeated sequences.

The primary structure of the 35-kDa protein deduced from the nucleotide sequence revealed several interesting features. The mass of the protein predicted from the nucleotide sequence (29.26 kDa) differed from the empirically deter- mined molecular mass. Similar findinds have been observed with other pro- teins, including the 65-kDa protein of M, tuberculosis (Shinnick, 1987). The protein contained no cysteines, precluding the formation of intra- or ex- tracellular disulphide bonds. No hydrophobic membrane anchor region or signal sequence could be identified for extraeellular secretion of the protein, and the large amount of predicted alpha-helical structure is suggestive of a soluble protein (although purified 35-kDa protein has demonstrated some in- solubility, perhaps due to aggregation in aqueous solutions). This is consis- tent with findings that indicate that the native location of this protein in M. tuberculosis is in the cytoplasm. The role this protein plays in the physiology of M, tuberculosis has yet to be determined. A search revealed no significant

Page 15: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. TUBERCULOSIS 35-kDa P R O T E I N 421

sequence similarity with nucleic or amino acid sequences deposited in the " G e n b a n k " , " E M B L " , or " N B R F " data bases.

Through the cloning and subsequent sequencing of this gene we have ex- tended our knowledge of the genetic organization of M. tuberculosis and are currently applying information learned to the development of reagents for potential use in improved diagnostic tests. Synthetic oligonueleotides based on the sequence data have demonstrated specificity in DNA-DNA hybridiza- tion experiments for DNA from the tubercle bacilli (data not shown), and therefore may be used as the basis for diagnostic tests for detecting mycobacteria in clinical specimens. Purified protein isolated from the recom- binant E, coil is being studied for possible use as a skin test antigen. During these experiments it was found that this protein stimulates a strong antibody response even when administered to guinea pigs in small doses (Rumschlag et aL, in press). Although celiular immune responses have a greater impact on the course of mycobaeterial infections the humoral response t . this an- tigen is being studied for potential use in a vaccine to control the spread of tuberculosis.

RgSUMg

St~QUENCIE NUCLI~OTIDIQUE CODANT POUR LA PROTEINE DE 35 kDa aE Mvcoaa crrmuu rVeERCVLOSlS

Une prot6ine de 35 kDa de Mycobacterium tuberculosis est exprim6e par une sou- che recombiname de Escherichia coil poss~dant un plasmide qui comient un frag- ment de 2,4 kb d'ADN chromosomique de M. tuberculosis. La s6quence nucl6otidique de ce fragment est d6termin6e par s6quenqage en terminaison de chaine/did~soxynu- cl~otides. L'analyse de eette s6quence r6v61e I'existenee de quatre s6qu¢fiee~ de lec- ture ouvertes qai pourraient coder pour des prot6ines ayant 250 acides amines de longueur. La s6quence de lecture pour la prot6ine de 35 kDa a dt6 identifi~e par sous- clonage de fragments d'ADN s'exprimant/l travers des vecteurs pTTQI 8 et pTTQI9, et par des essais de production de la proteine de 35 kDa par Western blot. Uue pro- t6ine ayant une structure primaire de 270 acides amin6s et un poids mol~culaire pr6- visible de 29,26 kDa a 6t6 d6duite/~ partir de la s~quence nucl~-otidique. Une recherche dans les bases de donn6es sur [es acides nucl6iques et amin6s n'a pas permis d'identi- tier des prot6ines dont la sgquence spit significativement semblable/t celle de eette prot6ine. L'organisation du g6ne codant pour cette prot6ine a 6t~ compar~e avec celle d'autres g~nes mycobact~riens d~ja s~quenc~s. L'information ainsi obtenue sur cette prot6ine pourrait faciliter le d6veloppement des r~actifs de diagnostic et de contr61e des maladies my¢obaet6riennes.

MOTS-CLgS: Mycobacterium tuberculosis, Nucl6otide, Sdq.-'cngage, Prot61ne de 35 kDa; Tuberculose, Diagnostic.

ACKNOWLEDGEMENTS

S.P.O. was supported by a National Research Councii-CDC Research Assoeiateship.

We thank Brian Holloway for preparing synthetic oligonucleotides used as se- quencing primers, and Mitch Cohen for advice and review of the manuscript.

Page 16: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

422 S.P. O 'CONNOR E T AL ,

REFERENCES

BETHESDA RESEARCH LABORATORIES (1984), E. col( TBI- host for pUC plasmids. Focus, 6, 7.

BOOTr~, R.J., HARRIS, D.P., LOVE, J.M. & WATSON, J.D. (1988), Antigenic proteins of Mycobacteriurn leprae: complete sequence of the gene for the 18-kDa protein. J. Immunol., 140, 597-601.

Bos, E.S., VAN DER DOELEN, A.A., VAN ROOY, N. • SCHUURSt A.H. (1981), 3,3',5,5'- tetramethylbenzidine as an Ames-test-negative chromogen for horse-radish peroxidase in enzyme-immunoassay, d. Immunoassay, 2, 187-204.

BUCHANAN, T.M., NOMAGUCHI, H., ANDERSON, D.C., YOUNG, R.A., GIELIS, T.P., BRITTON, W.J., IVANYI, J., KOLK, A.H.J., CLOSS, O., BLOOM, B.R. & MEHRA, V. 0987), Characterization of antibody-reactive epitopes on the 65-kitodalton protein of Myeobacterium leprae. Infect. Immun., 55, 1000-1003.

CENTERS FOR DISEASE CONTROL (1987), Tuberculosis and acquired immunodeficicncy syndrome - - New York City. Morbid. Mortal. Weekly Rep., 36, 785-795.

CENTERS FOR DISEASE CONTROL (1988), Tuberculosis: United States, t986. Alorbid. Morlal. Weekly Rep., 36, 817-820.

CENTERS FOR DISEASE CONTROL (1989), Tuberculosis and human immtmodcficiency virus infection : recommendations of the advisory committee for the elimina- tion of tuberculosis (ACET). Morbid. Mortal. Weekly Rep., 3/I, 236-250.

Cnou, P.Y. & FASMAN, G.D. (1978), Prediction of secondary structure of proteins from their amino acid sequence. Advanc. Enzymol., 47, 45-148.

COHEN, M.L., MAYER, L.W., RUMSCHLAG, H,S., YAKRUS, M.A., JONES, W.D. Jr, & GooD, R.C. (1987), Expression of proteins of Mycobacterium tuberculosis in Escherichia col( and potential of recombinant genes and proteins for development of diagnostic reag~i~:s. J. clin. MicrobioL, 25, 1176-1180.

DEVEREAU~ J., HAEaERLI, P. & SMITHIES, O. 0984), A comprehensive set of sequence analysis programs for the VAX. Nucl. Acids Res., 12, 387-395.

HANAHAN, D. (1983), Studies on transformation of Escherichia col( with plasmids. J. tool. Biol., 166, 557-580.

Ho~e, T.P. & WOODS, K.P. (1981), Prcdicti,~a of protein antigenic determinants from amino acid sequences. Proc. nat. Acad. Sci. (Wash.), 78, 3824-3828.

IsH-HOROWlCZ, D. & BURKE, J.F. (1981), Rapid and efficient cosmid cloning. Nucl. Acids Res., 9, 2989-2998.

JOINT INTERNATIONAL UNION AGAINST TUBERCULOSIS ANDWORLD HEALTH STUDY GROUP (1982), Tuberculosis cont,'ol. Tubercle, 63, 157-169.

KENNA, J.G., MAJOR, G.N. & WlULIAMS, R.S. (1985), Methods for reducing nonspecific antibody binding in enzyme-linked immunosorbent assays. J. lm- munol. Methods., 85, 409--419.

LAEMME[, U.K. (1970), Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature (Lond.), 227, 680-685.

Lu, M.C., LIEN, M.H., BECKER, R.E., HEINE, H.C., BUGGS, A.M., LIPOVSEK, D., GUPTA, R., RoaaiNS, P.W., GROSSKINSgY, C.M., HUBBARD, S.C. & YOUNG, R.A. (1987), Genes for immunodominant protein antigens are highly homologous in Mycobacterium tuberculosis, Mycobacterium africanum, and the vaccine strain Mycobacterium boris BCG. Infect. Im- mun., 55, 2378-2382.

MANIATfS, T., FRITSCH, E.F. & SAMBROOK, J. (1982), Molecular cloning: a laboratory manual (p. 93). Cold Spring Haroo~ Laboratory, New York.

~ANGER, F., NICKLEN, S. ~. COULSON, A.E. (!977), DNA sequencing with chain- terminating inhibitors. Proc. nat. Acad. Sci. (Wash.), 74, 5463-5467.

SCFUERER, P.H. & CORTESE, R. (1979), A fast and simple method for sequencing DNA cloned it, the single-stranded bacteriophage M13. J. tool. Biol.. 129, 169-172.

Page 17: Nucleotide sequence of the gene encoding the 35-kDa protein of Mycobacterium tuberculosis

M. T U B E R C U L O S I S 35-kDa P R O T E I N 423

5HInNICK, T.M. (1987), The 6$-kilodalton antigen of Myeobacteriura tuberculosis. d. Bact., 169, 1080-1088.

SHINNICK) T.M. , KRm, C. & SCHAOOw, S. (1987), Isolation and restriction site maps of the genes encoding five Mycobacterium tuberculosis proteins. Infect. lm- mun., 55, 1718-1721.

SmNNICK, T.M., VODKIN, M.H. & WILLIAMS, J.C. (1988), The Mycobacterium tuber- culosis 65-kilodalton antigen is a heat-shock protein which corresponds to common antigen and to the Escherichia carl GroEL protein. Infect. lmmun., 56, 446-451.

STARK, M.J .R. (1987), Multicopy expression vectors carrying the iac repressor gone ['or regulated hlgh-level expression o f genes in Escherichia coll. Gene, 51, 255-267.

THOLE, J .E.R. , DAUWERSE, H.G. , DAS, P.K., G~tOOTHUIS, D.G., SCHOOLS, L.M. & VAN EMnOEN, J .D.A. (1985), Cloning of Mycobacterium boris BCG DNA and expression of antigens in Escherichia coil Infect. lmmun., 50, 800-806.

TOWBIN, H., STAEHELIN, T. & GORDON, J. (1979), Electrophoretic transfer of pro- teins from polyacrylamide gels to nitrocellulose sheets : procedure and some applications. Proc. nat. Acad. Sci. (Wash.), 76, 4350-4354.

WALLACE, R.J . , SWENSON, J .M., SILCOX, V.A., GOOD, R.C. , Tsct-tEN, J .A. & STOr~E, M.S. (1984), Spectrum of disease due to rapidly growing mycobaeteria. Rev. Infect. Dis., 5, 657-679.

WAYNe, L.G. & GRoss, W.M. (1968), Base compositions of deoxyribonucleic acid from mycobacteria. J. Bact., 96, 1915-1919.

YANISCH-PERRON, C., "¢IERA, J. & MESSING, J. (1985)) Improved M13 phage-cloning vectors: nucleotide sequences of the M 13mp18 and pUCI9 vectors. Gene, 33, 103-119.

YooNo, R.A., MEHRA, V., SWEETSER, D., BUCHANAN, T., CLARK-CURTIS, J,) DAVIS, R.W. ~*~ BLOOM, B.R. (1985), Genes for the major antigens of the leprosy parasite Mycobacterium leprae. Nature (Land.), 316, 450-452.

Use ol ~ trade names is for identification only and does not imply endorsement by the Public Health Service or by the US Dept of Health and Human Services.