7
Vol. 183, No. 3, 1992 March 31, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS Pages 1273-1279 THE COMPLETE NUCLEOTIDE SEQUENCE OF THE GENE CODING FOR THE NONTOXIC-NONHEMAGGLUTININ COMPONENT OF CLOSTRIDIUM BOTULINUM TYPE C PROGENITOR TOXIN K. Tsuzuki, K. Kimura, N. Fujii, N. Yokosawa, and K. Oguma* Department of Microbiology, Sapporo Medical College, Sapporo 060, Japan Received February 23, 1992 SUMMARY: The structural gene for a nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin was found to exist on a 7.8 kb DNA fragment obtained from a type C phage DNA. The gene existed between the neurotoxin and hemagglutinin genes, and consisted of an 3588 bp open reading frame (1196 amino acid residues). It was speculated that this gene and the neurotoxin gene were transcribed by the same mRNA (polycistronic transcription) in C. botulinum organisms. © 1992Academicp ..... Inc. Clostridium botulinum type C strains produce two different-molecular-size toxins, with molecular masses of approximately 300 kilodaltons (kDa) and 500 kDa, in their culture fluids (1, 2). Each toxin is designated as a progenitor toxin (12S and 16S toxins), and is dissociated into a neurotoxin with a molecular weight (MW) of approximately 150 k and a nontoxic component in an alkaline condition (1, 2). The nontoxic component of 16S toxin shows hemagglutinating activity, but that of 12S toxin does not. It has been postulated that the nontoxic component of the 16S toxin is formed by conjugation of the nontoxic component of 12S toxin (designated as nontoxic-nonHA component in this paper) with hemagglutinin (HA)(3). On SDS gel electrophoresis, the nontoxic-nonHA component demonstrates only one band with a MW of approximately 130 k, but the nontoxic component of 16S toxin shows several bands such as 130 k (corresponds to nontoxic-nonHA component), 53 k, 33 k, and other faint bands, indicating that HA consists of several subcomponents (2, 4, 7). We have already succeeded in cloning the gene coding for the neurotoxin and the 33 k subcomponent of HA from the bacteriophage DNA which governs both neurotoxin and HA production in C. botulinum type C (4, 5). The HA-33 gene and the 5'-terminus of the neurotoxin gene exist on the same 7.8 kbp DNA fragment; the HA-33 gene is located 4.3 kb (exactly 3873 bp) upstream of the neurotoxin gene, and these two genes are transcribed in opposite directions (4, 5). This time, we found that the gene for the nontoxic-nonHA component (nontoxic-nonHA gene) exists between the HA-33 and neurotoxin genes (Fig. 1), so that the whole nucleotide sequence of this gene has now been determined. *To whom correspondence should be addressed. 1273 0006-291X/92 $1.50 Copyright © 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Embed Size (px)

Citation preview

Page 1: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol. 183, No. 3, 1992

March 31, 1992

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

Pages 1273-1279

THE C O M P L E T E NUCLEOTIDE SEQUENCE OF THE GENE CODING FOR THE NONTOXIC-NONHEMAGGLUTININ COMPONENT OF

C L O S T R I D I U M B O T U L I N U M TYPE C PROGENITOR TOXIN

K. Tsuzuki, K. Kimura, N. Fujii, N. Yokosawa, and K. Oguma*

Department of Microbiology, Sapporo Medical College, Sapporo 060, Japan

Received February 23, 1992

SUMMARY: The structural gene for a nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin was found to exist on a 7.8 kb DNA fragment obtained from a type C phage DNA. The gene existed between the neurotoxin and hemagglutinin genes, and consisted of an 3588 bp open reading frame (1196 amino acid residues). It was speculated that this gene and the neurotoxin gene were transcribed by the same mRNA (polycistronic transcription) in C. botulinum organisms. © 1992 Academic p .. . . . Inc.

Clostridium botulinum type C strains produce two different-molecular-size toxins, with

molecular masses of approximately 300 kilodaltons (kDa) and 500 kDa, in their culture fluids (1,

2). Each toxin is designated as a progenitor toxin (12S and 16S toxins), and is dissociated into

a neurotoxin with a molecular weight (MW) of approximately 150 k and a nontoxic component

in an alkaline condition (1, 2). The nontoxic component of 16S toxin shows hemagglutinating

activity, but that of 12S toxin does not. It has been postulated that the nontoxic component of

the 16S toxin is formed by conjugation of the nontoxic component of 12S toxin (designated as

nontoxic-nonHA component in this paper) with hemagglutinin (HA)(3). On SDS gel

electrophoresis, the nontoxic-nonHA component demonstrates only one band with a MW of

approximately 130 k, but the nontoxic component of 16S toxin shows several bands such as

130 k (corresponds to nontoxic-nonHA component), 53 k, 33 k, and other faint bands, indicating

that HA consists of several subcomponents (2, 4, 7). We have already succeeded in cloning

the gene coding for the neurotoxin and the 33 k subcomponent of HA from the bacteriophage

DNA which governs both neurotoxin and HA production in C. botulinum type C (4, 5). The

HA-33 gene and the 5'-terminus of the neurotoxin gene exist on the same 7.8 kbp DNA

fragment; the HA-33 gene is located 4.3 kb (exactly 3873 bp) upstream of the neurotoxin gene,

and these two genes are transcribed in opposite directions (4, 5). This time, we found that the

gene for the nontoxic-nonHA component (nontoxic-nonHA gene) exists between the HA-33

and neurotoxin genes (Fig. 1), so that the whole nucleotide sequence of this gene has now been

determined.

*To whom correspondence should be addressed.

1273

0006-291X/92 $1.50 Copyright © 1992 by Academic Press, Inc.

All rights of reproduction in any form reserved.

Page 2: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol. 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

CL8 CH3 7.8kbp 3.0kbp =',

438bp 858bp 3588 bp 3873 bp

HI II II

HA-33 Nontoxic-nonHA Neurotoxin component

I EcoR I

FIGURE 1. Schematic representation of the hemagglutinin (HA), nontoxic-nonHA, and C1 neurotoxin genes. The phage DNAs were digested with EcoRI. The fragments (7.8 and 3.0 kb) were cloned into pUC118 plasmids to give pCL8 and pCH3. The structural genes for HA, the nontoxic-nonHA component, and neurotoxin are indicated by boxes. The orientations of the open reading frames are indicated by arrows in the boxes.

MATERIALS AND METHODS

Purification of 16S toxin from type C culture. The 16S toxin was purified from the

culture fluid of type C strain, C-Stockholm. The organisms were cultured by a cellophane tube procedure (6). The culture supernatant was centrifuged (6,000 × g, 30 min.), and the toxin

was precipitated with 50% saturated ammonium sulfate. The precipitate was suspended in 50 mM sodium acetate buffer (pH 4.0) containing 200 mM NaCI, and then subjected to a Sephadex G-75 (Pharmacia) column (6.5 × 48 cm) under the same condition. The fractions

which showed both toxic and hemagglutinating activity were collected, and applied to a SP- Sephadex C-50 (Pharmacia) column (3.2 × 23 cm) equilibrated with the same 50 mM sodium

acetate buffer (pH 4.0) containing 200 mM NaC1. The absorbed 16S toxin was eluted with an exponential gradient of NaC1 (200 to 500 raM) in the same buffer, and then precipitated with saturated ammonium sulfate solution. The precipitate was dissolved in 50 mM sodium acetate buffer (pH 4.0) containing 500 mM NaC1, and then applied to a Sephadex G-200 (Pharmacia) (2.2 × 90 cm) column to obtain the purified 16S toxin.

Amino acid sequencing of N-terminus of nontoxie-nonHA component. The amino acid sequence of the N-terminus of the nontoxic-nonHA component was determined by the direct protein microsequencing method described previously (4). Briefly, the purified 16S toxin preparation was subjected to SDS-PAGE in the presence of 2- mercaptoethanol(2-ME), and transferred to a Immobilon P membrane (Millipore Corp. Bedford, Mass.). The membrane was stained and the visible protein band with MW of approximately 130 k was cut out. The N-terminal amino acid sequence of this protein was determined by a gas phage sequencer (model 470A; Applied Biosystems, Inc., Japan) coupled to an on-line high-performance liquid chromatograph (model 120A; Applied Biosystems, Inc., Japan).

Cloning and DNA sequencing of nontoxic-nonHA gene. Thecloning and theentire nucleotide sequences of the neurotoxin and HA-33 genes have been reported previously (4, 5). The phage DNAs were digested with EcoRI, cloned into the EcoRI site of Xgtl 1 phage DNA by using Packagene (Promega Corp., Madison, Wis.), and then mixed with Escherichia coli Y1090 cells. The DNA library was screened by the PlotBlot Immunoscreening System (Promega) with anti sera against neurotoxin and HA. Since the E. coli cells infected with the recombinant phage which contained a 7.8 kbp DNA fragment produced both N-terminus of neurotoxin (63 k) and one subcomponent of HA (33 k), the 7.8 kbp DNA fragment was recloned into plasmid pUCll8, digested with PstI and BamHI, and then deleted by using a TAKARA deletion kit (TAKARA SHUZO Co., Ltd., Japan) containing exonuclease III, Mung

1274

Page 3: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vo l . 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

bean nuclease, and Klenow enzyme. After transformation in E. coli MVl184, the deleted DNA fragments with the proper sizes were selected by agarose gel electrophoresis. The nucleotide sequence was determined by the dideoxy chain termination method with a T7 sequencing kit (Pharmacia Uppsala, Sweden).

Nucleotide sequence accession number. The nucleotide sequence reported herein will appear in the EMBL nucleotide sequence data bases under the accession number X62389.

R E S U L T S AND D I S C U S S I O N

Characterization of purified 16S toxin. The 16S toxin was purified from the

culture fluid of C-Stockholm by column chromatography under an acid condition. The

purified toxin was subjected to SDS-PAGE with a reducing agent. The preparation

demonstrated five major bands with MWs of approximately 130, 100, 53, 50, and 33 k, as well as

several faint bands (Fig. 2). Based on previous reports (1, 2, 4, 7), it can be concluded that the

protein of 130 k is the nontoxic-nonHA component, and those of 100 and 50 k are the heavy and

light chains, respectively, of the neurotoxin. The remaining proteins seem to be subcomponents

of HA.

Amino acid sequence of the N terminus of nontoxic-nonHA component.

The amino acid sequence of the N terminus of the 130 k protein developed on SDS-PAGE was

determined by a direct protein microsequencing procedure. The N-terminal amino acid sequence

was determined to be Met-Asp-tle-Asn-Asp-Asp-L~u-Asn-Ile-Asn-Ser- Pro-Val-Asp-Asn-Lys-

Asn-Val-Val-Ile.

MW (kDa) 1 2

2OO 116

92 66

45

31 --~ I~=~

14 -'~ ~,~ ~i~ ii ~

3 4

Tox (whole) ~ Nontoxic-nonHA

TOX (HC)

~ . _ ~ HA-53 Tox (Lc)

< HA-33

FIGURE 2. SDS-PAGE of purified 16S toxin and 7S neurotoxin. Both toxins were heated for 5 min. at 100"C in the absence of 2-ME (lanes l and 2) and in the presence of 2-ME (lanes 3 and 4). Neurotoxin (150 k) and its heavy (100 k) and light (50 k) chains, the subcomponent of HA (33 k, 53 k), and the nontoxic-nonHA component ( 130 k) are indicated by arrows. Lanes 1 and 3, 16S toxin; Lanes 2 and 4, neurotoxin.

1275

Page 4: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol. 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

AATC TI~A GTAAAAC TTTATTT GTGTTATTTGAAGG AC TAATGAKAAACA CTTCATTATTTC ~ AAATCATTTGCATTT GTTTGAGACATATT ATTIC C TC~ATAT ~ATAz-~-*-~-~T TTA@AAC ACA~TAAACACAATAA~E TTC C TGATTAC TTTTTGT GAAGTAATAAAGAA TTTAGTAAACGTAAA CAAACTC TGTA IleLysAs pLeuValLysAsnThrAsnAsnSerProSerIlePhe~heVa iGluAsnAsnArgL euAspAsnAlaAsnThrGlnSerMet I

HA-33 gene I

C TTTCAC T TACATATAACATAT TTAAGCCAAATTTT GTAAACC TAAAATT ATAATATATC TATT I-i-FI-I AAATGATTTA TATATTTTTATATT TTAGGTTT~C~C GA TGAAGAA

-35

ATGTTATATA TAAGTGAT AGAAACAATATAAA AAATT~I-~-x-,-s AAA TGAAGGAGGTATAT AAAAATGTAAACAA AC TAAAATGAAATTT TATAAAAATAATAGAGGAGA GTGATTAT AT

-10

1 ATG GAT ATA AAC GAT GAC TTA AAT ATA AAT TCT CCA GTG GAT AAT AAA AAT GTT GTA

Met Asp Ile Asn Asp Asp Leu Ash Ile ASh Ser Pro Val Asp Ash Lys Asn Val Val 10

91 AAA GCT TTT AAA GTr GCT CCT AAT ATT TGG GTA GCT CCA GAA AGA TAT TAT GGA GAA

Lys Ala Phe Lys Val Ala Pro Ash Ile Trp Val Ala Pro Glu Arg Tyr Tyr Gly Glu 4O

181 GGA GGA ATA TAT GAC TCT AAT TTT CTT TCA CAA GAT AGT GAA AGA GAA AAT TTT TTA

Gly Gly Ile Tyr Asp Ser Asn Phe Leu Set Gln Asp Ser Glu Arg Glu AS n Phe Leu 70

271 AAT ACT ATT TCT GGT AAA CAA TTA TTA TCT TTA ATT TCT ACA GCA ATT CCA TTT CCT Asn Thr Ile Ser Gly Lys Gln Leu Leu Sex Leu Ile Set Thr Ala Ile Pro Phe Pro

100

361 AAT ATA TTT ACT TTT GGA AAA ACA CCA AAA TCT AAT AAA AAA CTA AAT TCG TTA GTT ASh 11e Phe Thr Phe Gly Lys Thr Pro Lys Ser Asn Lys Lys Leu Asn Set Leu Val

130

451 AGA GAG ACA AAT TAT ATT GAA TCT CAA AAT AAT AAA AAT TTT TAT GCA TCT AAT ATA Arg Glu Thr Asn Tyr Ile GIu Ser Gln ASh ASh Lys Asn Phe Tyr Ala Ser Asn Ile

160

541 AAT AAT GTT ATA TAC TAT AAA AAG AAT GAT C~T GAG AAT GGT ATG GGA ACA ATG GCT ASh ASn Val Ile Tyr Tyr Lys Lys Ash Asp Ala Glu Ash G1y Met Gly Thr Met Ala

631 TAT AAT AAA TTT TAT ATA GAT CCT GCA

Tyr Ash Lye Phe Tyr Ile Asp Pro Ala

721 AAT TTA GTA GTT CCA TAT AGA CTA AGA Asn Leu Val Val Pro Tyr Arg Leu Arg

811 GTT GAC CTT GAG TTT ATA AAT ACA AAT

Val Asp Leu Glu Phe lle Ash Thr Asn

190 200

ATG GAG CTA ACT AAA TGC TTA ATA AAA TCT CTT

Met Glu Leu Thr Lys Cys Leu Ile Lys Ser Leu 220 230

ACA GAA CTA GAT AAT AAG CAA TTT TCT CAA CTA

Thr Glu Leu Asp Aan Lys Gln Phe Ser Gin Leu 250 260

CCA TAT TGG TTT ACA AAT AGT TAT TTT CCA AAT

Pro Tyr Trp Phe Thr ASh Set Tyr Phe Pro Asn 280 290

S.D°

ATA GTT CGG GCT AGA AAA ACT AAT /~2T TTT TTC

Ile Val Arg Ala Arg Lys Thr ASn Thr Phe Phe 20 30

CCT CTA GAT ATT GCT GAA GAA TAT AAA CTT GAT

Pro Leu Asp Ile Ala Glu Glu Tyr Lys Leu Asp 50 60

CAA GCT ATT ATA ATC TTA TTG AAG AGA ATA AAT

Gln Ala Ile Ile Ile Leu Leu Lys Arg Ile ASh 80 90

TAT GGA TAT ATA GGA GGA GGA TAT TCT TCA CCA Tyr Gly Tyr Ile Gly Gly Gly Tyr Set Set Pro 110 120

ACA AGT ACC ATT CCA TTT CCT TTC GGG GGA TAT Thr Set Thr lle Pro Phe Pro Phe Gly Gly Tyr 140 150

ATT ATT TTT GGT CCA GGA TCG AAT ATA GTG GAA Ile Ile PhQ Gly Pro Gly Ser Ash Ile Val GIu 170 180

GAA ATA GTA TTT CAA CCA CTA TTA ACT TAT AAA GIu Ile Val Phe Gln Pro Leu Leu Thr Tyr Lys

210

TAT TTT TTA TAT GGA ATA AAA CCT AGT GAT

Tyr Phe Leu Tyr Gly Ile Lys Pro Set Asp 240

AAT ATA ATT GAT TTA TTA ATA TCT GGA GGG ASh Ile lle Asp Leu Leu Ile Ser Gly Gly

270

TCA ATA AAA ATG TTT GAA AAA TAT AAA AAT

Ser Ile Lys Met Phe G1u Lys Tyr Lys ASh 300

901 ATT TAT AAA ACT GAA ATT GAA GGG AAT AAT

Ile Tyr Lys Thr GIu Ile GIu Gly ASh ASh 310

991 ATA TGG AAT TTA AAC TTG AAT TAT TTT TGT Ile Trp ASn Leu Asn Leu Asn Tyr Phe Cy$

340

1081 AAA CAG TAT TAT ACA ATG GAT TAT ACT GAT Lys Gln Tyr Tyr Thr Met Asp Tyr Thr Asp

370

1171 AAA AAT ACA AAT ATA ATA AGT AAA CCT GAA Lys ASh Thr ASh Ile Ile Set Lys Pro Glu

400

1261 GAT G~A TTA AAG GGT ACT I&CA GAA GAT TTT

Asp Gly Leu Lys Gly Thr Thr Glu ASp

1351 AAT TTC CCT TTA AAT AAT ATA AGT ATA ASh Phe Pro Leu Asn Asn Ile Ser Ile

1441 TTA GTA TTT ACA CAG ATA ACG AGT ATG Leu Val Phe Thr Gin Ile Thr Ser Met

1531 AAT GAA AAC TTC ACA TTA TCT TCA GAC ASn GIu ASh Phe Thr Leu Ser Set Asp

1621 AGT TAT TTA GAA ACT ATA AAA AAT GAT

Ser Tyr Leu Glu Thr Ile Lys Asn Asp

GCT ATT GGA AAT GAT ATA AAA TTA CGT TTA Ala ~le Gly ASh Asp Ile Lys Leu Arg Leu

320

CAA TCA TTT AAT AGT ATA ATA CCA GAC AGA Gln Set Phe Ash Ser Ile Ile Pro Asp Arg

350

AAT TAT AAT ATA AAT GGT TTT GTr AAT GGT ASh Tyr ASh Ile Asn Gly Phe Val Ash Gly

380

Phe 430

GAA GAA GTA GAC AGT ATT CCA ~ ATT ATA GAT GIu GIu Val Asp Set Ile Pro Glu Ile Ile Asp 460 470

ACT GAA GAA GTT ACT ACA CAT ACT GCT TTA ~CT Thr Glu Glu Val Thr Thr His Thr Ala Leu Ser 490 500

TTT TCA AAA GTT GTT TCC TCT AAA GAT AAG TCA

Phe Set Lys Val Val Set Set Lys Asp Lys Set 520 530

GGT CCT ATT GAT ACA GAT AAA AAG TAT TAT TrG Gly Pro Ile Asp Thr Asp Lys Lys Tyr Tyr Leu 550 550

AAA CAA AAA TTT CAA ATT AAT GTT CAG GAT Lys Gln Lys Phe Gln Ile Asn Val Gln Asp

330

TTT AGT AAC GCA CTT AAA CAT T~T TAT AGA Phe Ser Asn Ala Leu Lys His Phe Tyr Arg

360

CAA ATT AAT ACT AAG TTA CCT TTA TCT AAT Gln Ile ASh Thr Lys Leu Pro Leu Set Ash

390

AAA GTA GTA AAT CTA GTA AAT GAA AAT AAT ATT TCA TTA ATG AAA AGT AAT ATT TAT GGA Lys Val Val Asn Leu Val Asn Glu Asn Asn Ile Ser Leu Met Lys Ser ASh Ile Tyr Gly

410 420

TAT AGT ACT TAT AAG ATT CCA TAT AAT GAA GAK TAT GAA TAT CGT TTC AAT GAT TCG GAT

Tyr Set Thr Tyr Lys Ile Pro Tyr Asn G1u Glu Tyr Glu Tyr Arg Phe ASh Asp Set Asp

440 450

ATT AAC CCA TAT AAA GAT AAT AGT GAT AAC Ile Asn Pro Tyr Lys Asp ASh Ser Asp ASh

480

ATA AAT TAT CTT CAA GCT CAG ATT ACT AAT Ile Ash Tyr Leu Gln Ala Gln Ile Thr Asn

510

TTA GTA TAT TCT TTC TTA GAT AAT TTA ATG Leu Val Tyr Ser Phe Leu ASp ASh Leu Met

540

TGG TTA AAA GAA GTC TTT AAA AAC TAT TCA

Trp Leu Lys Glu Val Phe Ly8 Ash Tyr Set 570

F I G U R E 3. Complete nucleotide sequence and deduced amino acid sequence of the nontoxic-nonHA gene. The possible promoter consensus sequences at positions -10 and -35 and a Shine-Dalgarno (S.D.) sequence of the nontoxic-nonHA gene, and a S.D. sequence of the neurotoxin gene are underlined. Also, a candidate which was presumed to work as a promoter for the neurotoxin gene in E. coli cells is shown with a dotted line.

1276

Page 5: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol . 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

1711 TTT GAT ATT AAT CTT ACT CAA GAA ATT GAT Phe Asp Ile Ash Leu Thr Gln Glu Ile Asp

580

1801 ACA TCA AAT TCT TTT GTA GAA GAA TAT CAA

Thr Ser ASh Ser Phe Val Glu Glu Tyr Gln

610

1891 ATT GAT GAC ATA TCA GAT AGT TTA TTG GGA

Ile Asp Asp Ile Ser ASp Ser Leu Leu Gly

640

1981 TTT AAG AAG ATA TAT TTT AGT TTC TTA GAT

Phe Lys Lys Ile Tyr Phe Ser Phe Leu Asp 670

2071 ATA TTA GCT CAA GAA AGT TTA GTA AAA CAA Ile Leu Ala Gln Glu Ser Leu Val Lys Gln

700

2161 TTA ATT AGG GAA ACT ACA GAG AAG ACA TTr Leu lle Arg GIu Thr Thr GIu Lys Thr Phe

730

2251 GCA TCT ATA TGT GTT TTT GTT GAG GAT ATA Ala Set Ile Cys Val Phe Val Glu Asp Ile

760

2341 GAA TIT ATA CAA AGA TGT ACT AAT ATT AAT Glu Phe Ile Gin Arg Cys Thr Ash Ile Ash

790

2431 TTA GAT ATA CAA TCT ATT AAA AAC TTT TTT Leu Asp Ile Gln Set lle Lys ASh Phe

2521 GCA TCA AAA GGA CCA AAT AGT AAT ATA Ala Ser Lys Gly Pro ASh Ser Ash Ile

2611 GGT GTA AAT GGA GAA TCA TTA TAT CTA Gly Val Ash Gly GIu Ser Leu Tyr Leu

2701 TTT ACA ATT TGT TTC TGG TrA AGA TTC Phe Thr Ile Cys Phe Trp Leu Arg Phe

2791 ATT TAT TTT GAA GAC AAT GGA TTA GTT Ile Tyr Phe GIu Asp Ash Gly Leu Val

AGT ATG TGT GGC ATT AAT GAA ~IT GTT CTT TGG TTT GGG AAA GCT TTA AAT ATA TTA AAT Ser Met Cys Gly Ile Ash Glu val Val Leu Trp Phe Gly Lys Ala Leu Asn Ile Leu Asn

590

GAT TCA GGT GCA ATT TCT CTT ATC AGT AAA Asp Ser Gly Ala Ile Ser Leu Ile Ser Lys

620

CTA TCA TTT AAA GAT TTA AAT AAT AAA ~TA

Leu Ser Phe Lys Asp Leu Asn Asn Lys Leu

650

CAA TGG TGG ACT GAA TAT TAT AGT CAA TAT Gln Trp Trp Thr Glu Tyr Tyr Ser Gin Tyr

680

ATA GTA CAA AAT AAA TTT ACT GAT TTA TCT

lle Val Gln Ash Lys Phe Thr Asp Leu Set 710

ATA GAT CTA TCA AAT GAA TCA CAA ATA TCA lle Asp Leu Set ASh Glu Ser Gln Ile Set

740

TAT CCT AAG TTT ATT TCT TAT ATG GAA AAG Tyr Pro Lys Phe lle Ser Tyr Met GIu Lys

770

GAC AAT GAA AAG TCG ATT TTA ATT AAT AGT Asp Asn Glu Lys Ser Ile Leu Ile Ash Set

800

600

AAA GAT AAT CTA AGG GAA CCC AAT ATA GAA Lys Asp ASh Leu Arg Glu Pro ASn Ile Glu

630

TAT GAA ATA TAT TCT AAG AAT ATA GTT TAT

Tyr GIu Ile Tyr Ser Lys Asn I1e Val Tyr 660

TTT GAA TTA ATT TGT ATG GCA AAA CAG TCA

Phe Glu Leu lle Cys Met Ala Lys Gin Ser 690

AAA GCT AGT ATT CCA CCT GAT ACA TTA AAA Lys Ala Ser I le Pro Pro ASp Thr Leu Lys

720

ATG AAT CGT GTA GAT AAT TTT TTA AAT AAG Met ASh Arg Val Asp Asn Phe Leu ASh Lys

750

TAT ATT AAT AAT ATA AAT ATT AAG ACA AGA Tyr lle Asn ASh Ile Asn Ile Lys Thr Arg

780

TAT ACT TTT AAA ACT ATT GAT TTC AAA TTC Tyr Thr Phe Lys Thr Ile Asp Phe Lys Phe

810

AAT TCA CAA GTT GAA CAA GTA ATG AAA GAA ATA TTA TCC CCT TAT CAA CTA TTA TTA TTT

Phe ASh Set Gln Val Glu Gin Val Met Lys GIu Ile Leu Ser Pro Tyr Gin Leu Leu Leu Phe 820 830 840

ATT GAA GAT ATT TCT GGA AAA AAC ACA TTG ATA CAA TAT ACA GAA TCT ATA GAA TTA GTT TAT Ile GIu Asp Ile Ser Gly Lys Ash Thr Leu Ile Gln Tyr Thr Glu Ser Ile Glu Leu Val Tyr 850 860 870

AAA TCT CCT AAT GAA ACG ATT AAA TIT TCT AAT AAA TTT TTC ACA AAT GGA TTA ACT AAT AAT Lys Ser Pro Asn GIu Thr Ile Lys Phe Ser ASh Lys Phe Phe Thr Ash Gly Leu Thr ASh Ash 880 890 900

ACA GGA AAA AAT GAT GAT AAA ACT AGA TTA ATA GGA AAT AAG GTT AAT AAT TGT GGT TGG GAA Thr Gly Lys ASn ASp Asp Lys Thr Arg Leu Ile Gly ASn Lys Val Ash Ash Cys Gly Trp Glu

910 920 930

TTT GAA ATA ATA GAT TCA AAC GGC AAT CAA GAA AGT GTA TAT TTA TCT AAT ATT ATA AAT GAC Phe Glu Ile lle Asp Set ASh Gly ASh Gln GIu Ser Val Tyr Leu Ser Ash Ile Ile Asn Asp 940 950 960

2881 AAT TOG TAC TAT ATA TCA ATA TCC GTT GAT CGT TTA AAA GAT CAA TTA TTA ATA TTT ATT AAT GAT AAA AAT GTT GCA AAT GTA AGT ATC Asn Trp Tyr Tyr Ile Ser Ile Set Val Asp Arg Leu Lys ASp Gin Leu Leu Ile Phe Ile ASh Asp Lys Asn Val Ala ASh Val Set Ile

970 980 990

2971 GAT CAA ATA CTA AGT ATT TAT TCT ACC AAT ATA ATA TCT TTA GTT AAT ]tAG AAT AAT TCA Asp Gln Ile Leu Set Ile Tyr Ser Thr ASh Ile Ile Ser Leu Val Ash Lys Ash ASn Ser

1000

3061 AAT CCT ATT ACA AGC GAA GAA GTT ATA AGA Ash Pro Ile Thr Ser GIu GIu Val Ile Arg

1030

3151 TAT AAT AAA AAC TAT CAA TTA TAC AAT TAT Tyr ASh Lys ASh Tyr Gln Leu Tyr ASh Tyr

1060

3241 AAA AAT ACA GAT GGT ATT AAT ATT TCA AGT

Lys ASh Thr Asp Gly lle ASh lle Ser Ser

1090

3331 TGT ATA ATT TGT GTA TTA GAC GGT ACA GAA Cys Ile Ile Cys Val Leu Asp Gly Thr Glu

1120

3421 GCA AAA AAG ATT ACA GTT AAT ACT GAT TTA Ala Lys Lys Ile Thr Val Asn Thr Asp Leu

1150

3511 GAT GGA GAT TAT AAT TGG ATG ATA TGT AAT

Asp Gly Asp Tyr Asn Trp Met Ile Cys Asn

1180

I010

AAT TAC TTT AGT TAT TTA GAT AAT TCA TAT Asn Tyr Phe Set Tyr Leu Asp ASh Ser Tyr

1 0 4 0

GTA TTT CCC GAA ACT TCT TTA TAT GAA GTT Val Phe Pro GIu Thr Set Leu Tyr Glu Val

1070

GTT AAA TTT AAA TTA ATA AAT ATA GAT GAA Val Lys Phe Lys Leu lle Ash lle Asp GIu

1100

AAA TAT TTA GAT ATA TCT CCT GAA AAT AAT Lys Tyr Leu Asp Ile Ser Pro Glu Asn Asn

1130

TTT AGA CCT GAT TGT ATA ACA TTT TCA TAT Phe Arg Pro Asp Cys lle Thr Phe Ser Tyr

1160

GAC AAT AAC AAG GTG CCT AAA GGT GCA CAT Asp Ash ASh Lys Val Pro Lys Gly Ala His

1190

ATC TAT GTA GAA GAA TTG TCA GTT TTA GAT Ile Tyr Val Glu Glu Leu Ser Val Leu Asp

1020

ATA AGA GAT ACT TCC AAA TCA CTA TTA GAA Ile Arg Asp Ser Set Lys Ser Leu Leu GIu

1050

AAT GAT AAT ART AAG TCG TAT TTA TCA CTA Ash Asp Asn Asn Lys Set Tyr Leu Set Leu

1080

AGT AAA GTA TAT GTA CAA AAG TGG GAT GAG

Ser Lys Val Tyr Val Gin Lys Trp Asp Glu III0

AGA ATA CAA TTA GTA AGT TCC AAA GAT AAT Arg Ile Gln Leu Val Set Ser Lys ASp Asn

1140

AAT GAT AAA TAT TTT TCT CTA TCA CTT AGA Asn Asp Lys Tyr Phe Set Leu Set L e u Arg

1170 S.D.

TTG TGG ATA TTA GAA AGT TAGGAGATGTTAGTA Leu Trp Ile Leu GIu Set ***

1196

TTATGCC A ATAACAATTAACAA C TTTAATTATTC AG ATCC TGTTGATAAT AAAAATATTTTATA TTTAGATACTC ATTT AAATACAC T~C.C TA ATGAGC C TGAAAAA GCC TTTCGCATT I MetPr oI 1 eThr I leAsnAsnP heAsnTyrSerAsp p roValAspAsnLy sAsn i leLeuTyr L euAspThr HisLeuAsnThr LeuAlaAsn GluProGluLysAl ~h~g i le

• Toxin gene

FIGURE 3 - CONTINUED

1277

Page 6: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol. 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

Nucleot ide sequence of nontoxic-nonHA gene. The nucleotide sequence of the DNA

between the neurotoxin and HA-33 genes was determined. It was found that this region

contains one open reading frame (ORF) coding for 1196 amino acid residues (Fig. 3). The N-

terminal amino acid sequence deduced from this nucleotide sequence was identical to that of the

nontoxic-nonHA component determined by the protein microsequencing procedure. The

calculated MW of this gene was 138,758, which was in good agreement with 130 k.

Therefore, it was concluded that this ORF is a gene for the nontoxic-nonHA component. This

gene consists of a 3588 bp ORF starting at position +1 and ending with a TAG codon at

position +3588, and is located from 268 bases upstream of the HA-33 gene to 17 bases upstream

of the C1 toxin gene. Therefore, the number of nucleotide bases between the neurotoxin and

HA-33 genes, which had previously been estimated as 4.3 kb, was found to be 3873. The 5'

noncoding region was analyzed for the presence of regulatory sequences. Six base pairs,

predicted as a Shine-Dalgarno sequence (AGGAGA), were found at 11 bases upstream of the

start codon. A sequence resembling an E. coli promoter, consisting of a hexanucleotide TTTACA

(homology of five of six nucleotides to TTGACA) and a TATATA box (homology of four of six

nucleotides to TATAAT) separated by 19 nucleotides, was observed. Both the promoter and the

open reading frame were A+T rich, and no typical signal peptide was found after the ATG start

codon in the same manner as the neurotoxin and HA-33 genes.

It became clear that only 17 bp exist between the stop codon (TAG) of the nontoxic-nonHA

gene and the start codon (ATG) of the neurotoxin gene, and no promoter like sequence was found

in this region. In addition, the 5' terminus of both the nontoxic-nonHA and neurotoxin genes

showed high homology (Fig. 4). From these results, it was suggested that both the nontoxic-

nonHA and neurotoxin genes are transcribed by the same mRNA (polycistronic

transcription) in C. botulinum organisms.

In E. coli cells transformed with recombinant pUC118 plasmid or kgtl i phage which contained

the 7.8 kbp DNA fragment, both the neurotoxin and 33 k-HA subcomponent were produced, but

production of the nontoxic-nonHA component was not detected by a western blot test. This

indicates that the predicted nontoxic-nonHA promoter gene may not work in E. coli, but that

some nucleotide region in the nontoxic-nonHA gene may work as a promoter for the neurotoxin

1 53 Nontoxic-NonHA NDIN-DDLNINSPVDNKNVVIVRARK---TNTFFKAFKVAPNIWVAFERY--YGEP-LDI

N e u r o t o x i n MPITINNFNYSDPVDNKNILYLDTHLNTLANEPEKAFRITGNIWVIFDRFSRNSNPNLNK 1 6O

54 113 N o n t o x i c - N o n H A AEEY-KLDGGIYDSNFLSQDSERENFLQAIIILLKRINNTISGKQLLSLISTAIPFPYGY

N e u r o t o x i n PPRVTSPKSGYYDPNYLSTDSDKDTFLKEIIKLFKRINSREIGEELIYRLSTDIPFPGNN 61 120

FIGURE 4, Amino acid sequence alignment of neurotoxin and nontoxic-nonHA. Amino acidresiduescommon to neurotoxin and nontoxic-nonHA are m~ked with ~tefisks.

1278

Page 7: The complete nucleotide sequence of the gene coding for the nontoxic-nonhemagglutinin component of Clostridium botulinum type C progenitor toxin

Vol. 183, No. 3, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

gene; a candidate for such a region is shown in Fig. 3.

Upstream of the HA-33 gene, there exists another ORF which is presumed to code for another

subcomponent of HA. We are now trying to determine the whole nucleotide sequence of this

gene in an attempt to clarify the molecular construction of HA.

ACKNOWLEDGMENT

This study was supported by grant 03304030 from the Ministry of Education of Japan.

REFERENCES

1. Oguma, K., Nakane, A., and Iida, H, (1978) Appl. Environ. Microbiol. 35, 462-464. 2. Ohishi, I., and Sakaguchi, G. (1980) Infect. Immun. 28, 303-309. 3. Sakaguchi, G., Kozaki, S., and Ohishi, I. (1984) In Alouf, J.E., Fehrenbach, EJ . ,

Freer, J.H., and Jeljaszewicz, J.(ed.) Bacterial protein toxins. Academic Press, London, 435-443.

4. Tsuzuki, K., Kimura, K., Fujii, N., Yokosawa, N., Indou, T., Murakami, T., and Oguma, K. (1990) Infect. Immun. 58, 3173-3177.

5. Kimura, K., Fujii, N., Tsuzuki, K., Murakami, T., Indou, T., Yokosawa, N., Takeshi, K., Syuto, B., and Oguma, K. (1990) Biochem. Biophys. Res. Commun. 171, 1304-1311.

6. Yokosawa, N., Tsuzuki, K., Syuto, B., and Oguma, K. (1986) J. Gen. Microbiol. 132, 1981-1988.

7. Suzuki, N., Syuto, B., and Kubo, S. (1986) Jpn. J. Vet. Res. 34, 269-278.

1279