7
Eur. J. Biochem. 193,767 - 773 (1 990) 0 FEBS 1990 Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen Structural identity within the trypsin family Isabelle LE HUEROU’, Catherine WICKER ’, Paul GUILLOTEAU’. Renk TOULLEC’ and Antoine PUIGSERVER’ Centre de Biochimie et de Biologie Moleculaire du Centre National de la Recherche Scientifique, Marseille, France Laboratoire du Jenne Ruminant de I’Institut National de la Recherche Agronomique, Rennes, France (Received March 2, 1990) - EJB 90 0239 A cDNA clone encoding an anionic form of bovine trypsinogen was isolated from a pancreatic cDNA library. The corresponding 855-nucleotide mRNA contains a short 5’ noncoding region of 8 nucleotides and a long 3’ noncoding region of 56 nucleotides in addition to a poly(A) tail of at least 50 nucleotides. The deduced amino acid sequence for the anionic pretrypsinogen (247 residues) includes the N-tcrminal 15-amino-acid signal peptide followed by an 8-amino-acid activation peptide. The zymogen (232 residues) contains an additional C-terminal serine, compared with the amino acid sequence of bovine cationic trypsinogen. The identity between the anionic and cationic forms of bovine trypsinogen (65%) is lower than that existing between the anionic protein and other mammalian anionic trypsinogens (73 - 85%), suggesting that trypsin gene duplication in mammals occurred prior to the evolutionary events responsible for the species divergence. Bovine pancreatic anionic trypsin possesses all the key amino acids characteristic of the serine protease family. The exocrine pancreas is known to synthesize about 20 digestive enzymes and proenzymes which are destined to be exported from the acinar cell to carry out hydrolysis of dietary substrates in the intestinal lumen [I]. Trypsinogens are specifi- cally activated in the intestine by enterokinase, and the re- sulting trypsins are responsible for the activation of all remain- ing zymogens. Trypsins are also major digestive endopepti- dases since they catalyze the hydrolysis of peptide bonds on the carboxyl side of lysine and arginine residues of proteins and long peptides. Like all the other pancreatic serine pro- teases, including several forms of chymotrypsin, elastase and kallikrein, the catalytic activity of trypsins is based on the presence of the triad of residues His57, Asp102 and Ser195 (chymotrypsin numbering system). Trypsinogen isozymes have been shown to be present in the pancreatic tissue of most animal species. In particular, anionic and cationic forms of trypsins have been isolated from human [2, 31, bovine [4], dog [5], porcine [6], and rat [7] pancreatic glands. The amino acid sequences of the porcine [8,9] and bovine [lo, 111 trypsins, as determined by sequential degradation of the polypeptide chains, and those of the human [12], rat [13, 141 and dog [I51 trypsins, derived from nucleotide sequence analysis of the corresponding cloned cDNAs, reveal the existence of substantial structural identity. In the bovine pancreas, anionic trypsin was reported to represent less than 10% of the total amount of trypsin [6]. This probably explains the reason why the complete amino Correspondence to I. Le Huerou, Centre de Biochimie et de Bio- logic Molkculaire du Centre National dc la Recherche Scientifique, 31 Chemin Joseph-Aiguier, Boite Postale 71, F-I 3402 Marseille Cedex 9, France Enzyme. Trypsin(ogen) (EC 3.4.21.4); cnterokinase (EC 3.4.21.9). Note. The novel nucleotide sequence data published here has been deposited with the EMBL, GenBank and DDBJ nucleotide sequence data banks and is available under accession number X 54703, trypsino- gen anionic precursor. acid sequence of anionic trypsin has so far not been achieved. However, the amino acid sequence of the two peptides released during the activation of bovine anionic trypsinogen is known. The major octapeptide, Phe-Pro-Ser-Asp,-Lys, and the minor hexapeptide, Ser-Asp,-Lys, account for 90% and 10% of the parent zymogens, respectively [6]. The presence of an hexa- peptide, lacking the first two residues of octapeptide, might be the result of some uncontrolled splitting during purification and/or activation of the zymogen, or might be due to the existence of two anionic trypsins encoded by distinct mRNAs. We report here the amino acid sequence of the major form of bovine anionic trypsinogen as determined by nucleotide sequence analysis of the cloned cDNA. The 5’ and 3‘ non- coding regions have also been sequenced since they are ex- pected to play a part in the regulation of gene expression. The sequences of the bovine anionic and cationic trypsinogens have been compared to those of anionic trypsinogens I and 11, and cationic trypsinogen from the rat pancreas. MATERIALS AND METHODS Muteriuls Guanidine hydrochloride and oligo(dT)-cellulose were obtained from Sigma Chemical Co. (St Louis, MO, USA). Guanidinium thiocyanate and formaldehyde were from Fluka AG (Basel, Switzerland). Nitrocellulose sheets (BA 83) and formamide were supplied by Schleicher & Schull (Dassel, FRG) and Eastman Kodak Co. (Rochester, N.Y., USA), respectively. [F~’P]~CTP ( > 110 TBq/mmol), [y-32P]dATP ( > 110 TBq/mmol), [35S]dATP[aS] (deoxyadenosine 5’4- [35S]thiotriphosphate) (> 37 TBq/mmol), and cDNA syn- thesis kit (RPN.1256) were from Amersham Corp. (Les Ulis, France). Reverse transcriptase from avian myeloblastosis virus and pUC sequencing kit (Cat. 1013 106) were from

Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

Embed Size (px)

Citation preview

Page 1: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

Eur. J. Biochem. 193,767 - 773 (1 990) 0 FEBS 1990

Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen Structural identity within the trypsin family

Isabelle LE HUEROU’, Catherine WICKER ’, Paul GUILLOTEAU’. Renk TOULLEC’ and Antoine PUIGSERVER’

’ Centre de Biochimie et de Biologie Moleculaire du Centre National de la Recherche Scientifique, Marseille, France ’ Laboratoire du Jenne Ruminant de I’Institut National de la Recherche Agronomique, Rennes, France

(Received March 2, 1990) - EJB 90 0239

A cDNA clone encoding an anionic form of bovine trypsinogen was isolated from a pancreatic cDNA library. The corresponding 855-nucleotide mRNA contains a short 5’ noncoding region of 8 nucleotides and a long 3’ noncoding region of 56 nucleotides in addition to a poly(A) tail of at least 50 nucleotides. The deduced amino acid sequence for the anionic pretrypsinogen (247 residues) includes the N-tcrminal 15-amino-acid signal peptide followed by an 8-amino-acid activation peptide. The zymogen (232 residues) contains an additional C-terminal serine, compared with the amino acid sequence of bovine cationic trypsinogen. The identity between the anionic and cationic forms of bovine trypsinogen (65%) is lower than that existing between the anionic protein and other mammalian anionic trypsinogens (73 - 85%), suggesting that trypsin gene duplication in mammals occurred prior to the evolutionary events responsible for the species divergence. Bovine pancreatic anionic trypsin possesses all the key amino acids characteristic of the serine protease family.

The exocrine pancreas is known to synthesize about 20 digestive enzymes and proenzymes which are destined to be exported from the acinar cell to carry out hydrolysis of dietary substrates in the intestinal lumen [I]. Trypsinogens are specifi- cally activated in the intestine by enterokinase, and the re- sulting trypsins are responsible for the activation of all remain- ing zymogens. Trypsins are also major digestive endopepti- dases since they catalyze the hydrolysis of peptide bonds on the carboxyl side of lysine and arginine residues of proteins and long peptides. Like all the other pancreatic serine pro- teases, including several forms of chymotrypsin, elastase and kallikrein, the catalytic activity of trypsins is based on the presence of the triad of residues His57, Asp102 and Ser195 (chymotrypsin numbering system).

Trypsinogen isozymes have been shown to be present in the pancreatic tissue of most animal species. In particular, anionic and cationic forms of trypsins have been isolated from human [2, 31, bovine [4], dog [5], porcine [6], and rat [7] pancreatic glands. The amino acid sequences of the porcine [8,9] and bovine [lo, 111 trypsins, as determined by sequential degradation of the polypeptide chains, and those of the human [12], rat [13, 141 and dog [I51 trypsins, derived from nucleotide sequence analysis of the corresponding cloned cDNAs, reveal the existence of substantial structural identity.

In the bovine pancreas, anionic trypsin was reported to represent less than 10% of the total amount of trypsin [6]. This probably explains the reason why the complete amino

Correspondence to I. Le Huerou, Centre de Biochimie et de Bio- logic Molkculaire du Centre National dc la Recherche Scientifique, 31 Chemin Joseph-Aiguier, Boite Postale 71, F-I 3402 Marseille Cedex 9, France

Enzyme. Trypsin(ogen) (EC 3.4.21.4); cnterokinase (EC 3.4.21.9). Note. The novel nucleotide sequence data published here has been

deposited with the EMBL, GenBank and DDBJ nucleotide sequence data banks and is available under accession number X 54703, trypsino- gen anionic precursor.

acid sequence of anionic trypsin has so far no t been achieved. However, the amino acid sequence of the two peptides released during the activation of bovine anionic trypsinogen is known. The major octapeptide, Phe-Pro-Ser-Asp,-Lys, and the minor hexapeptide, Ser-Asp,-Lys, account for 90% and 10% of the parent zymogens, respectively [6]. The presence of an hexa- peptide, lacking the first two residues of octapeptide, might be the result of some uncontrolled splitting during purification and/or activation of the zymogen, or might be due to the existence of two anionic trypsins encoded by distinct mRNAs.

We report here the amino acid sequence of the major form of bovine anionic trypsinogen as determined by nucleotide sequence analysis of the cloned cDNA. The 5’ and 3‘ non- coding regions have also been sequenced since they are ex- pected to play a part in the regulation of gene expression. The sequences of the bovine anionic and cationic trypsinogens have been compared to those of anionic trypsinogens I and 11, and cationic trypsinogen from the rat pancreas.

MATERIALS AND METHODS

Muteriuls

Guanidine hydrochloride and oligo(dT)-cellulose were obtained from Sigma Chemical Co. (St Louis, MO, USA). Guanidinium thiocyanate and formaldehyde were from Fluka AG (Basel, Switzerland). Nitrocellulose sheets (BA 83) and formamide were supplied by Schleicher & Schull (Dassel, FRG) and Eastman Kodak Co. (Rochester, N.Y., USA), respectively. [F~’P]~CTP ( > 110 TBq/mmol), [y-32P]dATP ( > 110 TBq/mmol), [35S]dATP[aS] (deoxyadenosine 5 ’ 4 - [35S]thiotriphosphate) (> 37 TBq/mmol), and cDNA syn- thesis kit (RPN.1256) were from Amersham Corp. (Les Ulis, France). Reverse transcriptase from avian myeloblastosis virus and pUC sequencing kit (Cat. 1013 106) were from

Page 2: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

768

PrePro Anionic t r y p s i n b A E FR T R A A A

r 1 1 f Poly(A1'

-15 1 8 100 200 232 A m i n o acids 0 100 300 500 700 ' 8;s Nucleotides I

Fig. 1. Sequencing strutegyfor uizionic trypsiiz m R N A f r om bovine puncveus. Horizontal rectangles delineate the amino acid coding region; the signal peptide (PRE) and activation peptide (PRO) regions are indicated. Lines extending from the rectangles rcpresent the noncoding regions with the 50-nucleotide poly(A) tail, indicated as poly(A)+, at thc 3' end of the mRNA. Both thick horizontal lines below the main rectangle, represent the position, direction, and length of the first (5'-CCAGGCTTGCCCTTCTG-3') and second (5'-TGGTAAGGGACGGAATT-3') synthetic oligonucleotides used as primers. Horizontal arrows indicate the direction and extent of each sequencing run, while the wavy horizontal arrow delineates the position and length of sequence determined by primer extension analysis. Cleavage sites for a few restriction enzymes are indicated using the following abbreviations: A, A M ; E, G o R I ; F, FnuDII; R, RsuI; T, TuqI

Boehringer (Mannheim, FRG). Oligonucleotide purification cartridges (400771) were obtained from Applied Biosystem (California, USA).

Construction andscreening ? fa bovine pancreatic cDNA library

Bovine pancreatic total RNA was prepared by the guanidinium thiocyanate extraction procedure [16] with slight modification [17]. and poly(A)-rich KNA was subsequently isolated by chromatography on oligo(dT)-cellulose [IS]. Double-stranded cDNA was synthesized using the Amersham cDNA system and inserted into pUC 9 plasmid at the PsfI site following G-C tailing. Escherichia coli JM 83 strain was then transformed by the recombinant plasmid mixture [19]. Screening of insert-containing colonies (1.4 x lo3) was performed using a rat pancreatic trypsin cDNA available in our laboratory and previously 32P-labelled by nick-translation [20]. High-stringency hybridization conditions were used to screen the library. Prehybridization and hybridization were performed as already described [21] in order to promote the identification of trypsin sequences. Filters were washed three times at 65°C for 45 min each with a solution containing 0.5 x NaCl/Cit (75 mM NaCI, 7.5 mM sodium citrate, pH 7.0) and 0.1% SDS.

Nwleotide sequencing

Nucleotide sequence determination of bovine trypsin cDNA was carried out by the dideoxy-chain-termination tech- nique of Sanger et al. [22] using a sequencing kit from Boehringer. The oligonucleotide primers used for sequencing included the universal sequencing primer and the reverse sequencing primer from the kit. A first synthetic oligo- nucleotide (S'-CCAGGCTTGCCCTTCTG-3') corresponding to the 3' end of the bovine trypsin cDNA sequence (nucleotides 672-688) was also used (Fig. 1). It was synthesized by the P-cyanoethyl phosphoramidite method using an autoinatic DNA synthesizer (Applied Biosystem 381 A), and separated from reaction by-products on an oligonucleotide purification cartridge [23]. The strategy for nucleotide sequencing is also described (Fig. 1).

Primer extension

The sequence of the first 43 nucleotides of bovine trypsin mRNA ws determined directly from the mRNA by primer extension, as described [24], using a second synthetic oligonucleotide (5'-TGGTAAGGGACGGAATT-3') corre- sponding to the 5' end of the cDNA (nucleotides 105 - 121) as a primer (Fig. 1).

Northern-blot analysis

Total RNA samples (16 pg) were denaturated in 18 pl of a mixture containing 50% formamide (by vol.), 0.5 x Mops/ NaOAc/EDTA (20 mM Mops, 5 mM sodium acetate, pH 7.0, plus 0.5 mM EDTA pH 8.0), 2.2 M formaldehyde and 2 pl loading solution (50% glycerol, 1 mM EDTA, 0.4% bromophenol blue, 0.4% xylene cyanol), for 3 min at 95'C. Samples were loaded on a 1.2% agarose/formaldehyde gel [I91 running at 80 V for 4 h. The gel was treated and RNA transferred to nitrocellulose according to Maniatis et al. [19]. Nitrocellulose filters were baked for 2 h in a vacuum oven at 80°C. The trypsin cDNA plasmid was labelled with [a- 32P]dCTP using the BRL nick-translation kit and finally hy- bridized at > lo6 cpm/ml hybridization buffer. Prehybridiza- tion, hybridization and washing conditions were carried out according to Mueckler et al. [21].

RESULTS AND DISCUSSION

Screening the bovine cDNA library with a rat trypsin cDNA probe yielded some positive clones (2.3%). The sizes of cDNA inserts were determined by gel electrophoresis fol- lowing digestion with restriction endonucleases. Among these clones, PBIO contained a 809-bp insert and was nearly full- length. Sequence analysis showed that the insert extended from position 47 to the 3' end of trypsin mRNA. The lacking nucleotide sequence at the 5' end of the fragment was deter- mined by primer extension using the reverse transcriptase and the second 17-mer oligonucleotide as a primer in the presence of chain-terminating dideoxynucleotides (Fig. 1).

Page 3: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

7 69

I 5 0 UCUCCACC AUG CAU CCC CUG CUU AUC CUU GCC UUU GUG CGA GCU GCU GUG

Met His Pro Leu Leu Ile Leu Ala Phe Vd Gly Ala Ala Val -IS

101 CCC UCG GAC GAC GAU GAC CGG GGC UAC ACC UGC GCA R o Ser Asp Asp Asp Asp Cly Gly Tyr Thr Cyr Alr

152 GAG M U UCC GUC CCU UAC CAG GUG UCC CUG M U CCU Gcc UAC CAC UUC UGC Glu Asn Ser Val R o Tyr Cln Val Ser Leu A m Ala Gly Tyr His Phe Cyr

20 3 0

203 GGG GGC UCC CUC AUC M U GAC CAG UGC GUG GUG UCC GCG GCU Gly Cly Ser Leu Ile Asn Asp Gln Trp Val Val Ser A h Ala 6111 Cys T

AC UCC UAC

40 c CAG UAC CAC AUC CAG GUG AGG CUG GGA Gln Tyr His ne Gln Val Arg Leu Cly QTyr mile Asp

AA UAC AAC AUU GAU GUC UUG Leu

AG

305

70 8 0

356 AGC AGC UCG ACU CUG GAC AAU GAC AUC CUG CUG AUC AAA CUC UCC ACG CCU Ser Ser Trp Thr Leu A s A s n a Ile Leu Leu ne Lys Leu Ser Thr R o 98 100

407 GCG GUC AUC M U GCC CGG GUG UCC ACC UUG CUG CUG CCC AGU GCC UGU GCU Als Val ne A m Ala A r g Val Ser Thr Leu Leu Leu R o Ser Ala Cys Ala

110

6 8 UCC GCA CCC ACA GAG UGC CUC AUC UCC GGC UGG CGC AAC ACC CUG AGC AGU Ser Alp Gly Thr Glu Cys Leu Ile Ser Gly Trp Cly Asn Thr Leu Ser Ser

120 130

509 GGC GUC AAC UAC CCG GAC CUG CUG CAA UGC CUG GUG GCC CCG CUG CUG AGC Gly Val A m Tyr R o Asp Leu Leu G l n Cys Leu V d Ala Ro Leu Leu Set

140 1 so

T R Y

2 8 5 -

18s- -1.77

-1.52

-1.28

-0.78

-0.53

-0.40

560 CAC GCT GAC UGU GAA GCC UCA UAC CCU CGA CAG AUC ACU AAC AAC AUG AUC

*His Ala Asp Cys Glu Ala Ser Tyr Pro Gly Gln Ile Thr A m A m Met Ile 160 Fig. 3. Northern-blot analysis of' bovine pancreatic R N A . Total RNA

UGC GCU CGC uuc CUG G M CGA G W M~ c).~ G ~ Y fie Leu Glu G b G b LYS

uGC CAC GGU GAC uc &" isolated from bovine pancreas (16 pg) was denatured, loaded on a 1.2% agarose gel containing 2.2 M formaldehyde, transferred to a CYS Gln G ~ Y ASP G ~ Y

. .. nitrocellulose filter and finally hybridized with the 32P nick-translated 170

661 GGC ccu GUG GCU UGC AAC GGA CAG CUC CAC GGC AUU GUG ucc UW; GGC UAC

Leu nc Val 200 Trp 'Iy Tyr bovine trypsin cDNA probe, as indicated in Materials and Methods. Positions of 28s and 18s rRNA are shown on the left. Molecular Ro Val 190 cys

,,, mass standards (length in kb on the right) were denatured RNA G GGC AAG ccu CGG GUC UAC ACC AAC cuc UGC AAC UAC''" fragments derived from bacteriophage T7 and rat prolactin gene from

7 r Lyr Val Cys A m Tyr .. ,... ,... h. C . ~ .,~, .,. ~

220 BRL. TRY, trypsin 766

GUG GAC UGG AUU CAG GAG ACC AUC GCC GCC AAC AGC UGA AGCCCUWCCCUCU Val Asp Trp Dc Gln Glu Thr Ile Ala Ala Asn Ser Stop

230 232

80s C U G C C A U C A U U A U G C U d G A G A C U G C U C U U C C U G C A(50)

Fig. 2. Nucleotide sequence and deduced aniino acid sequence of' bovine anionic trypsinogen. Upper numbering corresponds to nucleotides, and lower numbering starts a t the first amino acid of the zymogen. Amino acid sequcnce of the predicted prepeplide extends from resi- due - 15 to residue - 1. The activation peptide residues + 1 to + 8 is indicated with vertical arrows. The conserved polyadenylation signal, AAUAAA, in the 3' noncoding region of mRNA, is underlined. Important amino acid residues for cnzyme activity, as discussed in the text, are enclosed in boxes

Fig. 2 shows the nucleotide sequence of bovine anionic trypsin cDNA and the deduced amino acid sequence. The 855- base mRNA contains on open reading frame that encodes a protein of 247 amino acids with a calculated molecular mass of 26246 Da. The presence of His48, Asp92 and Ser185, corre- sponding to residues 57, 102 and 195 in the chymotrypsin numbering system, in hihgly conserved regions is actually characteristic of the serine protease family (Fig. 2).

Comparison of the coding nucleotide sequences between bovine anionic trypsinogen and the two rat anionic trypsino- gens shows a high degree of structural identity. At the DNA level, bovine anionic trypsinogen is 81.1 '/o and 84.3% identical to rat anionic trypsinogens I and 11, respectively, while it only shares 74.9% identity with rat cationic trypsinogen.

5' and 3' noncoding regions of anionic tvypsinogen mRNA

Electrophoresis of bovine total RNA on agarose gel under denaturing conditions followed by hybridization with the bov- ine anionic trypsin cDNA probe yielded a single band the size of which was approximately 0.9 kb (Fig. 3). Thus, about 50 nucleotides of the poly(A) tail were lacking in the cDNA probe.

The length of the 5' noncoding region of trypsinogen mRNA was estimated to be 8 nucleotides plus the 5' cap structure, since the primer extension method has been reported to allow determination of all but the first 5' nucleotide and the m7G cap structure [13], as illustrated in the case of rat pancreatic trypsin I gene [25]. It was therefore likely that the nucleotide sequence of anionic trypsin mRNA shown in Fig. 2 was full-length, except the first 5' nucleotide which is generally an adenine [26]. The 5' noncoding region of anionic trypsino- gen mRNA (8 nucleotides in length) is therefore shorter than the 5' noncoding regions of rat and dog anionic trypsin mRNAs (14 nucleotides in length). However, Fig. 4 shows a very high degree of structural identity in the group of anionic trypsinogens regarding their 5' noncoding segments. The pres- ence of an adenosine and a cytosine at positions - 3 and -4 upstream of the initiation codon, respectively, is in accordance with the sequences reported by Kozak for eukaryotic inRN A [27] where it has been suggested that they could enhance the efficiency of the initiation step during eukaryotic mRNA

Page 4: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

770

translation. Also of interest was the fact that in contrast to the other anionic trypsinogen mRNAs, the first nucleotide after the initiator codon was not a purine in bovine anionic trypsinogen mRNA (Fig. 4).

A high degree of identity between the 5‘ noncoding seg- ments of anionic enzymes has already been pointed out by Kern et al. [28] when compar-ing the rat and dog sequences. As stressed by these authors as well as by others 1131, identical stretches of 6 nucleotides in length were found be highly com- plementary to the 3’ end of 18s rRNA. This is expected to favour the interaction between anionic trypsin mRNA and 40s ribosomal subunit leading to an increased translational efficiency of anionic forms as conipared with cationic ones [28]. This could also explain the selective increase in anionic trypsin expression during caerulein stimulation [15] or after feeding a protein-free diet in the rat. In the latter case, the mRNA specific for the three trypsinogen isozymes were in- creased [29], but only the overall synthesis of the anionic forms was enhanced [30, 311.

C C C Bovine

C C - A Rat1 1251

C A A Rafrr psi

U C k l l l ; A Dug 1151

Fig. 4. Noncou’irig regions of .four pancreatic anionic trypsinogen mRNA. Sequences are aligned to allow comparison of identical se- quences with the constraint that all of the initiation codons are super- imposed. Initiation codons are underlined. Boxes highlight the two conserved nucleotide sequcnces observed in the 5’ noncoding rcgions

The 3‘ region of bovine anionic trypsinogen consists of a UGA stop codon starting at position 750, followed by 53 nucleotides of untranslated message and a poly(A) taif of at least 50 nucleotides (Fig. 2 ) . The hexanucleotide AAUAAA is also preserved close to the 3’ end as in all polyadenylated eukaryotic mRNAs [32]. This polyadenylation recognition site is located 17 nucleotides upstream of the poly(A) tail of trypsinogen mRNA. It is worth stressing here that differences in length of 3’ noncoding regions were observed between anionic and cationic trypsin inRNAs in rats and dogs. In both species, the cationic trypsinogen was longer than its anionic counterpart: 105 instead of 61 nucleotides in the dog [15], and > 80 versus 54 nucleotides in the rat [13, 14, 251, without the poly(A) tail. The 3’ noncoding fragments of human trypsins I and I1 [I21 and bovine anionic trypsin are quite comparable in length: 53, 55 and 56 nucleotides, respectively.

Signal peptide

Based upon the existence of a typical activation peptide in trypsinogen, i. e. an octapeptide and/or an hexapeptide both containing four aspartyl residues, as well as on the enzyme specificity responsible for the cleavage of prepeptides during the synthesis of other serine protease zymogens, the prepeptide cleavage site in bovine anionic pretrypsinogen was unambigu- ously located between an alanyl residue (amino acid - 1) and the N-terminal phenylalanine of the zymogen (amino acid + 1 ; Fig. 2). Thus, the signal peptide of bovine anionic pre- trypsinogen is similar in length to that of rat pretrypsinogens, i.e. 15 amino acid residues (Fig. 5), and to those of dog [15, 351 and human [12] pretrypsinogens. The same amino acids extending from residues -1 to - 5 were found in the C- terminal region of rat and bovine trypsinogen prepeptides. In

-IS + I II LO ‘<* 10 3 0 10 BAnlTl M H P L L ~ L A F Y G A A V A ~ F P S D D D D K ~ I V G G Y T C A E N ~ V P Y Q V ~ L N A G Y H F C C G S L I N Q Q RAnlTl [251 S A L I L y L E P H S u RAnllTl 121) R A L I L y V Q S R BCatTI 133,341 - - - - - V G A T S s RCnlTI 1141 K A I E L L L D U K L 6

3 0 6 0 7 0 1 0 9 0 W V V S A 4 H C Y Q Y H I Q V R L C E Y N I D V L E G G E Q F I D A S K l I K H P K Y S S W T L D N D I L L I K L

K S K H N L D N A K N N M K S K H N L N V N A K N F U K K N M

S K S G D N Y V S V S N N N M H V A M - S N A N F K S R

100 110 120 110 140 120 S T P A V I N A K V S T L L L P S A C A S A G T E C L I S G WG N T L S S C Y N Y P U L L Q C L V A P L L S H A P

S V K L A A P V A P Q N Y N V U V Q A P S V K L A A V A S P Q Y E D P Q A P

K S A S L S A S 1 S T S U Q IS N S T L S V S K S G S K V 1 S D v D L S

K V K K I D S S

160 I70 1 8 0 I90 200 2 1 0 C E A S Y P G Q l T N N M l C A C F L E C C K D S C Q C U S G C P V A C N G Q L ~ ~ I V S W G Y G C A Q K G K P C

V V L P V N L A A E S S E A K V V V V E L P D N KS A S F V V S K c N KS K S F L V V

230 231 221)

F G D D

Y Y T K Y C N Y V D W Q E T i A A N S

s L Q s .. N Q v

Fig. 5. Amino acid sequences of,five pwtrypsinogens. Numbering starts a t the first amino acid residue of bovinc anionic trypsinogen. Only those residues thal differ from bovine anionic trypsinogen sequence are indicated. (-) Non-identified amino acid; (. . . .) any amino acid at this position (gap). Underlined residues are different in anionic and cationic pretrypsinogens, but identical within each group. Vertical arrows indicate &he predicted signal peptide (- 15 to - 1) and activation peptide (+ 1 to + 8). BAniTi, bovine anionic pretrypsinogen; RAnITi, rrlt anionic 1 prctrypsjnogen; RAnllTi, rat anionic I1 pretrypsinogen; BCatTi, Bovine cationic trypsinogen; RCatTi, rat cationic pretrypsinogen

Page 5: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

771

Leu

& - Phe I!!E & Leu Thr Ala Val Ser

Leu S S L Val Ile Thr Leu Val Thr Phe Ile Thr pro Pro Val sr;r

Asp Asu Asu Asu Asu Lvs

Asu Asu Asu Asu Lvs Asu Asu Asu Asu Lvs Asu Asu Asu Asu Lvs Glu Asp Asp Asp Lvs Asn Asu Asu ASD Lvs Asu Asu Asp Asp Lvs Asp Asp Asu Asu Lvs Asu Asu ASD Asu Lvs Asu Asu Asu Asu Lvs Asu Asu Asu ASD Lvs Ile Glu Glu Asu Lvs Asp Asu Asp Asu Lys Asu Asu Asu Asu Lvs

- *Rat (Cat.) Cow (Ani.) *Rat I1 (Ani.),*Mouse, Sheep, Roe-deer, Goat *Dog (Ani.) Pig (Cat.), Pig (Ani.), Wild boar, Sea elephant *Rat I (Ani.) *Rat (Neu.) *Dog (Cat.) 'Human I (Neu.) & I1 (Ani.) Dromedary Horse Dogfish Lung fish Cow (Cat.), Turkey, Sheep, Roe-deer, Goat Cow (Ani.)

Fig. 6. Trypsinogen activation peptides.from various species. Amino acid residues identical with those of bovine trypsinogen activation peptide are underlined. (*) Sequences deduced from cDNA analysis. Ani., anionic; Cat., cationic; Neu., neutral

contrast, the amino acid residues at the positions -6, -10 and - 11 are identical in anionic pretrypsinogens, exclusively. As observed in several cases [36], the signal peptide of bovine anionic trypsinogen contains a basic amino acid (His) immedi- ately after the first methionine, and a glycine five residues upstream of the cleavage site. The structural identity between the anionic trypsinogen prepeptides is SO%, while this value dropped to 60% when the prepeptide of rat cationic trypsino- gen was compared to those of other anionic trypsinogens. As in most of the 40 transport peptide sequences described in the literature, the prepeptide of bovine anionic trypsinogen has, in addition to the charged amino acid residue (Hisl4) follow- ing the first methionyl residue (MetlS), two short clusters of four and two hydrophobic residues, (a) Leul2, Leu1 1, IlelO and Leu9, and (b) Phe7 and Va16. It has been suggested that this hydrophobic region might represent the actual signal for binding of the nascent presecretory protein to the rough endo- plasmic reticulum membrane [37]. Finally in both anionic and cationic forms of bovine and rat trypsinogens, cleavage of the transport peptide sequence occurs after an alanine residue which is with glycine the more frequent C-terminal residue in eukaryotic signal peptides [36].

Activation peptide

On account of the high degree of structural identity ex- isting in trypsinogen activation peptides from various species (Fig. 6), that of bovine anionic trypsinogen included amino acids 1 ~ 8 (Figs 2 and 5), confirming previously reported re- sults [6]. The comparison of the 27 trypsinogen activation peptides listed in Fig. 6 illustrates the very low variability in the amino acid sequence recognized by intestinal entero- peptidase and resulting in the specific activation of trypsino- gen under physiological conditions. An important feature is the presence of four contiguous amino acid residues with the same carboxylic acid side chain (4 aspartyl) in most cases (89%), or with a single variation (1 glutamyl plus 3 aspartyl) exceptionally (3.7%) followed by lysine where cleavage of the activation peptide occurs. In this respect, the activation peptides of rat neutral trypsinogen (Asn-Asp-Asp-Asp) and lung fish trypsinogen (Ile-Glu-Glu-Asp) are quite unusual, while that of rat cationic trypsinogen is unique in having five unstead of four contiguous aspartyl residues. In the tripeptide preceding the polyanionic sequence, the first and second amino acids are usually a phenylalanine (63%) and a proline (86%), respectively. However, much more variability is ob-

Table 1. Amino acid sequence identity between bovine anionic trypsin and other trypsins, and some bovine serine proteases following optimal alignment of protein primary structures

Species Enzymc Charge Identity Reference

Bovine trypsin cationic". anionic I

Rat trypsin anionic 11' cati onic

Dog trypsin anionic" cationic'

Human trypsin neutral 1' anionic II'

Mousc trypsin anionic'

Pig trypsin cationicaXd

Bovine chymotrypsin

chymotrypsin A b,d

Bb,d subunit I11 of the procarboxypeptidase A ternary complex b,d

Yo 65.0 33 ,34 73.2 13

85.1 13 69.4 14

78.2 15 64.7 15 70.2 12 12.9 12

74.9 38 72.7 9 24.1 51

20.7 52

15.1 53

a Zymogen sequence without signal peptide.

' Sequence deduced from cDNA analysis. Enzyme sequence without signal and activation peptides

Sequence determined by protein sequencing.

served in this part of the activation peptide since in the dog fish and lung fish there are only two residues, while in rumi- nant species and in the turkey there is a single residue which is a valine or a serine. It is worth stressing here that in most if not all the ruminant species the activation segments of trypsinogens consist of an octapeptide together with an hexa- peptide. However, the hexapeptide likely originates from the octapeptide by proteolysis rather than another anionic tryp- sinogen encoded by a specific mRNA.

Trypsinogen and trypsin

Comparison of the amino acid sequence of bovine anionic trypsinogen with those of other pancreatic trypsinogens (Table 1) indicates that the highest structural identity was

Page 6: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

772

observed with anionic trypsinogens from other species. This is specially true with respect to rat anionic trypsinogen I1 (85.1 %), while bovine anionic trypsinogen is only 65% iden- tical to bovine cationic trypsinogen and < 25% identical to the other bovine pancreatic proteases. This suggests that the different members of the serine protease family probably di- verged prior to the evolutionary events responsible for the divergence of anionic and cationic trypsinogens and even more before the divergence of the mammalian species.

The deduced molecular mass for bovine anionic trypsino- gen is 24743 Da, a value which is close to that previously determined experimentally [6]. However, the number of resi- dues derived from the amino acid analysis of purified anionic trypsinogen was overestimated. Unlike rat anionic trypsino- gens I and 11, bovine cationic trypsinogen and rat cationic trypsinogen both containing 231 amino acids, the anionic trypsinogens from human (isoenzyme IT), bovine and dog as well as the neutral trypsinogens from human (isoenzyme I) and rat have an additional C-terminal amino acid (i.e. a serine in most cases and an asparagine in the last case) in their polypeptide chain which therefore consists of 232 residues.

12 cysteine residues are found in the same positions as in bovine, dog, pig and rat trypsinogens, whereas human trypsinogens I and I1 contain only ten and eight cysteine residues, respectively. Although the pairing of half-cystine residues in bovine anionic trypsinogen was not determined, it seems likely that the six disulfide bridges linking cysteine residues 15-145,33-49,117-218,124-191,156-170 and 181 -205 in bovine cationic trypsin [54] are also conserved in the former protein as they do in all the animal species so far examined but the human. The net charge of trypsin is not supposed to be of prime importance on a functional stand- point since that of residues 40, 155 ad 157 is different in the anionic and cationic trypsinogens, whereas that of the amino acid cluster 208 -210 in bovine anionic trypsinogen and rat cationic trypsinogen was unchanged, but differed by a + 2 value from that of rat anionic trypsins (Fig. 5). However, the net charge difference existing between anionic and cationic trypsins and resulting from the above-mentioned amino acid residues, as well as from others in the 83-87 region which is quite variable in trypsins, may induce some preferential hydrolysis of protein substrates due to secondary binding site interactions 1141. The binding of calcium ions by trypsinogens is known to promote autocatalytic activation of the zymogen and to result in cleavage of the critical peptide bond between the activation peptide and the resulting enzyme. The five amino acid residues involved in calcium ion binding, namely Glu60, Asn62, Va165, Glu67 and Glu70 [55], are conserved in bovine anionic trypsinogen (Fig. 2). Moreover, since the catalytic activity of trypsins is due to the ability of His48 to transfer a proton from Asp92 to Ser185, the amino acid sequences surrounding these three key residues are in general highly conserved. However, it is worth stressing here that bovine anionic trypsin shows more amino acid variations around His48 and Asp92 than around Ser185. Finally, Asp1 79, Serl80, Gly204 and the surrounding sequences which are of prime importance for substrate specificity of trypsins [56] are also preserved in the anionic enzyme.

In conclusion the anionic form of bovine trypsinogen is more closely related to the anionic trypsinogens from other species including human, rat, dog and mouse, than to bovine cationic trypsinogen. Thus, whatever the species may be anionic trypsinogen should be considered as a distinct group of pancreatic enzymes within the trypsin family.

This work was partly supported by a research grant from the Con.reil R6gional de Bretugizc.

REFERENCES 1.

2. 3. 4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

25.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27 28 29

30

32

32

33 34 35

Desnuelle, P. (1986) in Molecular and cellulur basis o f digestion (Desnuelle, P., Sjostrcim, H. & Norkn, O., eds) pp. 195-21 I , Elsevier, Amsterdam.

Travis, J. & Roberts, R. C. (1969) Biochemistry 8, 2884-2889. Mallory, P. A. &Travis, J. (1973) Biochemistry 12, 2847-2851. Puigscrver, A. & Dcsnuelle, P. (1 972) Riochim. Biophys. Actu 236,

Ohlsson, K. & Tegner, H. (2973) Biochim. Biophys. Acta 317,

Louvard, M. N . & Puigserver, A. (1974) Biochim Biophys. Acta

Brodrick, J. W., Largman, C., Geokas, M. C., O’Rourke, M. &

Charles, M., Rovery, M., Guidoni, A. & Desnuelle, P. (1963)

Hermodson, M. A,, Ericsson, L. H., Neurath, H. & Walsh, K. A.

Walsh, K. A. & Neurath, H. (1964) Proc. Natl Acad. Sci. USA

Mikes, O., Holeysovsky, V., Tomasek, V. & Sorm. F. (1966) Biochem. Biophys. Res. Commun. 24, 346 - 352.

Emi, M., Nakamura, Y., Ogawa, M., Yamamoto, T., Nishide, T., Mori, T. & Matsubara, K. (1986) Gene 41, 305-310.

Mac Donald, R. J., Stary, S. J . & Swift, G. H. (1982) J . Biol. Chem. 257, 9724-9732.

Fletcher, T. S., Alhadeff, M., Craik, C. S. & Largman, C. (1987) Biochemistry 26, 3081 - 3086.

Pinsky, S. D., LaForge, K. S. & Scheele, G. (1985) Mol. Cell. B id . 5, 2669 - 2676.

Chirgwin, J. M., Przybyla, A. E., Mac Donald, R. J . & Rutter, W. J. (1979) Biochemistry 18, 5294-5299.

Le Huerou, I., Wicker, C., Guilloleau, P., Toullec, R. & Puigserver, A. (1990) Biochim. Biophys. Acta 1048, 257 -264.

Aviv, H. & Leder. P. (1972) Proc. Nut1 Acad. Sci. USA 6Y, 1408- 1412.

Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) in Molecular cloning: a laboratory manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

Rigby, P. W. J., Dieckman. M., Rhodes, C. & Berg, P. (1977) J . Mol. Biol. 113, 237-251.

Muccklcr, M. M., Merril, M. J. & Pitot, H. C. (1983) J . B id . Chem. 258,6109-6124.

Sanger, F., Nickley, S. & Coulson, A. R. (1977) Proc. Nut1 Acad. Sci. U S A 74, 5463 - 5467.

Lo, K. M., Jones, S. S., Hackett, N. R. & Khorana, H. G. (1984) Proc. Natl Acad. Sci. U S A 81, 2285 - 2289.

Clauser, E., Gardell, S. J., Craik, C. S., Mac Donald, R. J. & Ruttcr, W. J . (1988) J . B id . Chem. 263, 17837-17845.

Craik, C. S., Choo, Q.-L., Swift, G. H., Quinto, C., MacDonald, R. J. &Rutter, W. J. (1984).J. Bid . Chem. 259, 14255-14264.

Breatnach, R. & Chambon, P. (1981) Annu. Rev. Biochem. 50,

Kozak, M. (1981) Nucleic Acids Res. 9, 5233 - 5252. Kern, H. F., Rausch, U. & Scheele, G. (1987) Gut 28, 89-94. Dakka, N., Puigserver, A. &Wicker, C. (1990) Biochem. J . 268,

Schick, J . , Verspohl, R., Kern, H. & Scheele, G. (1984) Am. J .

Dakka, N., Wicker, C. & Puigscrver, A. (1988) Eur. J . Biochem.

Proudfoot, N. J. & Brownlee, G. G. (1976) Nature 263, 21 1 -

Walsh, K. A. (1970) Methods Enzymol. 19, 41 -63. Hartley, B. S. (1970) Philos. Trans. R . SOC. London B257,77-87. Devillers-Thiery, A,, Kindt, T., Scheele. G. & Blobel. G. (1975)

499 - 502.

328 - 337.

371, 177-185.

Ray, S. B. (1980) Am. J . Physiol. 239, G512GG525.

Biochim. Biophys. Acta 6Y, 11 5 - 129.

(2973) Biochemistrj, 12, 3146-3153.

52, 884-889.

349 - 383.

471 - 474.

Phy.yiol. 247, G611 -G616.

176, 231 -236.

214.

Proc. Natl Acud. Sci. USA 72, 501 6- 5020.

Page 7: Isolation and nucleotide sequence of cDNA clone for bovine pancreatic anionic trypsinogen : Structural identity within the trypsin family

773

36. Watson, M. E. E. (1984) Nucleic A c i h Rex 13, 5145-5164. 37. Walter, P. & Blobel, G. (1981) J . Cell. Biol. 91, 557-561. 38. Stevenson, B. J., Hagenbuechle, 0. & Wellauer, P. K . (1986)

Nucleic Acid.y RPS. 14, 8307-8330. 39. Bricteux-Gregoire, S.. Schyns, R. & Florkin, M. (1966) Biochim.

Biophys. Actu 127,277-279. 40. Bricteux-Gregoire, S., Schyns, R. & Florkin, M. (1971) in Bio-

clzemicul evolution and the origin of the life (Schoffenields, E., ed.) pp. 130- 149, North Holland Publishing Co., Amsterdam.

41. Bricteux-Gregoire, S., Schyns, R. & Florkin, M. (1971) Biochim. Biophys. Actu 229, 123-135.

42. Bricteux-Gregoire, S., Schyns, R. & Florkin, M. (1969) Arch. Int. Physiol. Biochim. 77, 544- 545.

43. Brictcux-Gregoire, S., Schyns, R. & Florkin, M. (1974) Biochim. Biophys. Actu 351, 87-91.

44. Liitcke, H., Rauscb, U., Vasiloudes, P., Scheele, G. A. & Kern, H. (1989) Nucleic Acids Res. 17, 6736.

45. Bricteux-Gregoire, S., Schyns, R. & Florkin, M. (1971) Biochin?. Biophys. Acta 251, 19 - 82.

46. Harris, C. I. & Hofmann, T. (1969) Biochem. J . 114, 82P. 47. Titani, K., Ericsson, L. H., Neurath, H. & Walsh, K. A. (1975)

48. Hermodson, M. A., Tye, R. W., Reeck, G. R., Neurath, H. &

49. Davie, E. W. & Neurath, H. (1955) J. B i d . Chem. 212, 515-529. 50. Kishida, T. & Liener, I. E. (1968) Arch. Biochem. Biophys. 126,

51. Hartley, B. S. & Kauffman, D. L. (1966) Biochem. J. 101, 229-

52. Smillie, L. B., Furka, A,, Nagabhushan, N., Stevenson, K. J. &

53. Venot, N., Sciaky, M., Puigserver, A., Desnuelle, P. & Laurent,

54. Kauffman, D. L. (1 965) J . Mol. Biol. 12, 929 - 932. 55. Bode, W. & Schwager, P. (1975) FEBS Lett. 56, 139-143. 56. Huber, R. &Bode, W. (1978) Ace. Chem. Res. 11, 114-122.

Biochemistry 14, 1358-2366.

Walsh, K . A. (1971) FEBS Lett. 14. 222-224.

11 1 - 120.

231.

Parkers, C. 0. (1968) Nature 218, 343-346.

G. (1986) Eur. J . Biochern. 157, 91 -99.