4
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biolom, Inc. Vol. 263, No. 25, Issue of September 5, pp. 12274-12277,1988 Printed in U.S.A. The Mitochondrial Uncoupling Protein Gene CORRELATION OF EXONSTRUCTURETO TRANSMEMBRANE DOMAINS* (Received for publication, February 23, 1988) Leslie P. KozakS, James H. Britton, Ulrike C. Kozak, and James M. Wells From The Jackson Laboratory, Bar Harbor, Maine 04609 The mitochondrial uncoupling protein, a protein es- sential for the thermogenic properties of brown fat in mammals, is inserted in the inner mitochondrial mem- brane by means of six a-helical hydrophobic trans- membrane domains. We havesequencedacomplete cDNAandparts of thegene todeterminethatthe mitochondrial uncoupling protein gene is composed of six exons, eachofwhichencodesatransmembrane domain. We also show that transcription of the uncou- pling protein gene is from a single start site; however, the use of alternativepoly(A) addition signal sequences results in two mRNAs, the major species of 1221 nu- cleotides, not including the poly(A) tail, and a minor species of about 1600 nucleotides. The 5‘-untranslated region of the mRNA is composed of 231 nucleotides, and the 3”untranslated region contains 81 nucleotides prior to addition of the poly(A) tail. Three ion carrier proteins of the inner mitochondrial mem- brane, the ADP/ATP translocator, the uncoupling protein (Ucp)’ of brown fat, and the phosphate carrier protein, are characterized by the presence of six a-helical hydrophobic transmembrane domains by which they are inserted intothe mitochondrial membrane (1-4). Similarities in amino acid sequence among the proteins suggest that they evolved from a common ancestral gene (5). All three proteins have the same basic repetitive structure. It is a tripartite structure consisting of three homologous repeating 100-amino acid segments, with each segment containing two transmembrane domains (2). We now report that the structure of the nuclear gene for the mitochondrial Ucp reflects the domain structure. Each of the six exons in the gene encodes one of the transmembrane domains. We also show that transcription of the uncoupling protein gene is from a single start site and that the use of alternative poly(A) addition signal sequences results in two mRNAs, the major species of 1221 nucleotides, not including the poly(A)tail, and a minor species of about 1600 nucleotides. MATERIALS AND METHODS3 * This work was supported by National Institutes of Health Grant HDOS431 (to L. P. K.). The Jackson Laboratory is fully accredited by the American Association for Accreditation of Laboratory Animal Care. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. $ To whom correspondence should be sent. The abbreviations used are: Ucp, mitochondrial uncoupling pro- tein; kb, kilobase pairs; bp, base pairs. Portions of this paper (including “Materials and Methods,” Foot- note 2, and Fig. 1) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. RESULTS AND DISCUSSION We previously isolated a cDNA clone for the mitochondrial uncoupling protein (UCP) from mouse brown fat (6). By Northern blot analysis, this cDNA hybridized to two mRNAs, a major species of approximately 1300 nucleotides and a minor species of 1700 nucleotides. Both mRNAs were highly induced when mice were exposed to the cold at 5 “C, suggesting they were derived from the same transcriptional unit. In order to determine the molecular mechanisms controlling Ucp gene expression in brown fat, we have been investigating the struc- ture of both the Ucp mRNA and gene. Additional screening of our brown fat cDNA library has provided a cDNA sequence which is complementary to the full length of the major Ucp mRNA species (Fig. lA). The Ucp gene was isolated from a Charon 28 BALB/c embryo DNA library as two overlapping clones. We have sequenced the cDNA clones (Fig. lA). We have also used synthetic primers to sequence the Ucp mRNA directly to define the 5’ end of the mRNA, and we have sequenced parts of the 5’ and 3’ flanking region and the exons of the Ucp gene (Fig. 1B). The startsite for transcription was determined by compar- ing the RNA sequence to thegenomic sequence as illustrated in Fig. 2A. Reverse transcription of the RNA terminates with a G. Sequencing of an M13 clone of the 5’ flanking region of the gene establishes this G as the start site for transcription. The site is located 231 bp upstream of the ATG translation initiation codon and 26 bp downstream of the sequence TA- TATA. This latter sequence has the properties for the TATA box promoter region with respect to both sequence and posi- tion (14). The sequence GCCGGG at the 5’ end of the mRNA is also located in the cDNA clone, p-Ucp 2. In addition, the 5‘ end of the cDNA, p-Ucp 2, carries 179 nucleotides which do not match the Ucp mRNA sequence or any sequence inthe genomic clones (Fig. lA). Since this sequence hybridizes to an RNA in brown fat of approximately 4400 nucleotides and maps to a site on chromosome 7 near the apolipoprotein E gene (data not shown), whereas Ucp maps to chromosome 8, we conclude that the 179 nucleotides of the 5’ end of the cDNA were attached to the Ucp cDNA during constructionof the cDNA library. Sequencing of the cDNA defined the 3‘ end of the major mRNA species (Fig. 1, Miniprint section). An untranslated sequence of 78 bases follows the TAA stop codon before addition of the poly(A) tail. A poly(A) addition signal se- quence, AAUAAA, is located 17 bases upstream of the poly(A) addition site. Accordingly, the major Ucp mRNA has 231 bases of 5“untranslated sequence, 918 bases of coding se- quence, and 81 bases of 3’-untranslated sequence. Sequencing of the 3”flanking region of the Ucp gene suggested an expla- nation for the minor mRNA species of 1700 bases. TWO additional AAUAAA signal sequences are present, one is a single base downstream of the poly(A) addition site of the 12274

The Mitochondrial Uncoupling Protein Gene

  • Upload
    dotuyen

  • View
    233

  • Download
    6

Embed Size (px)

Citation preview

Page 1: The Mitochondrial Uncoupling Protein Gene

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biolom, Inc.

Vol. 263, No. 25, Issue of September 5, pp. 12274-12277,1988 Printed in U.S.A.

The Mitochondrial Uncoupling Protein Gene CORRELATION OF EXON STRUCTURE TO TRANSMEMBRANE DOMAINS*

(Received for publication, February 23, 1988)

Leslie P. KozakS, James H. Britton, Ulrike C. Kozak, and James M. Wells From The Jackson Laboratory, Bar Harbor, Maine 04609

The mitochondrial uncoupling protein, a protein es- sential for the thermogenic properties of brown fat in mammals, is inserted in the inner mitochondrial mem- brane by means of six a-helical hydrophobic trans- membrane domains. We have sequenced a complete cDNA and parts of the gene to determine that the mitochondrial uncoupling protein gene is composed of six exons, each of which encodes a transmembrane domain. We also show that transcription of the uncou- pling protein gene is from a single start site; however, the use of alternative poly(A) addition signal sequences results in two mRNAs, the major species of 1221 nu- cleotides, not including the poly(A) tail, and a minor species of about 1600 nucleotides. The 5‘-untranslated region of the mRNA is composed of 231 nucleotides, and the 3”untranslated region contains 81 nucleotides prior to addition of the poly(A) tail.

Three ion carrier proteins of the inner mitochondrial mem- brane, the ADP/ATP translocator, the uncoupling protein (Ucp)’ of brown fat, and the phosphate carrier protein, are characterized by the presence of six a-helical hydrophobic transmembrane domains by which they are inserted into the mitochondrial membrane (1-4). Similarities in amino acid sequence among the proteins suggest that they evolved from a common ancestral gene (5). All three proteins have the same basic repetitive structure. It is a tripartite structure consisting of three homologous repeating 100-amino acid segments, with each segment containing two transmembrane domains (2). We now report that the structure of the nuclear gene for the mitochondrial Ucp reflects the domain structure. Each of the six exons in the gene encodes one of the transmembrane domains. We also show that transcription of the uncoupling protein gene is from a single start site and that the use of alternative poly(A) addition signal sequences results in two mRNAs, the major species of 1221 nucleotides, not including the poly(A) tail, and a minor species of about 1600 nucleotides.

MATERIALS AND METHODS3

* This work was supported by National Institutes of Health Grant HDOS431 (to L. P. K.). The Jackson Laboratory is fully accredited by the American Association for Accreditation of Laboratory Animal Care. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

$ To whom correspondence should be sent. The abbreviations used are: Ucp, mitochondrial uncoupling pro-

tein; kb, kilobase pairs; bp, base pairs. Portions of this paper (including “Materials and Methods,” Foot-

note 2, and Fig. 1) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press.

RESULTS AND DISCUSSION

We previously isolated a cDNA clone for the mitochondrial uncoupling protein (UCP) from mouse brown fat (6). By Northern blot analysis, this cDNA hybridized to two mRNAs, a major species of approximately 1300 nucleotides and a minor species of 1700 nucleotides. Both mRNAs were highly induced when mice were exposed to the cold at 5 “C, suggesting they were derived from the same transcriptional unit. In order to determine the molecular mechanisms controlling Ucp gene expression in brown fat, we have been investigating the struc- ture of both the Ucp mRNA and gene. Additional screening of our brown fat cDNA library has provided a cDNA sequence which is complementary to the full length of the major Ucp mRNA species (Fig. lA). The Ucp gene was isolated from a Charon 28 BALB/c embryo DNA library as two overlapping clones. We have sequenced the cDNA clones (Fig. lA). We have also used synthetic primers to sequence the Ucp mRNA directly to define the 5’ end of the mRNA, and we have sequenced parts of the 5’ and 3’ flanking region and the exons of the Ucp gene (Fig. 1B).

The start site for transcription was determined by compar- ing the RNA sequence to the genomic sequence as illustrated in Fig. 2A. Reverse transcription of the RNA terminates with a G. Sequencing of an M13 clone of the 5’ flanking region of the gene establishes this G as the start site for transcription. The site is located 231 bp upstream of the ATG translation initiation codon and 26 bp downstream of the sequence TA- TATA. This latter sequence has the properties for the TATA box promoter region with respect to both sequence and posi- tion (14). The sequence GCCGGG at the 5’ end of the mRNA is also located in the cDNA clone, p-Ucp 2.

In addition, the 5‘ end of the cDNA, p-Ucp 2, carries 179 nucleotides which do not match the Ucp mRNA sequence or any sequence in the genomic clones (Fig. lA). Since this sequence hybridizes to an RNA in brown fat of approximately 4400 nucleotides and maps to a site on chromosome 7 near the apolipoprotein E gene (data not shown), whereas Ucp maps to chromosome 8, we conclude that the 179 nucleotides of the 5’ end of the cDNA were attached to the Ucp cDNA during construction of the cDNA library.

Sequencing of the cDNA defined the 3‘ end of the major mRNA species (Fig. 1, Miniprint section). An untranslated sequence of 78 bases follows the TAA stop codon before addition of the poly(A) tail. A poly(A) addition signal se- quence, AAUAAA, is located 17 bases upstream of the poly(A) addition site. Accordingly, the major Ucp mRNA has 231 bases of 5“untranslated sequence, 918 bases of coding se- quence, and 81 bases of 3’-untranslated sequence. Sequencing of the 3”flanking region of the Ucp gene suggested an expla- nation for the minor mRNA species of 1700 bases. TWO additional AAUAAA signal sequences are present, one is a single base downstream of the poly(A) addition site of the

12274

Page 2: The Mitochondrial Uncoupling Protein Gene

Mitochondrial Uncoupling Protein Gene 122 75

A.

I ’ o o b b BP H p’ ? p - ” c p I

P V X P 8 9 8 9 H Pv @ - 1 - 1 - 1

4 p - u c p 2

r&pobA

5’ noncoding coding 3’ noncoding

lOObp intron I - intron 2 intron 3 intron 4 (-600 bP) (-1800 b D ) ( - 5 0 0 b ~ ) (-2600 b p )

intron 5

. c ” - c t

FIG. 1. A, restriction enzyme map of overlapping Ucp cDNA clones. The dashed line at the right-hand end of p-Ucp 2 represents the cloning artifact described in the text. The encircled P represents restored PstI sites at the insert/vector boundary. B, the sequencing strategy for the exons of the Ucp gene is shown. The size and position of the introns determined by partial sequencing and restriction enzyme mapping of the genomic clones is given. Sequencing reactions marked by a bar utilized synthetic deoxyoligonucleotide primers. The restriction enzyme sites are A, AccI; Au, AuaI; B, BamHI; Bg, BglII; Bs, BstNI; E, EcoRI; H , HindIII; Hc, HincII; M , MboI; N , NarI; Ns, NsiI; P, Pstl; Pu, PuuII; S, SfaNI; and X , XbaI.

major species, the other is 399 bases downstream of the first AAUAAA signal sequence. To evaluate the possibility that the latter signal sequence was used to produce the larger mRNA, a Northern blot containing brown fat RNA was hybridized (Fig. 2B) with an upstream probe containing se- quences derived from the region defining the 3‘ end of the cDNA (probe A) and a downstream probe which extends just beyond the third signal sequence (probe B). The results which show that the upstream probe A hybridized to both RNAs while the downstream probe B hybridized to only the minor, high molecular weight Ucp RNA, strongly indicate that the high molecular weight RNA is derived by utilization of the downstream AAUAAA site. In summary, the Ucp gene ap- pears to have one major start site for transcription and the sixth exon can have a short or long form depending on which poly(A) addition signal sequence is utilized. It is also possible that the second poly(A) addition signal sequence is also used to yield a mRNA which could not be resolved from the major mRNA form using the Northern blot analysis.

The protein sequence for mouse UCP has been obtained by reverse translation of the cDNA sequence. This sequence shows strong homology to the hamster and rat proteins (2, 3, 15). All three species have 306 amino acids. The mouse protein differs from the rat in 8 positions and from the hamster in 24 positions with 20 of the differences with the latter being conservative. It has previously been noted that rat and ham- ster differ from each other at 26 positions (15). At the nucleo- tide level, the coding region of the mouse and rat cDNAs share 93% sequence homology. The 5”untranslated region shows less homology, but it cannot be fully evaluated until the complete sequence for the rat is known. The 5”untrans- lated region for the mouse mRNA consists of 231 nucleotides, while only 177 nucleotides were reported for the largest rat cDNA clone (3).

The sequencing strategy for the exons in the Vcp gene is described in Fig. 1B. The sequence of these exons and adjoin-

ing intron regions shows that the Ucp gene is composed of six exons (Fig. 1, Miniprint section). Also shown in Fig. 1 (Min- iprint section) are differences in sequence found between the cDNA sequence of p-Ucp 2 and the genomic sequence. These differences are located at position 145 in the 5‘ untranslated region and at codons 247 and 299, Since these substitutions do not change the amino acid sequence, we think the differ- ences are genuine and arise from differences in the genotype of the cDNA (C57BL/6J) and the genomic DNA (BALB/c).

The location of the introns relative to the domain structure of the uncoupling protein is clearly evident (Fig. 3). Introns 11 and IV interrupt the coding sequence within codons 108 and 209 to divide the coding region into the three 100-amino acid repetitive segments observed by sequence similarities for the inner membrane carrier proteins (2, 4). In addition, in- trons I, 111, and V further subdivide the coding region so that introns I and I11 interrupt transmembrane domains A and B and C and D, respectively, while intron V is located in codon 269 slightly within the region which encodes transmembrane domain F. Intron I interrupts the gene in a region, codon 42, which encodes a p strand possibly associated with a membrane pore (2). Remarkably, each of the transmembrane domains of the UCP is encoded by a separate exon.

A relation between introns and exons encoding transmem- brane domains was first observed with bovine rhodopsin which has seven transmembrane domains (16). Introns in the rhodopsin gene interrupt the coding regions at three positions which mark the boundaries of the hydrophobic regions. A more striking correlation was recently described for the 3- hydroxy-3-methylglutaryl coenzyme A reductase gene where each of the seven transmembrane domains, which enable the protein to be inserted in the endoplasmic reticulum, is sepa- rated by an intron (17). The single transmembrane domain of the H-2 molecule is also defined as a separate exon (18). A striking contrast to a gene organization where each transmem- brane domain is encoded by a separate exon is found in the

Page 3: The Mitochondrial Uncoupling Protein Gene

12276 Mitochondrial Uncoupling Protein Gene

A

TATA 1 box

8

I I A P V

EXON 6 L

P M I ‘ EiON 6s ,..c_“! M E H

AATAAA A G h ,

Drobc A DrObC B

I

1 0 0 bp \ 23 s

165

/ - 23s

-16s

FIG. 2. A , comparison of the DNA sequence near the 5‘ end of the Ucp gene to the Ucp RNA sequence. The DNA sequence of the genomic DNA was obtained by cloning a BgllIIPstI fragment from the 5’ end of exon I into M13 and sequencing with dideoxynucleotide chain terminators (8). The RNA sequence of the Ucp RNA was obtained by a procedure described by Geliebter et al. (10) with a synthetic primer which was derived from a sequence overlapping the PstI site in exon I (Fig. 1B). B, Northern blot demonstrating two poly(A) addition signal sequences which result in two Ucp mRNAs, one derived from the short form of exon 6 and the other from the long form. Abbreviations for restriction enzymes are found in the legend to Fig. 1.

G-protein-coupled membrane receptor proteins. These pro- teins, which include the aZ- and &adrenergic receptors (19, 20), the M2 muscarinic receptor ( Z l ) , and G-21 (22), have seven a-helical hydrophobic transmembrane domains, yet the

cytosol

A 0 c D E F

matrix

FIG. 3. A schematic representation of the transmembrane domains in the inner mitochondrial membrane. The transmem- brane domains are lettered A-F. The amino acid position at which the introns (arrows) interrupt the coding region is given. The dashed line represents the region of @-strand structure interrupted by intron I.

entire protein is encoded by a single exon. The only G-protein- coupled receptor which has an exon/intron structure is rho- dopsin and this is the first gene found to have a relationship between the transmembrane domains and intron/exon struc- ture (16). It is unlikely that the presence or absence of introns in the coding region for transmembrane domains has any functional importance; however, insights into the evolution- ary relationships among proteins with transmembrane do- mains may become evident when the structures of other genes encoding proteins with such domains are determined.

REFERENCES 1. Aquila, H., Misra, D., Eulitz, M., and Klingenberg, M. (1982) Hoppe-Seyler’s

2. AqyjL5 H., Link, T. A,, and Klingenberg, M. (1985) EMBO J. 4, 2369- Z. Physiol. Chem. 363,345-349

3. Bouillaud, F., Weissenbach, J., and Ricquier, D. (1986) J. Biol. Chem. 261,

4. Runswick, M. J., Powell, S. J., Nyren, P., and Walker, J. E. (1987) EMBO

5. Klingenberg, M. (1985) in Membrane Transport Driven by Ion Gradients J. 6, 1367-1373

Science, New York (Semenza, G., and Kinne, R., eds) pp. 279-288, New York Academy of

6. Jacohsson, A., Stadler, U., Glotzer, M. A., and Kozak, L. P. (1985) J. Biol. Chem. 260,16250-16254

7. Benton, W. D., and Davis, R. W. (1977) Science 196,180-182 8. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci.

9. Haltiner, M., Kenpe, T., and Tjian, R. (1985) Nucleic Acids Res. 13,1015-

Z J I b

1487-1490

U. S. A. 74,5463-5467

10. Geliebter, J., Zeff, R. A., Melvold, R. W., and Nathenson, S. G. (1986) Proc.

11. Aviv, H., and Leder, P. (1972) Proc. Natl. Acad. Sci. U. S.A. 69, 1408-

1025

Natl. Acad. Sci. U. S. A. 83,3371-3375

1412 12. Derman, E., Krauter, K., Walling, L., Weinberger, C., Roy, M., and Darnell,

13. Feinberg, A. P., and Vogelstein, B. (1983) Anal. Biochem. 132,6-13 14. Corden, J., Wasylyk, B., Buchwalder, A., Sassone-Corsi, P., Kedinger, C.,

15. Ridlev. R. G.. Patel. H. V.. Gerber. G. E.. Morton. R. C.. and Freeman. K.

J. E., Jr. (1981) Cell 23,731-739

and Chambon, P. (1980) Science 209,1406-1414

16. 17.

18.

19.

20.

21.

22.

B. (i986) Nucleic’Aeids Res. 14,’4025-4035 , ,

Nathans, J., and Hogness, D. S. (1983) Cell 34,807-814 Liscum, L., Finer-Moore, J., Stroud, R. M., Luskey, K. L., Brown, M. S.,

and Goldstein, J. L. (1985) J. Biol. Chem. 260,522-530 Kaufman, J. F., Auffray, C., Korman, A. J., Shackelford, D. A., and

Strominger, J. (1984) Cell 36, 1-13 Kohilka, B. K., Matsui, H., Kobilka, T. S., Yang-Feng, T. L., Franke, U.,

Caron, M. G., LeRowitz, R. J., and Regan, J. V. (1987) Science 238,

Kobilka, B. K., Dixon, R. A. F., Frielle, T., Dohlman, H. G., Bolanowski, 650-656

Lefkowitz, R. J. (1987) Proc. Natl. Acad. Sci. U. S. A. 84,46-50 M. A., Sigal, I. S., Yang-Feng, T. L., Francke, U., Caron, M. G., and

Peralta, E. G., Winslow, J. W., Peterson, G. L., Smith, D. H., Ashkenazi, A., Ramachandran, J., Schimerlik, M. I., and Capon, D. J. (1987) Science

Kohilka, B. K., Frielle, T., Collins, S., Yang-Feng, T., Kobilka, T. S., 236,600-605

Francke, U., Lefkowitz, R. J., and Caron, M. G. (1987) Nature 329,75- 79

Page 4: The Mitochondrial Uncoupling Protein Gene

Mitochondrial Uncoupling Protein Gene 12277

Korak e l a l . Svbcloning - A detailed restriction enzyme map VIE made Of the overlapping genomic clones. The llitochondrid Uncoupling Protein Gene Restriction enzyme fragments found in the genomic ~ i o n e r were also observed on Southern blots

MTEnII\Ls Alm IlxrmDs Isolation 01 cDNA and Genomic Clones -The isolation of 1 part ia l EDNA clone, p-Ucp 1, frao a mouse brom fat cDNA library has been described (6). Rescreening chis library by hybridisation vifh radiolabeled p-ucp 1 19 L probe resulted in fhe isolation Of additional cDNA clones far Uep. The restriction m z y m map for the largest cDN& clone, p-Ucp 2 . is illurfrafed in Pig. la.

Of genomic DNA. Hind111 fragments of 1.5, 2.2, 2.5 and 4.6 kb as vel1 8s overlapping EeoRI, P r f l and 5 ~ 1 1 fragmenfs sere cloned info puc vectors.

DNA Sequence Wterainafion - Coding regions within the genomic clones "ere upped by hybridizing Southern blots. used (0 prepare rerfriction enzyme map-, with subcloned regions of the p-Ucp2 =DNA. These regions Yere squeneed by the uQe of double stranded and single stranded DNA sequencing methodologies using the univerral 1113 primer or synthetic deoxyoiiganucleofide primers (8. 9). The RNA sequence of the 5' end of the Ucp & N A vas determned by a procedure dexribed by Celiebler e f . (10).

RNA Isolation - RNA for Northern blot analysis and RNA ~equencing vas extracted from rhe brovn fat mf cold adapted mice, i . e . mice aaintrincd 81 S T for 24 h, by the guanidinum-AC1 procedure (61. Poly A* RNA was isolated from total RNA by IYO passage3 through oligo dT-celluiore i l l ) .

Bloc Analysis ~ Northern blots were performed as described by Oerman e f . (12) using prober

Radiolabeled p-Uep 1 was a k a used to isolate genomic elones for the Ucp g y e fron a BALBIc embryo genomic DNA library connfruered in Charon 28 by Dr. Phillip Leder . Plaque hybridizations of phage "ere perfomed essentially a3 described by Beaton and Davis (7). The entire Uep gene was canrained in CYO overlapping frapenrn.

'Phlilip Leder. personal cornmumcation. radiolabeled with random primers (13).

1 5 10 GACTCCTTTTGTTCTTGCACCACGCCTCTCTCCCCTCCIIAGCCAGG CTC AAC CCG ACA A C T TCC GAA GTC CAA CCC

Val As" Pro Thr Thr Ser Giu Val Gln Pro

190 200 210 220 230

6 5 50 55 60 65 CAA GGI CAA GGC CRG CCT I C C ACT ACC ATT ACC TAT AAA GGT GTC CTA GGG ACC ATC K C ACC CTG GCA G i n Gly Clu G i y Gin ala Ser 5er Thr Ile (irg Tyr Lyn Gly V a l Leu Gly Thr Ile Thr T h r Leu A l a

70 75 80 AAA ACA C A A G C I TTG CCG AAA CTG TAC AGC GGT CTG CCT GCG GGC A T T CAG AGG CAA ATC AGC TTT GCC Lys Thr C i u G l y Leu Pro Lyr Leu Tyr Ser Gly Leu Pro Ala G l y I l e G l n Arg Gln Ile Ser Phe A l a

85

911 95 100 TCA CTC AGC m GGC CTC TAC GAC TCA GTC CAA GAG TAC TTC TCT TCA GGG AGA GAA A gtaaggaaat ...

105

Ser Leu Arg Iie G l y Leu Tyr Asp Ser Val Gln G l u Tyr Phe Ser Ser Gly Arg Glu T

INTRON 2 (approximarely 1900 bp) ... rcccggatag C i i CCT GCC TCT CTC GGA U C AAG DTC TCA GCC GGC hr Pro Ala Ser Leu Gly & m Lyr Ile Sex Ala Gly

110 115

120 125 130 135 140 TTb ATG A C T GGA GCT GTG GCA GTG TTC &TI GGG CAC CCT ACA GAG GTC GTG AAG GTC AGA ATG CAA GCC Leu Met Thr G l y C l y Val Ala Val Phe I l e Gly Gln Pro T h r Glu Val Val Lys Val Arg l e t Gln Ala

145 CAG AGC CAT CTG CAT GGG ATC AAA CCC CGC TAC ACG GGG ACC TAC U T GCT TAC AGA GTT ATR GCC ACC Gln Ser Hlr Leu His Gly I l e Lyr Pro A r g Tyr Thr Gly Thr Tyr A m Ala Tyr A r g Val Ile Ala Thr

150 155 160 165

110 ACA GAA AGC TTG TCA ACA C l l TGG AAA G gfagcraacfgccaagIggacagccacagIarItaaaaggcacaIacacccagcca Thr Glu Ser Leu Ser Thr Leu T r p Lyr G

175 gcagclc~flcd(l~~algcIplc[lg[ltllclgcag 66 ACG K C CCT AAT C T A ATG AGA AAT GTC ATC ATC AAT

l y Thr Thr Pro A r n Leu Mer ATg A m Val Ile Ile A m

180 185

190 195 200 205 TCT bCA GAG CTG GTA ACA TAT GAC CTC ATC AAC CGC GCC CTT GTA AAC AAC AAA AT* CTG GCA G g t a c r t c c Cyr Thr G l u Leu Va l Thr Tyr Asp Leu l e t Lyr C l y Ala Leu Val Asn Asn Lys i l e Leu Ala A

ca...INTRON I (approximately 850 bp) ... ICaCIcICPg AT GAC CTC CCC TGC CAT TTA CTG TCA GCT CTT sp A s p Val Pro Cyr Bir Leu Leu Ser Ala Leu

210 215

220 2 2 5 230 235 260 CTT GCC GCG TTT TGC ACC ACA CTC CTG GCC TCT CCA CTC GAT CTG GTA AAA ACA AGA T T C ATC AAC TCT Val Ala Gly Phe C y r Thr Thr Leu Leu A l a Ser Pro Val Asp Val Val Lyr T h r A r g Phe Ile A m Ser

2 1 5 T CTC CCA GGX CRC T K CCA AGC GTA CCA RGC TGT GCC ATG TCC ATG TAC ACC M G CAA CGA CCC ACC CCC Leu Pra 61) G l n Ty r Pro Ser Val Pro Ser Cyr A l a Net Ser l e t Tyr Thr Lye C l u Cly Pro T h r A l a

250 255 260 265

m TTC AAA GC graggafalg ... INTRON 5 (approxlmalely 2600 bp) ... taclfccrag G TTT GTG G C I TCT Phe Phe LYS GI y Phe Val Ala Ser

2 1 5 280 285 290 295 TTT CTG CGA CTC GGG TCC TGG AAC CTC AIC ATC m CTC TCC m GAA CAC CTC AAA IVU GAC CTG ATG

270

Phe Leu i r g Leu Cly Ser Trp Arn Val Ile let Pbe Val C y r Phe Glu G i n Leu Lyr Lyr Glu Leu lief

AAG TCC AGA CRC ACA GTG CAT TGT ACC ACA E G C M ~ G G A G G A A G A G * T A C T G A A C ~ T C l T T U j G C l T C G Lyr Ser Arg Gln Thr Val Asp Cyr Thr Thr

G 300 105

ACCACG-CCAACCMGAAATC AAA~CAGCTCCGTTG(ICmATTT*C*TTACAAGATCATTTCCAGTAGAGAGTmGM~C

T C T T T T I I A T T ~ T T A A A G G G - C T M C A C A T A C A C * T A C ~ T ~ T T T