5
THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 263, No. 26, Issue of September 15, pp. 13414-13418,1988 Printed in U.S.A. Construction of a Full-length cDNA Encoding Human Pro-a2(1) Collagen and Its Expression in Pro-a2(1)-deficient W8 Rat Cells* (Received for publication, April 19, 1988) Seung-Taek Lee, Barbara D. Smith$, and Daniel S. Greenspane From the Department of Pathology, Uniuersity of Wisconsin, Madison, Wisconsin 53706 and the 2 Laboratory of Connective Tissue and Aging, Veterans Administration Outpatient Clinic, Boston, Massachusetts 02108 We have constructed a cDNA encoding the entire human pro-a2(1) collagen molecule. Sequence deter- mination for 2196 base pairs at the 5’ end of the cDNA clone, and comparison with previously characterized human a2(I) sequences, identified a number of nucleo- tide and amino acid polymorphisms. Functionality of the cDNA clone, under control of the long terminal repeat of Roussarcoma virus, was demonstratedby its introduction into the WS cell line. The WS line, a chemically transformed variant of K16 rat liver epi- thelial cells,has been previously shown to lack detect- able levels of a2(I) RNA, but to secrete al(1) homotri- mers. Introduction of the human cDNA into WS cells, resulted in secretion of chimeric type I collagen com- prised of rat al(1)and human a2(I) chains. Availability of a functional full-length clone of human a2(I) cDNA, combined with the WS cell line as expression system, will allow detailed analysis, through site-directed mu- tagenesis, of domains on the pro-a2(1) molecule in- volved in assembly, transport, secretion, and fibrillo- genesis. Collagen represents approximately 30% of total body pro- teins in humans. Type I collagen, the major fibrous compo- nent of connective tissues is, by far, the most abundant of the collagens, andits orderly deposition is critical for normal morphogenesis and wound repair. Monomeric type I collagen is synthesized as procollagen, a precursor molecular formed through self-assembly of one pro-&(I)and two pro-al(1) chains into a heterotrimer. Type I procollagen differs from type I collagen in having amino- and carboxyl-terminal pep- tide extensions referred to as propeptides. These are cleaved, upon secretion of procollagen from the cell, leaving the cen- tral, triple-helical, part of the molecule as mature monomer which comprises the highly structured fibrils characteristic of type I collagen (for a review, see Ref. 1). Naturally occurring mutations underlying such inherited collagen disorders as osteogenesis imperfecta, Ehlers-Danlos syndrome, and Marfan’s syndrome have been mapped to the two genes encoding type I collagen (1). Characterization of these genetic lesions has provided insight into the roles of various regions of each chain of procollagen in the biosyn- thesis of the mature molecule. For example, a naturally oc- curring mutation in a region of the carboxyl-terminal propep- tide of the pro-a2(1) chain has demonstrated this region to be * This work was supported by Grant GM-38544 (to D. S. G.) from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accord- ance with 18 U.S.C. Section 1734 solelyto indicate this fact. To whom correspondence should be addressed. important in the initial association of pro-a chains (2). A more systematic approach toward mapping the functional topology of the procollagen molecule could involve mutagen- esis of specific sites in cloned collagen genes combined with establishment of tissue culture expression systems for struc- ture/function analysis. Such a system would complement analysis of naturally occurring mutations in patients with heritable collagen diseases. We report here the cloning of a functional, full-length cDNA encoding the human pro-a2(1) chain and transfection of the cDNA intothe W8 line of rodent cells. W8 is a chemically mutagenized line of liver epithelia that expresses pro-al(1) but not pro-a2(I) chains(3). We show that the cloned human cDNA driven by a retroviralLTR’ is expressed in the W8 cells and that the cells secrete a stable rat/human chimeric type I collagen molecule. The relative compactness of the cDNA version of the pro-&I) gene will make it more suitable for in vitro mutagenesis than the genomic version which is extremely large, ranging from 35-kb in humans (4) to 38 kb in the chick (5). The reported system, thus, holds promise for defining functional domains of the human pro- a2(1) chain necessary for assembly, secretion, and fibrilloge- nesis. MATERIALS AND METHODS Isolation of cDNA and Plasmid Construction-A human cDNA library of SV80 transformed fibroblasts in X gtl0 phage was obtained from D. Wolf (Harvard University) (6) and screened with a 3.2-kb EcoRI fragment corresponding to the 5’ end of a2(I) genomic clone, NJ-3 (7). Inserts of 3.8 and 1.25 kb, from two of the positive clones, were subcloned into pUC9 to generate recombinants pSTL8 and pSTL11, respectively (Fig. 1). These inserts, combined at a common PstI site, comprise a sequence beginning 20 bp downstream of the cap site and ending 35 bp short of the translation termination codon. Joining, at a common XbaI site with a 950-bp XbaI/EcoRI fragment from the partial-length cDNA clone Hfl131 (8), yields a cDNA insert containing all coding sequences for human pro-a2(1), pSTL14 (Fig. 1). This insert was excised at an SphI site at nucleotide +53 in the 5”untranslated region and at a SmaI site 9 bp downstream of the insert in the pUC9 polylinker, recloned in pUC18 in order to place a HindIII site 6 bp upstream of the insert, re-excised with HindIII and SmaI, and cloned into expression vector pRSVcat (9) between a HindIII site downstream of the Rous sarcoma virus LTR and a BalI site upstream of the SV40 small t splice site and SV40 early polyad- enylation signal. The resultant construct, pSTL22, was used in sub- sequent expression studies. Cell Culture and Transfection”W8 and K16 have been described previously (3, 10). AHlF normal neonatal fibroblasts, from foreskin, were kindly provided by L. Allen-Hoffman (University of Wisconsin- Madison). Cells were cultured in Dulbecco’s modified Eagle’s medium with 10% fetal calf serum as described (10). Transfection of W8 cells was with 5 pg of pSV2neo and 25 pg of pSTL22/10-cm dish (10‘ cells) by calcium phosphate precipitation (11). Cells were shifted to medium The abbreviations used are: LTR, long terminal repeat; kb, kilo- bases; bp, base pairs. 13414

Construction of a Full-length cDNA Encoding Human Pro-a2(1

Embed Size (px)

Citation preview

Page 1: Construction of a Full-length cDNA Encoding Human Pro-a2(1

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 263, No. 26, Issue of September 15, pp. 13414-13418,1988 Printed in U.S.A.

Construction of a Full-length cDNA Encoding Human Pro-a2(1) Collagen and Its Expression in Pro-a2(1)-deficient W8 Rat Cells*

(Received for publication, April 19, 1988)

Seung-Taek Lee, Barbara D. Smith$, and Daniel S. Greenspane From the Department of Pathology, Uniuersity of Wisconsin, Madison, Wisconsin 53706 and the 2 Laboratory of Connective Tissue and Aging, Veterans Administration Outpatient Clinic, Boston, Massachusetts 02108

We have constructed a cDNA encoding the entire human pro-a2(1) collagen molecule. Sequence deter- mination for 2196 base pairs at the 5’ end of the cDNA clone, and comparison with previously characterized human a2(I) sequences, identified a number of nucleo- tide and amino acid polymorphisms. Functionality of the cDNA clone, under control of the long terminal repeat of Rous sarcoma virus, was demonstrated by its introduction into the WS cell line. The WS line, a chemically transformed variant of K16 rat liver epi- thelial cells, has been previously shown to lack detect- able levels of a2(I) RNA, but to secrete al(1) homotri- mers. Introduction of the human cDNA into WS cells, resulted in secretion of chimeric type I collagen com- prised of rat al(1) and human a2(I) chains. Availability of a functional full-length clone of human a2(I) cDNA, combined with the WS cell line as expression system, will allow detailed analysis, through site-directed mu- tagenesis, of domains on the pro-a2(1) molecule in- volved in assembly, transport, secretion, and fibrillo- genesis.

Collagen represents approximately 30% of total body pro- teins in humans. Type I collagen, the major fibrous compo- nent of connective tissues is, by far, the most abundant of the collagens, and its orderly deposition is critical for normal morphogenesis and wound repair. Monomeric type I collagen is synthesized as procollagen, a precursor molecular formed through self-assembly of one pro-&(I) and two pro-al(1) chains into a heterotrimer. Type I procollagen differs from type I collagen in having amino- and carboxyl-terminal pep- tide extensions referred to as propeptides. These are cleaved, upon secretion of procollagen from the cell, leaving the cen- tral, triple-helical, part of the molecule as mature monomer which comprises the highly structured fibrils characteristic of type I collagen (for a review, see Ref. 1).

Naturally occurring mutations underlying such inherited collagen disorders as osteogenesis imperfecta, Ehlers-Danlos syndrome, and Marfan’s syndrome have been mapped to the two genes encoding type I collagen (1). Characterization of these genetic lesions has provided insight into the roles of various regions of each chain of procollagen in the biosyn- thesis of the mature molecule. For example, a naturally oc- curring mutation in a region of the carboxyl-terminal propep- tide of the pro-a2(1) chain has demonstrated this region to be

* This work was supported by Grant GM-38544 (to D. S. G.) from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accord- ance with 18 U.S.C. Section 1734 solely to indicate this fact.

To whom correspondence should be addressed.

important in the initial association of pro-a chains (2). A more systematic approach toward mapping the functional

topology of the procollagen molecule could involve mutagen- esis of specific sites in cloned collagen genes combined with establishment of tissue culture expression systems for struc- ture/function analysis. Such a system would complement analysis of naturally occurring mutations in patients with heritable collagen diseases.

We report here the cloning of a functional, full-length cDNA encoding the human pro-a2(1) chain and transfection of the cDNA into the W8 line of rodent cells. W8 is a chemically mutagenized line of liver epithelia that expresses pro-al(1) but not pro-a2(I) chains (3). We show that the cloned human cDNA driven by a retroviral LTR’ is expressed in the W8 cells and that the cells secrete a stable rat/human chimeric type I collagen molecule. The relative compactness of the cDNA version of the pro-&I) gene will make it more suitable for in vitro mutagenesis than the genomic version which is extremely large, ranging from 35-kb in humans (4) to 38 kb in the chick (5). The reported system, thus, holds promise for defining functional domains of the human pro- a2(1) chain necessary for assembly, secretion, and fibrilloge- nesis.

MATERIALS AND METHODS

Isolation of cDNA and Plasmid Construction-A human cDNA library of SV80 transformed fibroblasts in X gtl0 phage was obtained from D. Wolf (Harvard University) (6) and screened with a 3.2-kb EcoRI fragment corresponding to the 5’ end of a2(I) genomic clone, NJ-3 (7). Inserts of 3.8 and 1.25 kb, from two of the positive clones, were subcloned into pUC9 to generate recombinants pSTL8 and pSTL11, respectively (Fig. 1). These inserts, combined at a common PstI site, comprise a sequence beginning 20 bp downstream of the cap site and ending 35 bp short of the translation termination codon. Joining, at a common XbaI site with a 950-bp XbaI/EcoRI fragment from the partial-length cDNA clone Hfl131 (8), yields a cDNA insert containing all coding sequences for human pro-a2(1), pSTL14 (Fig. 1). This insert was excised at an SphI site at nucleotide +53 in the 5”untranslated region and at a SmaI site 9 bp downstream of the insert in the pUC9 polylinker, recloned in pUC18 in order to place a HindIII site 6 bp upstream of the insert, re-excised with HindIII and SmaI, and cloned into expression vector pRSVcat (9) between a HindIII site downstream of the Rous sarcoma virus LTR and a BalI site upstream of the SV40 small t splice site and SV40 early polyad- enylation signal. The resultant construct, pSTL22, was used in sub- sequent expression studies.

Cell Culture and Transfection”W8 and K16 have been described previously (3, 10). AHlF normal neonatal fibroblasts, from foreskin, were kindly provided by L. Allen-Hoffman (University of Wisconsin- Madison). Cells were cultured in Dulbecco’s modified Eagle’s medium with 10% fetal calf serum as described (10). Transfection of W8 cells was with 5 pg of pSV2neo and 25 pg of pSTL22/10-cm dish (10‘ cells) by calcium phosphate precipitation (11). Cells were shifted to medium

The abbreviations used are: LTR, long terminal repeat; kb, kilo- bases; bp, base pairs.

13414

Page 2: Construction of a Full-length cDNA Encoding Human Pro-a2(1

Cloning and Expression of Full-length Pro-aB(I) Collagen cDNA 13415

containing 0.4 mg/ml G418 (GIBCO) 24 h after transfection. G418- Resistant clonal lines of cells were picked with cloning cylinders and maintained in medium containing 0.2 mg/ml G418.

RNA Analysis-Cytoplasmic RNA was isolated by disruption of cells in isotonic lysis buffer in the presence of Nonidet P-40 (Shell Chemicals), with removal of nuclei by centrifugation, as described (12). For Northern blot analysis, 25 pg/slot of RNA was separated on 0.7% agarose gels containing 2.2 M formaldehyde (131, transferred to nitrocellulose filters (14), and hybridized to DNA probes radiolabeled to a specific activity of 4-6 X lo9 cpm/pg by the random primer method (15) using the multiprime kit (Amersham Corp.).

Isolation and Salt Fractwnutwn of Radiolabeled Collagen-Cultures were supplemented with ascorbate (75 pg/ml) 24 h prior to addition of isotopes. Cells were washed with serum-free medium and then radiolabeled for 16 h in fresh medium containing ascorbate, 3-ami- nopropionitrile fumarate (50 pg/ml, Sigma), 5% dialyzed fetal calf serum, and 20 pCi/ml ~-[2,3-~H]proline ( 32.5 Ci/mmol, Du Pont- New England Nuclear). Culture medium was collected, combined with a serum-free media wash of the cell layer, and chilled to 4 "C. Phenylmethylsulfonyl fluoride and EDTA were added to final con- centrations of 0.1 and 25 mM, respectively, cell debris was removed by centrifugation, and collagens precipitated in 25% saturated (NH,),SO, overnight at 4 "C. Precipitates were resuspended in 0.05 M Tris. HC1, pH 7.5,0.15 M NaCI, reprecipitated in 75% ethanol, and brought up into 0.5 M acetic acid for treatment with 100 pg/ml pepsin for 6 h at 4 "C. Samples were neutralized with NaOH, dialyzed against water, lyophilized, taken up into gel buffer (16), and heat denatured prior to electrophoresis. For salt fractionation, pepsinized and neu- tralized samples were dialyzed 24 h against 2.5 M NaCl, 50 mM Tris. HCl (pH 7.5). After centrifugation (60 min at 30,000 X g) pellets were rinsed twice in 2.5 M NaCI, 50 mM Tris-HCl, lyophilized, and taken up into gel buffer. Supernatants were made 4.4 M in NaCl by addition of solid NaCl and, after 24 h, centrifuged 60 min at 30,000 X g. Pellets were lyophilized and taken up into gel buffer.

DNA Sequence Analysis-All sequences were obtained in both directions by dideoxy chain termination (17) using [CI-~%]~ATP (Amersham Corp.), M13mp8,18 and 19 as cloning vectors (18) and the Sequenase sequencing kit (United States Biochemical Corpora- tion).

RESULTS

Sequence Analysis-To fully characterize that part of the aZ(I)-coding sequence which had not been described previ- ously, restriction fragments covering the 5' end of the full- length cDNA (Fig. 1) were cloned into M13 and both strands sequenced. As this paper was being prepared, a consensus sequence of human &(I), as deduced from overlapping ge- nomic and cDNA partial-length clones, was published by de Wet et al. (19). Comparison of the 2196-bp sequence derived

v aoSTL 11

E A P Sf A i S f N - " - -

"

TA l k b 4

8 NE

FIG. 1. Restriction map, sequencing strategy, and constit- uent clones of the full-length human pro-aZ(1) collagen cDNA. Overlaps of the three cDNA clones pSTLl1, pSTL8, and Hfl131 are depicted above the restriction map of the full-length cDNA clone pSTL14. White, black, and hatched bars comprising pSTL14 represent parts of the cDNA encoding untranslated regions, terminal propep- tides, and main triple helical region, respectively. E, EcoRI, S, SphI; B, BstEII; A , ApaI; Pv, PvuII; P, PstI; Sf, SfiI; Na, NaeI; N, NcoI; Bc, BclI; X, XbaI. Underlined restriction sites are present in pSTL14 but not in sequence reported by de Wet et al. (19); sites in parentheses are present in de Wet et al. sequence but not pSTL14. Horizontal arrows represent sequenced fragments. NB, PuuII-EcoRI fragment used as Northern blot probe. SI, NcoI-BstEII fragment used as S1 probe; asterisk denotes 32P-labeled 3' end and wavy line, 136 bp from M13 vector.

from our full-length clone with the sequence of de Wet et al. shows 20 differences (Table I). Of these, two are insertions of single base-pairs into the 5"untranslated region, and the rest are single base-pair substitutions, 17 of which are in the coding region. Five amino acid substitutions result, all of which are relatively conservative, with nonpolar residues sub- stituting for either other nonpolar residues or for residues with uncharged polar R groups. All observed amino acid substitutions occur in the main triple helical region of the collagen molecule in either the X or Y position of the G1y-X- Y repeat.

As in chicken and mouse a2(I) genes (20), the 5' region of human cy2(I) mRNA contains three AUG codons, the first and second of which are followed by short open reading frames potentially encoding a hexapeptide and tetrapeptide, respec- tively. The nucleotide substitution at position 71 (Table I) converts the terminal methionine of the hexapeptide pre- dicted by the de Wet et al. (19) sequence to arginine. With this terminal arginine, the sequence of the potential hexapep- tide becomes identical to that reported for the mouse (20). The inserted nucleotides a t positions 97-98 and 116-117 also agree with the mouse sequence.

Based upon placement of splice junction consensus se- quences and comparison of our sequence with that of chick cy2(I) ( Z l ) , we concluded that exon 3 of the human gene contains 15 bp compared to 18 bp in the chicken (Fig. 2). This has been confirmed (19). I t is of interest, however, when comparing human and chick sequences, that although a Gly-

TABLE I Polymorphisms of human ~12(1) colkgen

No. Nucleotide Nucleotide Amino acid position" changeb changeb

1 71 G (TI Arg (Met) 97-98 G insertion

116-117 14 822

T insertion C (T)

15 840 C (A) 864 G (A)

16 8 79 C (TI 881 T (A) 903 C (T) 921 C (TI

17 961 G (A) 966 T (A)

1147 G (A)

1236 G (TI 25 1581 A (C)

1583 C (T) 26

Ala (Val) 1644 T (C)

According to the numbering system of de Wet et al. (19). Nucleotide sequences were determined by the strategy shown in

Fig. 1. Amino acid sequences were derived from the DNA sequence. Sequences in parentheses are from Ref. 19.

19 1132,1133 GT (CC)

21 1230 T (C)

Ile (Asn)

Ala (Thr)

Val (Pro) Ala (Thr)

Exon 3 Exon 4

Human E T v - R K G A G D R G p R G E R

Chick A S A G R K G P R G D K G P Q G E R * t * t * t * * * . *

A L A A E x o n 5

Human G P P G P P G R D G E D G P T G P P G P P G P P G P G ~ G G * * * . * * * * * * * * . * * . * * * * . * . * . * * * Chick G P P G P P G R D G E D - " G P P G P P G P P G P P G L G G

A > A

FIG. 2. Alignment of amino acids encoded by human and chick exon8 3, 4. and 5. Dashes denote amino acids absent in one but present in the other of the two species. Homologies are denoted by asterisks. Arrowheads show exon junctions and open arrows mark the end points of the amino-terminal triple helical region. Chick amino acid sequences are from Ref. 21.

Page 3: Construction of a Full-length cDNA Encoding Human Pro-a2(1

13416 Cloning and Expression of Full-length Pro-aB(I) Collagen cDNA

X- Y triplet is lost to human exon 3 through deletion of a Gly codon, an additional Gly-X-Y triplet is gained in human exon 5 by the addition of 9 bp, thus maintaining the overall length of the small triple helical region in the NHz-terminal propep- tide.

Expression of Human a2(I) Collagen in Transfected W8 Cells-The full-length pro-a2(I) cDNA was inserted into the expression vector pRSV-cat downstream of the RSV-LTR, shown to be a strong promotor in a variety of eukaryotic cell types (9). Joining was such that the distance between the LTR transcriptional start site and the AUG which initiates translation of pro-a2(1), is about 132 bp. The resultant recom- binant (pSTL22) was cotransfected with the selectable marker pSV2neo (22), into W8 cells. Clonal lines of trans- fected W8 cells resistant to the neomycin analog G418 were then isolated, labeled with [3H]proline, and the culture media analyzed for presence of a2(I) chains. Unlike pro-al(I), pro- a2(I) chains do not form homotrimers and, in the absence of pro-al(1) chains with which to complex, pro-a2(I) chains are rapidly degraded in the cell and are not secreted (23). Thus, detection of a2(I) chains in culture media indicates the secre- tion of such chains in combination with two pro-al(1) chains in a heterotrimer. Resistance to limited pepsin digestion dem- onstrates the heterotrimer to be in the form of a stable triple helix. Pepsin-resistant a2(I) chains were detected in the cul- ture media of 5 of 10 clonal lines derived from transfected W8 cells. The line, with the highest level of a2(I) chains, A2 (Fig. 3A), was selected for further analysis.

Concomitant with a2(I) production, the A2 line reproduci- bly secreted increased amounts of al(1) chains compared with that secreted by equal numbers of W8 cells grown to the same density (Fig. 3A). I t is also evident that the ratio of al(1) to cu2(I) chains secreted by A2 cells is far greater than the 2:l ratio expected of a pure population of heterotrimers. T o determine whether collagen secreted by A2 cells might be comprised of both heterotrimers and al(1) homotrimers, sol- ubility differences between the two types of collagen were exploited in fractionation of pepsin-treated media by sequen- tial precipitation in 2.5 and 4.4 M NaC1. As controls, similar fractionations were performed on the culture media of both W8 cells, which secretes only al(1) homotrimers, and K16 cells, which secretes predominantly type l(1) collagen heter- otrimers. It can be seen that almost all secreted collagen of K16 cells is insoluble a t 2.5 M NaCl (Fig. 3B) and is comprised of both al(1) and a2(I) chains. In contrast, secreted collagen of W8 cells is comprised exclusively of al(1) chains which are soluble in 2.5 M NaCl but which precipitate out at 4.4 M NaCl.

A B K16 W8 A2 ”-

W8 A2 2.5 44 2s 4a 2.5 44

FIG. 3. Autofluorogram of [‘Hlproline-labeled al(1) and a2(I) collagenous proteins secreted by K16, WS, and A 2 cell lines. Pepsin-resistant collagen from equal volumes of W8, A2, and K l 6 culture media were resolved by sodium dodecyl sulfate, 6% polyacrylamide gel electrophoresis ( A ) directly after isolation or ( B ) after isolation and sequential fractionation in 2.5 and 4.4 M NaC1. Isolation and salt fractionation of collagens are detailed under “Ma- terials and Methods.” Cell lines were equally confluent at time of labeling (-5 X lofi cells/lOO-mm dish). Electrophoretic migration positions of lathyritic rat skin collagen al(1) and a2(I) chains, used as markers, are indicated.

As expected, the secreted collagens of A2 cells are comprised of both type I heterotrimers (Fig. 3B, 2.5 M) and al(1) hom- otrimers (Fig. 3B, 4.4 M). Ratios of al(1) to a2(I) chains were derived for 2.5 M NaC1-insoluble collagens of K16 (2.51) and A2 (3.51) cells by densitometric scanning of autofluorograms exposed for varying lengths of time (data not shown). The somewhat higher al(I):aB(I) ratio observed in the 2.5 M NaCl fraction of A2 cells probably reflects trapping of some al(1) homotrimers in the initial salt cut.

Analysis of Collagen-specific RNA Species-Northern blot analysis of cytoplasmic RNA from A2 cells detected two bands corresponding to species of 4.5 and 4.3 kb (Fig. 4). Unexpect- edly, these correspond in size to mRNAs which have utilized polyadenylation signal sequences located 16 and 180 bases downstream of the translation terminator in the 3“untrans- lated region of the a2(I) cDNA. These sites are, together, employed by only 5-10% of a2(I) transcripts found in human fibroblasts (7,19). Use of these sites in the two RNA species detectable by Northern blot analysis and the presence of only trace amounts of transcripts which extend downstream into SV40 sequences has been confirmed by S1 mapping (not shown). Thus, although the SV40 small t antigen intron and

A B 1 2 3 4 4

-4.9 5.1

FIG. 4. Northern blot analysis of a2(I) specific RNA. Cyto- plasmic RNA from K16 (lane I ) , W8 (lane 2) , A2 (lane 3), and AHlF (lane 4 ) cells was analyzed by hybridization to a probe specific for human a2(I) collagen (see Fig. l), as described under “Materials and Methods.” The autoradiogram was exposed for 88 h ( A ) or 24 h for clearer viewing of human fibroblast a2(I) RNA species ( B ) . The positions of 5.1-, 4.9-, 4.6-, and 4.5-kb species of a2(I) mRNAs (21) and 28 S ribosomal RNA are shown.

A B M 1 2 3 4 5 1 2 3 4 M

385-

249-

0- J376 -365

FIG. 5. S1 nuclease analysis of human a2(I) and rat al(1) transcripts in K16, WS, A2, and A H l F cells. Cytoplasmic RNA (50 pg) from the various cell lines was hybridized to (A) a double- stranded DNA probe specific for human a2(I) which extends from an NcoI site to a BstEII site in the region of the human cDNA which encodes the main triple helical region (see Fig. 1). Appended to the 249 bp of n2(I) sequences is a 135 bp “tail” of M13 sequences which provides a size difference to distinguish protected DNA fragments from undigested probe. B, a probe specific for rat al(1) which extends 365 bp from a BarnHI to a PstI site near the 3’ end of the region encoding the main triple helical region of a rat al(1) (24). This probe is joined to a 12-bp tail from vector pUC18. Human and rat probes were 3’ end-labeled at the NcoI and BarnHI sites, respectively, hybridized a t 60 “C to 50 pg cytoplasmic RNA from K16 (lane 2) , W8 (lane 3 ) , A2 (lane 4) , and AHlF (lane 5) cells, and subjected to digestion a t 20 “C with S1 nuclease. Lune I , undigested probe. M, ”P-labeled markers of Hinff-EcoRV-digested pBR322 of 517, 506, 446,396,344,298,221, and 220 bp.

Page 4: Construction of a Full-length cDNA Encoding Human Pro-a2(1

Cloning and Expression of Full-length Pro-a2(I) Collagen cDNA 13417

splice junctions had been placed downstream of the collagen cDNA, the majority of transcripts derived from the cDNA clone are, apparently, unspliced. This may, in part, explain their relatively low abundance compared to levels of a 2 tran- scripts found in AHlF normal human fibroblasts (Fig. 4A, lane 4) or compared with the signal obtained from cross- hybridization of the human probe with rat a2(1) transcripts from K16 cells (Fig. 4A, lane 1 ).

To confirm that a2(1) collagen chains secreted by A2 cells are encoded by the transfected human cDNA clone, and not produced by induction of endogenous rat genes, the more sensitive S1 nuclease protection assay was performed (Fig. 5). Hybridization of cytoplasmic RNA from A2 cells to a 385-bp human DNA probe with subsequent S1 digestion, generates a protected DNA fragment of 249 bases, diagnostic for human a2(I) transcripts (Fig. 5A, lane 4 ) . The protection is shown to be specific for human a2(I) RNA species since (i) a greater amount of the same fragment is protected by the cytoplasmic RNA of human diploid fibroblasts (Fig. 5A, lane 5 ) , and (ii) the 249-base fragment is not protected by cytoplasmic RNA of either K16 (Fig. 5A, lane 2 ) or W8 (Fig. 5A, lane 3 ) cells.

The A2 line differs from W8 cells not only in the secretion of human a2(1) chains but in the secretion of increased amounts of rat al(1) chains as well (see Fig. 3). To determine if the increased levels of al(1) chains found in culture media of A2 cells correlates with an increased level of al(1) tran- scripts, the cytoplasmic RNA of K16, W8, and A2 were hybridized to a rat al(1)-specific DNA probe and compared by S1 nuclease protection assay (Fig. 5B). Under conditions of probe excess, the amount of a protected DNA fragment is proportional to the amount of the corresponding RNA species in a sample. Under such conditions the levels of cytoplasmic, al(1)-specific, RNA was found to be similar for W8 and A2 cells types (Fig. 5B, lanes 3 and 4).

DISCUSSION

This work describes the cloning of a full-length cDNA encoding the human a2(I) collagen and demonstrates its expression, under control of the RSV-LTR, in a cell culture system. The W8 line is shown to be a convenient system for studying expression of transfected a2(I) genes as it produces abundant al(1) chains but does not produce detectable levels of endogenous a2(I) protein or mRNA. Secretion of human a2(I) chains by the transfected system indicates that they are combined with endogenous rat al(1) chains, since a2(I) chains not complexed with al(1) are rapidly turned over and not secreted (23). The complexed rat-human a chains are resist- ant to pepsin digestion, indicating a tight triple helical struc- ture, and show solubility characteristics similar to those of normal type I collagen in salt fractionation experiments. We conclude, therefore, that the human a2(I) and rat al(1) chains are combined in a chimeric rat/human heterotrimeric type I collagen. Moreover, both al(1) and a2(1) chains, from A2 cells, migrate with the same electrophoretic mobility as chains from normal rat type I collagen. This suggests that formation of the chimeric triple helix is efficient, since delayed helix formation, as observed with structurally abnormal (Y chains of many patients with osteogenesis imperfecta (25,26), results in excessive post-translational modification of a chains which retards their electrophoretic migration in gels.

Levels of rat al(1) chains secreted by A2 cells were consist- ently higher than those of W8 cells, although quantitative S1 mapping showed similar levels of rat al(1) RNA in the two cell lines. Expression of pro-a2(1) collagen chains, therefore, does not appear to influence transcription and/or stability of cyl(1) mRNA but does influence, at the translational or post-

translational level, the amount of d ( I ) chains found in the media of transfected cells. These observations are supported by preliminary results involving transient expression of the &(I) clone in W8 cells2 and may represent the more efficient secretion and/or increased intracellular stability of pro-al(1) chains when combined with pro-a2(I) chains as heterotrimers.

Expression of a2(I) chains encoded by transfected cDNA was at levels lower than those observed for endogenous chains in K16 cells or normal diploid human fibroblasts. This is most probably due to the nature of the recombinant DNA clone used in this particular study rather than an intrinsic property of the a2(I) cDNA. Unexpectedly, the majority of detectable cytoplasmic RNA species transcribed from the ot2(I) cDNA were found to utilize polyadenylation signals remaining in the 3”untranslated region of the cDNA which are employed infrequently in normal human fibroblasts. Available SV40 small t antigen intron and splice junctions downstream of the cDNA insert were not used by the great majority of tran- scripts. The apparent lack of splicing may decrease levels of expression, explaining the low levels of a2(I) chains in A2 cells relative to levels found in K16 cells or in normal human fibroblasts (AHlF). Removal of sequences which lie between the a2(I) translation terminator and downstream SV40 se- quences should markedly increase expression levels in future studies. The a2(I) cDNA is also being placed in a retroviral vector, since use of this type of vector has yielded high levels of expression of the human al(1) cDNA in transduction studies (27).

Availability of a functional cDNA clone encoding a2(I) will allow the systematic mapping, by site-directed mutagenesis and subsequent in vitro expression studies, of domains impor- tant in triple helix formation, secretion, and fibrillogenesis of type I collagen. Naturally occurring mutations of the a2(I) gene have been correlated with cases of Ehlers-Danlos syn- drome, Marfan’s syndrome, and osteogenesis imperfecta (1). These mutations can be expressed in a dominant fashion. Therefore, introduction of in vitro mutagenized a2(I) clones into transgenic mice could yield a model system for relating different mutations to various phenotypes.

Jaenisch and colleagues (27, 28) have introduced cloned al(1) genes into al(I)-deficient Mov-13 mouse cells, leading to formation of functional mouse-human hybrid type I colla- gen. The system presented here and that of the Jaenisch group clearly complement each other, as functional clones and expression systems are now available for study of both chains of human type I collagen.

Acknowledgments-We are grateful to Dr. Francesco Ramirez for supplying us with clones NJ-3 and Hf-1131 and to Dr. David Rowe for providing the pa lRl rat probe. We thank Guy Hoffman and Jennifer Harder for their technical assistance. We also thank Dr. Burton Goldberg for encouragement and critical reading of the man- uscript.

REFERENCES

1. Prockop, D. J., and Kivirikko, K. I. (1984) N. Engt. J. Med. 311,

2. Pihlajaniemi, T., Dickson, L. A., Pope, F. M., Korhonen, V. R., Nicholls, A., Prockop, D. J., and Myers, J. C. (1984) J. Biol.

3. Marsilio, E., Sobel, M. E., and Smith, B. D. (1984) J. Biol. Chem.

4. Dickson, L. A., de Wet, W. J., Di Liberto, M., Weil, D., and Ramirez, F. (1985) Nucleic Acids Res. 13, 3427-3438

5. Ohkubo, H., Vogeli, G., Mudryj, M., Awedimento, V. E., Sullivan, M., Pastan, I., and De Crombrugghe, B. (1980) Proc. Natl. Acad. Sei. U. S. A. 77,7059-7063

376-386

C k m . 259,12941-12944

259,1401-1404

S.-T. Lee, B. D. Smith, and D. S. Greenspan, unpublished data.

Page 5: Construction of a Full-length cDNA Encoding Human Pro-a2(1

13418 Cloning and Expression of Full-length Pro-a2(I) Collagen cDNA

6. Wolf, D., and Rotter, V. (1985) Proc. Natl. Acad. Sci. U. S. A. Acad. Sci. U. S. A. 74,5463-5467

7. Myers, J. C., Dickson, L. A., de Wet, W. J., Bernard, M. P., Chu, (Amst.) 23, 103-119 82,790-794 18. Yanisch-Perron, C., Vieira, J., and Messing, J. (1985) Gene

M.-L., Di Liberto, M., Pepe, G., Sangiorgi, F. O., and Ramirez, 19. de Wet, W., Bernard, M., Benson-Chanda, V., Chu, M.-L., Dick- F. (1983) J. Biol. Chem. 258, 10128-10135 son, L., Weil, D., and Ramirez, F. (1987) J. BWL Chem. 262,

E. F., and Prockop, D. J. (1983) Biochemistry 22, 1139-1145 20. Schmidt, A., Yamada, Y., and de Crombmgghe, G. (1984) J. Biol.

and Howard, B. H. (1982) Proc. Natl. Acad. Sci. U. S. A. 79. 21. Boedtker, H., Finer, M., and Aho, S. (1985) Ann. N. Y. Acad. Sei.

10. Smith, B. D., and Niles, R. (1980) Biochemistry 19, 1820-1825 22. Southern, P. J., and Berg, P. (1982) J. Mol. Appl. Genet. 1, 327- 11. Wigler, M., Pellicer, A., Silverstein, S., Axel, R., Urlaub, G., and 341

Chasin, L. (1979) Proc. Natl. Acad. Sci. U. S. A. 76,1373-1376 23. Dziadek, M., Timpl, R., and Jaenisch, R. (1987) Biochm. J. 244, 12. Greenspan, D. S., and Weissman, S. M. (1985) Mol. Celt. Biol. 6, 375-379

13. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular 6210-6216

8. Bernard, M. P., Myers, J. C., Chu, M.-L., Ramirez, F., Eikenberry, 16032-16036

9. Gorman, C. M., Merlino, G. T., Willingham, M. C., Pastan, I., Chem. 259,7411-7415

6777-6781 460,85-116

1894-1900 24. Genovese, C., Rowe, D. and Kream, B. (1984) Biochemistry 23,

Cloning:A Laboratory Manual, Cold Spring Harbor Laboratory, 25. Bateman, J. F., Mascara, T., Chan, D., and Cole, W. G. (1984) Cold Spring Harbor, NY Biochem. J. 217,103-115

14. Thomas, P. S. (1980) Proc. Natl. Acad. Sci. U. S. A. 77, 5201- 26. Bonadio, J. F., and Byers, P. H. (1985) Nature 316, 363-366 5205 27. Stacey, A., Mulligan, R., and Jaenisch, R. (1987) J. Virol. 61,

15. Feinberg, A. P., and Vogelstein, B. (1984) Addendum. Anal. 2549-2554 Biochem. 137,266-267 28. Schnieke, A., Dziadek, M., Bateman, J., Mascara, T., Harbers,

16. Laemmli, U. K. (1970) Nature 227,680-685 K., Gelinas, R., and Jaenisch, R. (1987) Proc. Natl. Acad. Sci. 17. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. U. S. A. 84,764-768