8
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1991 by The American Society for Biochemistry and Molecular Biology, Inc. Vol. 266, No. 4, Issue of February 5, pp. 2466-2473,1991 Printed in U. S. A. Structure and Chromosomal Location of the Gene for Endothelial- Leukocyte Adhesion Molecule 1” (Received for publication, June 8, 1990) Tucker Collins$$, Amy Williams$, Geoffrey I. JohnstonV**, Jenny Kim$, Roger Eddy((, Thomas ShowsII, Michael A. Gimbrone, Jr.4, and Michael P. Bevilacqua$$ From the $Department of Pathology, Vascular Research Division, Brigham and Women’s Hospital, Boston, Massachusetts 02115, the 7lDepartment of Medicine, Oklahoma University Health Science Center/Cardiovoscular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma 73104, and the 11 Department of Human Genetics, Roswell Park Memorial Institute, Buffalo, New York 14163 Endothelial-leukocyte adhesion molecule 1 is a cell surface glycoprotein expressed by cytokine-activated endothelium that mediates the adhesion of blood neu- trophils. Endothelial-leukocyte adhesion molecule 1 is a member of the selectin family of cell adhesion mole- cules each of which contain an amino-terminal lectin- like domain, followed by an epidermal growth factor- like domain and a variable numberof short consensus repeats similar to those found in complement binding proteins. Genomic clonesencoding the ELAM gene were isolated and the organization of the ELAM gene was determined. The gene, which is present in a single copy in the human genome, contains 14 exons spanning about 13 kilobases of DNA. The positions of exon- intron boundaries correlate with the putative func- tional subdivisions of the protein. Introns are found at similar positions in all of the six complement regula- tory repeats, suggesting that these elements arose by internal gene duplication. A consensus TATAA ele- ment is located upstream of the transcriptional start site. The ELAMpromoter contains an inverted CCAAT box and consensus NF-KB- and AP-1-binding sites. The ELAM gene was assigned to the ql2>qter region of human chromosome 1 by analysis of human-mouse hy- brid cell lines. Two other members of the selectin gene family,theleukocyteadhesion molecule 1 (LAM-1, TQ1, LEC-CAM 1, or Leu-8) and the granule mem- brane protein 140 (GMP-140, PADGEM, or CD62) have been localized to the long arm of chromosome 1, as have the structurally related complement binding proteins, suggesting that these genes may share a com- mon evolutionary history. Vascular endothelium is ideally positioned at the boundary between blood and tissues to regulate developmental, inflam- matory, and immunological processes. Activation of endothe- * This work wassupported by National Institutes of Health Grants PO1 HL-36028, HL-35716, GM20454, HG05196. The costs of publi- cation of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertise- ment” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequencefs) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) M58017. § Fellow of the Pew Scholars Program. ** Present address: Pharmaceutical Proteins Ltd., Kings Buildings, West Mains Rd., Edinburgh EH9 3JQ, Scotland. lial cells by the cytokines interleukin 1 (IL-1)’ and tumor necrosis factor (TNF), or by bacterial endotoxin, promotes the adhesion of blood leukocytes (reviewed in Refs. 1 and 2). This process appears to depend in large part on the de nouo expression of endothelial cell surface adhesion molecules. Endothelial-leukocyte adhesion molecule 1 (ELAM-1) is a 115-kDa cell surface glycoprotein expressed by cytokine-ac- tivated endothelium that mediates the adhesion of blood neutrophils (3). In uiuo, ELAM-1 expression is restricted to the endothelial lining of post-capillary venules at sites of active inflammation and certain immunologic disease proc- esses (4). A full-length complementary DNA for ELAM-1 was isolated by transient expression in COS-1 cells (5). Cells transfected with the ELAM-1 cDNA support the adhesion of isolated human neutrophils or the promyelocytic cell line HL- 60. Expression of ELAM-1 transcripts in cultured human endothelial cells is induced by cytokines such as IL-1 or TNF, reaching a maximum at 2-4 h and decaying by 24 h; cell surface expression of ELAM-1 protein parallels that of mRNA (5). ELAM-1 belongs to a family of structurally related mole- cules, designated “selectins,” in which the known members all participate in endothelial-leukocyte adhesion. A second member of the family is the Mel-14 antigen (6, 7). Originally identified as a murine lymphocyte cell surface molecule that functioned as alymph node homing receptor, the antigen was subsequently found on neutrophils and monocytes. Its human counterpart has been designated the human leukocyte adhe- sion molecule (LAM-1) and recognized to be the same as the Leu-8 or TQ1 antigen (8-11). The third member of this family is granule membrane protein 140 (GMP-140) (12), also known as platelet activation-dependent granule-external membrane protein (PADGEM) (13) or CD62. This protein is found in the secretory granules of platelets and endothelial cells. After cellular activation by agonists such as thrombin, this protein rapidly redistributes to the plasma membrane where it can mediate adhesion of neutrophils and monocytes (14, 15). The members of the selectin gene family have extensive amino acid similarity and share a pattern of organization. Each member of the selectin family has an unusual mosaic structure with an amino-terminal lectin-like domain, an epi- dermal growth factor-like domain, a variable number of con- sensus repeats (about 60 amino acids each) similar to those found in complement regulatory (CR) proteins, a transmem- The abbreviations used are: IL-1-interleukin 1; bp-base pair; CR- complement regulatory; ELAM-1-endothelial leukocyte adhesion molecule 1; GMP-140-granule membrane protein-140; kb-kilobases; LAM-1-human leukocyte adhesion molecule 1; RCA-regulators of complement activation; TNF-tumor necrosis factor. 2466

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

Embed Size (px)

Citation preview

Page 1: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

THE J O U R N A L OF BIOLOGICAL CHEMISTRY 0 1991 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 266, No. 4, Issue of February 5, pp. 2466-2473,1991 Printed in U. S. A.

Structure and Chromosomal Location of the Gene for Endothelial- Leukocyte Adhesion Molecule 1”

(Received for publication, June 8, 1990)

Tucker Collins$$, Amy Williams$, Geoffrey I. JohnstonV**, Jenny Kim$, Roger Eddy((, Thomas ShowsII, Michael A. Gimbrone, Jr.4, and Michael P. Bevilacqua$$ From the $Department of Pathology, Vascular Research Division, Brigham and Women’s Hospital, Boston, Massachusetts 02115, the 7lDepartment of Medicine, Oklahoma University Health Science Center/Cardiovoscular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma 73104, and the 11 Department of Human Genetics, Roswell Park Memorial Institute, Buffalo, New York 14163

Endothelial-leukocyte adhesion molecule 1 is a cell surface glycoprotein expressed by cytokine-activated endothelium that mediates the adhesion of blood neu- trophils. Endothelial-leukocyte adhesion molecule 1 is a member of the selectin family of cell adhesion mole- cules each of which contain an amino-terminal lectin- like domain, followed by an epidermal growth factor- like domain and a variable number of short consensus repeats similar to those found in complement binding proteins. Genomic clones encoding the ELAM gene were isolated and the organization of the ELAM gene was determined. The gene, which is present in a single copy in the human genome, contains 14 exons spanning about 13 kilobases of DNA. The positions of exon- intron boundaries correlate with the putative func- tional subdivisions of the protein. Introns are found at similar positions in all of the six complement regula- tory repeats, suggesting that these elements arose by internal gene duplication. A consensus TATAA ele- ment is located upstream of the transcriptional start site. The ELAMpromoter contains an inverted CCAAT box and consensus NF-KB- and AP-1-binding sites. The ELAM gene was assigned to the ql2>qter region of human chromosome 1 by analysis of human-mouse hy- brid cell lines. Two other members of the selectin gene family, the leukocyte adhesion molecule 1 (LAM-1, TQ1, LEC-CAM 1, or Leu-8) and the granule mem- brane protein 140 (GMP-140, PADGEM, or CD62) have been localized to the long arm of chromosome 1, as have the structurally related complement binding proteins, suggesting that these genes may share a com- mon evolutionary history.

Vascular endothelium is ideally positioned at the boundary between blood and tissues to regulate developmental, inflam- matory, and immunological processes. Activation of endothe-

* This work was supported by National Institutes of Health Grants PO1 HL-36028, HL-35716, GM20454, HG05196. The costs of publi- cation of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertise- ment” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequencefs) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) M58017.

§ Fellow of the Pew Scholars Program. ** Present address: Pharmaceutical Proteins Ltd., Kings Buildings,

West Mains Rd., Edinburgh EH9 3JQ, Scotland.

lial cells by the cytokines interleukin 1 (IL-1)’ and tumor necrosis factor (TNF), or by bacterial endotoxin, promotes the adhesion of blood leukocytes (reviewed in Refs. 1 and 2). This process appears to depend in large part on the de nouo expression of endothelial cell surface adhesion molecules. Endothelial-leukocyte adhesion molecule 1 (ELAM-1) is a 115-kDa cell surface glycoprotein expressed by cytokine-ac- tivated endothelium that mediates the adhesion of blood neutrophils (3). In uiuo, ELAM-1 expression is restricted to the endothelial lining of post-capillary venules at sites of active inflammation and certain immunologic disease proc- esses (4). A full-length complementary DNA for ELAM-1 was isolated by transient expression in COS-1 cells (5). Cells transfected with the ELAM-1 cDNA support the adhesion of isolated human neutrophils or the promyelocytic cell line HL- 60. Expression of ELAM-1 transcripts in cultured human endothelial cells is induced by cytokines such as IL-1 or TNF, reaching a maximum at 2-4 h and decaying by 24 h; cell surface expression of ELAM-1 protein parallels that of mRNA ( 5 ) .

ELAM-1 belongs to a family of structurally related mole- cules, designated “selectins,” in which the known members all participate in endothelial-leukocyte adhesion. A second member of the family is the Mel-14 antigen (6, 7). Originally identified as a murine lymphocyte cell surface molecule that functioned as a lymph node homing receptor, the antigen was subsequently found on neutrophils and monocytes. Its human counterpart has been designated the human leukocyte adhe- sion molecule (LAM-1) and recognized to be the same as the Leu-8 or TQ1 antigen (8-11). The third member of this family is granule membrane protein 140 (GMP-140) (12), also known as platelet activation-dependent granule-external membrane protein (PADGEM) (13) or CD62. This protein is found in the secretory granules of platelets and endothelial cells. After cellular activation by agonists such as thrombin, this protein rapidly redistributes to the plasma membrane where it can mediate adhesion of neutrophils and monocytes (14, 15).

The members of the selectin gene family have extensive amino acid similarity and share a pattern of organization. Each member of the selectin family has an unusual mosaic structure with an amino-terminal lectin-like domain, an epi- dermal growth factor-like domain, a variable number of con- sensus repeats (about 60 amino acids each) similar to those found in complement regulatory (CR) proteins, a transmem-

The abbreviations used are: IL-1-interleukin 1; bp-base pair; CR- complement regulatory; ELAM-1-endothelial leukocyte adhesion molecule 1; GMP-140-granule membrane protein-140; kb-kilobases; LAM-1-human leukocyte adhesion molecule 1; RCA-regulators of complement activation; TNF-tumor necrosis factor.

2466

Page 2: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

ELAM-1 Gene 2467

brane segment and a short cytoplasmic domain (5-12). As an initial approach to examining the mechanisms regulating ELAM gene2 expression and those regulating the tissue spec- ificity of transcription, the human gene encoding ELAM was cloned and its genomic organization characterized.

EXPERIMENTAL PROCEDURES

Cloning the ELAM-1 Gene-A bacteriophage XCharon 4A library, prepared from human lymphocyte DNA (16), and a library of human peripheral blood DNA in the vector EMBL3 (17), were screened with the complete ELAM-1 cDNA (5), or 3' restriction fragments of the ELAM-1 cDNA, respectively. The libraries were screened with probes labeled with the Klenow fragment of DNA polymerase I in the presence of random hexanucleotide primers and [cx-~ 'P]~CTP (18). The bacteriophage libraries were plated and nitrocellulose filters prepared as previously described (19). Filters were incubated with the radiolabeled restriction fragments according to standard procedures (19). Hybridizing phage were then purified and DNA prepared from phage stocks as described previously (20).

Sequencing the ELAM-1 Gene-Restriction fragments derived from bacteriophage containing the ELAM gene were ligated into the plasmid vectors pIB1 20 (International Biotechnologies Inc., New Haven, CT) or pBS (Stratagene, La Jolla, CA). Nucleotide sequence was determined by the dideoxynucleotide chain termination proce- dure with modified T7 DNA polymerase (United States Biochemical Corp., Cleveland, OH) and [ C X - ~ ~ S I ~ A T P (21, 22). Oligonucleotide primers were synthesized using an oligonucleotide synthesizer (Ap- plied Biosystems, Foster City, CA) and were used without purifica- tion.

Polymerase Chain Reaction-The sizes of some of the introns were measured using the polymerase chain reaction (PCR), performed (23) with Taq polymerase (Perkin-Elmer Cetus Instruments) and a Per- kin-Elmer Cetus Instruments DNA Thermal Cycler. Typically about 0.1 pg of an insert from a plasmid subclone of the ELAM gene containing the intron was amplified with 2.5 units of Taq polymerase in 30 cycles of 15 s at 94 "C (denaturing), 15 s at 55 "C (annealing), and 30 s at 72 "C (extension) using primers corresponding to the upstream and downstream exon sequences. The size of the amplifi- cation product was measured on a 1% agarose gel.

Growth of Human Umbilical Vein Endothelial Cells, Cytokine Treat- ment, and mRNA Isolation-Human endothelial cells were harvested from two to six umbilical cord veins and established in primary culture as previously described (24). Cultures were serially passaged under the conditions described by Maciag (25) as modified by Thorn- ton et al. (26). Endothelial cells were treated with 5 units/ml of recombinant IL-1 (Biogen, Boston, MA) for 2.5 h at 37 "C; poly(A)+ RNA was then prepared by detergent lysis and oligo(dT) cellulose batchwise absorption using a Fast Track Kit (Invitrogen, San Diego, CA).

CTTCCCAAAAC) complementary to the 5' end of the ELAM-1 Primer Extension Analysis-An oligonucleotide (ACAACTG-

mRNA (5) was labeled with [r-32P]dATP and polynucleotide kinase; about 5 X lo5 cpm were hybridized to each RNA sample (5 pg of IL- 1-treated endothelial poly(A)' RNA or tRNA) at 50 "C for 1 h in 12 p1 of 100 mM KCI, 10 mM MgCI,, and 25 mM Tris-C1, pH 8.3. The reverse transcription reactions (40 p l ) contained 30 mM KCI, 8 mM MgC12, 50 mM Tris-HCI, pH 8.3, 500 p~ (each) of the dideoxynucle- otides, 25 Fg/ml of actinomycin D, 10 units of RNasin (Promega Biotec, Madison, WI), 50 units of avian myeloblastosis virus reverse transcriptase (Molecular Genetic Resources, Tampa, FL), and were performed at 42 'C for 60 min. Extension products were run alongside a Sanger DNA sequence, generated using a double strand DNA template, and the same sequencing primer as used in the reverse transcription reactions. This method allows the size of extension product to be directly compared with the genomic sequence.

was as described (20). An 875-bp fragment of the ELAM-1 gene SI Nuclease Protection Analysis-S1 nuclease protection avalysis

spanning the 5' end was generated by PCR, labeled with [-y-"P] dATP and polynucleotide kinase, and digested with AccI to generate a 563-bp probe (-348 to +179). 25 fig of IL-1-treated endothelial cell RNA was hybridized with the probe, and the resulting heteroduplexes were incubated with 250 units/ml S1 nuclease (Sigma) at room

The Human Gene Mapping Nomenclature Committee has as- signed the following designations utilized in this report: ELAM-1; ELAM, LAM-1, LYAM, GMP-140, GRMP.

temperature for 2 h, prior to analysis on standard sequencing gels, as described previously (20). The digestion products were run alongside a Sanger DNA sequence, generated using a double strand DNA template and a sequencing primer which corresponded to the 3' PCR primer used to generate the nuclease protection probe.

Chromosomal Mapping-Southern blots were prepared from BamHI-digested DNA from 41 human-mouse somatic hybrid cell lines, as well as parental human and mouse cell lines. The hybrids were derived from 18 unrelated human cell lines and four mouse cell lines (27-29). The hybrids were characterized by karyotypic analysis and by mapped enzyme markers (27, 28, 30). Blots were hybridized with a restriction fragment of the ELAM-1 cDNA labeled by random priming, washed at high stringency, and autoradiographed.

RESULTS

Isolation of the Human ELAM Gene-Overlapping phage clones containing the ELAM gene were isolated by screening two genomic libraries with an ELAM-1 cDNA (Fig. 1). Re- striction maps for EcoRI were determined and refined by digestion with additional restriction enzymes and Southern blotting with cDNA probes or specific oligonucleotides. Nu- cleotide sequence analysis was facilitated by subcloning EcoRI restriction fragments into plasmid vectors. The sizes of the subcloned EcoRI fragments corresponded to the sizes of the hybridizing bands observed on a Southern blot of EcoRI cut human DNA probed with the ELAM-1 cDNA (data not shown); this pattern was consistent with a single copy of the gene being present in the human (haploid) genome.

Structural Organization of the ELAM-1 Gene-All of the exons, intron-exon boundaries, and portions of the introns were sequenced. The location and size of the 14 exons/introns are shown in Figs. 1 and 2, and in Table I. Exons range in size from 22 to 1,869 bp, while introns range from 106 to about 1,300 bp in length. Excluding the non-coding exon 14, the average length of an ELAM exon is 158 bp, which is consistent with the reported average exon size of 137 base pairs (31). Splice acceptor and donor sequences (Table 11) agree with the "GT-AG" rule (32) and conform to the consen- sus proposed by Mount (33). In the coding region, no splice junctions occur between amino acid codons (type 0), 10 (91%) occur after the first nucleotide (type I), and 1 (9%) occurs after the second nucleotide of a codon (type 2) (34). This can be compared to the values of 41% type 0, 36% type 1, and 23% type 2 previously reported for vertebrate genes (35).

0 5 10 15

R R R RR R R R

20 KE I I I

I I

L

Promoter+J'UT CR2 CR5 CYTO-1

- a I I

FIG. 1. Structural organization of the human ELAM-1 gene. Exons are indicated by filled boxes and introns as well as 5'- and 3"flanking sequences by lines. The locations of the putative domains are designated. EcoRI sites ( R ) present in the gene also are indicated. The sequencing strategy used is shown above the positions of the domains; the direction and distance sequenced are shown by horizontal arrows. SP, signal peptide; TM, transmembrane; UT, un- translated; KB, kilobases.

Page 3: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

2468 ELAM-1 Gene

TABLE I Location and size of exons and introns in the human ELAM-I gene

Numbering of residues corresponds to Fig. 2. Amino acids with interrupted codons were assigned to the exon containing two of the three codon nucleotides. Ut, untranslated Tm, transmembrane.

Exon Length tzz Domain Intron Length" Type'

1 2 85 12 3 384 128 4 108 36 5 186 62 6 186 62 7 189 63 8 189 63 9 189 63

10 177 59 11 108 36 12 22 7 13 73 19 14 1.869

5' ut Signal peptide Lectin EGF CR1 CR2 CR3 CR4 CR5 CR6 Tm Cytoplasmic cyto +3' ut

3' ut

1 308 2 370" I 3 770* I 4 1,200t I 5 850* I 6 113 I 7 BOO? I 8 130 I 9 212 I

10 524t I 11 650t I 1 2 105 I1 13 1,300t

"An asterisk (*) indicates that the size of the intron was deter- mined by PCR a plus sign (+) indicates the intron size was measured by restriction enzyme digestion.

bIntron type is according to Sharp (1981); 0 indicates a splice between codons, I indicates a splice after the first nucleotide of a codon and I1 indicates that the splice occurs after the second nucleo- tide.

Two nucleotide differences were noted between the ELAM- 1 cDNA sequence (5) and the genomic sequence. A difference at nucleotide 4277 (T to C) results in a change in amino acids (tyrosine to histidine) in CR5. A second difference was iden- tified at 5822 (C to T) in the 3"untranslated region. These differences probably do not represent polymorphisms at these sites, since sequence analysis of additional endothelial cDNAs identify the same nucleotide changes (36,37).

The ELAM gene structure suggests a correlation between exonlintron architecture and protein structure. Exon 1 con- tains part of the 5"untranslated region. Exon 2 encodes the remainder of the 5"untranslated region and most of the hydrophobic signal peptide. Exon 3 contains the last eight amino acids of the signal peptide as well as the lectin domain. Exon 4 contains the epidermal growth factor domain. The six tandem repeats of about 60 amino acids (CR1-CR6) are found in exons 5-10. Exon 11 encodes the putative transmembrane region, and the cytoplasmic domain is contained in exons 12 and 13. The large 3"untranslated region is contained in exons 13 and 14.

At the 3' end of the ELAM gene, exon 14 contains a single consensus polyadenylation signal. Poly(A) addition occurs 13 bp downstream from the AATAAA motif at a T residue (5). Although C(A) is the preferred sequence for polyadenylation sites, T(A) is frequently used (38). Located 5 bp downstream of the polyadenylation site is the sequence TGTGTTAA; this sequence is similar to the consensus sequence YGTGTTYY frequently found downstream of the polyadenylation site (39). The ELAM gene did not have the striking representation of the trinucleotide TGT in conjunction with oligo-T stretches (a "G/T cluster") frequently seen in certain other genes in the region 30 bp downstream from the AATAAA motif (38).

Identification of the Transcriptional Initiation Site-The transcriptional start site of the ELAM-1 gene was defined by primer extension analysis as well as S1 nuclease protection.

the ELAM-1 exons and introns were determined by comparison with FIG. 2. Nucleotide sequence of the human ELAM-I gene. the sequence of the ELAM-1 cDNA. Exons are enclosed within boxes.

Sequences of exons, 5"flanking region, and intron/exon boundaries The amino acid sequence is shown below the nucleotide sequence. are shown. The ELAM-1 transcriptional start site was determined by The TATAA element and the consensus polyadenylation signal (AA- primer extension (see Fig. 3) and is indicated. The organization of TAAA) are boxed. IVS refers to intervening sequence.

Page 4: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

ELAM-1 Gene 2469

TABLE I1 Intron-exon splice junction sequences in the human ELAM-1 gene

C A TTTTTTTTTTT T

A G ccccccccccc c AG GT AGT . . . . . . N AG”

Intron Splice junction sequences

Exon Intron Exon

1 AACG gtaagt . . . . . . gttttatctccccag GAAA 2 TTGG gtaagt . . . . . . tttttctcccactag TGCT 3 ACAG gtaggg . . . . . . ttgtattttccgtag CTGC 4 CAAA gtaagt . . . . . . gggtttctttttcag TTGT 5 AATG gtaaat . . . . . . ctttcaaatcctcag TGGT 6 AAAG gtagag . . . . . . gtatatttgttacag CTGT 7 GAAG gtaagc . . . . . . attttgatgtctcag CTTT 8 GAAG gtacag . . . . . . tctcgtgtgttccag CTGT 9 CAAG gtagaa . . . . . . acctaacatttgcag TGGT 10 GAAG gtgatg . . . . . . ctcttacttccttag CTCC 11 AAAG gtgagg . . . . . . tcttttcttttgcag CAAA 12 CCAG gtaagt . . . . . . tcatgtattccacag CAGC 13 TCAG gtaaga . . . . . . tgtttatatttacag AAAC

a Consensus sequence from Mount (1982).

The primer extension analysis was performed on IL-1-treated endothelial cell poly(A)+ RNA using an end-labeled primer complementary to the 5’ end of the ELAM-1 mRNA. Marker lanes display dideoxy DNA sequencing reactions primed from the same site on a plasmid subclone of this genomic region. Primer extension analysis reveals a major transcriptional initiation site corresponding to an A nucleotide (Fig. 3A). The nucleotide pair at this site in the ELAM-1 gene (A preceded by C) is the sequence most commonly found at eukaryotic transcriptional start sites (32). Also seen is a second prominent band which corresponds to the adjacent G residue, as well as several minor bands. The ELAM-1 start site was also mapped using an end-labeled fragment (extend- ing from position -384 to +179, see Fig. 3B) and S1 nuclease protection analysis. After the 1.5 bp correction for the differ- ence in mobility between fragments generated by DNA se- quencing uersus nuclease protection (40), the protected frag- ments are consistent with the start sites seen by primer extension. Both techniques place the initiation of ELAM-1 transcription about 30 base pairs downstream of a consensus “TATA box” sequence.

Identification of Potential cis-Acting Regulatory Elements- Examination of the human ELAM gene sequence 5‘ to the transcription initiation site reveals several notable features. Located at position -97 is an inverted consensus CCAAT sequence (ATTGG). Many promoters contain a CCAAT ele- ment approximately 70-80 bp upstream from the transcrip- tional start site (32, 41). This element can function in either orientation and ensures optimal promoter activity (42). The sequence GGGGATTTCC at position -94 is homologous to the consensus sequence (GGGR(C/A/T)TYYCC) for NF-KB binding (reviewed in Ref. 43). A sequence element (CCTGGGA) common to some acute phase reactant genes (44) was not found in the ELAM-1 promoter. Located just upstream of the TATAA box in the ELAM promoter is the sequence AGGAAG. This purine-rich sequence is recognized by a cellular factor designated polyomavirus enhancer A- binding protein (45). The sequence TGAGTCA, correspond- ing to a sequence identified as a binding site for the transcrip- tion factor AP1, was found at position -495 (data not shown).

Chromosomal Location of the ELAM-1 Gene-DNA isolated from human-mouse somatic hybrid cell lines and their paren- tal cells was examined for the presence or absence of the human ELAM gene by Southern blot techniques. The pat-

terns of restriction fragments in the human and murine DNA hybridizing the ELAM-1 cDNA were readily distinguishable. In BamHI-digested DNA, human bands of 10.3 and 8.9 kb were noted while the mouse had a single faint band of 8.7 kb. Forty-one hybrids were examined for the presence or absence of human ELAM-1 sequences (Table 111). Scoring for the human ELAM gene was determined by the presence or ab- sence of bands corresponding to the human gene in the hybrids on the blots. Concordant hybrids have retained (or lost) the human bands together with a specific human chro- mosome. Discordant hybrids have either retained human bands but not a specific chromosome, or the converse. Percent discordancy indicates the degree of discordant segregation for a marker and a chromosome. A 0% discordancy is the basis for chromosomal assignment. The presence of the ELAM gene correlated with the presence of human chromosome 1. Analysis of hybrids carrying chromosomal fragments further refined this assignment to the ql2>qter region of chromosome 1.

DISCUSSION

Regulation of ELAM Gene Expression-The pattern of ELAM gene expression is remarkable in that its product is dramatically increased by cytokines such as IL-1 and tumor necrosis factor (TNF) (3, 5). Analysis of the gene structure has provided some insights into potential mechanism of this process. Contained in the ELAM promoter is a sequence GGGGATTTCC which may play an important role in the cytokine response. This ELAM promoter element conforms completely to a consensus site for NF-KB binding (GGGR(C/ A/T)TYYCC) (43) and is identical to the sequence found in the human Ig K enhancer (47). At least three DNA binding activities, NF-KB (43), H2TFl (48), and PDRII-BF1 (49), specifically recognize this DNA sequence. One of these DNA- binding proteins, NF-KB, has been shown to participate in the cytokine-mediated increased expression of an acute phase reactant-human serum amyloid A protein, from the liver (50). NF-KB may also be involved in the lipopolysaccharide-me- diated transcriptional activation of the TNF-ol gene in pri- mary macrophages (51). Preliminary experiments using gel shift analysis indicate that an IL-1 inducible factor is present in nuclear extracts of endothelial cells which binds to this region of the ELAM promoter?

’’ T. Collins, manuscript in preparation.

Page 5: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

ELAM-1 Gene 2470

B -4-

* " " n

TATM EXON 1

-I+-+- ACC 1

PCR Primera + - Probe k

0.1 kb H mRNA d l - [AIn

C A T C

FIG. 3. Mapping of the ELAM-1 transcriptional start site. A, the transcriptional start site was determined by primer extension extention analysis. An oligonucleotide corresponding to the 5' end of the ELAM-1 cDNA was hybridized to IL-1-treated human umbilical vein endothelial cell poly(A)+ RNA, or tRNA. The products of an- nealing served as templates for reverse transcriptase. The extension products were run on a denaturing polyacrylamide gel alongside a Sanger sequence primed on a plasmid DNA template using the same primer as that used in the reverse transcription reaction. B, strategy for the mapping of the 5' end of the ELAM-1 gene by S1 nuclease digestion. C, S1 nuclease protection mapping of the 5' end of the ELAM-1 gene mRNA. Cytoplasmic RNA from IL-1-treated endothe- lial cells was hybridized to a 563-bp 5' "P-labeled probe spanning the immediate 5"flanking region of the ELAM-1 gene. The resulting heteroduplexes were incubated with S1 and the protected fragments were analyzed on a standard sequence gel. The digestion products were run alongside a Sanger DNA sequence primed on a plasmid DNA template using as a primer the 3' polymerase chain reaction (PCR) oligonucleotide. The lanes, which have not been corrected (40), are labeled as described for part A .

Chromosomal Localization of the ELAM Gene-In addition to ELAM-1, both GMP-140 (the corresponding gene has been designated GRMP)* and the third member of the selectin gene family, LAM-1 (LYAM), have been localized to the long arm of chromosome 1 (9,52). Recently, the members of the selectin gene family have been physically linked on chromosome 1

(53). In that study, GMP-140 was localized to chromosome 1, bands q21-24. Long range restriction mapping studies were then performed to link ELAM to the gene encoding GMP- 140. In the present study, the ELAM gene was directly mapped to the long arm of chromosome 1 by an independent approach, analysis of somatic cell hybrids. Interestingly, the cluster of genes designated the regulators of complement activation (reviewed in Ref. 54), which include complement receptor 1 (CRl), complement receptor 2 (CR2), decay accelerating fac- tor, membrane cofactor protein, factor H, and C4-binding protein are encoded at the q32 band on the long arm of chromosome 1 (55). These genes consist primarily of a tan- demly repeated motif (short consensus repeat), composed of about 60 amino acids. These repetitive elements are related to those found in the selectin gene family, although they contain 4 cysteines instead of the 6 found in the selectins (5- 10). Using pulsed field gel electrophoresis analysis, the CR1, CR2, decay accelerating factor, and complement component C4-binding protein gene (C4BP) have been physically linked and aligned in an 800-kb DNA segment (56). The structural homology, as well as the presence of both the selectin and regulators of complement activation gene families on the long arm of chromosome 1, raises the possibility that these genes have a common evolutionary history.

The Domain Structure of ELAM-1 Correlates with the Structure of the Gene-It has been suggested that protein evolution could be facilitated if exons coded for functional or structural units in proteins (57, 58). As gene sequences were accumulated, evidence for and against the concept of the exon as a unit of structure and function was reported (reviewed in Ref. 59). The striking correlation between the exon organi- zation of ELAM and the protein functional elements is con- sistent with the proposal that the exons correspond to discrete structural/functional domains in the ELAM-1 protein. On the schematic representation of the ELAM-1 molecule, shown in Fig. 4, the various protein domains and the placement of the introns are aligned. The introns interrupt the protein coding sequence in such a way that all of the protein segments are revealed as products of individual exons. The entire lectin domain (as well as part of the signal sequence) is contained within a single exon, similar to the lectin domains found in the human pulmonary surfactant protein (60) and a rat man- nose-binding protein (61). The epidermal growth factor do- main of ELAM is encoded by a separate exon as is the case for the human epidermal growth factor precursor (62), blood coagulation factors (63), and the cartilage matrix protein (64). The six tandem CR motifs of about 60 amino acids present in ELAM are also contained in separate exons. This is similar to the general pattern seen within the regulators of comple- ment activation gene cluster, where each CR motif is also encoded by a single separate exon (65). Like many membrane proteins, the ELAM transmembrane domain is contained within a single exon. This domain consists of 23 hydrophobic amino acids having a tendency for a-helix formation; no charged residues are present that might promote membrane protein-protein interactions. The cytoplasmic domain is con- tained in exons 12 and 13. This region (32 amino acids) is to small to encode an enzyme, although it may mediate interac- tion of ELAM-1 with the endothelial cytoskeleton. Some of the 6 serine and 2 tyrosine residues present in this domain may be phosphorylation sites. As originally proposed, the existence of introns permits functional domains encoded by discrete exons to shuffle between different proteins (57, 58). This allows proteins to evolve as new combinations of preex- isting functional units. Analysis of the ELAM gene supports the concept that ELAM-1 is a mosaic protein whose gene

Page 6: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

ELAM-1 Gene 2471

( + + 1 1 1 1 1 + 1 + 1 1 1 1 + 1 + 1 1 1 1 1 1 1 1 1 + 1 1 + 1 + 1 + 1 1 1 + 1 + 1 I

Page 7: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

2472 ELAM-1 Gene

I I I I I nt. 0 500 1000 1500

1 1 I I I I I a.a.

FIG. 4. Schematic diagram of the domain structure of the ELAM-1 gene and the positions of the introns. The positions of the introns are marked with arrowheads. L E C T I N and EGF, lectin and epidermal growth factor-like domains, respectively; 1-6 designate the 60-residue complement regulatory-like repeats; T M refers to the transmembrane domain, and CYTO designates the cytoplasmic do- main; nt., nucleotide; a.a., amino acid.

-21 1 100 200 300 400 500 589

consists in part of exons that have been borrowed from other genes.

Strikingly, virtually all of the splice junctions within the coding region of ELAM-1 are between the first and second nucleotides of a codon (Table I), defining these as phase 1 introns (34). Thus, most of these exons could possibly be spliced in or out of the mRNA without disturbing the reading frame. Such alternative splicing could produce soluble or cell surface molecules with different affinities or specificities for circulating leukocytes. For example, a soluble form of the selectin GMP-140 is capable of blocking activated neutrophil adhesion to endothelium (66). Two cDNAs for GMP-140 have been identified, one predicting a soluble form lacking the transmembrane domain (12). Recent analysis of the GMP- 140 gene reveals that this form of the molecule is derived by alternative splicing?

While the evidence is indirect, the internal repetition of the selectin CR motif, similarities in exon structure, sequence comparisons between selectin family members and potential clustering of the gene family members in a chromosomal region, suggest that gene duplication probably played a part in the evolution of this gene family. Analysis of the splice junctions of the exons in the ELAM gene is also consistent with this proposal. Only exons that have introns of the same type class at their 5’ and 3‘ ends can be inserted, deleted, or duplicated by intronic recombination. Otherwise, disruption of the reading frame would result. Since most of the introns in the ELAM gene interrupt codons at the same location, that is after the first base of the codon, the architecture of the gene would permit the insertion or duplication of the various modules in a preexisting gene without disruption of the read- ing frame. Therefore, the six exons comprising the coding regions for the CR1-6 domains could have arisen from dupli- cation of an ancestral exon. The other members of the selectin family, LAM-1 (8-11) and GMP-140 (12); contain two or nine complement regulatory motifs, respectively. Interest- ingly, the recent report (67) characterizing the human LYAM gene is consistent with this proposal. This member of the selectin family also exhibits the striking correlation between putative functional domains and exon-intron boundaries. The phase of the introns between the putative functional domains in LYAM is the same as that found in ELAM-1, namely, phase 1. Similarly, analysis of exon-intron boundaries in GRMP reveals the same correlation between putative func- tional domains as well as phase 1 junctions? This suggests that conservation of intron phase is a general property of the members of the selectin gene family. These findings are

~ ~~

Johnston, G. I., Bliss, G. A., Newman, P. J., and McEver, R. P. (1990) J. Biol. Chem. 265, 21381-21385.

consistent with the proposal that the selectins evolved as a result of gene duplication and raises the possibility that other members of the family may exist. Furthermore, type 1 introns are also found in other genes with encoded modules (e.g. growth factor domains) similar to those found in the selectins (68), suggesting that the primordial exons encoding these motifs may have been the progenitors for all of the exons encoding these sequences throughout the genome.

Acknowledgments-We would like to thank Dr. J . W. U. Fries for enthusiastic support and Drs. D. Bonthron, M. Cybulsky, D. Dorf- man, J. Lawler, and S. Orkin for critical reading of the manuscript and helpful comments.

REFERENCES 1. Cotran, R. S. (1987) Am. J . Pathol. 129,407-413 2. Pober, J. S. (1988) Am. J . Pathol. 133, 426-433 3. Bevilacqua, M. P., Pober, J. S., Mendrick, D. L., Cotran, R. S.,

and Gimbrone, M. A., Jr. (1987) Proc. Natl. Acad. Sci. U. S. A.

4. Cotran, R. S., Gimbrone, M. A., Jr., Bevilacqua, M. P., Mendrick, D. L., and Pober, J. S. (1986) J . Exp. Med. 164, 661-666

5. Bevilacqua, M. P., Stengelin, S., Gimbrone, M. A,, Jr., and Seed, B. (1989) Science 243, 1160-1165

6. Lasky, L. A., Singer, M. S., Yednock, T. A., Dowbenko, D., Fennie, C., Rodriguez, H., Nguyen, T., Stachel, S., and Rosen, S. D. (1989) Cell 5 6 , 1045-1055

7. Siegelman, M. H., van de Rign, M., and Weissman, I. L. (1989) Science 243, 1165-1172

8. Camerini, D., James, S. P., Stamenkovic, I., and Seed, B. (1989) Nature 342, 78-82

9. Tedder, T. F., Isaacs, C. M., Ernst, T. J., Demetri, G. D., Adler, D. A., and Disteche, C. M. (1989) J. Enp. Med. 170, 123-133

10. Bowen, B. R., Nguyen, T., and Lasky, L. A. (1989) J . Cell Biol. 109,421-427

11. Siegelman, M. H., and Weissman, I. R. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 5562-5566

12. Johnston, G. I., Cook, R. G., and McEver, R. P. (1989) Cell 56, 1033-1044

13. Hsu-Lin, S.-C., Berman, C. L., Furie, B. C., August, D., and Furie, B. (1984) J. Biol. Chem. 259,9121-9126

14. Geng, J-G., Bevilacqua, M. P., Moore, K. L., McIntyre, T. M., Prescott, S. M., Kim, J. M., Bliss, G. A,, Zimmerman, G. A., and McEver, R. P. (1990) Nature 343, 757-760

15. Larsen, E., Celi, A., Gilbert, G. E., Furie, B. C., Erban, J . K., Bonfanti, R., Wagner, D. D., and Furie, B. (1989) Cell 59,305- 312

16. Maniatis, T., Hardison, R. C., Lacy, E., Lauer, J., O’Connell, C., Quon, D., Sim, G. K., and Efstratiadis, A. (1978) Cell 15,678- 688

17. Bonthron, D. T., Morton, C. C., Orkin, S. H., and Collins, T. (1988) Proc. Natl. Acad. Sei. U. S. A. 85, 1492-1496

18. Feinberg, A. P., and Vogelstein, B. (1983) Anal. Biochem. 132, 6-13

19. Duby, A. (1987) in Current Protocols i n Molecular Biology (Au- subel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A,, and Struhl, K., eds) pp. 6.1.1-6.1.4 John Wiley & Sons, New York

20. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, pp. 2-108-2-117, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

21. Biggin, M. D., Gibson, T. J., and Hong, G. F. (1983) Proc. Natl.

22. Tabor, S., and Richardson, C. C. (1987) Proc. Natl. Acad. Sci. U. S . A. 84,4767-4771

23. Saiki, R. K., Bugawan, T. L., Horn, G. T., Mullis, K. B., and Erlich, H. A. (1986) Nature 324, 163-166

24. Gimbrone, M. A., Jr. (1976) in Progress in Hemostasis and Thrombosis (Spaet, T. H., ed) Vol. 3, pp. 1-28, Grune & Stratton, New York

25. Maciag, T., Hoover, G. A,, Stemerman, M. B., and Weinstein, R. (1981) J. Cell Biol. 91, 420-426

26. Thornton, S . C., Mueller, S. N., and Levine, E. M. (1983) Science

27. Shows, T. B., Sakaguchi, A. Y., and Naylor, S. L. (1982) in Advances in Human Genetics, Vol. 12 (Harris, H., and Hir-

84,9238-9242

Acad. Sci. U. S. A. 80, 3963-3965

222,623-625

Page 8: THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 266, … · Structure and Chromosomal Location of the Gene for Endothelial- ... ELAM gene was assigned to the ql2>qter ... Kings Buildings,

ELAM-

schhorn, K., eds) pp. 341-452, Plenum Publishing Co., New York

28. Shows, T., Eddy, R., Haley, L., Byers, M., Henry, M., Fujita, T., Matsui, H., and Taniguchi, T. (1984) Somatic Cell Mol. Genet.

29. Shows, T. B., Brown, J. A,, Haley, L. L., Byers, M. G., Eddy, R. L., Cooper, E. S., and Goggin, A. P. (1978) Cytogenet. Cell Genet. 21 , 99-104

30. Shows, T. B. (1983) in Isozymes: Current Topics in Biological and MedicalResearch (Rattazzi, M. C., Scandalios, J . G., and Whitt, G. S., eds) Vol. 10, pp. 323-339, Alan R. Liss, New York

31. Hawkins, M. D. (1988) Nucleic Acids Res. 16,9893-9903 32. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem.

33. Mount, S. M. (1982) Nucleic Acds Res. 10,459-472 34. Sharp, P. A. (1981) Cell 23,643-646 35. Smith, M. W. (1988) J. Mol. Euol. 27, 45-55 36. Hession, C., Osborn, L., Goff, D., Chi-Rosso, G., Vassallo, C.,

Pasek, M., Pittack, C., Tizard, R., Goelz, S., McCarthy, K., Hopple, S., and Lobb, R. (1990) Proc. Natl. Acad. Sci. U. S. A. 8 7 , 1673-1677

37. Polte, T., Newman, W., and Gopal, T. V. (1990) Nucleic Acids Res. 18, 1083

38. Birnstiel, M. L., Busslinger, M., and Strub, K. (1985) Cell 41, 349-359

39. McLauchlan, J., Gaffney, D., Whitton, J. L., and Clements, J . B. (1985) Nucleic Acids Res. 13, 1347-1368

40. Sollner-Webb, B., and Reeder, R. H. (1979) Cell 18,485-499 41. Maniatis, T., Goodbourn, S., and Fischer, J . A. (1987) Science

42. Graves, B. J., Johnson, P. F., and McKnight, S. L. (1986) Cell

43. Lenardo, M. J., and Baltimore, D. (1989) Cell 58,227-229 44. Adrian, G. S., Korinek, B. W., Bowman, B. H., and Yang, F.

(1986) Gene (Amst.) 49, 167-175 45. Martin, M. E., Piette, J., Yaniv, M., Tang, W-J., and Folk, W.

R. (1988) Proc. Natl. Acad. Sci. U. S. A. 8 5 , 5839-5843 46. Vogt, P., and Bos, T. (1990) Adu. Cancer Res. 8 5 , 1-18 47. Emorine, L., Kuehl, M., Weir, L., Leder, P., and Max, E. E.

48. Baldwin, A. S., and Sharp, P. A. (1988) Proc. Natl. Acad. Sci.

10,315-318

50,349-383

236,1237-1245

44,565-576

(1983) Nature 304, 447-449

U. S. A. 8 5 , 723-727

.I Gene 2473

49. Fan, C-M., and Maniatis, T. (1990) Genes & Deu. 4,29-42 50. Edbrooke, M. R., Burt, D. W., Cheshire, J. K., and Woo, P. (1989)

Mol. Cell. Biol. 9, 1908-1916 51. Shakhov, A. N., Collart, M. A., Vassalli, P., Nedospasov, S. A.,

and Jongeneel, C. V. (1990) J. Exp. Med. 171, 35-47 52. Johnston, G. I., LeBeau, M. M., Lemons, R. S., and McEver, R.

P. (1988) Blood 72, 327 (abstr.) 53. Watson, M. L., Kinsmore, S. F., Johnston, G. I., Siegelman, M.

H., Le Beau, M. M., Lemons, R. S., Bora, N. S., Howard, T. A., Weissman, I. L., McEver, P. R., and Seldin, M. F. (1990) J. Exp. Med. 172,263-272

54. Ahearn, J. M., and Fearon, D. T. (1989) Adu. Immunol. 46, 183- 2 19

55. Weis, J. H., Morton, C. C., Bmns, G. A. P., Weis, J. J., Klickstein, L. B., Wong, W. W., and Fearon, D. T. (1987) J. Imnunol. 138,312-315

56. Rey-Campos, J., Rubinstein, P., and Rodriguez de Cordoba, S. (1988) J. Exp. Med. 167 , 664-669

57. Gilbert, W. (1978) Nature 271,501 58. Gilbert, W. (1985) Science 228,823-824 59. Traut, T. W. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,2944-2948 60. White, R. T., Damm, D., Miller, J., Spratt, K., Schilling, J.,

Hawgood, S., Benson, B., and Cordell, B. (1985) Nature 317, 361-363

61. Drickamer, K., and McCreary, V. (1987) J. Biol. Chem. 262 ,

62. Sudhof, T. C., Russel, D. W., Goldstein, J. L., Brown, M. S.,

895 Sanchez-Pescador, R., and Bell, G. I. (1985) Science 228,893-

63. Foster, D. C., Yoshitake, S., and Davie, E. W. (1985) Proc. Natl. Acad. Sei. U S. A. 82,4673-4677

64. Kiss, I., Deak, F., Holloway, R. G., Jr., Delius, H., Mebust, K. A,, Frimberger, E., Argraves, W. S., Tsonis, P. A., Winterbottom, N., and Goetinck, P. F. (1989) J. Biol. Chem. 264, 8126-8134

65. Kristensen, T., Ogata, R. T., Chung, L. P., Reid, K. B. M., and Tack, B. F. (1987) Biochemistry 26,4668-4674

66. Gamble, J . R., Skinner, M. P., Berndt, M. C., and Vadas, M. A. (1990) Science 249,414-417

67. Ord, D. C . , Ernst, T. J., Zhou, L-J., Rambaldi, A., Spertini, O., Griffin, J., and Tedder, T. F. (1990) J. Biol. Chem. 265 , 7760- 7767

2582-2589

68. Patthy, L. (1987) FEBS Lett. 214, 1-7