5
Proc. Nati. Acad. Sci. USA Vol. 89, pp. 10925-10929, November 1992 Medical Sciences Molecular cloning and primary structure of the human blood group RhD polypeptide (erythocyte membrane/RhD antigen/cDNA/gene analysis/PCR) CAROLINE LE VAN KIM, ISABELLE MOURo, BAYA CHtRIF-ZAHAR, VIRGINIE RAYNAL, CATHERINE CHERRIER, JEAN-PIERRE CARTRON*, AND YVES COLIN Unite Institut National de la Sante et de la Recherche M6dicale U76, Institut National de Transfusion Sanguine, 6 rue Alexandre Cabanel, 75015 Paris, France Communicated by Vincente T. Marchesi, July 14, 1992 ABSTRACT The RH (rhesus) blood group locus from RhD-positive donors is composed of two homologous structural genes, one of which encodes the Cc and Ee polypeptides, whereas the other, which is missing in the RhD-negative condition, encodes the D protein that carries the major antigen of the RH system. Recently, different splicing isoforms tran- scribed from the CcEe gene were isolated. We report now the characterization of two other Rh clones, RhIl and RhXHI, generated by alternative choices for poly(A) addition sites that were identified as the RhD gene transcripts. That these cDNAs represented the RhD messenger and that the previously de- scribed Rh clones were derived from the CcEe gene was demonstrated by amplification of RhII/XIII sequences only from D-positive genomes and by cloning and sequencing of D- and CcEe-specifIc gene fragments. The predicted traation product of the RhD mRNA is a 417-amino acid protein (Mr = 45,500) that exhibited a similar membrane oration with 13 bilayer-sning domains compared with the polypeptide en- coded by the CcEe gene. The D and Cc/Ee polypeptides differ by 36 amino acid substitutions (8.4% divergence), but the NH2- and COOH-terminal regions of the two proteins are well conserved. Similarly, five of the six cysteine residues of the Cc/Ee proteins were conserved in the D protein, inluding the unique exofacial cysteine, which is critical for antigenic reac- tivity. The sequence homology between the Cc/Ee and D proteins supports the concept that the genes encoding these polypeptides have evolved by duplication of a common ancestor gene. The RH blood group system is important in transfusion and clinical medicine because it is involved in the hemolytic disease of the newborn, transfusion reactions, autoimmune hemolytic anemias, and hemolytic reactions of nonimmune origin (1-3). This system is recognized as one of the most complex polymorphisms in man (4), but until recently the structural basis of blood group antigens has remained poorly understood (for review, see ref. 5). Currently, individuals are classified as Rh-positive and Rh-negative according to the presence or the absence of the major D antigen on the surface of their erythrocytes, but >46 other antigens, including those of the Cc and Ee series, have been identified (6). The absence or the severe reduction of all of these antigens on erythro- cytes from Rh-deficient individuals (Rhnun, Rhmod) is associ- ated with morphological and functional abnormalities and a mild hemolytic anemia (3). The D, c, and E epitopes are carried by at least three distinct homologous unglycosylated integral membrane pro- teins of apparent Mr 30,000-32,000 that share a common NH2-terminal sequence (reviewed in ref. 5). Using this struc- tural information, the same Rh mRNA has been cloned independently by our group (7) and others (8) from a human bone marrow cDNA library (clones RhIXb and Rh30A, respectively). It has been shown subsequently by Southern blot analysis that the RH locus is composed of two structural homologous genes, one encoding the RhD polypeptide and the second encoding the Cc and Ee polypeptides (9), most likely by a mechanism of alternative splicing of a primary transcript (10). These findings are consistent with a two-locus model of Rh inheritance (11) and suggest that the two Rh genes have evolved through duplication of a common ances- tor gene. In the genome of Rh-negative individuals the RhD gene is missing, either as a result of a gene deletion or possibly as a survival of the ancestral RH locus before the duplication occurs (9). The Rh antigen specificity of the protein encoded by the cDNAs cloned previously (7, 8) has not been formally established. Partial RhD protein sequence analysis suggested that neither clone encoded the postulated D polypeptide (8), but recent studies with polyclonal antibodies raised against synthetic peptides indicated that such a clone more likely encodes the E or e antigens (P. Hermand, I.M., M. Huet, C. Bloy, K. Suyama, J. Goldstein, J.-P.C., and P. Bailly, unpublished data). This report describes the characterization of the RhD gene product that encodes the major antigen RhD of the RH system.t MATERIALS AND METHODS cDNA Library Screening. A human bone marrow Agtll cDNA library (Clontech) was screened with the previously isolated RhIXb cDNA (7). Amplification of Genomic Sequences. One microgram of human leukocyte DNAs from donors of different Rh pheno- types was used as templates in PCR experiments (40 cycles: 940C for 1 min, 460C for 1 min, and 720C for 1.5 min) as described (10) using a Perkin Elmer amplification kit (con- taining 2.5 units of Taq polymerase) and 300 ng of each primer. PCR A was performed between amplimers common to the RhIXb and RhXIII cDNAs: Al (5'-GGTGTTGTAAC- CGAG-3'; sense primer, positions 941-955; + 1 taken as the first residue of the initiator AUG) and A2 (5'-ATCATGC- CATTGCCG-3'; antisense primer, positions 1076-1062). Amplification products were analyzed on agarose gel and hybridized at 450C with an RhXIII-specific probe, pXIII (5'-GGCTCCGACGGTATC-3'; positions 1048-1062), or at 650C with the Rh cDNA probe (an equimolar mixture of the RhIXb and RhXIII cDNAs). PCR B was performed between amplimers A3 (5'-TAAGCAAAAGCATCCAAGAA-3'; po- sitions 1252-1271, sense primer common to RhIXb and RhXIII) and A-XIII (5'-ATGGTGAGATTCTCCTC-3'; po- *To whom reprint requests should be addressed. t~he sequences reported in this paper have been deposited to the GenBank data base (accession nos. X63094 and X63097). 10925 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on June 19, 2020

Molecularcloningandprimary polypeptide - PNAS · Medical Sciences: LeVanKimet al. Proc. Natl. Acad. Sci. USA89(1992) 10927 cagagacggacacagg _t Sor Sr Lys Tyr Pro Ar; Sar Vol Arg Arg

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Proc. Nati. Acad. Sci. USAVol. 89, pp. 10925-10929, November 1992Medical Sciences

Molecular cloning and primary structure of the human blood groupRhD polypeptide

(erythocyte membrane/RhD antigen/cDNA/gene analysis/PCR)

CAROLINE LE VAN KIM, ISABELLE MOURo, BAYA CHtRIF-ZAHAR, VIRGINIE RAYNAL, CATHERINE CHERRIER,JEAN-PIERRE CARTRON*, AND YVES COLINUnite Institut National de la Sante et de la Recherche M6dicale U76, Institut National de Transfusion Sanguine, 6 rue Alexandre Cabanel, 75015 Paris, France

Communicated by Vincente T. Marchesi, July 14, 1992

ABSTRACT The RH (rhesus) blood group locus fromRhD-positive donors is composed oftwo homologous structuralgenes, one of which encodes the Cc and Ee polypeptides,whereas the other, which is missing in the RhD-negativecondition, encodes the D protein that carries the major antigenof the RH system. Recently, different splicing isoforms tran-scribed from the CcEe gene were isolated. We report now thecharacterization of two other Rh clones, RhIl and RhXHI,generated by alternative choices for poly(A) addition sites thatwere identified as the RhD gene transcripts. That these cDNAsrepresented the RhD messenger and that the previously de-scribed Rh clones were derived from the CcEe gene wasdemonstrated by amplification of RhII/XIII sequences onlyfrom D-positive genomes and by cloning and sequencing ofD-and CcEe-specifIc gene fragments. The predicted traationproduct of the RhD mRNA is a 417-amino acid protein (Mr =45,500) that exhibited a similar membraneoration with 13bilayer-sning domains compared with the polypeptide en-coded by the CcEe gene. The D and Cc/Ee polypeptides differby 36 amino acid substitutions (8.4% divergence), but the NH2-and COOH-terminal regions of the two proteins are wellconserved. Similarly, five of the six cysteine residues of theCc/Ee proteins were conserved in the D protein, inluding theunique exofacial cysteine, which is critical for antigenic reac-tivity. The sequence homology between the Cc/Ee and Dproteins supports the concept that the genes encoding thesepolypeptides have evolved by duplication ofacommon ancestorgene.

The RH blood group system is important in transfusion andclinical medicine because it is involved in the hemolyticdisease of the newborn, transfusion reactions, autoimmunehemolytic anemias, and hemolytic reactions of nonimmuneorigin (1-3). This system is recognized as one of the mostcomplex polymorphisms in man (4), but until recently thestructural basis of blood group antigens has remained poorlyunderstood (for review, see ref. 5). Currently, individuals areclassified as Rh-positive and Rh-negative according to thepresence or the absence ofthe majorD antigen on the surfaceof their erythrocytes, but >46 other antigens, including thoseofthe Cc and Ee series, have been identified (6). The absenceor the severe reduction of all of these antigens on erythro-cytes from Rh-deficient individuals (Rhnun, Rhmod) is associ-ated with morphological and functional abnormalities and amild hemolytic anemia (3).The D, c, and E epitopes are carried by at least three

distinct homologous unglycosylated integral membrane pro-teins of apparent Mr 30,000-32,000 that share a commonNH2-terminal sequence (reviewed in ref. 5). Using this struc-tural information, the same Rh mRNA has been cloned

independently by our group (7) and others (8) from a humanbone marrow cDNA library (clones RhIXb and Rh30A,respectively). It has been shown subsequently by Southernblot analysis that the RH locus is composed of two structuralhomologous genes, one encoding the RhD polypeptide andthe second encoding the Cc and Ee polypeptides (9), mostlikely by a mechanism of alternative splicing of a primarytranscript (10). These findings are consistent with a two-locusmodel of Rh inheritance (11) and suggest that the two Rhgenes have evolved through duplication of a common ances-tor gene. In the genome of Rh-negative individuals the RhDgene is missing, either as a result of a gene deletion orpossibly as a survival of the ancestral RH locus before theduplication occurs (9).The Rh antigen specificity of the protein encoded by the

cDNAs cloned previously (7, 8) has not been formallyestablished. Partial RhD protein sequence analysis suggestedthat neither clone encoded the postulated D polypeptide (8),but recent studies with polyclonal antibodies raised againstsynthetic peptides indicated that such a clone more likelyencodes the E or e antigens (P. Hermand, I.M., M. Huet, C.Bloy, K. Suyama, J. Goldstein, J.-P.C., and P. Bailly,unpublished data). This report describes the characterizationof the RhD gene product that encodes the major antigen RhDof the RH system.t

MATERIALS AND METHODScDNA Library Screening. A human bone marrow Agtll

cDNA library (Clontech) was screened with the previouslyisolated RhIXb cDNA (7).

Amplification of Genomic Sequences. One microgram ofhuman leukocyte DNAs from donors of different Rh pheno-types was used as templates in PCR experiments (40 cycles:940C for 1 min, 460C for 1 min, and 720C for 1.5 min) asdescribed (10) using a Perkin Elmer amplification kit (con-taining 2.5 units of Taq polymerase) and 300 ng of eachprimer. PCR A was performed between amplimers commonto the RhIXb and RhXIII cDNAs: Al (5'-GGTGTTGTAAC-CGAG-3'; sense primer, positions 941-955; + 1 taken as thefirst residue of the initiator AUG) and A2 (5'-ATCATGC-CATTGCCG-3'; antisense primer, positions 1076-1062).Amplification products were analyzed on agarose gel andhybridized at 450C with an RhXIII-specific probe, pXIII(5'-GGCTCCGACGGTATC-3'; positions 1048-1062), or at650C with the Rh cDNA probe (an equimolar mixture of theRhIXb and RhXIII cDNAs). PCR B was performed betweenamplimers A3 (5'-TAAGCAAAAGCATCCAAGAA-3'; po-sitions 1252-1271, sense primer common to RhIXb andRhXIII) and A-XIII (5'-ATGGTGAGATTCTCCTC-3'; po-

*To whom reprint requests should be addressed.t~he sequences reported in this paper have been deposited to theGenBank data base (accession nos. X63094 and X63097).

10925

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

June

19,

202

0

10926 Medical Sciences: Le Van Kim et al.

sitions 1437-1421, antisense primer specific of the RhXIIIcDNAs) and probed with the Rh cDNAs.

Southern Blot Analysis. Human leukocyte DNA was di-gested with EcoRI, resolved by agarose gel electrophoresis,and transferred to a nitrocellulose membrane (12). Hybrid-izations were performed at 650C in 5x SSPE (SSPE = 0.18M NaCl/10 mM sodium phosphate/1 mM EDTA), 3x Den-hardt's solution, and 0.1% SDS with Rh exon-specific probes(9). Final washes were performed at 650C for 15 min in lxSSC (SSC = 0.15 M NaCl/15 mM sodium citrate) and 0.1%SDS.Genomic Libraries. Three hundred micrograms of leuko-

cyte DNA from a DCCee donor was digested by EcoRI andresolved by 0.6% agarose gel electrophoresis. Gel slices of 1cm thick were cut, DNA fractions were electroeluted, and analiquot was hybridized with exon-specific probes. Threehundred nanograms of each positive fraction was ligated with1 jig of Agtll EcoRI arms and packaged in vitro. For eachlibrary, 3 x 105 plaques were transferred to nitrocellulosefilters and hybridized with PCR probes in standard condi-tions.DNA Sequencing. Genomic and cDNA inserts were sub-

cloned into pUC18 vectors and sequenced on both strands bythe dideoxy chain-termination method (13).

RESULTS AND DISCUSSIONPreliminary attempts to characterize the RhD mRNA werecarried out by an oligonucleotide approach using partialsequence information on the NH2-terminal 54 amino acids ofthe immunopurified RhD polypeptide published recently (8).Compared to the already cloned polypeptide presumablyproduced by the RhCcEe gene (7-9), the proposed RhDprotein exhibits an identical NH2-terminal sequence up toresidue 41 but diverged beyond this point (although stillrelated). Different oligonucleotides deduced from the postu-lated RhD NH2-terminal sequence were used as primers inPCR experiments. Surprisingly, when reticulocyte and bonemarrow RNA or leukocyte genomic DNA was used astemplates no amplification product could be detected.

Isolation and Nucleotide Sequence of Putative RhD cDNAClones. As the previously isolated RhIXb cDNA cross-hybridized with the two homologous Rh genes (D and CcEe)(7, 9), we have characterized all of the clones detected in thehuman bone marrow cDNA library with this probe. Twoclones, RhVI and RhVIII, were clearly related to splicingisoforms of the RhIXb mRNA transcribed from the CcEegene (10). Two other clones, RhII and RhXIII, were ana-lyzed. Sequence analysis (Fig. 1) indicates that these cDNAs[1522 and 2789 base pairs (bp), respectively] differed only bythe length of their 3' noncoding region (237 bp and 1504 bp,respectively). This is consistent with the detection of a faint3- to 3.5-kilobase (kb) band in addition to a major 1.5- to1.7-kb species on Northern blot of erythroblast RNAs (7).Two AATAAA poly(A) signals and a class IV Alu sequencewere identified in the RhXIII 3' untranslated sequence. TheAATAAA sequence at positions 2747-2752 precedes thepoly(A) tail found at the 3' end of the RhXIII cDNA, whereasthe other AATAAA signal at positions 1462-1467 is used togenerate the RhII cDNA. These data indicate that the RhIland RhXIII represent two mRNA isoforms generated byalternative poly(A) site choices. Examination of the 3' non-coding sequences of the RhII and RhXIII clones reveals thatthey are identical to the 3' untranslated region of the RhcDNAs previously published (7-10) until nucleotide 1358 butsurprisingly differ after this position.The coding sequence of the RhXIII and RhIXb cDNAs

exhibited 3.5% divergence, which results in a 8.4% diver-gence at the amino acid level (Fig. 1). Such polymorphism inthe coding region and the difference in the 3' end sequence

suggest that the RhXIII mRNA is transcribed from a distinctRh gene rather than from an allelic form of the CcEe genecoding for RhIXb. We assumed that the RhXIII-encodedpolypeptide was a likely candidate for the RhD protein,although residues 42-54 were different from those attributedto the specific RhD protein by Avent et al. (8). Because of thisdifference, further evidence was needed to establish unam-biguously whether the RhXIII cDNA corresponded to theRhD gene transcript.

Amplification ofRhXMI Sequences from RhD-Positive DNA.To determine whether the RhXIII cDNA derived from theRhD gene, genomic DNA extracted from RhD-positive(DCCee, DccEE, Dccee) or RhD-negative (ddccee and ddc-cEE) donors was analyzed in different PCR experiments. Atfirst, primers were selected to amplify identical size productsfrom the CcEe and D genes (or cDNAs), which could besubsequently distinguished by hybridization to a gene-specific oligonucleotide located in a region of greater se-

quence divergence. Accordingly, using the Al and A2 am-plimers flanking nucleotide positions +956 to +1061 thatshow as much as 30% divergence between RhIXb and RhXIII(Fig. 1), a 136-bp product was amplified and detected in allDNA preparations with the Rh cDNA probe (Fig. 2, PCR A).However, when the pXIII oligonucleotide probe specific forthe RhXIII clone was used, only the 136-bp product fromRhD-positive DNAs or from the RhXIII cDNA used ascontrol was detected and no hybridization to the 136-bpproduct amplified from RhD-negative samples occurred.Thereafter, primers were chosen to amplify selectively a Dgene fragment in RhD-positive genomes. Accordingly, oli-gonucleotides A3 and A-XIII flanking nucleotides 1272-1420of the RhXIII cDNA were used as primers. Since the A-XIIIamplimer is specific for the 3' noncoding sequence ofRhXIII,an expected fragment of 186 bp could be amplified only in theRhD-positive DNA samples and in the control RhXIIIcDNA, but no product was detected in RhD-negative samples(Fig. 2, PCR B).These experiments demonstrated that specific RhXIII se-

quences are present in the RhD-positive DNA only, provid-ing proof that this mRNA is transcribed from the RhD gene.To add further evidence, sequencing of D and CcEe genefragments was performed.

Cloning and Sequencing of RhD Genomic Fragments. Al-though strongly related, the RhD and RhCcEe genes can bedistinguished by restriction analysis of RhD-positive (twogenes) and RhD-negative (one gene) genomic DNAs (9).Southern hybridization with probes specific for nucleo-tides 486-635 and nucleotides 1227-1347 of the RhIXbcDNA, which are complementary to exon 4 and exon 10 ofthe RhCcEe gene, respectively (B.C.-Z., C.L.V.K., V.R.,J.-P.C., and Y.C., unpublished data), each revealed twofragments in the RhD-positive genomes but only one in theRhD-negative DNAs (Fig. 3). The 5.6- and 3.0-kb fragmentsdetected in RhD-positive and -negative DNAs with the exon4 (Fig. 3A) and exon 10 (Fig. 3B) probes correspond to theRhCcEe gene. The 3.0- and 5.2-kb fragments are D specificsince they are revealed only in RhD-positive genomes.Accordingly, four partial genomic libraries were con-

structed. Libraries CE4 and D4 were enriched in the exon

4-specific CcEe and D genomic fragments, respectively,whereas libraries CE10 and D10 were selectively enriched inthe exon 10 complementary fragments derived from the CcEEand D genes, respectively. Sequence analysis indicated thatthe CE4 and CE10 genomic fragments contain sequencesidentical to those present in the RhIXb cDNA betweenpositions 486 and 801 (exons 4 and 5) and 1227 and 1488 (exon10), whereas the D4 and D10 genomic fragments contain thehomologous but not identical exonic sequences found in theRhXIII cDNA, including the entire 3' noncoding region (notshown).

Proc. Natl. Acad Sci. USA 89 (1992)

Dow

nloa

ded

by g

uest

on

June

19,

202

0

Medical Sciences: Le Van Kim et al. Proc. Natl. Acad. Sci. USA 89 (1992) 10927

cagagacggacacagg

_t Sor Sr Lys Tyr Pro Ar; Sar Vol Arg Arg Cys Lou Pro Lou Trp Ala Lou Thr Lou Clu Ala Al& Lou I11ATO MOC TCT AM TAC CCO COO TCT CTC COG COC TOC CTG CCC CTC TOO CCC CTA ACA CTC GAA cCA OCT CTC ATT

Lau Lou Ph. Tyr Ph. Phe Thr His Tyr Asp Ala Sar Lou Glu Asp lin Lys Cly Lou Vol Ala SOa Tyr Glin ValCTC CTC TTC TAT TTT TTT ACC CAC TAT GAC OCT TCC TTA GAG GAT CM AAM 0O CTC CTC ccA TCC TAT CIUGTI

Gly Oin Asp Lou Thr Vol Not Ala Ala C1 ly Lou Gly P60 Lou Thr Soa r Ph0 Arg Arg His SOr Trp SOrOC CM GAT CTO CCaTc ATO 000 0CC ATT OC TTGO0 TTC CTC MCC TCC AG? TTC COO MA CAC MGC TOC AGC

Oar Val Ala Ph* Asn Lou Pha Net Lou Ala Lou Gly Vol Gln Trp Ala I1- Lou Lou Asp Gly Pha Lou SOr GinMT Gct 0cC TTC AC CTC TTC ATC CTO OCO CTT OCT GTG CAM TGO CA ATC CTG CTG GAC 0CC TTC CTG AGC CAG

Pha Pro hr Gly Lys Vol Vol 11- Thr Lou Pha Oar I1e Arg Lou Ala Thr Net Oar Ala "a Sor Vol Lou II1TIC CCT ICT 0OO AAM cTe OTC ATC acA CtG TtC MCT ATT COO CTG 0CC MCC ATG MGT OCT XTG TCG CTG CTG ATC

oar ML A Ala Vol Lou 0ly Lys Vol Asn Lou Ala lin Lou Vo Vol Net Vol Lou Vol Glu Vol Thr Ala LouTCA GM CaT OCT CTC TTO 00 AMA ¢TC AAC TtG OCO CaM TTC GT¢ CTC ATO GTG CTG CTO GAO GTG ACA GCT TTA

0ly A Lou Arg Net Vol 11- SOr Asn I1 Phe Asn Thr Asp Tyr His Net Asn ML ML His S.1 Tyr Vol Pha000 AAC CTG MG AGTC GTC ATC AMT AAT ATC TTC MAC ACA GAC TAC Cac ATC AMC &TG AZO CAC ATC TAC GTG TTC

Ala Ala Tyr Pha 0ly Lou Ma Vol Ala Trp Cys Lou Pro Lys Pro Lou Pro CJ Gly Thr Glu Asp L" Asp GincA 0cC TAT TT 000 CTG ICT OTO WCC TOO TOC CTG CCA AMG CCT CTA CCC SAG GGA ACG GMG OAT MA GAT CAG

Th Ala Tbr Ile Pro Sor Lou SOr Ala Net Lou Cly Ala Lou Ph. Lou Trp ,a Pha Trp Pro SOr beM Asn SOrAA ocA AC¢ ATA CCC AMT TTC TCT 0CC ATG CTC 0CC GCC CTC TTC TTG TOG ATO TTC TOO CCA AGT ZTC MC TCT

&" Lou Lou Ar; Oar Pro I1 fl Arg Lys Asn Ala Y Pha Asn Thr Tyr Tyr Ala yaI Ala Val Sar Vol Val0CT CTC CTG AMA AT CCA ATC G AA caaM AaT 0CC GCG TTC AMC ACC TAC TAT OCT 2TA GCA GTC ACC GTG CTC

Thr Ala Ile SOr 0ly SOr SOr Lou Ala His Pro Gin fiU Lys Ile SOa "a Thr Tyr Vol His SOr Ala Val LouCA 0CC ATC TCA O00 TCA TCC TTG OCT Cac CCC Caa fGG AMG ATC AGC A&G ACT TAT GTG CAC AGT GCG GTG TTG

Ala 0ly 0ly Vol Ala Vl 01ly Thr SOr Cys His Lou I1I Pro SOr Pro Trp Lou Ala Net Vol Lou Gly Lou ValOCA GMa Oc CTC OCT CTO GOT ACC TCC TOT Cac CTC ATC CCT TCT CCC TOO CTT CCC ATC CTC CTG GGT CTT GTG

Ala ly Lou Ile SOr Mely0 ly Ala Lys I= Lou Pro O" Cys Cys Asn Arg Vol Lou 0ly Ile Zr His MrOCT 006 CTO ATC TCC RTC 000 cGa 0CC AM TAC CTG CCC G00 TOT TOT AAC CCAT CTG GG ATT CLC Cac ArC

Sar "a Net GU1U An Pho SOr Lou Lou Gly Lou Lou 0ly Glu 11. "ae Tyr Ie- Val Lou Lou Val Lou &aTCC ATC ATO gQC TAC AW TIC AMC TTG CTG GOT CTG CTT GGA GMG ATC AZC TAC ATT GTG CTG CTG GTG CTT OAT

Thr Vol AUa 0ly Asn 0ly Net Ile Gly Pha Gln Vol Lou Lou SOr I1- 0ly Glu Lou Sor Lou Ala Ile ValACC CTC M 200C GOC aaT GOC ATO ATT 0CC TTC CAM GTC CTC CTC AMC ATT 000 GMA CTC AGC TTG CCC ATC 0TG

Ile Ala Lou Tbr Sotr ly Lou Lou Thr 0ly Lou Lou Lou Asn Lou Lys lie Trp Lys Ala Pro His 0Lm Ala LysATA OCT CTC ACc TCT GOT CTC CTC ACA GCT TOC CTC CTA MAT CTI MA ATA TOO AAA OCA CCT CAT GAG GCT MA

Tyr Ph0 Asp Asp lin Vol Ph0 Trp Lys Ph, Pro His Lou Ala Vl 01ly PrhTAT ITT cAT GAC CAA OTT TTC TOO AMG TI? CCT CAT TTG OCT OTT GCA TI? taogcoaoag catccog^aa oaacaaggcc

tgttcooo cacoct tcctotcoct gttgcctgco tttgtocgtg *o-oocgctc *tgacgcao agtetc4at gttcgcgoag gcoctggogt caggo-oaat 0g0ottipat cctttctctg ccoctcttt%..Vag atct cacc ttt-t tatgcactgt ogaotacoc --aotcc ogatqtoc cactaJt- cotctggtoo acoocogoct gcotatotga tOqtggOtct ccrgtaogct ;aggttaatt tat

tattott cttgL LLLLLLLLLL LLLLLLLLLL t U t a g

SLOOL LULL"AL rOLagaonLaaCaO.S£aouLr.AAorLoGLaLoogLao&oogooaL sCamaa m accc e b G LOtss =ttgttt; Ctttttto&C aatataocgt gtgctcatag oaactgc

ttt gcatgaetg eootCatgtg cttcetog^a oottoatto; ottotoccoc tmaogtcttc agatttttat &cLttttLtt ttgaaacqga ;tctcoctct gtcccggC tggqgtgcog tgccgoaotC teg99teCet gCoCCtCC; cotccoggt tCeaaCgatt ctcCtgccLe octccrgagt agct00ott, coo-tgagce ctccoccc c9ctaottt ttgcottttt acttgtcagg gttteaccat gttggctaqqataotttcoc cggatetct tOgectootg *tc gcctgc CtLgOgOtcLC e ogt9Ct9 ggottocogg tgtgaocca cqtgcccaqc ctot-ettec ctttttgot accatttggt gttttgoega *tta-cacgt ttgtg-acgt ggco-tgctt qtgattcarq cttccattga gaec-ooagg" a*ctggt tgcagogc a *cogeggoc gegtgtggc a9gttttoao tgCtCttctg oaggctgata cgacagctct ctqtgcoct; attgeatat; cotcccoogs ttatottatt ;ttttctoet ;OtotgtgtC *actttgcc ooocoaggat tgLqaootgo oto&gegtt ttccttaagg coccttcctt aoacagacoo ttOgteoao tgooctccot tgcttoooo aoaocataaac accatttagt cactqaacat ogctatotgt atggttgtto ctotgga0at cttgttttgc caotttcttt gooaottctq gcoaoccaag gttctttttg ttMacatoot actaaoaea otgaaca ogctaocaaa ct(An)

-1

0025007S

00500150

0075022S

01000300

01250375

O0S10450

0175052S

02000600

02250675

02S00750

02750325

03000900

03250975

03501050

03751125

04001200

04171201

13721463155416451736182719162009210021912782237324642555264627372773

FIG. 1. Nucleotide sequence of the RhXIII cDNA and predicted amino acid sequence of the human RhD protein. Numbering of nucleotidesand amino acids starts at the ATG codon and initiating methionine, respectively. Nucleotides and amino acid positions that differ from the RhIXbcDNA and deduced protein sequences (7) are underlined. The bracket indicates the limit ofhomology with the 3' noncoding sequence ofa splicingisoform of RhIXb (10). The poly(A) signals used in the RhIl and RhXIII cDNAs and the Alu sequence present in the 3' region of RhXIII areunderlined. The vertical arrow (at nucleotide 1488) indicates the end of RhIl cDNA. Positions of the oligonucleotides used as amplimers or asprobe in PCR experiments (see Fig. 2) are indicated by dashed lines; from 5' to 3': Al, pXIII, A2, A3, and A-XIII.

These findings conclusively established that the RhIXbcDNA, as well as its splicing isoforms (10), derive from theRhCcEe gene and that the RhXIII mRNA is transcribed fromthe RhD gene. No spliceoform of the RhD cDNA could beidentified, either by the screening of the cDNA libraries orafter PCR amplification of various RNA preparations (notshown). Elucidation of theD and CcEe gene structures mightallow determination of whether polymorphisms in the in-tronic splicing sequences could account for this difference inthe expression patterns of the two Rh genes. The stronghomology between the CcEe and D cDNAs is in agreementwith Southern blot analysis indicating that the two genescomposing the RH locus of RhD-positive donors are relatedand may be derived by duplication from a common ancestralgene (9). As expected, the tissue-specific expression of theDgene explored by Northern blot analysis was found to beidentical to that previously reported for the CcEe gene[represented by the RhIXb cDNA, (7)1-i.e., restricted totissues or cell lines expressing erythroid/megakaryocyticcharacters (not shown).Amino Acid Analysis of the RhD Gene Product. The pre-

dicted translation product of the RhD transcript (RhXIII) is

a protein of417 amino acids (Mr = 45,500) that exhibited 8.4%divergence compared with the RhIXb-encoded polypeptide.Hydropathy plot analysis suggests that the two proteins areorganized similarly, starting with an intracellular NH2 termi-nus (14) followed by 13 hydrophobic transmembrane domainsand terminating with a predicted exofacial COOH end (Fig.4; ref. 7). Among the 36 amino acid substitutions between theRhIXb and the RhXIII proteins, only 2 are within the first 100residues and only 1 is within the 60 last amino acids,indicating a high degree of conservation of the NH2- andCOOH-terminal regions of the Rh proteins. Five of the sixcysteine residues of the RhIXb polypeptide are conserved atthe same positions in RhXIII, particularly the unique exofa-cial cysteine at position 285 (Fig. 4). Another differenceincludes a potential N-glycosylation site on Asn-331, whichmay not be used since the RhD protein is unglycosylated andbecause this residue is predicted to be located within ahydrophobic domain (Fig. 4).Which among the 36 amino acid substitutions are essential

for antigenic expression of RhD is still undetermined. Re-markably, however, <10 of these substitutions occur on thepredicted extracellular hydrophilic loops connecting the

Dow

nloa

ded

by g

uest

on

June

19,

202

0

10928 Medical Sciences: Le Van Kim et al.

DNAte m plate

PCRA 36 p

row D1) pos

1) -neg

PCRB

D-pos

D -neg

e I pecied

PCR product

:36 bp

CcEe

Al pool A2 Al A2

:36 bp

CcEe

A I A2

86 bp

D -- I- CcEeo prodic,

b.- -_ -

A3 AXIII A3

CcEe no Product

A3

hybridizationpXiY cDNA

+ +

a) -I-1.

; U __-C-

(2 a= C=-v D P or or

as .- aX)

.;__s .J

.. _--1_

IN.._L_-IF_

.

-+

-~ a) -Qc-2 - - 2

:.;(2 (2; 1t (_) (2 .-

1 -~- Ol r or or o'Lr+

* * _|

PhcDNAs

FIG. 2. Amplification of RhXIII-specific sequences from RhD-positive genomes. (Left) Diagram of D-positive and D-negative RH loci basedon Southern blot analysis (9). Amplimers (arrows) and oligoprobe pXIII positions on the D and CcEe genes are indicated as are the size andthe hybridization pattern of the expected PCR products. (Right) DNA from donors with the indicated Rh phenotypes were used as templatesin PCR experiments. PCR A: amplification between Al and A2 primers and hybridization with the Rh cDNA probe (RhIXb plus RhXIII) orwith the oligoprobe pXIII specific for the RhXIII coding sequence. RhXIII and RhIXb cDNAs were used as control. PCR B: amplificationbetween primers A3 and A-XIII (specific for the 3' noncoding sequence of RhXIII) and hybridization with the cDNA probe. See Fig. 1 for theexact locations of the oligoprobes and amplimers.

transmembrane domains of the RhD protein or in closevicinity to the cell surface (Fig. 4), and it is postulated thatthese positions might be critical for the antigenic propertiesof the Cc/Ee and D proteins. These few amino acid substi-tutions are spread along the whole molecule (except for thefirst and fifth external loops, Fig. 4) and it is of interest thatother investigations carried out with a panel of humanmonoclonal antibodies specific for the D antigen have sug-gested that at least eight RhD epitopes could be distinguished(16). Whether there may be some correlation between theamino acid exchanges and such RhD epitopes is purelyspeculative, but the possibility that the monoclonal antibod-ies may define discontinuous epitopes on the RhD protein isopen to analysis. However, the folding and assembly of theloops and transmembrane domains of the RhD protein is notknown, but if the Rh proteins are approximately globular inshape with a diameter of 50-60 A, it is likely that the D

X 0 w0L u u

U U U

U u U U U u10

epitopes may represent overlapping and conformation-dependent structures (17). Moreover, the lipid compositionof the membrane and possibly also the fatty acylation of theRhD protein itself may modulate the D antigenic expression(18-20). Such variations are thought to affect the cell surfaceexposure of antigenic motifs, particularly among those thatare specifically present on the RhD molecule. In addition, asthe Rh proteins most likely associate with other glycoproteinsthat are deficient in Rhn,1 cells to form a membrane complexof relatively large size (5, 21), it cannot be excluded thatinteractions with other proteins of the Rh protein cluster (5)may also account for D antigen expression.

Recently, it was proposed that a 17-kDa chymotrypticfragment radiolabeled from intact erythrocytes and contain-ing the NH2-terminal domain of the RhD protein could forman immune complex with a polyclonal anti-D antibody (22).That such fragment was radiolabeled is surprising since

U U u U U UU U U U U U

Q 0 Vd v ~

CE4 (0.6kb) - Sw w_ 4W f

D4 (3. n kb) -I~

~ qto

0 ._

VaI

.Exon 4

-- D10.I -.

-- CElO t

Exon 1 0

FIG. 3. Identification of RhD and RhCcEe EcoRI genomic fragments. RhD-positive or RhD-negative DNAs were digested by EcoRI andhybridized on Southern blot with exon 4- and exon 10-specific probes of the RhCcEe gene. Fragments ofthe DccEE DNA identified as D specific(D4 and D10) or CcEe specific (CE4 and CE10) were subcloned.

Proc. Natl. Acad Sci. USA 89 (1992)

40.4b a -.0 3,n- C -1 fo- :.:: :.!

Dow

nloa

ded

by g

uest

on

June

19,

202

0

Proc. Natl. Acad. Sci. USA 89 (1992) 10929

41 7

In

FIG. 4. Topology and protein sequence of the RhD gene product. Proposed transmembrane organization of the RhD protein based onhydropathy analysis (15) and on the intracellular orientation of the NH2 terminus of Rh proteins (14). Open symbols refer to identical positionswithin the CcEe protein (7) and black symbols refer to substitutions typical of the RhD protein: 60L -+I; 68N -. S; 103p -* S; 121M -b L; 127A+ V; 128G -+ D; 152T -* N; 169L -- M; 170R -. M; 172F -* I; 182T -. S; 193K -. E; 198N -. K; 201R -. T; 218M -. I; 223V -+ F; 226p -+ A; 233Q

-k E; 238M -+ V; 245L -+ V; 263R -. G; 267M -. K; 306I - V; 311C -* Y; 314V -+ G; 323H -, P; 3251 -I S; 327V -) I; 329H -* G; 330S - Y; 3311N; 342T -* I; 350H -- D; 353W -. G; 354N -- A; 3"V -, E.

previous reports indicated that the 1251 label incorporated intotyrosine(s) by cell surface radioiodination was associatedwith a single chymotryptic peptide following two-dimen-sional peptide map analysis (23) and was located within theCOOH-terminal region of the RhD protein, as it was rapidlyremoved by carboxypeptidase Y digestion (14, 24). It cannotbe excluded, however, that the 17-kDa fragment is weaklylabeled under the conditions used, but further investigationsare required to clarify this point. Since the number of aminoacid residues specific for the RhD protein that are exposedand available to antibodies is extremely low at the NH2terminus (Fig. 4), only one of the RhD epitopes might bedetected on the 17-kDa fragment.From the above considerations on the complexity of the

RhD epitopes, it is not surprising that preliminary attempts toexpress the recombinant Rh proteins (D and non-D) alone ortogether in eukaryotic cells of erythroid or nonerythroidorigin have been unsuccessful to date (unpublished data).Determination of whether this is due to the incorrect trans-port or folding of these proteins to the cell surface awaitsfurther investigation. However, the characterization of var-ious mRNAs and deduced protein sequences encoded by thetwo genes at the RH locus is a critical step toward the analysisof the Rh polymorphisms and of the molecular defectsresponsible for the Rh-deficiency syndrome.

We thank F. Piller for critical reading. Computer calculations wereperformed at the CITI2 (Paris). Support from the Institut National dela Sante et de la Recherche Medicale, the Caisse Nationale d'As-surances Maladie des Travailleurs Salaries, and the North AtlanticTreaty Organization (Grant 0556/88) is gratefully acknowledged.

1. Mollison, P. L., Engelfriet, C. P. & Contreras, M. (1987) BloodTransfusion in Clinical Medicine, (Blackwell Scientific, Ox-ford), 8th ed.

2. Petz, L. D. & Garratty, G. (1980) Acquired Immune HemolyticAnemias, (Churchill Livingstone, New York).

3. Nash, R. & Shojania, A. M. (1987) Am. J. Hematol. 24,267-275.

4. Race, R. R. & Sanger, R. (1975) Blood Groups in Man (Black-well Scientific, Oxford), 6th ed.

5. Agre, P. & Cartron, J. P. (1991) Blood 78, 551-563.6. Issitt, P. (1989) Transf. Med. Rev. 3, 1-12.7. Cherif-Zahar, B., Bloy, C., Le Van Kim, C., Blanchard, D.,

Bailly, P., Hermand, P., Salmon, C., Cartron, J. P. & Colin, Y.(1990) Proc. Natl. Acad. Sci. USA 87, 6243-6247.

8. Avent, N. D., Ridgwell, K., Tanner, M. J. A. & Anstee, D. J.(1990) Biochem. J. 271, 821-825.

9. Colin, Y., Cherif-Zahar, B., Le Van Kim, C., Raynal, V., VanHuffel, V. & Cartron, J. P. (1991) Blood 78, 2747-2752.

10. Le Van Kim, C., Cherif-Zahar, B., Raynal, V., Mouro, I.,Lopez, M., Cartron, J. P. & Colin, Y. (1992) Blood 80, 1074-1078.

11. Tippett, P. (1986) Ann. Hum. Genet. 50, 241-247.12. Southern, E. (1975) J. Mol. Biol. 98, 503-517.13. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl.

Acad. Sci. USA 74, 5463-5467.14. Bloy, C., Hermand, P., Blanchard, D., Cherif-Zahar, B.,

Gossens, D. & Cartron, J. P. (1990) J. Biol. Chem. 265,21482-21487.

15. Engelman, D., Steitz, T. & Goldman, A. (1986) Annu. Rev.Biophys. Chem. 15, 321-353.

16. Tippett, P. (1990) J. Immunogenet. 17, 247-257.17. Lomas, C., Tippett, P., Thompson, K. M., Melamed, M. D. &

Hughes-Jones, N. C. (1986) Vox Sang. 57, 261-264.18. Shinitzky, M. & Souroujon, M. (1979) Proc. Natl. Acad. Sci.

USA 76, 4438-4440.19. Hughes-Jones, N. C., Gorick, B. D. & Brown, D. (1991) Im-

munol. Lett. 27, 101-104.20. de Vetten, M. P. & Agre, P. (1988) J. Biol. Chem. 263,

18193-181%.21. Hartel-Schenk, S. & Agre, P. (1992) J. Biol. Chem. 267,

5569-5574.22. Suyama, K. & Goldstein, J. (1992) Blood 79, 808-812.23. Blanchard, D., Bloy, C., Hermand, P., Cartron, J. P., Saboori,

A. M., Smith, B. & Agre, P. (1988) Blood 72, 1424-1427.24. Krhamer, M. & Prohaska, R. (1987) FEBS Lett. 226, 105-108.

Medical Sciences: Le Van Kim et al.

Dow

nloa

ded

by g

uest

on

June

19,

202

0