6
Biochimica et Biophysica Acta, 1216 (1993) 221-226 221 © 1993 Elsevier Science Publishers B.V. All rights reserved 0167-4781/93/$06.00 BBAEXP 92543 The Arabidopsis thaliana ribosomal protein $15 (rig) gene Veena Sangwan, Todd R. Lenvik and J. Stephen Gantt Department of Plant Biology, University of Minnesota, St. Paul, MN (USA) (Received 3 March 1993) Key words: Ribosomal protein gene; cDNA; mRNA; (A. thaliana) We have isolated cDNA and genomic clones for Arabidops& thaliana cytosolic ribosomal protein S15 and determined their sequences. Like animal $15 genes, this plant $15 gene is composed of four exons and the first intron is located immediately following the ATG translational start codon. The 5' end of the $15 mRNA was mapped by RNase protection experiments which showed that this mRNA contains a 5' untranslated region of approx. 83 nucleotides. Southern blot analyses suggest that Arabidopsis S15 is encoded by a small family of genes. The sequences of the predicted exons in the cloned S15 gene are identical to that of the $15 cDNA, demonstrating that this gene is transcriptionally active. Sequence analysis of the cloned A. thaliana S15 gene shows that it is tightly linked (approx. 500 nucleotides distant) to a gene of unknown function. The Arabidopsis $15 protein described here is about 75% identical to vertebrate S15, 70% identical to the homologous yeast protein ($21), 50% identical to archaebacterial S19, 30% identical to eubacterial S19, and about 30% identical to plant mitochondrial and plastid $19. Introduction Protein synthesis by ribosomes is a fundamental process that occurs in all known organisms. Three classes of ribosomes are found in eukaryotic cells. These ribosomes are distinguished from one another on the basis of their size, RNA and protein compo- nents, and subcellular location. Cytosolic protein syn- thesis in plants and animals appears to proceed by remarkably similar mechanisms and so it is not surpris- ing that the cytosolic ribosomes found in members of these two kingdoms are also similar in size, structure, and composition. Where comparisons have been made, these similarities extend to the amino acid sequences of the ribosomal proteins (see, for example, Refs. 1-5). There is little doubt that plant and animal cytosolic ribosomes and many of the genes that encode their components share common ancestors. The number of cytosolic ribosomes, and conse- quently the abundance of ribosomal proteins, in eu- karyotic cells is typically proportional to the cellular growth rate or to the level of cellular protein synthesis. In plants, this correlation extends to ribosomal protein Correspondence to: J.S. Gantt, Department of Plant Biology, 220 Biological Science Center, 1445 Gortner Ave., St. Paul, MN 55108, USA. The accession numbers issued by the EMBL data library for the two sequences presented in this paper are Z23161 and Z23162. mRNA levels, which are high in rapidly growing tissues and low in mature differentiated tissues [1,3-10]. Many different mechanisms, including the regula- tion of mRNA abundance, appear to control ribosomal protein gene expression in eukaryotes. Hariharan et al. [11] demonstrated that several mouse ribosomal pro- tein genes were transcribed at nearly equal rates, that these genes have a remarkably similar architecture, and that in some instances these genes appear to bind common trans-acting factors. We are interested in studying the mechanisms that regulate plant ribosomal protein gene expression. Because plant and animal cytosolic ribosomal proteins are evolutionarily related and complex regulatory mechanisms undoubtedly ex- isted in their common ancestor, we are also interested in comparing the mechanisms by which their synthesis is controlled in order to study the evolution of regula- tory mechanisms and factors. Previous results from our laboratory have shown that two Arabiclopsis thaliana H1 histone genes are each tightly linked to sequences that we predicted are the 3' ends of two highly conserved active genes, the functions of which are unknown [12]. We tentatively named these two genes Hlflk-1 and Hlflk-2. We also isolated and sequenced a cDNA clone that represented yet another member of the Hlflk gene family (Hlflk-3) [12]. This cDNA was used to isolate a genomic clone that contained the corresponding gene and some flank- ing DNA. Here we report that tightly linked to the Hlflk-3 gene is the gene encoding an A. thaliana cytosolic ribosomal protein that is homologous to ver- tebrate ribosomal protein S15 and prokaryotic riboso-

The Arabidopsis thaliana ribosomal protein S15 (rig) gene

Embed Size (px)

Citation preview

Page 1: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

Biochimica et Biophysica Acta, 1216 (1993) 221-226 221 © 1993 Elsevier Science Publishers B.V. All rights reserved 0167-4781/93/$06.00

BBAEXP 92543

The Arabidopsis thaliana ribosomal protein $15 (rig) gene

Veena Sangwan, Todd R. Lenvik and J. Stephen Gantt Department of Plant Biology, University of Minnesota, St. Paul, MN (USA)

(Received 3 March 1993)

Key words: Ribosomal protein gene; cDNA; mRNA; (A. thaliana)

We have isolated cDNA and genomic clones for Arabidops& thaliana cytosolic ribosomal protein S15 and determined their sequences. Like animal $15 genes, this plant $15 gene is composed of four exons and the first intron is located immediately following the ATG translational start codon. The 5' end of the $15 mRNA was mapped by RNase protection experiments which showed that this mRNA contains a 5' untranslated region of approx. 83 nucleotides. Southern blot analyses suggest that Arabidopsis S15 is encoded by a small family of genes. The sequences of the predicted exons in the cloned S15 gene are identical to that of the $15 cDNA, demonstrating that this gene is transcriptionally active. Sequence analysis of the cloned A. thaliana S15 gene shows that it is tightly linked (approx. 500 nucleotides distant) to a gene of unknown function. The Arabidopsis $15 protein described here is about 75% identical to vertebrate S15, 70% identical to the homologous yeast protein ($21), 50% identical to archaebacterial S19, 30% identical to eubacterial S19, and about 30% identical to plant mitochondrial and plastid $19.

Introduction

Protein synthesis by ribosomes is a fundamental process that occurs in all known organisms. Three classes of ribosomes are found in eukaryotic cells. These ribosomes are distinguished from one another on the basis of their size, RNA and protein compo- nents, and subcellular location. Cytosolic protein syn- thesis in plants and animals appears to proceed by remarkably similar mechanisms and so it is not surpris- ing that the cytosolic ribosomes found in members of these two kingdoms are also similar in size, structure, and composition. Where comparisons have been made, these similarities extend to the amino acid sequences of the ribosomal proteins (see, for example, Refs. 1-5). There is little doubt that plant and animal cytosolic ribosomes and many of the genes that encode their components share common ancestors.

The number of cytosolic ribosomes, and conse- quently the abundance of ribosomal proteins, in eu- karyotic cells is typically proportional to the cellular growth rate or to the level of cellular protein synthesis. In plants, this correlation extends to ribosomal protein

Correspondence to: J.S. Gantt, Department of Plant Biology, 220 Biological Science Center, 1445 Gortner Ave., St. Paul, MN 55108, USA. The accession numbers issued by the EMBL data library for the two sequences presented in this paper are Z23161 and Z23162.

m R N A levels, which are high in rapidly growing tissues and low in mature differentiated tissues [1,3-10].

Many different mechanisms, including the regula- tion of m R N A abundance, appear to control ribosomal protein gene expression in eukaryotes. Har iharan et al. [11] demonstrated that several mouse ribosomal pro- tein genes were transcribed at nearly equal rates, that these genes have a remarkably similar architecture, and that in some instances these genes appear to bind common trans-acting factors. We are interested in studying the mechanisms that regulate plant ribosomal protein gene expression. Because plant and animal cytosolic ribosomal proteins are evolutionarily related and complex regulatory mechanisms undoubtedly ex- isted in their common ancestor, we are also interested in comparing the mechanisms by which their synthesis is controlled in order to study the evolution of regula- tory mechanisms and factors.

Previous results from our laboratory have shown that two Arabiclopsis thaliana H1 histone genes are each tightly linked to sequences that we predicted are the 3' ends of two highly conserved active genes, the functions of which are unknown [12]. We tentatively named these two genes Hlflk-1 and Hlflk-2. We also isolated and sequenced a cDNA clone that represented yet another member of the Hl f lk gene family (Hlflk-3) [12]. This cDNA was used to isolate a genomic clone that contained the corresponding gene and some flank- ing DNA. Here we report that tightly linked to the Hlflk-3 gene is the gene encoding an A. thaliana cytosolic ribosomal protein that is homologous to ver- tebrate ribosomal protein S15 and prokaryotic riboso-

Page 2: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

222

mal protein $19. The plant $15 protein is about 75% identical to the vertebrate protein and the S15 gene is composed of four exons and three introns, as are the vertebrate genes.

Methods

Isolation and subcloning of the A. thaliana genomic and cDNA clones

A 450 bp EcoRI fragment corresponding to the 3' end of the Hlflk-3 cDNA clone [12] was isolated and labelled with [a-3ep]dATP [13]. This probe was used to screen an A. thaliana genomic library constructed in A Dash (Stratagene) [12]. Filters were hybridized and washed as previously described [14] and one hybridiz- ing clone, AAtS15g, was purified. DNA from this re- combinant phage was isolated [15] and a 2.8 kb EcoRI fragment that hybridized to the probe used above was subcloned into pBluescript KS + (Stratagene) and named pAtS15g-2.8. One of the EcoRI restriction sites used to subclone this fragment is derived from the A Dash vector and is not found in the genomic DNA.

pAtS15g-2.8 was digested with BclI and EcoRI and the 737 bp fragment containing a portion of the 3' end of the S15 coding sequence was isolated, labelled with 32p as described above, and used as a probe to screen an A. thaliana Agtl0 cDNA library (kindly provided by Nigel Crawford, University of California, San Diego). A hybridizing recombinant phage was purified as de- scribed above and the cDNA insert was subcloned into pBluescript KS + (pAtS15c).

DNA sequencing and sequence analysis Both strands of the inserts in pAtS15g-2.8 and

pAtS15c were sequenced as previously described [12]. Overlapping deletions of the inserts were generated as described by Henikoff [16]. Nucleic acid and amino acid sequences were analyzed using computer pro- grams developed by Intelligenetics.

RNase protection assay RNase protection assays were performed as de-

scribed by Thompson et al. [14]. A 297 nucleotide Apa I /Bs tYI (corresponding to nucleotides 818 to 1114) fragment from the genomic clone was subcloned into BamHI/ApaI-digested pBluescript SK + (Stratagene). Labeled anti-sense RNA was synthesized in vitro using [a-32p]UTP and T3 RNA polymerase (Stratagene) ac- cording to the manufacturer 's instructions.

Southern blot analysis Genomic DNA was extracted and purified from

3-week-old Arabidopsis seedlings essentially as de- scribed by Rogers and Bendich [17]. Following restric- tion digests, the DNA fragments were separated by agarose gel electrophoresis and blotted onto a Ze-

taprobe membrane (Bio-Rad). S15 cDNA, labelled as described above, was used as a hybridization probe. Moderately low stringency hybridizations were carried out in an aqueous solution containing 6 X SSC, 5 x Denhardt 's solution, 5 0 / x g / m l sheared salmon sperm DNA, 0.2% SDS, and 5 tzg/ml poly(A) at 60°C. Filters were washed for 30 rain three times in 3 x SSC, 0.1% SDS at room temperature.

Results and Discussion

S15 cDNA and genomic cloning We have previously reported the isolation and char-

acterization of two A. thaliana H1 histone genes, HI-1 and H1-2 [12]. Tightly linked to both of these genes are highly conserved DNA sequences (named Hlflk-1 and Hlflk-2) of unknown function. We also reported the isolation of a cDNA clone (pHlflk-3c) whose sequence was similar to the Hlflk-1 and Hlflk-2 sequences but was clearly not derived from the two previously charac- terized genes. In an attempt to isolate the H1-3 gene, we screened a genomic library using the Hlflk-3 cDNA as a probe. A genomic clone was isolated from this screening procedure and a portion of the Hlflk-3 genc was localized to a 2.8 kb EcoRI restriction fragment (Fig. 1). Sequence analysis of the region downstream of the Hlflk-3 gene showed that the H1-3 gene that we sought was not present; however, this analysis did reveal a sequence similar to that encoding animal ribosomal protein $15 (Fig. 1).

To examine further the possibility that we had cloned an A. thaliana S15 gene, we sought to isolate a corre- sponding cDNA clone. Using a portion of the pAtS15g genomic clone to screen an A. thaliana cDNA library, we isolated a single cDNA clone. The 695 bp cDNA insert contained a 27 nucleotide poly(A)-tail. Analysis of the cDNA indicated that the corresponding mRNA consists of a single large open reading frame that is followed by a 207 nucleotide 3'-untranslated region. Contained within this open reading frame is a potential A U G translational start codon (corresponding to nu- cleotides 1120-1122, Fig. 2) located 30 nucleotides from the site corresponding to the 5' end of the cDNA (nucleotide 1091). Supporting the hypothesis that

Hlflk-3 ~ SI5

m I / 7 1 m I

400 bp

Fig. 1. Organization of the Hlflk-3 and ribosomal protein S15 genes in Arabidopsis thaliana. The complete sequence of a 2.8 kb EcoRl restriction fragment was determined (solid line). The boxes represent exons, the filled regions of which represent the protein coding regions of the Hlflk-3 and ribosomal protein S15 genes. Arrows

show the direction of Hlflk-3 and S15 gene transcription.

Page 3: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

translation initiates at this codon is the presence of an upstream in-frame translational termination codon (corresponding to nucleotides 1108 to 1110). Addition- ally, the sequence AUG is not found between the 5' end of the mRNA (determined by RNase protection assay, see below), located near position 1037, and the putative translational initiation codon. The open read- ing frame terminates at position 2226 (Fig. 2), yielding

223

a translated product that is 152 amino acids in length with a molecular mass of 17130 Da. As with many plant genes, a canonical polyadenylation signal (AATAA) is not found in the Y-untranslated region. With the exceptions of the poly(A)-tail in the cDNA clone and the presumptive intron sequences in the genomic clone, the sequence of the S15 cDNA exactly matches the sequence of the cloned gene.

G AAT TCA CAC AAG AGA TTT GCT AAT GCA TTC CCA AAG TAT TGT GAG CTT GTT GAT AAC GCT AGG CTT TAT 70 N S H K R F A N A F P K Y C E L V D N A R L Y

TGC ACC AAT GCT GTT GGT GGT CCT CCA AGG gtaaattag~caaaag~caatatcaaatctattacgatgtcgtttttgggaga 153 C T N A V G G P P R

gatactgatgtgtgtgttttaattggttggtggagcag CTT ATA GCG TGG AAA GAC GGA AAT AGT AAG CTT CTG GTG GAT 233 L I A W K D G N S K L L V D

CCA GAG GAT ATT GAT TGT CTA AAA ~ GTC AGC AGT CTC AAC CCG GAT GCA GAA TCC ATT TAC GAG CTT TAT 305 P E D I D C L K R V S S L N P D A E S I Y E L Y

CCG GAT CCA AGC CAA TTA AGC AAA CCC GGT TCT GTT TGG AAT GAT GTT GTG TTA GTT CCA TCG AGG CCC AAG 377 P D P S Q L S K P G S V W N D V V L V P S R P K

GTC CAG AAG GAA CTT AGC GAT GCA ATC AGG AGA ATC GAA AAG GCC CAA CCA AAA AAT TGA TGATATq~fGTGAACT 452 V Q K E L S D A I R R I E K A Q P K N *

AGACcFi-i.GAAATTATCTTCTTCCTGAATTTGTCATTTGTTTCCGTCAGcFF1~Fi.GGTTAAATAAGTTAATAAAATGG~FFFF1.GT?GTGAGTTG 546

acatagctagtggaattgagacccatagtcattggtcacgcacatcaaccaagaacaaattgcagtcgataaactaatattttgataacagttac 641

gact tgatcct at cgtat at aact t cgat t t aacaaacacaagt cgt aat t t ct aaaaaacaat t aact caat t tgt t tgaaaat aaaagt t t cc 736

t t t t t ct t cgt aacaagtgaagt caaatgt t t t t cact cgt at ct ct t acaacgt t t tgatgagatgat aagagaagagtgggcccacat at t tg 831

ct t aacaatacacgt t t t agggt t acgtgt cactaatgt t at tgggt t cat t t aatctgggct t aaaagaggcgt gt at catgt tgt aat at t cc 926

cat t tgggct t t cat at aatgaagt tgggccgacaaaaaaaaaaaggact agggt t cccaagt at cccact at at at aat t t t agt cgaagaaaa 1021

ttcttttatttttct cTTGCAAAGA~CTcTCAGCAGCCAACGACAGTTATCA/J~GCTTAAGCCG.i~FFi.AGC-G~F~-~AT~u-i~FF~.GCTGAGAT~ 1115

AATC ATG gtgagttcttcag~c~caacatctgactgaatcttcttctctgatttacattttccgattgctgaacaattaatggtgaacgttgt 1208 M

ag GCG GAT GTT GAA CCA GAG GTT GCT GCT GCT GGA GTT CCC AAG AAG AGG ACG TTT AAG AAG TTT GCC TTC 1279 A D V E P E V A A A G V P K K R T F K K F A F

AAA GGA GTT GAT CTC GAT GCT CTT CTC GAT ATG TCT ACT GAT GAT ~ GTC AAG CTC TTC TCT TCT CGT ATT 1351 K G V D L D A L L D M S T D D L V K L F S S R I

CGT AGA AG gca~acatcaaaccctagactagtctcacttttagattctgtgattgatgtttgacctca~tcgactgtttgtgaaa~gttc 1442 R R R

atag G TTC TCT AGA GGT TTG ACA AGG AAG CCA ATG GCT CTG A~F AAG AAG CTG AGG AAA GCG gCaatcatt 1513 F S R G L T R K P M A L I K K L R K A

tcttatatagtcttgaatgtttggttagttct~aagt~gc~cg~tta~cttcactttgttgaatacttggtgttgcttggt~agt~ctgaatatg 1608

gttttgcaacaattttagtgcaagtctcttctgtttgtgttctctctaagtttg~ttaagattggtgttgattgttttagtgatttctggatttg 1703

ttgtct~gtttcatacattgatacatttctgattctcgagttcatgattaaattccatg~tttattgagattcataaccttgttgctgtattag~ 1798

ctc~gtctggattgccttta~gaatgtt~acactgtgtttgagtatgaatgaattgttcctacatcagttactgaatctcttgtctatcc~gaat 1893

gtacaatagccttgtttctgattgttgcattctgta~atc~gagtaagtgtc~gat~tcgaacttgcgttttggttga~aaacag AAA AGG 1985 K R

GAG GCA CCA CAA GGT GAG AAG CCA GAA CCA GTG AGA ACC CAC TTG AGG AAC ATG ATC ATC GTC CCT GAA ATG 2057 E A P Q G E K P E P V R T H L R N M I I V P E M

ATT GGA AGC ATC ATT GGA GTG TAC AAC GGA AAG ACT TTC AAC CAG GTT GAG ATC AAG CCT GAG ATG ATT GGT 2129 I G S I I G V Y N G K T F N Q V E I K P E M I G

CAC TAC CTG GCT GAG TTC TCT ATC TCA TAC AAG CCG GTC AAG CAC GGT AGG CCT GGT GTT GGT GCT ACC CAC 2201 H Y L A E F S I S Y K P V K H G R P G V G A T H

TCT TCC AGA TTC Aq~f CCC CTT AAG TGA AAAGCTTCTATTGGGGAATGTGTTTACTTGCTCGTTATATGTGGATCTAAGTACGTGGA 2287 S S R F I P L K *

TCT CTTATG'I-I-I-I- I'GAAGc FFI'ATAATGACATTA~FIq'AGAAGu-I-I-I'CT CAA CT CTAGCT CTTGCG'I'FI" I'CATTAT CATGTTACAAGAGGAT CTT 2383 I

AAAT CAAATCA'I'I'FI'GGA'FF/'Fi'GTT atgt t t atgt tgt t at aat t t at t tggat t tgat t aat tgaaaat gaagtgacacct agaact t gct c 2476

aaacatgggt at tgt tgt t t tcct tgt t at taggact ctgct t aacat t acact t t tgt t aagagt aacaacat acat ct aaaat ct cat aacag 2571

aggat agaaat cgt acccactgcaaaaaccagat act caaat t agt ccact t cacgacaacagat t aatgggcct t t aagt at atggacatgggc 2666

ctttgtcagttataagtctcccttaaccagttgacttgtgttccctctaactttaacaaaagattgttcctgagaaattctagggtttctgaaat 2761

cgaggagagat ccgaat t c 2780

Fig. 2. The complete nucleotide sequence of the cytoplasmic ribosomal protein S15 gene and its flanking regions. A 2780 bp EcoRl restriction fragment was sequenced. The upper case letters are exon sequences, italicized lower case letters are intron sequences, the sequence in plain lower case letters from 547 to about 1036 and from 2410 to 2773 is intergenic (nucleotides 2774 to 2780 are derived from the cloning vector). The sequence from position 1 to 546 (denoted with an arrow) is the 3' end of the Hlflk-3 gene and the region extending from about 1037 (marked with a star) to 2409 (marked with an arrow) contains the transcribed portion of the S15 gene. The 5' end of the S15 cDNA is denoted by a filled

arrowhead at position 1091.

Page 4: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

224

Site of transcriptional initiation To map the 5' end of the $15 transcript, an anti-

sense RNA probe, complementary to nucleotides 1116 to 817 (Fig. 2), was hybridized to A. thaliana total RNA, digested with RNases A and T 1 and the prod- ucts separated by electrophoresis through a DNA se- quencing gel. The major RNA fragment that was specifically protected by Arabidopsis RNA migrated with the mobility of an 86 nucleotide DNA fragment (Fig. 3). Several additional protected RNAs are also observed above and below this fragment. Since the electrophoretic mobility of RNA is only 90% to 95% that of DNA [18], the estimated size of the major protected RNA fragment is 80_+ 2 nucleotides. This result suggests that transcription initiates primarily be- tween nucleotides 1035 and 1039 producing a tran- script with about 83 nucleotides in its 5' untranslated region.

The assignment of the S15 transcriptional initiation site is complicated by the possibility that the additional S15 genes that we have shown to be present in the Arabidopsis genome (see below) are transcribed and may be able to hybridize to the anti-sense probe. While it seems unlikely, it is also possible that the transcript's 5' untranslated region is encoded by two or more exons. Nevertheless, if our interpretation of the RNase protection experiments is correct, then transcription of this plant cytosolic ribosomal protein gene initiates in a polypyrimidine tract, which is similar to many of the genes encoding mammalian cytosolic ribosomal pro- teins [11,19]. Unlike many of the mammalian genes, the plant ribosomal protein gene contains an apparent TATA box about 36 nucleotides upstream of the puta- tive site of transcriptional initiation and contains no GCG box motifs.

Fig. 3. Mapping the 5' end of ArabMopsis thaliana S15 mRNA. A uniformly labelled antisense RNA probe was hybridized to 20/xg of A. thaliana total RNA (lane 1) or 20/xg of yeast total RNA (lane 2). After digestion with RNase A and RNase T~, the protected RNAs were separated by electrophoresis on a 6% acrylamide sequencing gel. The major RNA fragment that is specifically protected by A. thaliana RNA is indicated by an arrow. A DNA sequencing ladder is

also shown (lanes G, A, T, and C).

Amino acid sequence comparisons and analysis' The amino acid sequence deduced from the cDNA

clone was used to search for similar sequences in sequence databases. Significant similarities were found

AtS15 RrSl5 XIS15 GsS15 SOS21

HhS19 EcSI9

NICSI9 Ph.MSI9

i0 20 30 40 50 60 70

MADVE P EVAAAGVP KKRTF KKFAF KGVD LDAL LDMSTDDLVKL FS S R I RRRFS RGLTRK;MAL I KKLR KAKREAPQ . . . . . . . . . . . . . . . . . Q . . . . . . . . . . K . . . P

• .E..QK ....... . ....... TYR ..... Q ..... YEQVMQ.YCA.Q...LN...R..QNS .L.R ..... K...P • .E..QK ....... . .... R..TYR ..... Q ..... YEQ.MQ.- .A.Q...LN...R..QHS .L.R ..... K...P . SQAVNA ....... . . .V..THSYR .... EK..E...E.F...APA.V .... A..M.S..AGFM .... A..LA..E

• . LE. VAE. LPA. Q.. TIT... SEEH- - - H. V. AE. RESGTE MP. S. KKG. FIDLHL. K. VEKAVES MT.S. KKN. FVANHL. K. IDKLNTK MP. RS IW. GSFVDAF. LRM. KK- - -

AtS15 RrS15 XIS15 GsS15 S¢$21

HhSI9 EcS19

NtCS19 PhMSI9

80 90 i00 110 120 130 140 150

GEKPEPVRTHLRNMI IVPEMIGS I IGVYNGKT~QVEI KPEM IGHYLAEFS I SYKPVKHGRPG~GATHS SRFI PLK

M .... V.K .... D. . .L. . .V. .MV ...................... G .... T .......... I ............ M .... VIK .... D...L...V..MV ...... A ............... G .... T .......... 7 ............

M .... V.K .... D...L...V..MV ...................... G .... T .......... 7 ............ N. . .A ..... M ............ W. I .... A ...... R. . .L .... G .... T.T. .R. . . --A. . . -T ....... • TANN. IR ~ . . . DMPVL.. FV. LTFA.. T. QE. ER.. VQ ........ G.. QLTRSS. E.. QA. I... R.. K.V... GD. -K.L..WS .RST.F.NM..LT.A.H..RQHVP. FVTD..V..K.G..A .... . TRTY. GHAADKKAKK .... . A.. - . IIV.WS.AST. I.T...HT.AIH...EHLPIY.TDS.V..K.G..A .... . TLNF.GHAKSDNR.. ----R RDLLFNRKIWS . RSS . L. . FVDCFVRI ...... VRCK. TEGKV. . KFG. . AFTR. R .... . . SRTNIGPG. KRGK.

Fig. 4. Amino acid sequence alignment of a plant cytosolic ribosomal protein S15 with the homologous protein from several vertebrates, a yeast an archeabacterium, a eubacterium, and two eukaryotic organelles. Identical amino acid are indicated by dots, dashed lines indicate gaps introduced into the sequences to maximize sequence identity• The sequences are derived from Arabidopsis thaliana (AtS15, this report); rat (RrS15) [20]; frog (XIS15) [22]; chicken (GsS15) [22]; yeast (ScS21) [26]; Halobacterium halobium (HbS19) [27]; Escherichia coli (EcS19) [28]; tobacco plastid (NtCS19) [29]; and petunia mitochondrion (PhMSI9) [30]. The amino acid sequence of the petunia mitocbondrion S19 protein is

derived from the fully edited gene transcript [30].

Page 5: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

with ribosomal proteins from an extremely diverse group of organisms and organelles and amino acid sequence alignments with several of these proteins are shown in Fig. 4. The plant ribosomal protein appears to be homologous to vertebrate $15 (about 75% identi- cal), yeast $21 (about 70% identical), archaebacterial S19 (about 50% identical), and eubacterial, mitochon- drial, and plastid ribosomal protein S19 (each about 30% identical). A. thaliana S15 appears to have seven extra, very hydrophobic, amino acids (amino acids 8-14) in its N-terminal region relative to the other eukaryotic S15 ribosomal proteins.

Clones encoding vertebrate S15 ribosomal proteins were originally identified as being derived from the rat insulinoma gene (rig), the expression of which is dra- matically elevated in insulinomas and in a variety of other transformed cells [20,21]. Even prior to its identi- fication as a ribosomal protein, the rig protein was thought to be important in general cellular growth. Two regions in the protein, a putative nuclear localiza- tion signal and a possible DNA-binding domain, were given special attention [21,22]. Both of these regions are highly conserved in the A. thaliana protein as well as all other eukaryotic organisms. The putative nuclear localization signal in the Arabidopsis protein extends from amino acids 68 to 75 and differs from the verte- brate sequence by the substitution of an arginine for a lysine. The potential DNA-binding domain extends from amino acids 102-112 and the plant amino acid sequence matches the rat sequence exactly as well as most other vertebrate S15 sequences. The identifica- tion of rig as a ribosomal protein gene [23] has brought into question whether the encoded protein actually functions as anything other than a ribosomal protein. Nevertheless, the remarkable conservation of amino acid sequence in this putative DNA-binding domain between plants and animals leaves open the possibility that the S15 protein may also function as a DNA-bind- ing protein.

Southern blot analysis and structure of the S15 gene $15 cDNA was used to probe Southern blots of A.

thaliana genomic DNA (Fig. 5). Hybridization of the cDNA probe at moderately low stringency reveals the presence of between seven and ten fragments in each restriction enzyme digest. Due to the low sequence similarities exhibited between cytosolic $15 and the homologous proteins in plant mitochondria and plas- tids (Fig. 4), it is unlikely that the cytosolic S15 cDNA probe would detect the mitochondrial or plastid $19 genes. It is therefore possible that all of the hybridizing fragments contain cytosolic ribosomal protein $15 gene sequences, making it likely that S15 is encoded by a family of genes. Using moderately high stringency hy- bridization conditions, nearly all of the fragments seen in Fig. 3 continue to hybridize to the cDNA probe;

225

E B X

Fig. 5. Genomic Southern blot analysis of Arabidopsis thaliana DNA probed with ribosomal protein $15 cDNA. 5 /zg of genomic DNA was digested with EcoR! (lane E), BamHI (lane 13), or Xbal (lane X) and separated by electrophoresis on a 0.7% agarose gel. 32p_ labelled S15 cDNA was used as a hybridization probe. Filter-bound DNA was hybridized to the probe at moderately low stringency. No hybridizing bands were visible below 1.5 kb. The positions of molecu-

lar length markers in kb are indicated.

however, the relative intensities differ somewhat (data not shown).

Unlike animal genomes, which typically contain large numbers of ribosomal protein pseudo-genes and ex- press a single gene, plant cells seem to express many, if not all, of the genes in their ribosomal protein gene families [1,4,8,24]. This observation has led to the spec- ulation that plants may have functionally distinct sub- sets of cytosolic ribosomes [1,24]. Bailey-Serres and Freeling [25] have presented data showing that hypoxia induces alterations in the ribosomal protein comple- ment of maize cytosolic ribosomes, suggesting that these ribosomes may have some unique function or property. Further analyses would be required to deter- mine whether any of the S15 genes, in addition to the cloned gene, are expressed in A. thaliana; to examine the diversity of the expressed S15 genes; and to deter- mine the affect that a heterogeneous population of ribosomes has on translation.

The genomic EcoRI fragment that was isolated and sequenced is 2780 bp in length and includes the last exon and a portion of the penultimate exon of the Hlflk-3 gene (Fig. 1 and Fig. 2). Only about 490 nucleotides separate the 3' end of the Hlflk-3 gene (position 546 or 547) and the 5' end of the S15 gene (near position 1037). The $15 cDNA sequence matches

Page 6: The Arabidopsis thaliana ribosomal protein S15 (rig) gene

226

that of the genomic DNA exactly, demonstrating that the cloned gene is active and that the S15 cDNA was derived from a transcript of this gene. The S15 gene is split into four exons by three introns, as are the known vertebrate genes. The first two introns are relatively small (88 and 87 nucleotides, respectively) compared to the third intron (475 nucleotides). The first intron immediately follows the initiation codon as it does in all known vertebrate S15 genes. This gene architecture suggests that this intron was present in the common ancestor of plants and animals. The locations of the other introns in the A. thaliana $15 gene, relative to the amino acid coding sequence, have not been con- served between plants and animals. Interestingly, while many yeast ribosomal protein genes contain introns, the yeast gene homologous to S15 contains none [26].

Despite the similarity in structure of the animal and plant S15 genes, little nucleotide sequence similarity exists outside of the amino acid coding regions. No sequences in the 5' region of the A. thaliana S15 gene bear obvious similarity to those involved in regulation of animal or yeast ribosomal protein genes. Several other A. thaliana ribosomal protein genes have been isolated and partially characterized, including one en- coding L3 [24] and several encoding a 60S ribosomal protein that is synthesized as a ubiquitin conjugate [5]. Among these genes, only a small amount of upstream sequence has been determined and the sites of tran- scriptional initiation have not been reported making sequence comparisons difficult. To determine whether plant ribosomal protein genes contain a unique set of conserved regulatory elements similar to those identi- fied in mammals and yeast, more of these genes need to isolated and analyzed.

Acknowledgements

We thank Michael Thompson, Sudam Pathirana, and Jocelyn Shaw for suggestions during the prepara- tion of this manuscript. This work was supported by a grant from the Public Health Service National Insti- tutes of Health (GM-38769).

References

1 Larkin, J.C., Hunsperger, J.P., Culley, D., Rubenstein, I. and Silflow, C.D. (1989) Genes Dev. 3, 500-509.

2 Gantt, J.S. and Thompson, M.D. (1990) J. Biol. Chem. 265, 2763-2767.

3 Taylor, M.A., Mad Arif, S.A., Pearce, S.R., Davies, H.V., Kumar, A. and George, L.A. (1992) Plant Physiol. 100, 1171-1176.

4 Stafstrom, J.P. and Sussex, I.M. (1992) Plant Physiol. 100, 1494- 1502.

5 Callis, J., Raasch, J.A. and Vierstra, R.D. (1990) J. Biol. Chem. 265, 12486-12493.

6 Gantt, J.S. and Key, J.L. (1983) Biochemistry 22, 4131-4139. 7 Gantt, J.S. and Key, J.L. (1985) J. Biol. Chem. 260, 6175-6181. 8 Madsen, L.H., Kreiberg, J.D. and Gausing, K. (1991) Curr. Genet.

19, 417-422. 9 Crowell, D.N., Kadlecek, A.T., John, M.C. and Amasino, R.M.

(1990) Proc. Natl. Acad. Sci. USA 87, 8815-8810. 10 K6hler, S., Coraggio, 1., Becker, D. and Salamini, F. (1992)

Planta 186, 227-235. 11 Hariharan, N. and Perry, R.P. (1990) Proc. Natl. Acad. Sci. USA

87, 1526-1530. 12 Gantt, J.S. and Lenvik, T.R. (1991) Eur. J. Biochem. 202, 1029-

1039. 13 Feinberg, A.P. and Vogelstein, B. (1983) Anal. Biochem. 132,

6-13. 14 Thompson, M.D., Jacks, C.M., Lenvik, T.R. and Gantt, J.S.

(1992) Plant Mol. Biol. 18, 931-944. 15 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular

Cloning: A Laboratory Manual, pp. 80-85, Cold Spring Harbor Laboratory Press, Cold Spring Harbor.

16 Henikoff, S. (1987) Methods Enzymol. 155, 156-165. 17 Rogers, S.O. and Bendich, A.J. (1988) in Plant Molecular Biology

Manual (Gelvin, S.B., Schilperoort, R.A. and Verma, D.P.S., eds.), Vol. A6, pp. 1-11.

18 Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. Struhl, K. (1989) Current Protocols in Molecular Biology, Vol. 1, p. 4.7.3, Greene Publishing Associates/Wiley-In- terscience, New York.

19 Weis, L. and Reinberg, D. (1992) FASEB J. 6, 3300-3309. 20 Takasawa, S., Yamamoto, H., Terazono, K. and Okamoto, H.

(1986) Diabetes 35, 1178-1180. 21 Inoue, C., Shiga, K., Takasawa, S., Kitagawa, M. and Yamamoto,

H. (1987) Proc. Natl. Acad. Sci. USA 84, 6659-6662. 22 Sugawara, A., Nata, K., Inoue, C., Takasawa, S., Yamamoto, H.

and Okamoto, H. (1990) Biochem. Biophys. Res. Commun. 166, 1501-1507.

23 Kitagawa, M., Takasawa, S., Kikuchi, N., Itoh, T., Teraoka, H., Yamamoto, H. and Okamota, H. (1991) FEBS Lett. 283, 210-214.

24 Kim, Y., Zhang, H. and Scholl, R.L. (1990) Gene 93, 177-182. 25 Bailey-Serres, J. and Freeling, M. (1990) Plant Physiol. 94, 1237-

1243. 26 Takasawa, S., Tahgo, A., Unno, M., Yonekura, H. and Okamoto,

H. (1992) FEBS Lett. 307, 318-323. 27 Mankin, A.S. (1989) FEBS Lett. 246, 13-16. 28 Zurawski, G. and Zurawski, S.M. (1985) Nucleic Acids Res. 13,

4521-4526. 29 Tanaka, M., Wakasugi, T., Sugita, M., Shinozaki, K. and Sugiura,

M. (1986) Proc. Natl. Acad. Sci. USA 83, 6030-6034. 30 Conklin, P.L. and Hanson, M.R. (1991) Nucleic Acids Res. 19,

2701-2705.