4
Gene. 112 (1992)225-228 © 1992ElsevierScience PublishersB.V. All rightsreserved.0378-1119/92/$05.00 GENE 06326 225 Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana (Ecotypes; plant; restriction site polymorphisms; gene copy number) Bruce R. Campell, Yonggen Song, Thomas E. Posch, Christopher A. Cullis and Christopher D. Town Biology Department, Case Western Reserve University, Cleveland, OH 44106 (U.S.A.) Receivedby T.D. McKnight:24 June 1991 Accepted: 22 September1991 Receivedat publishers:9 Decem0er 1991 SUMMARY We have isolated a genomic clone containing Arabidopsis thaliana 5S ribosomal RNA (rRNA)-encoding genes (rDNA) by screening an A. thaliana library with a 5S rDNA probe from flax. The clone isolated contains seven repeat units of 497 bp, plus 11 kb of flanking genomic sequence at one border. Sequencing of individual subcloned repeat units shows that the sequence of the 5S rRNA coding region is very similar to that reported for other flowering plants. Four A. thaliana ecotypes were found to contain approx. 1000 copies of 5S rDNA per haploid genome. Southern-blot analysis of genomic DNA indicates that 5S rDNA occurs in long tandem arrays, and shows the presence of numerous restriction-site polymorphisms among the six ecotypes studied. INTRODUCTION The 5S ribosomal RNA genes (5S rDNA) are highly conserved throughout the animal and plant kingdoms (Erd- mann and Wolters, 1986) and are present in high copy number in many species, including some plants (Long and Dawid, 1980). In most organisms, the bulk of the 5S rDNA is organized into tandem arrays found at a small number of chromosomal locations (Long and Dawid, 1980). In certain fungi [e.g., Schizosaccharomyces pombe (Mao et al., 1982) and Neurospora crassa (Selker et al., 1981)] the 5S rDNA is dispersed throughout the genome. In flax, which contains an unusually large number of copies of 5S rDNA genes, these sequences are distributed throughout the ge- home, both in long tandem repeats and in shorter, diver- gent arrays (Schneeberger et al., 1989). A. thaliana, a member of the Cruciferae which is rapidly gaining popularity as a model organism for plant molecu- Correspondenceto: Dr. C.D Town, BiologyDepartment,Case Western Reserve University, Cleveland,OH 44106(U.S.A.) Tel. (216)368-3593; Fax (216)368-4672. Abbre,~iations: bp, basepair(s);FIGE,field-inversion gelelectrophoresis; kb, kilobase(s)or 1000 bp; nt, nucleotide(s);rDNA, DNA containing genes codingfor rRNA; rRNA, ribosomalRNA. lar genetics, has the smallest known genome size among angiosperms, and contains very little repetitive DNA (Leutwi!,er et al., 1984). The large rRNA genes from this organism have already been cloned and partially charac- terized (Pruitt and Meyerowitz, 1986). Because of its small genome size and low fraction of repetitive nt sequences, it was of iaterest to characterize the 5S rDNA gene family in Arabidopsis. In the work described here, we report on the sequence of the 5S rDNA repeat unit and the organization of part of a tandem array in a genomic clone. We also pre,~ent data concerning gene copy number and polymor- phisms observed among the 5S rDNA genes of six differ- ent ecotypes of Arabidopsis. EXPERIMENTALAND DISCUSSION (a) Isolation of a phage clone and organization of Arabi- dopsis genomic insert Only one 5S rDNA-containing clone, 2TP1, was iso- lated from screening 20000 plaques of a library containing Arabidopsis (ecotype Nd-0) genolaic DNA cloned in 2EMBIA, indicating a low representation of these se- quences in the library. No positively hybridizing plaques were found in a comparable screen of a library constructed from DNA of the Col-0 ecotype. Since the 5S rDNA is

Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

Embed Size (px)

Citation preview

Page 1: Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

Gene. 112 (1992) 225-228 © 1992 Elsevier Science Publishers B.V. All rights reserved. 0378-1119/92/$05.00

GENE 06326

225

Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

(Ecotypes; plant; restriction site polymorphisms; gene copy number)

Bruce R. Campell, Yonggen Song, Thomas E. Posch, Christopher A. Cullis and Christopher D. Town

Biology Department, Case Western Reserve University, Cleveland, OH 44106 (U.S.A.)

Received by T.D. McKnight: 24 June 1991 Accepted: 22 September 1991 Received at publishers: 9 Decem0er 1991

SUMMARY

We have isolated a genomic clone containing Arabidopsis thaliana 5S ribosomal RNA (rRNA)-encoding genes (rDNA) by screening an A. thaliana library with a 5S rDNA probe from flax. The clone isolated contains seven repeat units of 497 bp, plus 11 kb of flanking genomic sequence at one border. Sequencing of individual subcloned repeat units shows that the sequence of the 5S rRNA coding region is very similar to that reported for other flowering plants. Four A. thaliana ecotypes were found to contain approx. 1000 copies of 5S rDNA per haploid genome. Southern-blot analysis of genomic DNA indicates that 5S rDNA occurs in long tandem arrays, and shows the presence of numerous restriction-site polymorphisms among the six ecotypes studied.

INTRODUCTION

The 5S ribosomal RNA genes (5S rDNA) are highly conserved throughout the animal and plant kingdoms (Erd- mann and Wolters, 1986) and are present in high copy number in many species, including some plants (Long and Dawid, 1980). In most organisms, the bulk of the 5S rDNA is organized into tandem arrays found at a small number of chromosomal locations (Long and Dawid, 1980). In certain fungi [e.g., Schizosaccharomyces pombe (Mao et al., 1982) and Neurospora crassa (Selker et al., 1981)] the 5S rDNA is dispersed throughout the genome. In flax, which contains an unusually large number of copies of 5S rDNA genes, these sequences are distributed throughout the ge- home, both in long tandem repeats and in shorter, diver- gent arrays (Schneeberger et al., 1989).

A. thaliana, a member of the Cruciferae which is rapidly gaining popularity as a model organism for plant molecu-

Correspondence to: Dr. C.D Town, Biology Department, Case Western Reserve University, Cleveland, OH 44106 (U.S.A.) Tel. (216)368-3593; Fax (216)368-4672.

Abbre,~iations: bp, base pair(s); FIGE, field-inversion gel electrophoresis; kb, kilobase(s) or 1000 bp; nt, nucleotide(s); rDNA, DNA containing genes coding for rRNA; rRNA, ribosomal RNA.

lar genetics, has the smallest known genome size among angiosperms, and contains very little repetitive DNA (Leutwi!,er et al., 1984). The large rRNA genes from this organism have already been cloned and partially charac- terized (Pruitt and Meyerowitz, 1986). Because of its small genome size and low fraction of repetitive nt sequences, it was of iaterest to characterize the 5S rDNA gene family in Arabidopsis. In the work described here, we report on the sequence of the 5S rDNA repeat unit and the organization of part of a tandem array in a genomic clone. We also pre,~ent data concerning gene copy number and polymor- phisms observed among the 5S rDNA genes of six differ- ent ecotypes of Arabidopsis.

EXPERIMENTAL AND DISCUSSION

(a) Isolation of a phage clone and organization of Arabi- dopsis genomic insert

Only one 5S rDNA-containing clone, 2TP1, was iso- lated from screening 20000 plaques of a library containing Arabidopsis (ecotype Nd-0) genolaic DNA cloned in 2EMBIA, indicating a low representation of these se- quences in the library. No positively hybridizing plaques were found in a comparable screen of a library constructed from DNA of the Col-0 ecotype. Since the 5S rDNA is

Page 2: Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

226

fairly abundant in the A rabidopsis genome, but may be highly methylated (see section d), the low representation of these sequences in the library may result from loss of sequences during cloning due to the use of a methylation-sensitive bacterial host, VCS-257 (Graham et al., 1990). Use of a methylation-insensitive host carrying mcrA- and mcrB- mutations increases the efficiency of cloning of rRNA genes from flax (R.G. Schneeberger, M.L. Agarwal and C.A.C., unpublished data).

Digestion of ).TPI DNA with EcoRI followed by FIGE indicated an insert size of 14.5 kb. Restriction analysis with a number of enzymes followed by probing with 5S rDNA and phage A DNA sequences indicated that the cloned genomic fragment consists of an approx. 3.5-kb block con- taining the 5S rDNA sequence plus about 11 kb ofnon-.~S DNA (Fig. 1). The number of repeat units present in the 5S rDNA block was determined by subjecting purified ge- nomic insert to partial digestion with Sail. Probing blots of the partial digests with 5S rDNA showed that the insert contains a tandem array of seven repeat units of approx. 500 bp, with an additional SalI site occurring in one of the terminal units.

(b) Length and sequence of a 5S rDNA repeat unit from Arabidopsis

Since preliminary genomic digests suggested (errone- ously) that the Arabidopsis 5S rRNA gene contains a sin- gle BamHI site, BamHI-digested DNA from ATP1 was subcloned into Bluescript MI3( - ) for sequencing, generating plasmids pCT4.2 and pCT4.3, which contain repeat unit-sized inserts. Subsequent sequence analysis of these subclones revealed the presence of a second BamHI site within the repeat unit, raising the possibility that frag- ments of different repeat units had been ligated together in these clones, but also indicated a single Sail site in the repeat. Therefore, the complete sequence of an intact 5S

LEFT ARM

H BH S BS T "-T• • • ,

~/ ' / / / / /~ ~ RIGHT ARM

m

1 kb

= 5S DNA

Fig. 1. Restriction map of genomic clone ~.TP 1. A. thaliana (Nd-0) ge- nomic DNA was partially digested with Sau3A, size-selected, and cloned into ).EMBL4. ATPI was identified by screening the resulting library with flax 5S rDNA (rBGI3: Goldsbrough et al., 1981). Phage A DNA was isolated and purified by the glycerol-step-gradient procedure (Sambrook et al., 1989). ).TPI was mapped by Southern blotting of singiv and double- restriction digests of phage A DNA subjected to FIGE, The blots were probed sequentially with 5S rDNA (pCT4.2) and phage 2 DNA. For detailed analysis of the 5S rDNA region, the vector arms were removed by digestion with EcoRI and the purified insert was subjected to partial digestion with Sail, Southern blots of the partial digests were probed with 5S rDNA. B, Bgill; H, HindIIl; S, Sail. No restriction sit~-s for EcoRI and Kpnl. Sail sites within the 5S DNA block are not shown.

rDNA repeat unit was obtained from a plasmid subclone derived from Sail-cut phage ATP1 DNA (pBCI.1), and was confirmed by the sequence data obtained from pCT4.2 and pCT4.3. The repeat unit defined by the Sail fragment is 497 bp in length (Fig. 2A). Within this unit lies a 121-bp (220-340) fragment which shows over 95% homology to the angiosperm consensus 5S rRNA coding sequence (Fig. 2B). The sequences derived from the SalI and BamHI clones are identical within the presumed coding region and differ by one bp (C177---+G) in the nontranscribed spacer sequence.

(c) 5S rDNA gene copy number The 5S rDNA gene copy number was estimated by com-

paring the hybridization of labeled insert excised from pCT4.3 to known amounts of immobilized (dot blotted) pCT4.2 and genomic DNAs (Micron Separations Inc. pro- tocol guide, 1989). Assuming a haploid genome size of 70000 kb (Leutwiler et al., 1984), the 5S rDNA repeat unit was estimated to be present at approx. 1000 copies per genome in the ecotypes En-2, Col-0, An-I and Nd-0. Therefore, the 5S rDNA comprises roughly 0.7% of the Arabidopsis genome. These results are comparable to those of Pruitt and Meyerowitz (1986), who estimated that the Columbia ecotype contains 570 copies of the large rRNA repeat unit per haploid genome.

(d) Restriction site polymorphism analysis of 5S rDNA genes in six Arabidopsis ecotypes

Southern blots of DNA from six ecotypes digested with diff,:rent restriction enzymes showed variability in the 5S rDNA gene sequences of Arabidopsis, and provided infor- mation concerning their organization within the genome. Digestion with HindlII generates the most easily distin- guishable polymorphisms (Fig. 3A). In the En-2 ecotype, there do not appear to be any digestable HindIII sites in the array(s). In all the other ecotypes, the 497-bp monomer is not the predominant band and gaps appear in the ladder of 497-bp multimers. There are also a number of minor fragments whose lengths are not multiples of 497, some of which are common to more than one ecotype. Poiymor- phism occurs in ATP1, where only one of the seven repeat units has a HindIII site. The repeat unit present in pBCI.I contains six hexanucleotide sequences which could be con- vened to HindIlI sites by either a single bp change or an insertion. The minor fragment~ observed in the genomic digests could also represent some 5S rRNA genes apart from the main tandem array(s), or rearrangements at the ends of arrays.

Digestion with EcoRI also revealed differences among the six ecotypes (Fig. 3B). The Col-0 ecotype yielded a typical ladder of 497-bp multimers. A similar ladder hy- bridized with lower intensity in Nd-0. EcoRI did not cut

Page 3: Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

A

1 51

i01 151 201

251 301 351 401 451

. . . . . e . . . . 4 + + . . . . e . . . . 4 . + # i e

CCCAAATTTTGACCTTTAAGTACTTTTTCGGGCATTTTCGTGATTTGGGC 50 TATI~TTACGGACCCAAAATTACTTGTTCAAGCATTGTTTTCGAATTTTTT i00 CATGCATCAAAGCTCGTTAAGACTAGATGGGGGATCCCTACATAGCGGGT 150 GGGACCCACGGCGAATGGTTCATCAACTCTTCAAAAAAGAATATATACGA 200 TTGCATTGCATATACTAACGGATGCGATCATACCAGCACTAATGCACCGG 250 . . . . e . . . . 4 ÷ ÷ t . + o e ~ #

ATCCCATCAGAACTCCGCAGTTAAGCGTTCTTGGGCGAGAGTAGTACTAG 300 GATGGGTGACCTCCCGGGAAGTCCTCGTGTTGCATCCCTCTTTATATGTT 350 TAACCTTTTTTTTTTTGGTTAAAACTTTATGACTCCATAACTTTTAGACC 400 GTGAGCCAAACTTGGCATGTGATACCTTTTCGGAAAGCCCAAAGACAGCC 450 CTCCGACGAAAGAAGCAGGACAACTTTTCCATTGACTTTTTGTCGAC 497

227

B

GGATGCGATC ATACCAGCAC TAATGCAC~G GATCCCATCA GAACTCCGCA 50 Arabfdopsfs gGgTgCGATC ATACCAGCAC TAaLGCACCG GATCCCaTCA GAACTCCgcA 50 Consensus

GTTAAGCGTT CTTGGGCGAG AGTAGTACTA GGATGGGTGA CCTCCCGGGA 100 Arabfdopsfs gTTAAgCGtG CTLgGGcGAG AGTAGTAC~A 9gATGGGTGA cCtCcTGGGA 100 Consensus

AGTCCTCGTG TTGCATCCCT C 121 AraMdopsfs AGTCCTcGTG TTGCA cccc c lZO Consensus

Fig. 2. Sequence of the A. tludia,a 55 DNA repeat unit and comparison with plant con~cn~u~ ~cqucncc. (A) Sequence of an entire 5S rDNA Sail re- peat unit from pBCI.I. DNA isolated from phage TPI was digested with either BamHi or Sail, subcloned into Biuescript M I3( - ) (Stratagene, La Jolla, CA), and transformed into DHS~. Colonies containing repeat unit-sized 5S rDNA fragment(s) were identified by restriction analysis, and by probing Southern blots with flax 5S rDNA (pBG 13; Goldsbrough et al., 1981). Plasmids containing 5S rDNA (pCT4.2, pCT4.3, pBC1.1) were introduced into E. coil JM 109 to propagate them for sequencing. Sequencing of single-stranded and double-stranded templates was performed by the dideoxy termina- tion method using Sequenase (US Biochemicais, Cleveland, OH). Single-strand DNA rescue was performed using the VCS helper phage and protocol supplied by Stratagene. Synthesis of labeled DNA was primed with the reverse, KS and T3 primers, and oligodeoxyribonucleotides complementary to both strands of the 5S coding region of the flax 5S repeat unit. The nt sequence has been submitted to GenBank and assigned the accession No. M65137. (B) Comparison of the A. thaliana 5S rDNA gene coding region with an angiospean consensus sequence compiled from Pisum sativum, Viciafaba, Phaseolus vulgaris, Lupinus iuteus, Matthiola incana, L ycopersicum esculentum, Helianthus an,uus, Spinacia oleracea, Linum usitatissimum, Lemna minor, Secale cereale and Triticum aestivum (Ellis et al., 1988, and references therein). The nt in the A. thaliana gene which differ from the consensus sequence are overlined and in boldface; nt in lower case letters vary among the twelve species used to compile the consensus sequence (the most common nt is shown).

within the 5S rDNA of ;LTP1, but the sequenced repeat units contain two hexanucleotide sequences which could become EcoRI sites through single bp alterations. A faint ladder appeared in An-1, and little or no cutting was ob- served in the other three ecotypes.

When partially digested with BamHI, all six ecotypes produced a 497-bp ladder ranging up to at least the 15-mer, with some minor bands, while more complete digestion yielded mostly the monomer and dimer (not shown). This indicates either that one of the two BamHI sites found in the sequenced subclones occurs infrequently in the gene family, or that most of the 5S rDNA is not digested at one of these sites, possibly due to nearly complete methylation. Analysis of pBCI.1 shows that the BamHI site within the transcribed region is followed by a GG dinucleotide (Fig. 2A, nt 247-248 on the complementary strand), which would allow methylation of one or both C residues. Exten- sive methylation of the 5S rRNA genes would be consis- tent with their low representation in libraries made in the methylation-sensitive host, VCS-257. Interestingly, the one clone isolated contained non-5S flanking sequence. Simi- lax results were obtained by Schpeeberger et al. (1989), who isolated 5S rDNA sequences from a flax genomic library

constructed with the same vector and host strain as used in this study. Ten of the eleven clones isolated by these workers contained junctions between 5S rDNA and non- 5S sequences, indicating enhanced representation of these junction fragments in the library (relative to fragments en- tirely Within the long tandem arrays). One possible expla- nation (or this phenomenon is that the repeat units at the ends of'the long tandem array(s) are not as highly methy- lated. Aiternatively, ).TP 1 may represent part of a smaller array which is not methylated, or a methylated sequence which was nevertheless incorporated into the library.

Digestsusing Xhol, which does not have a site within the sequenced~ 5S rDNA repeat unit, did not produce a multimer ladder in any ecotype (Fig. 3C). Examination of ethidiumi-bromide-stained gels showed extensive diges- tion of genomic DNA of all ecotypes by Xhol, but the only change !n the distribution of hybridization after diges- tion with thi:~; enzyme was a shift to slightly lower Mr frag- ments (not shown). This suggests that the 5S rRNA genes are arrayed ~n tandems whose lengths exceed the size of the DNA which has been prepared and resolved on these gels.

The enzyme SalI also did not differentiate between the ecotypes. Very similar ladders of multimers were produced

Page 4: Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana

228

B

G D Fig. 3. Souti~crn (19"/5) blots ofdifferenl. A. thafiam~ ecot>pes probed with 5S rDNA. DNA was extracted by the procedure of Cullis (1981), omit- ting the RNase and pronase digestions. DNA from different plant ecotypes was digested, separated by vertical 1% agarose gel electrophoresis, blot- ted onto nylon membrane and hybridized with 5S rDNA from pCT4.2. DNA of ecotypes (left to right) Nd-0, Rid, La-0, An-l, En.2, and Col-0 was disested with: (A) Hin~lil, (B) EcoRI, (C) Xhol, (D) Sail.

in each case (Fig. 3D). However, a single polymorphism was detected in ATP 1, where digestion with SalI indicated the presence of an additional restriction site in one of the seven repeat units.

(e) Conclusions (1) An Arabidopsis genomic fragment containing a tan-

dem array of seven 5S rRNA genes flanked on one side by

I 1 kb of non-5S sequence has been cloned in AEMBL4.

(2) The repeat length of the Arab~dopsis 5S rRNA gene is 497 bp. The sequence of the presumed coding region is

121 bp long and shows a high degree of homology with the 5S rDNA from other angiosperm species.

(3) The 5S rDNA repeat unit is present in approx. 1000 copies per haploid genome, comprising 0.7% of the Ara- bidopsis genome. The gene copy number is similar in the four ecotypes examined.

(4) Restriction analysis and sequencing indicate minor heterogeneities among the repeat units within the genomic clone, and in the genome as a whole. Comparison of re- striction patterns among ecotypes revealed a number of polymorphisms. Some of these may be due to the genomic methylation suggested by the BamHl digests, while others probably represent genuine sequence polymorphisms.

ACKNOWLEDGEMENTS

This work was supported in part by D.O.E. grant DE-

FG02-88ER13907 to C.D.T., and by a grant from the Ohio

Board of Regents. B.R.C. was supported in part by Pub-

lic Health Service training grant 5-T32-HDI7104-12. We

would like to thank Dr. D. Setzer, Dr. R.G. Schneeberger

and Dr. S.W. Gorman for critical review of the manuscript and helpful advice.

REFERENCES

Cullis, C.A.: DNA sequence organization in the flax genome. Biocilim. Biophys. Acta 652 (1981) 1-15.

Ellis, T.H.N., Lee, D., Thomas, C.M., Simpson, P.R., Cleary, W.G., Newman, M.-A. and Burcham, K.W.G.: 5S rRNA genes in Pisum: sequence, long range and chromosomal organization. Mol. Gun. Genet. 214 (1988) 333-342.

Erdmann, V.A. and Wolters, J.: Collection ofpublished 5S, 5.8S and 4.5S ribosomal RNA sequences. Nucleic Acids Res. 14 suppl. (1986) rl- r35.

Goldsbrough, P.B., Ellis, T.H.N. and Cullis, C.A.: Organization of the 5S RNA genes in flax. Nucleic Acids Res. 9 (1981) 5895-5904.

Graham, M.W., Doherty, J.P. and Woodcock, D.M.: Efficient construc- tion of plant genomic libraries requires the use of mcr- host strains and packaging mixes. Plant. Mol. Biol. Rep. 8 (1990) 18-26.

Leutwiler, L.S., Hough-Evans, B.R. and Meyerowitz, E.M.: The DNA of Arabidopsis thaliaaa. Mol. Gun. Genet. 194 (1984) 15-23.

Long, E.O. and Dawid, LB.: Repeated genes in euk~ryotes. Annu. Rev. Biochem. 49 (1980) 727-764.

Mao, J., Appel, B., Schaack, J., Yamada, H. and S~II, D.: The 5S genes of Schizosaccharom.rces pombe. Nucleic Acids Res. 10 (1982) 487- 500.

Micron Separations, Inc. Protocol Guide. Westbolough, MA, 1989. Pruitt, R.E. and Meyerowitz, E.M.: Characterization of the genome of

Arabidopsis thaliana. J. Mol. Biol. 187 (1986) 169-183. Sambrook, J., Fritsch, E,F. and Maniatis, T.: Molecular Cloning. A

Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.

Schneeberger, R.G., Creissen, G.P. and Cullis, C.A,: Chromosomal and molecular analysis of 5S gene organization in flax, Linum usitatissimum. Gene 83 (1989) 75-84.

Selker, E.U., Yanofsky, C., Driftmier, C., Metzenberg, R,L., Alzner- DeWeerd, B. and RajBhandary, U.: Dispersed 5S genes in N. crassa: structures, expression and evolution. Cell 24 (19!! 1) 819-828.

Southern, E.M.. Detection of specific sequences am~ng DNA fragments separated by gel electrophoresis. J. Mol. Biol. % (1975) 503-517.