Plant Genome Projects.pdf

Embed Size (px)

Citation preview

  • 7/27/2019 Plant Genome Projects.pdf

    1/4

    Plant Genome ProjectsRenate Schmidt, Max Planck Institute, Cologne, Germany

    A genome project aims to discover all genes and their function in a particular species.

    Plant genome projects have focused on a few model organisms that are characterized by

    small genomes or their amenability to genetic studies.

    Introduction

    Plant genomes have been extensively studied at thecytological, genetic and molecular level and large differ-ences in chromosome number,genome size andploidy levelhave been found in the plant kingdom. Detailed chromo-some maps and whole-genome sequences are prerequisitesfor a detailed structural description of a genome. To assessgene function, expression studies and mutational analysesare required. Comparative approaches serve as a tool totransfer the knowledge and resources that have beenassembled for model plants, especially Arabidopsis, riceand maize, to a wide variety of species.

    Arabidopsisand Rice

    Arabidopsis thaliana, a small dicotyledonous crucifer withapproximately 125 Mbp has one of the smallest knowngenomes in higher plants. The short life cycle, small statureand large number of progeny make it ideally suited forgenetic and mutational analysis. The important crop plant

    rice, Oryza sativa, with 430 Mbp, contains one of thesmallest genomes known for monocotyledonous plants.Due to their small genome sizes, Arabidopsis and rice havebeen chosen for detailed genome analyses.

    Mutants have beenidentified in rice and Arabidopsis andmany of them have been placed on genetic maps.Likewise,very extensive molecular marker maps have been as-sembled for the five Arabidopsis and the 12 rice chromo-somes (Harushima et al., 1998). Restriction fragmentlength polymorphism (RFLP) markers constitute a parti-cularly versatile molecular marker system. Genomic orcomplementary DNA (cDNA) clones are used to detectpolymorphisms at restriction sites in the DNA of

    individuals in genomic blot hybridizations. Analysis ofprogeny derived from crosses of individual plants that arepolymorphic at the DNA level with different markersresults in the construction of genetic linkage maps.

    For expressed sequence tag (EST) projects, thousands ofpartial sequences of randomly chosen cDNA clones aregenerated. These projects provide a catalogue of tran-scribed sequences for an organism in a cost-efficientmanner. In such a collection, many genes will berepresented multiple times. This is exploited to construct

    consensus sequences of ESTs that are longer thaindividual ESTs; in some cases even the full-lengttranscript of a gene can be reconstructed. Over 110 00EST sequences have been generated for Arabidopsthaliana and approximately 70 000 for rice. It has beeestimated that these tags represent approximately 3060%of all genes in these species. Particularly in rice, many ESTsequences have been used as RFLP markers, thus enablingenes to be anchored on the genetic map.

    In order to study a genome in detail, it is necessary testablish clone libraries covering the entire genome. To dthis, high-molecular weight plant DNA is cloned intbacterial (BAC) or yeast artificial chromosome (YACvectors and the resulting artificial chromosomes carryininserts of plant DNA spanning 100 kbp or more armaintained alongside the bacterial or yeast chromosomesUnique coordinates are assigned to any particular clone ia library, ensuring that all mapping results obtained witthese libraries can be directly compared.

    Chromosome maps based on artificial chromosomclones can be generated by applying a map-based approacand a fingerprinting strategy. Using the map-base

    approach, molecular-mapped markers are used as probeto identify and anchor clones on the genetic map. Given large number of markers and sufficiently redundanlibraries with large DNA inserts, clones will be identifiethat span two or more markers. Those clones sharing thsame markers can be assembled into a set of contiguouclones (contig). This strategy has been successfully used tgenerate YAC contigs spanning large areas of thArabidopsis and rice genomes and maps covering entirchromosome arms have been assembled (Figure1) (Schmidet al., 1995).

    For a fingerprinting strategy, all clones of a library ardigested with appropriate restriction endonucleases an

    the resulting fragments separated on gels. The sizes of afragments are estimated and recorded. A comparison othe fragment patterns for all different clones reveaoverlapping clones. According to these results, the cloneare arranged into contigs. This strategy has been successfully applied to generate large BAC contigs for thArabidopsis and rice genomes (Marra et al ., 1999Anchoring of the resulting contigs on the genetic map performed using molecular-mapped markers as probes tidentify corresponding BAC clones (Figure 1).

    Article Contents

    Secondary article

    . Introduction

    . Arabidopsisand Rice

    . Maize

    . Comparative Genomics

    ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

  • 7/27/2019 Plant Genome Projects.pdf

    2/4

    High-density BAC contig maps are currently providingthe templates for large-scale sequencing of genomes(Figure1). BAC clones sharing minimal overlaps are chosenfor sequencing experiments. The identification of genes inthe resulting genomic sequence must rely largely onpredictions using suitable computer algorithms. Compar-

    isons of genomic sequence with sequence databases, e.g.the ESTdatabases, are equallyimportantfor annotation ofgene sequences. Sequencing of the rice genome began in1998 and is due to conclude in 2003. For Arabidopsis, morethan 90% of the genomic sequence has been determined,only highly repetitive regions, such as centromeric andnucleolar organizing regions have not been sequenced.Analysis of large contiguous segments of sequence hasshown that a gene is found on average every 45 kbp inthe Arabidopsis genome. Clusters of related genes are

    frequently observed. Approximately half of the predictegenes have sufficient similarity to assign a putative functioto the encoded proteins. Retroelement-like sequences arrarely found interspersed with genes and the majority orepetitive sequences are found clustered in the centromeriregions of the chromosomes (Lin et al., 1999; Mayer et al1999).

    The Arabidopsis and rice genome projects have resultein the construction of densely populated genetic maps andetailed clone contig maps that are highly integrate(Figure 1). This facilitates gene isolation procedures usinmap-based approaches as has been documented by thsuccessful completion of positional cloning experimentNow the emphasis is shifting towards the functionaanalysis of genes. Insertional mutagenesis systems (sebelow) as well as global transcript analysis via high densitarrays of oligonucleotides or cDNAs will then play crucial role.

    Maize

    The maize genome, with approximately 2500 Mbp, is muclarger than the Arabidopsis or rice genomes. Furthermoreit is of polyploid origin, with most genes being present iduplicate. The number of genes has been estimated to bbetween 40 000 and 50 000. A high frequency of retrotransposon-like sequences are found interspersed witgene sequences (SanMiguel et al ., 1996). The largproportion of these repetitive elements explains the larggenome size.

    A large genome size poses special problems in genom

    analysis studies: although most of the described techniquecan be applied to large as well as small genomes, the labouand cost involved is far higher for large genomes. Hencethe complete genome sequence is not theimmediate goal othe maize genome project, rather large-scale EST projectare carried out to describe most of themaize genes in a cosefficient manner. A similar situation exists for other cereasuch as wheat and barley. In parallel, high-density genetimaps are being assembled for maize and clone conticonstruction is on the way to establishing highly integrategenetic and physical maps. Furthermore, information othe smaller rice and sorghum genomes is being exploited tfurther genome studies in maize (see below).

    Due to possessing a number of extremely well-characterized mobile genetic elements (transposons), maize iextremely amenable to gene function studies, as insertiomutants may readily be generated. Transposon mutagenesis (Figure 2a) has led to the discovery of many importangenes in maize and large collections of lines carryintransposon insertion in different positions of the genomhave been generated. Using a reverse genetics approachinsertion mutants in virtually any gene of interest can nowbe identified (Figure 2b).

    Figure 1 Components of a genome project. On the left, a schematic

    representation of a molecularmarker map(A) fora chromosome is shown.Molecular markers are depicted as horizontal lines. Yeast artificialchromosome(YAC) clones, shownas longverticalblack lines, areanchored

    by molecular markers onto the genetic map. The marker content of allclones in a particular region of the chromosome is assessed to build largecontigs (B).High-density bacterialartificial chromosome(BAC) contigs (C),displayed as short vertical black lines, are established by fingerprinting

    techniques. Molecularmarkers anchorthe contigs on thegeneticmap. Fora completely sequenced BAC (D), predicted genes are shown as open

    boxesintherightpartofthefigurefortheWatsonandCrickstrands.Agenecorresponding to one of thegenesof thesequenced BAChas been used to

    anchorYAC clones andfor genetic mapping experiments, thus it provides adirect link between the genetic map, the physical map and the genomicsequence indicated by the dashed line.

    Plant Genome Projects

    2 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

  • 7/27/2019 Plant Genome Projects.pdf

    3/4

    Since maize transposons are also active if introducedinto other plants, they are also widely exploited for thegeneration of insertion mutants in other species, such asArabidopsis and rice. Also, in these species the systematicelucidation of gene functions is carried out using reversegenetics.

    Comparative GenomicsGenomes from closely related plant species showconsiderable DNA homology and often have the samenumber of chromosomes. Comparative genetic mappingexperiments have been carried out to address the questionwhether the order of genes is also conserved betweenspecies.

    Genetic linkage maps based on molecular markershave been assembled for many different plant species.

    Often, cDNA or gene sequences are used as RFLmarkers. Their high conservation during evolution allowthe use of RFLP markers not only in the species theare derived from but also in closely related species. Ithe same set of molecular markers is used for genetmapping experiments in different species, the resultingenetic maps can then be compared. Such experiment

    have been carried out in several different plant familieand extensive conservation of marker repertoire and ordehas been found. A colinear order of markers has beeobserved for segments of chromosomes or in somcases even entire chromosomes (Figure 3a).

    Almost complete genome colinearity has been observein the Solanaceae family. Differences in marker organization on the 12 tomato and potato chromosomes can bexplained by five chromosomal inversions (Tanksley etal1992). For the grass family (Poaceae), a high degreof genome conservation has been established even betweespecies which diverged as long as 60 million years agand which differ considerably in genome size. A clos

    examination of data for the rice, maize and wheagenomes has led to the conclusion that the organizatioof all different chromosomes in the grass family can bdescribed by a limited number of evolutionarily relatechromosome segments. This concept allows multiplalignments of chromosome maps. Comparison of thgenetic maps of Arabidopsis thaliana and Brassica haalso revealed many colinear segments.

    To obtain information about areas of the chromosomes lying between molecular markers it is necessary tclone and characterize these regions in detail. Usincompara-tive physical mapping and sequencing experments, the conservation of local gene order, orientatio

    and spacing is addressed. So far, only a few studiehave been carried out in the Poaceae and Brassicaceafamilies and more data are needed to draw firmconclusions about the degree of micro-colinearity betweegenomes. Through the analysis of small homologouchromosome segments in different species, the same geneare discovered. Furthermore, the order of genes generally maintained, although their spacing can varwidely in different species. High sequence homologieare confined to exon sequences and repetitive elements arnot conserved. Disruptions of the overall conservation olocal gene order have also been found. Copy numbechanges of genes are observed, as well as insertions o

    deletions of gene sequences (Figure 3b) (Tikhonov et al1999).

    Their small sizes have made the Arabidopsis and ricgenomes the best-studied plant genomes. Extensivgenome colinearity at the genetic and molecular levehas been established for closely related plant specieTherefore, genome analysis studies in the Poaceae anBrassicaceae families may benefit from the transfeof information and resources that are assembled itheframework of the Arabidopsis and rice genome project

    Figure2 Insertion mutagenesis. (a) Gene inactivation upon insertion of amobile element. The gene is displayed as an open box with a black

    rectangle corresponding to the mobile element. Upon insertion of theelement, transcription of the gene indicated by an arrow can no longer

    proceed.If thenature of theelementis known,the inactivatedgene canbeisolated. (b) The rationale of a reverse genetic approach. Only if a known

    element is inserted in the gene of interest, shown as an open box, can aDNAfragmentbe generatedif primers specific for the gene of interest andthe transposon,indicated by a blackrectangle, are usedfor amplification ofDNA sequencesby polymerasechain reaction (PCR). Arrows correspondto

    primer sequences. The resulting PCR product is shown as a hatched bar.

    Plant Genome Projects

    ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

  • 7/27/2019 Plant Genome Projects.pdf

    4/4

    ReferencesHarushima Y, Yano M, Shomura A et al. (1998) A high-density rice

    genetic linkage map with 2275 markers using a single F2 population.

    Genetics 148: 479494.

    Lin X, Kaul S, Rounsley S et al. (1999) Sequence and analysis of

    chromosome 2 of theplantArabidopsis thaliana. Nature402: 761768.

    MarraM, KucabaT, SekhonM etal. (1999)A mapfor sequence analysis

    of the Arabidopsis thaliana genome. Nature Genetics 22: 265270.

    Mayer K, Schu ller C, Wambutt R et al. (1999) Sequence and analysis of

    chromosome 4 of theplantArabidopsis thaliana. Nature402: 769777.

    SanMiguelP, TikhonovA, JinY-Ketal. (1996) Nested retrotransposons

    in the intergenic regions of the maize genome. Science 274: 765768.

    Schmidt R,WestJ, Love Ketal. (1995) Physicalmap andorganization of

    Arabidopsis thaliana chromosome 4. Science 270: 480483.

    Tanksley SD,Ganal MW, Prince JPetal. (1992)High density molecular

    linkage maps of the tomato and potato genomes.Genetics 132: 1141

    1160.

    Tikhonov AP, SanMiguel PJ, Nakajima Y et al. (1999) Colinearity and

    its exceptions in orthologous adh regions of maize and sorghum.

    Proceedings of theNationalAcademy of Sciences of theUSA 96: 7409

    7414.

    Further ReadingChang C and Meyerowitz EM (1991) Plant genome studies: restrictio

    fragment length polymorphism and chromosome mapping inform

    tion. Current Opinion in Genetics and Development 1: 112118.

    Dean C and Schmidt R (1995) Plant genomes: a current molecul

    description. Annual Review of Plant Physiology and Plant Molecula

    Biology 46: 395418.

    Gale MD and Devos KM (1998) Comparative genetics in the grasse

    Proceedings of theNational Academy of Sciences of theUSA 95: 1971

    1974.

    Rounsley S, Lin X and Ketchum KA (1998) Large-scale sequencing

    plant genomes. Current Opinion in Plant Biology 1: 136141.

    Sasaki T and BurrB (2000) International genome sequencingproject:t

    effort to completelysequencethe ricegenome.Current Opinion in Pla

    Biology 3: 138141.

    Schmidt R (1998) The Arabidopsis thaliana genome: towards a comple

    physical map. In: Anderson M and Roberts JA (eds) Arabidopsi

    Annual Plant Reviews, vol. I, pp. 130. Sheffield: Sheffield Academ

    Press.

    Walbot V (2000) Saturation mutagenesis using maize transposon

    Current Opinion in Plant Biology 3: 103107.

    Database information accessible via the World Wide Web:

    Maize DB [http://www.agron.missouri.edu/]

    Rice Genome Research Program [http://rgp.dna.affrc.go.jp/]

    The Arabidopsis information resource [http://www.arabidopsis.org/]

    X

    gene

    Transcription

    Mobileelement

    PCR product

    transposon

    gene

    (a) (b)

    Figure3 Patternsof genome colinearity. (a) Comparative genetic mapping.Using the sameset of molecularmarkersfor geneticmappingexperimentsdifferent species (A and B) allows the alignment of chromosome maps. Molecular markers are depicted as horizontal bars and markers which have bee

    mapped in both species to the chromosomes shown are connected by lines. The chromosome of species A shares colinear segments with twochromosomesof species B, indicatingtranslocationevents.An example for an inversionevent of a chromosomalsegmentis highlighted as a box.(b) Micrcolinearity. A comparison of homologous genomic regions derived from two different species (A and B) at the sequence level is shown. Gene sequence

    black and white boxes, are highly conserved as indicated by grey bars. In contrast, intergenic sequences do not show significant homologies. The genemarked by an asterisk is duplicated in species A, whereas the gene indicated by an arrow is not found in species A.

    Plant Genome Projects

    4 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net