34
Plant Molecular Systematics Spring 2014

Plant Molecular Systematics Spring 2014. “Problems” with morphological data… Convergence and parallelisms Reduction and character loss Phenotypic vs

Embed Size (px)

Citation preview

Plant Molecular Systematics

Spring 2014

“Problems” with morphologicaldata…

• Convergence and parallelisms• Reduction and character loss• Phenotypic vs. genotypic differences• Evaluation of homology• Misinterpretation of change or polarity• Limitation on number of characters• Phenotypic plasticity

Always searching for new types of characters…

Is molecular data intrinsically better than morphological data?

Central Dogma

Central Dogma

Lipid pigments: chlorophyll lycopenes xanthophylls carotene

Phenolics: flavonols flavones tannins anthocyanins

Iridoid compounds

Alkaloids(N-containing)e.g. nicotine caffeine morphine betalains

Secondary Metabolites

Terpenes

Development of Molecular (Chemical) Systematic Methods

“Chemosystematics”

• Early methods relied on chromatography to separate complex mixtures of secondary metabolites, detect them, and then compare between taxa “spot botanists” – very phenetic

• Better separation and identification methods developed – used pathway stages as cladistic characters - phytochemistry

• Move away from secondary metabolites to proteins• Early protein studies used immunological reactions• Development of improved electrophoretic methods – permitted

direct protein comparisons between taxa• Comparison of seed storage proteins• Development of direct estimates of genetic relationships based on

allele frequency of enzyme variants

Molecular (DNA) Systematics

• Next step was to examine DNA directly through examination and comparison of restriction fragments (RFLP bands)

• Technology evolved to make it feasible to sequence DNA directly

• Initially limited to single genes or non-coding regions

• Now feasible to sequence large numbers of genes or regions or increasingly even whole genomes relatively quickly

Molecular Systematics- Can obtain phylogenetically informative

characters from any genome of the organism- Assumes that genomes accumulate molecular changes by lineage, as morphological characters do- Possibly greater assurance of homology with molecular data (less likely to misinterpret characters) but homoplasy happens!- Principal advantages are the much greater number of molecular characters available & greater comparability across lineages

How big are genomes of organisms?

Genomes of the Plant Cell

Nuclear

Plastid

Mitochondrial

Three genomes in plant cells

Chloroplast

135,000-160,000 bp

Generally maternallyinherited

(seed parent)

Mitochondrion

200,000-2,500,000 bp

Generallymaternallyinherited (seed parent)

Nucleus

1.1 x 106

to 1.1 x 1011

kilobase pairs

Biparentallyinherited

Selection of DNA region to compare:

• Should be present in all taxa to be compared• Must have some knowledge of the gene or other

genomic region to develop primers, etc.• Evolutionary rate of sequence changes must be

appropriate to the taxonomic level(s) being investigated; “slow” genes versus “fast” genes

• Sequences should be readily alignable• The biology of the gene (or other DNA sequence)

must be understood to assure homology

Genes frequently used for phylogenetic studies of plants:

• Mitochondrial genome – uniparentally (maternally) inherited, but genes evolve very slowly and structural rearrangements happen very frequently, so generally not useful in studying relationships, but there are some exceptions

• Plastid genome – uniparentally (maternally) inherited- rbcL – ribulose-bisphosphate carboxylase large subunit- ndhF – NADH dehydrogenase subunit F- atpB – ATP synthetase subunit B- matK – maturase subunit K- rpl16 intron – ribosomal protein L16 intron

• Nuclear genome – biparentally inherited- ITS region – internal transcribed spacers ITS1 and ITS2- 18S, 26S ribosomal nuclear DNA repeat- adh – alcohol dehydrogenase- many other genes now with next generation sequencing

Plastid Genome

- Circular, derived from endosymbio- sis of cyanobacteria

- Three zones: LSC (large single copy region) SSC (small single copy region) IR (inverted repeats)

- Genes related to photosynthesis and protein synthesis Fig. 14.4

The Polymerase Chain Reaction (PCR) (Fig. 14.2)

Automated Sequencing

Scanning of gel to detect fluorescently-labeled DNAs; data fed directly to computer.

Fig. 14.3

How do we analyze molecular variation?

- DNA nucleotide sequences (point

mutations) - Structural rearrangements

-insertions and deletions (indels)-inversions

Aligned DNA sequences showing substitutions

Insertion-Deletion Events

- Can occur as single nucleotide gains or losses or as lengths of 2-many base pairs- Can also be “chunks” of DNA (i.e., losses of introns)

A molecular synapomorphy for Subfamily Cactoideae (Cactaceae) – deletion of the plastid rpoC1 intron…

(Wallace & Cota, Current Genetics, 1995)

ancestral

derived

Cactaceae: trnL Intron Deletions

North American Clades

South American Clades

Pachycereeae

Corryocactus

“Browningieae I”*

“Browningieae II”*

Cereeae

Trichocereeae

Leptocereeae

Hylocereeae

Shared Deletion 2

- 268 bp

trnL intron deletions – Columnar Cacti

(*Tribe Browningieae polyphyletic)

23 kb inversion in all Asteraceae except for members of Tribe Barnadesieae (now Subfamily Barnadesioideae)

Chloroplast DNA Inversion

Fig. 14.6

Comparative DNA Sequencing• Obtain DNA samples from representative organisms (try

to represent morphological diversity) and outgroups• Identify DNA region(s) for comparison• Extract DNA and use PCR to amplify targeted region• Carry out sequencing reactions• Run sequencing procedures (automated)• Align sequences• Use aligned sequences for phylogenetic analysis

(various programs using various algorithms)• Evaluate data in context of taxonomy and morphology

Partial sequence of rbcL (plastid gene coding for Rubisco) in Poaceae

BEP Clade

Stamensreduced to 3;+ 55 mya

AnomochlooideaePharoideaePuelioideae

Bambusoideae(bamboos)

Pooideae(bluegrasses, wheat)

Ehrhartoideae(rices and allies)

Aristidoideae(wiregrasses)

Panicoideae(maize, panicgrasses)

Chloridoideae(love grasses)

Danthonioideae(pampas grasses)

Micrairoideae

Arundinoideae(reeds)

PACMAD Clade

Crepet & Feldman 1991

Genetic Databases

International Nucleotide Sequence Database Collaboration

GenBank: National Institutes of Health (NIH) Genetic Sequence Database

http://www.ncbi.nlm.nih.gov/genbank/

EMBL: European Bioinformatics Institute Nucleotide Sequence Database

DDBJ: DNA Databank of Japan

Edwards et al., Science 2010, Fig. 4

Edwards & Smith, PNAS 2010, Fig. 1

Climatic Data-Global Biodiversity Information Facility (GBIF)-1,584,351 independent collection sites-10,469 taxa

Genetic Data-2,684 taxa-8 regions (plastid and nuclear)-phylogenetic analysis

Data mining