Course code: ZOO560
Week 3a) Phylogenetics &
b) Dynamic genomes Advanced molecular biology (ZOO560) by Rania M. H. Baleela is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
This week content
• Molecular phylogenetics• Transposable Elements (TEs)• Retroviruses
2
Molecular phylogenetics
Study evolutionary relationships between organisms or genes by a combination of
molecular biology & statistical techniques.
relationships among a group of organisms are illustrated in:a bifurcating phylogeny tree,in a phylogeny network. 3
Ancestral population/root
2 descendant populations, each with unique derived traits
4 descendant populations, each with unique derived traits
To read/understand it
Direction
recent
past
nodes represent the taxonomic units
descendantsrecent
past
Tree terminology
• Each branch in the tree = clade.
• Monophyletic = a taxon that is derived from a single
ancestral species. only legitimate cladogram type!
• Polyphyletic = a taxon whose members were derived
from 2 or >2 ancestors not common to all members.
• Paraphyletic = a taxon that excludes some members
that share a common ancestor with members included
in the taxon.
Branches & clades
• A clade is a group of organisms that are all descendents from a common ancestor=>
a clade= an ancestor + all descendents of that ancestor.
9
Tree building methods
10
Mathematical basis of molecular
phylogenetic reconstruction
Tree building methods
• Can be classified into 4 types: 1. distance based methods, 2. maximum parsimony methods (Nucleotides are
used directly), 3. maximum likelihood methods (Searches for the
maximum-likelihood (ML) value for the character state configurations between the sequences),
4. Bayesian methods (Incorporates other (a priori) information to infer phylogenies & the Bayes‘ theorem).
11
e.g. Distance based methods
• Involves:
1. computing the evolutionary distances for all
pairs of taxa,
2. constructing a tree using a clustering
algorithm based on some functional
relationships among the distance values.
(Fast & produce a single tree making it widely used)
12
e.g. of distance based methods:
1. The unweighted pair-group method with arithmetic mean (UPGMA) (the simplest),
2. Neighbor-Joining (NJ) method.
• A drawback of using UPGMA is that because of the construction of all branches having identical rates of evolution, some fast or slow evolving branches or lineages may cause errors in branching order.
• In NJ method, if some distances are large or if the evolutionary rate varies greatly among sites, then accurate estimation of distances become difficult (Li, 1997) although it is generally very robust (Page and Holmes, 2000).
13
What do phylogeny shapes tell us?
14
What do phylogeny shapes tell us?
NeutralPopulationgrowth(star-like phylogeny)
(Kaessman and Paabo 2002 J. Int. Medicine)
15
Human mtDNA phylogeny(Cavalli-Sforza &
Feldman 2003 N
at. Genetics Suppl.)
Mitochondrial Eve was African (~200,000 years ago)
The tree is fairly star-like (short internal branches and long external branches)
Human Y chromosome tree
(Cavalli-Sforza & Feldm
an 2003 Nat. G
enetics Suppl.)
Is also fairly star-like, what does a star-like tree mean?
Gene trees within species
Genetic diversity in humans is substantially reduced compared to apes
(Kaessman and Paabo 2002 J. In
t. Medicin
e)
Aver
age
hete
rozy
gosi
ty
Timescale:how deep are the trees for sequences sampled within species?
(Garr
igan
& H
am
mer
2006 N
at.
R
ev.
Gen
et.
)
MRCA
MRCA
MRCA=> most recent common ancestor (the limit [“horizon”] for population genetics studies)
The coalescent
D
Sequences
C B A
MRCA
The most recent common ancestor
coalescent
coalescent
Time of coalescence
for n lineages
Time isrunning
backwards
n(n-1)Tn=
4Ne
Tn =
2Ne
It takes almost half of the time for the last two lineages to coalesce
b) TEs
“Without transposable elements we
would not be here & the living world
would probably look very different
from the one we know.”Labrador & Corces (2002)
What are TEs?
Transposable elements (TEs) are fragments of
DNA that can insert into new chromosomal
locations & often make duplicate copies of
themselves in the process.
Discovered in corn (Zea mays) by Barbra McClintock (1940s)
TEs (jumping genes)
• Causes mutations (e.g. corn kernel colour),
• Increases (or decreases) the DNA content,
• Associated with antibiotic resistance of
bacteria,
• Causes sterility of the Drosophila sp. Offspring
(P element),
I. TEs classes = 2 (based on mechanism of movement)
II. Transposition=movement: is a nonhomologous recombination
III. TEs can cause genetic changes
TEs
Class I: RNA (only in
eukaryotes)
Class II: DNA (in
prokaryotes &
eukaryotes)
I. Classes
Class I: Retrotransposons
• 2 groups:
A. Long terminal repeat retrotransposons
(LTRs).
B. Non-LTR retrotransposons.
“LTRs resemble retroviruses in
both their structure &
mechanism”
LTRs Gag & Pol
• Gag encodes structural proteins important
for the packaging of retrotransposon RNA,
• The pol gene encodes the enzymatic
activities needed for the retrotransposon life
cycle.
A. LTRs
• Have long terminal repeats (LTRs) (~100bp-5kb).
• Are divided into 2 groups (based on the enzymatic
activity differences):
Ty1-copia,
Ty3-gypsy.
~8% of human genome and 10% of mouse genome.
Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.
The Ty transposable element of yeast
1. Ty1-copia
Are abundant in species ranging from single-cell algae to bryophyte, gymnosperms & angiosperm.
gymnosperms
Magnolia (angiosperm)
Liverwort (bryophyte)
2. Ty3-gypsy
• Are widely distributed:
In plants (gymnosperms and angiosperms);
Lampreys,
Bony fishes,
Amphibians,
Reptiles,
Mammals.
“non-LTR retrotransposons
are the dominant element
type in mammalian
genomes, where they
appear to account for most
of the species-specific
differences”
B. Non-LTRs
• Are divided into 2 groups:1. Long interspersed nuclear elements (LINEs)
(e.g. L1).2. Short interspersed nuclear elements (SINEs)
(e.g. Alu).
LINEs and SINEs terminate by a simple sequence repeat, usually poly(A).
Encode 2 ORFs, which are transcribed as:
1. RNA binding protein (ORF1),
2. Endonuclease & RT activities (ORF2).
1. LINEs
2. SINEs
• Are characterized by an internal RNA pol III promoter.
• Heterogeneous group of TEs (length from 90-300bp).
• Do not have any coding capacity.• Use LINE-specified functions to transpose.
Species-specific TEs TEs varies from species to species in 2 important ways
1. By the classes of TEs present and their fractional representation in the genome,
2. By the level of TE activity.
Type of TE Human Rat Rice Arabidopsis
Chicken
Caenorhabditis
Drosophila
LINE/SINE 33.4 30.2 1.2 0.5 6.5 0.4 0.7
LTR 8.1 9.0 14.8 4.8 1.3 0.0 1.5Class II 2.8 0.8 13.0 5.1 0.8 5.3 0.7
Total (+other TEs) 44.4 40.3 35.0 10.5 8.6 6.5 3.1
Alu sequence (SINE)
Karyotype from a ♀lymphocyteChromosomes were hybridized with a probe for Alu sequences (green) and counterstained with TOPRO-3 (red).
Alu insertions & disease
Alu insertions are sometimes disruptive & can
result in inherited disorders.
Most Alu insertions act as markers that
segregate with the disease.
Disesase linked with Alu insertion include:
Breast CA, hypercholesterolemia, haemophilia
A & B, diabetes mellitus type II, …etc.
Class II: Transposons
• Most elements transpose by a ‘cut and paste’ mechanism mediated by a transposase that recognize their short terminal inverted repeated sequences (TIRs)
• Transposons structure is simple:
1. A short terminal inverted repeat (TIR) (~10–40 bp to ~200 bp).
2. A single gene encoding the transposase.
Class II elements
• Plasmid-borne transposons are responsible for the rapid evolution of drug resistance in disease causing bacteria.
Examples:1. Insertion sequences (IS) elements of E. coli,2. Ac & Ds elements of corn.3. P elements of Drosophila melanogaster.
Miniature inverted-repeat TEs (MITEs)
• MITEs are a special class of class II elements
(found in genomes at very high copy number).
• MITEs are short (< 500 bp).
• Are the most common TEs in plant genes (also
abundant in insects & fish).
“Most MITEs insert within a TA or a
TAA sequence (seem to target
very high AT-rich regions for
integration)”
mPing MITE: a case study
• Rice (Oryza sativa) genome size= ~430Mb.• Maize genome size= ~2500Mb & barley
5000Mb.• O. s. japonica is one of the 3 rice
domesticated subspecies.• mPing (a 429bp MITE) is active in japonica
rice varieties.
mPing• Temperate japonicas contain the highest number of
mPing elements (> 1000 elements!).
• Tropical japonicas contain the least (many have only
a single element).
Temperate & tropical cultivars diverged from a
common ancestor since domestication: 5000-7000
yrs ago.
mPing copy number
The dramatic difference is significant
• The 2 varietal groups are adapted to radically
different temperature & water regimes:
a) Tropical cultivars flourish in tropical & subtropical
environments,
b) Temperate cultivars were selected for productivity
in cool zones with very short growing seasons.
Justification
1. Stress activation of mPing elements during the domestication of temperate japonicas,
2. mPing preferential insertion into genic regions,
1 & 2 might have diversified these cultivars & hastened their domestication by creating new allelic combinations that might be favored by human selection.
Impact
• The impact of the bursts of mPing insertions on
genome evolution is unclear.
• 1000s of new insertions, presumably into gene
rich regions of the genome, will be the focus of
detailed analyses to determine which, if any,
contributed to adaptation and/or domestication.
Restructuring genomes!
3 ways to restructure the host genome
1. TE-mediated chromosome breakage &
rejoining (i.e. nonhomologus recombination),
2. TEs as insertional mutagens,
3. TEs and epigenetic regulation.
TEs as insertional mutagens
I. Purpurea TEs (Tip100 element is a Ac/Ds)
Variegated kernel colour in corn due to interaction between the TEs Ac & Ds
• Kernels contain 2 copies of Ds (chromosome 9 proximal to the locus of Cl)
• Cl is responsible for the purple anthocyanin pigment.
• The homologous chromosome carries an inactive mutant allele cl.
• The element Ac is present elsewhere in the genome.
• When Ac breaks chromosome 9 at the position of either Ds element, the tip ofchromosome 9 containing the dominant Cl allele is lost, and the portion of the kernel that develops from such a cell is colourless.
• The colourless patches are large or small depending on whether the breakageoccurred early or late in development.
(Weil & Wessler, 1993. The Plant Cell 5:515.]
Non-mendelian
Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.
Kernel color in corn & transposon effects
TEs as insertional mutagens
Defending against the spread of TEs
1. Purifying selection (i.e. negative selection,
elimination by natural selection).
2. DNA methylation (add methyl group –
CH3).
Positive impacts of TEs
• Transposition events can create advantageous mutations by mixing & matching fragments of genes and producing novel combinations that benefit the organism.
• The immunoglobulin enzyme genes (RAG1 & RAG2) originated in a transposition event several hundred million yrs ago.
• Telomerase “domestication” within eukaryotes.
What do you know about retroviruses? Revisit ZOO405