44
Course code: ZOO560 Week 2 Evolution of genes & proteins Advanced molecular biology (ZOO560) by Rania M. H. Baleela is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 3.0 Unported License.

Course code: ZOO560 Week 2 Evolution of genes & proteins Advanced molecular biology (ZOO560) by Rania M. H. Baleela is licensed under a Creative Commons

Embed Size (px)

Citation preview

This week lectures content

• Evolution of duplicate genes (paralogs)• Evolution of Pseudogenes• Mathematical models of substitution• Models of mutation:

1. IAM2. ISM3. SMM4. Wright-Fisher 5. Coalescence

2

Genome Evolution

Genome changes due to1. Mutation2. Recombination3. Transposition4. Gene transfer5. Deletion and duplication

• major mechanism for the expansion in the size of genomes as organisms evolved from simple to more complex is duplication of whole genomes as well as duplication of specific sequences

Gene duplication is an important source of phenotypic change and

adaptive evolution

(Dennis et al., 2012)

Fates of duplicate genes

1.Subfunctionalization 2.Neofunctionalization

3.Nonfunctionalization (psuedogene) 4.Evolve in concert

Definitions

1. Subfunctionalization: a pair of duplicate genes are said to be subfuctionalized if each of the 2 copies of the gene performs only a subset of the functions of the ancestral single copy gene.

2. Neofunctionalization: a pair of duplicate genes in a population are said to be neofunctionalized if one of the 2 genes possesses a new selectively beneficial function that was absent in the population before the duplication (e.g. hepatocyte growth factor vs. plasminogen)

Duplicate genes fate models

Duplication

Nonfunctionalization(pseudogene)

NeofunctionalizationSubfunctionalization

A duplicated gene is unlikely to be fixed unless it acquires a

novel & useful function

Evolution of paralogsSub/neo-functionalization

Gene duplication: history• Time= 1936, • Scientist= Bridges • Finding= gene duplication of a chromosomal

band in a mutant of the fruit fly Drosophila melanogaster,

• Observed result= extreme reduction in eye sizeGene duplicates= paralogs

X-linked duplication

http://www.nature.com/scitable/content/25319/pierce_9_7_large_2.jpg

The Bar gene duplication

Gene duplication May occur due to:

1. an error in homologous recombination, 2. a retrotransposition event, 3. duplication of an entire chromosome.

Which genes undergo duplication?Features that might allow or prohibit the fixation of a duplicate copy of a gene in the population:1. Functional biases in the types of genes that survive in

duplicate (e.g. in: yeast, humans, insects & bacteria).2. Belong to certain categories: such as genes encoding

transcription factors, kinases and particular enzymes & transporters have unexpectedly high numbers of duplicates

Duplications can be advantageous, deleterious or neutral

If an organism is exposed to a toxic environment, there may be an advantage in overproduction of detoxifying enzymes

Disadvantage will result of overproduction of a protein that upsets the regulatory balance

Most duplications are neutral=> their fate is determined by selection and drift

Transition from Australopithecus to Homo, why? and the beginning of neocortex expansion, how?

Dennis et al., 2012

The cortical development gene Slit-Robo Rho GTPase-activating protein 2 (SRGAP2) duplicated 3 times exclusively in humans: from (SRGAP2A) to (SRGAP2B), 2 larger duplications later copied SRGAP2B to (SRGAP2C) and to proximal (SRGAP2D). SRGAP2C is the most likely duplicate to encode a functional protein (one of the most fixed human-specific duplicate genes). Incomplete duplication created a novel gene function at birth 2–3 mya,

Evolution of (paralogs) Pseudogenes “fossil records”

Nonfunctionalization

Pseudogenes

A pseudogene is a DNA sequence that is nearly identical to that of a functional gene, but

contains one or more mutations, making it non-functional.

First recognized and named pseudogenes during thelate 1970s

2 types of pseudogenes

1. Unprocessed (duplicated):– From genome duplication.– Subsequently lost its function .– Rapid degeneration observed in

prokaryotes. 2. Processed (retrotransposed):

–From reverse transcription (no intron).

Regulatory role has been observed for human pseudogenes.

Many changes have occurred in a beta-globin gene since it became a pseudogene

Pseudogene descendants of human ribosomal protein gene (RPL21)

Pseudogenes may represent reservoirs of genetic information that participate in the evolution of new genes, not only relics of inactivated genes whose fate is genomic

extinction.

21

21-hydroxylase (cytP21) gene

• One of the cytochrome P450 gene family.• cytP21 is located on chromosome 6 in

humans. • Has a paralogous pseudogene in the vicinity. • 100s of mutations in the 21-hydroxylase gene

have been described. • 75% of them are due to gene conversion.

22

ξ-globin duplication• The equine ξ-globin locus consists of a gene

and a pseudogene. • The duplication of the ξ -globin genes

predates placental mammals radiation. • Because of repeated GC events, the gene and

the pseudogene are identical in their align-able part.

23

Evolution by Gene Duplication Susumu Ohno, 1970

“Natural selection merely modified, while redundancy created”

24

Source of variation

mutation

25

Mutation

• Mutation is any heritable change in the genetic material.

• Is the ultimate source of genetic variation.• Include:

1. Changes of DNA sequence (e.g. substitution)

2. Chromosomal rearrangements (e.g. inversion)

26

Mutation

• Most wild type (wt) genes mutate at a very low rate

• Typical mutation rate= to new mutations/gene/generation.

• In a population of size N diploid organisms, there are 2N copies of each gene, each of which can mutate in any generation.

Mutation rate= probability of mutation.27

Substitution models

28

Types of point mutations• In DNA sequences:1. Transitions: Point mutations substituting a

purine (A or G) for a purine (A or G) or a pyrimidine (T or C) for a pyrimidine (T or C) .

2. Transversions: substituting a purine (A or G) by a pyrimidine (T or C).

29

Transitions are more common than transversions.

Mathematical models of substitution

• Are essential to study the dynamics of nucleotides substitutions:

1) Jukes & Cantor one-parameter model (JC) (1969), 2) Kimura‘s two-parameters model (K2P) (1980),3) Felsenstein model (F81) (1981),4) Hasegawa, Kishino & Yano model (HKY85) (1985),5) A general reversible model (REV) (Rodríguez et al.,

1990).

30

Jukes & Cantor model (JC)

assumes no bias in the direction of change so

that substitutions occur randomly

among the four types of nucleotides with

equal probability.

31

Kimura‘s two-parameters model (K2P)

incorporate the observation that the transition

rate per site (α) may differ from that of

transversion (β).

32

Felsenstein model (F81)

allows the frequencies of the 4

nucleotides to be different

But

assumes that they are approximately the same

over all the sequences.

33

Both K2P & F81 extend JC

34

Hasegawa, Kishino & Yano model (HKY85)

merges K2P & F81 models By:

1. allowing transitions & transversions to occur at different rates,

2. allowing base frequencies to vary as well

35

The general reversible model (REV)

adopts a more general approach, with a probability matrix of six parameters making it

possible to generate any of the previous models.

36

Models of mutation

37

1. The infinite-alleles model (IAM)

assumes that every new mutation that arises in a population creates a new allele that had not existed previously (Kimura & Crow, 1964).

38

2. The infinite-sites model (ISM)

• assumes that a new mutation alters sites (i.e. nucleotides) in sequences or alleles instead of creating an entire new allele and that it makes all polymorphic sites segregating for just two nucleotides if the mutation rate is sufficiently low (Kimura, 1969).

3. The stepwise-mutation model (SMM)

• initially developed for allozyme variation and then adopted for microsatellites mutations;

• assumes that mutation only occurs to adjacent states (Ohta and Kimura, 1973);

• in the case of microsatellites, different alleles have different number of repeats that mutation occurs only by adding or deleting one repeat.

• Unlike IAM, in the SMM mutation may produce alleles that are already present in the population (Hedrick, 2005).

40

4. The Wright-Fisher model• is a simple representation of a population

that Sewall Wright (1931) and Ronald Fisher (1930) used in developing the principles of population genetics (Hedrick, 2005).

• Assumes non-overlapping generations of individuals, random mating and a constant population size of N diploid individuals resulting in a Poisson distribution (Hey and Machado, 2003).

41

Non-overlapping generations mathematical model (adopted from Hartl and Clark, 1997, with modifications)

In this model, all organisms from one generation die before the members of the next generation mature. It applies literally only to organisms with a very simple life history such as short-lived insects, but this model can be used in population genetics as a first

approximation to populations with more complex life histories.

5. The coalescent approach

• Credited to Kingman (1982),• in brief this theory depends on the Wright-

Fisher model and works by tracing alleles back to their ancestors and calculating the times to the common ancestry allele.

• The point at which the ancestor allele is detected is called coalescence (Hedrick, 2005).

The coalescent

D

Sequences

C B A

MRCA

The most recent common ancestor

coalescent

coalescent

Time of coalescence

for n lineages

Time isrunning

backwards

n(n-1)Tn=

4Ne

Tn =

2Ne

It takes almost half of the time for the last two lineages to coalesce