20
Molecular Evolution Sylvia Nagl

Molecular Evolution Sylvia Nagl. Relationships between DNA or amino acid sequence 3D structure protein functions Use of this knowledge for prediction

Embed Size (px)

Citation preview

Molecular Evolution

Sylvia Nagl

•Relationships between

DNA or amino acid

sequence 3D structure protein functions

•Use of this knowledge for prediction of function, molecular modelling, and design (e.g., new therapies)

Sequence-structure-function paradigm

CGCCAGCTGGACGGGCACACCATGAGGCTGCTGACCCTCCTGGGCCTTCTG…

TDQAAFDTNIVTLTRFVMEQGRKARGTGEMTQLLNSLCTAVKAISTAVRKAGIAHLYGIAGSTNVTGDQVKKLDVLSNDLVINVLKSSFATCVLVTEEDKNAIIVEPEKRGKYVVCFDPLDGSSNIDCLVSIGTIFGIYRKNSTDEPSEKDALQPGRNLVAAGYALYGSATML

A novel sequence or structure

Prediction based on “similarity”= evolutionary relatedness

Evolution as an algorithmic process

Random mutation (genotype) “mutate”

Selection (phenotype) “select”

Differential reproduction “replicate”

The term algorithm denotes a certain kind of formal process consisting of simple steps that are executed repetitively in a defined sequential order and will reliably produce a definite kind

of result whenever the algorithm is run or ‘instantiated.’

‘cumulative’

•Cumulative selection will work on almost anything that can yield similar, but non-identical, copies of itself through some replication process.

•It depends on a medium that stores information and can be passed on to the next generation - DNA or RNA (virus) in terrestrial life forms.

•Most genetic mutations are deleterious - proofreading and error correction mechanisms - negative selection

•Whenever positive selection acts, it can be thought of as selecting DNA with particular phenotypic effects over others with different effects.

•Advantageous mutations may confer a survival and reproductive advantage on individuals who will then, on average, pass on more copies of their genetic material because they will tend to have a larger number of offspring.

•Over many generations, the accumulation of small changes can result in the evolution of DNA sequences with new associated phenotypic effects.

Genes and gene-related sequences

900Mb

Extragenic DNA

2100Mb

Single-copy genes

Multi-gene families

Regulatory sequences

Non-coding tandem repeats

DNA transposons

LTR elements

LINEs

SINEs

Satellite DNA

Minisatellites

Microsatellites

Dispersed

Tandemly repeated

Coding DNA 90Mb

Noncoding DNA

810Mb

Pseudogenes

Gene fragments

Introns, leaders, trailers

Unique and low-copy number

1680Mb

Repetitive DNA 420Mb

Genome-wide

interspersed repeats

Roadmap of the human genome

Multi-gene families: Evolution by gene duplication

•Gene duplication is the most important mechanism for generating new genes and new biochemical processes.

This mechanism has facilitated the evolution of complex organisms:

•In the genomes of eukaryotes, internal duplications of gene segments have occurred frequently. Many complex genes might have evolved from small primordial genes through internal duplication and subsequent modification.

•Vertebrate genomes contain many gene families absent in invertebrates.

•Many gene duplications have occurred in the early evolution of animals (“Biology’s Big Bang”, “Cambrian explosion”, ~570-505 million year ago).

A duplication may involve

•a single gene (complete gene duplication)

•part of a gene (internal or partial gene duplication)

•part of a chromosome (partial polysomy)

•an entire chromosome (aneuploidy or polysomy)

•the whole genome (polyploidy)

Types of duplication events

Unequal sister chromatid exchange at meiosis

Unequal crossing-over at meiosis

Gene duplication: Mechanisms

Transposition via an RNA intermediate

RNA cDNAtranscription reverse

transcription

reintegration

DNA transposons

transposon replication

Gene duplication: Mechanisms

Homology: Paralogy, orthology and xenology

a

a b

duplication

speciation

species 1 species 2

a b a b

paralogous

orthologous

‘Redundant copy’

Random mutation (genotype) “mutate”

Selection (phenotype) “select”

Differential reproduction “replicate”

Duplication – mutation in ‘redundant copy’ – paralogy - new function

Complete gene duplication

pseudogene (silent)

deleterious

mutations

invariant repeats

“tandem arrays”

increased gene product

Examples:

large quantities of specific rRNAs or tRNAs, histone proteins

amplified esterase gene in Culex mosquito

variant repeats

sequence divergence

HOX/HOM genes

function or regulation may differ

Dayhoff (1978):

at least 50% identity: gene family

>35% identity: homologous (super)family

Evolution of Hox and HOM gene clusters by gene duplication

mouse

Amphioxus

Drosophila

hypothetical

ancestor

gene duplication

Antennapedia Bithorax

•Duplicated gene segments often correspond to functional or structural domains.

A domain is a well-defined region within a protein that either performs a specific function or constitutes a stable structural unit.

•Domain duplication is a form of internal duplication.

This mechanism may

•increase number of active sites

•enable acquisition of a new function by modifying the redundant segment.

Domain duplication increases the functional complexity of genes in evolution.

Internal gene duplication: Domain duplication

Internal repeats in the apolipoprotein genes

The structural and functional module: a 22-mer repeat

In exon 4 of the genes belonging to this family, this 22-mer is repeated 1 to ~15 times.

The presence of many copies lead to the evolution of new functions:

apoE now plays a role in neural regeneration, immunoregulation, growth and differentiation, via interactions with low-density lipoprotein receptors and apoE receptors.

Gene evolution by domain shuffling

1. Internal duplication

Duplication of one or more domains

2. Domain insertion

Structural or functional domains are exchanged between proteins or inserted into a protein

“Mosaic or chimeric proteins”

Examples: Two common domains

Kringle domain from plasminogen protein

EGF-like domain from coagulation factor X

Domain insertion: “Mosaic proteins”

Structural modules:

plasminogen kringle domain

EGF domain

fibronectin finger domain

vit. K-dependent calcium-binding domain (osteocalcin)

tissue plasminogen activator

urokinase

prothrombin

plasminogen

Domain origins:

trypsin-like serine protease

Mosaic proteins

fibronectin

epidermal growth factor (EGF)