30
Book club Andreas Wagner, The Origins of Evolutionary Innovations Chapter 4 Book club presented by G. M. Dall'Olio, Pompeu Fabra, IBE-CEXS

Wagner chapter 4

Embed Size (px)

DESCRIPTION

Book club on "Origins of Evolutionary Innovations" by A. Wagner http://bioinfoblog.it/

Citation preview

Page 1: Wagner chapter 4

Book club

Andreas Wagner,The Origins of Evolutionary Innovations

Chapter 4

Book club presented by G. M. Dall'Olio, Pompeu Fabra, IBE-CEXS

Page 2: Wagner chapter 4

Reminder:Genotype network

A genotype network is a set of genotypes that have the same phenotype, and are connected by single pairwise differences

AAAAA AAAAC AAAAG AAAAT AAATT

AAACA AAACC AAACG AAACT AAATC

AACCA AACCC AACCG AACCT …..

ACCCA ACCCC ACCCG ACCCT …..

CCCCA CCCCC CCCCG CCCCT …..

….. ….. ….. ….. …..

Yellow = same phenotype = a genotype network Note: genotype network == neutral network

Page 3: Wagner chapter 4

Genotype Networksbetter representation!

The Genotype Space can be represented as a Hamming Graph

https://bitbucket.org/dalloliogm/genotype_space

Page 4: Wagner chapter 4

Chapter 4:Novel Molecules

This chapter describes the relationship between protein/RNA sequence and tertiary structure

Most RNA/Proteins have the same fold but different sequences

Page 5: Wagner chapter 4

Novel Molecules,definitions (1)

Genotype:  def 1: the aminoacid sequence of a protein 

(or the list of hydrophobic) def 2: the nucleotidic sequence of a RNA 

Page 6: Wagner chapter 4

A genotype space of sequences

Page 7: Wagner chapter 4

A genotype space of sequences (simplified)

O = any Hydrophobic aminoacid Y = any Hydrophilic aminoacid

Page 8: Wagner chapter 4

Novel Moleculesdefinitions (2)

Phenotype:  The fold of a protein sequence The secondary structure of a RNA molecule

Page 9: Wagner chapter 4

Protein Structures

It is also possible to predict the fold of a protein

But it is difficult, so here we focus on “lattice models”

In a lattice model, we only use hydrophobic or hydrophilic aminoacids

Page 10: Wagner chapter 4

A Genotype network

In this example, all orange sequences have the same fold:

Page 11: Wagner chapter 4

More sequences than folds

Li et al, 1996: study on lattice protein models: There are many more protein sequences than folds Some phenotypes are formed by more sequences 

than others Sequences that produce the same fold can be very 

different

Rost, 1997: study on 272 proteins with similar folds. They shared 8.5% of aa seq

Page 12: Wagner chapter 4

There are many more protein sequences than

protein folds Globins are a very common protein domain Most globins have different sequence, but the same 

fold Among some hemoglobins, only 12.4% of aa 

residues are identical

Page 13: Wagner chapter 4

Do globins have a common origin?

Bailly, X., Chabasse, C., Hourdez, S., Dewilde, S., Martial, S., Moens, L. and Zal, F. (2007), Globin gene family evolution and functional diversification in annelids. FEBS Journal, 274: 2641–2652. doi: 10.1111/j.1742-4658.2007.05799.xGoodman M, Pedwaydon J, Czelusniak J, Suzuki T, Gotoh T, Moens L, Shishikura F, Walz D, Vinogradov S. An evolutionary tree for invertebrate globin sequences. J Mol Evol. 1988;27(3):236-49. PubMed PMID: 3138426.

Page 14: Wagner chapter 4

Some folds are more common than others

Some folds can be obtained by an higher number of sequences than others

Number of proteins Sequences by structure (Ferrada, Wagner 2010): 

Ferrada, E. & Wagner, A., 2010. Evolutionary innovations and the organization of protein functions in genotype space. PloS one, 5(11), p.e14172. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2994758&tool=pmcentrez&rendertype=abstract

Page 15: Wagner chapter 4

The 10 most structurally promiscuous functions

Promiscuity of a function: when the function can be obtained by different structures/sequences

Ferrada, E. & Wagner, A., 2010. Evolutionary innovations and the organization of protein functions in genotype space. PloS one, 5(11), p.e14172. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2994758&tool=pmcentrez&rendertype=abstract

Page 16: Wagner chapter 4

Genotype networks of protein sequences

Sequences that have the same fold tend to be connected in a genotype network (from Li et al, 1996)

More the case of figure 1 (above) than figure 2 (below)

Page 17: Wagner chapter 4

RNA structures

RNA secondary structures can be predicted in silico

http://rna.ucsc.edu/rnacenter/ribosome_images.html

Page 18: Wagner chapter 4

RNA structure videogame

There is even a videogame on predicting RNA structure:

http://eterna.cmu.edu/

So, predicting RNA structures is (relatively) easy

Page 19: Wagner chapter 4

Innovations in RNA folds

All the observations made for protein sequences are also valid for RNA, in a bigger scale:

On average, 400 million RNA seqs per fold Very long RNA sequences tend to similar folds

Page 20: Wagner chapter 4

There are many more RNA sequences than RNA folds

Size rank of genotype set by frequency

Wagner, A., 2008. Robustness and evolvability: a paradox resolved. Proceedings. Biological sciences / The Royal Society, 275(1630), pp.91-100. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2562401&tool=pmcentrez&rendertype=abstract

Page 21: Wagner chapter 4

Frequent RNA structures

def. frequent RNA structure: a RNA structure that can be obtained by > 5000 sequences

Only 10% of RNA structures are frequent 93% of RNA sequences belong to frequent RNA 

structures

Page 22: Wagner chapter 4

RNA sequences can withstand a lot of changes, without modifying the fold

Maximal genotype distance in a RNA gen. network:

A. Wagner, The Origins of Evolutionary Innovations. Figure 4.6

Page 23: Wagner chapter 4

RNA sequences can withstand a lot of changes, without modifying the fold

Different sequence, same fold:

http://eterna.cmu.edu/

Page 24: Wagner chapter 4

Neighbors of points in the genotype network

Most neighbors of sequences in the space have the same fold

A. Wagner, The Origins of Evolutionary Innovations. Figure 4.7

Page 25: Wagner chapter 4

Neighbors of points in the genotype network

Most neighbors of sequences in the space have the same fold

This means that the genotype network of a RNA fold is usually dense

RNA genotype network is more likely to fig 1 than fig 2:

Fig 1 Fig 2

Page 26: Wagner chapter 4

Neighbors of genotypes in a genotype network

Two sequences on a genotype network have, by definition, the same fold.

But what about their neighbors?

A. Wagner, The Origins of Evolutionary Innovations. Figure 2.6

Page 27: Wagner chapter 4

Phenotype of neighbors of genotype network

Neighbor of genotypes can have very different phenotypes

Page 28: Wagner chapter 4

Novel RNA phenotypes

Schultes and Bartel: designed a new rybozime from two existing ones

Existing enzymes had <25% sequence similarity and no common structure

Few mutations needed to obtain the hybrid Schultes, E. a & Bartel, D.P., 2000. One sequence, two ribozymes: implications

for the emergence of new ribozyme folds. Science (New York, N.Y.), 289(5478), pp.448-52. Available at: http://www.ncbi.nlm.nih.gov/pubmed/10903205

Page 29: Wagner chapter 4

Take Home messages

There are many more sequences than protein/RNA folds

Some folds correspond to more sequences than others

Sequences that produce the same fold can be very different

New folds can be reached by changing few bases

Page 30: Wagner chapter 4

A Genotype network

All blue sequences have the same fold