39
Module 4: Introduc/on to concepts and methods in molecular phylogene/cs Céline Poux Laboratoire EEP – UMR8198 Université de Lille Introduc/on to bioinforma/cs Samuel Blanquart Inria Laboratoire CRIStAL – UMR 9189 Université de Lille

Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Module4:Introduc/ontoconceptsandmethodsinmolecularphylogene/cs

CélinePouxLaboratoireEEP–UMR8198

UniversitédeLille

Introduc/ontobioinforma/cs

SamuelBlanquartInria

LaboratoireCRIStAL–UMR9189UniversitédeLille

Page 2: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Introduc/ontobioinforma/cs

Part1:Introduc/on–PreparingthedatasetPart2:Phylogene/creconstruc/on–MLphylogene/creconstruc/onPart3:Reconstruc/onbiasesPart4:Phylogene/creconstruc/on–Bayesianreconstruc/on

Part5:Molecularda/ng–Bayesianda/ng

Page 3: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Reconstruc/onbiases

SignalOrthologouscharacters

Data

NoiseWheredoesitcomefrom?

POORSIGNAL LOTOFNOISEWRONGTREE

Page 4: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases

1.   Stochas7cerrors: Samplingerrors

2. Systema7cerrors:

Methodologicalerrors

3. Biologicalerrors: Genestreesandspeciestreesarediscordant

Treereliability

Page 5: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thestochas/cerrors

Rokasetal.2003

Treereliability

106orthologousgenesfrom8yeastgenomes

Stochas7cerrorsaresamplingerrors

Thesamplingsizeoftheanalyzedcharactersistosmall

Page 6: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thestochas/cerrors

Rokasetal.2003

106genesdataset

12

34

5Nod

e3

Nod

e5

Treereliability

Stochas7cerrorsaresamplingerrors

Page 7: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Reconstruc/onbiases:thestochas/cerrors

Data Alignment&cleaning TreebuildingEvolu/onarymodels BiasesDelsucetal.2005

Methodsforphylogenomicinference

Treereliability

Page 8: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thestochas/cerrors

=>Addingmoresequences–phylogenomics–isnotalwaysenoughtoresolveinconsistences

Philippeetal2011

Treereliability

Stochas7cerrorsaresamplingerrors

Page 9: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

Systema7cerrors:themethodofinferenceisinconsistent

ü  Thehypothesesofthereconstruc/onmethodsareviolatedü  Themodelsofsequencesevolu/onarenotaccurate.⇒ Mul/plessubs/tu/ons(homoplasy)cangoundetectedorbewrongly

inferred.

• Ratevaria/onamonglineage =>Longbranchaarac/on

• Ratevaria/onamongsites =>Satura/onatsomesites

• Heterogeneityofnucleo/decomposi/onamongspecies =>Composi/onalbiases

Treereliability

Page 10: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

LongBranchAarac/on(LBA)

Yang&Rannala2012

Treereliability

Page 11: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

Philippeetal.2007

MP11959AAsites

Treereliability

Tunicates

LongBranchAarac/on(LBA)

Page 12: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

Philippeetal.2007

MP11959AAsites

Platyhelminthes

Tunicates

Treereliability

LongBranchAarac/on(LBA)

Page 13: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

CATmixturemodel

TreereliabilityPhilippeetal.2007

LongBranchAarac/on(LBA)

Page 14: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

Composi/onalbiases&Satura/on

Jeffroyetal.2006

Treereliability

BInt3 BInt

BInt12BIAA

Page 15: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:thesystema/cerrors

Treereliability

Composi/onalbiases&Satura/on

(A)  Correctphylogeny.

(B)  Classicalreconstruc/onar/factwheremesophilicbacteriawithsimilarGCcontentclusterinthetree.

ARN16S

BlanquartetLar/llot2006

Page 16: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:biologicalerrors

Genesphylogeny=Speciesphylogeny

Isthishypothesiscorrect?

Treereliability

@EmmanuelDouzery

Page 17: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Pholetal.2009

Treereliability

Page 18: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Pholetal.2009

Treereliability

Page 19: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Treereliability@GuyPerrière

Page 20: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

3.6.Orthologieetparalogie

Duplica/onSpecia/on

CopyA CopyB

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Treereliability

Truephylogeny

@GuyPerrière

Page 21: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Treereliability

3.6.Orthologieetparalogie

Truephylogeny

@GuyPerrière

Page 22: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Genesduplica/on

Treereliability

3.6.Orthologieetparalogie

Truephylogeny

Reconstructedphylogenyphylogeny

@GuyPerrière

Page 23: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Horizontalgenetransfer

CoverofMMBR,December2009

Treereliability

Page 24: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Reconstruc/onbiases:Horizontalgenetransfer

•  Horizontalgenestransfer:genestransmissionbetweendifferenttaxa.

•  Phenomenonfrequentbetweenprokaryotes.

•  Itimpliesvariousmechanisms:–Transforma/on,–Conjuga/on,–Transduc/on.17.6%ofthegenesE.coliwouldhavebeenacquiredbytransfer.

Data Alignment&cleaning TreebuildingEvolu/onarymodels BiasesTreereliability

Page 25: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:horizontalgenetransfer

Calteauetal.2004

Treereliability

BacteriaArchea

Page 26: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

AdaptedfromLeliaertetal.2014

Treereliability

Page 27: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels BiasesAdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 28: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Spe.A SpeB SpeC

Genetree=Speciestree

AdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 29: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Spe.A SpeB SpeC

Genetree≠Speciestree

AdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 30: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels BiasesAdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 31: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels BiasesAdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 32: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

BC

ABC

AdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 33: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

AB

ABC

AdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 34: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

AB

ABC

AdaptedfromLeliaertetal.2014

Reconstruc/onbiases:Incompletelineagesor/ng&Ancestralpolymorphism

Treereliability

Page 35: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Reconstruc/onbiases:Hybridiza/on&Genomeduplica/on

Macet-Houben&Gabaldon2015

Treereliability

Page 36: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Data Alignment&cleaning TreebuildingEvolu/onarymodels Branchsupports BiasesMacet-Houben&Gabaldon2015

Reconstruc/onbiases:Hybridiza/on&Genomeduplica/on

Page 37: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Concludingremarks

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

•  Thephylogene7csignaliscarriedbyorthologsequencesdisplayingsynapomorphiescharacteris/cofasharedevolu/vehistory.Thissignalincreaseswiththenumberofanalyzedcharacter.

•  Thenon-phylogene7csignalcanbedueto:•  Misalignedsequences,•  Saturatedsequences,•  GCcontentbiases,•  Simplis/corinappropriatemodelsofsequenceevolu/on,•  Comparisonofparalog,xenologorohnologsequences•  Incompletelineagesor/ng&ancestralpolymorphism

Treereliability

Page 38: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Concludingremarks

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Strongphylogene/csignalWeaknon-phylogene/csignal

Weakphylogene/csignalWeaknon-phylogene/csignal

Weakphylogene/csignalStrongnon-phylogene/csignal

Correcttopology Resultsarenonsignificant Arctefactualtopologyrobustlyinferred

Treereliability

Page 39: Module 4: Introduc/on to concepts and methods in molecular ... · Module 4: Introduc/on to concepts and methods in molecular phylogenecs Céline Poux Laboratoire EEP – UMR8198 Université

Reconstruc/onbiases

Data Alignment&cleaning TreebuildingEvolu/onarymodels Biases

Advises:Ingeneral:ü  Usesinglecopygenes,ü  Checkincongruencesofindividualgenetreeswithspeciestree.

Fordeep/mephylogenies:ü  Checkthecomposi/onalbiases(Phylobayes)=>recodethedata,ü  Checkforhomoplasy(Phylobayes)=>removefastevolvingsites,ü  Includemorespecies(slowlyevolvingsequences,closelyrelatedoutgroup),

Fordeepcoalescenceproblems:ü  Mul/speciescoalescentmodels(*Beast)Forgeneduplica/onandlateralgenetransfer:ü  Phylogene/creconcilia/on(Phyldog)

Treereliability