35
T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Embed Size (px)

DESCRIPTION

Why Is It Difficult To Compute A multiple Sequence Alignment? A CROSSROAD PROBLEM BIOLOGY: What is A Good Alignment COMPUTATION What is THE Good Alignment chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: *

Citation preview

Page 1: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

T-COFFEE, a novel method for Multiple Sequence

AlignmentsCédric Notredame

Page 2: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :

Potential Uses of A Multiple Sequence Alignment?

Extrapolation

Motifs/Patterns

Phylogeny

Profiles

Struc. PredictionMultiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.

Page 3: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Why Is It Difficult To Compute A multiple Sequence Alignment?

A CROSSROAD PROBLEMBIOLOGY:

What is A Good Alignment

COMPUTATIONWhat is THE Good

Alignment

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

Page 4: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Why Is It Difficult To Compute A multiple

Sequence Alignment ?

BIOLOGY

CIRCULAR PROBLEM....

GoodSequences

GoodAlignment

COMPUTATION

Page 5: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Dynamic Programming Using A Substitution Matrix

Progressive Alignment

Page 6: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The T-Coffee Algorithm

Page 7: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Progressive Alignment Principle and its Limitations…

Page 8: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The Extended Library Principle…

Page 9: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The Extended Library Principle…

Page 10: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The Triplet Assumption

SEQ A

SEQ B

Page 11: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Weighting And Extension

Extension=Using Information from Other Sequences

Weighting=Using The surrounding Information (Coffee)

Page 12: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

T-Coffee Progressive Alignment

Notredame, Higgins, Heringa, 2000

Dynamic Programming Using The extended Library

Page 13: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Local Alignment Global Alignment

Extension

Multiple Sequence Alignment

Mixing Local and Global Alignments

Page 14: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

What is a library?

Extension+T-Coffee

Library Based Multiple Sequence Alignment

2Seq1 MySeqSeq2 MyotherSeq#1 21 1 253 8 70….

3Seq1 anotherseqSeq2 atsecondoneSeq3 athirdone#1 21 1 25#1 33 8 70….

Page 15: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

How Long Does it Take

Page 16: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Primary Lib: O(N2L2)

Extension:O(N3L2)

Tree :O(N2L2)+O(N3)Aln :O(NL2)

Page 17: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

N times slower than

ClustalW

Page 18: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Validating T-Coffee

Page 19: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

What Is BaliBaseBaliBase

BaliBase is a collection of reference Multiple Alignments

The Structure of the Sequences are known and were used to assemble the MALN.

Evaluation is carried out by Comparing the Structure Based Reference Alignment With its Sequence Based Counterpart

Page 20: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

BaliBase

DALI, Sap …

Method X

Comparison

Page 21: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Validation Using BaliBase

T-Coffee Results

Page 22: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Validation Using BaliBase

Page 23: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Taking T-Coffee Further:

Using Structures

Page 24: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Mixing Heterogenous Information With T-Coffee

Local Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

Page 25: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Running T-Coffee ONLINE

Page 26: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

WHERE ?

[email protected]

www.tcoffee.org

Page 27: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The T-Coffee Server

Page 28: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

The T-Coffee Server

ES45, 4Proc1 Gb RAM

Page 29: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame
Page 30: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Future…

Page 31: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Large Scale…

Page 32: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

Tailor Made…

Page 33: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

WHERE ?

[email protected]

www.tcoffee.org

Page 34: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

WHO ?

WHO USES T-Coffee ?

Dali Domain DictionnaryPfamSwissProt

WHO Makes T-Coffee ?

Cédric NotredameDes HigginsChantal AbergelOlivier PoirotOrla O’Sullivan

Page 35: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame