70
3D-COFFEE Mixing Sequences and Structures Cédric Notredame

3D -COFFEE Mixing Sequences and Structures

  • Upload
    kevlyn

  • View
    55

  • Download
    4

Embed Size (px)

DESCRIPTION

3D -COFFEE Mixing Sequences and Structures. Cédric Notredame. Potential Uses of A Multiple Sequence Alignment ?. chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP - PowerPoint PPT Presentation

Citation preview

Page 1: 3D -COFFEE Mixing Sequences and Structures

3D-COFFEE Mixing Sequences and Structures

Cédric Notredame

Page 2: 3D -COFFEE Mixing Sequences and Structures

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :

Potential Uses of A Multiple Sequence Alignment?

Extrapolation

Motifs/Patterns

Phylogeny

Profiles

Struc. PredictionMultiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.

Page 3: 3D -COFFEE Mixing Sequences and Structures

Why Is It Difficult To Compute A multiple Sequence Alignment?

A CROSSROAD PROBLEMBIOLOGY:

What is A Good Alignment

COMPUTATIONWhat is THE Good

Alignment

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

Page 4: 3D -COFFEE Mixing Sequences and Structures

Why Is It Difficult To Compute A multiple

Sequence Alignment ?

BIOLOGY

CIRCULAR PROBLEM....

GoodSequences

GoodAlignment

COMPUTATION

Page 5: 3D -COFFEE Mixing Sequences and Structures

The T-Coffee Algorithm

Page 6: 3D -COFFEE Mixing Sequences and Structures

Local Alignment Global Alignment

Extension

Multiple Sequence Alignment

Mixing Local and Global Alignments

Page 7: 3D -COFFEE Mixing Sequences and Structures

What is a library?

Extension+T-Coffee

Library Based Multiple Sequence Alignment

2Seq1 MySeqSeq2 MyotherSeq#1 21 1 253 8 70….

3Seq1 anotherseqSeq2 atsecondoneSeq3 athirdone#1 21 1 25#1 33 8 70….

Page 8: 3D -COFFEE Mixing Sequences and Structures

The Triplet Assumption

X

Y

Z

X

Y

SEQ A

SEQ B

Consistency Consensus

Page 9: 3D -COFFEE Mixing Sequences and Structures

ClustalW T-Coffee

Page 10: 3D -COFFEE Mixing Sequences and Structures

Dynamic Programming Using An Extended Library

Progressive Alignment

Page 11: 3D -COFFEE Mixing Sequences and Structures

What Is BaliBaseHow Good is T-Coffee ???

Best Performing Method on MSA benchmark Datasets

BaliBase -Notredame-Sonhammer

Ribosomal RNA-Katoh (Mafft)

Homstrad-Notredame

OxBench-Barton

Page 12: 3D -COFFEE Mixing Sequences and Structures

Mixing Heterogenous Data With

T-CoffeeLocal Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

Page 13: 3D -COFFEE Mixing Sequences and Structures

Mixing Sequences and Structures

Page 14: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

1-Predicting Sequence Structures

STUCTURE FUNCTION

Page 15: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

•Sequences are Cheap and Common.

•Structures are Expensive and Rare.

Page 16: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

Cheapest Structure determination:

Sequence-Structure Alignment

THREADOr

ALIGNADKPRRP---LS-YMLWLNADKPKRPKPRLSAYMLWLN

Page 17: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

ADKPRRP---LS-YMLWLNADKPKRPKPRLSAYMLWLN

THREADOr

ALIGN

Convincing Alignment

Same Fold

Page 18: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

Convincing Alignment

Same Fold

Distant sequences are hard to align

Page 19: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

Multiple Sequence Alignments Help

Exploring the Twilight Zone

Page 20: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

1-Predicting Sequence Structures

2-Produce Better Alignments

Page 21: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

ADKPRRP---LS-YMLWLNADKPKRPKPRLSAYMLWLNALIGN

Unreliable alignment if %ID <30%

Page 22: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

Alignment Unsentitive to %ID

ADKPRRP---LS-YMLWLNADKPKRPKPRLSAYMLWLN

Struc.Superposition

Folds evolve Slower than Sequences

Page 23: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

Page 24: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

StructureSuperposition

Page 25: 3D -COFFEE Mixing Sequences and Structures

Why Do We Want To Mix Sequences and Structures?

1-Predicting Sequence Structures

2-Produce Better Alignments

Page 26: 3D -COFFEE Mixing Sequences and Structures

How To Mix Sequences and

Structures

Page 27: 3D -COFFEE Mixing Sequences and Structures

Mixing Heterogenous Data With

T-CoffeeLocal Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

Page 28: 3D -COFFEE Mixing Sequences and Structures

Struct Vs StructSeq Vs Struct

Thread

Evaluation on Homestrad

Superpose

Seq Vs SeqLocalGlobal

Mixing Sequences and Structures with T-Coffee

Page 29: 3D -COFFEE Mixing Sequences and Structures

The 3D-Coffee LibrariesMethods

•Global: Needlman and Wunsch

•Local: Sim (lalign)

•Threading: Fugue

•Superposition:SAP

Page 30: 3D -COFFEE Mixing Sequences and Structures

•Threading: Fugue

Page 31: 3D -COFFEE Mixing Sequences and Structures

Fugue

•Threading: Fugue

Page 32: 3D -COFFEE Mixing Sequences and Structures

Fugue

•Threading: Fugue

1-Turn Sequence into a profile:-lower penalties in loops-Structure specific matrix

2-Align Profile

withSequence

Page 33: 3D -COFFEE Mixing Sequences and Structures

Evaluating Fugue

•Threading: Fugue

1-Select 967 pairs of sequences in HOMSTRAD

FUGUE T-Coffee2-Align each pair with T-Coffee and Fugue.

Compare

3-Compare the TwoAlignments

Page 34: 3D -COFFEE Mixing Sequences and Structures

Fugue

•Threading: Fugue

1-Select 967 pairs of sequences in HOMSTRAD

2-Align each pair with T-Coffee and Fugue.

3-Compare the TwoAlignments TCdef wins

Fugue wins TCdef: 58.81%Fugue: 61.81%

Page 35: 3D -COFFEE Mixing Sequences and Structures

Superposition:

SAP

Page 36: 3D -COFFEE Mixing Sequences and Structures

•Superposition:SAP

Page 37: 3D -COFFEE Mixing Sequences and Structures

•Superposition:SAP

1-High Level Dynamic Programming

Substitution Matrix when doing regular Alignments

2-Low Level DP.Forcing the aln of two residues

Page 38: 3D -COFFEE Mixing Sequences and Structures

1-High Level Dynamic Programming

•Superposition:SAP

1

9

12131

8

14

53-Rigid Body Superposition

RMSD

2-Low Level DP.Forcing the aln of two residues

Page 39: 3D -COFFEE Mixing Sequences and Structures

1-High Level Dynamic Programming

•Superposition:SAP

1

9

1213

18

14

53-Rigid Body Superposition

RMSD2-Low Level DP.Forcing the aln of two residues

Page 40: 3D -COFFEE Mixing Sequences and Structures

1-High Level Dynamic Programming

•Superposition:SAP

3-Rigid Body Superposition

2-Low Level DP.Evaluate Every Pair

Page 41: 3D -COFFEE Mixing Sequences and Structures

1-High Level Dynamic Programming

•Superposition:SAP

Structure Based Sequence Alignment

Make a DP on the

accumulated traces

Use Traces like a

Substitution Matrix

Page 42: 3D -COFFEE Mixing Sequences and Structures

SAP T- Coff ee

Compare

1-Select 967 pairs of sequences in HOMSTRAD

2-Align each pair with T-Coffee and SAP.

3-Compare the TwoAlignments

•Superposition:SAP

Page 43: 3D -COFFEE Mixing Sequences and Structures

1-Select 967 pairs of sequences in HOMSTRAD

2-Align each pair with T-Coffee and SAP.

3-Compare the TwoAlignments

•Superposition:SAP

TCdef: 58.81%SAP: 86.31%

Page 44: 3D -COFFEE Mixing Sequences and Structures

•SAP•Fugue

TCdef: 58.81%Fugue: 61.81%

TCdef: 58.81%Fugue: 86.31%

Page 45: 3D -COFFEE Mixing Sequences and Structures

Sequences and Structures:

How Good is The Mixture ???

Page 46: 3D -COFFEE Mixing Sequences and Structures

Our Benchmark:

HOM39

-HOMSTRAD: Structure based MSAs that can be used as References.

-COMPACT and DEMANDING

-HOM39: The 39 Most difficult datasets (percent ID lower than 25).

Page 47: 3D -COFFEE Mixing Sequences and Structures

Our BenchMark:

Using HOM39

BENCHMARKING Strategy:

-re-align HOM39 without using ALL the structures

-Compare the result with the reference

Page 48: 3D -COFFEE Mixing Sequences and Structures

Evaluating 3D-Coffee

1- Can a SINGLE structure Help ?

Page 49: 3D -COFFEE Mixing Sequences and Structures

Seq Vs Struct

Thread

Evaluation on HOM39

Seq Vs SeqLocalGlobal

Using ONE structure with3D-Coffee

HOM39 with ONE Structure per MSA

Page 50: 3D -COFFEE Mixing Sequences and Structures
Page 51: 3D -COFFEE Mixing Sequences and Structures
Page 52: 3D -COFFEE Mixing Sequences and Structures
Page 53: 3D -COFFEE Mixing Sequences and Structures

Evaluating 3D-Coffee

1- Can a SINGLE structure Help ?

2- Does it benefit to ALL the Sequences

Is EVERYONE Happier if there is a STAR in the team…

Page 54: 3D -COFFEE Mixing Sequences and Structures

BaliBase

HOM39 TC-Fugue

+

Remove Provided Structure(s)

Comparison

Page 55: 3D -COFFEE Mixing Sequences and Structures
Page 56: 3D -COFFEE Mixing Sequences and Structures

Evaluating 3D-Coffee

1- Can a SINGLE structure Help ?

3- Can We Use Two or More Structures

2-Does it benefit to all the sequences

Page 57: 3D -COFFEE Mixing Sequences and Structures

Seq Vs Struct

Fugue

Evaluation on Homestrad

Seq Vs SeqLocalGlobal

Mixing Sequences and Structures with 3D-Coffee

HOM39 with TWO Structures/MSA

Struct Vs Struct

SAP, LSQ

Page 58: 3D -COFFEE Mixing Sequences and Structures

Indirect Improvement

Direct Improvement

Page 59: 3D -COFFEE Mixing Sequences and Structures
Page 60: 3D -COFFEE Mixing Sequences and Structures

Evaluating 3D-Coffee

1- Can a SINGLE structure Help ?

4-Relation Accuracy/ N-structures ???

2-Does it benefit to all the sequences

3-Can we use Two Structures

Page 61: 3D -COFFEE Mixing Sequences and Structures

Seq Vs Struct

Fugue

Evaluation on Homestrad

Seq Vs SeqLocalGlobal

Mixing Sequences and Structures with T-Coffee

HOM39 with 1-N Structures per MSA

Struct Vs Struct

SAP

Page 62: 3D -COFFEE Mixing Sequences and Structures
Page 63: 3D -COFFEE Mixing Sequences and Structures

Induced Improvement

Page 64: 3D -COFFEE Mixing Sequences and Structures

Conclusion

Page 65: 3D -COFFEE Mixing Sequences and Structures

-Structures Help

BUT NOT SO MUCH

Page 66: 3D -COFFEE Mixing Sequences and Structures

The More Structures The Merrier

Page 67: 3D -COFFEE Mixing Sequences and Structures

The More Structures The Merrier

Page 68: 3D -COFFEE Mixing Sequences and Structures

Credits

Orla O’Sullivan: University College, Cork, Ireland

Des Higgins: University College, Cork, Ireland

Karsten Suhre: IGS-CNRS, Marseille, France

Page 69: 3D -COFFEE Mixing Sequences and Structures

Conclusion

The program is available on request from:

[email protected]

Page 70: 3D -COFFEE Mixing Sequences and Structures