64
IPANEMAP Integrative Probing Analysis of Nucleic Acids Empowered by Multiple Accessibility Profiles Afaf SAAIDI & Yann PONTY Bruno SARGUEIL & Delphine ALLOUCHE CNRS/Ecole Polytechnique Univ. Paris Descartes

IPANEMAP I P A Nucleic Acids E M A P

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP

Integrative Probing Analysis of Nucleic Acids

Empowered by Multiple Accessibility Profiles

Afaf SAAIDI & Yann PONTY Bruno SARGUEIL & Delphine ALLOUCHE

CNRS/Ecole Polytechnique Univ. Paris Descartes

Page 2: IPANEMAP I P A Nucleic Acids E M A P

Structure probing in a nutshell

Page 3: IPANEMAP I P A Nucleic Acids E M A P

Structure probing in a nutshell

Page 4: IPANEMAP I P A Nucleic Acids E M A P

Structure probing in a nutshell

Page 5: IPANEMAP I P A Nucleic Acids E M A P

Structure probing in a nutshell

Page 6: IPANEMAP I P A Nucleic Acids E M A P

The SHAPE wars: 100% accuracy and beyond…

Now, wait a minute

guys…!

Spasic et al, NAR 2018David H Mathews

Page 7: IPANEMAP I P A Nucleic Acids E M A P

The practice of RNA modeling

Probing is tricky to

perform and interpret.

Beyond RNAPuzzle, modelers

usually try different techniques

and reagents, but integration is

no gimme…

Bruno Sargueil

How to integrate multiple

probing profiles?

Page 8: IPANEMAP I P A Nucleic Acids E M A P

A KISS approach

Reactivity profiles uniformly captured as pseudo potentials

in prediction methods (RNAsubopt + Soft constraints)

Native structure should be represented in each of the

pseudo-Boltzmann ensemble (maybe not at the top)Pse

udo-B

oltzm

ann

Sam

plin

g (R

NA

subopt)

ACGAUGAUCGACUACGAUCGA

UCGACUAGCUACGUACUGACU

CGGCUAGAUUAGCUUAUGA…

Page 9: IPANEMAP I P A Nucleic Acids E M A P

A KISS approach

Reactivity profiles uniformly captured as pseudo potentials

in prediction methods (RNAsubopt + Soft constraints)

Native structure should be represented in each of the

pseudo-Boltzmann ensemble (maybe not at the top)Pse

udo-B

oltzm

ann

Sam

plin

g (R

NA

subopt)

ACGAUGAUCGACUACGAUCGA

UCGACUAGCUACGUACUGACU

CGGCUAGAUUAGCUUAUGA…

Page 10: IPANEMAP I P A Nucleic Acids E M A P

A KISS approach

Reactivity profiles uniformly captured as pseudo potentials

in prediction methods (RNAsubopt + Soft constraints)

Native structure should be represented in each of the

pseudo-Boltzmann ensemble (maybe not at the top)Pse

udo-B

oltzm

ann

Sam

plin

g (R

NA

subopt)

ACGAUGAUCGACUACGAUCGA

UCGACUAGCUACGUACUGACU

CGGCUAGAUUAGCUUAUGA…

Page 11: IPANEMAP I P A Nucleic Acids E M A P

A KISS approach

Reactivity profiles uniformly captured as pseudo potentials

in prediction methods (RNAsubopt + Soft constraints)

Native structure should be represented in each of the

pseudo-Boltzmann ensemble (maybe not at the top)Pse

udo-B

oltzm

ann

Sam

plin

g (R

NA

subopt)

ACGAUGAUCGACUACGAUCGA

UCGACUAGCUACGUACUGACU

CGGCUAGAUUAGCUUAUGA…

Page 12: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 13: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 14: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 15: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 16: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 17: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 18: IPANEMAP I P A Nucleic Acids E M A P

IPANEMAP Pipeline

Page 19: IPANEMAP I P A Nucleic Acids E M A P

Single condition/single structure dataset

IPANEMAP vs Rsample

[Spasic et al, NAR 2018]

Hajdin et al dataset

SHAPE 1M7

50-500nts RNAs

Geometric Mean

Page 20: IPANEMAP I P A Nucleic Acids E M A P

Single condition/single structure dataset

IPANEMAP vs Rsample

[Spasic et al, NAR 2018]

Hajdin et al dataset

SHAPE 1M7

50-500nts RNAs

Geometric Mean

Page 21: IPANEMAP I P A Nucleic Acids E M A P

Single condition/single structure dataset

IPANEMAP vs Rsample

[Spasic et al, NAR 2018]

Hajdin et al dataset

SHAPE 1M7

50-500nts RNAs

80% GM IPANEMAP

vs 63% GM for Rsample

Reason: Rsample ≈ MEA

Geometric Mean

Page 22: IPANEMAP I P A Nucleic Acids E M A P

Multiple conditions improve predictions?

6 RMDB RNAs

5s RNA E. coli

Glycine Riboswitch

F. Nucleatum

cidGMP riboswitch

V. Cholerae

P4 - P6 domain

Tetrahymena ribozyme

add Adenine Riboswitch

tRNA phenylalanine yeast

3 conditions

SHAPE (NMIA)

DMS

CMCT

Page 23: IPANEMAP I P A Nucleic Acids E M A P

Multiple conditions improve predictions?

6 RMDB RNAs

5s RNA E. coli

Glycine Riboswitch

F. Nucleatum

cidGMP riboswitch

V. Cholerae

P4 - P6 domain

Tetrahymena ribozyme

add Adenine Riboswitch

tRNA phenylalanine yeast

3 conditions

SHAPE (NMIA)

DMS

CMCT

Page 24: IPANEMAP I P A Nucleic Acids E M A P

Multiple conditions improve predictions?

6 RMDB RNAs

5s RNA E. coli

Glycine Riboswitch

F. Nucleatum

cidGMP riboswitch

V. Cholerae

P4 - P6 domain

Tetrahymena ribozyme

add Adenine Riboswitch

tRNA phenylalanine yeast

3 conditions

SHAPE (NMIA)

DMS

CMCT

Page 25: IPANEMAP I P A Nucleic Acids E M A P

Multiple conditions improve predictions?

6 RMDB RNAs

5s RNA E. coli

Glycine Riboswitch

F. Nucleatum

cidGMP riboswitch

V. Cholerae

P4 - P6 domain

Tetrahymena ribozyme

add Adenine Riboswitch

tRNA phenylalanine yeast

3 conditions

SHAPE (NMIA)

DMS

CMCT

Page 26: IPANEMAP I P A Nucleic Acids E M A P

Multiple conditions improve predictions?

6 RMDB RNAs

5s RNA E. coli

Glycine Riboswitch

F. Nucleatum

cidGMP riboswitch

V. Cholerae

P4 - P6 domain

Tetrahymena ribozyme

add Adenine Riboswitch

tRNA phenylalanine yeast

3 conditions

SHAPE (NMIA)

DMS

CMCT

Page 27: IPANEMAP I P A Nucleic Acids E M A P

Back to the lab for even more data

Lariat Capping Ribozyme (PDB 4P8Z) from Didymium iridis

Page 28: IPANEMAP I P A Nucleic Acids E M A P

Back to the lab for even more data

Lariat Capping Ribozyme (PDB 4P8Z) from Didymium iridis

Probed under 14 different conditions:

Technique: SHAPE, DMS, CMCT

Reagents: 1M7, NMIA, DMS, CMCT, NAI, BzCN

Stop based and Mutation based readouts

Presence/absence Magnesium

Page 29: IPANEMAP I P A Nucleic Acids E M A P

Back to the lab for even more data

Lariat Capping Ribozyme (PDB 4P8Z) from Didymium iridis

Probed under 14 different conditions:

Technique: SHAPE, DMS, CMCT

Reagents: 1M7, NMIA, DMS, CMCT, NAI, BzCN

Stop based and Mutation based readouts

Presence/absence Magnesium

It has a

pseudoknot

Page 30: IPANEMAP I P A Nucleic Acids E M A P

Back to the lab for even more data

Lariat Capping Ribozyme (PDB 4P8Z) from Didymium iridis

Probed under 14 different conditions:

Technique: SHAPE, DMS, CMCT

Reagents: 1M7, NMIA, DMS, CMCT, NAI, BzCN

Stop based and Mutation based readouts

Presence/absence Magnesium

It has a

pseudoknot

Page 31: IPANEMAP I P A Nucleic Acids E M A P

Mono probing analysis

Different performances

No clear factor

Some outliers

(NAI & CMCTMg)

MCC:

Page 32: IPANEMAP I P A Nucleic Acids E M A P

Reproducibility

Page 33: IPANEMAP I P A Nucleic Acids E M A P

More mono probing analysis

(Squared) Euclidian distance

between conditions

Page 34: IPANEMAP I P A Nucleic Acids E M A P

More mono probing analysis

(Squared) Euclidian distance

between conditions

14 conditions + no-probing (-)

Page 35: IPANEMAP I P A Nucleic Acids E M A P

More mono probing analysis

(Squared) Euclidian distance

between conditions

14 conditions + no-probing (-)

Conditions cluster in 8 groups

Page 36: IPANEMAP I P A Nucleic Acids E M A P

More mono probing analysis

(Squared) Euclidian distance

between conditions

14 conditions + no-probing (-)

Conditions cluster in 8 groups

5 singletons

Page 37: IPANEMAP I P A Nucleic Acids E M A P

More mono probing analysis

(Squared) Euclidian distance

between conditions

14 conditions + no-probing (-)

Conditions cluster in 8 groups

5 singletons

Outliers confirmed: NAI & CMCTMg

Page 38: IPANEMAP I P A Nucleic Acids E M A P

Dual probing analysis

Pairs of conditions do not improve predictions (+.2% MCC) but:

Mitigates the risk of poor conditions:

+5% MCC on average against worst condition

Page 39: IPANEMAP I P A Nucleic Acids E M A P

Dual probing analysis

Pairs of conditions do not improve predictions (+.2% MCC) but:

Mitigates the risk of poor conditions:

+5% MCC on average against worst condition

Slightly increases expectation:

+0.2% against mean MCC, +1% w/o NAI

Page 40: IPANEMAP I P A Nucleic Acids E M A P

Dual probing analysis

Pairs of conditions do not improve predictions (+.2% MCC) but:

Mitigates the risk of poor conditions:

+5% MCC on average against worst condition

Slightly increases expectation:

+0.2% against mean MCC, +1% w/o NAI

Sensitive to contamination:

-4.4% against best, -1.5% w/o NAI & CMCTMg

Page 41: IPANEMAP I P A Nucleic Acids E M A P

Triplets of conditions improve predictions

Considering triplets of conditions (364 runs) :

Page 42: IPANEMAP I P A Nucleic Acids E M A P

Triplets of conditions improve predictions

Considering triplets of conditions (364 runs) :

Improves MCC by 10% over worst MCC

Page 43: IPANEMAP I P A Nucleic Acids E M A P

Triplets of conditions improve predictions

Considering triplets of conditions (364 runs) :

Improves MCC by 10% over worst MCC

Improves by 2.6% over mean MCC on avg (+3.4% median)

Page 44: IPANEMAP I P A Nucleic Acids E M A P

Triplets of conditions improve predictions

Considering triplets of conditions (364 runs) :

Improves MCC by 10% over worst MCC

Improves by 2.6% over mean MCC on avg (+3.4% median)

Degrades by 4% over best MCC of triplet (2% median)

Page 45: IPANEMAP I P A Nucleic Acids E M A P

Triplets of conditions improve predictions

Considering triplets of conditions (364 runs) :

Improves MCC by 10% over worst MCC

Improves by 2.6% over mean MCC on avg (+3.4% median)

Degrades by 4% over best MCC of triplet (2% median)

Some triplets improve overall best MCCs

1M7ILUMg + NMIAMgCE + 1M7ILU3

1M7ILUMg + 1M7 + BzCNMg

1M7ILUMg + 1M7ILU3 + 1M7

Beyond three conditions?

→ 85.3%MCC

Page 46: IPANEMAP I P A Nucleic Acids E M A P

The full monty: Multi probing

All 14 conditions: 80% MCC (vs 73% Avg)

Page 47: IPANEMAP I P A Nucleic Acids E M A P

The full monty: Multi probing

All 14 conditions: 80% MCC (vs 73% Avg)

Within clustered conditions: Limited info/improvement

Page 48: IPANEMAP I P A Nucleic Acids E M A P

The full monty: Multi probing

All 14 conditions: 80% MCC (vs 73% Avg)

Within clustered conditions: Limited info/improvement

Across clusters

One condition for each cluster

79% MCC w/o NAI

74% MCC with NAI

But best (86.5% MCC) has NAI

Page 49: IPANEMAP I P A Nucleic Acids E M A P

The full monty: Multi probing

All 14 conditions: 80% MCC (vs 73% Avg)

Within clustered conditions: Limited info/improvement

Across clusters

One condition for each cluster

79% MCC w/o NAI

74% MCC with NAI

But best (86.5% MCC) has NAI

Orthogonality?

Complementarity?

Page 50: IPANEMAP I P A Nucleic Acids E M A P

Preliminary: Treating mutants as conditions

Mutate-And-Map profiles produced by Das lab

Systematic single-point mutants, usually similar structure

Page 51: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

Page 52: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Page 53: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Page 54: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Page 55: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Page 56: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Page 57: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

Page 58: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Page 59: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Explore better pseudo-potentials

Page 60: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Explore better pseudo-potentials

Better clustering procedures (beyond majority rule)

Page 61: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Explore better pseudo-potentials

Better clustering procedures (beyond majority rule)

Test on multi-stable RNAs

Page 62: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Explore better pseudo-potentials

Better clustering procedures (beyond majority rule)

Test on multi-stable RNAs

Optimal/iterative design of probing experiments

Page 63: IPANEMAP I P A Nucleic Acids E M A P

Conclusion

IPANEMAP is way more than just a cool acronym

Exploits the complementarity of probing experiments

Paves the way for quantifying a notion of orthogonality

Take home message:

Probing remains challenging (accuracy still slightly below 120%)

Having >2 reactivity profiles (even bad ones) appears to help

The future needs:

Explore better pseudo-potentials

Better clustering procedures (beyond majority rule)

Test on multi-stable RNAs

Optimal/iterative design of probing experiments

More data to build mechanical understanding of SHAPE

Page 64: IPANEMAP I P A Nucleic Acids E M A P

Thank you!

Afaf Saaidi

Postdoc?

Bruno Sargueil Delphine Allouche

Ronny (da main man) for modular extensions to Vienna package

Soft constraints x Sampling = Awesomeness

Supported by Fondation pour la Recherche Médicale