bioexcel.eu
Partners Funding
PRODIGY, a web server to predict binding affinities
in protein-protein complexes
Presenters: Anna Vangone
Host: Adam Carter
BioExcel Educational Webinar Series #7
12 October, 2016
bioexcel.eu
This webinar is being recorded
bioexcel.eu
BioExcel Overview
• Excellence in Biomolecular Software
- Improve the performance, efficiency and scalability of key codes
• Excellence in Usability
- Devise efficient workflow environments
with associated data integration
• Excellence in Consultancy and Training
- Promote best practices and train end users
DMI Monitor
DMI Enactor
DMI Executor
DMI Enactor
Data Delivery Point
Data Source
Monitoring flow
Data flow
Service Invocation
DMI Optimiser
DMI Planner
DMIValidator
DMI Gateway
DMI Gateway
DMI Gateway
DMI Enactor
Portal / Workbench
DMI Request
DADC Engineer
DMI Expert
Repository
Registry
DMI Expert
Domain Expert
bioexcel.eu
Interest Groups
• Integrative Modeling IG
• Free Energy Calculations IG
• Best practices for performance tuning IG
• Hybrid methods for biomolecular systems IG
• Biomolecular simulations entry level users IG
• Practical applications for industry IG
• Training
• Workflows
Support platforms
http://bioexcel.eu/contact
Forums Code Repositories Chat channel Video Channel
bioexcel.eu
Today’s Presenter
Anna Vangone studied Chemistry at University of Salerno
(Italy) and received her MSc cum laude in 2009. She then
joined the doctoral program in bioinformatics at University of
Salerno and worked as visiting PhD student at University of
Oxford (UK), obtaining her PhD in 2013.
After a few visits at King Abdullah University of Science and
Technology (Saudi Arabia) as an invited scientist, she joined
the computational structural biology group at Utrecht
University as a post-doc researcher. In 2015 she obtained a
prestigious Marie Curie individual fellowship that has
supported her research at Utrecht University since October
2015.
Her research interests are focused on the study of protein-
protein interactions, characterization and analysis of biological
complexes by computational modeling and docking in
particular. Her work has resulted in 15 peer-reviewed
publications and several international conference invitations.
5
Structural Biology Groupwww.bonvinlab.org
Anna Vangone
Computational Structural Biology group
Utrecht University
PRODIGY
A web-server to predict binding affinity
in protein-protein complexes
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
Binding affinity
Computational prediction methods
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
1. Introduction - 2. Method - 3. Results - 4. Conclusion
DNA replication
Immune response
Signaling cascade… many more
Protein-protein interactions
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Protein-protein interactions: binding energy
ΔG
En
ergy
Conformational space
+
ΔGbind = Gbound – Gfree = RT ln(K)
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Protein-protein interactions: binding energy
ΔG
En
ergy
Conformational space
+
ΔGbind = Gbound – Gfree = RT ln(K)
EXACT METHODS
EMPIRICAL SCORING
FUNCTIONS
STRUCTURAL PROPERTIES
MD direct counting
Umbrella Sampling
Combination of Energetics terms (En)
• Electrostatic
• Van der Waals
• Exclusion solvent
• ……
Inaccurate/fast
NIS: Kastritis et al. J Mol Biol (2014)
BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)
ΔG= f(BSA)
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Structural properties
Non Interacting
Surface (NIS)
NIS: Kastritis et al. J Mol Biol (2014)
BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)
ΔG= f(BSA)
ΔG = f(BSA, NIS)
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Structural properties
Non Interacting
Surface (NIS)
NIS: Kastritis et al. J Mol Biol (2014)
BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)
ΔG= f(BSA)
ΔG = f(BSA, NIS)
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Structural properties
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
Interfacial contacts (ICs):
number of pair-residues within a distance cut-off
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Contacts at the interface
5.5 Å
Vangone and Bonvin, eLife (2015)
Interfacial contacts (ICs):
number of pair-residues within a distance cut-off
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Contacts at the interface
5.5 Å
Vangone and Bonvin, eLife (2015)
ICstotal
ICsProperty P1
ICs Property P2
r
6 2 4 …
Performance: reported as Pearson’s Correlation
Coefficient
Classification of residues based on their
physico-chemical properties
P1 is #ICs between charged-polar residues
P2 is #ICs between polar-apolar residues
……
Interfacial contacts (ICs):
number of pair-residues within a distance cut-off
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Contacts at the interface
5.5 Å
Vangone and Bonvin, eLife (2015)
ICstotal
ICsProperty P1
ICs Property P2
r
6 2 4 …
Performance: reported as Pearson’s Correlation
Coefficient
Classification of residues based on their
physico-chemical properties
P1 is #ICs between charged-polar residues
P2 is #ICs between polar-apolar residues
……
ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL
P1Charged/Charged
P2Charged/Polar
P3Charged/Apolar
P4Polar/Polar
P5Polar/Apola
r
P6Apolar/Apolar
r
w1 w2 w3 w4 w5 w6 N
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Training the predictor
P1Charged/Charged
P2Charged/Polar
P3Charged/Apolar
P4Polar/Polar
P5Polar/Apola
r
P6Apolar/Apolar
r
w1 w2 w3 w4 w5 w6 M
ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL
P1Charged/Charged
P2Charged/Polar
P3Charged/Apolar
P4Polar/Polar
P5Polar/Apola
r
P6Apolar/Apolar
r
w1 w2 w3 w4 w5 w6 N
FEATURE SELECTION (AIC)(Akaike Information Criterion)
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Training the predictor
P1Charged/Charged
P2Charged/Polar
P3Charged/Apolar
P4Polar/Polar
P5Polar/Apola
r
P6Apolar/Apolar
r
w1 w2 w3 w4 w5 w6 M
ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL
25% prediction
P1Charged/Charged
P2Charged/Polar
P3Charged/Apolar
P4Polar/Polar
P5Polar/Apola
r
P6Apolar/Apolar
r
w1 w2 w3 w4 w5 w6 N
FEATURE SELECTION (AIC)(Akaike Information Criterion)
CROSS-VALIDATION: 4-fold cross-validation
75% trainingDataset:
Fold_1
Fold_2
Fold_3
Fold_4
Fold_1
Fold_2
Fold_3
Fold_4
Fold_1
Fold_2
Fold_3
Fold_4
Fold_1
Fold_2
Fold_3
Fold_4
X 10
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Training the predictor
Binding affinity
Stronger Weaker
• Functional classes (antibody 12%, enzymes 41%, other 47%)
• ΔG (-4.3 / -18.6) kcal mol-1
• BSA (808 – 3370) Å2
• Methods (Kd) (SPR, florescence, ITC…)
• Conformational changes (0.17-4.90) Å
Benchmark in: Kastritis at al., Protein Sci 2011
122 complexes with complete
crystallographic structure
-18.6 kcal mol-1 -4.3 kcal mol-1
1. Introduction - 2. Method - 3. Results - 4. Conclusion
The dataset: protein-protein binding affinity benchmark
i_rmds: 3.28 Åi_rmds: 0.42 Å
Rigid complex Flexible complex
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Conformational changes upon binding
i_rmsd > 1.0 Å
i_rmds: 3.28 Åi_rmds: 0.42 Å
Rigid complex Flexible complexi_rmsd > 1.0 Å
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Conformational changes upon binding
i_rmds: 3.28 Åi_rmds: 0.42 Å
Rigid complex Flexible complexi_rmsd > 1.0 Å
1. Introduction - 2. Method - 3. Results - 4. Conclusion
Conformational changes upon binding
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
Experimental ΔGs (kcal mol-1)
R=-0.50p-value<0.0001
ICs
Technique r_ICs r_BSA #cases
All -0.50 -0.32 122
Stopped-flow -0.70 -0.55 8
Spectroscopy -0.65 -0.27 14
ITC -0.55 -0.64 20
SPR -0.53 -0.44 39
Inhibition Assay 0.05 -0.08 17
Fluorescence 0.04 0.34 19
10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion
Experimental ΔGs (kcal mol-1)
R=-0.50p-value<0.0001
ICs
Technique r_ICs r_BSA #cases
All -0.50 -0.32 122
Stopped-flow -0.70 -0.55 8
Spectroscopy -0.65 -0.27 14
ITC -0.55 -0.64 20
SPR -0.53 -0.44 39
Inhibition Assay 0.05 -0.08 17
Fluorescence 0.04 0.34 19
10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion
10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs
R=0.05p-value<0.4
Experimental ΔGs (kcal mol-1) Experimental ΔGs (kcal mol-1)
R=-0.50p-value<0.0001
ICs
Technique r_ICs r_BSA #cases
All -0.50 -0.32 122
Stopped-flow -0.70 -0.55 8
Spectroscopy -0.65 -0.27 14
ITC -0.55 -0.64 20
SPR -0.53 -0.44 39
Inhibition Assay 0.05 -0.08 17
Fluorescence 0.04 0.34 19
Inhibition Assay + Fluorescence10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion
10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs
R=0.05p-value<0.4
Experimental ΔGs (kcal mol-1) Experimental ΔGs (kcal mol-1)
Experimental ΔGs (kcal mol-1)
R=-0.59p-value<0.0001
R=-0.50p-value<0.0001
ICs
ICs
Technique r_ICs r_BSA #cases
All -0.50 -0.32 122
Stopped-flow -0.70 -0.55 8
Spectroscopy -0.65 -0.27 14
ITC -0.55 -0.64 20
SPR -0.53 -0.44 39
Inhibition Assay 0.05 -0.08 17
Fluorescence 0.04 0.34 19
Inhibition Assay + Fluorescence10
20
30
40
50
60
70
80
-20 -18 -16 -14 -12 -10 -8 -6 -4
ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion
ICs
total
ICs Property-
basedNIS r
✓ 0.59
Vangone and Bonvin, eLife (2015)
ΔGpred=w1P1+w2P2+….
The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion
ICs
total
ICs Property-
basedNIS r
✓ 0.59
✓ 0.67
Vangone and Bonvin, eLife (2015)
ΔGpred=w1P1+w2P2+….
The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion
ICs
total
ICs Property-
basedNIS r
✓ 0.59
✓ 0.67
✓ ✓ 0.73
Vangone and Bonvin, eLife (2015)
ΔGpred=w1P1+w2P2+….
The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion
ICs
total
ICs Property-
basedNIS r
✓ 0.59
✓ 0.67
✓ ✓ 0.73
ΔGpredicted= - 0.09459 ICscharged/charged
- 0.10007 ICscharged/apolar
+ 0.19577 ICspolar/polar
- 0.22671 ICspolar/apolar
+ 0.18681 %NISapolar
+ 0.3810 %NIScharged
- 15.9433
Vangone and Bonvin, eLife (2015)
Pre
dic
ted
ΔG
s (k
cal
mo
l-1)
-20
-18
-16
-14
-12
-10
-8
-6
-4
-20 -18 -16 -14 -12 -10 -8 -6 -4
Experimental ΔGs (kcal mol-1)
r = 0.73
RMSE= 1.89 kcal mol-1
ΔGpred=w1P1+w2P2+….
The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion
Comparison with other methods1. Introduction - 2. Method - 3. Results - 4. Conclusion
Vangone and Bonvin, eLife (2015)1CCharPPI web-server: Moal et al., Bioinformatics 2015
Performance compared with 105 functions reported in CCharPPI1, calculated on the same set of
structures (“composite scoring functions” reported in the plot)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
Pears
on
's C
orr
ela
tio
n
Comparison with other methods1. Introduction - 2. Method - 3. Results - 4. Conclusion
Vangone and Bonvin, eLife (2015)1CCharPPI web-server: Moal et al., Bioinformatics 2015
Performance compared with 105 functions reported in CCharPPI1, calculated on the same set of
structures (“composite scoring functions” reported in the plot)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
Pears
on
's C
orr
ela
tio
n
ALL
RIGID
FLEXIBLE
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
PRODIGY: the web-server1. Introduction - 2. Method - 3. Results - 4. Conclusion
Xue, Rodrigues, Kastritis, Bonvin, Vangone. Bioinformatics (2016)
1. INTRODUCTION
2. THE METHOD
3. RESULTS
4. CONCLUSION
Take home message1. Introduction - 2. Method - 3. Results - 4. Conclusion
The number and nature
of interface contacts
is a simple but robust
predictor of
binding affinity
http://milou.science.uu.nl/services/PRODIGY/
ACKNOWLEDGMENTS
CSB group @ UU
Li Xue
João Rodrigues
Panagiotis Kastritis
Alexandre Bonvin
Further info
Cross-validation data
ΔGpredicted= - 0.09459 ICscharged/charged
- 0.10007 ICscharged/apolar
+ 0.19577 ICspolar/polar
- 0.22671 ICspolar/apolar
+ 0.18681 %NISapolar
+ 0.3810 %NIScharged
- 15.9433
Vangone and Bonvin, eLife (2015)
ICs BSA
• ICs performs better than BSA
BSA of each residue greatly depends from its ASA in the free form
BSAaa = ASAaa_free −ASAaa_complex
For aa at interface: ASAaa_complex ~ 0 BSAaa ∼ASAaa_free
The advantage of using contacts instead of surfaces?1. Introduction - 2. Method - 3. Results - 4. Conclusion
bioexcel.eu
Audience Q&A
session
Please use the Questions
function in GoToWebinar
application
Any other questions or points
to discuss after the live
webinar? Join the discussion at
http://bit.ly/2dc69Ur