New Walters D3R 2018 - D3R | Welcome... · 2018. 3. 5. · Pat Walters – D3R Workshop February...

Preview:

Citation preview

Howcanwegetbetteratthis?

PatWalters– D3RWorkshopFebruary23,2018

I’veDoneThisaFewTimes

2012 2013 2015 2016 2017 20182014

3

HowISpendMyTimeOnChallenges

Confidential|©2017RelayTherapeutics

Dealingwithpoorlyformattedsubmissions

Validatingevaluations MakingSlides

4

TheEvaluationProcess

Confidential|©2017RelayTherapeutics

PatEvaluate

ConnorandZiedEvaluate

FinalComparisons

5

TheLiteratureMakesItLookLikeActivityPredictionisaSolvedProblem

Confidential|©2017RelayTherapeutics

0.82 0.80

0.66 0.65

Pearsonr

6

ScoringPerformanceFromGC2andGC3

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

7

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

8

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

Whenevaluatingaregressionmodel,thedatasetshouldhaveadynamicrangesimilartothoseobservedindrugdiscoveryprojects(typically4-6logs)

9

DatasetsShouldSpanaReasonableDynamicRange

Confidential|©2017RelayTherapeutics

Thisdataset(PDBindv.2016coreset)spans10logsanddoesn’tprovideanappropriaterepresentationofcorrelation

10

CorrelationsCanChangeDramaticallyWithDynamicRange

Confidential|©2017RelayTherapeutics

R2=0.22MAE=0.69

R2=0.76MAE=0.55

Thisisthesamedataset.Ontheleftweconsidertheentireset,whichhasanunrealisticallylarge(~10log)dynamicrange.Ontherightweconsideramorerealisticsubsetwitha3logdynamicrange.Notethechangeincorrelation.

11

GC3CatSDatasetSpansaRealisticDynamicRange

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

12

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

13

Don’tCramMultipleDatasetsontotheSamePlot

Confidential|©2017RelayTherapeutics

http://pubs.acs.org/doi/abs/10.1021/acs.jpcb.7b07224 http://pubs.acs.org/doi/abs/10.1021/ja512751q

14

EvenMyFriendsAreGuilty

Confidential|©2017RelayTherapeutics

MillandNeysa(Yesterday)

15

Trellisingprovidesamuchmoreeffectivemeansofcomparingdatasets

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

16

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

ReportPearson,SpearmanandKendallcorrelationsFavorR2 overRwhenreportingaPearsoncorrelationcoefficientReportMAEand/orRMSE

17

Alwaysreportcorrelationsappropriately

Confidential|©2017RelayTherapeutics

Ihavenoideawhatthismeans

http://pubs.acs.org/doi/abs/10.1021/acs.jpcb.7b07224

18

MaximumAchievableCorrelation

Confidential|©2017RelayTherapeutics

StartwithexperimentaldataAddGaussianerror

§ Mean=0.0§ Standarddeviation=0.3log

CalculationcorrelationRepeat1000times

Brown,ScottP.,StevenW.Muchmore,andPhilipJ.Hajduk."Healthyskepticism:assessingrealisticmodelperformance.”DrugDiscoveryToday14.7(2009):420-427.

19

MaximumAchievableCorrelation- HPS90D3R1

Confidential|©2017RelayTherapeutics

https://github.com/PatWalters/metk

OpenSourceEvaluationCode(MoretoCome)

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

21

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

22

EnsureThatDifferencesinCorrelationAreSignificant

Confidential|©2017RelayTherapeutics

Inparticular,bothMM-PB/SAandMM-GB/SAproducedbetterresultsbyusingarepresentativestructure(R)0.72-0.79)ratherthanaveragingovertheconformationalensembleofeachgivencomplex(R)0.61-0.74

23Confidential|©2017RelayTherapeutics

M1_dynamic M1_static M2_static M3_dynamic M3_static M4_dynamic M4_static

Table L2

abs(

Pea

rson

r)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Aliteraturecomparisonof7methodsforscoringprotein-ligandinteractions

24

Rememberthatcorrelationshaveconfidenceintervalsandreporttheseintervals

Confidential|©2017RelayTherapeutics

25

It’sAlltheSame!

Confidential|©2017RelayTherapeutics

M1_dynamic M1_static M2_static M3_dynamic M3_static M4_dynamic M4_static

Table L2

ab

s(P

ea

rso

n r

)

0.0

0.2

0.4

0.6

0.8

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

26

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

MolecularweightandcalculatedLogParepoornullmodels

GenerateRDKitfingerprintsforligandsTrainonPDBbindrefinedset(n=4057)TestonPDBbindcoreset(n=290)Wallclocktime<5min

28

SimpleQSARasaNullModel

Confidential|©2017RelayTherapeutics

29

WhatConstitutesanAppropriateNullModel

Confidential|©2017RelayTherapeutics

MolecularWeight XLogP SimpleQSAR

30

ANullModelforRMSE

Confidential|©2017RelayTherapeutics

1.SampleNobservedvalues2.CalculateRMS3.Repeat1and2*1000

31

NullModelforGC1HSP90FreeEnergyChallenge

Confidential|©2017RelayTherapeutics

RMSE(kcal/m

ol)

32

ComparingRMSvsNullforGC1HSP90Challenge

Confidential|©2017RelayTherapeutics

Dashedlineindicatesthenullmodel

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

33

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

Alwaysprovideamachinereadabletable(e.g.csv)ofpredictedandexperimentalvaluesAtableinapaperisnotsufficient,itisoftenverydifficulttoextracttablesfrompdffilesChemicalstructuresshouldbeincludedasSDFor,whereappropriate,SMILEStofacilitatecomparisonwithothermethodsNeedtoenablereaderstoevaluatecorrelationsanderrors

34

Includeappropriatesupportinginformation

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

35

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

36

CanIReproduceYourMethod?

Confidential|©2017RelayTherapeutics

Code!!!AthoroughdescriptionofyourmethodAwebimplementationNoneoftheabove

37

WhatConstitutesReproducibility?

Confidential|©2017RelayTherapeutics

Weneedtoagreeon• Whatconstitutesareasonabledataset• Howdatashouldbereported• Evaluationmetrics• Statisticsforcomparison• Whatconstitutesanullmodel• Formatofsupportingmaterial• Criteriaforreproducibility

38

GuidelinesForReviewing”ScoringFunction”Papers

Confidential|©2017RelayTherapeutics

39

HowCanYouHelp?

Confidential|©2017RelayTherapeutics

40

DockingChallengesHaveBecomeMoreChallenging

Confidential|©2017RelayTherapeutics

Arewespendingenoughtimeunderstandcompoundsthatdockedpoorly?• Insufficientconformationalsampling• Insufficientposesampling• Inadequatescoring• LigandposeswithlimiteddensityIseveryonemissingthesamecompounds?Cangroupsworktogethertoimprovetheirmethods?

41

QuestionsonDockingChallenges

Confidential|©2017RelayTherapeutics

D3RParticipantsCSARParticipantsTDTParticipantsSAMPLParticipants

RommieAmaroMikeGilson

MillLambertNeysaNevins

ConnorParksZiedGaieb

ShuaiLiu

42

Acknowledgements

Confidential|©2017RelayTherapeutics

https://github.com/PatWalters/metk

OpenSourceEvaluationCode(MoretoCome)

BACKUP

44Confidential|©2017RelayTherapeutics

45

LooksLikeActivityPredictionisaSolvedProblem

Confidential|©2017RelayTherapeutics

0.82 0.80

0.66 0.65

Pearsonr

46

WhatConstitutesanAppropriateNullModel

Confidential|©2017RelayTherapeutics

MolecularWeight XLogP SimpleQSAR

47

WhatConstitutesanAppropriateNullModel

Confidential|©2017RelayTherapeutics

MolecularWeight XLogP SimpleQSAR

48

Evaluatemaximumpossiblecorrelationforadatasetgivenexperimentalerror

Confidential|©2017RelayTherapeutics https://www.sciencedirect.com/science/article/pii/S1359644609000403

StartwithexperimentaldataAddGaussianerror• Mean=0.0• Standarddeviation=0.3logCalculationcorrelationRepeat1000times

49

MaximumAchievableCorrelation

Confidential|©2017RelayTherapeutics

Recommended