Upload
ullrikasahlin
View
177
Download
2
Tags:
Embed Size (px)
DESCRIPTION
This is the presentation from my talk at the excellent Gordon Research Conference on Computer Aided Drug Design 2013.
Citation preview
Uncertainty in QSAR Predictions – Bayesian Inference and the Magic of Bootstrap
Ullrika Sahlin PhD
Centre for Environmental and Climate Research (CEC)
QSAR integrated assessment
Assessment model
Input 1
Input 2
Input 3
Decision node
QSAR prediction
QSAR prediction
Experimental value
Uncertainty in hazard assessment – does it matter?
4. Conservative
value of toxicity
3. Expected toxicity
2. Median toxicity
1. QSAR predictions
without uncertainty
0. No HA
?: 386
Not toxic*: 281 265 262 153
+109+3
+16Very toxic:
105
Sahlin et al. 2013. Arguments for Considering Uncertainty in QSAR Predictions in Hazard and Risk Assessments. ATLA
QSAR integrated hazard assessment and the AD domain problem
-10 -8 -6 -4
020
040
060
080
0Predicted No Effect Concentration of 386 Triazoles
log min{EC50}
Mo
lecu
lar
we
igh
t
Relative toxicity potentialLow confidence in prediction
Modes of statistical inference
• Parametric inference– Explain– Hypothesis-driven
• Predictive inference– Predict to support decision making– Generate hypothesis
• Evidence synthesis– Consider quality
Geisser. Introduction to predictive inference 1993. Sutton and Abrams 2001. Bayesian methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research.
To predict…
is to make a statement of something we have not yet observed
is always made with uncertainty
is made using at least one model
How can I…
• Assess uncertainty in a prediction?• Take my judgement of confidence in the
model into account?• Validate the assessment?
Principle for QSAR modelling
Principle to judge
confidence in predictions
Principle to assess
uncertainty
Uncertainty in a prediction
Predictive error Predictive reliabilityOur confidence in using a model to predict what we want to predict
0.0 0.1 0.2 0.3 0.4 0.5 0.6
-2-1
01
hat value
pre
dic
tive
me
an
2 4 6 8 10 12 14
-2-1
01
nC
log
EC
50
Discrepancy between model and reality
Quantitative
Qualitative
-5 0 5 10
-10
-50
510
15
nC
pred
icte
d y
Different kinds of errors
5e-02 5e-01 5e+00 5e+01 5e+02
510
15
distance from model
pre
dic
tion
+
+ ++
+
+
++ ++++
+ + +++
+
++
+
+
+
+++
+ ++++
+
+
++
+
+
+
++
+++ ++++ +
+
++
+
+
+++
++++
+
+
+++ + +
+
+
+
++
+
++ +++
++++
++
+
++
+
+
+
+++
++
+
+
+
++++
++
++
++
++
++
+
++
+
+
+
++
+
+
+
++ ++++++++++++++
+ ++ +
+
+ +
+
++ + ++ ++
+ ++++ + +
++
+
+
+ +
++ +
++++
++
+
++
++++
++
+
+++
++ +
+
+
++
++
+++
+ ++
+++++++
+
+
++
+
++++
+
+
++
++
++
+ +++ + +
+
++
+ +++ + +
+
+++
+
+
+
+
+
+++++ +
+++
+ +++++
++++ ++
++
+++++
+++
++
+++
+++ ++++
++
++
++
+
+
+
++ + + +++ ++
++ +
+
+
+
++
++
+ ++++
+ +++++
+++++++++++
+ + ++++
+++
+
++
+++
++++
+++
++ ++ ++++ +
++++++ ++ ++
Predictive reliability
Different measures of predictive reliability
• Similarity to points in the training data set• Distance from the centre of training data• Density of training data around the item to be
predicted
• Sensitivity analysis e.g. standard deviation in perturbed predictions
Predictive error of a regression
Predictive error of a regression
Predictive distribution
p(Y < y |X,θ)
Predictive error of a regression
Predictive distribution
p(Y < y |X,θ)
Predictive error of a regression
Use likelihood to compare!
Assessment of predictive
distribution
Frequentist framework
Frequentist analytical
Sampling"external data" Re-sampling
Jackknifing "without
replacement"
Bootstrapping"with
replacement"
Bayesian framework
Bayesian analytical
Bayesian sampling
Different ways to assess
I. Bayesian modelling
Assessment of predictive
distribution
Frequentist framework
Frequentist analytical
Sampling"external data" Re-sampling
Jackknifing "without
replacement"
Bootstrapping"with
replacement"
Bayesian framework
Bayesian analytical
Bayesian sampling
I. Bayesian modelling
• Model parameters are uncertain
• Uncertainty is described by probability
• Prior information is subjective
• Data enters through Bayesian updating
0 50 100 150 200
5055
6065
7075
MCMC sampling
parameter 1
par
am
ete
r 2
I. Bayesian modelling
Pros• Uncertainty is measured by
probability• Links to decision theory• Motivated under small data
Cons• Treatment of high-
dimensional descriptor space?
• Limitation to specific models?
• Re-modelling of QSARs needed
Validation
Fathead Minnow QSARdata R-package
Park and Casella (2008) Journal of the American Statistical Association, Gramacy and Pantaleo (2010) Bayesian Analysis.
-2 -1 0 1 2
-10
12
training data
observed
pred
icte
d
R2_Blasso = 0.79
-3 -2 -1 0 1 2
-2-1
01
23
test data
observed
pred
icte
d
R2_Blasso = 0.75
Validation
Empirical coverage
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
training data
confidence
hit
rate
0.0 0.2 0.4 0.6 0.8 1.00.
00.
20.
40.
60.
81.
0
test data
confidence
hit
rate
2. Bootstrap sampling
Assessment of predictive
distribution
Frequentist framework
Frequentist analytical
Sampling"external data" Re-sampling
Jackknifing "without
replacement"
Bootstrapping"with
replacement"
Bayesian framework
Bayesian analytical
Bayesian sampling
3. Assessment considering judgment in predictive reliability
Inspired by Denham 1997 and Clark 2009
Type of distribution: Gaussian
Mean: Point prediction yq
Variance: Local Predictive Error Sum of Squares divided by denominator
3. Assessment considering judgment in predictive reliability
Inspired by Denham 1997 and Clark 2009
Type of distribution: Gaussian
Mean: Point prediction yq
Variance: Local Predictive Error Sum of Squares divided by denominator
Observed prediction errors Measure of predictive reliability
jj yy ˆSampling from distribution of
modified residuals
3. Assessment considering judgment in predictive reliability
n
j jq
n
j jjjq
qw
yywPRESSW
1 ,
1
2, )ˆ(
.
)(
2
,
)ˆ(.jqwkNNj
jjq yyPRESSkNN
n
j jj yyPRESS1
2)ˆ(
Inspired by Denham 1997 and Clark 2009
Type of distribution: Gaussian
Mean: Point prediction Yq
Variance: Local Predictive Error Sum of Squares divided by denominator
Validate the assessmentEvaluation on External data
log likelihood score
Ass
essm
ent
of p
redi
ctiv
e er
ror
-100 -80 -60 -40 -20 0
equal
W euclidean
W leverage
W ADdens
kNN euclidean
kNN leverage
kNN ADdens
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Empirical coverage (External data)
confidence level
hit
rate
1:1equalW euclideanW leverageW ADdenskNN euclideankNN leveragekNN ADdens
So – which approach is the best?
-2 -1 0 1 2
-2-1
01
2
training data
observed
pred
icte
d
R2_pls = 0.77 R2_boot = 0.83 R2_Blasso = 0.79
-3 -2 -1 0 1 2-2
-10
12
3
test data
observed
pred
icte
d
R2_pls = 0.77 R2_boot = 0.78 R2_Blasso = 0.75
So – which approach is the best?
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
training data
confidence
hit
rate
1:1BlassoBootstrapkNN leverageequal
0.0 0.2 0.4 0.6 0.8 1.00.
00.
20.
40.
60.
81.
0
test data
confidence
hit
rate
1:1BlassoBootstrapW euclideanequal
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
training data
confidence
hit
rate
1:1BlassoBootstrapkNN leverageequal
0.0 0.2 0.4 0.6 0.8 1.00.
00.
20.
40.
60.
81.
0
test data
confidence
hit
rate
1:1BlassoBootstrapW euclideanequal
So – which approach is the best?
Evaluation on training data
log likelihood score
Ass
ess
me
nt
of
pre
dic
tive
err
or
-200 -150 -100 -50 0
Blasso
Bootstrap
kNN leverage
equal
Take home messages
• A predictions is complete when given with uncertainty specified by probability
• Assessment of uncertainty need both be theoretical motivated and proved honest in empirical evaluation of performance measures
• Three useful approaches are to assess uncertainty through modelling (Bayesian), sampling (e.g. bootstrapping), or post modelling of predictive error
• Use appropriate measures to validate the assessment of uncertainty
Thank you for your attention
Drive safely in the statistical djungle!