Upload
cyrah
View
32
Download
0
Embed Size (px)
DESCRIPTION
A Method for the More Accurate Measurement and Communication of Model Error. Scott Fortmann-Roe University of California, Berkeley. Predictions. 1) More accurate assessment of prediction error. 2 ) More accurate models. Inferences. 3) More accurate measures of significance. - PowerPoint PPT Presentation
Citation preview
A Method for the More Accurate Measurement and Communication of Model Error
Scott Fortmann-RoeUniversity of California, Berkeley
1) More accurate assessment of prediction error
Predictions
Inferences
2) More accurate models
3) More accurate measures of significance
4) Altered inferences and conclusions
Measure R2, p-
value, AIC
Accuracy
Accessibility
Adaptability
The Method: A3
Applications
Housing Market
Predicting housing price based on house and market attributes
Harrison D, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management 5: 81–102.
Coefficient Std. Error t-Value p-Value
(Intercept) 7.767 4.989 1.557 0.12
AGE -0.015 0.014 -1.096 0.27
ROOMS 7.006 0.412 17.015 < 0.01
NOX -13.314 3.903 -3.412 < 0.01
PUPIL/TEACHER -1.116 0.148 -7.544 < 0.01
HIGHWAY -0.025 0.043 -0.584 0.56
Adjusted R2: 0.60; p-Value < 0.01
Coefficient CrVa R2 p-Value
-Full Model- 59.3 % < 0.01
(Intercept) 7.767 - 0.1 % 0.39AGE -0.015 + 0.0 % 0.22ROOMS 7.006 + 22.9 % < 0.01NOX -13.314 + 0.8 % < 0.01PUPIL/TEACHER -1.116 + 4.6 % < 0.01
HIGHWAY -0.025 - 0.2 % 1.00
A3: Linear Model
CrVa R2 p-Value
-Full Model- 74.3 % < 0.01AGE - 1.5 % 0.01
ROOMS + 20.4 % < 0.01
NOX + 6.3 % < 0.01
PUPIL/TEACHER - 1.4 % < 0.01
HIGHWAY - 2.6 % 0.03
A3: Random Forest Model
Linear Regression
Random Forest
Support Vector
Machines
CrVa R2 0.593 0.743 0.711
Significant at p = 0.05
• ROOMS• NOX• PUPIL/
TEACHER
• AGE• ROOMS• NOX• PUPIL/
TEACHER• HIGHWAY
• AGE• ROOMS• NOX• PUPIL/
TEACHER
Not Significant at
p = 0.05
• AGE• HIGHWAY
• HIGHWAY
Environmental Productivity
Measure utility of an ecosystem based on different physical attributes
Maestre FT, Quero JL, Gotelli NJ, Escudero A, Ochoa V, et al. (2012) Plant Species Richness and Ecosystem Multifunctionality in Global Drylands. Science 335: 214–218.
Coefficient Std. Error t-Value p-Value
(Intercept) 1.0080 0.175 5.772 < 0.01
SR 0.0099 0.004 2.351 0.02
SLO 0.0176 0.006 3.139 < 0.01
SAC -0.0174 0.002 -8.523 < 0.01
C1 -0.0209 0.039 -0.537 0.59
C2 -0.0677 0.053 -1.285 0.20
C3 0.0348 0.036 0.979 0.33
C4 -0.2663 0.038 -7.005 < 0.01
LAT 0.0024 0.001 1.797 0.07
LONG -0.0019 0.001 -3.474 < 0.01
ELE -0.0002 0.000 -3.887 < 0.01
Adjusted R2=0.56; p-Value < 0.01
Coefficient CrVa R2 p-Value-Full Model- 52.5 % < 0.01(Intercept) 1.008 + 7.2 % < 0.01SR 0.010 + 0.8 % 0.01SLO 0.018 + 1.7 % 0.01SAC -0.017 + 16.3 % < 0.01C1 -0.021 - 0.5 % 0.91C2 -0.068 + 0.0 % 0.15C3 0.035 - 0.2 % 0.28C4 -0.266 + 10.8 % < 0.01LAT 0.002 + 0.2 % 0.09LONG -0.002 + 2.4 % < 0.01ELE 0.000 + 3.0 % < 0.01
A3: Linear Model
CrVa R2 p-Value
-Full Model- 68.3 % < 0.01SR + 1.2 % < 0.01SLO - 1.3 % 0.95SAC + 4.0 % < 0.01C1 + 1.8 % < 0.01C2 - 0.04 % 0.02C3 + 0.3 % 0.16C4 + 0.6 % < 0.01LAT + 0.5 % < 0.01LONG + 0.2 % 0.02ELE + 0.4 % 0.02
A3: Random Forest Model
SR SLO SAC C1 C2 C3 C4 ELE0
0.2
0.4
0.6
0.8
1
Maestre et al Relative Importance (Fig 2A)Relative Importance using Random Forests
Relative Im
portance Predict-
ing Productivity
Applications Recap
Explained an additional 15-16% of the squared error
Significantly altered inferences and conclusions about the underlying systems
Questions….