KNN Ch. 3Diagnostics and Remedial Measures
Applied Regression AnalysisBUSI 6220
Diagnostics for the Predictor Variable
Dot Plots Sequence Plots Stem-and-Leaf Plots
Essentially to check for outlying observations which will be useful in later diagnosis.
Residual AnalysisWhy Look at the Residuals?
Detect non-linearity of regression functionDetect Heteroscedasticity (=lack of constant variance)Auto-correlationOutliersNon-normality Important predictor variables left out?
Regression Model Assumptions:Errors are Independent (Have Zero Covariance)Errors have Constant VarianceErrors are Normally Distributed
Diagnostics for Residuals
Detect non-linearity of regression function
Heteroscedasticity Auto-correlation Outliers Non-normality
Important predictor variables left out?
1. against predictor (if X1 only)
2. (Absolute or Sqd. Residual) against predictor
3. against fitted values (for many Xi)
4. against time5. against omitted predictor variables6. Box plot7. Normal probability plot
PLOT OF PLOT OF RESIDUALSRESIDUALS
Diagnostics for Residuals
Approximate expected value of kth smallest residual :
25.0375.0
nkzMSE
Normal probability plotNormal probability plot
Tests involving Residuals
The Correlation test for NormalityThe Correlation test for Normality
HH00: The residuals are normal: The residuals are normal HHAA: The residuals are not normal: The residuals are not normalCorrelation between eCorrelation between eii(s) and their expected values (s) and their expected values under normality.under normality.Use Table B.6Use Table B.6Observed coeff. of correlation should be at least Observed coeff. of correlation should be at least as large as table value for a given level of as large as table value for a given level of significance.significance.
Tests involving Residuals
Other tests for NormalityOther tests for Normality
HH00: The residuals are normal: The residuals are normal HHAA: The residuals are not normal: The residuals are not normalAnderson-Darling (very powerful, may be used for Anderson-Darling (very powerful, may be used for small sets, n<25)small sets, n<25)Ryan-JoinerRyan-JoinerShapiro-WilkShapiro-WilkKolmogorov-SmirovKolmogorov-Smirov
Tests involving Residuals
The Correlation test for NormalityThe Correlation test for Normality
HH00: The residuals are normal: The residuals are normal HHAA: The residuals are not normal: The residuals are not normalCorrelation between eCorrelation between eii(s) and their expected values (s) and their expected values under normality.under normality.Use Table B.6Use Table B.6Observed coeff. of correlation should be at least Observed coeff. of correlation should be at least as large as table value for a given level of as large as table value for a given level of significance.significance.
Tests involving Residuals(Constancy of Error Variance)(Constancy of Error Variance)
The Modified Levene TestThe Modified Levene Test
Partitions the independent variable into two groups Partitions the independent variable into two groups (High X values and low X values), then tests the null(High X values and low X values), then tests the null
HH00: The groups have equal variances : The groups have equal variances Similar to a pooled variance t-test for difference in two means Similar to a pooled variance t-test for difference in two means of independent samples.of independent samples. It is robust to It is robust to departures from normalitydepartures from normality or error terms or error terms Large sample size essential so that dependencies of error Large sample size essential so that dependencies of error terms on each other can be neglectedterms on each other can be neglectedUses group “median” instead of the “mean”(Why ?)Uses group “median” instead of the “mean”(Why ?)
Tests involving Residuals(Constancy of Error Variance)(Constancy of Error Variance)
The Modified Levine TestThe Modified Levine Test
2)1()1(
points. data of sets twoon these based istest - t thei.e points,datathearedanddtheNow,
~~,
11
21
222
211
i2i1
221111
21
21*
nnsnsns
eedandeedwhere
nns
ddt
iiii
L
Read “Comments” on page 118 and Read “Comments” on page 118 and go thru’ the Breusch-Pagan test on go thru’ the Breusch-Pagan test on page 119.page 119.
F test for Lack of Fit A comparison of “Full Model” sum of
squares error and “Lack of Fit” sum of squares.
For best results, requires repeat observations at, at least one X level.
Full model: Yij=j+ ij (j = mean response when X=Xj)
Reduced model: Yij= 0 + Xj+ ij
(Why “Reduced” ?)
F test for Lack of Fit SSE(Full)=SSPE=
(Labeled “Pure Error” since unbiased estimator of true error variance. See 3.31 and 3.32, page 123)
SSLF=SSE(Reduced)-SSPE, (where SSE(Reduced)= SSE from ordinary least squares regression model)
Test Statistic : (what is “p”?)
j i
jij YY 2
cnSSPE
pcSSLF
F
*
Be sure to compare the ANOVA table on page 126 with OLS ANOVA Be sure to compare the ANOVA table on page 126 with OLS ANOVA table.table.
Overview of some Remedial Measures The Problem: Simple Linear Regression is not appropriate. The solution: 1. Abandon the model (“Eagle to Hawk; abort mission and return to base”.)
2. Remedy the situation:If Non-independent error terms then work with a model that If Non-independent error terms then work with a model that
calls for correlated error terms (Ch.12)calls for correlated error terms (Ch.12)If Heteroscedasticity then use WLS method to estimate If Heteroscedasticity then use WLS method to estimate
parameters (Ch. 10) or use transformations of data.parameters (Ch. 10) or use transformations of data.If scatter plot indicates non-linearity, then either use non-linear If scatter plot indicates non-linearity, then either use non-linear
regression function (Ch.7) or transform to linear. regression function (Ch.7) or transform to linear. NEXT: We will look at one such powerful transformation
method.
The Box-Cox Transformation Method The family of power transforms on Y is given as: Y'=Y
The family easily includes simple transforms such as the square root, squared etc.
By definitiondefinition, when then Y'=logeY When the response variable is so transformed, the normal
error regression model becomes: Yi0 + Xi+ i
We would like to determine the “best” value of ethod 1: Maximum likelihood estimation
n
iiinR
XYLMax1
2102
22,,, 21exp
2
1.42
100
The Box-Cox Transformation Method ethod 2: Numerical SearchStep 1: Set a value of .Step 2: Standardize the Yi observations
If then: Wi=K1(Yi)
If then: Wi=K2(logeYi)
where, K2 and K1
Step 3: Now regress the set W on the set X.Step 4: Note the corresponding SSE. Step 5: Change and repeat steps 2 to 4 until lowest SSE is
obtained.
nn
iiY
/1
1
12
1 K
Let’s try both this method with the GMAT data. Let’s try both this method with the GMAT data. What should we get as the best What should we get as the best