Upload
shona-charles
View
215
Download
3
Embed Size (px)
Citation preview
Analysis of Residuals
©2005 Dr. B. C. Paul
Examining Residuals of Regression (From our Previous Example)
Set up your linear regression in theUsual manner.
Selecting Plots
After setting your dependent andIndependent variables and beforeClicking ok, click plots instead.
Picking Residual Plots
Plot the residual on the Y axisAgainst the predicted value onThe X axis.
Ask for Histograms and normalProbability plots.
More Plots
Use the next button to allow youTo select another plot.
Then enter the residual on theY axis against the dependentVariable.
Finally tell the computer toContinue.
You Will Still Get the Normal Tables we Saw Before
Scroll downTo see whatIs new.
Some Abnormality in the Histogram
A Histogram is a bar chartShowing the number ofResults in different numericIntervals.
In this case we can see thereMay be two families ofUnexplained events andOne of them is causing theModel to over-predict(note the negative tail).
We Have a Cumulative Probability Plot
Cumulative probabilityCounts all the samples That should have comeUp by a certain point(it is an integration of theProbability distribution).
Normal would plot on aStraight line. This isSomewhat straight butThe slope at the center isWrong and the tailsDrift off. (More commentaryOn reading cumulativeProbability plots later).
Look for Trends that have been systematically missed
This plot showsThe residual(amount weMissed by) againstThe predictedValue.
If there is a trendIn the points itMay tell usWhat we missed.
In this case it isPretty scattered.
Missing Trends
We are still missingSomething becauseThere is a definiteTrend in the residualsRelative to the actualMPG.
We are missing aVariable or factor.(it might be linear).
Consider Another Data Set
We have an Independent andDependent Variable.
(The data set could representAny problem we wished toModel).
Tell it to do a Regression of the Dependent against the Independent Variable.
Be sure we also ask for ourResidual plots.
Go to Results
The R^2 value is 0.996 – darnOne is a straight line. How muchCloser do you want to be.
This regression looks like itFits like a glove – TheMean Square for regressionIs 5 orders of magnitudeGreater than the MS for error.
The F statistic blows the nullHypothesis off the map.
No Chance the Slope or Constant are Zero
There is some evidence the distribution of residuals is a little skewed.
The residual distribution is definitely skewed off to one side
Oh Boy – Can You See the Trend we missed here?
Here the residualsFollow a clear andUnmistakable shape ofAn effect we missed.
This Thing Has a Second Order or Curved Effect
OK – Now What Do I Do?
Linear Regression Rapidly and Quantitatively Fits a simple linear function of one variable to another.
We noted that there had to be other effects present on the gas mileage but linear regression only handles one independent variable.
We also noted that sometimes there our second or higher order effects of a variable present – a straight line just doesn’t fit that
We may want to have some more powerful tools to fall back on (we just try the easy stuff first).