(PPTX) Assumption checking in “normal” multiple regression with Stata

Assumption checking in “normal” multiple regression with Stata

Download PPTX Report

Upload
megara
View
47
Download
0

Tags:

Embed Size (px)

DESCRIPTION

Assumption checking in “normal” multiple regression with Stata. Assumptions in regression analysis. No multi-collinearity All relevant predictor variables included Homoscedasticity : all residuals are from a distribution with the same variance - PowerPoint PPT Presentation

Citation preview

Assumption checking in “normal” multiple regression

with Stata

Assumptions in regression analysis•No multi-collinearity

•All relevant predictor variables included•Homoscedasticity: all residuals are from a distribution with the same variance•Linearity: the “true” model should be linear.•Independent errors: having information about the value of a residual should not give you information about the value of other residuals•Errors are distributed normally

Page 3: Assumption checking in “normal” multiple regression with Stata

FIRST THE ONE THAT LEADS TO NOTHING NEW IN STATA (NOTE: SLIDE TAKEN LITERALLY FROM MMBR)

Independent errors: having information about the value of a residual should not give you information about the value of other residuals

Detect: ask yourself whether it is likely that knowledge about one residual would tell you something about the value of another residual.Typical cases: -repeated measures-clustered observations (people within firms / pupils within schools)

Consequences: as for heteroscedasticityUsually, your confidence intervals are estimated too small (think about why that is!).

Cure: use multi-level analyses

Page 4: Assumption checking in “normal” multiple regression with Stata

In Stata:Example: the Stata “auto.dta” data setsysuse auto

corr (correlation)vif (variance inflation factors)

ovtest (omitted variable test)

hettest (heterogeneity test)

predict e, residswilk (test for normality)

Page 5: Assumption checking in “normal” multiple regression with Stata

Finding the commands

• “help regress”• “regress postestimation”

and you will find most of them (and more) there

Page 6: Assumption checking in “normal” multiple regression with Stata

Multi-collinearity A strong correlation between two or more of your predictor variables

You don’t want it, because:1. It is more difficult to get higher R’s2. The importance of predictors can be difficult to

establish (b-hats tend to go to zero)3. The estimates for b-hats are unstable under slightly

different regression attempts (“bouncing beta’s”)

Detect: 4. Look at correlation matrix of predictor variables5. calculate VIF-factors while running regression

Cure:Delete variables so that multi-collinearity disappears, for instance by combining them into a single variable

Page 7: Assumption checking in “normal” multiple regression with Stata

Stata: calculating the correlation matrix (“corr”) and VIF statistics (“vif”)

Page 8: Assumption checking in “normal” multiple regression with Stata

Misspecification tests(replaces: all relevant predictor

variables included)

Page 9: Assumption checking in “normal” multiple regression with Stata

Homoscedasticity: all residuals are from a distribution with the same variance

Consequences: Heteroscedasticiy does not necessarily lead to biases in your estimated coefficients (b-hat), but it does lead to biases in the estimate of the width of the confidence interval, and the estimation procedure itself is not efficient.

Page 10: Assumption checking in “normal” multiple regression with Stata

Testing for heteroscedasticity in Stata

• Your residuals should have the same variance for all values of Y hettest

• Your residuals should have the same variance for all values of X hettest, rhs

Page 11: Assumption checking in “normal” multiple regression with Stata

Errors distributed normally

Errors are distributed normally (just the errors, not the variables themselves!)

Detect: look at the residual plots, test for normality

Consequences: rule of thumb: if n>600, no problem. Otherwise confidence intervals are wrong.

Cure: try to fit a better model, or use more difficult ways of modeling instead (ask an expert).

Page 12: Assumption checking in “normal” multiple regression with Stata

First calculate the errors:predict e, resid

Then test for normalityswilk e

Errors distributed normally

Stata 10 Tutorial 7 - Queen's Universityecon.queensu.ca/pub/graduate/stata/Abbott_351Tutorial7_w08.pdf · ECONOMICS 351* -- Stata 10 Tutorial 7 M.G. Abbott Record Your Stata Session

Documents

Chapter 4 Model Adequacy Checking - IIT Kanpurhome.iitk.ac.in/~shalab/.../Chapter4-Regression-ModelAdequacyChecking.pdf · To check the assumption of linearity between study variable

Documents

$Stata for Logistic Regression - UMasspeople.umass.edu/biep640w/pdf/Stata for Logistic Regression.pdf · Teaching\stata\stata version 14\Stata for Logistic Regression.docx Page 9of$

Stata for Logistic Regression - UMasspeople.umass.edu/biep640w/pdf/Stata for Logistic Regression.pdf · Teaching\stata\stata version 14\Stata for Logistic Regression.docx Page 9of

Documents

AOV Assumption Checking and Transformations (§8.4-8.5) How do we check the Normality of residuals assumption in AOV? How do we check the Homogeneity of

Documents

$Stata version 14 Also works for versions 13 & 12people.umass.edu/biep640w/pdf/stata v 14 lab sessio… · · 2016-02-01Teaching\stata\stata version 14\stata v 14 lab session 1.docx$