Transcript
Page 1: Violations of Assumptions In Least Squares Regression

Violations of Assumptions In Least Squares Regression

Page 2: Violations of Assumptions In Least Squares Regression

Standard Assumptions in Regression

• Errors are Normally Distributed with mean 0

• Errors have constant variance• Errors are independent• X is Measured without error

2

2

2

00

00

00

εV

Page 3: Violations of Assumptions In Least Squares Regression

Example Xs and OLS Estimators

t X0(t) X1(t)1 1 02 1 13 1 24 1 35 1 46 1 57 1 68 1 79 1 810 1 911 1 1012 1 013 1 114 1 215 1 316 1 417 1 518 1 619 1 720 1 821 1 922 1 10

“t” is used to imply time ordering

12^

1

2

2

20

^

1

2

2

1

^

1^

1

^

0

^

11

1

21

^

1010

'

1

'

1050,...,1

XXβ

YX'XXβ

εXβY

V

XX

X

nV

XXV

XYYaYXX

XX

ntXY

n

tt

n

tt

t

n

ttt

n

tn

tt

t

ttt

Page 4: Violations of Assumptions In Least Squares Regression

Non-Normal Errors (Centered Gamma)

Page 5: Violations of Assumptions In Least Squares Regression

Errors = (Gamma(2,3.7672)-7.3485)

• Yt = 50 + 10Xt + (t-7.35) = 0*+1Xt+t*

• E(t*) = 0 V(t*) = 27

0.350320.1227273

0.12272736136364.0

6136364.0295455.4

770110

1102227'

1

^

1

12^

SE

V XXβ

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 95.05% of the samples.Average=9.99887, SD=0.3502Average s2(b1) = 0.1224804

Page 6: Violations of Assumptions In Least Squares Regression

Non-Constant Error Variance (Heteroscedasticity)

• Mean: E(Y|X) = X = 50 + 10X

• Standard Deviation: Y|X = X + 0.5

• Distribution: NormalX E(Y|X) 0 50 0.51 60 1.52 70 2.53 80 3.54 90 4.55 100 5.56 110 6.57 120 7.58 130 8.59 140 9.510 150 10.5

Page 7: Violations of Assumptions In Least Squares Regression

Non-Constant Error Variance (Heteroscedasticity)

4763.00.218409

0.2184090.59205-

0.59205-2.289773

5.10000

05.000

005.100

0005.0

1

^

2

29191

1919199919

912

91

912

91

1999191919

91912

SE

V

VVV

EEE

VE

11

1111^

111^

XX'XYX'XX'

XX'XX'XYX'XX'YX'XX'β

βXβX'XX'YX'XX'YX'XX'β

00

00000

00

00

00000

00

YXβY

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 92.62% of the samples.Average=9.998828, SD = 0.467182Average s2(b1) = 0.1813113 < 0.2184

Page 8: Violations of Assumptions In Least Squares Regression

Correlated Errors

Vε 2

321

32

2

12

2

2

t1

2

2

1

211

1

1

1

1

1

:1 iii)

2 oft independen ii)

1

,0~ i)

:Assuming

,0~,...,

nnn

n

n

n

nttt

V

t

iid

Example: 2= 9, =0.5, n=22

Page 9: Violations of Assumptions In Least Squares Regression

Correlated Errors

0.3035730.092157

0.0921570.46078-

0.46078-3.841111

15.05.05.0

5.015.05.0

5.05.015.0

5.05.05.01

5.01

9

1

^

2

322222122

3222

222

1222

2

SE

V

VVV

EEE

VE

11

1111^

111^

XX'XYX'XX'

XX'XX'XYX'XX'YX'XX'β

βXβX'XX'YX'XX'YX'XX'β

YXβY

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 84.56% of the samples.Average=10.00216, SD=0.3039251Average s2(b1) = 0.0476444 < 0.092157

Page 10: Violations of Assumptions In Least Squares Regression

Measurement Error in X• Z=True Value of Independent Variable (Unobserved)

• X=Observed Value of Independent Variable• X=Z+U (Z can be fixed or random)• Z, U independent (assumed) when Z is random• V(U) is independent of Z when Z is fixed

22

11

^

22122

2

11

^2

2

1

1:example in this e.g. fixed

1

1,~ :random

,0~

ZZnEZ

ENZZ

NU

u

zuuz

zzZii

u

Page 11: Violations of Assumptions In Least Squares Regression

Measurement Error in X

21

^

2211

^

2222

1

2

20226.0,10~

Error, MeasurmentWithout

09.91.1

10

220)1(22

1

110

1

1

91220

N

ZZnE

UZXZZ

u

ui

i

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 76.72% of the samples.Average=9.197568, SD=0.6076758Average s2(b1) = 0.4283653 >> (0.20226)2


Recommended