11
Violations of Assumptions In Least Squares Regression

Violations of Assumptions In Least Squares Regression

Embed Size (px)

DESCRIPTION

Violations of Assumptions In Least Squares Regression. Standard Assumptions in Regression. Errors are Normally Distributed with mean 0 Errors have constant variance Errors are independent X is Measured without error. Example X s and OLS Estimators. “t” is used to imply time ordering. - PowerPoint PPT Presentation

Citation preview

Page 1: Violations of Assumptions In Least Squares Regression

Violations of Assumptions In Least Squares Regression

Page 2: Violations of Assumptions In Least Squares Regression

Standard Assumptions in Regression

• Errors are Normally Distributed with mean 0

• Errors have constant variance• Errors are independent• X is Measured without error

2

2

2

00

00

00

εV

Page 3: Violations of Assumptions In Least Squares Regression

Example Xs and OLS Estimators

t X0(t) X1(t)1 1 02 1 13 1 24 1 35 1 46 1 57 1 68 1 79 1 810 1 911 1 1012 1 013 1 114 1 215 1 316 1 417 1 518 1 619 1 720 1 821 1 922 1 10

“t” is used to imply time ordering

12^

1

2

2

20

^

1

2

2

1

^

1^

1

^

0

^

11

1

21

^

1010

'

1

'

1050,...,1

XXβ

YX'XXβ

εXβY

V

XX

X

nV

XXV

XYYaYXX

XX

ntXY

n

tt

n

tt

t

n

ttt

n

tn

tt

t

ttt

Page 4: Violations of Assumptions In Least Squares Regression

Non-Normal Errors (Centered Gamma)

Page 5: Violations of Assumptions In Least Squares Regression

Errors = (Gamma(2,3.7672)-7.3485)

• Yt = 50 + 10Xt + (t-7.35) = 0*+1Xt+t*

• E(t*) = 0 V(t*) = 27

0.350320.1227273

0.12272736136364.0

6136364.0295455.4

770110

1102227'

1

^

1

12^

SE

V XXβ

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 95.05% of the samples.Average=9.99887, SD=0.3502Average s2(b1) = 0.1224804

Page 6: Violations of Assumptions In Least Squares Regression

Non-Constant Error Variance (Heteroscedasticity)

• Mean: E(Y|X) = X = 50 + 10X

• Standard Deviation: Y|X = X + 0.5

• Distribution: NormalX E(Y|X) 0 50 0.51 60 1.52 70 2.53 80 3.54 90 4.55 100 5.56 110 6.57 120 7.58 130 8.59 140 9.510 150 10.5

Page 7: Violations of Assumptions In Least Squares Regression

Non-Constant Error Variance (Heteroscedasticity)

4763.00.218409

0.2184090.59205-

0.59205-2.289773

5.10000

05.000

005.100

0005.0

1

^

2

29191

1919199919

912

91

912

91

1999191919

91912

SE

V

VVV

EEE

VE

11

1111^

111^

XX'XYX'XX'

XX'XX'XYX'XX'YX'XX'β

βXβX'XX'YX'XX'YX'XX'β

00

00000

00

00

00000

00

YXβY

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 92.62% of the samples.Average=9.998828, SD = 0.467182Average s2(b1) = 0.1813113 < 0.2184

Page 8: Violations of Assumptions In Least Squares Regression

Correlated Errors

Vε 2

321

32

2

12

2

2

t1

2

2

1

211

1

1

1

1

1

:1 iii)

2 oft independen ii)

1

,0~ i)

:Assuming

,0~,...,

nnn

n

n

n

nttt

V

t

iid

Example: 2= 9, =0.5, n=22

Page 9: Violations of Assumptions In Least Squares Regression

Correlated Errors

0.3035730.092157

0.0921570.46078-

0.46078-3.841111

15.05.05.0

5.015.05.0

5.05.015.0

5.05.05.01

5.01

9

1

^

2

322222122

3222

222

1222

2

SE

V

VVV

EEE

VE

11

1111^

111^

XX'XYX'XX'

XX'XX'XYX'XX'YX'XX'β

βXβX'XX'YX'XX'YX'XX'β

YXβY

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 84.56% of the samples.Average=10.00216, SD=0.3039251Average s2(b1) = 0.0476444 < 0.092157

Page 10: Violations of Assumptions In Least Squares Regression

Measurement Error in X• Z=True Value of Independent Variable (Unobserved)

• X=Observed Value of Independent Variable• X=Z+U (Z can be fixed or random)• Z, U independent (assumed) when Z is random• V(U) is independent of Z when Z is fixed

22

11

^

22122

2

11

^2

2

1

1:example in this e.g. fixed

1

1,~ :random

,0~

ZZnEZ

ENZZ

NU

u

zuuz

zzZii

u

Page 11: Violations of Assumptions In Least Squares Regression

Measurement Error in X

21

^

2211

^

2222

1

2

20226.0,10~

Error, MeasurmentWithout

09.91.1

10

220)1(22

1

110

1

1

91220

N

ZZnE

UZXZZ

u

ui

i

Based on 100,000 simulations, the 95% CI for 1 contained 10 in 76.72% of the samples.Average=9.197568, SD=0.6076758Average s2(b1) = 0.4283653 >> (0.20226)2