30
8.1 A llrightreserved by D r.BillW an Sing H ung -H K BU Lecture #8 Studenmund(2006): Chapter 8 Perfect and imperfect multicollin earity Effects of multicollinearity Detecting multicollinearity Remedies for multicollinearity Objective s

Lecture # 8

Embed Size (px)

DESCRIPTION

Lecture # 8. Studenmund (2006) : Chapter 8. Multicollinearity. Objectives. Perfect and imperfect multicollinearity Effects of multicollinearity Detecting multicollinearity Remedies for multicollinearity. The nature of M ulticollinearity. Perfect multicollinearity : - PowerPoint PPT Presentation

Citation preview

Page 1: Lecture # 8

8.1

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Lecture #8Studenmund(2006): Chapter 8

• Perfect and imperfect multicollinearity• Effects of multicollinearity• Detecting multicollinearity• Remedies for multicollinearity

Objectives

Page 2: Lecture # 8

8.2

All right reserved by Dr.Bill Wan Sing Hung - HKBU

The nature of Multicollinearity

If multicollinearity ismulticollinearity is perfectperfect, the regression coefficients ofthe Xi variables, is, are indeterminateindeterminate and their standarderrors, Se(i)s, are infinite.

Perfect multicollinearity: When there are some functional relationships existing

among independent variables, that is iXi = 0

or 1X1+ 2X2 + 3X3 +…+ iXi = 0

Such as 1X1+ 2X2 = 0 X1= -2X2

Page 3: Lecture # 8

8.3

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Y = 0 + 1X1 + 2X2 + ^ ^ ^ ^Example:3-variable Case:

If x2 = x1,

=(yx1)(2x1

2) - (yx1)(x1x1)

(x12)(2 x1

2) - 2(x1x1)21^ =

0

0

Indeterminate

=(yx1)(x1

2) - (yx1)(x1x1)

(x12)(2 x1

2) - 2(x1x1)22^ =

0

0

Similarly

If x2 = x1

Indeterminate

=(yx1)(x2

2) - (yx2)(x1x2)

(x12)(x2

2) - (x1x2)21^

=(yx2)(x1

2) - (yx1)(x1x2)

(x12)(x2

2) - (x1x2)22^

Page 4: Lecture # 8

8.4

All right reserved by Dr.Bill Wan Sing Hung - HKBU

If multicollinearity is imperfect,

x2 = 1 x1+ where is a stochastic error

(or x2 = 0+ 1 x1+ )

Then the regression coefficients, although determinate, possess large standard errors, which means the coefficients can be estimated but with less accuracy.

=(yx1)(2x1

2 + 2 ) - ( yx1 + y )( x1x1+ x1 )

(x12)(2 x1

2 + 2 ) - ( x1x1 + x1 )2

1^

0 = 0(Why?)

Page 5: Lecture # 8

8.5

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Example: Production function Yi = 0 + 1X1i + 2X2i + 3X3i + i

Y X1 X2 X3

122 10 50 52

170 15 75 75

202 18 90 97

270 24 120 129

330 30 150 152

Y: Output

X1: Capital

X2: Labor

X3: Land

XX11 = = 55XX22

Page 6: Lecture # 8

8.6

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Example: Perfect multicollinearity

a. Suppose D1, D2, D3 and D4 = 1 for spring, summer, autumn and winter, respectively.

Yi = 0 + 1D1i + 2D2i + 3D3i + 4D4i + 1X1i + i.

b. Yi = 0 + 1X1i + 2X2i + 3X3i + i

X1: Nominal interest rate; X2: Real interest rate; X3: CPI

c. Yt = 0 + 1Xt + 2Xt + 3Xt-1 + t

Where Xt = (Xt – Xt-1) is called “first different”

Page 7: Lecture # 8

8.7

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Yi = 0 + 1X1i + 2X2i + … + KXKi + i

When some independent variables are linearly correlated but the relation is not exact, there is imperfect multicollinearity.

0 + 1X1i + 2X2i + + KXKi + ui = 0where u is a random error term and k 0 for some k.

When will it be a problem?

Imperfect Multicollinearity

Page 8: Lecture # 8

8.8

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Consequences of imperfect multicollinearity

5. The OLS estimators and their standard errors can be sensitive to small change in the data.

Can be detected from

regressionresults

1. The estimated coefficients are still BLUE, however, OLS estimators have large variances and covariances, thus making the estimation with less accuracy.

2. The estimation confidence intervals tend to be much wider, leading to accept the “zero null hypothesis” more readily.

3. The t-statistics of coefficients tend to be statistically insignificant.

4. The R2 can be very high.

Page 9: Lecture # 8

8.9

All right reserved by Dr.Bill Wan Sing Hung - HKBU

OLS estimators are still BLUE under imperfect multicollinearity

Remarks:

•Unbiasedness is a repeated sampling property, not about the properties of estimators in any given sample

•Minimum variance does not mean small variance

•Imperfect multicollinearity is just a sample phenomenon

Why???

Page 10: Lecture # 8

8.10

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Effects of Imperfect Multicollinearity

Unaffected:

a. OLS estimators are still BLUE.

b. The overall fit of the equation

c. The estimation of the coefficients of non-multicollinear variables

Page 11: Lecture # 8

8.11

All right reserved by Dr.Bill Wan Sing Hung - HKBU

The variances of OLS estimators increase with the degree of multicollinearity

Regression model:

Yi = 0 + 1X1i + 2X2i + i

High correlation between X1 and X2

Difficult to isolate effects of X1 and X2 from each other

Page 12: Lecture # 8

8.12

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Closer relation between X1 and X2

larger r212

larger VIF

larger variances

where VIFk = 1/(1-Rk²), k=1,...,K and Rk² is the coefficient of determination of regressing Xk on all other (K-1) explanatory variables.

Page 13: Lecture # 8

8.13

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Page 14: Lecture # 8

8.14

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Larger kˆvar

2kˆse .b tends to be large

a. More likely to get unexpected signsunexpected signs.

Larger variances tend to increase the increase the standard errors of estimated coefficientsstandard errors of estimated coefficients.

c. Larger standard errors Lower t-valuesLower t-values

k

*kk

k ˆse

ˆt

Page 15: Lecture # 8

8.15

All right reserved by Dr.Bill Wan Sing Hung - HKBU

d. Larger standard errors

Wider confidence intervals

Less precise interval estimates.

k2/,dfkˆsetˆ

Page 16: Lecture # 8

8.16

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Detection of Multicollinearity

Example: Data set: CONS8 (pp. 254 – 255)

COi = 0 + 1Ydi + 2LAi + i

CO: Annual consumption expenditure

Yd: Annual disposable income

LA: Liquid assets

Page 17: Lecture # 8

8.17

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Since LA (liquid assets, saving, etc.) is highly related to YD (disposable income)

Studenmund (2006) - Eq. 8.9, pp254

Results:High R2 and Adjusted R2

Less significant t-values

Drop one variable

Page 18: Lecture # 8

8.18

All right reserved by Dr.Bill Wan Sing Hung - HKBU

OLS estimates and SE’s can be sensitive to specification and small changes in data

Small changes:

Add or drop some observations

Change some data values

Specification changes:

Add or drop variables

Page 19: Lecture # 8

8.19

All right reserved by Dr.Bill Wan Sing Hung - HKBU

High Simple Correlation Coefficients

2jj

2ii

jjiiij

XXXX

XXXXr

Remark: High rij for any i and j is a sufficient indicator for the existence of multicollinearity but not necessary.

Page 20: Lecture # 8

8.20

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Variance Inflation Factors (VIF) method

Procedures: kk XXXY ...)1( 22110

kk XXXX ...)2( 332211

Obtain 2kR

21

1ˆ)3(k

k RVIF

Rule of thumb: VIF > 5 multicollinearityNotes: (a.) Using VIF is not a statistical test. (b.) The cutting point is arbitrary.

Page 21: Lecture # 8

8.21

All right reserved by Dr.Bill Wan Sing Hung - HKBU

1. Drop the Redundant Variable

Using theories to pick the variable(s) to drop.Do not drop a variable that is strongly supported by theory. (Danger of specification error)

Remedial Measures

Page 22: Lecture # 8

8.22

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Insignificant

Insignificant

Since M1 and M2are highly related

Other examples: CPI <=> WPI; CD rate <=> TB rate

GDP GNP GNI

Page 23: Lecture # 8

8.23

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Check after dropping variables:Check after dropping variables:• The estimation of the coefficients of other variablescoefficients of other variables are not affecare not affec

ted.ted. (necessary)• RR22 does not fall much does not fall much when some collinear variables are dropped.

(necessary)• More significant t-valuesMore significant t-values vs. smaller standard errors (likely)

Page 24: Lecture # 8

8.24

All right reserved by Dr.Bill Wan Sing Hung - HKBU

2. Redesigning the Regression Model

ttttttt PNLnYdPBPFF 543210

There is no definite rule for this method.Example (Studenmund(2006), pp.268)

Ft = average pounds of fish consumed per capitaPFt = price index for fishPBt = price index for beefYdt = real per capita disposable incomeN = the # of CatholicP = dummy = 1 after the Pop’s 1966 decision, = 0 otherwise

ttttttt PNYdPBPFfF

),,,(

Page 25: Lecture # 8

8.25

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Signs are unexpectedMost t-values are insignificant

High correlations

VIFPF = 43.4VIFlnYd =23.3VIFPB = 18.9VIFN =18.5VIFP =4.4

Page 26: Lecture # 8

8.26

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Drop N, but not improved

Use the Relative Prices Relative Prices (RP(RPtt = PF = PFtt/PB/PBtt))

Ft = 0 + 1RPt + 2lnYdt + 3Pt + t

ttttt PYdRPfF

),,(

Improved

Page 27: Lecture # 8

8.27

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Using the lagged term of RP to allow the lag effect Using the lagged term of RP to allow the lag effect in the regressionin the regression

Ft = 0 + 1RPRPt-1t-1 + 2lnYdt + 3Pt + t

Improved much

Page 28: Lecture # 8

8.28

All right reserved by Dr.Bill Wan Sing Hung - HKBU

From previous empirical work, e.g.

Consi = 0 + 1Incomei + 2Wealthi + i

and a priori information: 22 = 0.1 = 0.1.

Then construct a new variable or proxy,

(Cons*Cons*ii = ConsConsii – – 0.1Wealth0.1Wealthii)

Run OLS:Run OLS: ConsCons**i = 0 + 1Incomei + i

3. Using A Priori Information

Page 29: Lecture # 8

8.29

All right reserved by Dr.Bill Wan Sing Hung - HKBU

Taking first differencesfirst differences of time series data.

Origin regression model:

Yt = 0 + 1X1t + 2X2t + t

Transforming model:Transforming model: First differencing First differencing

Yt = ’0 +’1X1t + ’2X2t + ut

Where Yt = Yt- Yt-1, (Yt-1 is called a lagged term)

X1t = X1t- X1,t-1, X2t = X2t- X2,t-1,

4. Transformation of the Model

Page 30: Lecture # 8

8.30

All right reserved by Dr.Bill Wan Sing Hung - HKBU

5. Collect More Data (expand sample size)

6. Doing Nothing:

Unless multicollearity causes serious biased, and the change of specification give better results.

Larger sample size means smaller variance of estimators.