Econometrics - Welcome to UTIAstaff.utia.cas.cz/barunik/files/Econometrics/Econometrics_Lecture... · Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences

Econometrics

Week 4

Institute of Economic StudiesFaculty of Social Sciences

Charles University in Prague

Fall 2012

1 / 23

Recommended Reading

For the todaySerial correlation and heteroskedasticity in time seriesregressions.Chapter 12 (pp. 376 – 404).

For the next weekPooling cross sections across time. Simple panel datamethods.Chapter 13 (pp. 408 – 434).

2 / 23

Recommended Reading

For the todaySerial correlation and heteroskedasticity in time seriesregressions.Chapter 12 (pp. 376 – 404).

For the next weekPooling cross sections across time. Simple panel datamethods.Chapter 13 (pp. 408 – 434).

2 / 23

Properties of OLS with Serially CorrelatedErrors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markovassumptions for time series regression.

But we do not assume anything about serial correlationpresent often in the economic data.As long as explanatory variables are strictly exogenous,βOLS are unbiased, regardless the degree of serialcorrelation in the errors.Last lecture, we also relaxed the strict exogeneity toE(ut|xt) = 0 and by assuming weak dependence of thedata, we have shown that βOLS are still consistent(although not necessarily unbiased).But what about assumption on serial correlation?

3 / 23


OLS is unbiased under the first 3 Gauss-Markovassumptions for time series regression.But we do not assume anything about serial correlationpresent often in the economic data.

As long as explanatory variables are strictly exogenous,βOLS are unbiased, regardless the degree of serialcorrelation in the errors.Last lecture, we also relaxed the strict exogeneity toE(ut|xt) = 0 and by assuming weak dependence of thedata, we have shown that βOLS are still consistent(although not necessarily unbiased).But what about assumption on serial correlation?

3 / 23


OLS is unbiased under the first 3 Gauss-Markovassumptions for time series regression.But we do not assume anything about serial correlationpresent often in the economic data.As long as explanatory variables are strictly exogenous,βOLS are unbiased, regardless the degree of serialcorrelation in the errors.

Last lecture, we also relaxed the strict exogeneity toE(ut|xt) = 0 and by assuming weak dependence of thedata, we have shown that βOLS are still consistent(although not necessarily unbiased).But what about assumption on serial correlation?

3 / 23


OLS is unbiased under the first 3 Gauss-Markovassumptions for time series regression.But we do not assume anything about serial correlationpresent often in the economic data.As long as explanatory variables are strictly exogenous,βOLS are unbiased, regardless the degree of serialcorrelation in the errors.Last lecture, we also relaxed the strict exogeneity toE(ut|xt) = 0 and by assuming weak dependence of thedata, we have shown that βOLS are still consistent(although not necessarily unbiased).

But what about assumption on serial correlation?

3 / 23


OLS is unbiased under the first 3 Gauss-Markovassumptions for time series regression.But we do not assume anything about serial correlationpresent often in the economic data.As long as explanatory variables are strictly exogenous,βOLS are unbiased, regardless the degree of serialcorrelation in the errors.Last lecture, we also relaxed the strict exogeneity toE(ut|xt) = 0 and by assuming weak dependence of thedata, we have shown that βOLS are still consistent(although not necessarily unbiased).But what about assumption on serial correlation?

3 / 23

Properties of OLS with Serially CorrelatedErrors: Efficiency and Inference

Gaus-Markov theorem requires both homoskedasticity andserially uncorrelated errors.

Thus, OLS is no longer BLUE in the presence of serialcorrelation....and standard errors and test statistics are not valid.

Let’s assume model with AR(1) errors:

yt = β0 + β1xt + ut,

ut = ρut−1 + εt,

for t = 1, 2, . . . , n, where |ρ| < 1 and εt uncorrelated randomvariables with zero mean and variance σ2

ε .

4 / 23


Gaus-Markov theorem requires both homoskedasticity andserially uncorrelated errors.Thus, OLS is no longer BLUE in the presence of serialcorrelation.

...and standard errors and test statistics are not valid.





ε .

4 / 23


Gaus-Markov theorem requires both homoskedasticity andserially uncorrelated errors.Thus, OLS is no longer BLUE in the presence of serialcorrelation....and standard errors and test statistics are not valid.





ε .

4 / 23


Gaus-Markov theorem requires both homoskedasticity andserially uncorrelated errors.Thus, OLS is no longer BLUE in the presence of serialcorrelation....and standard errors and test statistics are not valid.





ε .

4 / 23

Properties of OLS with Serially CorrelatedErrors: Efficiency and Inference cont.

The OLS estimator is then:

β1 = β1 + SST−1x

n∑t=1

xtut,

where SSTx =∑n

t=1 x2t

5 / 23


Variance of β1 conditional on X is:

V ar(β1) = SST−2x V ar

(n∑t=1

xtut

)

= SST−2x

n∑t=1

x2tV ar(ut) + 2

n−1∑t=1

n−t∑j=1

xtxt+jE(utut+j)

= σ2/SSTx︸︷︷︸

variance of β1

+ 2(σ2/SST 2x )

n−1∑t=1

n−t∑j=1

ρjxtxt+j︸︷︷︸

biaswhere σ2 = V ar(ut) and we used the fact from last lectureE(utut+j) = Cov(ut, ut+j) = ρjσ2

If we ignore the serial correlation and estimate the variancein the usual way, variance estimator will be biased (asρ 6= 0)

6 / 23




(n∑t=1

xtut

)

= SST−2x

n∑t=1

x2tV ar(ut) + 2

n−1∑t=1

n−t∑j=1

xtxt+jE(utut+j)

= σ2/SSTx︸︷︷︸

variance of β1

+ 2(σ2/SST 2x )

n−1∑t=1

n−t∑j=1

ρjxtxt+j︸︷︷︸bias

where σ2 = V ar(ut) and we used the fact from last lectureE(utut+j) = Cov(ut, ut+j) = ρjσ2


6 / 23




(n∑t=1

xtut

)

= SST−2x

n∑t=1

x2tV ar(ut) + 2

n−1∑t=1

n−t∑j=1

xtxt+jE(utut+j)

= σ2/SSTx︸︷︷︸

variance of β1

+ 2(σ2/SST 2x )

n−1∑t=1

n−t∑j=1




6 / 23




(n∑t=1

xtut

)

= SST−2x

n∑t=1

x2tV ar(ut) + 2

n−1∑t=1

n−t∑j=1

xtxt+jE(utut+j)

= σ2/SSTx︸︷︷︸

variance of β1

+ 2(σ2/SST 2x )

n−1∑t=1

n−t∑j=1



If we ignore the serial correlation and estimate the variancein the usual way, variance estimator will be biased (asρ 6= 0) 6 / 23


Consequences:In most economic applications, ρ > 0 and usual OLSvariance underestimates the true variance of the OLS

We tend to think that OLS slope estimator is more precisethan it actually is.Main consequence is that standard errors are invalid ⇒ tstatistics for testing single hypotheses are invalid ⇒statistical inference is invalid.

7 / 23


Consequences:In most economic applications, ρ > 0 and usual OLSvariance underestimates the true variance of the OLSWe tend to think that OLS slope estimator is more precisethan it actually is.

Main consequence is that standard errors are invalid ⇒ tstatistics for testing single hypotheses are invalid ⇒statistical inference is invalid.

7 / 23


Consequences:In most economic applications, ρ > 0 and usual OLSvariance underestimates the true variance of the OLSWe tend to think that OLS slope estimator is more precisethan it actually is.Main consequence is that standard errors are invalid ⇒ tstatistics for testing single hypotheses are invalid ⇒statistical inference is invalid.

7 / 23

Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the errorterms in the multiple linear regression model:

yt = β0 + β1xt1+, . . .+ βkxtk + ut,

with ut = ρut−1 + εt, t = 1, 2, . . . n.

The null hypothesis is that there is no serialcorrelation.H0 : ρ = 0With strictly exogenous regressors, the test is verystraightforward - simply regress the OLS residuals ut onlagged residuals ut−1

t statistics of ρ coefficient can be used to test H0 : ρ = 0against HA : ρ 6= 0 (or sometimes even HA : ρ > 0)

8 / 23



yt = β0 + β1xt1+, . . .+ βkxtk + ut,

with ut = ρut−1 + εt, t = 1, 2, . . . n.The null hypothesis is that there is no serialcorrelation.

H0 : ρ = 0With strictly exogenous regressors, the test is verystraightforward - simply regress the OLS residuals ut onlagged residuals ut−1


8 / 23



yt = β0 + β1xt1+, . . .+ βkxtk + ut,

with ut = ρut−1 + εt, t = 1, 2, . . . n.The null hypothesis is that there is no serialcorrelation.H0 : ρ = 0

With strictly exogenous regressors, the test is verystraightforward - simply regress the OLS residuals ut onlagged residuals ut−1


8 / 23



yt = β0 + β1xt1+, . . .+ βkxtk + ut,

with ut = ρut−1 + εt, t = 1, 2, . . . n.The null hypothesis is that there is no serialcorrelation.H0 : ρ = 0With strictly exogenous regressors, the test is verystraightforward - simply regress the OLS residuals ut onlagged residuals ut−1


8 / 23



yt = β0 + β1xt1+, . . .+ βkxtk + ut,

with ut = ρut−1 + εt, t = 1, 2, . . . n.The null hypothesis is that there is no serialcorrelation.H0 : ρ = 0With strictly exogenous regressors, the test is verystraightforward - simply regress the OLS residuals ut onlagged residuals ut−1


8 / 23

Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:

DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.

DW ≈ 2(1− ρ).ρ ≈ 0⇒ DW ≈ 2.ρ > 0⇒ DW < 2.The DW is little problematic, we have 2 sets of criticalvalues, dL (lower) and dU (upper):

DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.DW > dU ⇒ fail to reject the H0 : ρ = 0.dL ≤ DW ≤ dU the test is inconclusive.

9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.

DW ≈ 2(1− ρ).

ρ ≈ 0⇒ DW ≈ 2.ρ > 0⇒ DW < 2.The DW is little problematic, we have 2 sets of criticalvalues, dL (lower) and dU (upper):


9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.

DW ≈ 2(1− ρ).ρ ≈ 0⇒ DW ≈ 2.

ρ > 0⇒ DW < 2.The DW is little problematic, we have 2 sets of criticalvalues, dL (lower) and dU (upper):


9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.

DW ≈ 2(1− ρ).ρ ≈ 0⇒ DW ≈ 2.ρ > 0⇒ DW < 2.

The DW is little problematic, we have 2 sets of criticalvalues, dL (lower) and dU (upper):


9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.



9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.


DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.

DW > dU ⇒ fail to reject the H0 : ρ = 0.dL ≤ DW ≤ dU the test is inconclusive.

9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.


DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.DW > dU ⇒ fail to reject the H0 : ρ = 0.

dL ≤ DW ≤ dU the test is inconclusive.

9 / 23



DW =∑n

t=2(ut − ut−1)2∑nt=1 u

2t

.



9 / 23


In case we do not have strictly exogenous regressors (one ormore xtj is correlated with ut−1), t test nor DW test doesnot work.

In this case, we can regress ut on xt1, xt2, . . . xtk, ut−1 for allt = 2, . . . , n.t statistics of ρ coefficient of ut−1 can be used to test thenull of no serial correlation.The inclusion of xt1, xt2, . . . xtk explicitly allows each xtj tobe correlated with ut−1 ⇒ no need for strict exogeneity.

10 / 23


In case we do not have strictly exogenous regressors (one ormore xtj is correlated with ut−1), t test nor DW test doesnot work.In this case, we can regress ut on xt1, xt2, . . . xtk, ut−1 for allt = 2, . . . , n.

t statistics of ρ coefficient of ut−1 can be used to test thenull of no serial correlation.The inclusion of xt1, xt2, . . . xtk explicitly allows each xtj tobe correlated with ut−1 ⇒ no need for strict exogeneity.

10 / 23


In case we do not have strictly exogenous regressors (one ormore xtj is correlated with ut−1), t test nor DW test doesnot work.In this case, we can regress ut on xt1, xt2, . . . xtk, ut−1 for allt = 2, . . . , n.t statistics of ρ coefficient of ut−1 can be used to test thenull of no serial correlation.

The inclusion of xt1, xt2, . . . xtk explicitly allows each xtj tobe correlated with ut−1 ⇒ no need for strict exogeneity.

10 / 23


In case we do not have strictly exogenous regressors (one ormore xtj is correlated with ut−1), t test nor DW test doesnot work.In this case, we can regress ut on xt1, xt2, . . . xtk, ut−1 for allt = 2, . . . , n.t statistics of ρ coefficient of ut−1 can be used to test thenull of no serial correlation.The inclusion of xt1, xt2, . . . xtk explicitly allows each xtj tobe correlated with ut−1 ⇒ no need for strict exogeneity.

10 / 23

Testing for Higher Order Serial Correlation

We can easily extend the test for second order (AR(2))serial correlation.

In the model yt = ρ1ut−1 + ρ2ut−2 + εt, we test the

H0 : ρ1 = 0, ρ2 = 0

We regress ut on xt1, xt2, . . . xtk, ut−1, ut−2 for allt = 3, . . . , n...and obtain F test for joint significance of ut−1 and ut−2.If they are jointly significant, we reject the null ⇒ errorsare serially correlated of order two.

11 / 23


We can easily extend the test for second order (AR(2))serial correlation.In the model yt = ρ1ut−1 + ρ2ut−2 + εt, we test the

H0 : ρ1 = 0, ρ2 = 0


11 / 23



H0 : ρ1 = 0, ρ2 = 0

We regress ut on xt1, xt2, . . . xtk, ut−1, ut−2 for allt = 3, . . . , n

...and obtain F test for joint significance of ut−1 and ut−2.If they are jointly significant, we reject the null ⇒ errorsare serially correlated of order two.

11 / 23



H0 : ρ1 = 0, ρ2 = 0


11 / 23

Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.

Regress ut on xt1, xt2, . . . xtk, ut−1, ut−2, . . . , ut−q for allt = (q + 1), . . . , n.Use F test to test joint significance of ut−1, ut−2, . . . , ut−q

Or use LM version of test – Breusch-Godfrey test:

LM = (n− q)R2u,

where R2u is usual R2 from the regression above.

Under the null hypothesis, LM a∼ χ2q .

12 / 23


We can include q lags to test high order serial correlation.Regress ut on xt1, xt2, . . . xtk, ut−1, ut−2, . . . , ut−q for allt = (q + 1), . . . , n.

Use F test to test joint significance of ut−1, ut−2, . . . , ut−q


LM = (n− q)R2u,



12 / 23


We can include q lags to test high order serial correlation.Regress ut on xt1, xt2, . . . xtk, ut−1, ut−2, . . . , ut−q for allt = (q + 1), . . . , n.Use F test to test joint significance of ut−1, ut−2, . . . , ut−q


LM = (n− q)R2u,



12 / 23




LM = (n− q)R2u,



12 / 23




LM = (n− q)R2u,



12 / 23




LM = (n− q)R2u,



12 / 23

Correcting for Serial Correlation

When correlation is detected, we need to treat it.

We know that OLS may be inefficient.So how do we obtain BLUE estimator in the AR(1) setting?We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .V ar(ut) = σ2

ε /(1− ρ2)We need to transform the regression equation so we haveno serial correlation in the errors.

13 / 23


When correlation is detected, we need to treat it.We know that OLS may be inefficient.

So how do we obtain BLUE estimator in the AR(1) setting?We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .V ar(ut) = σ2


13 / 23


When correlation is detected, we need to treat it.We know that OLS may be inefficient.So how do we obtain BLUE estimator in the AR(1) setting?

We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .V ar(ut) = σ2


13 / 23


When correlation is detected, we need to treat it.We know that OLS may be inefficient.So how do we obtain BLUE estimator in the AR(1) setting?We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .

V ar(ut) = σ2ε /(1− ρ2)

We need to transform the regression equation so we haveno serial correlation in the errors.

13 / 23


When correlation is detected, we need to treat it.We know that OLS may be inefficient.So how do we obtain BLUE estimator in the AR(1) setting?We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .V ar(ut) = σ2

ε /(1− ρ2)

We need to transform the regression equation so we haveno serial correlation in the errors.

13 / 23


When correlation is detected, we need to treat it.We know that OLS may be inefficient.So how do we obtain BLUE estimator in the AR(1) setting?We assume all 4 Gauss-Markov Assumptions, but we relaxAssumption 5 and assume errors to follow AR(1):ut = ρut−1 + εt, t = 1, 2, . . .V ar(ut) = σ2


13 / 23

Correcting for Serial Correlation cont.

Consider following regression:


ut = ρut−1 + εt

For t ≥ 2, we can write:

yt−1 = β0 + β1xt−1 + ut−1,

yt = β0 + β1xt + ut

By multiplying first equation by ρ and subtracting it fromsecond equation, we get:

yt = (1− ρ)β0 + β1xt + εt

where yt = yt − ρyt−1 and xt = xt − ρxt−1

This is called quasi-differencing.

BUT we never knowthe value of ρ

14 / 23




ut = ρut−1 + εt


yt−1 = β0 + β1xt−1 + ut−1,



yt = (1− ρ)β0 + β1xt + εt




14 / 23




ut = ρut−1 + εt


yt−1 = β0 + β1xt−1 + ut−1,



yt = (1− ρ)β0 + β1xt + εt




14 / 23




ut = ρut−1 + εt


yt−1 = β0 + β1xt−1 + ut−1,



yt = (1− ρ)β0 + β1xt + εt




14 / 23




ut = ρut−1 + εt


yt−1 = β0 + β1xt−1 + ut−1,



yt = (1− ρ)β0 + β1xt + εt


This is called quasi-differencing. BUT we never knowthe value of ρ

14 / 23

Feasible GLS Estimation with AR(1) Errors

The problem with this GLS estimator is that we neverknow the value of ρ.

But we already know how to obtain the ρ estimator:

Simply regress the OLS residuals on their lagged valuesand get ρ.

Feasible GLS (FGLS) Estimation with AR(1) Errors

Run the OLS regression of yt on xt1, . . . , xtk and obtain residualsut, t = 1, 2, . . . , n.

Run the regression of ut on ut−1 to obtain estimate ρ.

Run OLS equation:

yt = β0xt0 + β1xt1 + . . .+ βkxt + errort,

where xt0 = (1− ρ), xt1 = xt − ρxt−1 for t ≥ 2, andxt0 = (1− ρ2)1/2, xt1 = (1− ρ2)1/2x1 for t = 1

15 / 23


The problem with this GLS estimator is that we neverknow the value of ρ.But we already know how to obtain the ρ estimator:

Simply regress the OLS residuals on their lagged valuesand get ρ.




Run OLS equation:



15 / 23


The problem with this GLS estimator is that we neverknow the value of ρ.But we already know how to obtain the ρ estimator:Simply regress the OLS residuals on their lagged valuesand get ρ.




Run OLS equation:



15 / 23






Run OLS equation:



15 / 23






Run OLS equation:



15 / 23






Run OLS equation:



15 / 23






Run OLS equation:



15 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.

These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)

Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.

FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimator

If we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.

FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.

Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.

This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23


GLS is BLUE under Assumptions 1 – 5 and we can use tand F tests from the transformed equation for theinference.These tests are asymptotically valid if Ass.1 – 5 hold intransformed model (along with stationary and weakdependence in the original variables)Distributions conditional on X are exact (with minimumvariance) if Ass 6. holds fro εt.FGLS estimator is called Prais-Winsten estimatorIf we just omit first equation (t = 1), it is calledCochrane-Orcutt estimator.FGLS estimators are not unbiased, but are consistent.Asymptotically, both procedures are the same and FGLS ismore efficient than OLS.This method can be extended for higher order serialcorrelation, AR(q) in the error term.

16 / 23

Serial Correlation-Robust Standard Errors

Problem: If the regressors are not strictly exogenous,FGLS is no longer consistent.

If strict exogeneity does not hold, it’s possible to calculateserial correlation (and heteroskedasticity) robust standarderrors of OLS estimate. We know that OLS will beinefficient.The idea is to scale OLS standard errors to take intoaccount serial correlation.

17 / 23


Problem: If the regressors are not strictly exogenous,FGLS is no longer consistent.If strict exogeneity does not hold, it’s possible to calculateserial correlation (and heteroskedasticity) robust standarderrors of OLS estimate. We know that OLS will beinefficient.

The idea is to scale OLS standard errors to take intoaccount serial correlation.

17 / 23


Problem: If the regressors are not strictly exogenous,FGLS is no longer consistent.If strict exogeneity does not hold, it’s possible to calculateserial correlation (and heteroskedasticity) robust standarderrors of OLS estimate. We know that OLS will beinefficient.The idea is to scale OLS standard errors to take intoaccount serial correlation.

17 / 23

Serial Correlation-Robust Standard Errors cont.

Estimate the model with OLS to obtain residuals ut, σ andthe usual standard errors “se(β1)”, which are incorrect.

Run the auxiliary regression of xt1 on xt2, xt3, . . . , xtk (withconstant) and get residuals rt.For a chosen integer g > 0 (typically integer part of n1/4):

ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.

Serial Correlation-Robust Standard Error

se(β1) = [“se(β1)”/σ]2√ν

Similarly for βj .SC robust standard errors can poorly behave in small samplesin presence of large serial correlation.

18 / 23


Estimate the model with OLS to obtain residuals ut, σ andthe usual standard errors “se(β1)”, which are incorrect.Run the auxiliary regression of xt1 on xt2, xt3, . . . , xtk (withconstant) and get residuals rt.

For a chosen integer g > 0 (typically integer part of n1/4):

ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.


se(β1) = [“se(β1)”/σ]2√ν


18 / 23


Estimate the model with OLS to obtain residuals ut, σ andthe usual standard errors “se(β1)”, which are incorrect.Run the auxiliary regression of xt1 on xt2, xt3, . . . , xtk (withconstant) and get residuals rt.For a chosen integer g > 0 (typically integer part of n1/4):

ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.


se(β1) = [“se(β1)”/σ]2√ν


18 / 23



ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.


se(β1) = [“se(β1)”/σ]2√ν


18 / 23



ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.


se(β1) = [“se(β1)”/σ]2√ν

Similarly for βj .

SC robust standard errors can poorly behave in small samplesin presence of large serial correlation.

18 / 23



ν =n∑t=1

a2t + 2

g∑h=1

[1− h/(g + 1)]

(n∑

t=h+1

atat−h

),

where at = rtut, t = 1, 2, . . . , n.


se(β1) = [“se(β1)”/σ]2√ν


18 / 23

Heteroskedasticity in Time Series Regressions

OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).

OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.

Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).

However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.

We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.

Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.

If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23


OLS estimators are unbiased (with Ass. 1-3) andconsistent (Ass. 1A-3A).OLS inference is invalid, if Ass.4 (homoskedasticity) fail.Heteroskedasticity-robust statistics can be easily derived inthe same manner as in cross-sectional data (if Ass.1A,2A,3A and 5A hold).However, in small samples we know that these robuststandard errors may be large. ⇒ we want to test forheteroskedasticity.We can use the same tests as for the cross-sectional case,but we need to have no serial correlation in the errors.Also for the Breusch-Pagan test where we specifyu2t = δ0 + δ1xt1 + . . .+ δkxtk + νt and testH0 : δ1 = δ2 = . . . = δk = 0, we need ν to behomoskedastic and serially uncorrelated.If we find heteroskedasticity, we can use heteroskedasticityrobust statistics.

19 / 23

Autoregressive Conditional Heteroskedasticity

Many times, we find dynamic form of theheteroskedasticity in economic data.

We can have E[u2t |X] = V ar(ut|X) = V ar(ut) = σ2, but

still:E[u2

t |X, ut−1, ut−2, . . .] = E[u2t |X, ut−1] = α0 + α1u

2t−1.

Thus u2t = α0 + α1u

2t−1 + νt, where

E[ν|X, ut−1, ut−1, . . .] = 0.Engle (1982) suggested looking at the conditional varianceof ut given past errors - autoregressive conditionalheteroskedasticity (ARCH) model.So even when the errors are not correlated (Ass. 5 holds),its squares can be correlated.OLS is still BLUE with ARCH errors and inference is validif Ass. 6 (normality) holds.Even if Normality does not hold, we know thatasymptotically OLS inference is valid under Ass 1A – 5Aand we can have ARCH effects.

20 / 23


Many times, we find dynamic form of theheteroskedasticity in economic data.We can have E[u2

t |X] = V ar(ut|X) = V ar(ut) = σ2, butstill:E[u2


2t−1.


2t−1 + νt, where


20 / 23





2t−1.


2t−1 + νt, where

E[ν|X, ut−1, ut−1, . . .] = 0.

Engle (1982) suggested looking at the conditional varianceof ut given past errors - autoregressive conditionalheteroskedasticity (ARCH) model.So even when the errors are not correlated (Ass. 5 holds),its squares can be correlated.OLS is still BLUE with ARCH errors and inference is validif Ass. 6 (normality) holds.Even if Normality does not hold, we know thatasymptotically OLS inference is valid under Ass 1A – 5Aand we can have ARCH effects.

20 / 23





2t−1.


2t−1 + νt, where

E[ν|X, ut−1, ut−1, . . .] = 0.Engle (1982) suggested looking at the conditional varianceof ut given past errors - autoregressive conditionalheteroskedasticity (ARCH) model.

So even when the errors are not correlated (Ass. 5 holds),its squares can be correlated.OLS is still BLUE with ARCH errors and inference is validif Ass. 6 (normality) holds.Even if Normality does not hold, we know thatasymptotically OLS inference is valid under Ass 1A – 5Aand we can have ARCH effects.

20 / 23





2t−1.


2t−1 + νt, where

E[ν|X, ut−1, ut−1, . . .] = 0.Engle (1982) suggested looking at the conditional varianceof ut given past errors - autoregressive conditionalheteroskedasticity (ARCH) model.So even when the errors are not correlated (Ass. 5 holds),its squares can be correlated.

OLS is still BLUE with ARCH errors and inference is validif Ass. 6 (normality) holds.Even if Normality does not hold, we know thatasymptotically OLS inference is valid under Ass 1A – 5Aand we can have ARCH effects.

20 / 23





2t−1.


2t−1 + νt, where

E[ν|X, ut−1, ut−1, . . .] = 0.Engle (1982) suggested looking at the conditional varianceof ut given past errors - autoregressive conditionalheteroskedasticity (ARCH) model.So even when the errors are not correlated (Ass. 5 holds),its squares can be correlated.OLS is still BLUE with ARCH errors and inference is validif Ass. 6 (normality) holds.

Even if Normality does not hold, we know thatasymptotically OLS inference is valid under Ass 1A – 5Aand we can have ARCH effects.

20 / 23





2t−1.


2t−1 + νt, where


20 / 23

Autoregressive Conditional Heteroskedasticitycont.

So why do we need to care about ARCH errors?

Because we can obtain asymptotically more efficientestimators than OLS.Details will be provided at Mgr. courses, not Bc. level.ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.Rob Engle received a Nobel Prize in 2003 for it.Example of stock market returns on the next slide.

21 / 23


So why do we need to care about ARCH errors?Because we can obtain asymptotically more efficientestimators than OLS.

Details will be provided at Mgr. courses, not Bc. level.ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.Rob Engle received a Nobel Prize in 2003 for it.Example of stock market returns on the next slide.

21 / 23


So why do we need to care about ARCH errors?Because we can obtain asymptotically more efficientestimators than OLS.Details will be provided at Mgr. courses, not Bc. level.

ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.Rob Engle received a Nobel Prize in 2003 for it.Example of stock market returns on the next slide.

21 / 23


So why do we need to care about ARCH errors?Because we can obtain asymptotically more efficientestimators than OLS.Details will be provided at Mgr. courses, not Bc. level.ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.

Rob Engle received a Nobel Prize in 2003 for it.Example of stock market returns on the next slide.

21 / 23


So why do we need to care about ARCH errors?Because we can obtain asymptotically more efficientestimators than OLS.Details will be provided at Mgr. courses, not Bc. level.ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.Rob Engle received a Nobel Prize in 2003 for it.

Example of stock market returns on the next slide.

21 / 23


So why do we need to care about ARCH errors?Because we can obtain asymptotically more efficientestimators than OLS.Details will be provided at Mgr. courses, not Bc. level.ARCH model have become important for empirical financeas it captures time-varying volatility in the stock markets.Rob Engle received a Nobel Prize in 2003 for it.Example of stock market returns on the next slide.

21 / 23


Prices of DJI stock market index H2000-2011L

2000 2002 2004 2006 2008 2010 20126000

8000

10 000

12 000

14 000

Returns of DJI stock market index 2000-2011

2000 2002 2004 2006 2008 2010 2012-0.10

-0.05

0.00

0.05

0.10

22 / 23

Thank you

Thank you very much for your attention!

23 / 23

Documents

Econometrics - Welcome to UTIAstaff.utia.cas.cz/barunik/files/Econometrics/Econometrics_Lecture... · Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences