Autocorrelation

Chapter 12 Autocorrelation The classical linear regression model

assumes that there is no autocorrelation in the data

Autocorrelation occurs when the error terms of any two observations are correlated

The error terms are not independent of each other

This is more likely to occur in a time series case since the observations in such data follow a natural ordering and therefore the observations are likely to be correlated

Thus, a shock( random factor) in one period is likely to have an effect in the next period

Why does Autocorrelation Occur Inertia-business cycles- momentum Specification bias: Excluded Variable

case. Suppose the true model is but we fit the model

then , thus to the extent that X3 influences Y, there will be a systematic pattern in the error terms

1

Cobwebs: Example: Overproduction in one year might cuase underproduction in the next year

Lags-Autoregression Manipulation of data- averaging data

introduces smoothness which may lend itself to a systematic pattern in the disturbances

Data Transformation- taking first differences of data, introduces auto correlation

What are the consequences As in the case of heteroscedasticity, OLS

estimators are still unbiased and consistent, but are no longer efficient

The error terms We assume that the residuals are generated

by an autoregressive process i.e the errors are determined by previous error values

An autoregressive 1 process ( AR(1)) can be expressed as follows

and the new error term isc normally distributed with mean zero, constant variance and does not exhibit autocorrelation

2

The closer is to 1 in absolute value, the more autocorrelation is present. A value of 1 means there is perfect positive autocorrelation, of -1 means there is perfect negative autocorrelation, and of zero means there is absolutely no autocorrelation

Detecting Autocorrelation 1) Graphical methods:Plot residuals

against time or graph and see if there is a systematic pattern

2) Formal Tests: a) Run’s Test We define a run as an uninterrupted

sequence of a specific sign residual. We define the length of a run as the

number of elements in it 12.6.1( pg 465) shows 3 runs: one of

length 9, one of length 21 and one of length10

Very few runs are an indication of positive autocorrelation. Too many runs are a sign of negative auto correlation

3

We consider the sequence of the runs to determine whether or not there is randomness

N= total no of observations;N1=no of +ve symbols;N2=no of –ve symbols;K=no of runs

H0: residuals are independent-no autocorrelationWe construct a confidence interval

If the actual no of runs falls outside this interval, reject H0

Durbin Watson D statistic Assume an 1)AR(1) process 2) The model has an intercept term 3)The explanatory variables are fixed 4)The regression does not include lagged

values of the dependent variable 5) There are no missing observations in

the data

4

The durbin – Watson statistic can be derived ( see pg 468 gujarati) to be the following:

implies 0<d<4 p=0 implies d=2 ( no autocorrelation) p=1 impliesd=0( positive autocorrelation) p=-1 implies d=4 (negative

autocorrelation)

We look at durbin Watson tables: They are organized in terms of a) sample size N and b) the no of explanatory variables k’=k-1

The table reports two critical value . The upper limit and the lower limit

The null is that there is no autocorrelation NB: see diagram 12.10 pg 469 for decision

criteria of whether to reject or accept null How do we correct for autocorrelation A) If is known: We use a form of GLS.

i.e We transform the variables such that there is no autocorrelation

W start off with the model . (1)

We also assume an AR(1) process

5

We then assume the same, for the previous period

Multiplying by on both sides we get (2)

Subtracting 2 from 1 , we get

where The ols assumption now hold. Thus the

GLS estimators are BLUE The slope coefficient is the same as in

the original model To get back to the intercept of the

original model, we divide the GLS intercept by (1- )

Since we lose an observation when differencing, we can give the prais-winsten transformation command, to avoid losing the observation

If is not known: 1)Assume =1, and estimate

corresponding difference model 2) Estimate from the durbin-watson

statistic , transform the data and apply OLS

3) Cochrane-Orcutt iterative procedure

6

a)Fit the OLS equation b) Use the residuals from the regression

to run the regression to estimate

c) run the generalized difference equation(GLS)

to re-estimate the coefficients d) use these estimates, to calculate a new

set of residuals e) Use these residuals to re-estimate i.e f) Run generalized least squares get new

estimates g) Continue until successive estimates for

differ by very small amounts(<0.005) Stata will run the cochrane-orcutt

procedure automatically if you give the command

Note: If we have to estimate the procedure is known as estimated generalized least squares and the properties of the estimators are valid only in large samples

7

Documents

Autocorrelation