TESTING FOR UNIT ROOTS WITH COINTEGRATED DATAeconfin.massey.ac.nz/school/documents/seminarseries... · one-unit increase in U ç from its equilibrium value causes the next period’s

Paper 1:

TESTING FOR UNIT ROOTS WITH COINTEGRATED DATA

Paper 2:

ON THE PRACTICAL IMPORTANCE OF HURWICZ BIAS

IN MODELS WITH LAGGED DEPENDENT VARIABLES

TESTING FOR UNIT ROOTS WITH COINTEGRATED DATA

by

W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand

Email: [email protected]

Abstract

This paper demonstrates that unit root tests can suffer from inflated Type I error rates when data are cointegrated. Results from Monte Carlo simulations show that three commonly used unit root tests – the ADF, Phillips-Perron, and DF-GLS tests – frequently overreject the true null of a unit root for at least one of the cointegrated variables. The reason for this overrejection is that unit root tests, designed for random walk data, are often misspecified when data are cointegrated. While the addition of lagged differenced (LD) terms can eliminate the size distortion, this “success” is spurious, driven by collinearity between the lagged dependent variable and the LD explanatory variables. Accordingly, standard diagnostics such as (i) testing for serial correlation in the residuals and (ii) using information criteria to select among different lag specifications are futile. The implication of these results is that researchers should be conservative in the weight they attach to individual unit root tests when determining whether data are cointegrated.

Keywords: Unit root testing, cointegration, DF-GLS test, Augmented Dickey-Fuller test, Phillips-Perron test, simulation JEL classification: C32, C22, C18

May 30, 2015

Acknowledgments: This paper has been much improved as a result of comments from David Giles, Peter Phillips, Morten Nielsen, Marco Reale, and seminar participants at the “1st Conference on Recent Developments in Financial Econometrics and Applications” held at Deakin University, 4-5 December, 2014. Remaining errors are the sole property of the author.

1

I. INTRODUCTION

When estimating relationships among time series data, it is standard practice to first test for

unit roots in the individual series. If the data are integrated, one then moves to testing

whether the variables are cointegrated. This note points out that unit root tests are likely to

suffer from size distortions precisely because the data are cointegrated. These size distortions

are often substantial.

I illustrate this using a simple autoregressive, distributed lag (ARDL) system of two

variables. The ARDL framework has a number of features which make it attractive for

modelling dynamic relationships. It allows for interactions between variables, and

incorporates both endogeneity and own and cross-lagged effects. These features capture

likely behaviors of real economic time series. The ARDL framework can be “solved” to

identify parameter values that cause the two variables to be cointegrated. Furthermore, the

ARDL framework is easily transformed to an error correction specification, which facilitates

interpretation of dynamic relationships.

TABLE 1 illustrates the problem with size. X and Y are two simulated data series

where the parameter values for the data generating process (DGP) have been chosen to ensure

that they are cointegrated. Because the series are cointegrated, each of the series must have a

unit root. I subject each series to three unit root tests: the augmented Dickey-Fuller test

(ADF), the Phillips-Perron test, and the DF-GLS test. 10,000 simulations of sample sizes 100

were conducted. Significance levels were set equal to 0.05. The table reports the associated

Type I error rates. All simulations were done using Stata, Version 14.1

While the ADF and DF-GLS tests produce Type I error rates for X close to 0.05, the

Phillips-Perron test produces an error rate over 0.40. For Y, the results are much worse.

Type I error rates are 0.206, 1.000, and 0.685 for the ADF, Phillips-Perron, and DF-GLS

1 All programs used to produce the results for this paper are available from the author.

2

tests, respectively.2 The ADF regressions show good diagnostics, with little serial correlation

evident in the residuals. As I show below, unit root test results such as these are quite easy to

produce with cointegrated data.

I proceed as follows. Section II presents the theory that motivates the simulation

work. Section III presents additional Monte Carlo evidence of size distortions for

cointegrated data. Section IV provides an explanation for my results. Section V concludes

by discussing the implication of these findings for estimation of error correction models.

II. THEORY Consider the following ARDL(1,1) model.

, ~ 0,1 , 1) , ~ 0,1 ,

1,2, … , . This can be rewritten in VAR form as:

(2) .

where the parameters , i=1,2, j=1,2,3,4 are each functions of the and terms of

Equation (1).

Define . The determinant of the matrix

is the characteristic equation of , and the values of that

set this equation equal to zero are the associated characteristic roots, or eigenvalues:

(3) 0 .

A necessary condition for and to be CI(1,1) is that the corresponding solutions to (3) be

given by 1, | | 1.

2 Lag lengths for the Phillips-Perron and DF-GLS tests were chosen using the default options supplied by Stata.

3

The following conditions on , , and are sufficient to ensure that

1, | | 1. 3

(4a) 0 1

(4b) 0 1

(4c) 1 .

We can work backwards from (4a) – (4c) to obtain and values consistent with 1,

| | 1.

Let and take any values such that 1. Then

(5a)

(5b) 1

(5c)

(5d) 1

will produce , , and values such that 1, | | 1.

Equation (2) can be arranged in vector error correction (VEC) model form as:

∆ , 6) ∆ ,

where the coefficients , , and , as well as the error terms and , are functions of

the terms, i=1,2, j=1,2,3,4. This allows the long-run equilibrium relationship between

and , represented by the parameter ; and the speed-of-adjustment parameters and , to

all be expressed as functions of , , and .

7a)

7b)

3 These conditions are taken from Enders (2010, page 369). I have made the conditions more restrictive to make sure that the speed of adjustment parameters have the correct sign and size.

4

7c) .

The parameter values chosen in this way will ensure that (i) 1 0, and (ii) 1

0, so that the VEC model is well-behaved.

III. RESULTS

This section reports results from ten additional cases that highlight the problem with size

distortions. The first two columns of TABLE 2 describe the model parameters and time

series characteristics associated with the data generating processes (DGPs) for each case. I

have chosen cases that cover a wide range of behaviours. The cases are sorted in ascending

order of , the speed of adjustment coefficient for the Y series. ranges from a low of

-0.16 to a high of -0.90. The last column reports the characteristic roots associated with the

respective model parameters. In all cases, 1, | | 1.

TABLE 3 reports more simulation findings demonstrating that the results from

TABLE 1 are not isolated outcomes. The ten panels of TABLE 3 correspond to the ten cases

of TABLE 2. The top panel reports Monte Carlo results using the parameter values from

Case 1:

0, 2, 1.24, 1.70, 0, 5, 4.40, 1.75.

The fact that both and are nonzero implies that if one of the series is I(1), the other

must be as well (cf. Equation 1). These values generate a VEC model with long-run

equilibrium and speed of adjustment parameters 1.25, 0.16, and 0.25.

This implies that the long-run relationship between Y and X is given by 1.25 . A

one-unit increase in from its equilibrium value causes the next period’s value of Y to

decrease by 0.16 units. A one-unit increase in from its equilibrium value causes the next

period’s value of X to decrease by 0.25 units (= ). As in TABLE 1, the Monte Carlo

results are based on 10,000 simulations of sample sizes 100. As a point of comparison,

5

TABLE 3 also reports the results of unit root tests for a random walk variable,

. ~ 0,1 The Z column is useful for illustrating the range of deviations that can be

expected from sampling error.

The results for the X variable demonstrate that the size distortions associated with

each of the tests can be quite substantial. The Type I error rates for the ADF, Phillips-Perron,

and DF-GLS tests are 53.6, 87.6, and 64.0 percent, respectively. Thus, given sample data

from this DGP and applying any of the three tests supplied in Stata, a researcher would

incorrectly conclude that the X variable was stationary over half the time. The results for Y

also show size distortions, but of a smaller degree. These results are to be compared to those

reported for the Z variable, which is a pure random walk process. All three unit root tests

produce Type I error rates for Z that are close to 5 percent.

As is well-known, results from unit root tests can differ substantially depending on the

number of lagged differenced (LD) terms included in the unit root specification. Stata

automatically selects the number of LD terms for the Phillips-Perron and DF-GLS tests. The

ADF test requires the user to supply the number of lags. For the ADF tests, I chose lag

orders that were sufficient to generate white noise behaviour in the residuals.4 The last row

of the panel reports the results of a Breusch-Godfrey test where the null hypothesis is no

serial correlation. The test results for the X and Y variables are close to the value of 0.05 that

one would expect were there no serial correlation. These are virtually identical to those for

the random walk Z variable which has no serial correlation by construction. Based on these

results, a researcher would conclude that the ADF test was correctly specified.

The next nine panels report more unit root test results. I have highlighted the results

that show substantial size distortions. In the second panel, unit root tests for both the X and Y

variables reveal Type I error rates well above 5 percent for all three tests. For the ADF test,

4 Lag lengths for Cases 1 to 10 are 2, 2, 4, 3, 6, 2, 3, 4, 4, and 5, respectively.

6

Type I error rates are 0.390 and 0.309, respectively. The Breusch-Godfrey test results

indicate that sufficient lags have been included in the ADF specification. For the Phillips-

Perron and DF-GLS tests, the Type I error rates are 0.770 and 0.655, and 0.492 and 0.394,

respectively. The results for the X and Y variables contrast with the results for the benchmark

Z variable, which are approximately 5 percent across all three tests. The results from this

second panel indicate that a researcher would frequently conclude that both X and Y were

both stationary, and hence not cointegrated.

The highlighted areas in the subsequent panels accumulate further evidence that unit

root tests of cointegrated data are frequently characterized by substantial size distortions. An

egregious example is Case 6, where the Type I error rates for the Y variable are 0.921, 1.000,

and 0.931. A researcher would incorrectly classify the order of integration for this variable

over 90 percent of the time using any of the three unit root tests provided by Stata.

It turns out that the size distortions for the ADF test can be eliminated by adding

sufficient lagged differenced (LD) terms to the ADF specification. However, knowing the

correct number of LD terms to add is impossible in practice. Two common methods for

determining the number of LD terms are (i) testing the residuals for serial correlation; and (ii)

using information criteria such as the AIC and SIC to select the LD specification with the

lowest AIC/SIC value (Harris, 1992).

TABLE 4 reports the results of an analysis where these two methods are employed to

determine the appropriate number of LD terms to add to the ADF specification. The X and Y

data for TABLE 4 are generated using the DGP for Case 1. As before, the Z data are pure

random walk data and are included as a benchmark. One to ten LD terms are successively

added to the ADF specification. Breusch-Godfrey tests for each LD specification are

reported in the top panel of TABLE 4. Average AIC and SIC values for each LD

specification are reported in the subsequent two panels.

7

TABLE 4 is designed to address this thought experiment: based on the results from

the first three panels, how many LD terms would a researcher think is the “correct” number

of terms to add? For example, when LAGS = 1, the null hypothesis of no serial correlation is

rejected approximately 6.0 and 5.7 percent of the time for the X and Y series. For LAGS = 2,

rejection rates are 5.6 and 5.5 percent.5 The average AIC values for the X and Y series when

LAGS = 1 are 168.96 and 41.51. These successively increase as additional LD terms are

added. Likewise, the average SIC values for the X and Y series achieve their minimum when

LAGS = 1. Using the diagnostics from these three panels, a researcher might conclude that

the “correct” number of LD terms to add was 1 or 2.

The fourth panel of TABLE 4 reports the ADF Type I error rates for each LD

specification. When LAGS = 1, the Type I error rates for the X and Y variables are 0.734 and

0.148. When LAGS = 2, they are 0.546 and 0.106, respectively. In other words, using

commonly accepted methods for determining the appropriate number of LD terms, a

researcher would likely conclude that one, or at most two, LD terms was sufficient to control

for serial correlation in the ADF specification. Either strategy would result in the researcher

concluding that the X variable was stationary over half the time. In fact, it would take ten or

more LD terms to reduce the size of the ADF test to 5 percent. The diagnostic tests are

unable to identify the appropriate number of LD terms.

Similar results are obtained for the remaining cases (the Appendix reports the results

of following the same procedure for Case 2). In all cases, the information criteria select a

single LD term. Tests for serial correlation generally indicate that more than one LD term

should be included, but not so many as to eliminate the size distortion. The next section

explains that this inability of the diagnostic tests is because the apparent “success” from

adding LD terms to the ADF specification is spurious.

5 In practice, a researcher only has a single test for serial correlation to go on, so that it is likely that that he/she would find a single LD term to be sufficient in this case.

8

IV. DISCUSSION

The explanation for the poor size performance of unit root tests with cointegrated data can be

linked to how critical values are determined in the Dickey-Fuller framework. Given Monte

Carlo simulation of the random walk process,

,

repeated OLS estimation of the DF specification below

∆ .

produces an empirical distribution of t values, . .

, associated with testing the null

hypothesis: : 0, which is true given the random walk process.

In my simulations of the ARDL(1,1) data, the DGP is given by

,

,

and the corresponding differenced specifications are given by

∆

∆ ,

where 1 , 1 , , and

. While variables Y and X are each I(1) in the ten cases of TABLE 2,

1 0 and 1 0. In other words, the unit root tests are

misspecified: the null hypothesis : 0, is not true for those cases where / is not

equal to one.

This also explains why adding more LD terms “improves” the performance of the

ADF test. As in System GMM, the LD terms are predictors of the lagged dependent variable.

The more LD terms, the greater the multicollinearity in the ADF specification, and the more

statistically insignificant the coefficient becomes on the lagged dependent variable. This

makes it appear that the ADF test is more “successful” in identifying unit root processes as

9

more LD terms are added to the ADF specification. This is a spurious success due to

increased collinearity between the lagged dependent variable and the LD terms.

V. CONCLUSION

This paper demonstrates that unit root tests can suffer from inflated Type I error rates when

data are cointegrated. This should be of interest to researchers who are interested in

estimating relationships between nonstationary variables. Standard procedure calls for

testing variables for unit roots before proceeding to the estimation of error correction models.

The results of this study demonstrate that the very fact that the data are cointegrated can

render unit root tests unreliable. This suggests that researchers should be conservative in the

weight they attach to individual unit root tests, opting for a more holistic approach when

determining whether data are cointegrated.

10

REFERENCES

Enders, W. 2010. Applied Econometric Time Series, 3rd Edition, New York: John Wiley &

Sons. Harris, R.I.D. 1992. Testing for unit roots using the augmented Dickey-Fuller test: Some issues

relating to the size, power and the structure of the test. Economics Letters 38: 381-386.

11

TABLE 1

Example of Unit Root Test Results Using Cointegrated Data

UNIT ROOT TEST X Y

ADF 0.051 0.206

Phillips-Perron 0.410 1.000

DF-GLS 0.087 0.685

NOTE: Values in the table are Type I error rates associated with the null

hypothesis that the data series have a unit root. The underlying DGP is the ARDL framework represented by Equation 1 in the text, with the following parameter values:

0, 3, 1, 3, 0, 1, 0.2, 0.2

The corresponding long-run equilibrium and speed of adjustment parameters (see Equations 7a)-7c) are given by: 0.1, 0.5, 1, 0.1.

12

TABLE 2 DESCRIPTION OF CASES

CASE MODEL PARAMETERS

(1) TIME SERIES CHARACTERISTICS

(2) CHARACTERISTIC ROOTS

(3)

1 2.00, 1.24, 1.70,

5.00, 4.40, 1.75 0.16, 0.20, 0.25, 1.25 1, 0.59

2 2.00, 1.80, 1.60,

5.00, 4.50, 1.25 0.20, 0.50, 0.25, 0.50 1, 0.55

3 2.00, 0.35, 0.60,

3.00, 2.05, 2.80 0.25, 0.20, 0.80, 4.00 1, 0.05

4 1.00, 1.45, 0.90,

5.00, 3.65, 1.30 0.45, 0.90, 0.20, 0.22 1, 0.35

5 0.50, 0.72, 0.85,

1.00, 0.92, 1.10 0.48, 0.40, 0.50, 1.25 1, 0.02

6 3.00, 1.00, 3.00,

1.00, 0.20, 0.20 0.50, 1.00, 0.10, 0.10 1, 0.40

7 2.00, 2.20, 1.80,

5.00, 2.90, 1.35 0.60, 0.90, 0.15, 0.17 1, 0.25

8 0.80, 0.72, 1.08,

0.60, 0.64, 0.96 0.60, 0.40, 0.40, 1.00 1, 0

13

CASE MODEL PARAMETERS

(1) TIME SERIES CHARACTERISTICS

(2) CHARACTERISTIC ROOTS

(3)

9 0.80, 0.52, 1.16,

0.60, 0.52, 1.06 0.80, 0.40, 0.30, 0.75 1, 0.1

10 2.00, 1.90, 1.90,

5.00, 1.40, 1.40 0.90, 0.90, 0.10, 0.11 1, 0

NOTE: Each of the cases above is based on the ARDL framework of Equation (1), with associated parameter values given in Column

(1). The values of the characteristics , , and reported in Column (2) are, respectively, the values of the speed of adjustment parameters and the long-run relationship parameter between Y and X in the error correction models of Equation (6) that correspond to the values of the model parameters for that case. identifies the systematic change in ∆ corresponding to a one-unit change in . Necessary conditions for the series to be well-behaved are (i) 1 0, and (ii) 1 0. The last column reports the values of the characteristic roots in the VAR specification of Equation (2) for that case. A necessary condition for the series to be cointegrated is that 1, | | 1.

14

TABLE 3

More Examples of Unit Root Test Results Using Cointegrated Data

CASE TEST X Y Z

1

ADF 0.536 0.100 0.055

Phillips-Perron 0.876 0.203 0.062

DF-GLS 0.640 0.121 0.045

BREUSCH-GODFREY TEST: 0.056 0.058 0.059

2

ADF 0.390 0.309 0.055


DF-GLS 0.492 0.394 0.045

BREUSCH-GODFREY TEST 0.059 0.057 0.059

3

ADF 0.113 0.048 0.049


DF-GLS 0.429 0.066 0.045


4

ADF 0.200 0.403 0.053


DF-GLS 0.419 0.680 0.045


5

ADF 0.159 0.106 0.045


DF-GLS 0.756 0.639 0.045


6

ADF 0.051 0.921 0.055


DF-GLS 0.038 0.931 0.045


7

ADF 0.083 0.578 0.053


DF-GLS 0.170 0.808 0.045


8

ADF 0.266 0.405 0.049


DF-GLS 0.677 0.782 0.045


15

CASE TEST X Y Z

9

ADF 0.087 0.311 0.049


DF-GLS 0.354 0.696 0.045


10

ADF 0.050 0.347 0.048


DF-GLS 0.085 0.826 0.045


NOTE: The values in the table are the rejection rates of the respective null hypothesis. For the unit root tests (ADF, Phillips-Perron, and DF-GLS), the null hypothesis is that the series has a unit root. For the Breusch-Godfrey tests, the null hypothesis is that the residuals associated with the ADF test are not serially correlated.

16

TABLE 4 The Effect of Adding Lagged Differenced Terms to the Dickey-Fuller Unit Root Regression Equation: Case 1

X Y Z

BREUSCH-GODFREY TESTS:

LAGS = 1 0.060 0.057 0.057

LAGS = 2 0.056 0.055 0.063

LAGS = 3 0.062 0.056 0.060

LAGS = 4 0.056 0.054 0.059

LAGS = 5 0.059 0.059 0.060

LAGS = 6 0.060 0.060 0.062

LAGS = 7 0.059 0.061 0.061

LAGS = 8 0.067 0.058 0.063

LAGS = 9 0.056 0.064 0.065

LAGS = 10 0.060 0.061 0.064 AIC VALUES:

LAGS = 1 168.96 41.51 279.67

LAGS = 2 169.93 42.39 280.44

LAGS = 3 170.91 43.55 281.41

LAGS = 4 171.84 44.49 282.39

LAGS = 5 172.91 45.38 283.40

LAGS = 6 173.65 46.40 284.03

LAGS = 7 174.65 47.31 284.93

LAGS = 8 175.51 48.31 285.79

LAGS = 9 176.73 49.25 286.81

LAGS = 10 177.49 50.05 287.48 SIC VALUES:

LAGS = 1 179.34 51.89 290.05

LAGS = 2 182.90 55.37 293.42

LAGS = 3 186.48 59.12 296.98

LAGS = 4 190.00 62.66 300.56

LAGS = 5 193.67 66.14 304.16

LAGS = 6 197.01 69.75 307.39

LAGS = 7 200.60 73.26 310.88

LAGS = 8 204.06 76.86 314.33

LAGS = 9 207.87 80.39 317.95

LAGS = 10 211.23 83.78 321.22

17

TABLE 4 (continued)

X Y Z

ADF UNIT ROOT TESTS:

LAGS = 1 0.734 0.148 0.054

LAGS = 2 0.546 0.106 0.050

LAGS = 3 0.405 0.085 0.055

LAGS = 4 0.291 0.068 0.054

LAGS = 5 0.222 0.063 0.047

LAGS = 6 0.166 0.054 0.048

LAGS = 7 0.143 0.054 0.043

LAGS = 8 0.116 0.046 0.042

LAGS = 9 0.092 0.051 0.045

LAGS = 10 0.082 0.046 0.044

NOTE: The values in the top panel (“Breusch-Godfrey Tests”) are the rejection rates associated with the null hypothesis of no serial correlation for alternative specifications of lagged differenced (LD) terms in the ADF specification. The values in the next two panels (“AIC Values” and “SIC Values”) are the average information criteria values associated with the respective LD specifications. The number of observations are held constant across the different specifications. The values in the bottom panel (“ADF Unit Root Tests”) are the Type I error rates associated with the null hypothesis of a unit root using the ADF test with the designated number of lagged, differenced terms.

18

APPENDIX The Effect of Adding Lagged Differenced Terms to the Dickey-Fuller Unit Root Regression Equation: Case 2

X Y Z BREUSCH-GODFREY TESTS:

LAGS = 1 0.069 0.069 0.057

LAGS = 2 0.060 0.060 0.063

LAGS = 3 0.063 0.059 0.060

LAGS = 4 0.056 0.057 0.059

LAGS = 5 0.062 0.056 0.060

LAGS = 6 0.059 0.058 0.062

LAGS = 7 0.059 0.061 0.061

LAGS = 8 0.065 0.056 0.063

LAGS = 9 0.060 0.060 0.065

LAGS = 10 0.063 0.064 0.064 AIC VALUES:

LAGS = 1 179.18 14.35 279.67

LAGS = 2 179.93 15.04 280.44

LAGS = 3 181.00 16.15 281.41

LAGS = 4 181.89 16.95 282.39

LAGS = 5 182.89 17.86 283.40

LAGS = 6 183.62 18.93 284.03

LAGS = 7 184.67 19.96 284.93

LAGS = 8 185.51 20.94 285.79

LAGS = 9 186.71 21.91 286.81

LAGS = 10 187.58 22.78 287.48 SIC VALUES:

LAGS = 1 189.56 24.73 290.05

LAGS = 2 192.90 28.02 293.42

LAGS = 3 196.57 31.72 296.98

LAGS = 4 200.06 35.12 300.56

LAGS = 5 203.65 38.62 304.16

LAGS = 6 206.97 42.28 307.39

LAGS = 7 210.62 45.91 310.88

LAGS = 8 214.06 49.49 314.33

LAGS = 9 217.85 53.05 317.95

LAGS = 10 221.32 56.52 321.22

19

APPENDIX (continued)

X Y Z

ADF UNIT ROOT TESTS:

LAGS = 1 0.592 0.477 0.054

LAGS = 2 0.411 0.324 0.050

LAGS = 3 0.287 0.218 0.055

LAGS = 4 0.202 0.163 0.054

LAGS = 5 0.145 0.123 0.047

LAGS = 6 0.113 0.094 0.048

LAGS = 7 0.098 0.083 0.043

LAGS = 8 0.079 0.065 0.042

LAGS = 9 0.067 0.056 0.045

LAGS = 10 0.059 0.052 0.044

NOTE: Values in the top panel (“Breusch-Godfrey Tests”) are the rejection rates associated with the null hypothesis of no serial correlation. Values in the next two panels (“AIC Values” and “SIC Values”) are the average information criteria values associated with the respective lag specifications. The number of observations are held constant across the different specifications. The values in the bottom panel (“ADF Unit Root Tests”) are the Type I error rates associated with the null hypothesis of a unit root using the ADF test with the designated number of lagged, differenced terms.

NOTE: This is an early draft of the paper. Please do not circulate.

ON THE PRACTICAL IMPORTANCE OF HURWICZ BIAS IN MODELS WITH LAGGED DEPENDENT VARIABLES

by

W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand

Email: [email protected]

Min Zhu School of Economics and Finance

Queensland University of Technology, Australia Email: [email protected]

Abstract

It is well-known that OLS estimation of the time series model , ~ , produces biased estimates of in finite samples, though OLS is consistent. This

bias was first discussed in Hurwicz (1950), and investigated further in Phillips (1977). Most applied researchers are seemingly unperturbed by this fact, presumably resting their confidence on the asymptotic properties of the OLS estimator of . The purpose of this note is to remind researchers that the size of the bias can be very large in reasonably-sized samples. This has important implications for estimation of dynamic models. I illustrate the size of the bias in three settings: (i) a single equation, time series model where is regressed on ; (ii) a single equation, auto-regressive, distributed lag model (ARDL) where the researcher is interested in estimating the long-run propensity of an independent explanatory variable ; and (iii) a single equation, dynamic panel data (DPD) model where the researcher is again interested in estimating the long-run propensity of . In all three cases, Type I error rates are sufficiently large to vitiate the usefulness of hypothesis testing in many cases. KEYWORDS: Hurwicz bias, Auto-Regressive Distributed-Lag models, ARDL, Dynamic Panel Data models, DPD, Anderson-Hsaio, Arellano-Bond, Difference GMM, System GMM, Nickell bias, simulation JEL classification: C22, C23

Revised, 4 July 2015

Acknowledgments: This paper acknowledges helpful comments from David Frazier, Adrian Pagan, Yongcheol Shin, Richard Smith, Jeffrey Wooldridge, and seminar participants at the NZESG meetings in February 2015 and the NZAE meetings in July 2015. Remaining errors are the sole property of the authors.

1

I. INTRODUCTION

It is well-known that OLS estimation of the model,

(1) ,

t = 1,2,...,T, where ~ 0, and produces biased estimates of in finite samples,

though OLS is consistent. In particular, . This bias was first discussed in

Hurwicz (1950) and investigated further in Phillips (1977).

Despite this fact, most applied researchers are seemingly unperturbed by this fact,

presumably resting their confidence on the asymptotic properties of the OLS estimator. The

purpose of this note is to remind researchers that the size of the bias can be very large in

reasonably-sized samples when the dependent variable is substantially auto-correlated. This

has important implications for the estimation of dynamic models. I illustrate the size of the

bias in three settings.

The first setting is the similar linear regression model represented by Equation (1)

where researchers are interested in estimating . The second setting is an auto-regressive,

distributed lag (ARDL) model of the form

(2) ,

where researchers use OLS to estimate the value of the long-run propensity (LRP) of x, LRP

of x = 1⁄ . The third setting is a dynamic panel data (DPD) model of the form

(3) , , where the error term has a fixed effect and researchers are again interested in estimating the

LRP of x.

The paper proceeds as follows. Sections II to IV respectively investigate Hurwicz

bias in each of the settings above. Section V concludes.

2

LRP = ∑

The long-run propensity (LRP) defined above, despite its conceptual beauty, has serious

statistical pitfalls. The fact that the OLS estimation of AR model is biased in finite sample

potentially causes the LRP to be biased. Even worse, the LRP is a nonlinear function of the

AR coefficients and a small bias in the denominator can be greatly magnified especially when

AR coefficients are close to the unit circle. For illustration purpose, suppose p 1 which is

Koyck model and β 1. Marriott and Pope (1954) provide the first-order bias in the OLS

estimation for AR(1) coefficient which is 1 3ρ /T. With ρ 0.95 and T 100, the

first-order bias of AR(1) coefficient is -0.0385 or 4% of the true value. However, this can be

translated into 43.5% bias in the LRP (LRR is 11.3 rather than 20)! The issue of bias could

be even pronounced when p 1. As pointed out by Patterson(2000) , small biases in the

individual AR coefficient estimates cumulate, often in a reinforcing rather than offsetting

manner, in the estimate of the sum ∑ ρ .

On top of the bias issue, we have problem with estimation due the fact that often the

ratio distributions are heavy-tailed, and it may be difficult to work with such distributions and

develop an associated statistical test. For the LRP, the OLS estimate of the ratio is the ratio

of two correlated normally distributed variables, themselves estimate. In the case of both

numerator and denominator are independent normal distributions with mean zero, the ratio

follows Cauchy distribution. In general cases, the distribution of the ratio is heavy tailed and

has no moments. The shape of its density can be unimodal, bimodal, symmetric, asymmetric

depending heavily on the value of the coefficient of the denominator variable (Hinkley, 1969;

Diaz-Frances and Rubio, 2013).

II. HURWICZ BIAS IN A SINGLE-EQUATION, SIMPLE LINEAR REGRESSION SETTING

3

The first part of my analysis focuses on the model 1 , t = 1,2,...,T,

~ 0,1 . The error terms are generated independently of the lagged dependent

variable values, . T takes values between 15 and 1000 ( 15, 30, 50, 100, and1000),

and takes values between 0.010 and 0.99 ( 0.10, 0.50,0.90, 0.95, and0.99). For each

simulated sample, OLS is used to estimate the value of . 10,000 replications were run for

each of the (T, ) combinations.

The results of the Monte Carlo experiments are reported in TABLE 1.1 The top panel

of the table reports the average estimated value of across the 10,000 replications. The

bottom panel reports Type I error rates, where the values are the rejection rates of the null

hypothesis, H0: . For both panels, the columns represent the different

sample sizes, and the rows represent the different true values of .

For example, when 0.10and T = 15, the average estimated value of = 0.014.

When T=1000, the average estimated value of increases to 0.098, illustrating the

consistency of the OLS estimator. Likewise, when 0.10and T = 15, the rejection rate

of the null hypothesis H0: 0.10 is 0.041 (significance level is 5%), rising to 0.054 when

T=1000 (see below).

Note that the nominal size of the bias is substantial when 0.90 for sample sizes

of T=50. For example, when 0.90, the average estimated value of is 0.819, with a

nominal bias of -0.081, or -9.0 percent.2 Similarly, the Type I error rates are substantially

larger than 0.05. When 0.90and T = 50, the rejection rate of the null hypothesis H0:

0.90 is 0.103 (significance level is 5%), rising to 0.227 when 0.99. This

1 Stata Version 13 was used for all the Monte Carlo experiments. Copies of the Stata .do files used to produce all the results in this paper are presented in the Appendix. Readers are encouraged to use the programs to confirm my results and to study alternative experimental parameters and settings. 2 Note that the percentage bias is often larger for smaller values of .

4

demonstrates that Hurwicz bias can be practically important in realistic empirical

environments in terms of both bias and inference.

III. HURWICZ BIAS IN A SINGLE-EQUATION, AUTO-REGRESSIVE, DISTRIBUTED LAG (ARDL) SETTING The next set of Monte Carlo experiments focuses on the implications of Hurwicz bias for

estimating the long-run propensity (LRP) of an explanatory variable. I study the ARDL

model, 1 , 1,2, . . . , , where takes values 0.10, 0.50, 0.90,

0.95, and 0.99 (same as above), ~ 0,1 , and the simulated values are again

generated independently of the values. The true LRP of x values are 1.11, 2, 10, 20, and

100, respectively. The estimated LRP of x values are calculated by 1 1⁄ .

TABLE 2 reports the results of these Monte Carlo experiments. The table is

organized similarly to TABLE 1 with some minor differences. I now report three panels.

The top panel reports the median (not average) value of the estimated LRP of x values across

the 10,000 replications.3 The bottom panel reports Type I error rates associated with the null

hypothesis, H0: LRPof . The middle panel reports the 90 percent

empirical sample range calculated by taking the 5th and 95th percentile values of the 10,000

estimated LRP values. While the range of the true values is the same as above, I only

report simulation results for sample sizes of T = 10, 50, and 1000.

As in TABLE 1, there is little nominal distortion associated with small values of

when T = 50. For example, when 0.10, the median estimated value of the LRP of x =

1.1, the same as the true value. The associated Type I error rate is 0.064. The 90 percent

empirical sample range extends from 0.08 to 1.5, encompassing the true value.

However, substantial distortions arise for large values of when T = 50. When

0.90, the median estimated LRP of x value is 7.5, 25 percent less than its true value.

3 I report the median rather than the mean, because estimates of very close to 1 can cause the LRP to be

extremely large, making the sample mean uninformative.

5

The 90 percent empirical sample range, while encompassing the true value, is quite large,

extending from 3.4 to 19.4 And the associated Type I error rate is 0.267. All of these values

become substantially worse for larger values of and smaller values of T.

IV. HURWICZ BIAS IN A SINGLE-EQUATION, DYNAMIC PANEL DATA (DPD) SETTING The last of my analyses focuses on the implications of Hurwicz bias for estimating the long-

run propensity (LRP) of an explanatory variable in dynamic panel data (DPD) models. The

corresponding Monte Carlo programs generate simulated panel data having the DGP,

1 1 , , 1,2, … , , 1,2, . . . , , where again takes

values 0.10, 0.50, 0.90, 0.95, and 0.99. is an error term that has a fixed effect,

, and each of the components of are independently and identically standard normally

distributed, ~ 0,1 , ~ 0,1 . The fixed effects are uncorrelated with , so there

is no endogeneity associated with the fixed effects. I begin with the panel data having

dimensions N=50, T=10.

Given the values of , the true LRP of x values are, again, 1.11, 2, 10, 20, and 100,

respectively. The LRP is estimated using three dynamic panel data estimators: Anderson-

Hsaio (AH), difference GMM (DGMM), and system GMM (SGMM). The results are

reported in TABLE 3 and again consist of three panels: (i) the median estimated LRP value,

(ii) the 90% empirical sample range of LRP vales, and (iii) the Type I error rate. The

columns now report the results for each of the three DPD estimators. Note that the panel

sample sizes are the same for all the experiments reported in TABLE 3 (N=50,T=10).

The consequences of Hurwicz bias are once again substantial for large values of .

For example, when 0.90, the median LRP estimates for the AH, DGMM, and SGMM

estimators are 0.6, 5.1, and 25.6 (compared to a true value of 10). The 90% empirical sample

ranges are very wide, and in none of the cases do they include the true value. The Type 1

6

error rates are 0.709, 0.821, and 0.001, rendering them worthless for hypothesis testing.

Larger values of result in simularly distorted values.

TABLE 4 reports a similar analysis for panel data models having dimensions N=140,

T=5. I chose these dimensions because they are the approximate dimensions of the “abdata”

bundled with Stata that is used to illustrate the functionality of the xtabond (difference GMM)

and the xtdpdsys (system GMM) commands in Stata.4 The results are very similar to those

from TABLE 3. Once again, substantial biases and size distortions are apparent for

0.90.

While much emphasis has been paid to Nickell bias in DPD models (Nickell, 1981),

the results above suggest that Hurwicz bias is a much more serious problem for estimation of

LRP values in DPD models.

V. BIAS REDUCATION UNHELPFUL FOR ESTIMATING LRP

We pointed out two statistical issues with estimating LPR in the Introduction, namely,

Hurwicz bias and heavy tailed ratio distributions. Here we illustrate that controlling for

Hurwicz bias does not necessarily improve the LPR estimation.

We use jackknife technique for reducing the leading order of the OLS estimator bias.

Suppose that is a maximum-likelihood based estimator of θ. Partition the observation

along the time dimension, that is 1, … , to 1,… , /2 and 1, … , .

And let ̅ / 0.5 . Compared with the original biased estimator , the

jackknifed estimator ̅ / is bias-reduced. For the ARDL model 1

, 1,2, . . . , , we apply the jackknifing procedure to reduce the bias in . Note that

4 The abdata data set used for the xtabond procedure has 140 groups, with a minimum of 4 observations per group, and a maximum of 6. The corresponding minimum and maximum values corresponding to the data used for the xtdpdsys procedure are 5 and 7, with the number of groups again equalling 140.

7

is not affected by finite-sample bias. The results are reported in Table 5. The top panel of

the table reports the average estimated value of across 10,000 replications, for both the

OLS estimator and the bias-reduced estimator. As it shown, the jackknifing procedure

greatly reduces the finite-sample bias in the OLS estimator. For example, when

0.99and T = 15, the average estimated value of = 0.76. The average value improves to

0.93 after jackknifing. The middle panel reports the median value of the estimated LRP

across the 10,000 replications using based on the bias-reduced estimates. And the bottom

panel reports the 90 percent empirical sample range of the estimated LRP values. We can see

that better estimate of does not necessarily lead to a better estimate of the LRP ratio.

Table 6 reports results on DPD model. The jackknifing procedure fails to reduce any finite-

sample bias in this case.

VI. CONCLUSION

It is well known that OLS estimation of a single equation time series model having the

specification , ~ 0, produces biased estimates of .

This bias is known as “Hurwicz bias” and was identified more than sixty years ago (Hurwicz,

1950). To the best of my knowledge, the implications of this for estimation of auto-

regressive, distributed lag (ARDL) and dynamic panel data (DPD) models have not been

previously noted.

This paper uses Monte Carlo experiments to demonstrate that the practical

consequences of Hurwicz bias can be substantial in finite samples when is characterized

by a significant degree of auto-correlation. Estimates of the long-run propensity (LRP) of

explanatory variables calculated from ARDL and DPD models can be biased by 50 or even

8

100 percent in realistic data settings. In some cases, 90 percent empirical sample ranges for

LRP estimates do not include the true value. And Type I error rates are sufficiently distorted

to render them virtually useless for hypothesis testing.

Again to the best of my knowledge, there is no econometric procedure that can

identify and/or correct for Hurwicz bias. One potential indicator that there is a problem is the

size of the estimated coefficient on the lagged dependent variable. If this coefficient is close

to one, then empirical estimates of long-run impacts of explanatory variables are likely to be

severely distorted. Of course, this is complicated by the fact that Hurwicz bias causes the

coefficient of the lagged dependent variable to be underestimated. However, programs like

the ones attached to the end of this study can be used to simulate DGPs that produce

estimates similar to those observed by the researcher. The DGP parameters uncovered in this

manner can be used to assess the severity of the problem.

All of the discussion above assumes a relatively uncomplicated data environment.

For example, the presence of serial correlation in the error term is likely to compound the

problems identified here.5 Given these difficulties, researchers may want to consider

specifications that replace the lagged dependent variable with lagged values of the

explanatory variable(s). The resulting costs associated with diminished degrees of freedom

may be small compared to the distortions from including a lagged dependent variable.

In conclusion, it is hoped that that the results of this study will encourage researchers

to be even more cautious in the use and interpretation of models with lagged dependent

variables.

55 The following papers provide useful discussions of the problems associated with serially-correlated error terms in dynamic models: Griliches (1961), Keele and Kelly (2006), and Beck and Katz (2011).

9

REFERENCES

Beck, N. and Katz, J.N. 2011. “Modelling Dynamics in Time-Series-Cross-Sectional

Political Economy Data.” Annual Review of Political Science Vol. 14, pp. 331–52. Diaz‐Frances, E. and Rubio, F.J. 2013. “On the Existence of a Normal approximation to the

distribution of the Ratio of Two Independent Normal Random Variables.” Statistical Papers, Vol. 54 No. 2, pp. 309‐323.

Griliches, Zvi. 1961. ‘‘A Note of Serial Correlation Bias in Estimates of Distributed Lags.’’

Econometrica Vol. 29, pp. 65–73. Hinkley, D.V. 1969. “On the Ratio of Two Correlated Normal Random Variables.” Biometrika,

Vol. 56 No. 3, pp. 635‐639. Hurwicz, L. 1950. “Least Squares Bias in Time Series," in Statistical Inference in Dynamic

Economic Models, ed. by T. C. Koopmans. New York: Wiley. Keele, L. and Kelly, N.J. 2006. “Dynamic Models for Dynamic Theories: The Ins and Outs

of Lagged Dependent Variables.” Political Analysis Vol. 14: pp. 186-205. Marriott, F.H.C. and Pope, J.A. 1954. “Bias in the estimation of

autocorrelations.” Biometrika, Vol. 41, pp. 390‐402. Nickell, S. 1981. “Biases in Dynamic Models with Fixed Effects.” Econometrica, Vol. 49,

No. 6, pp. 1417-1426. Patterson, K. 2000. “Finite sample bias of the least squares estimators in an AR(p) model:

estimation, inference, simulation and examples.” Applied Economics, Vol. 32 No. 15, pp. 1993‐2005.

Phillips, P.C.B. 1977. “Approximations to Some Finite Sample Distributions Associated with

a First-Order Stochastic Difference Equation.” Econometrica, Vol. 45, No. 2, pp. 463-485.

10

TABLE 1

Demonstration of Hurwicz Bias in a Single Equation Time Series Model:

True Value of SAMPLE SIZE

T=15 T=30 T=50 T=100 T=1000

Average Estimated Value of

0.10 0.014 0.057 0.074 0.086 0.098

0.50 0.339 0.418 0.449 0.475 0.498

0.90 0.640 0.764 0.819 0.860 0.896

0.95 0.668 0.802 0.862 0.906 0.946

0.99 0.698 0.830 0.893 0.941 0.986

Type I Error Rate (H0: true value)

0.10 0.041 0.046 0.051 0.047 0.054

0.50 0.052 0.054 0.054 0.051 0.053

0.90 0.131 0.129 0.103 0.085 0.055

0.95 0.178 0.165 0.144 0.110 0.056

0.99 0.221 0.238 0.227 0.203 0.089

SOURCE: All Monte Carlo experiments were run using Version 13 of Stata. 10,000 replications were run for each experiment. The experiments are described in more detail in Section II of the text. The associated simulation programs are given in “TABLE1A” and “TABLE1B” in the Appendix.

11

TABLE 2 Demonstration of Hurwicz Bias in an ARDL Model:

; LRP of x = ⁄

True Value of / True Value of LRP of x

SAMPLE SIZE

T=10 T=50 T=1000

Median Value of Estimated LRP of x

0.10 / 1.11 1.1 1.1 1.1 0.50 / 2 1.6 1.9 2.0 0.90 / 10 2.8 7.5 9.9 0.95 / 20 2.8 11.4 19.5 0.99 / 100 2.7 16.0 85.9

90 Percent Empirical Sample Range for Estimated LRP of x

0.10 / 1.11 0.3 — 2.3 0.8 — 1.5 1.0 — 1.2 0.50 / 2 0.4 — 5.0 1.3 — 2.8 1.8 — 2.2 0.90 / 10 -13.2 — 21.9 3.4 — 19.4 8.3 — 11.7 0.95 / 20 -18.1 — 25.2 3.9 — 45.7 15.3 — 24.8 0.99 / 100 -24.6 — 28.8 -72.1 — 125.4 51.2 — 152.9

Type I Error Rate (H0: LRP of x = true value)

0.10 / 1.11 0.104 0.064 0.050 0.50 / 2 0.185 0.081 0.051 0.90 / 10 0.530 0.267 0.069 0.95 / 20 0.661 0.393 0.084 0.99 / 100 0.844 0.681 0.197

12

SOURCE: All Monte Carlo experiments were run using Version 13 of Stata. 10,000 replications were run for each experiment. The experiments are described in more detail in Section III of the text. The associated simulation programs are given in “TABLE2A” and “TABLE2B” in the Appendix.

13

TABLE 3 Demonstration of Hurwicz Bias in DPD Model:

, ; LRP of x = ⁄ ; N=50, T=10


ESTIMATOR

Anderson-Hsaio Difference GMM System GMM


0.10 / 1.11 1.1 1.1 1.1

0.50 / 2 1.9 1.8 2.0

0.90 / 10 0.6 5.1 25.6

0.95 / 20 2.5 10.8 ‐56.1

0.99 / 100 8.3 34.7 ‐22.2

Average 90 Percent Empirical Sample Range for Estimated LRP of x

0.10 / 1.11 0.9 — 1.3 1.0 — 1.2 1.0 — 1.2 0.50 / 2 1.3 — 3.5 1.6 — 2.1 1.7 — 2.3 0.90 / 10 -5.9 — 7.0 3.3 — 8.3 11.4 — 99.7 0.95 / 20 -27.3 — 27.8 6.6 — 21.1 -332.5 — 225.6 0.99 / 100 -99.4 — 105.0 13.1 — 170.0 -30.0 — -17.9

Average Type I Error Rate (H0: LRP of x = true value)

0.10 / 1.11 0.053 0.083 0.088

0.50 / 2 0.077 0.194 0.110

0.90 / 10 0.709 0.821 0.001

0.95 / 20 0.514 0.642 0.499

0.99 / 100 0.524 0.585 1.000

14

SOURCE: All Monte Carlo experiments were run using Version 13 of Stata. 10,000 replications were run for each experiment. The experiments are described in more detail in Section IV of the text. The associated simulation programs are given in “TABLE3A” and “TABLE3B” in the Appendix.

15

TABLE 4 Demonstration of Hurwicz Bias in DPD Model:

, ; LRP of x = ⁄ ; N=140, T=5


ESTIMATOR



0.10 / 1.11 1.1 1.1 1.1

0.50 / 2 2.0 1.8 1.9

0.90 / 10 0.8 3.4 25.2

0.95 / 20 1.8 7.7 ‐57.2

0.99 / 100 8.3 25.6 ‐22.1

Average 90 Percent Empirical Sample Range for Estimated LRP of x

0.10 / 1.11 0.9 — 1.3 1.0 — 1.2 1.0 — 1.2 0.50 / 2 1.4 — 3.3 1.6 — 2.2 1.7 — 2.3 0.90 / 10 -8.7 — 9.6 2.1 — 6.2 10.8 — 105.2 0.95 / 20 -22.0 — 21.5 4.5 — 19.1 -263.0 — -21.4 0.99 / 100 -103.1 — 105.6 -86.0 — 148.9 -27.2 — -18.6

Average Type I Error Rate (H0: LRP of x = true value)

0.10 / 1.11 0.052 0.083 0.095

0.50 / 2 0.073 0.191 0.143

0.90 / 10 0.640 0.921 0.001

0.95 / 20 0.583 0.746 0.618

0.99 / 100 0.540 0.618 1.000

16

SOURCE: All Monte Carlo experiments were run using Version 13 of Stata. 10,000 replications were run for each experiment. The experiments are described in more detail in Section IV of the text. The associated simulation programs are given in “TABLE4A” and “TABLE4B” in the Appendix.

17

TABLE 5

Bias-reduced estimate in an ARDL Model: ; LRP of x = ⁄


SAMPLE SIZE

T=10 T=50 T=1000

OLS estimate / Jackknifing estimate of

0.10 0.33 / 0.95 0.88 / 0.10 0.10 / 0.10 0.50 0.37 / 0.48 0.48 / 0.50 0.50 / 0.50 0.90 0.70 / 0.85 0.86 / 0.90 0.90 / 0.90 0.95 0.73 / 0.89 0.90 / 0.95 0.95 / 0.95 0.99 0.76 / 0.93 0.94 / 0.99 0.99 / 0.99

Median Value of Estimated LRP of x – Bias-corrected

0.10 / 1.11 1.09 1.11 1.11 0.50 / 2 1.5 2.0 2.0 0.90 / 10 1.1 8.1 10.1 0.95 / 20 0.9 9.1 20.2 0.99 / 100 0.7 6.7 100.1

90 Percent Empirical Sample Range for Estimated LRP of x – Bias-corrected

0.10 / 1.11 0.2 — 3.3 0.8 — 1.5 1.0 — 1.2 0.50 / 2 -5.5 — 8.5 1.3 — 3.1 1.8 — 2.2 0.90 / 10 -16.8 — 17.1 -34.2 — 48.8 8.0 — 12.0 0.95 / 20 -17.3 — 19.7 -70.5 — 85.0 15.6 — 26.6 0.99 / 100 -17.0 — 18.7 -99.6 — 101.7 47.1 — 335.3

18

TABLE 6 Bias-reduced estimate of Hurwicz Bias in DPD Model: , ; LRP of x = ⁄ ; N=50, T=10


ESTIMATOR


OLS estimate / Jackknifing estimate of

0.10 0.10 / 0.10 0.79 / 0.95 0.96 / 0.10

0.50 0.49 / 0.50 0.46 / 0.50 0.79 / 0.51

0.90 0.36 / 0.34 0.81 / 0.90 0.96 / 0.96

0.95 0.86 / 0.98 0.91 / 0.97 1.01 / 1.00

0.99 0.98 / 0.99 0.97 / 1.00 1.05 / 1.04

Median Value of Estimated LRP of x – Bias-corrected

0.10 / 1.11 1.11 1.10 1.12

0.50 / 2 2.0 2.0 2.0

0.90 / 10 0.3 9.1 25.7

0.95 / 20 1.1 20.1 ‐64.7

0.99 / 100 6.3 ‐25.7 ‐25.9

90 Percent Empirical Sample Range for Estimated LRP of x – Bias-corrected

0.10 / 1.11 0.9 — 1.4 1.0 — 1.2 1.0 — 1.3 0.50 / 2 1.3 — 3.9 1.6 — 2.4 1.7 — 2.4 0.90 / 10 -4.2 — 4.5 -14.9 — 46.6 9.4 — 113.7 0.95 / 20 -20.1 — 23.5 -157.5 — 179.6 -502.9 — 419.7 0.99 / 100 -91.0 — 87.3 -375.3 — 348.7 -37.8 — -20.0

19

20

APPENDIX: Stata .do files that accompany this paper

NOTE #1: These .do files are included to make it possible for readers to (i) confirm this study’s findings, and (ii) investigate alternative DGP specifications. NOTE #2: The output for each table is produced by a two-part program. For Table X, the TABLEXA program must be run first, followed by the TABLEXB program. The latter program calls the former program and produces the results reported in the table. The computing time necessary to produce the results for a table are given at the beginning of the respective B programs.

21

“TABLE1A” program

program drop _all program define hurwicz, rclass version 13 syntax, beta0(real) betay(real) betau(real) smallobs(integer) bigobs(integer) // Remove existing variables drop _all // Create the data set obs `bigobs' gen t = _n tsset t // We initialize y as its LR equilibrium value gen y = `beta0'/(1‐`betay') replace y = `beta0' + `betay'*L.y + `betau'*rnormal() in 2/l regress y L.y in ‐`smallobs'/l return scalar bhaty = _b[L.y] test _b[L.y] = `betay' return scalar pvalue = r(p) end

22

“TABLE1B” program

// This program takes approximately 30 mins to run on my laptop. // The programs must be run in the following order: (i) hurwicz1A, then (ii) hurwicz1B. // The program “etime” is a user‐written .do file that can be obtained online etime, start drop _all clear graph drop _all set more off set seed 13 matrix meanbhaty = J(5,5,0) matrix meanRR = J(5,5,0) local i = 1 foreach betay in 0.1 0.5 0.9 0.95 0.99 { local j = 1 foreach smallobs in 15 30 50 100 1000 { simulate bhaty = r(bhaty) pvalue = r(pvalue), /// reps(10000): hurwicz, betay(`betay') smallobs(`smallobs') bigobs(1100) /// beta0(1) betau(1) summ bhaty, meanonly matrix meanbhaty[ì', `j'] = r(mean) generate RejectRate = 0 replace RejectRate = 1 if pvalue < 0.05 summ RejectRate, meanonly matrix meanRR[ì', `j'] = r(mean) local `++j' } local `++i' } matrix colnames meanbhaty = T15 T30 T50 T100 T1000 matrix rownames meanbhaty= B10 B50 B90 B95 B99 matrix colnames meanRR = T15 T30 T50 T100 T1000 matrix rownames meanRR = B10 B50 B90 B95 B99 matrix list meanbhaty matrix list meanRR etime

23


program drop _all program define ARDLprog, rclass version 13 syntax, beta0(real) betay(real) betax(real) betau(real) smallobs(integer) bigobs(integer) // Remove existing variables drop _all // Create the data set obs `bigobs' gen t = _n tsset t // We initialize y as its LR equilibrium value gen x=rnormal() gen u=rnormal() gen y = `beta0'/(1‐`betay') replace y = `beta0' + `betax'*x+`betay'*L.y + u in 2/l regress y L.y x in ‐`smallobs'/l return scalar LRP = _b[x]/(1‐_b[L.y]) testnl _b[x]/(1‐_b[L.y])=`betax'/(1‐`betay') return scalar pLRP = r(p) end

24


// This program takes approximately 20 mins to run on my laptop. // The programs must be run in the following order: (i) hurwicz2A, then (ii) hurwicz2B. // The program “etime” is a user‐written .do file that can be obtained online etime, start drop _all clear graph drop _all set more off set seed 13 matrix medLRP = J(5,3,0) matrix confidLRP = J(5,6,0) matrix meanRR = J(5,3,0) local i = 1 foreach betay in 0.1 0.5 0.9 0.95 0.99 { local j = 1 foreach smallobs in 10 50 1000 { simulate LRP = r(LRP) pLRP = r(pLRP), /// reps(10000): ARDLprog, betay(`betay') smallobs(`smallobs') bigobs(1100) /// beta0(1) betau(1) betax(1) summ LRP, detail matrix medLRP[ì', `j'] = r(p50) matrix confidLRP[ì', (`j'‐1)*2+1] = r(p5) matrix confidLRP[ì', (`j'‐1)*2+2] = r(p95) generate RejectRate = 0 replace RejectRate = 1 if pLRP < 0.05 summ RejectRate, meanonly matrix meanRR[ì', `j'] = r(mean) local `++j' } local `++i' } matrix colnames medLRP = "T10(P50)" "T50(P50)" "T1000(P50)" matrix rownames medLRP= B10 B50 B90 B95 B99 matrix colnames confidLRP = "T10(P5)" "T10(P95)" "T50(P5)" "T50(P95)" /// "T1000(P5)" "T1000(P95)" matrix rownames confidLRP= B10 B50 B90 B95 B99 matrix colnames meanRR = T10 T50 T1000 matrix rownames meanRR = B10 B50 B90 B95 B99 matrix list medLRP matrix list confidLRP matrix list meanRR etime

25


program drop _all program define DPDprog, rclass version 13 syntax, beta0(real) betax(real) betay(real) numN(integer) numT(integer) numNT(integer) /// beffect(real) drop _all set obs `numN' gen id = _n // "ai" is the part of the error term that doesn't change over time for a given // unit. gen ai = rnormal() expand `numT' bysort id: gen t=_n xtset id t gen x = (`beffect'*ai + rnormal())/sqrt(1+`beffect'^2) // "uit" is the part of the error term that varies across time gen uit = rnormal() // The total error term is the sum of "a" and "e" gen error = ai + uit gen y = `beta0'/(1‐`betay') replace y = `beta0' + `betax'*x + `betay'*L.y + error if t > 1 //This section calculates AH estimates xtivreg y x (L.y = L(2/3).y) if t > 10, fd return scalar AHLRP = _b[D.x]/(1‐_b[LD.y]) testnl _b[D.x]/(1‐_b[LD.y])=`betax'/(1‐`betay') return scalar pAHLRP = r(p) // This section calculates Difference GMM xtabond y x if t > 10 return scalar DGMMLRP = _b[x]/(1‐_b[L.y]) testnl _b[x]/(1‐_b[L.y])=`betax'/(1‐`betay') return scalar pDGMMLRP = r(p) // This section calculates System GMM xtdpdsys y x if t > 10 return scalar SGMMLRP = _b[x]/(1‐_b[L.y]) testnl _b[x]/(1‐_b[L.y])=`betax'/(1‐`betay') return scalar pSGMMLRP = r(p) end

26


// This program takes approximately 70 mins to run on my laptop. // The programs must be run in the following order: (i) hurwicz3A, then (ii) hurwicz3B. // The program “etime” is a user‐written .do file that can be obtained online etime, start drop _all clear set more off set seed 13 matrix medLRP = J(5,3,0) matrix confidLRP = J(5,6,0) matrix meanRR = J(5,3,0) // The local commands below set all the parameters for the experiments. local beffect = 0 // “beffect” controls the degree of correlation between x and the fixed effect. // When beffect = 0, there is no omitted variable bias. As beffect increases, // omitted variable bias increases. local numN = 50 // This sets the number of cross‐sectional units local numT = 20 // This sets the number of time observations per unit // Note that the estimations will only use the last 10 of these observations local numNT = `numN'*`numT' // This sets the total number of observations local beta0 = 1 // This sets the intercept term local betax = 1 // This sets the slope coefficient for x local reps = 10000 local i = 1 foreach betay in 0.1 0.5 0.9 0.95 0.99 { simulate AHLRP = r(AHLRP) pAHLRP = r(pAHLRP) DGMMLRP = r(DGMMLRP) /// pDGMMLRP = r(pDGMMLRP) SGMMLRP = r(SGMMLRP) pSGMMLRP = r(pSGMMLRP) , /// reps(`reps'): DPDprog, betay(`betay') betax(`betax') beta0(`beta0') /// numN(`numN') numT(`numT') numNT(`numNT') beffect(`beffect') summ AHLRP, detail matrix medLRP[ì',1] = r(p50) matrix confidLRP[ì',1] = r(p5) matrix confidLRP[ì',2] = r(p95) generate RRAHLRP = 0 replace RRAHLRP = 1 if pAHLRP < 0.05 summ RRAHLRP, meanonly matrix meanRR[ì', 1] = r(mean) summ DGMMLRP, detail matrix medLRP[ì',2] = r(p50) matrix confidLRP[ì',3] = r(p5) matrix confidLRP[ì',4] = r(p95) generate RRDGMMLRP = 0 replace RRDGMMLRP = 1 if pDGMMLRP < 0.05 summ RRDGMMLRP, meanonly matrix meanRR[ì', 2] = r(mean)

27

summ SGMMLRP, detail matrix medLRP[ì',3] = r(p50) matrix confidLRP[ì',5] = r(p5) matrix confidLRP[ì',6] = r(p95) generate RRSGMMLRP = 0 replace RRSGMMLRP = 1 if pSGMMLRP < 0.05 summ RRSGMMLRP, meanonly matrix meanRR[ì', 3] = r(mean) local `++i' } // These commands print out the results matrix colnames medLRP = AHLRP(p50) DGMMLRP(p50) SGMMLRP(p50) matrix rownames medLRP = B10 B50 B90 B95 B99 matrix colnames confidLRP = AHLRP(p5) AHLRP(p95) DGMMLRP(p5) DGMMLRP(p95) /// SGMMLRP(p5) SGMMLRP(p95) matrix rownames confidLRP = B10 B50 B90 B95 B99 matrix colnames meanRR = AHLRP DGMMLRP SGMMLRP matrix rownames meanRR = B10 B50 B90 B95 B99 matrix list medLRP matrix list confidLRP matrix list meanRR etime

28


program drop _all program define DPDprog, rclass version 13 syntax, beta0(real) betax(real) betay(real) numN(integer) numT(integer) numNT(integer) /// beffect(real) drop _all set obs `numN' gen id = _n gen ai = rnormal() // "ai" is the part of the error term that doesn't change over time for a given unit. expand `numT' bysort id: gen t=_n xtset id t gen x = (`beffect'*ai + rnormal())/sqrt(1+`beffect'^2) gen uit = rnormal() // "uit" is the part of the error term that varies across time // The total error term is the sum of "a" and "e" gen error = ai + uit gen y = `beta0'/(1‐`betay') replace y = `beta0' + `betax'*x + `betay'*L.y + error if t > 1 //This section calculates AH estimates xtivreg y x (L.y = L(2/3).y) if t > 15, fd return scalar AHLRP = _b[D.x]/(1‐_b[LD.y]) testnl _b[D.x]/(1‐_b[LD.y])=`betax'/(1‐`betay') return scalar pAHLRP = r(p) // This section calculates Difference GMM xtabond y x if t > 15 return scalar DGMMLRP = _b[x]/(1‐_b[L.y]) testnl _b[x]/(1‐_b[L.y])=`betax'/(1‐`betay') return scalar pDGMMLRP = r(p) // This section calculates System GMM xtdpdsys y x if t > 15 return scalar SGMMLRP = _b[x]/(1‐_b[L.y]) testnl _b[x]/(1‐_b[L.y])=`betax'/(1‐`betay') return scalar pSGMMLRP = r(p) end

29


// This program takes approximately 100 mins to run on my laptop. // The programs must be run in the following order: (i) hurwicz3A, then (ii) hurwicz3B. // The program “etime” is a user‐written .do file that can be obtained online etime, start drop _all clear set more off set seed 13 matrix medLRP = J(5,3,0) matrix confidLRP = J(5,6,0) matrix meanRR = J(5,3,0) // The local commands below set all the parameters for the experiments. local beffect = 0 // “beffect” controls the degree of correlation between x and the fixed effect. // When beffect = 0, there is no omitted variable bias. As beffect increases, // omitted variable bias increases. local numN = 140 // This sets the number of cross‐sectional units local numT = 20 // This sets the number of time observations per unit // Note that the estimations will only use the last 5 of these observations local numNT = `numN'*`numT' // This sets the total number of observations local beta0 = 1 // This sets the intercept term local betax = 1 // This sets the slope coefficient for x local reps = 10000 local i = 1 foreach betay in 0.1 0.5 0.9 0.95 0.99 { simulate AHLRP = r(AHLRP) pAHLRP = r(pAHLRP) DGMMLRP = r(DGMMLRP) /// pDGMMLRP = r(pDGMMLRP) SGMMLRP = r(SGMMLRP) pSGMMLRP = r(pSGMMLRP) , /// reps(`reps'): DPDprog, betay(`betay') betax(`betax') beta0(`beta0') /// numN(`numN') numT(`numT') numNT(`numNT') beffect(`beffect') summ AHLRP, detail matrix medLRP[ì',1] = r(p50) matrix confidLRP[ì',1] = r(p5) matrix confidLRP[ì',2] = r(p95) generate RRAHLRP = 0 replace RRAHLRP = 1 if pAHLRP < 0.05 summ RRAHLRP, meanonly matrix meanRR[ì', 1] = r(mean) summ DGMMLRP, detail matrix medLRP[ì',2] = r(p50) matrix confidLRP[ì',3] = r(p5) matrix confidLRP[ì',4] = r(p95) generate RRDGMMLRP = 0 replace RRDGMMLRP = 1 if pDGMMLRP < 0.05 summ RRDGMMLRP, meanonly matrix meanRR[ì', 2] = r(mean)

30

summ SGMMLRP, detail matrix medLRP[ì',3] = r(p50) matrix confidLRP[ì',5] = r(p5) matrix confidLRP[ì',6] = r(p95) generate RRSGMMLRP = 0 replace RRSGMMLRP = 1 if pSGMMLRP < 0.05 summ RRSGMMLRP, meanonly matrix meanRR[ì', 3] = r(mean) local `++i' } // These commands print out the results matrix colnames medLRP = AHLRP(p50) DGMMLRP(p50) SGMMLRP(p50) matrix rownames medLRP = B10 B50 B90 B95 B99 matrix colnames confidLRP = AHLRP(p5) AHLRP(p95) DGMMLRP(p5) DGMMLRP(p95) /// SGMMLRP(p5) SGMMLRP(p95) matrix rownames confidLRP = B10 B50 B90 B95 B99 matrix colnames meanRR = AHLRP DGMMLRP SGMMLRP matrix rownames meanRR = B10 B50 B90 B95 B99 matrix list medLRP matrix list confidLRP matrix list meanRR etime

Documents

TESTING FOR UNIT ROOTS WITH COINTEGRATED DATAeconfin.massey.ac.nz/school/documents/seminarseries... · one-unit increase in U ç from its equilibrium value causes the next period’s