29
Uppsala University Department of Economics Uppsala, Sweden B.Sc. Thesis Fall, 2005 Small sample performances of two tests for overidentifying restrictions Author: Can Tongur Thesis advisor: Matz Dahlberg

Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Uppsala University Department of Economics Uppsala, Sweden B.Sc. Thesis Fall, 2005 Small sample performances of two tests for

overidentifying restrictions

Author: Can Tongur Thesis advisor: Matz Dahlberg

Page 2: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Abstract Two new specification tests for overidentifying restrictions proposed by Hahn and Hausman

(2002:b) are here tested and compared to the classical Sargan test. Power properties are found

to be very similar in overall performance, while Sargan generally has better size than the new

tests. Also, size is distorted for one of the new tests, thus a tendency to reject prevails. In

addition, sometimes severe bias is found which affects the tests’ performances, something that

differs from earlier studies.

Keywords: specification tests, weak instruments, size, power.

Note: Gauss code for this study is available on request. Correspondence through my email:

[email protected]

2

Page 3: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Contents 1 Introduction 4

2 Fundamentals of the problem

2.1 Economic Introduction 5

2.2 Basics of endogeneity in cross-sectional modelling 5

2.3 Instrumenting 6

3 Earlier work 8

4 Specification tests for IV methods 9

4.1 Sargan and HH Test 9

4.2 The Bias of TSLS and OLS 14

5 The Monte Carlo study

5.1 The Design 14

5.2 The Data Generating Process 15

6 Results 18

7 Summary 20

Concluding remarks 20

References 22

Appendix A1: Tables 24

Appendix A2: The partialling out 29

3

Page 4: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

1 Introduction A common problem for many kinds of economic models is the presence of endogeneity in the

explanatory variables, which means that some explanatory variables are not predetermined,

they are determined within the model resulting in more than one dependent variable for a

single equation. The econometric consequence of this endogeneity feature is that regular

Ordinary Least Squares (OLS) is no longer valid for regression since the fundamental

assumptions for OLS are violated as an endogeneity bias prevails in the model. There is not

always a straightforward way of solving the endogeneity problem when discovered, thus an

accurate solution method must be attained to give consistency in econometric modelling. One

general solution to this kind of problems is to use an Instrument Variable approach, basically

by using a Two Stage Least Squares (TSLS) estimator instead of regular OLS to achieve

efficiency in estimation of the fundamental economic model. The main issue is then to have a

correctly specified model in the TSLS, an issue that is not always easy in application. Mainly,

variants of Sargan’s test have been used to test for a correctly specified model. There has

however, evolved alternatives to this test of which one will be discussed in this paper.

The new specification tests proposed by Hahn and Hausman (2002:b) are here examined for

their size and power properties in small samples by Monte Carlo simulations. A comparison is

made with Sargan’s test for the different specifications of the data generating process to

determine if the new tests have better size1 and power properties given the presence of

instruments which are either all weak or all weak and one strong. Weakness is a well

investigated issue since in applications strong instruments are hard to find. Size has been

examined earlier for some combinations, so the aim is to study power. To my knowledge, no

study is present for power properties of the new tests, although this is proposed by Hausman

et al. (2004). Even if focus is on power, some size properties are examined for comparisons

with earlier studies, which ensures transparency of power results. In addition, since the new

specification tests involve bias terms, either by second-order derivations or by bias-

corrections, bias will be reported for all cases.

The paper has the following structure from now. First, a brief introduction to the endogeneity

problem and its remedy are given. The second section presents some earlier studies in the

field of solving endogeneity and the corresponding specification tests. The fourth section

1 Size is the probability to reject a correct null hypothesis and power is the probability of rejecting a false null hypothesis. It is desirable that size tends to nominal size and power tends to nominal power.

4

Page 5: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

presents the Sargan and Hahn and Hausman (HH test from now) tests and their derivations.

The fifth section presents the experimental design. The results from the simulations are given

in the sixth section and a brief discussion concludes the paper in section seven.

2. The fundamentals of the problem 2.1 Economic introduction

In many econometrical studies of economic problems, variables are found to be endogenous,

which means that they are not predetermined. This implies for instance, that one equation may

have more than one dependent variable. In fact, this is sometimes very common; in

applications in macroeconomics, variables in the models are often initiating each other. For

instance, GNP is set by consumption, while consumption is set partly by GNP the preceding

year. In microeconomics, estimation of market equilibria is dependent of simultaneous

movements in price, demand and supply, which are all interdependent. The very commonly

used Mincer equations to determine labour wages face the same problems; factors explaining

wages are often set by parental wages and schooling is many times correlated with societal

situation. Solving these kinds of endogeneity problems requires accuracy for inference, but far

from always are god solutions possible. The interactions of 3 factors are then fundamental;

the solution method of endogeneity, which means the IV-method, here illustrated by using

TSLS, the adequacy of instruments and the specification tests.

2.2 Basics of endogeneity in cross-sectional modelling

In a regular linear regression model, such as the OLS, explanatory variables are expected to

be predetermined. If they are determined within the model, the fundamental assumption of

orthogonality of error terms is violated, resulting in total inconsistency of OLS estimates. If

one regressor is not predetermined, there has to be another equation to explain the

endogeneous regressor to avoid any prevalent bias. Thus a system of equations is set up with

M equations;

12211 εββα +++= XXY (1)

11 νηγ ++= ZX (2)

Here, M = 2 and is an endogeneous regressor, equation (2) is then solving the system by 1X

5

Page 6: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

assigning presumably orthogonal instrumental variables Z to the endogenous regressor in

(1).2

A system of equations is not solvable by OLS.3 Instead another approach within the

Generalized Method of Moments (GMM) may be used for computation, as well as Maximum

Likelihood estimators (extremum estimators). The problem of the non-orthogonality is then

remedied by simultaneously solving the equations in the M-system by GMM, an approach

that assures the sample moments in (3), i.e the mean and the variance to be calculated and it

clears out the endogeneity. The solving of the system is carried out in Two-Stage Least

Squares, a special case of a single equation GMM approach. If more endogenous variables

attained the model, a multiple equation GMM approach would be necessary.

2.2 Instrumenting

Since one of the regressors is not predetermined, it must be replaced by an instrument. The

solution is to use for instance TSLS is to assign Z instruments as substitutes for the

endogenous variable. The Z instruments are assumed to be uncorrelated with the error term in

the structural equation (1) but correlated with the endogenous regressor in order to be defined

as valid instruments. Then, the excluded instruments Z enter the system through equation (2)

and a consistent solution is possible. If any of the instruments in Z are not orthogonal to the

error term, they are defined as invalid instruments since they would possess the same kind of

explanatory power as the already endogenous regressor and have a bias.

The endogeneity in the equation (1) may be considered as weak when the correlation between

the endogenous regressor and the original Left Hand Side (LHS) variable is weak.

Conversely, if the correlation between them is strong, strong endogeneity prevails in the

model.4

Two conditions are required to be fulfilled for IV estimation. First, instruments have to be

relevant i.e. have some explanatory power. Instruments are considered as weak instruments if

their explanatory power in equation (2) is low, which implies that they are only weakly

correlated with the endogenous regressor. This may be seen from a partial 2R of Z in (1).

2 The orthogonality assumption for any regressor j in any equation m for any individual i is

( ) 0=⋅ mimjiXE ε , which implies that the expected value of the product of any arbitrary regressor and the error

term is zero i.e. they do not have cross moments. 3 Of course, OLS may be conducted with inconsistency and with bias as consequence. 4 This correlation is affecting the error component, resulting in non-orthogonality.

6

Page 7: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

If instruments tend to zero correlation with the endogenous regressor, that is when the partial 2R is going to zero, the TSLS model will tend to be based on irrelevant instruments and

presumably have a bias towards OLS. Since the order condition for identification of any

Instrumental Variable approach is that at least as many instruments K as endogenous

variables L are used, , too weak instruments give a bias similar to the one of OLS

(Baum et al., 2003). If the correlation between the instruments and the endogenous regressor

is strong, they have greater explanatory power in (2) and we no longer have a weak

instrument bias.

LK ≥

Second, apart from having explanatory power, instruments have to be valid, which means that

they must be orthogonal to the error term in (1), as explained in note 2.

If the required conditions are assumed to be met for TSLS, and if we divide data as

, , , (3)

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

=⋅

'

'2

'1

)(

n

Ln

x

xx

XM

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

=⋅

'

'2

'1

z

zz

Z

n

Kn M)(

⎥⎥⎥⎥

⎢⎢⎢⎢

=⋅

n

n

y

yy

M2

1

)1(y

where the X matrix contains included endogenous and included predetermined regressors, the

Z matrix contains the originally excluded instruments (they appear first in equation (2)) and

included predetermined regressors. The y vector contains the dependent LHS variable. It is

then possible to derive the traditional TSLS estimator

yZ'Z)Z(Z'X'X]Z'Z)Z(Z'X'δ 111TSLS

−−−= [)

. (4)

This estimator may be found in Hayashi (2000, pp. 230, observe the converse notation of X

and Z in the reference). The application of the TSLS is sought to remedy the problem of

endogeneity given orthogonal instruments and may be considered as computationally easy

compared with other approaches, such as no-moment estimators such as the LIML or the

other correspondents described earlier. The no-moment estimators are referred to as no-

moment since they do not use population moments in computations of parameters like the

GMM, instead they are extremum estimators (the section of k-classification, p. 13 explains

more).

7

Page 8: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

3. Earlier work

In the field of instrumental variable methods (IV), two issues constitute the main part of

studies. Firstly, different instrumental variable methods are used to solve endogeneity

problems. The variety of estimators in this field has flourished, and some of these studies are

presented here since the importance of these estimators is large. Second, the estimators are

used in specification tests to test if the instrumental variable models are properly specified,

which may be hard to detect when instruments are weak as in our case.

The problem with detecting incorrectly specified models by using the Sargan test (1958) is a

well investigated issue. The small sample properties of the Sargan test have been well

explored and the general conclusion is that the test lacks adequacy in terms of size and power

and is easily affected by sample bias. For further knowledge, there is a vast literature to be

consulted. Bowsher (2000) and Dahlberg et al. (forthcoming, 2006) examine the power of the

Sargan test in dynamic panels and find the test to have poor power properties, from

substantially low to sometimes zero rejection in the Bowsher study. In another Monte Carlo

study, Blomquist et al. (2001) find that for a more general data generating process and with

many weak instruments and one strong instrument, the Sargan requires a large sample, around

1000 individuals, to be powerful.

The problem of incorrect models due to weak instruments, or detecting a failure of the

orthogonality condition, has been remedied somewhat by alternative estimators to the TSLS,

among which the Limited Information Maximum Likelihood (LIML) and various Jackknife

(JNTSLS) and Split-Sample estimators are used. A comparison of these estimators may be

found in a study by Blomquist and Dahlberg (1999), in which many weak instruments are

used. They find that the LIML has the least bias among the above mentioned ones. Hahn et al.

(2002) find that for very weak instruments, only the Fuller (1977) estimator may be

considered adequate, but for stronger instruments, LIML, Nagar and JNTSLS all perform

well. In the weak instruments case, Staiger and Stock (1997) derive the theoretical properties

of the TSLS and LIML to be non-equivalent even asymptotically. However, in a forthcoming

study by Blomquist and Dahlberg (2006), they find no estimator to be indeed satisfactory

when instruments are weak.

As a conclusion, there is still an ongoing search for a good counterpart of the TSLS and

thereby also a search for the uniformly most powerful test corresponding to the Sargan test for

8

Page 9: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

overidentifying restrictions, especially when instruments are weak. This is a recent topic of

research to which Hahn and Hausman (2002:b) contribute by deriving a new specification

test5 by using second order asymptotic theory for inference. They derive the asymptotic

properties of the new test and they study the size properties by Monte Carlo and find that the

test tends to nominal size in many cases. Hahn and Hausman (2003) examine the HH test for

its asymptotic power properties with the presence of exogeneity in the instruments, which

means that they only derive the theoretical properties and not small sample performances.

Hausman et al. (2005) examine the strength of the test to detect weak and sometimes

irrelevant instruments and find a generally good pattern, so the subject of testing for power in

small samples is still open, thus focus is on that in this study.

4. Specification tests for IV-methods

The field of straight-forward tests developed to control for specification errors remains open,

the most widely used test is still Sargans statistic (1958), which tests for over-identifying

restrictions under the imposing of conditional homoskedasticity. The Basmann (1960) F-test

is similar to the Sargan statistic in the assumptions, but analogously to Sargan, no longer valid

when the last assumption of homoskedasticity is relaxed. The remedy is then to use an

optimal weighting matrix for the error components, efficient GMM that is. When this remedy

against heteroskedasticity is used, the Sargan statistic becomes Hansen’s J statistic. The main

objection against using the Hansen J is that it requires finite fourth moments (Hayashi, 2000,

pp. 212), which implies a large sample size, asymptotic distribution that is. This is the main

objection against efficient GMM since the small sample properties may be very poor. As a

solution to this overall quest, Hahn and Hausman (2002:b) propose a different approach on

the problems of misspecification.

4.1 Sargan Statistic and the HH Test

The Sargan statistic is a specification test for testing simultaneously if the restrictions for the

TSLS are fulfilled and also instrumental validity. The null hypothesis is that the model is

correctly specified so the instruments are orthogonal to the errors and the assumptions for the

TSLS estimator are fulfilled, while a rejection implies that either condition is violated. The

statistic is defined as

22 ~

)(LK

TSLSTSLSS −

−−= χ

σ)

))δZP(y)'δZy , (5)

5 They derive 3 variants of this test, of which 2 are used in this study.

9

Page 10: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

and as it can be seen from the degrees of freedom, the model has to be overidentified since the

number of K instruments must be larger than the number of L endogenous regressors. The

denominator in (5) is the TSLS variance, computed as

nε'ε ))) =2σ , (6)

where P= , which is the projection matrix on the subspace spanned by X. X'X)X(X' 1− ε) are

the TSLS residuals from (4) and n is the sample size. The statistic in (5) is in our case base on

the TSLS estimator TSLSδ)

. Other estimators would technically work as well while the

problems mentioned in the section with earlier studies should be in mind, other estimators

may lack adequacy. An alternative way of testing the TSLS estimator and thus constructing a

specification test is proposed by Hahn and Hausman (2002:b), hence the HH Test. From now,

we follow their notation in derivations. If we start with the basic model

VZZYuZYY++=

++=

22112

121

ππγβ

, (7)

1Y and are jointly endogenous variable vectors.2Y 6 are the instruments and the exogenous

variables in the first equation are partialled out from the system by multiplying through the

annihilator matrix of ,

2Z

1Z

1Z

.'1

11

'11 1111

)( zZzz PIZZZZIM −=−= − 7 (8)

where is the projection matrix of . 1z

P 1Z

The remaining equations are then

122

121

νπεβ

+=+=

zyyy

(9)

and dim( 2π ) . No exogenous variables are then left in the first-stage equation, merely

excluded instruments. In order to derive the test statistic, two different TSLS estimators are

derived

K=

6 Throughout this study, we use only one endogenous right-hand-side variable (RHS). Hahn and Hausman (2002:b) derives the corresponding features for multiple jointly endogenous RHS variables. 7 This procedure is derived in appendix A2.

10

Page 11: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

2z'2

z'2

yPyyPy 1=TSLSb and

1z'1

2z'1

yPyyPy

=TSLSc (10)

where is corresponding to the traditional forward estimator, and is the reverse

estimator and they differ by the interchanging of the left-hand-side variable, LHS. is the

projection matrix of the remaining Z variables in (9). The estimators have unit correlation and

the inverse of follows the same first-order distribution as the forward, so with respect to

first order asymptotical theory, the forward and reverse estimators are similar. If second order

theory is applied following Bekker (1994), estimates will differ additionally by a second order

bias difference, derived as

TSLSb TSLSc

zP

TSLSc

1'22

'2

11ˆˆ

yPyn

yPyn

Bzz ⋅

Ξ−≡ , (11)

where the numerator is computed as

21

'22

'21

'1

1'12

'22

'2

1'21

'21

'2

2'21

'11

'1

)())((

)(

)(2

)(ˆ

1

ˆ1

ˆ1

ˆ1

ˆ1

ˆ1

ˆ

1

ˆ1

ˆ1

ˆ1

ˆ1

1

ˆ1

ˆ1

ˆ1

ˆ1

1

ˆ1

ˆ1

ˆ1

ˆ1

yMyyMyyMy

yMyyMyyPy

yMyyMyyPy

yMyyMyyPy

zzz

zzz

zzz

zzz

nnn

nnn

nnn

nnn

α

α

α

α

α

α

α

α

α

α

α

α

α

α

α

α

α

α

−−−

−−

−−

−−

−+

−+

−−

−=Ξ

(12)

and αα )≡⎯→⎯nKlim . (13)

The numerator of the statistic is then defined as

:0H =1d)

),0(1 VNBc

bnTSLS

TSLS →⎟⎟⎠

⎞⎜⎜⎝

⎛−−)

(14)

where Bn)

⋅ is a consistent estimate of probability limit of the difference in bias between the

forward and reverse estimators.

11

Page 12: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

The bias-subtracted form is normally distributed with variance V.8 The corresponding

specification test is then t-distributed and takes the form

5.01

11 w

dm )

)

= ~ t (15)

the denominator term, 5.01w) , is the square root of the consistent estimator of the variance

computation

22

'2

22

'2

22

'22

'2

2221

1 )()(

)())((12

1

yPyyPy

yMyyPyyy

knKw

zz

zziLIMLini

Kn

K

−−−∑

−−

== β

(16)

The left hand side of equation (15) is from now referred to as m1.

The assumption under the null is that the difference between the two estimates and the

magnitude of the bias are small. Under the assumption of homoskedasticity, the HH test is

directly comparable with the Sargan statistic. Also, the minimum requirement is just like the

Sargan statistic, that the model is overidentified of degree one and like the Sargan conditions,

a rejection may occur in (15) if instruments are not orthogonal or if the instruments are too

weak, tending to irrelevancy. The statistic, and especially the bias is subject to an interaction

between the sample size n, the 2R of the reduced form equation, the number of instruments K

and the orthogonality between the endogenous variables. If this test rejects the null, Hahn-

Hausman propose as a consistent alternative to the preceding Bekker asymptotic theory, that

the model be tested in the Nagar-form estimator, originating from Nagar (1959) in which

higher order moments are derived for k-class estimators (se below for explanation). Again,

they divide estimators in to forward and reverse9 estimators respectively

,2

'22

'2

1'21

'2

yMyyPyyMyyPyb

zz

zzTSLS λ

λ−−

≡ 1

'21

'2

1'11

'11

yMyyPyyMyyPy

zz

zz

TSLSc λλ

−−

≡ , (17)

8 The variance itself and all theoretical properties are derived by Hahn and Hausman (2002:b), merely the expressions are necessary for this study. 9 This is how Hahn and Hausman (2002:b) and Hausman et al. (2004) derive the reverse estimator. To be theoretically correct and consistent with the reverse estimator in (10), the denominator in the reverse estimator (17) should have interchanged endogenous variables. However, the results are the same indifferent of matrix multiplication rules in this

case since both are vectors and dim ( ) is zP nn× .

12

Page 13: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

where

n

K

n

K

21

2

−−

=λ .10 (18)

The difference between these two estimates sets up the numerator

( )TSLSTSLS cbnd /12 −=)

. (19)

It can be seen that since no bias term is subtracted, the Nagar estimators are both bias-

corrected,11 thus the expression is simpler than in (14). The statistic is similar to (15)

twdm ~5.0

2

22 )

)

= , (20)

Equation (20) is from now referred to as m2. The computation of the standard deviation 5.02w)

follows from the variance definition

2

2'22

'2

2

2221

2

)(

))((121 yMyyPy

yykn

KwzzLIML

iLIMLini

Kn

K

−−

−∑−−

= =

β

β (21)

For both of the variance calculations, the no-moments estimator LIML should have the

identical theoretical properties of the asymptotical variance for the variance estimates, as

proposed in Hahn and Hausman (2002:b). The modified LIML estimator, proposed by Fuller

(1977) may also be used since it is considered better than the LIML,12 however, Hausman et

al. (2005) state that any other IV estimator may be used for this purpose. The LIML estimator,

the modified LIML estimator and the Nagar type of estimators are all defined as k-class

estimators as they use a certain specification depending on k eigenvalues (see Hayashi, 2000,

pp. 541 for a simple derivation of the LIML in k-classification). OLS may be seen as an entry

to this classification, a special case with k=0. Both Maximum Likelihood-estimators as well

as Nagar’s follow this k-classification.13

10 As can be seen, K>2 is required for the Nagar to be well-defined. If the degree of overidentification is one, the forward and reverse estimators converge with each other in the Hahn and Hausman (2002:b) application.

11 A derivation starting from the bias-corrected TSLSβ converging to Nagarβ may be found in Hahn and Hausman

(2002:a). 12 This is proposed by Jerry Hausman in e-mail communication. Also, Fuller has proved to have better size, see Hausman et al. (2005). However, it is still a no-moment estimator. 13 Since k-class estimators are used, Hahn and Hausman (2002:b) derive the Bekker approximations to be more accurate when few instruments are used, rather than using second order Edgeworth expansions commonly used for large numbers of instruments.

13

Page 14: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

4.2 The Bias of TSLS and OLS

One of the major differences between OLS and IV coefficient estimates, say TSLS

parameters, when endogenous regressors are present, is the bias of the respective. As stated in

Hahn and Hausman (2002:a), empirical work findings show that the TSLS estimate differ

substantially in magnitude from the OLS respective. The derivation of the bias, from the same

study, illustrates the problem, here shown only through approximated derivations:

[ ]11

11

22' vv

vTSLS KRn

KbE

σππσ

β ε

+≈− , (22)

and the approximated derivation of the OLS bias follows as

[ ]11

11

22' vv

vOLS R

bEσππ

σβ ε

+≈− , (23)

The unacquainted parameters in (22) and (23) are R, which is the square root of 2R from the

second equation in (9), and 11vεσ is the residual covariance from equation (9). The most

influent difference between (22) and (23) is the denominators; the TSLS bias approximation

has n in the denominator; it is n-consistent, which implies that for a large sample, this

difference expression would tend to zero. The implication of this n-correction reminds us of

CAN, Consistent and Asymptotically Normal (Hayashi, 2000, pp. 95) which is the case for

our statistics in (14) and (19). This is not the case for OLS bias, so the OLS becomes

inconsistent. The bias in the parameter estimates will transfer to the specification tests, and

their strength to correct for this will be a determinant for their outcome.

5. The Monte Carlo Study

5.1 The Design of the Monte Carlo

The model we are using has to account for weak instruments and a certain endogeneity as in

the Hahn and Hausman study (2002:b), and in addition, instruments have to be invalid. If the

number of instruments would be too large, say 180 as in the Angrist and Krueger (1991) study

and an influential part of the instruments is weak, each additional weak instrument would

contribute to bias in the TSLS estimator indifferent of instrumental validity. Among the many

IV-applications, no general conclusion is set on amount of instruments to be included, while it

14

Page 15: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

is well known that too many weak instruments give bias at least in theoretical derivations. It is

however reluctant act to omit relevant but weak instruments. In for instance the Blackburn

and Neumark (1992) study of wages based on the Griliches study (1976), they use four

instruments, all rather strong and Blomquist and Dahlberg (1999) use seven instruments for

their labour supply model, with rather high 2R (>0.2), so the variation in number of

instruments is large. The Hahn and Hausman study (2002:b) is based on an amount varying

from 5 to 30 instruments, all weak. The interaction between the number of included

instruments and weakness of instruments are thus likely to have great impact on the

performances of the specification tests, so this is the fundamental design of this study: we

focus on few instruments of which either all are weak or when one of them is either weak or

strong. We will thus assume the following situations for the instruments in our size and power

test with respect to either weak or strong endogeneity in the model:

1) All K instruments are exogenous (valid). This tests size. The K:th instrument is either

weak like the K-1 instruments or it is strong.

2) K-1 instruments are exogenous (valid), but 1 instrument, the K:th is endogenous,

invalid. This tests power.

The first situation above is that we specify a regular size test; since the instruments are

exogenous, they are valid instruments and the tests are expected to reject. For the first

situation, we will test when either all K instruments are weak or when K-1 are weak and the

K:th instrument is strong. For the power study, which is the second situation, the following

cases will be tested for both weak and strong endogeneity;

2:1) K-1 valid but weak instruments, K:th instrument invalid and weak.

2:2) K-1 valid but weak instruments, K:th instrument invalid and strong

It would have been favourable to control for all of the cases as above with K-1 strong

instruments instead of weak. This is however, not possible due to restrictions of the Cholesky

approach, briefly explained in note 16.

5. 2 The Data Generating Process

The originating data generating process (DGP) for the test is set different from the one of

Hahn and Hausman (2002:b), they use a very simple model while our model is a more

complex one, specified following Blomquist and Dahlberg (1999) and Blomquist et al.

15

Page 16: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

(2003). This will favour similarity in results concerning the power properties of the Sargan,

and this approach should yield somewhat similar results to Hahn and Hausman (2002:b) in

size, or it may at least be used as a benchmark. We do not know however, how the new tests

behave in power. The DGP is set as

iiii yxy εββα +++= 22111 (24)

The constant α is set to 10 and parameters are set to 1β = 3 and 2β = 0.5, as in the Blomquist

et al (2003) study. The restrictions on the variables for this study are that only two jointly

endogenous variables are used, and . Since this study is based on pseudo-random

numbers and a generated DGP, the partialling out process needs not to be done; we assume

that this has been done

1y 2y

14 as we compute the respective forward and reverse estimators,

without any included predetermined variables.15 Thereafter, we need to specify the

relationship between all variables in the equation. The instruments are defined so that the

vector contains the valid instruments and is the either valid or invalid instrument,

restricted upon the different cases above.

21'z 22z

The endogeneity, weakness and validity restrictions are imposed by the covariance matrix,16

designed as

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

i

i

i

i

z

y

22

21

2

1

'z

ε

~ N . (25)

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

100

136

,

0000

222122

221121

22221212

22121121

1

yzz

Kyzz

zyzyy

zzy

i

σσσσ

σσσσσσ

ε

ε

ε

εεε

I

14 This is how the approach differs from the traditional TSLS method and the difference is that the levels of the variables will change after the partialling out. Matlab code was obtained from Jerry Hausman for the Nagar form, and neither did they use partialling out in their Monte Carlo. 15 The included predetermined regressors are necessary for the computation of the LIML and Fuller estimators, the computation of the LIML may be found in Hayashi (2000), p. 540. If LIML were to be performed without any included predetermined regressors, the characteristic value would be one (a result obtained in simulations), thus it will converge with the TSLS estimator. 16 The idea of this decomposition comes from a study by Blomquist et al. (2003), which extends the regular correlation procedure for multivariate variables; see for instance Eklof (2001) for Gauss tips on the basic case. The Cholesky decomposition required some restrictions on the weakness of instruments and their variances, but these restrictions will be applied throughout all simulations to give consistency in results. I use only some of their cases for the Sargan and while they make a comparison with the Hausman and Newey test from Newey (1985) I apply their DGP on the new HH tests.

16

Page 17: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Interpreting the covariance matrix above gives us the following relationships between the

variables:

≠21yεσ 0 imposes a covariance between the two endogenous variables.

Degree of endogeneity is set here, either weak or strong.

≠212zyσ 0 implies that the K-1 instruments are correlated with the endogenous regressor.

Degree of weakness of instruments is set here.

211zεσ = 0 implies that the K-1 instruments are uncorrelated with the LHS variable

through the error component.

=2221zzσ 0 implies that the K-1 and K:th instruments are independent. In addition, all K-1

instruments are independent from each other.

0222≠yzσ implies that the K:th instrument is correlated with the

endogenous regressor. Degree of weakness is set here for the K:th instrument.

122εσ z If this is zero, the K:th instrument is valid like other instruments. If it is not zero,

it is invalid and we test for power.

These covariances are implicitly set, through defining the correlations between the variables.

Endogeneity is either weak or strong, 21yερ = 0.1 or 0.6, respectively. The K-1 instruments are

valid so 211zερ = 0 and they are weak

212 zyρ = 0.1 (strong instruments were not possible to

apply, as explained earlier). The K:th instrument is invalid if ,0122≠ερ z otherwise all

instruments are valid when we test for size. The K:th instrument is weak or strong if 222 yzρ =

0.1 or 0.6 respectively. The number of instruments is set to K = 5 and 10 with respect to the

problematic mentioned above with bias from too many weak instruments since at least K-1

instruments are valid and weak.17 Number of individuals, N, is set to 100, 250 and 1000. N is

also 2000 in the size study, while this is not assumed to be necessary for power.

Measuring power implies that the assumptions for the null hypothesis of Sargan and the HH

tests are violated, so the tests are expected to reject why any form of bias is expected to

increase the magnitude of the numerators in equations (15) and (20).

17 The theoretical properties of the model in the situation of invalid instruments are derived by Hahn and Hausman (2003).

17

Page 18: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

The simulations are made in Gauss and results are based on 1000 replications on 5 %

significance level.18 Bias is computed as 2/ ββ i

), where 2β is set to 0.5 in the DGP and i is

either the forward or inverted reverse estimator for m1, or either forward or inverted reverse

estimator for the m2. This will show us how much the estimated parameters will differ from

the OLS estimate we set in the DGP. The estimated values should not only be interpreted as

bias, although they are bias they also constitute a “distance”. The difference between the

forward and reverse estimators should thus be in focus since this is the measure of rejection

for the m1 and m2; if the distance between the two estimates is large, the statistics will reject

after correcting for the variance.

6. Results

In table 1, size properties of the tests are reported. We see from case 1:1, which implies weak

endogeneity and only weak instruments, that in column (a), the Sargan test has desirable size

for small N but tends to diminishing size for N=1000, 2,9 %. In the case of LIML residuals in

the m1 (b), the test is oversized while Fuller residuals (c) tend to nominal size even if it is

exceeding for small N. The Nagar forms, (d) and (e) show a severe size distortions, and size

increases with N, which is an unexpected pattern. These results differ substantially from the

Hahn and Hausman (2002:b) study in which sizes for Nagar tended to nominal size. The main

objection against their results would be that they have fairly low bias; they use a very simple

model and have a bias far from the ones in Table 2, Case 1:1. Bias for m1, seen in columns

(a) and (b) is reasonably low, we see that the difference between forward and reverse

estimators is not too large. This is not the case for the Nagar, so the rejection occurs due to the

large difference between estimators. The fifth column (e) reports this feature, which was

observed during observations: the absolute value of the bias for the reverse Nagar seemed to

increase rapidly, which in turn increased with the N -correction. Thus the conclusion is that

the bias correction may not be as efficient in this case, unlike the procedure in the bias

subtracted form, m1.

In case 1:2, with strong endogeneity and all weak instruments, the size properties for the HH

tests are severely altered for small N. While Sargan still tends to nominal size, the LIML

tends to arount 30 % size while the Fuller form in column (c) needs high N to tend to

approximately 15 %, and the Nagar form is still greatly oversized. However, it is likely that

18 The Gauss code for the DGP was obtained from Lars Lindvall, Department of Economics, Uppsala University. Any modification and errors are due to my self.

18

Page 19: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

the DGP has important influence on the statistic. Hausman et al. (2005) have sizes up to 25 %

for the Nagar form when they study the interactions of 21yερ , 2R and K so we may again

suspect that the Nagar estimator fails to correct for substantial bias and be very sensitive to

the underlying model. The degree of endogeneity seems to have heavy negative influence on

the size, the differences between case 1:1 and 1:2 are sometimes large. When one instrument

is strong and endogeneity is weak, as in case 1:3, the Sargan and m1 tests behave as earlier

and the Nagar is again oversized. Bias is small for the m1, while the bias difference is again

great for the Nagar, shown in Table 2. Size of the m1 in cases 1:3 and 1:4 for N=2000 and 5

instruments showed unexpected values; size increased in these cases, but this was not the case

for K=10, or in any other case, so we hope this is due to the randomness of random

numbers.19

Table 3 depicts the power of the tests when endogeneity is weak. Case 2:1 shows that when

one instrument is invalid and weak, which should be intuitively hard to detect since it is only

weakly correlated with the endogenous regressor, Sargan tends to one (a), while the m1 with

LIML has a slow walk towards about 85 % as N increases. The m1 with Fuller is more

sensitive to number of instruments, K, only when K=10 does it exceed 60 %. The Nagar

forms m2, has desirable power, however we know that it has a tendency to be oversized,

which means that it is more likely to reject in all cases than not rejecting. For small N, 100

and 250, the power in m1 for both LIML and Fuller seem reasonably high, they outperform

the Sargan test and (c) is even better, marginally though, than Sargan since they are closer to

0.95. Also, the bias for the reverse Nagar, see Table 4, columns (e) and (f), is quite severe.

Case 2:2 shows the power when the invalid instrument is strong. In this case, bias for the

Nagar is substantially low, and again overall bias tends to decrease.

When endogeneity is strong in case 2:3 and one instrument is invalid but weak, Sargan tends

rapidly to one, while m1 with LIML (b) only reaches 82.5 % power for N=1000, and the

Fuller (c) shows a negative pattern; as N increases, power decreases, something not in

common with the Nagar, which has desirable power. Also, Fuller seems very sensitive to K,

in this case and in general. The reason for this is most likely its computation.20

19 Restrictions on the capacity of software made it impossible to test for N=5000. 20 The Fuller estimator uses a modification of the smallest eigenvalue in the LIML, a modification depending on K instruments as well, see Fuller (1977), p.951.

19

Page 20: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Bias for Nagar, Table 4, case 2:3, is still very high and the difference between the forward and

reverse Nagar biases is in many cases very high, much higher than for the m1- respective.

In case 2:4, endogeneity is strong, as is the invalid instrument. First of all, bias is relatively

low compared to Case 2:3. For all Nagar cases the bias of the reverse estimator is always

positive, so columns (e) and (f) are identical in Table 4. The general pattern is that bias

decreases as K increases in this case. In table 3 for case 2:4, power for Sargan, m1 and m2

tend to one as N increases in all cases For small N, power of the Fuller (c) is above 90 %, and

far above case 2:3, but one should have in mind it’s size distortions for small N with strong

endogeneity. Overall, the strength of the invalid instrument seems to have greater influence

on the performance of the test and the magnitude of the bias, than does the degree of

endogeneity.

7. Summary

The power properties of the new specification tests were found to be somewhat desirable; for

almost all cases power was high, and tended to one for large N. It was found that Sargan

sometimes outperforms the m1, while the m2, the Nagar, is always powerful. This is however

size-distorted, so interpretations of its power properties should be damped. The m1 and

Sargan behave like opposites in power for smaller sample sizes, if one is powerful the other

one lacks power, but for large samples, they all tend to 1, so in terms of power, the m1 and the

m2 are not necessarily to be preferred to the Sargan test. In general, size properties in this

study were somewhat larger than the ones in the Hahn and Hausman (2002:b) study. For large

N, the m1 form could be a possible counterpart to the Sargan test, but not necessarily. The

main difference from their study was that they found heavily distorted sizes for Sargan (or as

it is called, the n* 2R ). In this study, much smoother size movements and reliable size are

found for the Sargan test, so a valid interpretation is that the DGP may have great influence

on these results, as may be the case with size properties of the Nagar.

Concluding remarks

We found a significantly large bias for all size tests for the Nagar, and bias of the m2

continuously exceeds m1. Hahn et al. (2002) recommend strongly that no no-moments

estimators such Nagar and LIML to be used when weak instruments are present, a conclusion

valid for our study, and they propose one of its equivalents, the JNTSLS, to be better with

weak instruments. However, as was stated by earlier studies, all non-regular and no-moments

estimators are shown to have somewhat non-satisfactory properties in Monte Carlo, see for

20

Page 21: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

instance the forthcoming study of Blomquist and Dahlberg (2006). The problem as it appears

here, may be that with a considerable amount of weak instruments, the Nagar form tends to

non-identification, even if not mathematically for this case (a derivation of non-identification

may be found in Hahn and Hausman, 2002:a), so the estimator “blows up” due to no-

moments, while the TSLS “only” becomes inconsistent and biased. The results for the Nagar

form are thus not in consensus with the Hahn and Hausman (2002:b) study. The sometimes

abysmally large difference between the forward and reverse estimators for the Nagar form

seems to have total influence on the rejection frequencies. Otherwise, the m1 results are not

far from the results of Hahn and Hausman (2002:b) while their results for the Sargan test are

sometimes similar to our results for the m2. In addition to the problems above, computation

of Sargan is much easier and involves no no-moment estimator when applied on TSLS

residuals as we did here, unlike the complex computation of m1 and m2 and their respective

variance terms. To summarize, the overall conclusion is that the Sargan test is easier to apply

and more efficient, the moment method applied through TSLS seems to work well for the

Sargan test. Of course, for really small samples, neither test showed perfection. []

21

Page 22: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

REFERENCES

Angrist, Joshua D and Alan B. Krueger (1991),”Does compulsory school attendance affect

schooling and earnings?”, Quarterly Journal of Economics, no. 106, 979-1014.

Baum, Christopher F., Mark E. Shaffer and Steven Stillman (2003), “Instrumental variables

and GMM: estimation and testing”, Stata Journal 3, No. 1, 1-31.

Blackburn, McKinley and David Neumark (1992), Unobserved ability, Efficiency wages and

interindustry Wage differentials”, Quarterly Journal of Economics” no. 107,

1421-1436.

Blomquist, Soren and Matz Dahlberg (1999), “Small sample properties of LIML and

Jackknife IV Estimators: Experiments with weak instruments”, Journal of

Applied Econometrics, No.14, 69-88.

Blomquist, Soren and Matz Dahlberg (forthcoming, 2006) “The case against JIVE: A

comment” in Journal of Applied Econometrics.

Blomquist, Soren, Matz Dahlberg and Lars Lindvall (2003), “Testing for instrument validity

when the instruments are weak”, Unpublished.

Bowsher, Clive G (2002), “On testing overidentifying restrictions in dynamic panel data

models”, Economic Letters, no. 77, issue 2, 211-220.

Dahlberg, Matz, Eva Mork and Per Tovmo (forthcoming, 2006) “On the performance of

the Sargan test in the presence of measurement errors in dynamic panels”, in

Applied Economic Letters.

Eklof, Matias (2001), “A note for beginners on Gauss 3.2 for Windows”, Available on

http://www.anst.uu.se/matieklo/index.htm.

Fuller, Wyane A. (1977), “Some properties of a modification of the Limited Information

Estimator”, Econometrica 45, no. 4, 939-954.

Griliches, Zvi (1976), “Wages of very young men” “The Journal of Political Economy”

Vol 84, No 2, 69-86.

Hayashi, Fumio (2000), Econometrics, Princeton University Press

Hahn, Jinyong, Jerry Hausman and Guido Kuersteiner (2004), “Estimation with weak

instruments: Accuracy of higher order bias and MSE approximations”,

Econometrics Journal ,7, no. 1, 272-306.

Hahn, Jinyong and Jerry Hausman (2002:a),”Notes on bias for estimators in simultaneous

equation models”, Economic Letters, 75, issue 2, 237-241.

22

Page 23: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Hahn, Jinyong and Jerry Hausman (2002:b), “A new specification test for the validity of

instrumental variables”, Econometrica 70, No1, 163-189.

Hahn, Jinyong and Jerry Hausman (2003), “Iv estimation with valid and invalid instruments.

Technical report, Department of Economics, Massachusetts Institute of

Technology, Cambridge, Massachusetts.

Hausman, Jerry, James H stock and Motohiro Yogo (2005), “Asymptotic properties of the

Hahn-Hausman test for weak instruments” Economic Letters 89, No 3, 333-342.

Nagar, A,L (1959), “The bias and moment matrix of the general k-class estimators of the

parameters in simultaneous equations” Econometrica 27, No 4, 575-595.

Staiger, Douglas and James H. Stock (1997), “Instrumental variables regression with weak

instruments”, Econometrica 65, No. 3, 557-586.

23

Page 24: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Appendix A1: Tables Table 1 Size properties of the tests

Table Columns report following: (a) Size of the Sargan test (5%) (b) Size of the m1 test with LIML residuals

(5%) (c) Size of the m2 test with Fuller residuals (5%) (d) Size of the Nagar form with LIML residuals (5%) (e)

Size of the Nagar form with Fuller residuals

Case 1:1. Weak endogeneity, 21yερ = 0.1. All K-1 and K:th instruments valid

211zερ = 122ε

ρ z = 0

and weak 212 zyρ =

222 yzρ = 0.1

N K (a) (b) (c) (d) (e) 100 5 0.046 0.218 0.107 0.445 0.357 250 5 0.036 0.139 0.064 0.569 0.470 1000 5 0.029 0.131 0.065 0.770 0.622 2000 5 0.054 0.116 0.071 0.811 0.685 100 10 0.040 0.241 0.198 0.479 0.429 250 10 0.050 0.128 0.094 0.630 0.553 1000 10 0.054 0.098 0.074 0.786 0.699 2000 10 0.047 0.091 0.067 0.830 0.721 Case 1:2. Strong endogeneity

21yερ = 0.6. All K-1 and K:th instruments valid 211zερ =

122ερ z = 0 and weak

212 zyρ =222 yzρ = 0.1

N K (a) (b) (c) (d) (e) 100 5 0.082 0.540 0.328 0.601 0.379 250 5 0.067 0.366 0.184 0.763 0.488 1000 5 0.040 0.295 0.133 0.884 0.632 2000 5 0.064 0.250 0.107 0.911 0.680 100 10 0.089 0.707 0.616 0.677 0.536 250 10 0.089 0.449 0.312 0.791 0.605 1000 10 0.066 0.276 0.149 0.888 0.703 2000 10 0.052 0.259 0.133 0.906 0.740 Case 1:3. Weak endogeneity,

21yερ = 0.1. All K-1 instruments instruments valid 211zερ = 0 and weak

212 zyρ =0. K:th instrument valid 122ε

ρ z = 0 but strong 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) 100 5 0.061 0.208 0.124 0.715 0.607 250 5 0.044 0.180 0.110 0.808 0.683 1000 5 0.036 0.129 0.071 0.835 0.704 2000 5 0.047 0.472 0.314 0.842 0.696 100 10 0.051 0.302 0.263 0.713 0.668 250 10 0.056 0.239 0.188 0.783 0.704 1000 10 0.055 0.136 0.106 0.831 0.731 2000 10 0.045 0.095 0.064 0.834 0.720

24

Page 25: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Case 1:4. Strong endogeneity, 21yερ = 0.6. All K-1 instruments instruments valid

211zερ = 0 and weak

212 zyρ =0. K:th instrument valid 122ε

ρ z = 0 but strong 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) 100 5 0.079 0.402 0.226 0.798 0.599 250 5 0.055 0.332 0.151 0.868 0.676 1000 5 0.039 0.213 0.079 0.891 0.701 2000 5 0.044 0.577 0.318 0.892 0.696 100 10 0.082 0.582 0.500 0.809 0.707 250 10 0.071 0.446 0.317 0.840 0.720 1000 10 0.059 0.262 0.146 0.878 0.736 2000 10 0.049 0.183 0.091 0.882 0.737

Table 2 Bias for Size

25

Page 26: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

Table columns report following: (a) Mean bias of the forward TSLS estimator (b) Mean bias of the reverse TSLS

estimator (c) Mean of the Bias difference B)

(d) Mean bias of the forward Nagar (e) Mean bias of the reverse

Nagar (f) Mean absolute bias of the reverse Nagar.

Case 1:1. Weak endogeneity, 21yερ = 0.1. All K-1 and K:th instruments valid

211zερ = 122ε

ρ z = 0

and weak 212 zyρ =

222 yzρ = 0.1

N K (a) (b) (c) (d) (e) (f) 100 5 1.582 0.066 -0.281 3.316 97.00 142.6 250 5 1.249 0.096 -0.078 1.019 4.912 38.72 1000 5 1.173 0.269 -0.015 1.118 0.202 19.93 2000 5 1.011 0.403 -0.007 0.980 12.97 25.98 100 10 1.914 0.074 -0.968 2.061 -20.55 73.37 250 10 1.396 0.104 -0.276 1.130 31.59 79.43 1000 10 1.065 0.250 -0.055 0.982 5.205 18.56 2000 10 1.065 0.226 -0.026 1.022 3.836 14.86 Case 1:2. Strong endogeneity,

21yερ = 0.6. All K-1 and K:th instruments valid 211zερ =

122ερ z = 0

and weak 212 zyρ =

222 yzρ = 0.1

N K (a) (b) (c) (d) (e) (f) 100 5 4.431 0.182 -0.237 1.349 9.659 79.08 250 5 2.595 0.211 -0.071 1.184 78.76 139.8 1000 5 1.551 0.372 -0.015 1.205 -4.550 28.33 2000 5 1.214 0.494 -0.007 1.036 -16.94 32.59 100 10 4.801 0.189 -0.823 9.585 -13.49 57.22 250 10 2.919 0.217 -0.251 1.133 -6.123 50.60 1000 10 1.517 0.352 -0.053 1.014 -2.715 16.03 2000 10 1.307 0.532 -0.026 1.053 3.532 11.70 Case 1:3. Weak endogeneity,

21yερ = 0.1. All K-1 valid 211zερ = 0 and weak

212 zyρ = 0. K:th instrument valid

122ερ z = 0 and strong

222 yzρ = 0.6

N K (a) (b) (c) (d) (e) (f) 100 5 1.044 0.175 -1.017 0.950 1.857 25.88 250 5 1.052 0.392 -0.387 1.086 2.416 9.657 1000 5 1.035 1.213 -0.093 1.026 12.52 14.31 2000 5 0.996 1.842 -0.047 0.991 4.031 4.282 100 10 1.378 0.135 -2.342 1.204 5.731 37.85 250 10 1.087 0.253 -0.878 1.008 2.139 16.17 1000 10 1.019 0.796 -0.211 0.998 2.073 3.605 2000 10 1.027 1.322 -0.105 1.017 1.805 2.109 Case 1:4. Strong endogeneity,

21yερ = 0.6. All K-1 valid 211zερ = 0 and weak

212 zyρ = 0.

26

Page 27: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

K:th instrument valid 122ε

ρ z = 0 and strong 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) (f) 100 5 1.646 0.276 -0.976 1.118 0.047 27.75 250 5 1.262 0.476 -0.380 1.054 4.704 13.89 1000 5 1.091 1.242 -0.092 1.039 0.951 4.366 2000 5 1.027 1.838 -0.046 1.000 1.114 2.311 100 10 2.472 0.245 -2.200 1.357 9.332 31.18 250 10 1.521 0.349 -0.855 1.036 -0.501 22.28 1000 10 1.123 0.850 -0.209 0.999 8.129 10.96 2000 10 1.083 1.349 -0.104 1.021 1.636 1.992

Table 3 Power properties of the test Table columns report following: (a) Power of the Sargan test (5%) (b) Power of the m1 test with LIML residuals

(5%) (c) Power of the m1 test with Fuller residuals (5%) (d) Power of the Nagar form with

LIML residuals (5%) (e) Power of the Nagar form with Fuller residuals

Case 2:1. Weak endogeneity, 21yερ = 0.1. All K-1 valid

211zερ = 0 and weak 212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0.6 and weak ,222 yzρ = 0.1

N K (a) (b) (c) (d) (e) 100 5 0.850 0.437 0.306 0.767 0.702 250 5 0.978 0.492 0.337 0.941 0.957 1000 5 1.000 0.838 0.405 0.995 0.963 100 10 0.955 0.374 0.494 0.809 0.859 250 10 0.998 0.451 0.488 0.984 0.995 1000 10 1.000 0.867 0.620 0.999 0.998 Case 2:2. Weak endogeneity,

21yερ = 0.1. All K-1 instruments valid 211zερ = 0 and weak

212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0 .6 but strong, 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) 100 5 0.268 0.978 0.832 0.520 0.408 250 5 0.521 0.997 0.868 0.804 0.614 1000 5 0.995 1.000 0.910 1.000 0.914 100 10 0.342 0.952 0.978 0.524 0.581 250 10 0.697 0.989 0.987 0.844 0.804 1000 10 1.000 1.000 0.993 1.000 0.999 Case 2:3. Strong endogeneity,

21yερ = 0.6. All K-1 instruments valid 211zερ = 0 and weak

212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0 .6 and weak 222 yzρ = 0.1

N K (a) (b) (c) (d) (e) 100 5 0.935 0.640 0.456 0.843 0.720 250 5 0.994 0.648 0.438 0.963 0.893 1000 5 1.000 0.825 0.415 0.997 0.957 100 10 0.996 0.640 0.765 0.889 0.932 250 10 1.000 0.635 0.664 0.985 0.997 1000 10 1.000 0.838 0.674 1.000 0.999

Case 2:4. Strong endogeneity, 21yερ = 0.6. All K-1 instruments valid

211zερ = 0 and weak212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0 .6 but strong 222 yzρ = 0.6.

27

Page 28: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

N K (a) (b) (c) (d) (e) 100 5 0.445 0.996 0.904 0.726 0.530 250 5 0.891 1.000 0.900 0.953 0.727 1000 5 1.000 1.000 0.919 1.000 0.937 100 10 0.604 0.986 0.989 0.790 0.735 250 10 0.971 1.000 0.992 0.987 0.957 1000 10 1.000 1.000 0.997 1.000 0.999

Table 4 Bias for Power Table columns report following: (a) Mean bias of the forward TSLS estimator (b) Mean bias of the reverse TSLS

estimator (c) Mean of the Bias difference B)

(d) Mean bias of the forward Nagar (e) Mean bias of the reverse

Nagar (f) Mean absolute bias of the reverse Nagar

Case 2:1. Weak endogeneity, 21yερ = 0.1. All K-1 and K:th instruments valid

211zερ = 122ε

ρ z = 0

and weak 212 zyρ =

222 yzρ = 0.1

N K (a) (b) (c) (d) (e) (f) 100 5 7.821 0.055 -0.794 -0.161 191.8 346.1 250 5 10.27 0.057 -0.297 12.61 129.8 257.6 1000 5 12.42 0.059 -0.071 12.95 87.66 98.37 100 10 5.070 0.059 -1.817 9.400 -76.26 391.0 250 10 6.110 0.060 -0.650 7.466 111.3 256.2 1000 10 7.805 0.061 -0.153 7.453 47.45 100.9 Case 2:2. Weak endogeneity,

21yερ = 0.1. All K-1 instruments valid 211zερ = 0 and weak

212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0 .6 but strong, 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) (f) 100 5 10.66 0.276 -1.430 11.34 14.20 14.20 250 5 11.16 0.298 -0.563 11.44 13.31 14.20 1000 5 11.48 0.307 -0.138 11.55 12.98 12.98 100 10 9.037 0.248 -3.011 10.36 14.12 14.12 250 10 9.762 0.287 -1.171 10.32 13.12 13.12 1000 10 10.24 0.305 -0.288 10.39 12.90 12.90 Case 2:3. Strong endogeneity,

21yερ = 0.6. All K-1 instruments valid 211zερ = 0 and weak

212 zyρ = 0.1

K:th instrument invalid 122ε

ρ z = 0 .6 and weak 222 yzρ = 0.1

N K (a) (b) (c) (d) (e) (f) 100 5 10.93 0.074 -0.652 14.31 57.01 156.3 250 5 11.77 0.064 -0.247 12.81 80.47 142.4 1000 5 12.78 0.061 -0.059 13.02 74.15 74.15 100 10 7.940 0.093 -1.479 8.669 3.560 183.3 250 10 7.627 0.075 -0.545 7.340 81.66 233.0 1000 10 7.654 0.066 -0.130 7.612 71.38 71.38 Case 2:4. Strong endogeneity,

21yερ = 0.6. All K-1 instruments valid 211zερ = 0 and weak

212 zyρ = 0.1

28

Page 29: Small sample performances of two tests for overidentifying ...uu.diva-portal.org/smash/get/diva2:130492/FULLTEXT01.pdf · 4.1 Sargan and HH Test 9 4.2 The Bias of TSLS and OLS 14

K:th instrument invalid 122ε

ρ z = 0 .6 but strong 222 yzρ = 0.6

N K (a) (b) (c) (d) (e) (f) 100 5 11.27 0.291 -0.795 11.51 13.57 13.57 250 5 11.40 0.303 -0.304 11.50 13.13 13.13 1000 5 11.52 0.309 -0.073 11.54 12.93 12.93 100 10 10.05 0.276 -1.750 10.39 13.44 13.44 250 10 10.21 0.298 -0.655 10.36 12.97 12.97 1000 10 10.36 0.309 -0.158 10.38 12.85 12.85 Appendix A:2 Partialling out

The exogenous variables in the first equation below can be partialled out as follows:

First, set the basic model to

1121 εγβ ++= ZYY (A1)

222112 εππ ++= ZZY (A2)

Then, since and is the annihilator matrix of , we

know that If we multiply through , it follows that

.'1

11

'11 1111

)( zZzz PIZZZZIM −=−= −1z

M 1Z

.011=ZM z 1z

M

121121 111111εβεγβ zzzzzz MYMMZMYMYM +=++= , (A3)

22222112 111111επεππ zzzzzz MZMMMZMYM +=++= , (A4)

So we may redefine the remaining variables

111yYM z = , 111

νε =zM , , 221yYM z = 221

νε =zM and zZM z =21.

The remaining equations are then

ii vzy 12'

1 += πβ (A5)

ii vzy 22'

2 += π (A6)

29