A method for calculating bounds on the asymptotic covariance matrices of generalized method of moments estimators

Journal of Econometrics 30 (1985) 203-238. North-Holland

A METHOD FOR CALCULATING BOUNDS ON THE ASYMPTOTIC COVARIANCE MATRICES OF GENERALIZED

METHOD OF MOMENTS ESTIMATORS*

Lars Peter HANSEN

l/niversi(l: of Chicago, Chicago, IL, 60637, USA

For many time series estimation problems, there is an infinite-dimensional class of generalized method of moments estimators that are consistent and asymptotically normal. This paper suggests a procedure for calculating the greatest lower bound for the asymptotic covariance matrices of such estimators. The analysis focuses on estimation problems in which the data are generated by a stochastic process that is stationary and ergodic. The calculation of the bound uses martingale difference approximations as suggested by Gordin (1969) and a matrix version of Hilbert space methods.

1. Introduction

In many applications of instrumental variables estimation and more gener- ally applications of generalized method of moments (GMM) estimation, there is a large set of candidates for instrumental variables and hence a large class of consistent, asymptotically normal GMM estimators. In fact the dimensionality of the set of instruments can be infinite for two reasons. First, in models in which the expectation of a disturbance vector is zero conditioned on a vector of random variables, virtually arbitrary functions of the random variables can be used as instrumental variables. Second, in many time series models when a particular variable is orthogonal to the disturbance vector, lagged values of that variable also are orthogonal to the disturbance vector. Hence, from the perspective of a large sample analysis, as the sample size grows the number of avaiiable instrumental variables grows as well. The purpose of this paper is to provide a general framework for calculating the greatest lower bound for the asymptotic covariance matrices of members of an infinite-dimensional class of GMM estimators.

We approach this problem from the perspective of an analyst with time series observations on a vector stochastic process that is stationary and ergodic. For such a process it is well-known that time series averages converge almost

*This research was supported in part by grants from the Sloan Foundation and the National Science Foundation. This research has benefited from conversations with Ricardo Barros. Bo Honore, and George Tauchen.

0304-4076/85/$3.30 C 1985, Elsevier Science Publishers B.V. (North-Holland)

204 1.. P. Hansen, Culcukuting hounds on the uyvmptotic cowriance mcrtrices

surely to averages taken over hypothetical states of the world (i.e., unconditional expectations). Additional assumptions are required to obtain a central limit theory. Gordin (1969) has suggested an approach in which time series averages of a stationary process are approximated in a certain sense by time series averages of a martingale difference sequence. By adopting his approach, the existing central limit results for martingale difference sequences can be

applied to stationary processes as well. In this paper we use Gordin’s approach in studying the asymptotic distribution of GMM estimators.

In section 2 of this paper, we specify the underlying estimation environment and give some results that exploit the martingale difference approximation suggested by Gordin (1969). The central limit results for stationary processes can be stated conveniently in terms of the approximating martingale difference sequences. For this reason, we use the approximating martingale difference sequences to define a matrix counterpart to an inner product between vector stochastic processes that are jointly stationary and ergodic.

In section 3 we define a large (infinite-dimensional) class of GMM estimators. Alternative estimators are indexed by alternative specifications of the instrumental variables used in constructing the estimators. Each choice of instrumental variables defines a stochastic process of products of instrumental variables and disturbances. The asymptotic distribution of the resulting GMM estimators can be expressed in terms of the asymptotic distribution of the stochastic processes of cross-products of instrumental variables and disturbances. Hence, the inner product defined in section 2 induces an inner product on the class of GMM estimators that is useful in representing asymptotic covariance matrices of these estimators.

In section 4 we pose the problem of determining the smallest asymptotic covariance matrix for the class of GMM estimators defined in section 3. This problem turns out to be the matrix counterpart to a minimum norm problem in a Hilbert space. We show that the solution to this problem can be obtained by solving the related problem of finding an inner product representation for a particular matrix linear functional. For many applications, it is convenient to calculate the smallest asymptotic covariance matrix by solving this related problem.

In section 5 we illustrate how to use this analysis to characterize the bound for two examples. In the first example the stochastic process for the disturbance vector is a conditionally homoscedastic martingale difference sequence so that the approximation described in section 2 is trivial. The bound we calculate can be viewed as a time series version of the bounds calculated by Basmann (1957), Sargan (1958) Jorgenson and Laffont (1974), and Amemiya (1977). Hence, the analysis of this example provides a link to previous literature on covariance matrix bounds for instrumental variables estimators. In the second example the disturbance vector is a finite linear combination of current and past values of a conditionally homoscedastic martingale difference

L. P. Hansen. Cuiculutrng hounds on the usymptork cocwimce nwtriw~ 205

sequence. We do not assume that variables used as instruments are strictly exogenous but only predetermined. The analysis of this second example illustrates the important role of the martingale approximation in calculating the bound. The two examples were chosen to illustrate the analysis in the previous sections. In Hansen (1985) bounds are calculated for several other examples with alternative restrictions imposed on the disturbance vector and the set of admissible instrumental variables.

There are two important limitations to the analysis in this paper. The first limitation is that we do not provide a decision-theoretic rationale for ranking estimators in terms of their asymptotic covariance matrices. In some cir-

cumstances particular GMM estimators are known to be locally minimax [e.g., see Chamberlain (1983)]; however, in our analysis we do not impose sufficient structure on the estimation problem and do not consider a sufficiently broad class of estimators to reach that conclusion. A second limitation is that we do not show how to construct estimators that attain the greatest lower bound. On the other hand, our method for calculating this lower bound often can be used to show how such a construction might take place. This second limitation will

be addressed in a sequel to this paper.

2. Martingale difference approximation

In this section the probability structure underlying the generation of the time series data is specified, and the martingale difference approximation used for the central limit theory is defined. Following an approach suggested by Gordin (1969) a martingale difference approximation is used for obtaining central limit results. The approximating martingale difference sequences are then used to define a matrix counterpart to an inner product between two stochastic processes that are jointly stationary. This inner product will be used in subsequent sections to represent the asymptotic covariance matrices of GMM estimators and to calculate the greatest lower bound.

Let (0, A, Prob) denote the underlying probability space, and let S denote a measurable transformation mapping D onto itself that is measure-preserving

and ergodic. The transformation S is used to specify how the time series data are generated. For instance, suppose h is a k-dimensional random vector. Then h and S together generate a time series via the relation

h,(w)=h[S’(w)] forallwinfi, (2.1)

where S’ is interpreted as the transformation S applied t times. We assume that S is one-to-one and S’ is measurable so that S’ is well-defined for both positive and negative values of t. A vector stochastic process {h,: - co < t < + co} generated in this fashion is stationary and ergodic. Under this construction h, and h are identical. Hence, we can index a k-dimensional vector

206 L. P. Hansen, Calculating hounds on the asymptotic covariance matrices

stochastic process by a k-dimensional random vector corresponding to the time zero component of the process. Furthermore, any two index random vectors generate vector stochastic processes that are jointly stationary and ergodic. This indexation is used throughout our analysis.

Associated with this probability space is a sequential information structure.

Let B denote a subsigma algebra of A, and let

B, = { b, in A : b, = { w:S’(o)isinb}forsomebinB}. (2.2)

Under this specification, for any h that is measurable with respect to B, h, is measurable with respect to B,. We assume that the sequence of sigma algebras {B,: - 00 -C t c + 00) is non-decreasing. For many econometric examples, it is convenient to think of B as being the smallest sigma algebra with which a set of stochastic forcing variables with non-positive time subscripts are measurable.

A central limit theorem proved by Gordin (1969) can be applied to rich class of stochastic processes that are stationary and ergodic. Gordin’s Theorem is used extensively in this section. In addition to the precise result stated in Gordin’s Theorem, other results that are straightforward implications of his proof are required for our analysis. For this reason, we decompose the implications of Gordin’s proof into three parts and make the appropriate extensions to each part.

To apply Gordin’s Theorem, it is necessary to restrict the k-dimensional random vector h used in generating a stochastic process. This restriction uses the set

G = { g: g is a k-dimensional random vector that is measurable

with respect to B, has a finite second moment,

and E (gl B_,) = 0 for some non-negative r } , (2.3)

and the notation

PT(h) = (l/l@) ih,. r=l

(2.4)

We then require that h satisfy:

Property 2.1. h is measurable with respect to B, h has a finite second moment, and

inf limsupE[Pr(g-h).Pr(g-h)] =O. gino 7-+m

L. P. Hansen, Cqlculuting bounds on the asymptotic covuriunce matrices 207

Since {h,: -cc i: t-c + co} is stationary and ergodic, {(l/fi)PT(h): TZ l} converges almost surely to E(h) as long as this expected value is finite.

When h satisfies Property 2.1, E(h) is zero and h can be approximated arbitrarily well by elements in G, although the approximation is not a conven- tional mean-square ( L2) approximation.

There are two sets that are closely related to G. These sets are

and

H = { h: h is a k-dimensional random

vector that satisfies Property 2.1)) (2.5)

F= {finG:E(flB_,)=O}. (2.6)

The set H contains G which in turn contains F. All three sets, F, G, and H,

satisfy the following linearity property:

Property 2.2. For any k-dimensional random vectors f and f * in F (G or H) and any k by k matrices of real numbers c and c*. cf + c*f * is in F (G or H).

Lemma 2.1. F, G, and H satisfy Property 2.2.

Proofs of all lemmas in this section are provided in the appendix A

The first implication of the proof of Gordin’s Theorem is that elements in H and hence G can be approximated by elements in F. Our extended version of this first implication is given in the following lemma.

Lemma 2.2. There is a function M mapping H into F such that for any h in H,

{ Pr[h - M(h)]: T> l} converges in mean square ( L2) to zero. Furthermore,

M satisfies the property that for any two elements h and h * in H and any two k by

k matrices of real numbers c and c *, M(ch + c*h*) = CM(h) + c*M(h*).

The first part of Lemma 2.2 is important because the stochastic process

{M(h),: - cc < t -c + CQ} is a stationary and ergodic martingale difference sequence. Billingsley (1961) proved a central limit theorem for martingale difference sequences that are stationary and ergodic. Hence, Billingsley’s Theorem can be applied to { M(h),: - 00 -C t < + co}. Taken together, Lemma 2.2 and Billingsley’s Theorem imply the second implication of Gordin’s Theo- rem.

Lemma 2.3. For any h in H, { Pr[M(h)J: T> l} and hence { PT(h): T 2 l} converge in distribution to a normally distributed random vector with mean zero covariance matrix E[ M( h ) M( h)‘].

208 L. P. Hunsen, Calculuting bounds on the asymptotic c~~arunce matrices

The second part of Lemma 2.2 establishes the linearity of the function M. This linearity makes

(h]h*) = E[M(h)M(h*)‘] (2.7)

a valid matrix counterpart to an inner product on H. Since h and h* can be used in conjunction with S to generate stochastic processes that are stationary and ergodic, (2.7) can be used as an inner product of two such stochastic processes. Using this matrix inner product, the asymptotic covariance matrix of { P’(h): T 2 1) is (h Jh). The third implication of Gordin’s Theorem proof is an alternative representation of the inner product given in (2.7).

Lemma 2.4. For any h and h* in H, (h(h*) = +limmEIPr(h)Pr(h*)‘].

Taken together, Lemmas 2.2-2.4 show that for any h in H, {P’(h): T 2 l} is asymptotically normal with an asymptotic covariance matrix given by

(hlh) =$iwE[P’(h)P’(h)‘]. (2.8)

This result is simply the vector counterpart to Gordin’s Theorem. Since M is the identity transformation when restricted to F,

(h-M(h)(h-M(h))=0 forall h in H. (2.9)

Consequently, when viewing H as an inner product space with inner product (2.7) or equivalently (2.9, one can just as well restrict attention to the subspace F. For many econometric estimation problems, one is lead to examine sub- spaces of H when constructing econometric estimators. By applying the transformation M, the subspace of H can be mapped into a corresponding subspace of F. As we will see in the next two sections, the matrix inner product ( .I. ) defined using M is valuable in studying the large sample properties of infinite-dimensional classes of GMM estimators.

3. Econometric model and GMM estimators

In this section we specify a general econometric model, and we consider a class of GMM estimators of the parameter vector of that model. Let y denote a random vector observed by the econometrician that enters directly into the econometric model, let PO denote a k-dimensional parameter vector, and let m denote a function mapping y and & into a K-dimensional disturbance vector e. Formally,

(3.1)

L. P. Hansen, Calculating hounds on the q~mptorrc cocwrmce matncev 209

where m(. , /?) is Bore1 measurable for all p in a neighborhood of PO. The function m is known a priori while the parameter vector & is to be estimated. In addition, m is continuously differentiable in its second argument so that the K by k random matrix

Jm(y,&)/W=d (3.2)

is a well-defined matrix of random variables. The random vector y is assumed to be measurable with respect to B which guarantees that the elements in d

and e are also measurable with respect to B.

To identify and estimate ,f$,, there is a set Z of k by K matrices of random variables that are measurable with respect to B. In addition Z satisfies the

following five properties:

Property 3.1. For any z in Z, ze is in H.

Property 3.2. For any z in Z, zd has a jinite $rst moment and the random matrix function z[bm(y, .)/SD] is$rst-moment continuous at &.I

Property 3.3. There exists a z in Z for which Ezd is a non-sing&r matrix.

Property 3.4. For any z and z * in Z and any k by k matrices of real numbers c

and c*, cz+c*z* isinZ.

Property 3.5. For any sequence { z ‘: j 2 1) in Z for which lim (z ‘eJz ‘e) = 0, lim E(zJd) = 0. /+m 1-m

Among other things, Property 3.1 guarantees that elements of Z are valid instruments to be used in estimating PO in the sense that

E(ze)=O foranyzin Z. (3.3)

Properties 3.2 and 3.3 are useful in verifying the asymptotic normality of GMM estimators that use elements of Z as instruments. Property 3.3 insures that PO is identified at least locally for one z in Z.2 Property 3.4 can be

‘Consider a function Fnc with domain that is Cartesian product of D and k-dimensional Euclidean space. For each p, Fnc( ., 8) is assumed to bc Bore1 measurable. Hansen (1982) delines Fnc to be j th-moment continuous if

where

jiLyIE[mod(.,6)‘] =O.

mod(w,S)=sup{ IFnr(w,p)-Fnc(w,B,)I:iP-8,I<s}.

A matrix function is said to be jth-moment continuous if each row is 1 th-moment continuous.

2 For & to be locally identified using z as instruments we mean that there is a neighborhood of /So for which IE[ zm( .v, .)]I has unique minimum at PO.

210 L. P. Hansen, Calculuting hounds on the asvmptotic couurumce matrices

interpreted as the matrix counterpart to the requirement that 2 be a linear space. A matrix inner product ( .I . )* is induced on Z via

(zIz*)* = (zelz*e). (3.4)

Property 3.5 imposes a continuity requirement on the expected cross-product matrix functional defined by d. Among other things, this property implies that any two elements z and z* in Z for which

(z - z*(z -z*>* = 0, (3.5)

result in estimators with the same asymptotic covariance matrix. The disturbance vector e may generate a serially correlated stochastic

process {e,: - cc =C t -C + cc}, and elements in Z are not restricted to be uncorrelated with e, for all t. Therefore, this setup can accommodate models with serially correlated disturbances and instrumental variables that are predetermined but not strictly exogenous. Also, the set Z of matrices of instrumental variables can be infinite-dimensional. In section 5 we give illustrations of estimation problems for which Z is infinite-dimensional.

Alternative GMM estimators are indexed by alternative choices of z. For any given z we can think of estimating PO by minimizing

(3.6)

by choice of /3 in some admissible parameter space that contains an open neighborhood of PO. However, our analysis is not limited to estimators { /3r: T 2 l} that minimize objective functions of the form (3.6). For instance, many estimators solve minimization (or maximization) problems with first-order conditions which imply that { Pr[zm(y, fir)]: T 2 l} converges in probability to zero for some choice of z. Such estimators have an instrumental variables interpretation and have an asymptotic distribution that is identical with the GMM estimator we associate with z.~

In this paper we do not establish the consistency and asymptotic normality of the estimators under consideration. Instead, we refer the interested reader to

3 Many econometric estimators solve maximization or minimization problems with first-order conditions of the form

Pr[zTm(v,PT)]=O, (*)

where { Pr[(zT- r)e]: T> 1) converges in probability to zero. In (8) zT may depend on the estimator fl,. or on any other estimated parameters as long as these estimators are consistent. Also. in time series problems in which elements of z depend on variables that are arbitrarily far back in the past, a .zI depending only on finite sample information can be found that approximates z, in the sense that { P’[( zF - z)e]: T 2 1) converges in probability to zero.

L. P. Hunsen, C&dating hounds on ihe usympiotic corwrclnce marrices 211

Hansen (1982) for a formal study of these large sample properties4 From results in that paper, we know that the asymptotic covariance matrix for an estimator that uses the matrix z for instruments is given by

cov(z) = [E(zd)] -‘(zJz),[E(zd)‘] -I. (3.7)

If E(zd) is singular, the parameter vector PO ceases to be identified locally when z is used as the matrix of instruments, and we interpret (3.7) as being infinite.

4. Greatest lower bound calculation

A goal of this paper is to find the greatest lower bound for the asymptotic covariance matrices of GMM estimators for infinite-dimensional classes of estimators. In other words, the problem is to find a matrix inf(Z) for which

inf( Z) < cov( z) for all z in Z, (4.1)

and for which there is a sequence { z J: j 2 1) such that {cov( z 1): j L 1) converges to inf(Z). As is true in the Gauss-Markov Theorem, the matrix inequality s is a partial ordering interpreted to mean that difference between the right-hand and left-hand matrices is positive semi-definite. The fact that the matrix inequality is only a partial ordering leaves open the possibility that the greatest lower bound is not well-defined. However, as is shown below, the restrictions placed on Z are sufficient for inf(Z) to exist. When E(zd) is singular implying that cov(z) is infinite, we interpret cov(z) as being greater than any finite positive semi-definite matrix.

Our strategy for solving this problem is to use minor modifications of standard Hilbert space methods. Assuming Z satisfies Property 3.1, it is

convenient to transform the index set of the GMM estimators from Z to a vector linear subspace of F. Let

F’= {fin F:f=M(ze)forsomez in Z}. (4.2)

For any elements z and z* in Z for which M( ze) and M( z *e) are equal,

(z - z*1z -z*>* = 0. (4.3)

4The asymptotic distribution theory in Hansen (1982) imposes a more restrictive assumption than Property 3.1. However. given Lemma 2.3 it is straightforward to extend that analysis to situations in which Properties 3.1-3.5 are satisfied. The treatment of consistency given in Hansen requires additional assumptions; however, these assumptions are not always necessary One set of assumptions used in Hansen is (i) the parameter space is compact; (ii) E[zm(,. .)] exists and is finite at all points in the parameter space and has a unique zero at &; and (iii) zm(~, .) is first-moment continuous at all points in the parameter space. In order to have the parameter space be independent of the choice of 2, it may be necessary to include some additional orthogonality conditions that are common for all GMM estimators under consideration.

When 2 satisfies Properties 3.2 and 3.5, E[(z - z*)d] is zero, or equivalently

E(zd) = E(z*d). (4.4)

Therefore, we can define a matrix function D on F” to be

D(f) = E(zd) for anyf in F*. (4.5)

Using the function D and transforming the index set of GMM estimators from 2 to F”, an alternative way to pose the general problem considered in this paper is to find the largest matrix inf( Z) for which

inf(Z)<D(f))‘E(ff’)[D(f)‘]-’ forallfin F”. (4.6)

In general F” is not closed (in mean-square) which means that the greatest lower bound inf(Z) for the right-hand side of (4.6) may not be attained. To calculate inf( Z) it is convenient to add the closure points to F ‘. The set of all closure points is given by

fin F: ,Iit E[(f-fJ).(f-fJ)] =0

for some sequence ( f’: j 2 l} in F0 . (4.7)

The linearity of A4 (Lemma 2.1) and Z (Property 3.4) induce linearity on F” and F*, respectively.

Lemma 4.1. Suppose that Z satisfies Properties 3.1 and 3.4. Then F” and F *

satisfy Property 2.2.

Calculating inf(Z) is closely related to representing D using expected cross-products with an element of F *. A matrix counterpart to the Riesz-Frechet Theorem can be applied to obtain such a representation.

Lemma 4.2. Suppose Z satisfies Properties 3.1-3.5. Then there exists a unique fd in F * such that

E(ffd’)=D(f) forallfinF’. (4.8)

Furthermore, E( f “f”‘) is non-singuhr.5

It turns out that fd used to represent D can also be used to calculate inf(Z) in (4.1) and (4.6).

Lemma 4.3. Suppo.Fe Z satis$es Properties 3. I-3.3. Then inf( Z) = [E(fdfd’)]--‘. Furthermore, iffd is in F”, then inf(Z)= ((z~/z~)*))~ for zd such that M( zde) = f d. Finally, cov( z) = inf( Z) if and only if M( ze) = M(cz ‘e)

for some k-dimensional non-singular matrix of real numbers c.

Lemmas 4.2 and 4.3 show that the problem of finding an f” in F” that attains inf( Z) is equivalent to the problem of finding an f“ in F” that satisfies (4.8). Since F” is not always closed, such an fd may not exist, but it always can be approximated in the limit by a sequence { fJ: j 2 1) in F”. There is a corresponding sequence { zJ: j 2 l} in Z for which f’ = M( z/e). Thus inf( Z) can always be approximated arbitrarily well by members of Z.

A problem that is equivalent to finding a f“ in F” that satisfies (4.8) is to find a zrl in Z that satisfies

E(zd) = (z(zd)* =>TwE[PT(ze)PT(z’e)‘]. (4.9)

Relation (4.9) can be viewed as a set of first-order conditions that are sufficient for cov( zd) to attain the greatest lower bound inf( Z). It is not surprising that this bound is attained by other elements in Z obtained by premultiplying zd by a k-dimensional non-singular matrix of real numbers c since the equation system

PT[idm(y,P)] =0 (4.10)

is equivalent to

PT[czdm(y,P)] =O. (4.11)

Lemma 4.2 guarantees the existence of an fd in F * that satisfies (4.8) when, among other things, Z satisfies Property 3.5. For some examples, it turns out that the most convenient way of verifying that Z satisfies Property 3.5 is to find an f d in F * that satisfies

E( zd) = E[ M( ze)fd’l for all z in Z. (4.12)

‘Property 3.5 plays a crucial role in guaranteeing that E(fdfd’) is non-singular. In some time series estimation problems in which Property 3.5 is violated, it is possible to construct a sequence of estimators with asymptotic covariance matrices that converge to a singular matrix.

214 L. P. Hansen, Calculating hounds on the mymptotic covuriance matrices

Lemma 4.4. Suppose Z satisfies Properties 3.1-3.4 and there is an f d in F *

satisfying (4.8 ). Then Z satisfies Property 3. 5.6

Using the definition of D given in (4.5) it follows that the f d that satisfies (4.12) also satisfies (4.8). Hence if we can find an f d that satisfies (4.12) then we need not verify that Z satisfies Property 3.5.

In the remaining section of this paper, we provide a more explicit characteri-

zation of the function M and the matrix inf( Z) for two examples.

5. Illustrations

In this section we show how to apply the approach suggested in sections 2 through 4. We focus on two particular examples that were chosen for illustra- tive purposes. Hansen (1985) provides a more comprehensive set of applications of this approach. Some of the analysis underlying these two examples is common and will be presented first. For the first example we impose an additional restriction on the disturbance vector e, and for the second example we impose an additional restriction on the index set Z of instrumental variables.

Let w be an n-dimensional random vector that is measurable with respect to B and satisfies:

Assumption 5.1. E(wlB_,) = 0.

Assumption 5.1 guarantees that the stochastic process generated by w in conjunction with S is a martingale difference sequence. The disturbance vector e is assumed to satisfy:

s-1

Assumption 5.2. e= r(L)w, and T(o)= c r/a’, /=o

where L is the lag operator, u is a complex variable, and F, is a K by n matrix of real numbers for 0 2 j c s. Taken together Assumptions 5.1 and 5.2 imply

E(elB-,) = 0. (5.1)

Many rational expectations models have the implication that the disturbance vector is an s-step ahead forecast error of some K-dimensional random vector

6Relation (4.12) can be modified to require that there exists an f* in F such that

E( zd) = E[ M( ze)f*‘] for all z in Z.

This is weaker than (4.12) since f * is not required to be in F*. This weaker condition is not particularly useful for our problem since we need to characterize fd in order to characterize the inf( Z).

L. P. Hunsen, Culculuting hounds on the asymptotic couurimce matrices 215

where s can be greater than one [e.g., see Hansen and Hodrick (1980,1983)].

Assumptions 5.1 and 5.2 are consistent with those models. Let J be a closed (in mean-square) linear space of random variables with

finite second moments that are measurable with respect to B. One possibility is for J to contain all such random variables. A second possibility is for J to contain all linear combinations of a finite-dimensional random vector that is measurable with respect to B and has a finite second moment. A third possibility is for J to be the closed linear space generated by all finite linear combinations of current and past values of a random vector with a finite second moment. Flexibility in the specification of J is advantageous because it broadens the range of estimation problems to which the analysis applies. For instance, a researcher may wish to characterize the largest class of GMM estimators for which a particular estimator attains the greatest lower bound. In this way he can assess better the cost of obtaining further gains in asymptotic efficiency by expanding the class of estimators.

Let B* be the sigma algebra generated by J, and let

2 = { z : z is a k by K matrix of random variables that are in J _ ,s } .

We assume that:

(5.2)

Assumption 5.3. E(ww’]B*,)= 1,

so that w is homoscedastic conditioned on BT 1.

Assumptions 5.1-5.3 and specification (5.2) of the index set Z imply that

E(ze) = E[E(ze(B’,)]

= E[ zE( e]B:,)] (5.3)

= 0 for all z in Z.

In this sense, members of Z are valid candidates for instrumental variables. However, in general

E( z,e) # 0 for positive values of t, (5.4)

so that members of Z may not be strictly exogenous. This feature of Z is central in the analysis of Example 2.

216 L. P. Hansen, Culculutmg hounds on the asvmptotic cowname mutnces

The following characterization of M is valid for these examples:

Lemma 5.1. Suppose Assumptions 5.1-5.3 are satisjied and Z is given by (5.2). Then Z satisjies Property 3.1 and

M(ze)= [z&L-‘)]w. (5.5)

The proofs of lemmas presented in this section are given in appendix C.

Next, we verify that M(ze) as given by (5.5) is in F. Since z is measurable with respect to B_, and Assumption 5.2 is satisfied, the elements of [zJ( L-l)]

are measurable with respect to 3-t. Consequently, (5.2) implies that

E{ [z,,T(L~‘)]w~B_~} = [zJ(L-~)]E(w]B_~)=O. (5.6)

Therefore, M(ze) as given by (5.5) is in F. Using (5.5) we can represent the matrix inner product ( .I . )* as

(zIz*)+ = E{ [zoT(L-‘)]ww’[z,*T(L-‘)I’}

=E{E{ [Zor(L-')]ww'[z,*r(L-')]'IBT,}} (5.7)

= E{ [ zJ( L-l)] [ z;I’( L-l)]‘} for any z and z* in Z.

Let proj(. IL,) denote the linear least squares projection operator onto J_,Y defined relative to the standard inner product on the L2 space of random variables with finite second moments. To ensure that Z satisfies Properties 3.2 and 3.3. we assume:

Assumption 5.4. E[tr(d’d)] is finite and G’m( y, .)/a/3 is second-moment con-

tinuous at &.7

Assumption 5.5. E[proj(d’(J_,Y)proj(dJJ_,)] is non-singular.

Lemma 5.2, Suppose Assumptions 5.4 and 5.5 are satisfied and Z is given by

(5.2 j. Then Z satisfies Properties 3.2 and 3.3.

Since J is a linear space of random variables, Z satisfies Property 3.4 (linearity) by construction. Finally, to guarantee that Z satisfies Property 3.5

‘See note one for a definition of second moment continuity

1,. P. Hcrnsen, Cukuluting hounds on the asynptotic cownance ntotnces 217

(continuity of 0) we assume:

Assumption 5.6. det[r(a)r(a)‘] f 0 for all (uI= 1,

where prime denotes transposition and complex conjugation.

Lemma 5.3. Suppose Assumptions 5.1-5.4 and 5.6 are satisfied and Z is given by (5.2). Then Z satisjies Property 3.5 und F0 = F *.

Taken together, Lemmas 5.1-5.3 show that the hypotheses of Lemmas 4.2 and 4.3 are satisfied. In the next two subsections we specialize our analysis in two distinct ways. In the first subsection, we restrict e by assuming s is one. In the second subsection we restrict Z by assuming J contains J_,.

5.1. Example 1

In this example we assume that disturbance vector process is a martingale difference sequence (s = 1). By altering the specification of J, we can obtain time series counterparts to bounds calculated by Basmann (1957), Sargan (1958), Jorgenson and Laffont (1974), and Amemiya (1977). To obtain counterparts to the bounds calculated by Basmann and Sargan, J is taken to be finite-dimensional. To obtain the counterpart to the bounds calculated by Jorgenson and Laffont and Amemiya, J is taken to be the set of all random variables with finite second moments that are measurable with respect to B. In this case J will be infinite-dimensional in general. An additional case included in our analysis is where J is generated by all finite linear combinations of current and past values of a random vector with a finite second moment. Since the linear combinations can include arbitrary lags of the random vector, J will be infinite-dimensional for this case as well.

By applying Lemmas 4.2 and 4.3, inf(Z) is attained by a member z“ of Z that satisfies

(zIz”)* = E( zd). (5 .S)

Since the elements of z are in J_,, it follows that

E(zd)= E[zproj(dlJ_l)] for all z in Z. (5.9)

When s is one, (5.7) simplifies to

(zIz”)* = E( zr,r;z”‘). (5.10)

In this example (r&) is the unconditional covariance matrix for e.

218 L. P. Hansen, Culculuting hounds on the qmptotic couarionce mutrices

Substituting (5.9) and (5.10) into (5.8) gives

E(zI’&zd’) = E[tproj(dlJP,)] for all z in Z. (5.11)

Members of Z have elements that are arbitrary elements of J_,. Hence, relation (5.11) implies that

r&zd’ = proj(dlJ_,). (5.12)

Since T(a) satisfies Assumption 5.6, r,ri is non-singular. Solving (5.12) for zd gives the following result:

Lemma 5.4. Suppose Assumptions 5.1-S. 6 are sarisjied and Z is given b.y (5.2). Ifs = 1, then

z“= proj(dlJ_,)‘(r,r,l)-‘, (5.13)

and

inf(Z) = (E[proj(dlJ_,)~(r,r;)-‘proj(d(J~,)]) -I. (5.14)

To derive characterization (5.14) of inf(Z), first apply Lemma 4.3 to obtain

inf(Z) = (zdlzd);’ = [E(zdror;Zd~)]el, (5.15)

and then substitute for zd from (5.13). In this example the choice of instrumental variables that attains inf(Z) is

obtained by calculating (or estimating) the least squares regression proj(d lJ_ 1) and the unconditional covariance matrix (r&).

5.2. Example 2

In this example, we allow s to be greater than one so that the disturbance vector can be serially correlated. However, we adopt a more restrictive specification of the space J. We assume that J is specified so that J_, is a subspace of J. Consequently, for any variable in J, lagged values of that variable are also in J. This specification of J presumes that J is infinite-dimensional as long as there are elements in J that generate stochastic processes that are not perfectly forecastable. Among other things, the analysis in this section shows how to calculate the limiting asymptotic covariance matrix for the sequence of estimators suggested by Hayashi and Sims (1983)’ Their sequence of estima-

“Hayashi and Sims (1983) do not impose the exact same assumptions that are imposed in this example although their analysis is entirely compatible with the assumptions maintained here.

L. P. Hunsen, Cakulating hounds on the usymptotic coucuumcc matrices 219

tors is obtained by expanding the list of instrumental variables to include

arbitrary lags of some underlying set of instrumental variables. When s is greater than one, [z&L-‘)] will not necessarily be in Z even

though z is in Z. The operator r( L-‘) is a forward operator with terms in negative powers of L and z may not be strictly exogenous. For this reason we obtain an alternative to representation (5.7) of inner product ( .I. )*. This alternative representation involves a backwards operator with a one-sided inverse. The existence of such an operator is guaranteed by the Wold Decom- position Theorem.

Lemma 5.5. Suppose Assumptions 5.2 and 5.6 are satisfied. Then there exists

an operator r*(L) such that

s-1

r*(u) = c IyJJ, (5.16) J=o

where rz, r;C, . . . , rsY 1 are K by K matrices of real numbers;

r*(+y3)‘= r(d)r(d)‘, (5.17)

for aN complex numbers (61 = 1; and

det[r*( e)] # 0, (5.18)

for all u satisfying IuI I 1.

Since r* satisfies (5.17) ( .(. )* can be represented using I-*.

Lemma 5.6. Suppose Assumptions 5.1-5.6 are satisfied and Z is given by (5.2).

For any z and z * in Z,

(z(z*), = E{ r[T*(L-‘)T*(L)‘$]}

(5.19)

= E{ [ zor*(L)] [ z,*r*(L)]‘}.

The following lemma characterizes zd and inf( Z) for Example 2:

Lemma 5.7. Suppose Assumptions 5. I-5.6 are satisfied and Z is given by (5.2). If J contains J_ 1, then

~~=proj([d;,r*(L-l)-“] ]J_,JT*(L)-‘, (5.20)

inf(Z)=E(proj( [d;T*(LP1))“] I/-,)

xproj([d~T*(L~~‘)~“](J.,)‘~ ml. (5.21)

Since r* satisfies (5.16) and (5.18) of Lemma 5.5, it follows that r* has a one-sided inverse on the space of stationary stochastic processes with finite

second moments. In appendix C we use this fact to show that the right-hand side of (5.20) is a well-defined matrix of random variables in J ,s. Let z* denote this matrix.

To prove (5.20) we must show that z* equals z“ where zd is the solution to

E( zd) = (zlz”)*. (5.22)

Using the representation of ( .I. )* given in (5.19) of Lemma 5.6, it follows that

(z~z~)*=E{z[~*(L~~)~*(L)‘z~‘]}. (5.23)

Combining (5.22) and (5.23) gives

E(zd)=E{z[r*(LP1)r*(L)‘Z,;“]} forall z in Z. (5.24)

For (5.24) to be satisfied, it is necessary and sufficient that

proj(dlJ_,)=proj{[T*(LP’)T*(L)‘z,d’] IJP,,}. (5.25)

To verify that z * equals z ‘, we substitute for z * as given by the right-hand side of (5.20) for zd in the right-hand side of (5.25). This gives

= pr0j( T*(L-‘)proj[T*(L-‘)~‘d,lJ~,]IJ~,)

= proj( dlJ -.),

(5.26)

which verifies that z* satisfies (5.25). Since z * is in Z, z* equals zd and inf( Z)

is attained by z*. Lemmas 4.3 and 5.6 show that

inf(Z) = (z~/z~);’

= E{ [ &‘*( L)] [ z$*( L)]‘} -‘. (5.27)

To prove (5.21) of Lemma 5.7, we substitute (5.20) into (5.27). In this example the choice of instrumental variables that attains inf(Z) is

obtained by calculating (or estimating) the operator r*(L) from the serial correlation properties of { e,: - CC < t < + x? } and calculating the least squares

regression proj{ d;T*(L-‘))“] IJ_,,}.

6. Conclusions

In this paper we suggested and documented a method for calculating bounds on the asymptotic covariance matrices for estimators in an infinite-dimensional class of GMM estimators. The method we suggested uses the theory of stationary stochastic processes coupled with martingale difference decomposi- tions as suggested by Gordin (1969). We illustrated how to use the method by calculating greatest lower bounds for GMM estimators under two different sets of auxiliary assumptions. These two sets of assumptions are by no means exhaustive. In Hansen (1985) this method is applied in calculating greatest lower bounds for many other sets of auxiliary assumptions corresponding to applications of GMM estimation in the literature. A crucial ingredient in these examples is the characterization of the approximating martingale difference sequence shown to exist in this paper.

Appendix A

In this appendix we prove the lemmas given in section 2. Throughout all of the appendices we let ((hl( = E(h . h)lj2 where h is a k-dimensional random

vector with a finite second moment. The norm )/ ’ 1) is the norm for the L* space induced by the product measure of Prob and the counting measure on the set {1,2 )..., k}.

Proof of Lemma 2.1. This proof is divided into three parts. First, we establish the linearity of F, then of G, and finally of N.

(i) Linearity of F: Let f and f* be in F and c and c* be k by k matrices of real numbers. Since f and f * are in F,

E(flB_l)=E(f*(B_l)=O. (A-1) Hence,

E(cf+c*f*(B_,)=cE(fIB_,)+c*E(f*(B_,)=O. (A.2)

222 L. P. Hunsen, Coladuring hounds on the uymptotic couarionce mutrices

Also, the Triangle Inequality insures that cf+ c*f* has a finite second moment. Therefore, cf+ c*f * is in F.

(ii) Linearity of G: Let g and g* be in G and c and c* be k by k matrices of real numbers. Since g and g* are in G, there exist non-negative integers r

and r* such that

E( glB_,) = E( g*lB_,*) = 0. (A.3)

Without loss of generality, let r 2 r *. Then (A.3) and the Law of Iterated Expectations imply that

E( cg + c*g*(B_,) = cE( gl&) + c*E(g*lk) = 0. (A.4)

Also, the Triangle Inequality insures that cg + c*g* has a finite second moment. Therefore, cg + c*g* is in G.

(iii) Linearity of H: Let h and h* be in H and c and c* be k by k matrices

of real numbers. Define

h+= ch + c*h*. 64.5)

Since h and h* are in H, there are sequences { g’: j 2 l} and { g*J: j 2 l} in G such that

limsup(IPr(h-gJ)(I <I/j for jrl, (A.6) T-30

and

limsup()PT(h*-g*J)II<l/j for j>I. (A.7) T-m

Let g+j = cgJ + c*g*J, and note that g+’ is in G for all j 2 1. Also,

PT(h+- g+J) = cPT( h - g’) + c*PT( h* - g*‘). (A.8)

By the Triangle and Cauchy-Schwarz Inequalities

(1 P’( h+- g+‘) 1) I tr( cc’)“*]] P’( h - gJ) (1

+tr(c*c*‘)1’21)Pr(h* -g*‘) ]I) (A.9)

where tr(cc’) denotes the trace of the matrix cc’. Hence,

limsupl] P’( h+-- g+J) )I I [tr( cc’)1’2 + tr( c*c*‘)i”]/j. T-CC

(A.lO)

L. P. Hunsen, Cdcuidng hounds on the usymptotrc cooariance mutnces 223

Taking limits as j goes to infinity, we obtain

limlimsup~~PT(h+-g+J)~]=O. J’CC T+m

(All)

Therefore, h+ is in H. Q.E.D.

Proof of Lemma 2.2. This proof relies heavily on the proof of Gordin’s (1969) Theorem. In proving this lemma, it is convenient to reproduce some of his steps. Let h be in H and let { gJ: j 2 l} be a sequence in G such that

limsup)) P’( h - g’) )( < l/j. T-CC

(A.12)

Since gJ is in G, there is an integer 7(j) such that

E[ g’IB_.(,,] = 0 for j2 1. (A.13)

The set F is a closed linear subspace of the L2 space defined at the beginning of this appendix. Since gJ satisfies (A.13), it follows that

T(J)

gJ = cproj(gJIFi-OY t=l

(A.14)

where proj( -1 F) denotes the least squares projection operator onto F. Following Gordin (1969), there is a sequence ( fJ: j 2 1) in F that can be

used to approximate the sequence (8’: j 2 l}. The approximating sequence is

T(1)

fJ= xproj(g/_,lF) for j21. 1=1

(A.15)

Then for each j> 1 and T> [I- 11,

PT(gJ-f’)=(g,*J-g;rJ)/JT,

where

T(J) T(/)

g *J = C Cproj(&lL). r=2 .s=t

Notice that for each j 2 1, g*J is in L2 so that

limsup()( go*’ - g,*‘)/JT(( I 2$mml/g*Wfi =O. T+oO

(~.i6)

(A.17)

(A.18)

224 L. P. Hansen, Culcuhting hounds on the as_vmptotic cooariunce mcrtrice.r

Taking norms and limits of (A.16) gives

lim )IPr(gj-fJ)/I=O forall jkl. T-cc

(A.19)

Using (A.12), (A.19) and the Triangle Inequality, we obtain the following inequalities

limsupIIPT(h-f’)II <l/j ford1 j21, T-CO

(A.20)

and

limsup(IPT(fJ-ffJ*)\I <2/j* forall jZj*. T+CC

Now for any f in F

llWf)II =Ilfll3

so that (A.21) implies that

(A.21)

(A.22)

IlfJ-f’*lli2/j* for all j>j*. (A.23)

Consequently, { f': j 2 l} is a Cauchy sequence in L2. Since F is closed, { f I:

j 2 l} converges in mean square ( L2) to some f” in F. Hence, (A.20) (A.22)

and the Triangle Inequality imply that

li~s~p\lp’lh-f”)l~=O.

We define M(h)=f’.

(A.24)

Next, we show that (A.24) defines a unique f” in F. Let f’ be any other element in F for which

limsupl( P’( h -f+) 11 = 0. T+CC

Then, by the Triangle Inequality,

limsupI)PT( f+- f “) (1 = 0 T+CC

(A.25)

(~.26)

Using relation (A.22) we conclude that

Ilf+-f"ll = 07

so that f’= f ‘. Therefore, (A.24) defines a unique value for M(h).

(A.27)

Finally, we show that M is linear. This logic parallels closely the logic used in proving Lemma 2.1. Let h and h * be in H and c and c* be k by k matrices of real numbers. Define

hi= ch + c*li*. (A.28)

Since F is linear, CM(~) + c*M(h*) is in F. We must show that

limsup(jPr[h+-CM(h) -c*M(h*)] (( =O. T--)30

The Cauchy-Schwarz and Triangle Inequalities imply that

IIPT[h+-CM(h) -c*M(h*)] 11

I tr(ccy*JIPqh -M(h)] 11

+tr(c*c*‘)1’2((PT[h* - M(h*)] (I.

It follows that

li~s~pj)Pr[h+-CM(h)-c*M(h*)] 1)

I tr(cc’)“‘li~s$P’[h - M(h)] (1

+tr(c*c*‘)1’21imsupl(PT[h* - M(h*)] 1) T-OS

(A.29)

(A.30)

(A.31)

= 0. Q.E.D.

Proof of Lemma 2.3. This Lemma is a simple application of Billingsley’s (1961) Theorem extended to random vectors. Billingsley’s Theorem implies that { y . P ‘[M(h)]: T 2 1) converges in distribution to a normally distributed random vector with mean zero and covariance matrix p’E[ M(h)M(h)‘]p for any k-dimensional vector of real numbers p. Consequently, it implies that { Pr[M(h)]: T 2 l} converges in distribution to a normally distributed random vector with mean zero and covariance matrix E[M( h)M( h)‘]. Lemma 2.2 guarantees that { PT[h - M(h)]: T> l} converges in mean square (L2) to zero. Therefore, {P’(h): T 2 l} converges in distribution to the same random vector as does { Pr[M(h)]: T2 1). Q.E.D.

226 L.P. Hunsen, Calculating hounds on the qmptotic couununce matrices

Proof of Lemma 2.4. The proof of this lemma relies heavily on Lemma 2.2.

Lemma 2.2 implies that for any h and h* in H, { PT[h - M(h)]: T r 1) and { PT[h* - M(h*)]: T> I} converge in mean square to zero. However,

E[PT(h)PT(h*)‘] -E[PT[M(h)]PT[M(h*)]‘]

=E{PT[h-M(h)]PT[h*-M(h*)]‘}

(A.32)

+E{PT[M(h)]PT[h*-M(h*)]‘}

+E{PT[h-M(h)]PT[M(h*)]‘}.

Since M( h ) and M( h *) are in F,

E[PT[M(h)]PT[M(h*)]‘] =E[M(h)M(h*)‘], (A.33)

IIPTbf(h)l 11 = IIM(h) IIt (A.34)

IIf%Wh*)I 11 = lb+@*) II- (A.35)

Taking limits of (A.32) and applying the Cauchy-Schwarz Inequality gives

lilimm(E[P7(h)P’(h*)‘] -E[M(h)M(h*)‘]} =O. (~.36)

Q.E.D.

Appendix B

In this appendix we prove the lemmas given in section 4.

Proof of Lemma 4.1. For any f and f * in F” there is a z and a z* in Z such

that f = M( ze) and f * = M( z *e). Let c and c* be any k by k matrices of real numbers. Since Z satisfies Property 2.6, cz + c*z* is in Z. Hence, M[(cz + c*z*)e] is in F ‘. Lemma 2.2 guarantees that M is linear in the sense that

M[(cz+c*z*)e] =cM(ze)+c*M(z*e)=cf+c*f*, (BJ)

which proves that F, satisfies Property 2.2.

L. P. Hunsen. Cukuluiing hounds on the asvmptotic cocwriunce mutrrces 227

Let f and f* bein F* and cand c * be k by k matrices of real numbers.

Then there are sequences { f’: j 2 1 } and { f J*: j r I} in F” such that

limljf-f’ll=O, (B.2) J-‘”

and

limlJf*-fJ*~~=O, (B.3) /-cc

Since F” satisfies Property 2.2, { cf’ + c*f I*: j 2 1) is a sequence in F”. By

the Triangle and Cauchy-Schwarz Inequalities,

(I(cfj+c*f/*)-(cf+c*f*)II

2 llc(f’-f 1 (If ((c*(f’* -f *I /I (B.4)

5 tr(cc’)1’211fJ-fll+ tr(c*c*‘)1’211fJ*-f*Ij.

The limits in (B.2) and (B.3) imply that

lim /(cf/ + c*fJ*) -(cf+ c*f*)/ = 0. (B.5) /‘”

Therefore, cf + c*f * is in F * and F * satisfies Property 2.2.

Proof of Lemma 4.2. This proof is divided into three steps. First we show that D is a bounded linear matrix functional on F”. Then we extend this functional

to F *. Finally, we use the same logic as is used in the Riesz-Frechet Theorem to obtain a representation of D.

To show that D is a linear matrix functional, let f and f * be in F’, and let c and c* be k by k matrices of real numbers. Then there exist a z and a z* in 2 such that f = M( ze) and f * = M(z *e). Hence,

and

D(f) = E(zd), 03.6)

D( f *) = E(z*d). (B.7)

Lemma 2.2 guarantees the linearity of M so that

cf + c*f * = M[(cz + c*z*)e]. 03.8)

Consequently,

D(cf+ c*f*) = E[(cz + c*z*)d]

=cE(zd)+c*E(z*d)

=cD(f)+c*D(f*),

(B.9)

which proves that D is a linear matrix functional.

To prove that D is bounded, let

6=sup{tr[D(f)D(f)‘]“*:~]fjJ=landfisin F”}. (B.lO)

We must show that 6 is finite. Suppose to the contrary that 6 is infinite. Then there exists a sequence (1’: j 2 1) in F” such that Ilf’\l = 1 and

tr[D(f’)D(f’)‘]“‘>j.

Now

fJ* =fl/tr(D(fl)D(fl)‘]1’2

is in F”,

(B.ll)

(B.12)

Ilf’*ll~ (l/j>, (B.13)

and

D(f’*) = (l/tr[D(/.‘)D(f/)‘]1’2)D(fl), (B.14)

for j 2 1. Hence

and

lim Ilf’*ll = 0, I+=

(B.15)

tr[D(jl*)D(fJ*)‘]“*=l forall jrl. (B.16)

Since fJ* is in F", there exists a zJ in 2 such that

fJ*=M(z+) forall j21.

In light of (B.15)

(B.17)

(B.18) lim (z’lz”)* = 0. J+m

L. J? Humen, Cukuluting hounds on the os,mptotrc c’orurrar~e mutrtces 229

Since Z satisfies Property 3.5,

limE(zJd)= limD(fJ*)=O. (B.19) I+= /-m

This contradicts (B.16). Therefore, D is bounded. Next, we prove that there is an extension of D to F *. For any given f* in

F *, let { f J:- j 2 l} be a sequence in F” for which

limIlf’--f*JI=O. 1-x

By the Triangle Inequality,

lim Ilf’ll = Ilf *Il. /-)x

Since D is bounded on F’,

S*=sup(tr[D(f/)D(fJ)‘]l’z: jr 1)

is finite and there exists a subsequence { fJ('): r 2 1) for which

J~_IID[~,‘~)] = D(f *)

exists and satisfies

tr[D(f*)D(f*)‘] <a*.

Property 3.5 of Z then guarantees that

lim D( f’) = D(f*), /-m

since

lim J(fJCr) -f’l/ = 0. T’cc

(B.20)

(B.21)

(B.22)

(B.23)

(B.24)

(B.25)

(~.26)

Furthermore, Property 3.5 insures that any sequence in F” that converges to f* gives rise to the same definition of D(f *) via (B.23). Hence, D is well-defined on F*. Furthermore, the linearity and boundedness of D carry over from F o to its closure F *.

The remainder of the proof of this lemma mimics the Riesz-Frechet Theorem. Let

F+= {fin F*: D(f)=O}. (B.27)

230 L. P. Hunsen. Culcuiutmg hounds on the rrsymptottc couariunce m&rices

Then Fi is a closed linear subspace of L2. Since Z satisfies Properties 3.3 and 3.4, there exists an f” in F” such that

D(f”)=Z. (B.28)

Let

f* =fa - proj( jOIF+). (B.29)

Since D is a matrix linear functional.

D( f*) = D( f”) - D[proj( f”lF’)] = I. (B.30)

The random vector f * is orthogonal to F+ by construction, in other words,

E(f*‘f)=O forallfin F’. (B.31)

Also, for any f in F+ and any k by k matrix of real numbers c, cf is in Ft. Hence.

E(ff*‘)=O forall f in Ft. (B.32)

In addition, for any f in F *, f - D( f )f * is in F * since F * satisfies Property 2.2 (see Lemma 4.1) and

D[f-D(f)f*l=D(f)-D(f)D(f*)=O. (B.33)

Hence, for any f in F *, f - D( f )f * is in Ft. Consequently, relation (B.32)

and the linearity of E imply that

E(ff*')=D(f)E(f*f*') for all f in F*. (B.34)

If E( f *f *‘) is non-singular, then (B.34) can be expressed as

E(ff? = D(f )T (B.35)

where f d = E( f *f *I)-‘f * which is the desired representation of D. To show that E( f *f *‘) is non-singular, suppose to the contrary that

E( f *f *‘) is singular. Then there exists a k by k matrix of real numbers c that has at least one non-zero element for which cf * is zero. By the linearity of D, D( cf *) = c which is different from zero. However, since cf * is zero (almost surely) and D is a matrix linear functional, D(cf *) = 0 which implies that c = 0. This contradicts the fact that c has a non-zero element. Hence, E( f *f *‘) is non-singular as is B( f “f d’).

L. P. Hunsen, Cdculuting hounds on the qvmptotlc couonunc~e mutrrces 231

Finally, we show that f” is unique. Let f d* be any element in F * for which

E(ffd*‘)=E(ffd’)=D(f) forall f in F*.

Since fd* -fd is in F*,

(~.36)

E[(fd*-fd)fd*‘] -E[(f“*-f’)f“‘] =O. (B.37)

Hence,

E[(fd*-fd)(fd*-fd)‘] =O, (~.38)

which implies that f d* = f d. Q.E.D.

Proof of Lemma 4.3. The matrix linear functional D was defined over the entire domain F * in the proof of Lemma 4.2. Furthermore, Lemma 4.2 implies that

D(f)=E(ffd’) foranyf in F*.

Hence, for any f in F * for which D(f) is non-singular

E([D(f)-‘f-D(fd)-‘fd]fdrj=O.

Consequently,

E([D(f)-lf-D(f”)-lfd][D(f)-lf-D(f“-lfd]’)

=D(f)-lE(ff’)[D(f)‘]-l-D(fd)-l>

or

D(f”)-’ 5 D(f )-‘E(Sf’)[D(f )‘I -‘.

From (B.41) it follows that

D(f“-‘=D(f>-lE(ff’)[D(f)‘l-‘v

if and only if

D(f)-‘f-D(f”)-‘f”=O,

or

f=cfd,

(B.39)

(B.40)

(B.41)

(B.42)

(B.43)

(B.44)

(B.45)

232 L.P. Hunsen, Culculuting hounds on the asynptotic cmuriunce mutnces

for some k-dimensional, non-singular matrix of real numbers c. Since there exists a sequence { f’: j 2 l} in F” for which

lim Ilf’ -fdll = 0, (~.46) J--rP

limD(fJ)=D(fd), J-00

and

lim E( fJfJ’> = E( f”f”‘). /-)w

Therefore,

~~~~(ri)-‘E(fJ~j’)[~(~J)‘]-l=n(fd)~’,

and

(B.47)

(~.48)

(B.49)

inf( Z) = D( fd)-’ = E( fdfd’))‘.

If f d is in F”, then there exists a zd in Z such that

(B.50)

fd= M(zde). (B.51)

In this case it follows from (B.45) that cov(z) = inf(Z) if, and only if, M( ze) = M( cz de) for some k-dimensional non-singular matrix c. Finally,

(z”Iz”)* = (zdelzde) = E( f “f d’) = [inf( Z)] -l. Q.E.D. (B.52)

Proof of Lemma 4.4. Let { zJ: j 2 l} be a sequence in Z that satisfies

)&E[ M( zJe)M( zJe)‘] = 0, (B.53)

and let f d be an element of F * that satisfies (4.12). Then for j 2 1, M( z/e) is in 2’ (Property 3.1) and

E( zjd) = E[ M( zJe)fd’] . (B.54)

By the matrix version of the Cauchy-Schwarz Inequality,

(B.55)

L. P. Hunsen, Culculuring hounds on the uynptotlc corww~e mutrim 233

Also, (B.53) implies that

lim I[ M( z’e) 11 = 0. J-m

Therefore, (B.55) and (B.56) imply that

lim E( z/d) = 0. Q.E.D. J4W

(~.56)

(B.57)

Appendix C

In this appendix we prove the lemmas given in section 5.

Proof of Lemma 5.1. First, we show that for any z in Z. ze is in G and hence H. Since the elements of Z have finite second moments and are measurable with respect to BT,, it follows from Assumptions 5.1-5.3 that ze has a finite second moment and

E(zeJB_,V) = zE(eJB_,) = 0. (C.1)

Hence, ze is in G. Next we show that M is given by (5.5). In the proof of Lemma 2.2, we

established that for any g in G

(C.2)

where

E(glB-,) = 0

for some positive integer 7, and

f* =J$l[%-,lB) - E(g,-,,B-,)I.

(C.3)

(C.4)

Let g = ze and T = s. Lemma 5.1 is proved once we show that f * = M(ze) as given by (5.5) or equivalently that

i [E(z,_,e,_,lB)-E(z,~,ei-,lB-,)] = [zJ(LP’)]W. J=l

(C.5)

234 L. P. Hansen, Culculuimg hounds on the asymptotic covuriunce matrices

The matrix z, _ r is measurable with respect to B _ 1 for j = 1,2,. . . , s, and

E(z,-re,-rle) - F(z,-re,-Wr)

=z ,-IE(e,-IIB) -z,-&(e,-#-J

=z ,-J-Y. (C.6)

Summing (C.6) gives

f* = ,+q_1w = [z&L-l)] w. Q.E.D. (C.7)

Proof of Lemma 5.2. For any z in Z, the elements of z have finite second moments. Also, Assumption 5.4 implies that the elements of d have finite second moments. Hence, it follows from the matrix version of the Cauchy-Schwarz Inequality that the elements zd have finite first moments. Also,

Since am( y, .)/ap is second-moment continuous at PO, and tr( zz’) has a finite first moment, za(y, .)/fi is first-moment continuous at PO. Therefore, Z satisfies Property 3.2. Since proj(d ‘IJ_,) is in Z, Property 3.3 is satisfied for z = proj( d ‘IJ_,). Q.E.D.

Next we consider an intermediate result that will be used to prove Lemma 5.3.

Lemma C.l. Suppose Assumptions 5.2 and 5.6 are satisjied. Then there exist

positive numbers S b and 6” such that for any K-dimensional random vector h with

a jinite second moment,

L. P. Hunsen, Culculufing hounds on the asymptotic cocwiunce mutnces 235

Proof _ Let Rh be the spectral measure of the process generated by h. The random vector T(L-‘)‘h, generates a stochastic process with spectral measure C( 8)‘d Rh( O)C( 8) where C(8) = F[exp( -if?)]. By the Fourier inversion for- mula

IIW-l)‘hol~Z= (1/277)/ tr[C(e)‘dRh(8)C(B)], (_?i.W ]

(C.10)

and

llhll’= (1/27dj tr[dRh(8)]. (-7r,n 1

((2.11)

Since C is a continuous matrix function on the compact domain [-VT, a] and Assumption 5.5 is satisfied, there exist positive numbers Sh and 6” such that the eigenvalues of the product C(e)C(e)’ are in the interval [ah, S’] for all B in [ -T, T]. Therefore,

tr[dRhwI 5 j(__ 71 ] tr[c(e)‘dRh(e)c(e)]

5 6” J

tr[dRh( e)] t-v,v I

(C.12)

Relation (C.9) then follows from substituting (C.10) and (C.11) into (C.12). Q.E.D.

Proof of Lemma 5.3. First we show that Z satisfies Property 3.5. Then we show that F” = F *. Let { zJ: j 2 l} be sequence in Z that satisfies

lim (z’Iz’)* = 0. J+a

(C.13)

Given (5.7),

(z’Iz’>*=E{[z:~(L-‘)][z:T(L~‘)]~} forall j>I. (C.14)

It then follows from Lemma C.l that

E[tr( z/z/‘)] I tr(z’(z’),/6h.

Thus, (C.13) implies that

(C.15)

(C.16) lim E[tr(z’zJ’)] = 0. I-+X

236 L.P. Hunsen, Cakulrtrng hounds on the osymptottc c’ocwwnc’r mutrrwv

Since the elements of d have finite second moments (Assumption 5.4) the matrix version of the CauchyySchwarz Inequality implies that

lim E( z’d) = 0. (C.17) 1-m

Finally, we prove that F” = F *. Let { j': j 2 l} be any sequence in F0 that is mean-square Cauchy. Then there exists a sequence { zJ: j 2 1) in Z such that M(zJe) =j’ for j 2 1. Thus

E[(jl- j’)( j’- jr)‘] = (z’-z’lz’-z’)*, (CM)

where ( .(. )* is given in (5.7). However, (C.18) and Lemma C.l imply that

tr( ( zJ - zT]z’ - z’>*) 2 8”tr{ E[( zJ - z’)( z’- z7)‘] } (C.19)

Hence, the elements of [zJ: j >_ l} are mean-square Cauchy. Since J is closed in mean square, there exists a z0 in Z such that

lim E{tr[(z’- z”)( z’ - z”)‘] } = 0. (C.20) /-=

However,

Ijj’ - M( z”e) \I* = tr((z’ - zO]zJ - z”)*)

I S”tr{E[(t’- z”)( z’ - z”)‘] } . (C.21)

Consequently, the limit in (C.20) guarantees that { jJ: j 2 l} converges in mean-square to M(z’e). Therefore, F” = F*. Q.E.D.

Lemma 4.5 was proved in the text.

Proof oj Lemma 5.5. Let R” denote the spectral measure of the process generated by W. Then R” is absolutely continuous with respect to Lebesgue

measure and with density matrix I for almost all 8 in (-r, ~1. Furthermore, r( L-1) wO generates a stochastic process with a spectral measure that is absolutely continuous with respect to Lebesgue measure and has spectral density matrix given by

c( -B)C( -0)’ (C.22)

for almost all 0 in (- 7~, a] where C( - 19) = r[exp(ie)]. Since Assumption 5.5 is satisfied, the spectral density matrix given in (C.22) has constant rank K for all 0 in (-r, ~1 and the stochastic process { r( L-‘)w,: - ca -c t < -I CO} is

linearly regular. Hence, by the Wold Decomposition Theorem there exists a

function r* satisfying

T*(o)= .&cG where i: tr(rl*r/*‘) < cc, (C.23) /=ll , = 0

C*(B)=hET*[rexp(-is)] foralmostall8in(-71,7r], (C.24)

c*(e)c*(e)‘= c( -d)C( -e)‘, (C.25)

T*(a)+0 for )a]<l. (C.26)

[See Rozanov (1967, pp. 56-63) for (C.23) (C.25) and (C.26). and see Zygmond (1959, p. 276) for (C.24).]

Let h = r( L-‘)w,. Since w defines a process that is a vector white noise,

Assumption 5.2 implies that

E(hhY.)=O for 72s. (C.27)

Thus

y=O for j2s. (C.28)

Taken together, (C.23) and (C.28) imply (5.16). Since r* and r are polynomi- als that are well-defined for all complex values of u, (C.24) and (C.25) imply (5.17). Finally, (C.26) and Assumption 5.6 imply (5.18). Q.E.D.

Proof of Lemma 5.6. Relation (5.7) states that

(z(~*)+=E{[~~T(L-‘)][T(L-‘)‘;:‘]} forallzandz*inZ

(C.29)

Since {z,: -cc <t< +co} and {z,?: - 00 < r < + cx)} are jointly stationary

processes,

(z(z*)*=E{z[T(L)T(L-l)‘z$‘]}. (C.30)

In light of (5.17) in Lemma 5.5, (C.30) can be written equivalently as

which

+mJ

which

(zlz*)* = E{ z[T*(L-‘)T*(L)‘z;‘]}, (C.31)

is the first equality in (5.19). Again, using the fact that {z,: - cc < t < and {I,*: - CC < t -c + co} are jointly stationary, it follows that

(LIZ*> = E{ [z~r*<L)][z,*T*(L)]‘}, (C.32)

is the second equality in (5.19). Q.E.D.

23X L. P. Hunsen, Calculating hounds on the asvmptotic cooariunce matrices

Proof of Lemma 5.7. Most of this theorem is proved in the text. The only missing step is to show that zd as given by (5.20) is well-defined and in Z. In Lemma 5.5 it is shown that r*(a) is a finite-order polynomial for which the zeroes of det[r*(a)] have absolute values that are greater than one. Hence, T*(a))’ is well-defined and has a power series expansion that is uniformly convergent in the closed unit disc of the complex plane. Since Assumption 5.4 is satisfied, [d;r*( L-‘)-“I is a well-defined mean-square limit. Thus, proj{[d;T*(L-‘)-“]lJ_,} is well-defined and has elements in J_,. Similarly, prOj{[d~T*(L-l)-l’]IJ_,}r*(L)- ’ is a well-defined mean-square limit. Fi- nally, since r*(a)-’ has a power series expansion in a region containing the closed unit disc of the complex plane, r*(L)-’ is a one-sided operator. Also, J_, is a closed linear subspace of J so that proj{[d~T*(L-l)‘]IJ_,}T*(L)-l has elements that are in .I_,. Therefore, zd as given by (5.20) is in Z.

References

Amemiya, T., 1977, The maximum likelihood and the nonlinear three-stage least squares estimator in the general nonlinear simultaneous equation model, Econometrica 45. 9555968.

Basrnann, R.L., 1957, A generalized classical method of linear estimation of coefficients in a structural equation, Econometrica 25, 77-83.

Billingsley, P., 1961, The Lindeberg-Levy theorem for martingales, American Mathematical Society 12, 788-792.

Chamberlain, G., 1983, Asymptotic efficiency in estimation with conditional moment restrictions, Social Systems Research Institute paper 8307 (University of Wisconsin, Madison, WI).

Gordin, MI., 1969, The central limit theorem for stationary processes. Soviet Mathematics Doklady 10, 1174-1176.

Hansen, L.P., 1982, Large sample properties of generalized method of moments estimators, Econometrica 50, 1029-1054.

Hansen, L.P., 1985, Using martingale difference approximations to obtain covariance matrix bounds for generalized method of moments estimators, Manuscript.

Hansen, L.P. and R.J. Hodrick, 1980, Forward exchange rates as optimal predictors of future spot rates: An econometric analysis, Journal of Political Economy 88, 8299853.

Hansen, L.P. and R.J. Hodrick, 1983, Risk averse speculation in the forward foreign exchange market: An econometric analysis of linear models, in: J.A. Frenkel, ed.. Exchange rates and international macroeconomics (University of Chicago Press, Chicago. IL).

Hayashi, F. and C. Sims, 1983, Nearly efficient estimation of time series models with predetermined. but not exogenous, instruments, Econometrica 51, 7833798.

Jorgenson, D.W. and J. Laffont, 1974, Efficient estimation of non-linear simultaneous equations with additive disturbances, Annals of Economic and Social Measurement 3, 615-640.

Rozanov, Y.A.. 1967, Stationary random processes (Holden-Day, San Francisco, CA). Sargan, J.D., 1958, The estimation of economic relationships using instrumental variables,

Econometrica 26, 393-415. White, H., 1982, Instrumental variables regression with independent observations, Econometrica

50. 483-499. Zygmund, A., Trigonometric series, 2nd ed., Vol. 1 (Cambridge University Press, Cambridge).

Documents

A method for calculating bounds on the asymptotic covariance matrices of generalized method of moments estimators