20
The Lagrangian Multiplier Test Author(s): S. D. Silvey Source: The Annals of Mathematical Statistics, Vol. 30, No. 2 (Jun., 1959), pp. 389-407 Published by: Institute of Mathematical Statistics Stable URL: http://www.jstor.org/stable/2237089 . Accessed: 28/09/2013 21:09 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The Annals of Mathematical Statistics. http://www.jstor.org This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM All use subject to JSTOR Terms and Conditions

Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

The Lagrangian Multiplier TestAuthor(s): S. D. SilveySource: The Annals of Mathematical Statistics, Vol. 30, No. 2 (Jun., 1959), pp. 389-407Published by: Institute of Mathematical StatisticsStable URL: http://www.jstor.org/stable/2237089 .

Accessed: 28/09/2013 21:09

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to TheAnnals of Mathematical Statistics.

http://www.jstor.org

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 2: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

THE LAGRANGIAN MULTIPLIER TEST

BY S. D. SILVEY

University of Glasgow 1. Introduction. One of the problems which occurs most frequently in prac-

tical statistics is that of deciding, on the basis of a number of independent ob- servations on a random variable, whether a finite dimensional parameter involved in the distribution function of the random variable belongs to a proper subset X of the set Q of possible parameters. Naturally this problem has re- ceived considerable attention and the main method which is currently applied in dealing with it is the well-known Neyman-Pearson likelihood ratio test. Direct application of this test involves finding the supremum of the likelihood function in the set X and this in turn often involves the solution of restricted likelihood equations containing a Lagrangian multiplier. And the same set of of equations has to be solved if, irrespective of the likelihood ratio test, it is desired to obtain a maximum likelihood estimate in the set X of the unknown parameter. Rather surprisingly, since the problem is of such frequent occur- rence, little seems to have appeared in statistical literature on such restricted maximum likelihood estimates, the main results in this field being cont-ined in a recent paper by Aitchison and Silvey [1].

In this paper the authors introduced, on an intuitive basis, a method of testing whether the true parameter does belong to c, this method being based on the distribution of a random Lagrangian multiplier appearing in the re- stricted likelihood equations. It is the object of this present paper to discuss this Lagrangian multiplier test. In order to do so, it is necessary to consider how the results of the previous paper must be modified when the true parameter does niot belong to the set c, because onily in this way can we obtain any notion of the power of the test. Discussion of this point forms the initial part of the present paper. We will then show the connection between the Lagrangian mul- tiplier test and the likelihood ratio test. Finally, since often in practice situations arise where the information matrix is singular, we will consider how the Lagran- gian multiplier test must be adapted to meet this contingency.

The approach adopted by Aitchison and Silvey [1] in the discussion of re- stricted estimates is essentially Cramer's approach [4] to maximum likelihood estimates, i.e., attention is concentrated on solutions of the likelihood equations rather than on genuine maximum likelihood estimates. Such an approach is really unsuitable in the present instance where we do not necessarily assume that the true parameter does belong to the subset w. And we will use instead the method used by Wald [7] in his discussion of the consistency of maximum like- lihood estimators. As has been pointed out by Kraft and Le Cam [5], Wald's approach to unrestricted maximum likelihood estimation is much more illumi-

Received March 14, 1958.

389

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 3: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

390 S. D. SILVEY

nating than that of Cramer and, not surprisingly, this is still true of restricted estimation. Unfortunately the change in viewpoint necessitates certain changes in the notation used by Aitchison and Silvey, and these we will now introduce in describing mathematically the situation to be discussed.

2. Notation. The basic situation in which we shall be interested is described mathematically as follows.

Corresponding to each point 0 = (01, 02, ***, 0,) in some subset Q of s-di- mensional Euclidean space, denoted by R8, is a distribution function F(, 0) defined on R1; where a is some given integer. A random variable X, taking values in R' has distribution function F( *, Go) where 0o is known to belong to Q but is otherwise unknown; though it is suspected that 0o belongs to a subset WX= n I{0:h(0) = 0} of Ql, where h = (hi, h2 , hr) is a well-behaved futnction from R' into R', r < s.

We will assume, as is usual, that for all 0 E Q, F( *, 0) is either discrete or ab- solutely continuous, and admits an elementary probability law f(, 0). Theni for a given sequence x = (x, X2, X 2 , Xn, * X ') of independent observations on X, the log-likelihood function log Ln(x,-) is defined on Q by log Ln x, 0) =

= log f(xi, 0). By a maximum likelihood estimate of Oo in any subset W* of Q, we mean an element O.(x, w*) of w* which is such that

log Ln(x, O (x, w*)) = sup log Ln(x, 0) . Sew

If a single-valued function di ( *) is thus defined for almost all x, then n (* w*) is a random variable called a maximum likelihood estimator of Oo in W*. When we refer to "almost all x" we mean almost all with respect to the probability measure defined on the sequence space of points x by the consideration that the components of a sequence x are regarded as independent observations on a random variable X with distribution function F(., 0o). Similarly "almost all t E Ra" means almost all with respect to the probability measure defined on R by F( , Oo).

The matrix whose (i, j)th element is fR. alog f(t, 0)/a0i alog f(t, o)/aoj dF(t, 0), we will denote by Be. Further, Ho will denote the s X r matrix (ahj(0)/aOj). For any real function r defined on R', DD(O) will denote the col- umn vector whose ith component is cl(O)/a0j, while D2v(O) will denote the s X s matrix whose (i, j)th component is 02r(O)/CEiCj . Generally column vectors corresponding to points in Euclidean space will be printed in the corre- sponding boldface type so that, for example, the column vector 0 corresponds to the point 0.

We will be interested initially in the emergence of 6.(x, co) as a solution of the equations

n-'D log Ln(x, 0) + Ho= 0

h(0) = 0,

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 4: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 391

where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum likelihood estimator An (*, w).

3. &,(x, w) and the likelihood equations. Naturally the discussion on which we have embarked will involve the introduction of various assumptions con- cerning F and h. The assumptions that we will introduce are not designed to achieve complete mathematical generality but are, we hope, of such a nature that they will not obscure the over-all mathematical picture and will be satisfied in many practical problems. The first of these assumptions is as follows.

Assumption 1. For every 6 E Q, z(0) = f R- log f(t, 0) dF(t, Go) exists. The whole problem of maximum likelihood estimation, restricted and un-

restricted, is closely bound up with the behaviour of the function z, because the Law of Large Numbers ensures that, for each 0, the sequence (n-' log Ln(x, 0)) converges, for almost all x, to z(6). If, further, this convergence is uniform with respect to 6, then for large n and most x, nA- logLn(X,-) will be uniformly near z and under suitable conditions will attain its supremum in X near the point (if such exists) where z attains its supremum in w. The assumptions which we will now introduce are designed to achieve this desirable situation.

A ssumption 2. Q is a convex compact subset of R8. Assumption 3. For almost all t - R', log f(t, * ) is continuous on Q. Assumption 4. For almost all t c R', and for every 0 E Q, a log f(t, )/100,

(i = 1, 2, - , s) exists and la log f(t, 0)1a0jl < g(t)(i = 1, 2, * . -, s) where JfRa g(t) dF(t, 0) is finite.

Assumption 5. The funietion h is continuous on Q. Assumption 6. There exists a point 6* e w such that z(6*) > z(6) when 0 e w

and 6 # 0*. Assumptions 2-4 ensure that for almost all x the sequence (n 1 log Ln(x, 0))

converges to z(6) uniformly with respect to 0 in the set Q. Assumptions 2 and 5 ensure that w is a compact subset of R8 and consequently that any continuous function on w attains its supremum at some point of w. In particular the function log L,,(x, .), for almost all x, attains its supremum in w at some point O.(x, w) of w. Assumption 6 then ensures that for almost all x the sequence (&1(x, w)) converges to 0*. The proofs of these results are fairly straightforward and we omit them.

It is of some interest to note that if Oo - w then usually Oo will satisfy the condition demanded of 6*. This has been proved by Wald [7]. In fact, when interest is concentrated on the case where 0o - w, Assumption 6 may be replaced by the following

Assumption 6A. 0o E w and if 0 # Oo then for at least one t E R', F(t, 0) # F(t, 0o). This is sufficient to ensure that z(Oo) > z(O) if 0 $ 0o.

As stated above, Assumptions 1-6 ensure the existence of a maximum likeli- hood estimator in w of Oo which converges with probability one to 0*. If in addi- tion we make the following Assumption 7 then for large n and most x, O1(x, W) will be an interior point of w and consequently will emerge as a solution of the restricted likelihood equations, when the function h is differentiable.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 5: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

392 S. D. SILVEY

Assumption 7. 0* is an interior point of w. Now making assumptions 1-7, we will use these likelihood equations in discussing the asymptotic distribution of

4. The asymptotic distribution of 6n(- wo). The method by which the asymp- totic distribution of maximum likelihood estimators is usually derived, for ex- ample by Cram&r [4], involves expanding the likelihood function by Taylor's Theorem. In order that we may adopt this method in the present instance we now introduce the following assumptions, similar to those of Cramer.

Assumption 8. The functions hi possess first and second order partial deriva- tives which are continuous (alnd so bounded) on U.

Assumption 9. For almost all t ? R' the function log f(t, *) possesses con- tinuous second order partial derivatives in a neighborhood of 0*. Also, if 0 be- longs to this neighborhood, then 102 log f(t, 0)/a0id0j1 < G,(t) (i, j = 1, 2, ... , s) where f Ra Gi(t) dF(t, 0) is finite.

Assumption 10. For almost all t E R' the function log f(t, *) possesses third order partial derivatives in a neighborhood of 0* and, if 0 is in this neighbor- hood, then

1i3 logf(t, 0)/1aiOdjd0kj < G2(t) (i,j, ck = 1, 2, *. * s),

where f R G2(t) dF(t, Go) is finite.

(4.1) Important implications for our purposes of Assumptions 4, 9 and 10 are as follows.

(4.1.1) The vector Dz(0) exists for every 0 - Q2 and the sequence (Dn'1 log Ln(x, 0)) of vectors converges for almost all x to Dz(0) (Assumption 4).

(4.1.2) The matrix D2z( 0*) exists and the sequence (D2n-1 log L. (x, 0*)) of matrices converges for almost all x to D2z(0*) (Assumption 9).

(4.1.3) For almost all x and i, j, k = 1, 2, - , s the sequence (n'103 log L,,(x, 0)/00i00j90k) is bounded uniformly with respect to 0 in a neighborhood of 0* (Assumption 10).

Each of these three statements is almost a direct consequence of the Strong Law of Large Numbers.

We are now in a position to obtain the asymptotic distribution of n(-, W). For brevity we will now write

' instead of O'(x, co). Since 0 -- 0* for almost all

x, we find by applying Taylor's Theorem and using (4.1.2) and (4.1.3) that

(4.1.4) Dn 1 log Ln(x, f9) = Dn ' log L,(x, 0*) + [D2z(0*) + o(1)] [0- ']

for almost all x. Also because of the continuity of the first partial derivatives of the functions

hi, for almost all x,

(4.1.5) H6 = Ho* + o(1)

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 6: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 393

and

(4.1.6) h(s) = [H'. + o(1)][6 - 0*1. For almost any x, if n is sufficiently large, 6 will, with a certain Lagrangian

multiplier XA(x), satisfy the restricted likelihood equations. So we have, writing X in place of XA(x) for brevity,

(4.1.7) Dn-1 log Ln(x, 9*) + [D2z(0*) + o(1)][6- 0*] + He;-O,

(4.1.8) [He* + o(1)][6 -*] = 0.

Since z(0*) is a maximum in the set w of the function z, there exists a Lagran- gian multiplier X* = (Xi, A*, * ** , xr) such that

(4.1.9) Dz(0*) + Ho.I* = 0,

and on subtracting (4.1.9) from (4.1.7), and using (4.1.5) we obtain

[Dn ' log Ln(x, 9*) - Dz(o*)] + [D2z(0*) + o(1)][6 - _*1 (4.1.10)

+ [Ho. + o()][" - L*] + [Hi - Ho.*]X*- 0.

Now on expanding the elements of the matrix H6 by Taylor's Theorem, we find that, because of the continuity of the second order partial derivatives of the func- tions hi, for almost all x,

(4.1.11) [H6 - Ho*.]* = [z Xi D2hi(0*) + o(i)] [6 - ].

We will denote by - Bet the matrix D2z( 8*) + 5=1 X*D hi(*). Then on substituting in (4.1.10) the expression for [H& - Ho*]I* contained in (4.1.11) we have

[Bet* + o(l)][6 - J*] -[Ho* + o(l)][l-

= Dn ' log Ln(x, 9*) -Dz(0*),

and combining (4.1.12) and (4.1.8) we may write L Bet + o(l) - H + o(l)1F6-6*1 l-Ho* + o(1) 0 ; *

(4.1.13) -H +o10 [Dn-1 log Ln(x, 0*) -Dz(0*)

L ~0

We will now make the final assumptions which enable us to derive the asymp- totic distribution of f(, co) and A(.)

Assumption 11. The matrix

L-H o -H| H* 0J

is non-singular.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 7: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

394 s. D. SILVEY

Assumption 12. For i, j = 1, 2, , s, f3xj(0*) = fRa( log f(t, 0*)/1a90 a log f(t, 0*)/aojdF(t, Do) exists.

We now define

Pt Q Wj[ r s He.}

lQt Rt* J -Ho o j and VO* = (3ij((*)) - [Dz(0*)I[Dz(0*)I'. By the multivariate form of a Cen- tral Limit Theorem (Cram6r [3]) it follows from the existence of the matrix Vo. that the distribution of V\n[Dn-1 log Ln(., 0*) - Dz(0*)] is asymptotically normal with mean 0 and variance matrix Vo*. Then from (4.1.13), by the multi- variate extension of a theorem of Cram6r [4] we have the results stated in the following lemma.

LEMMA 1. Under Assumptions 1-12 the random vector

/n [;< () _ O*]

is asymptotically normally distributed with mean 0 and variance matrix

rPoV*t* pt* Qt *l [Qt' Vo.* PI Qt Ve. Qoj

We have now obtained a formal result regarding the behavior for large n of the restricted maximum likelihood estimator, a result which might be used in most practical situations to determine the large sample power function of the test of the hypothesis that 0o c w, proposed by Aitchison and Silvrey. (This might involve a considerable amount of computation). The extent to which the method of solving the likelihood equations which is proposed in the same paper can be used when Oo e w remains obscure, as does any general picture of the power of the test. However some light is shed on these questions by considering how the results here obtained particularize in the case when 0o e w.

(4.2) Accordingly we consider what happens when we replace Assumption 6 by Assumption 6A. Then Oo replaces 0* and z(Oo), the maximum of z in the set w, is also the maximum of z in the set U. Hence Dz(0*) = 0 and X* = 0. The *matrix Vo* becomes the matrix Bo,, and, with the mild additional assumption

Assumption 13. f R. 02f(t, 0o)/&0Oi90j dt 0 O (i, j = 1, 2, *.* , s), the matrix Bt* also becomes Be6. Consequently we have exactly the result of the previous paper [1] concerning the asymptotic distribution of the restricted estimator and the corresponding Lagrangian multiplier. The assumptions made here in deriv- ing this distribution are, so far as comparison is possible, stronger than the assumptions of the previous paper, but we have now obtained a result concern- ing the geniuine maximum likelihood estimator rather than merely a solution of the likelihood equations. (A greater degree of similarity betwveen the two sets

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 8: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 395

of assumptions is apparent if we note that in the case where Oo E co we might replace Assumption 11 by the following

Assumption 11A. The matrix Beo is positive definite and the matrix Heo is of rank r).

(4.3) It is now possible to obtain a picture of the typical practical situation when n is large and 0o, while not belonging to the set w, is very near this set. Usually then z(Oo) will be supees z(0) and 0* will be near 0o so that Dz(0*) will be near Dz( Go) = 0. Then X* also will be near 0, though, since n is large, VnkX* may be- appreciably different from 0. Also the elements of D2z(O*) will be near those of D2z(Oo) = -B6o. If in addition f3ij (0*) is near the corresponding element of B0,, as will usually be the case, then we can say that approximately

v\n [6n~' z o-*

will have a multivariate normal distribution with mean 0 and variance matrix

rPoo ? [Po -Reol

this matrix being as defined in [1]. (It would be possible to give a rigorous mathe- matical derivation of this result by imagining the true parameter 0o to vary with n in such a way that the distance of 0o from the set w tended to 0 as n -00, and by imposing suitable restrictions on the functions f and h to ensure that what is here said to happen usually would in fact happen. But this does not seem particularly profitable).

(4.4) Finally in this connection, because of the remarks made in the pre- vious paragraph and of the flexibility of Newton's method of solving equations, we might expect that, in the case where 0o is near the set w and n is large, the iterative method of solving the restricted likelihood equations suggested in [1] will still apply.

5. Three tests of the hypothesis that 0o co. We will now compare three in- tuitively reasonable tests of the hypothesis that Oo E w. These are as follows.

(i) The likelihood ratio test. We accept the hypothesis if ,(x) = sUpoew Ln(x, 0)/supo,s Ln(x, 0) is "sufficiently near" 1.

(ii) The TVald test. Assuming the existence of On(x, Q), we accept the hypothe- sis if h(&n(x, Q)) is "sufficiently near" 0. (Wald [8]).

(iii) The Lagrangian multiplier test. Assuming the existence of &n(x, w) and An(x) we accept the hypothesis if An(x) is "sufficiently near" 0. (Aitchison and Silvey [1]).

For typographical brevity we will now write 0 for the unrestricted maximum likelihood estimator f)(* *X),

' for the restricted maximum, likelihood estimator

61( 7 w) and X for the random variable XA(.)-

The measure of the distance from 0 of h(6) used by Wald is, in our notation,

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 9: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

396 S. D. SILVEY

-n[h(6)]'Rd[h(6)]; his test is based on this random-i variable and he has showni that under genieral conditions the asymptotic distributions of -2 log A and -n[h(0)]'R6[h(6)] are the same. The measure of the distance from 0 of X in the test proposed by Aitchison and Silvey is -nA'Re'5;. We will now show that subject to the following assumptions A we have

plim 2 log , = plim n[h(0)]'R6[h(6)] = plim n45R'Rem5Z.

Assumptions A. By assumptionis A we mean the following set of assumptions: 1-5, 6A, 7-10, 11A, 13 and

Assumption 12A. The matrix Bo exists in a neighborhood of Oo, and its elements are continuous functions of 0 there. Of course when assumption 6A is made, 0* is replaced by Go in subsequent assumptions.

We have already seen that these assumptions imply that ' exists and almost

certainly converges to Go, and that for large n and most x, An(x) exists. It is not difficult to use the particular form to which (4.1;13) reduces when assump- tion 6A replaces assumption 6 to obtain the results

(5.1) n (' - Oo) -n-'PoOD log Ln( *, Go) + op(1),

(5.2) n-2Q1oD log L,(-, Go) + op(1).

Here op is used in the sense of MIann and Wald [6] and P0o, Qo0 are defined by

(5.3)LB0 Hj [P0 Ql [-Hoo 0o LQOO Rooj

Also it is easy to show by the same kind of argument as has been applied above that the assumptions A imply that 0 exists and almost certainly converges to Oo and that

(5.4) for almost any x and sufficiently large n,

D log L[x, O(x)] = 0,

(5.5) /n(b - Oo) = n'B-'D log L( ,Go) +o(1).

We will now use these results to prove the following lemmas. LEMMA 2. Subject to assumptions A,

-2 log n(= n - -) b3Be0( - 6b) + oP(1).

PROOF. Clearly from (5.1) and (5.5). , - Ol = Op(n-). Hence on expand- ing log Ln(-, 0x) by Taylor's Theorem, we have, in virtue of (4.1.3) and (5.4)

log Ln(., 0) = log L.( , 0) + 6)- [D2 log Ln(., 0)][0 - 61 + op(1).

Again from Taylor's Theorem we have n-'D2 log L.(- 0) = n-1D2 log L,,(, Go) + op(1), and from (4.1.2) and assumptions 9 and 13 (which imply DmZ(Go) = -Bog)

im-1'D log L,(-, Go) = -Boo + op(l).

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 10: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 397

Hence

log ,u = log L(, 0) -log L.(- 0)

2n(6 6)'[Boo + o,(1)](6 - 6) + os,(1),

and the result follows because 110 - Oil = O0(n-). LEMMA 3. Subject to assumptions A, 2 log , = n"'R-' I + op(l). PROOF. We have

V/n(6-b6) n (Poo - Bon)D log Ln( G Xo) + op(l).

Now

[P00 - B']Boo[Poo - B_] = B-P = -QPoR-oQ@0

these matrix relationships following easily from the definition of POO, QoO and Roo in (5.3). Hence

n(6' - )'Boo(6 - 6) = -n-'[D log L.( *, 0o)'QooR-01Q'0[D log L.(., Go)] + op(l)

= -nx'Rs'5; + op(l), by (5.2).

Since, according to assumption 12A, the elements of the matrix Bo are continu- ous functions in a neighborhood of Go, and by 1lA Boo is positive definite, Bo will also be positive definite in a neighborhood of Go. Similarly Ho is of rank r in a neighborhood of Oo and so the matrix Ro exists and its elements are con- tinuous functions of 0 in a neighborhood of Oo . It follows from the strong conver- gence of 0 to Go that R`1 = R-? + op(l), and this completes the proof.

LEMMA 4. Subject to assumptions A, 2 log ,u = n[h(O)]'Ri[h(O)] + op(l). PROOF. Since the second derivatives of the functions hi are bounded on Q2

(assumption 9) and since 10 - Ooll is Op(n x), we have

h(O) -h(Oo) + H'0(b - 6O) + Op(n-')

- H O( 6- Oo) + Op(n-1),

since by 6A, G0o w . Hence V\nh(O) = n 'H' Bo-1D log L,,( X, Go) + op(1) and n[h(0)]'Ro0[h()] = n l[D log Ln( *, Go)]'B-1Ho0RooHeoB- [D logLn( X Gto)] + op(1). It is easy to show that BoO'HeOR0OHo03B' = Qo0R-1Q'O, and it follows that n[h(6)]'Ro0[h(6)] = ni'R-001 + op(1). The proof is then completed by the remark that, as in Lemma 3, Roo = R, + op(1).

LEMMA 5. Subject to assumptions A, each of the random variables -2 log , -n[h(0)]'R4[h(6)] and -nX'Re^ 1i is asymptotically distributed as x2 with r degrees of freedom.

This follows from lemmas 3 and 4 and from the fact that Vn5 is asymptoti- cally normally distributed with mean 0 and variance matrix -Ri.

In consequence of lemma 5, when n is large the natural choices of critical re- gions of size a for testing the hypothesis that G0 E w on the bases (i), (ii) and (iii)

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 11: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

398 S. D. SILVEY

are Cl 02 and C3 respectively where

C0 is the set of x on which -2 log IA> ka X.

C2 is the set of x on which -n[h(6)]'Re[h(6)] > k.,X and

C3 iS the set of x on which -n5k'RIe' > ka. Here ka is determined by Pr{ X(] > ka} - a.

Wald [8] has shown that usually the tests based on the critical regions C1 and C2 have asymptotically the same power. His argument shows essentially that if n is large and 0o is not near e, the power of each test is near 1, while if 0o is near w each of the random variables -2 log ,u and -n[h(O)I'R4h(O)] has approximately a non-central x2-distribution with the same parameters. We now inquire, without going into rigorous mathematical detail, whether this type of argument will usually hold when we compare the tests based on the critical regions Cl and C3.

We consider first what happens when n is large and Oo is near W. Then as we have seen, 0* will usually be near Oo anl we suppose that 0o is near enough co to ensure that 0* - 0 is near 0, though -Vn(0* - O0o) may be appreciably dif- ferent from 0. In virtue of the remarks made in (4.3) we will then have, in most practical situations,

(5.6) Vn(4~ O-*) n Pe D log Ln( *0 ), (5.7) V/n(5; - *) 'n QoOD'0 log Ln(*, 00)

where ' denotes approximate equality with probability near 1, for large n. Also since Dz(0*) + Ho4** = 0 and since usually Dz(Oo) = 0 and D2z(Oo)= -Boo, we will have

(5.8) Be (0* - Oo) + Heol* = 0, approximately. Since the distribution of 6 does not depend on whether Oo is in W or not, it will remain true (see (5.5)) that

(5.9) V/n( - Oo) - n Bo-,1D log Ln( 0*). Also examination of the details of the proof of lemma 2 shows that the result there obtained, namely

(5.10) -2 log , n(' - 6)Bo(O - )

still holds. Now from (5.6) and (5.9) we have

>- 6) +Vn(4* - 0) + n i(Peo - Bo1)D log Ln(*, 0f) V An(0 - Oo) + n'QoOR-1QIOD log Ln(., 00) - V/iB@6lH,o1* + VnQ,OR-'(5-

by (5.8) anid (5.7). It is not difficult to show that Q00Ro0 = Boo Hoo, and so

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 12: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 399

V/n( 6 -6) 2BoO1Heo0. Hence, in the usual practical situation; when n is large and 0o is near enough w to ensure that 0* - 0o is near 0, we will have

-2 log - n(' - 6)'B9o(6 - 6) - n5'H IB-1Heo0 = ni'R-'5- n_'Re^L,, and consequently the tests based on the critical regions C1 and C3 will have ap- proximately the same power in these circumstances. Moreover it is easy to see that each of the random variables -2 log jA and -ni"'Ri01; will then have ap- proximately a non-central x2-distribution with r degrees of freedom and param- eter - n*'Ro^%*. (Again this argument could clearly be made rigorous by imagin- ing 0o to vary with n in such a way that 110* - Ooll = 0(n-') and by imposing suitable conditions on the functions f and h).

We now consider the power of the Lagrangian multiplier test when n is large and 0o is not near w. Then the asymptotic distribution of 'n will usually be as given in Lemma 1. Now, if X* is not near 0, then with a high probability x will be far from 0 and since normally the matrix -Ro will be positive definite, the power of the test based on C3 will be near 1. However there is a possibility that Oo might be such that the function z has a stationary value at 0*, in which case A* = 0. Then -ni'RA' would not necessarily be large with a high prob- ability and consequently the power of the test based on C3 would not be near 1 for such a Oo . But this is a contingency which does not seem likely to arise often (the author has been unable to find an example of it) and we may conclude that in most practical situations the Lagrangian multiplier test is equivalent, for large samples, to the likelihood ratio test.

6. Singular information matrices. As we have said previously the whole problem of maximum likelihood estimation is closely bound up with the be- havior of the function z. In particular, for unrestricted estimation it is important that z should have a maximnum turning value in Q at 0o, for this condition plays an important part in ensuring consistency of f9n( *, il). Now the demands that z(Oo) should be a maximum turning value of z in Q and that Boo should be posi- tive definite are not unrelated. For it is usually true that z has a stationary value at Oo, i.e., that Dz(Oo) = 0 and also that D2z(Oo) = -Beo: these results depend only on f being such that we can "differentiate under the integral sign." So that if 0 is near Oo we will usually have

(6.1) z(0) - z(Oo) = -1(0 - Oo)'B6o(0 - 00) + 0(110 - Oo 11).

Hence if B9o is not positive definite it may very well happen that z(Oo) is not a maximum turning value of z in Q and much of unirestricted estimation theory would then break down.

However, even if Boo is not positive definite and z(Oo) is not a maximum turn- ing value of z in Q, it may still be the case that if Oo belongs to the subset X of Q, z(Oo) is a maximum turning value of z in w so that restricted estimation theory may not need drastic revision. And it is of some theoretical interest to consider Just what revision is necessary in this case. Moreover this problem is of

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 13: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

400 S. D. SlLVEY

practical interest because it often happens that it is niatural, either for reasons of symmetry or for some other reason, to describe the distribution of a random variable in terms of a parameter 0 in such a way that neither is Bo0 positive defi- nite nor is z(Oo) a maximum of z in Q. For instance if X has a multinomial dis- tribution and describes an experiment in which an individual can fall into any one of s classes, it is natural for reasons of symmetry to denote the probabilities associated with the different classes by j/E!=j 0i (i = 1, 2, ... , s). The set Q of possible parameters is {0 E R8: Oi > 0 (i = 1,2, , s)}, and it is easy to verify that neither is B9 positive definite for any 0 in Q nor is z(Oo) a maximum turning value of z in U. (In this case it is clear that this is so because we have set in s-dimensional space a parameter that is really (s - 1)-dimensional). However it is obvious that there is no difficulty about restricted estimation in the subset of Q in which Z=1 Oi = 1.

We will Inow consider what revision is necessary of that part of the foregoing theory based on the assumptions A, if we drop the demand that Bo, be positive definite (assumption 1 A) and replace assumption 6A by the following assump- tion 6B, while maintaining the remainder of the assumptions A.

Assumption 6B. 0o E X and for any other point 0 of w, F(t, 0) $ F(t, O0) for at least one t. Roughly speaking, we may explain the introduction of assumption GB as follows. If assumption 6A is not satisfied, the parameter is not identifiable .in the set Q, i.e., there are different 0's in Qv which give the same distribution of X. However we wish 0o to be identifiable in the subset w, in order that restricted estimation may still be possible. Hence we make assumption 6B.

It is easy to verify that these assumptions imply the existence of a consistent estimator 6Q,( * X w) of Oo, that for almost any x and sufficiently large n, On(x, w) with a Lagrangian multiplier 'X(x) satisfies the restricted likelihood equations and that

'G2) F BO0 + o(i) -H?0 + o(1)1 6F(x,c) - 0 =] Dn logLn(x,0o)] H) L- , ? o(1) 0 JL W (x

for almost any x. Now however, since we have dropped the requirement that Boo be positive definite and since subsequent theory concerning the asymptotic distributions of fn(., w), X',, and associated random variables makes considerable use of the inverse of Boo, this theory no longer applies. To enable us to replace this theory we will now introduce assumption 1lB which is associated with assumption GB in the same manner as 1lA was shown at the beginning of this section to be associated with 6A. This assumption will provide a natural connec- tion between properties of the matrix Boo, the subset w and the facts that 00 is identifiable when it is kniown to belong to X (assumption 6B), but unidenti- fiable in S.

A ssumption 11 B. The matrix Hoo is of rank r. The matrix Bo. is of rank s -t where t ? r. There exists an s X t sub-matrix Hi of Ho. such that Bo. + H,Hi is positive definite. (Without any loss of generality we may assume that Hi is

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 14: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 401

the matrix composed of the first t columns of Ho, and we may write

H,o = [H1 H2]).

We will now define the set of assumptions B. Assumptions B. By assumptions B we will mean the set of assumptions A

with 6B and 1lB replacing 6A and 1lA respectively. Now subject to assumptions B, if y denotes an s-dimensional random vector

normally distributed with mean 0 and variance matrix Boo and if we write 0 in place of- O'Q, w) and A in place of X',, then from (6.2) we have, as before,

and since /n Hoo(6 - Oo) - 0 it follows that

(G.4) + [Beo +Hi Hi H] -Ho Oo] [Y]

Since B,o + HiH/ is positive definite and Hoo is of rank r, the matrix

rBoo + HI Hi -Hoo L-Ho I 0 0 J

is non-singular and wve define PO*o, Q'* and R* by

(6.5) [Po Q0 Bo] [o ?Hi -HH

We will also define Soo by

It O (G.6) Soo = -Roo - 0]

where It denotes the unit t X t matrix. WVe will now prove two lemmas concerning the distributions of statistics in

which we are interested. LEMMA 6. Subject to assumptions B, the vector

0 0

is asymptotically normally distributed twith mcan 0 and variance matrix

[P 0o

O so]

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 15: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

4{)2 S. D. SILVEY

PROOF. From (6.4) we have, as previously, the result that

[ Oo

is asymptotically normal with mean 0 and variance matrix

PO Bso Poo PO Boo Qoo

lQoO Bo0 Poo QOO B0o Q*,j Now Pe-o B00 PoO = Peo (Boo + H1H')P* - P*H1HP*O and, as previously, the first term on the right hand side of this equation is P@o . Also from (6.5)

Po*Hoo = 0 = 0

and in particular P* H1 = 0. It follows that P* Boo P* = P* ; and in a similar manner it may be shown that PooBoo Qoo = 0. We also have

QOOBooQoo = -RnO-QOO H1 HI Qoo ,

and from (6.5) Q0UHoo = -I, so that, in particular, QeOH, = -[It 0]'. It follows that

Qoo B0o Qoo -Ro-o LO = So ,

and this completes the proof. LEMMA 7. Subject to assumptions B, -n1'R*^11' is asymptotically distributed

as x2 with r - t degrees of freedom. PROOF. Since Boo + H1H1 is positive definite and Boo is of rank s -t, there

exists a non-singular matrix W such that

W'(Bo + HI1H1)W = I8

and

As-t O

W'Boo W A0 0]

where A0-t is a diagonal s - t X s - t matrix. Then

t ^~~8-t ?

W'H1 Hi W=Is-[ 0

and since H1H/ is of ranlk t, it follows that As-t = I and that

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 16: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 403

We now define an s-dimnensional random variable m = (ml , M2 M * mS)

by m = W'y. Then m is inormally distributed with mean 0 and variance matrix

Is_t O W'Boo W ; 0]

It followvs that ni1, i2, ***, mit are independent N(O, 1) random variables, while mist?l = Ms-t+2 = i = 0-.

Now from (6.4) wre have

[(WW) -Heo[6 Oo] [W m]

and so

(6.7) [W' -W'H00 [6 OoIf [m]

Hence

m'm 0 [6 '[(WW'Y)1 + Heo H0 -HWo W HJ ? m m,,- n I I e etej

i.e., since Ho, (6 o-o) 0,

(6.8) m'm 4 - n ]'Boo[ - OoI + n5I'Heo WW'Hoo 1"

Now from (6.4) Vn(-- -O) PO*(W')-'m and, as previously, P* is of rank s - r. Ilence asymptotically, when n[6 - Oo]'B8o[6 - Oo] is expressed as a quadratic form in ml , M2 . , mS8- , its rank is at most s - r. We will n1ow show that when n1"'HooWW'Ho05l is expressed as a quadratic form in ml, M2,

* n*,-t , its rank is at most r - t. From (6.7) we have, again since Hso( o- Oo) 0,

-H@OWm '-. VnH@o0W'H/o01.

Now

, ~H1 Wm H

I Wm = -H W]

and, since _ _

M'W'Hi HWmi-nm' O m =0, LO ItJ

we have H'Wm = 0. Heince

- VnHol, WW'Hoo I[ Wm.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 17: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

404 S. D. SILVEY

Since the rank of H2 is at most r - t, it follows that, asymptotically, when n?,,'H 0WW'H8o0 is expressed as a quadratic form in mi , M2, ***, m8t, its rank is at most r - t. Now from (6.8) by applying Cochran's Theorem (Cram6r [41) we have the result that asymptotically n[' - Oo]'Boj[0 - 00] and

n5'HIIWW'Heo`

are independently distributed as x2 with s - r and r - t degrees of freedom respectively.

The proof of Lemma 7 is completed by the remarks that

HeoWW'H0o = H0o(Boo + HiH;)-1Hoo = -R

ari-d that RsO-' Ra'. The results proved in this section, and the methods of proof, make it clear

how the technique suggested by Aitchison and Silvey [1] for solving the re- stricted likelihood equations can usually be adapted, and how the Lagrangian multiplier test can usually be applied when the matrix Boo is singular and the function h is suitable. We will not amplify this point.

7. Different numbers of observations on several random variables. Experi- mental material beinig what it is, and experimenters being as they are, it is not often that the statistician is faced with an estimation problem in the ideal cir- cumstances of being given a number of observations on a vector valued random variable. The more usual situation confronting him is that he is given n1 ob- servations on a random variable X1 whose probability density function depends onl s8 parameters 01, 02, * - X - X n2 observations on a random variable X2 whose probability density functioii depends on S2 parameters 081+1, 081+2, ***,

081?+2, .... and nk observations on a random variable Xk, whose probability density function depends on Sk parameters 08?+82+-. +8k_1+1i * *, 8, where s = s3 + 82 + - + Sk. And he is presented with the problem of deciding whether the true parameter 0o = (0?, 00, * , 00) belongs to a set

co = {I0o:h(0) =0,

Q and h being as before. If ni = n2 = ... = nk then we may interpret the ob- servations as observations on a vector valued random variable and the fore- going theory applies. But if the n's are not all equal we cannot do this, and in order to enlarge the sphere of the Lagrangialn multiplier test we have to con- sider this situation separately. In discussing it we will avoid all mathematical detail and will be content to indicate very briefly the modifications necessary in the test.

We will denote by x* a given set of ni + n2 + * + nk observations on the random variables XI, X2, * * *, Xk, and log L (x*, 0) will denote the value of the log-likelihood function at the point 0. Now if Oo E X then the same kind of argument as we have used before may be used to show that it will usually be the case that 0(xc*, w) exists, is near 0o when ni , n2, **. , nki, and nk are all

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 18: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

LAGRANGIAN MULTIPLIER TEST 405

large, and with a Lagrangian multiplier X(x*) satisfies the likelihood equations

D log L(x*, 0) + Hol = 0

h(O) = 0.

We now introduce a matrix N defined by

ni 0 .. 0]

K n72 IL2 0

N=

L o nk L.J

The information matrix Bo is defined in this case by

Be- -N '[EoD2 log L(., 0)],

whiere Eo denotes expected value when 0 is take as the true parameter. Then again by the type of argument used previously we may show that for most x* , when ni, n2, * , nk are large,

(7.1) [NB0Noo -Ho] ( ) - D log L(x*, Oo[] LHoo 0 iL 5WX*) I ' 0 1

Also it will usually be true that D log L(x*, do) can he regarded as an observa- tion on a random variable which is approximately normal with mean 0 and variance matrix NBo .

Now in the case where Boo is positive definite we may use (7.1) in the same way as before to show that when 0o E w and n,, n2* nk are large,

L.'H4NRBd] 1H6I

will usually be distributed approximately as x2 with r degrees of freedom, and it is this statistic which we use in the modified form of the Lagrangiatn multi- plier test. Alternatively when Bo, is of rank s - t, when each of the functions hi, h2 , * -, hi is a function of only the parameters involved in the distribution of one of the X's and Bo, + H1H' is positive definite, the statistic on wlhich the test is based is 2/H4[N(B6 + H1H")K 1Hdi, which will usually be distributed

2 as x with r - t degrees of freedom when ni , n2 . , nk are large. We conclude by applying the Lagrangiaii mnultiplier test in a familiar situa-

tion. Homogeneity in the 2 X 2 contingency table. One of the three situa-

tions (Cochran [2]) in which the 2 X 2 con1tingency table arises is as follows. We are given ni observations oIn a random variable X1 whose distribution is defined by

Pr{il = (1, O)} = 07(0? + 0?)

Pr'Xi = (0, 1)} -? (0? + 6?),

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 19: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

406 s. D. SILVEY

and n2 observations on an ildepenident r-alndomn variable X2 whose distribution is defined similarly in terms of 03 and 04 . These observations can be summiiariised in a 2 X 2 contingency table as follows.

Number of occurrences of different values of Xi and X2.

(1, 0) (0, 1) Total

Xi nl n1 ni X2 n2l 122 fl2

Total MI m2 n

We suppose that the point 0C = (01 , 00, 03, O) is kniown to belong to the set Q { R 4:E _ O < 1/e (i = 1, 2, 3, 4)1 where E is a snmall positive inumber. In this case we also have

log L(x*, 0) -constant + nl1 log 01 + n12 log 02 - n lo3g (01 + 02)

+ n2 log0 03 + f122 log 04 - It2 1(g ( 03 + 0).

The matrix

[- O'- (01 + 02)-l -(1 + 02) 0 0

-(01 + 02) o2 - (01 + 02) 0 0

0- _ (0 + 04) 1 (03 + 04)

L 0 0 (03 + 04)< 041 - (3 + 04) _<

has rank 2. Homogeneity of X1 and X2 means that 01/(0 + -- 02) =03/( + -4) and we consider estimating Oq subject to the restrictions

01 + 02 - 1

h(0) - . + f 044- 1 =0,

L 01- 03 J

so that

_ l 0 O He= 0 1 -1.

0 1 0

If H1 is the leading 4 X 2 sub-matrix of Ho, then for any 0 r ,

B8 + H, H = o- o0 0 0- 0 0 j 1 0 0-1 0 0 Be + Hi Hi =

_ 0 0 04 which is positive definite.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions

Page 20: Author(s): S. D. Silvey Source: The Annals of Mathematical ... · LAGRANGIAN MULTIPLIER TEST 391 where X is a Lagrangian multiplier in R', and generally in the restricted maxi- mum

JiAGIRANGIAN MUL4TIPLIER TEST 407

rThe likelihood equatiols are easily solved in this case and we find that

81(x*, co) = 03(x*, w) = m/n

while 02(V*, W) 04(X*, ,) = M2/n. It is n-ot difficult to verify that the statistic ^'H;[N(B6 + H,H')]-p1H4 is the usual statistic used in the x2-test of homo- geneity in a 2 X 2 table, so that this test is a particular case of the Lagrangian multiplier test. And it illustrates most aspects of the preceding theory. The computational procedure for applying the Lagrangian multiplier test in less familiar and mnore complicated situations will be set out in a subsequenit paper.

REFERENCES

11] J. AITCHISON ANID S. I). SILVEY, "Maximutm likelihood estimation of parameters sub- je(t to restraInts," Ann. Math. Stat., Vol. 29 (1958), pp. 813-828.

[21 W. G. COCHRAN, "The x2-test of goodness of fit," Ann. Math. Stat., Vol. 23 (1952), pp. 315-345.

[3l H. CRAMiR, ?antdota V'ariables and Probability Distributions, Cambridge University Press, 1937.

[41 H. CRAME'R, Matheimatical Methods of Statistics, Princeton University Press, 1946. [51 C. KRAFT AND L. LECAM, "A remark on the roots of the maximum likelihood equation,"

Ann. Math. Stat., Vol. 27 (1956), pp. 1174-1177. [61 H. B. MANN AND A. WALD, "On stochastic limit and order relationships," Ann. Math.

Stat., Vol. 14 (1943), pp. 217-226. [71 A. WALD, "Note oni the consistency of the maximum likelihood estimate," Ann. lMath.

Stat., Vol. 20 (1949), pp. 595-601. [81 A. WALD, "Tests of statistical hypotheses concerning several parameters when the

number of observations is large," Trans. Am. Math. Soc., Vol. 54 (1943), pp. 426- 482.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PMAll use subject to JSTOR Terms and Conditions