5

Click here to load reader

A Note on Optimal Subset Selection Procedures

Embed Size (px)

Citation preview

Page 1: A Note on Optimal Subset Selection Procedures

A Note on Optimal Subset Selection ProceduresAuthor(s): Shanti S. Gupta and Deng-Yuan HuangSource: The Annals of Statistics, Vol. 8, No. 5 (Sep., 1980), pp. 1164-1167Published by: Institute of Mathematical StatisticsStable URL: http://www.jstor.org/stable/2240447 .

Accessed: 22/12/2014 19:39

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to TheAnnals of Statistics.

http://www.jstor.org

This content downloaded from 128.235.251.160 on Mon, 22 Dec 2014 19:39:46 PMAll use subject to JSTOR Terms and Conditions

Page 2: A Note on Optimal Subset Selection Procedures

The Annals of Statistics 1980, Vol. 8, No. 5, 1164-1167

A NOTE ON OPTIMAL SUBSET SELECT1ON PROCEDURES'

BY SHANTI S. GUPTA AND DENG-YUAN HUANG

Purdue University and Academia Sinica, Taipei A result for constructing an "optimal" selection rule for selecting a subset

of k (> 2) populations is given. Attention is restricted to the class of rules for which the infimum of the probability of a correct selection, over a subset of the parameter space, is guaranteed to be a specified number. In this class a rule is derived which minimizes the supremum of the expected size of the selected subset.

Let f,2, 2 . . , f represent k (> 2) independent populations (treatments) and let Xio, * *. , Xini be n, independent random observations from 7i. The quality of the ith population sr is characterized by a real-valued parameter 9i, usually unknown. Let 2 = {816# = (091, . . .k)) denote the parameter space. Let Tj= Tij(9) be a mea- sure of separation between 7i and xj. We assume that there exists a monotonically nonincreasing function h such that fyi = h(ij). Let Qi = {fIi > tTo Vj 5 i}, I < i < k, and S20 = ?2-Q, where S2 = U,.k is For this problem, we assume -r0 and as known with To > Ti for all i. Let Ti= minj_ j, I < i S k. We define *= max1 .<z k Tl. The population associated with T* will be called the best population. It should be pointed out that if 9 E Up, then Ti > Tj for all j, since for some jJ 7 i,i = h(Tij) < h(o) < h(Ti) = Tij < To. Thus if 0 E- i, then ff, is the best population. A selection of a subset containing the best population is called a correct selection (CS).

To illustrate the above notation, we assume that independent observations are drawn from r, which has a normal distribution with unknown mean 9i(i = 1,I* * ., k) and known variance a2. We define iTj O=9 [-k; then )i= A 9fkJ if 9, <Otk] and Ti = Oi-9[k-1J if 9, = 9[kJ, where 011, ** k). In this case, T,, = O for all i and the population with the largest mean, 0[kJ9 is the best. If, instead, Tij = Oj- i then the population with the smallest mean, 0H1', would be the best. In the above example, h(t) = - t, which is a decreasing function.

Let the observed sample vector be denoted by X' = (X X, , 3) where Xi has components Xi I, , ,Xini, i = 1, ... ,k. Let 8 = (81 ... * k) be a selection proce- dure where 8,(x) is the probability of selecting ff(l < i < k) based on the observed vector X = x. As measures of goodness of a selection rule, consider two quantities (cf. Lehmann [51) R(9, 8) and S(9, 8). We define S(9, 8) = P6(CSJ8) and R(9, 8) = z 1R(')(, 68i), where R(')(O, Si) = P{Selecting fjI }. Thus R(9,8) is the expected size of the selected subset. For a specified y, (O < y < 1), we restrict attention to

Received September 1976; revised June 1978. 'This research was supported by Office of Naval Research Contract N00014-75-C-0455 at Purdue

University. AMS 1970 subject classifications. Primary 62F07; secondary 62G30. Key words and phrases. Subset selection, restricted minimax.

1164

This content downloaded from 128.235.251.160 on Mon, 22 Dec 2014 19:39:46 PMAll use subject to JSTOR Terms and Conditions

Page 3: A Note on Optimal Subset Selection Procedures

OPTIMAL SUBSET SELECTION 1165

the class e of all 8 such that

(1) S(9,8) > Y for 9 E Q We are interested in constructing an optimal procedure 83 in C which minimizes the supremum of R(q, 8) over Q for all 8 E( C, i.e.,

(2) supee R(q,80) = min,,esupee.R(q,8).

REMARK. For some basic results and the motivation of the subset selection approach, reference can be made to Gupta [4]. Some (different) optimality results assuming a slippage configuration are given by Studden [7] for the exponential family. Recently Bjomrnstad [2] has obtained some results on the minimaxity aspects of the procedures of Gupta [4], Seal [6] and Studden [7].

We restrict attention to those selection procedures which depend on the observa- tions only through a sufficient statistic for 9.

Let the statistic Zij be based on the n, and n, independent observations from g and gj(i, j = 1, 2, * * , k), respectively, and suppose that for any i, the statistic Z'i = (Zil ... , Zik) is invariant sufficient under a transformation group G and let _ i = (Till * * Tik) be a maximal invariant under the induced group G. It is well known (see Ferguson [3]) that the distribution of Z, depends only on T,. For any i, let the joint density of Zij, Vj # i, be p9(zi) with SIP. Let p(z,i) be denoted by po(zi) when Til = * * * = Tik = Tii = constant and bypi(4,) when Til = . . .-Tik- To 1 < i < k. In the normal means example, a choice of Zij might be Xi- j, where Xi = l/nillni 7Xi, and Xj = l/n En', ljXji. Let v be a a-finite measure on Rk-l

Now we state and prove a theorem which provides a solution to the restricted minimax problem as stated in (1) and (2) (cf. Lehmann [5]).

THEOREM. Suppose that for any i,p,(Zi)/po(Z,) is nondecreasing in zi and that pe(z) has the stochastically increasing property. If R(9, 80) is maximized at Tij = T = constant, for all i, j, where 80 is given by

3,0(z,) = 1 if Pi(zi) > cpo(zi),

= Xi if Pi(zi) = cPo(Zi)

= 0 if Pi(z,) < CPo(z,)

such that c(> 0) and Xi are determined by f830pi = y, 1 < i < k. Then 80 -

(80 ... , * ) minimizes sup,E R(q, 8) subject to inf9Ee S(9, 8) > y.

PROOF. For any 8 E C, 9 E Q implies 9 E Qi for some i, thus

S(9,8) = fAO(zi)pel(4i)d(Czi) > minli<kinfeeQjf3i(Ci)pe(Ci)dv(Ci).

We have

iHfcfra S(, E ) = mifEfin ikiphefdi(zi)P9(4i) d<(zi) y

Hence for any 3 E CP, infeei1f3(zi)pe(zi)dv(zi) > y, 1 ( i ( k, and by the

This content downloaded from 128.235.251.160 on Mon, 22 Dec 2014 19:39:46 PMAll use subject to JSTOR Terms and Conditions

Page 4: A Note on Optimal Subset Selection Procedures

1166 SHANTI S. GUPTA AND DENG-YUAN HUANG

assumption that f8.?pi = y, it follows that

f(Si - 6i?)(Pi -cpO) 6 0

which implies

f86?po < fAPO.

By our assumption, 8,0(z,) is nondecreasing in zi, hence

inf9q,S(q,8O) = minl1,ikff8,Pi = y.

If R(9, 80) is maximized at Tj = Tij = constant, for all i, j, then

supee1 R(q, 8) > 'f,1p > k1f30p0 = supees R(, 8?)

which completes the proof. As an application of the preceding result, consider the following example:

EXAMPLE. Let 7r, T2'. * * Tk be k independent normal populations with means 01ir O*k and common known variance a2 = 1. The ordered O,'s are denoted by 0[1J S *- f 9[kJ It is assumed that there is no prior knowledge of the correct pairing of the ordered and the unordered ,i's. Our goal is to select a nonempty subset of the k populations so as to include the population associated with 0[kj.

Let Xi, 1 < i 6 k, denote the sample means of independent samples of size n from these populations. The likelihood function of 9 is then

p(X) = W, lpq,(5i)

where nl2 -(n/2)(i,-_i)2 1 6

Let Tij= -ij () = 9i - Oj, 1 S i, j < k, To = A > 0, Q= {9[kJ - [k-11 > A} and Zij = XYi- 1 < i, j < k. Let zi = (Zil,,* * Zik), T = (il* Tik), then since Zii = 0 and T,, = 0, Vi, the joint density of Zij, j # i, is given by

Pe( i) = (2 )(k-1)/2121 1- exp{- (z,i - (r i fi(}'

where

1 2 I'

Y-(k-I) x(k- 1) n 1 2

is the covariance matrix of Zij's. Since

Po(__ = exp{z2' A + A' 'z i - A'E 'A} = expf nk (zi +*** +Zik)}

is nondecreasing in zij, j 7# i, where A' = (A, , A). Hence

pi(zi)

Po(C )

This content downloaded from 128.235.251.160 on Mon, 22 Dec 2014 19:39:46 PMAll use subject to JSTOR Terms and Conditions

Page 5: A Note on Optimal Subset Selection Procedures

OPTIMAL SUBSET SELECTION 1167

is equivalent to

X ki > -izj=iFj + dJ

Since R(9,80) = 1PP{Xi > 1/(k - 1)2j:iXj + d} is the expected size of the selected subset for Seal's average-type procedure 8? [6], the following result of Berger [1] and Bjornstad [2] applies

5upesa R(q, 80) = R(, 80) iff inf9u S(Q9 d0) > k-I

Since the right-hand side is equivalent to J(((k - l)/k)2n2 d) S I/k, the left-hand side for every fixed A > 0 holds if and only if

7 = I-?((k- I

I2d^)>1 ( k k - I

where 4(.) is the cdf of the standard normal. Therefore, if for A > 0, Y is the chosen in such a way that the preceding inequality holds, then the result of the theorem can be applied.

Acknowledgment. The authors wish to thank the referees and the associate editor for their comments and suggestions which have improved and simplified the presentation.

REFERENCES

[11 BERGER, R. L. (1977). Minimax subset selection for loss measured by subset size. Mimeo Ser. # 189, Depart. Statist., Purdue Univ.

[21 BJdRNSTAD, J. (1978). The subset selection problem, II. On the optimality of some subset selection procedures. Mimeo. Series #78-27, Depart. Statist., Purdue Univ.

[3] FERGUSON, T. S. (1967). Mathematical Statistics: A Decision Theoretic Approach. Academic Press, New York.

[4] GUPTA, S. S. (1965). On some multiple decision (selection and ranking) rules. Technometrics 7 225-245.

[5] LEHmArNN, E. L. (1961). Some model I problems of selection. Ann. Math. Statist. 32 990-1012. [61 SEAL, K. C. (1955). On a class of decision procedures for ranking means of normal populations. Ann.

Math. Statist. 36 387-397. [7] STUDDEN, W. J. (1967). On selecting a subset of k populations containing the best. Ann. Math.

Statist. 38 1072-1078.

DEPARTMENT OF STATISTICS PuRDuE UNivERsrTy MATHEMATICAL SCIENCES BUILDiNG WEST LAFAYETTE, INDIANA 47907

ACADEMIA SWCA TAEPEI, TAIWAN

This content downloaded from 128.235.251.160 on Mon, 22 Dec 2014 19:39:46 PMAll use subject to JSTOR Terms and Conditions