Optimal Stopping With Random Horizon With Application to the Full-Information Best-Choice Problem With Random Freeze

Optimal Stopping With Random Horizon With Application to the Full-Information Best-Choice Problem With Random FreezeAuthor(s): Ester Samuel-CahnSource: Journal of the American Statistical Association, Vol. 91, No. 433 (Mar., 1996), pp. 357-364Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2291415 .

Accessed: 15/06/2014 04:05

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 62.122.73.34 on Sun, 15 Jun 2014 04:05:17 AMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=astata

http://www.jstor.org/stable/2291415?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Optimal Stopping With Random Horizon With

Application to the Full-Information Best-Choice

Problem With Random Freeze Ester SAMUEL-CAHN

A general theorem is given showing that an optimal stopping problem with independent random horizon is essentially equivalent to a "discounted" fixed horizon optimal stopping problem derived from it. The result is used to treat the "full information" secretary problem, where a known number of applicants for a job are potentially available. Applicants are interviewed sequentially with no recall, and their Xi-values are observed, where the Xi are iid random variables from a known continuous distribution. The goal is to pick the applicant with the highest Xi value. The "payoff" for this applicant is 1, and for any other applicant is zero. A random "freeze-time" variable M, with known distribution independent of the Xi's makes it impossible to pick an applicant after time M. The optimal rule is described and necessary and sufficient conditions for it to have a "monotone structure" are given. Uniform and geometric freeze variables are discussed, and some asymptotic results are given.

KEY WORDS: Discount factor; Job freeze; Monotone rule; Secretary problem.

1. INTRODUCTION

Consider a situation where your department has adver- tised an opening for a; secretary. There are a known number, n, of applicants for the job, and the aim is to hire the best one. This aim is exaggerated in mathematical terms, so that the best carries a value of 1 and all the others have zero value. You interview the applicants sequentially, say one every day, and measure some quantity X (such as an aggregate score over several categories) for the applicant whom you are presently interviewing. The applicant must be told immediately after the interview whether he or she has been hired, and there cannot be any regrets later on. This is the classical full-information secretary problem. In real life this situation is often complicated by budgetary limits. With no connection to the quality of the applicants, the administration may declare a freeze on all jobs. In that case, if you have not already hired an applicant, you lose your chance of hiring one, and your "payoff" will surely be zero. This is the model made rigorous and studied here.

Let n be known and let Xi,Ii = I,... ,n be iid random variables from some known continuous distribution F. The Xi's represent some measure of quality of applicants for a job; the applicant with the highest X value is considered "best." By considering F(Xi) instead of Xi, we can, and shall, without loss of generality, assume that the Xi are uniformly distributed on [0, 11. Let Lk = max{Xi, ... , Xk}, k = 1, ... , n. In the "classical" full- information best-choice problem, the payoff is 1 if the stopping time t is such that Xt = Ln and is zero otherwise. It is well known (see, e.g., Gilbert and Mosteller 1966, abbrevi- ated GM for future reference; or Samuels 1982, 1991) that the optimal stopping rule Tn is of the form

-n = min{i: Xi > bn-i+li Xi = Li} /A n (1)

Ester Samuel-Cahn is Professor, Department of Statistics, The Hebrew University of Jerusalem, 91905, Israel. Part of this research was done while the author was visiting Rutgers University, New Brunswick, NJ. The con- structive comments of the referees, associate editor, and editor are gratefully acknowledged. The help of Israel Einot with numerical evaluations is also gratefully acknowledged.

for some monotone increasing sequence

= b, < b2 < )(2)

where the sequence (2) does not depend on n. As n oo, lim P(Xrn = Ln) .5802.

Now consider a "freeze time" variable M, independent of the Xi, with known distribution. (We allow a positive probability of M > n, which corresponds to "no freeze.") The goal of choosing the "best" among all n applicants and the payoff remain unchanged, except that any stopping after time M yields reward zero. The concept of a freeze variable was introduced by Samuel-Cahn (1995), who applied it to the usual ("uninformed") best-choice problem; that is, where the choice is based on relative ranks only and no accurate Xi are available.

For the "uninformed" version, where there is a random number N of applicants (here N = M A n), and the goal is to pick the best of the actually available N applicants, a solution was given by Presman and Sonin (1972). The "full information" version of this problem has to the best of our knowledge not been studied. (Sakaguchi (1986) studied the full information problem for a Poisson process of arrivals- on a time interval of random length.)

In Section 3 the structure of the optimal stopping rule under the freeze model is explored, and a necessary and sufficient condition for it to be of the form

tn = min{i: Xi > an,ii Xi = LiU A n (3)

for a monotone sequence an,, > ... > an,n is given. In this case we say that the stopping rule is monotone. We also evaluate the probability of selecting the best. The cases of uniform and geometric freeze are studied in detail in Sec- tions 4 and 5, and some limiting probabilities of selecting the best are obtained when the parameters of the distributions vary with n. On our way, in Section 2 we give a general theorem on optimal stopping in the presence of a freeze

? 1996 American Statistical Association Journal of the American Statistical Association

March 1996, Vol. 91, No. 433, Theory and Methods

357



358 Journal of the American Statistical Association, March 1996

variable M, independent of the observed sequence. The theorem essentially states that a freeze variable acts like a sequence of "discount factors" {qi}, where qj = P(M > i). The proofs can be found in the Appendix.

2. OPTIMAL STOPPING UNDER FREEZE

Let Y1, Y2,... be a sequence of random variables, any dependence, and known joint distribution, and let Oi(Y1, . . . , Yi) = Oi (Yi) be the payoff if one stops after i observations, where EI i(Yi)I < oo. Let T be the set of all stopping rules t adapted to the sequence of a fields generated by the Y's (i.e., {t = i} E vi = u(Yi), i = 1, 2,... and P(t < oo) = 1). The goal is to maximize the expected value of the qi (Yi) at which one stops; that is, to find SUPtET E[qt (Yt)] = Vo (Y1, Y2, . . .) and, if it exists, to find a t that achieves this supremum. In the case where one has a finite sequence Y1, . .. , Yi, only, we write Tn and require that P(t < n) = 1. In this case an optimal t always exists.

Now introduce a curtailing (freeze) random variable M, independent of the Y's, where we allow P(M = oo) > 0. If P(M = oc) > 0, then M is called "defective." Thus the random horizon is M A n or M, depending on whether the finite or infinite problem is considered. Set q4*(Yi,M) = qi5(Yi)I(M > i), where I( ) is the indi- cator function. Thus there is no change in the reward if one stops no later than M, but stopping after M yields zero reward. Let a* be the or field generated by Yi and the events {M = j}, j < i - 1. The interpretation of vi* is that in ad- dition to knowing the Y's up to and including the present time i, it is also known whether a freeze has occurred before time i. (Note that we do not automatically receive the last q3i before the freeze; rather, we must select it in ignorance of whether it is indeed the last.) Let t* and T* denote the corresponding stopping rules and set of stopping rules and let V(Y1, Y2.. .) be defined accordingly.

Theorem 2.1. Let Y1, Y2,... be any sequence of random variables and Xi functions such that El I5i (Yi) I < oo, i = 1, 2 .... Let AM be a (possibly defective) curtailing random variable, independent of the Y's, with P(M > i) = qi,i = 1,2,.... Let q$*(Yi,M) = Oi(Yi)I(M > i) and fbi(Yi) = qi0i(Yi). Then

PO,* (J 11... I Yn) = VO (Y1 i ...,i Yn)- (4)

Let tn, tn and t* denote the optimal rules for the payoff sequences {I5i(Yi)}, {Ibi(Yi)} and {Io*(Yi, M)}. (If any of these rules is not unique, then choose the version which stops the earliest.) Then it is always possible to choose tn to satisfy

tn =tn A (M + 1). (5)

If qi5(Yi) >0 a.s. for i =1,... ,n, then

tn < tn a.s. (6)

If q5i(Yi) ?0O a.s. for i = 1, 2, . .., then

Vs>*(Y1,2, .....) = +(Y1Y2, ......

)(7)

Remarks.

a. The proof appears in Appendix A. The proof does not use q, = 1, (i.e., M > 1). qi = 1 implies that Yi is observed with certainty without freeze. There are examples where M > 0 is a more natural assumption.

b. The interpretation of (4) and (7) is that a freeze variable is equivalent to usual "discounting" by the sequence {qi}. In particular, if M has a geometric distribution, then one has the usual "fixed rate discounting." In connection with the best-choice secretary problem (in the "uninformed" setting), fixed discounting has been studied by Rasmussen and Pliska (1976).

c. The interpretation of (5) and (6) is that one will always stop no later than before in the presence of a freeze variable.

3. THE SECRETARY PROBLEM: STRUCTURE OF OPTIMAL RULE AND VALUE

As stated in Section 1, in the classical full-information best-choice problem the payoff is 1 if t = i and Xi = Ln and is zero otherwise. If we identify Xi with Yi of Section 2, then clearly this payoff is not measurable with respect to ui. It must, and can, be replaced, as in the classical cases, by the conditional probability that Xi is best given the past; that is,

=X , XX i Ln i(Xi)

Now Theorem 2.1 can be applied to this payoff function. Because we need some of the quantities that can be used to derive the solution to the full-information best-choice problem when no freeze is present, we first briefly describe this derivation. It follows GM quite closely. A value k for which Xk - Lk is called a "candidate." To derive bi+l of (2), suppose that the i + 1 observation from the end is a candidate and that its value is x. The value bi+l must be an "indifference value"; that is, the probability of winning now, with x, which is xi (= probability that no future observation will exceed x), must equal the probability of winning later, which is Ej1 VW(')(x), where W1'V)(x) is the probability that the first observation (after the i + first from the end) to exceed x is the jth and that this observation is larger than any future observation. Thus because the Xi are uniform on [0, 1],

r1 ~~Xj - l_xi wj~(x) =

x j-1 7i-J dy = .

(8) i (x i- j + 1 (8)

and bi+l is the solution to

x -1 i 7

xi -x x =2 . ;

that is to

1+kl=Z(kxk>1.(9 k=1 k=4



Samuel-Cahn: Optimal Stopping With Random Horizon 359

Some values of bi and their approximations have been tab- ulated in GM, and some asymptotics are given there; see also column b,(1) in Table 4 here.

Now consider the problem with the added freeze variable M. Let P(M > i) = q,,i = 1,2,.... It makes sense to assume qi = 1, though this is not needed. {M > n} symbolizes "no freeze." By Theorem 2.1, the optimal rule for this problem and its "value" are equivalent to those of a problem without freeze, with payoff 0 if t = i and Xi 7% Lng and payoff qi if t = i and Xi = Ln,i = 1, . . . ,n. By the independence of the future Xi's of the past, the previous argument of GM can again be used to show that the optimal rule has structure (3), where the an,i's are "indifference values." A rigorous argument of this is readily obtained by backward induction. But the sequence {an,i} is not necessarily monotone in i. This is shown in the following simple example.

Example 3.1. Let P(M = 1) =p = 1-P(M = n), for some 0 < p < 1. Because q2 = . = qn it follows that for i > 1, an,i = bn?i+l, where bn?i+l are the optimal values of (1) (i.e., b,+1 is the solution to (9)) and hence satisfy an,2 > ... > an,n. But the value an,, can be made arbitrarily small by choosing p sufficiently large.

The derivation of the optimal rule without freeze was based on the intuitive reasoning (which can easily be made rigorous) that if one is willing to stop with a candidate whose observed value is x, k stages from the end, then one should clearly be willing to stop with a candidate with this x value at any later stage (i.e., closer to the end). This argument fails in the presence of a freeze variable, as Example 3.1 shows.

Let m = sup{i: qi > 0} and r = n A m. Then clearly anur = 0, and an, need not be defined for i > r. Our aim is to find necessary and sufficient conditions for an,1 > an,2 > > an,r; that is, for the optimal rule to be monotone.

Theorem 3.2. A necessary and sufficient condition for an,k > an,k+1 for k = r-2, r-3, ... ., 1 is that

qk/qk+l + (r - k)-1 < 1 + [(r - k)a'-k'+f1- (10)

for k = r-2, r-3, ..., 1,where

an,r-1 qr/[qr-l + qr]. ( 11)

Remarks.

a. Note that when all q's equal 1 (i.e., no freeze), then (10) holds for all a < 1, yielding a proof that the bi sequence of (1) is monotone.

b. There exists a sequence {qk } for which an,k = c for all k <r; that is, for which the optimal rule is a "simple threshold" rule. Let c be the value of (11). For given qk+l, .. *, qr-i, qr, substitute c for an,k+l in the right side of (10) and solve (10) with equality for qk. The solution will always satisfy qk > qk+l. By rescaling all qi through multiplication by a constant, one can always obtain qi = 1.

To obtain the "value" for a given monotone stopping rule with stopping constants d1 ? .. ?n 4 0, let P(k-) denote the probability of "winning" at time k, with no freeze. GM

proved that

P(1) = (1 -dn)/n

and

k

P(k + 1) = [k(n -k)]-l dj- [n(n -k)]- j=1

k

x E djn - d n 1 /n, 1< k <r-1. (12) j=1

Let Wj (t) denote the "value" of a stopping rule t; that is, the probability of choosing the best applicant, in the presence of freeze.

Theorem 3.3. Let td=min{i: Xi > di, Xi = Li} Ar for a monotone nonincreasing sequence {di}. Then

r

Wj (td) = E qkP(k). (13) k=1

Proof. The proof is immediate, using (12) and Theo- rem 2. 1.

As stated in Section 1, the limiting optimal value without freeze, as n -? oc, is approximately .58. Unfortunately, with freeze, for any fixed freeze-variable M, the optimal value tends to zero.

Theorem 3.4. Let M be a fixed freeze variable with P(M < oc) = 1. Then limn-,o[supt WD(t)] = 0.

Proof Let Jn be the random index i such that Xi = Ln; that is, the index of the best candidate. Let k(e) be such that P(M > k(e)) < e. Because for every e > O limn,oo P(Jn > k(e)) = 1, the probability of reaching the best candidate before a freeze tends to zero as n xo. - Theorem 3.4 indicates that the only way to get mean-

ingful limiting results is to consider a sequence of freeze variables {Mn}, the distributions of which depend on n.

4. UNIFORM FREEZE

In this section we consider the case where the freeze variable has a uniform distribution over the integers 1,..., n. We prove that in this case the optimal rule is monotone. We tabulate some optimal thresholds and develop a method for getting good approximations to these thresholds when n is large. We also give the probabilities of choosing the best. Asymptotic values are obtained, as the number of applicants, n, tends to infinity. We also give some results for "single threshold rules."

Let Mn be uniform on the integers 1, ...,n. Then qk

= (n - k + 1)/n, k = 1,... ,n .and r = n. We first show that the optimal rule is monotone. In Equation (B.2) substitute these qk, and set z = x-1 to obtain

2(n-k)+1l=z(z- _l>1[zn-k -1], k=n-1,...,1, (14)




Table 1. Optimal Values and Their Approximations Under Uniform Freeze

n 'Yn En an 1 an1 n Wun Vun

10 1.13155 1.11422 .88375 .89749 3.44135 .357577 .205793 20 1.06428 1.05983 .93960 .94355 3.47630 .333831 .182881 40 1.03177 1.03064 .96921 .97027 3.49437 .322290 .172136 60 1.02110 1.02060 .97933 .97982 3.50049 .318492 .168663 80 1.01580 1.01551 .98445 .98473 3.50356 .316601 .166947

100 1.01262 1.01244 .98754 .98771 3.50542 .315470 .165923 1,000 1.00126 1.00126 .99874 .99875 3.51211 .311414 .162256 2,000 1.00063 1.00063 .99937 .99937 3.51249 .311190 .162102 5,000 .311055 .161983

10,000 .311010 .161943 00 3.51286 .310965 .161902

NOTE: Values of an and its approximation 6n for selected values of n. -y-1 = an+jj is the common optimal-rule threshold when n applicants remain. The two last columns are the probabilities of choosing the best, under uniform freeze, in the full information and relative rank cases.

the solution of which is a-. (For k = n - 1, this yields z = 3, corresponding to (11).) Equation (14) shows that an,k is a function of n - k only. Rewrite (14) as

2+1= z(Z-1)_1(Z- 1), j=1,2,... (15)

and let -yj be its solution. Then for n - k = j, one has -a 1 Here -yl = 3,Y2 = 10[1 + V2T1]-l 1.7912 ....

The optimal rule is monotone for all n iff yj > 7Yj+l for all j. Condition (10) translates to 2 < a(nk) that is,

n&,k+l

2 < 'yj for j = 1, 2,.... (16)

Equation (15) can also be written as

k 2j+l=Zzk (17)

k=1

Because the solution yj is greater than 1, and yji is the largest of the j summands in (17) when ?j is substituted for z there, it follows that >y] > (2j + 1)/j > 2, so (16) holds; that is, the optimal rule is monotone.

For any monotone d1 > ... > dn and uniform freeze, the value given in Theorem 3.3 can be written explicitly. After simplification, we obtain (where U stands for uniform Mn)

1 n-1 k

WU(td)={ 1?+ZEI dk n, k

k=1 j=l n

-

2 (2n -2k + 1)dm . (18)

k=1

(It follows immediately that any reasonable rule should have dn = 0, which could also be argued directly for the general case from P(n) of (12), or on obvious grounds.)

If di = d,i = 1, ... , n, then (18) simplifies to

An(d) =- (19)

(If we set dn = 0 instead of dn = d, then we get A* (d) - A {,A>,Zn/q,2 ~

n

We have computed some values of -yj directly, by solving (17). Some of these are listed in Table 1. The optimal value, denoted by WU, is obtained by substituting dj = alj in (18). The values of Wu are also listed. For comparison, the last column also lists the corresponding optimal values V7j for the same best-choice problem based on relative ranks only (see Samuel-Cahn 1995).

It is easy to obtain good approximations for -yj. By (15), +1 satisfies

tYjj+1 = (2j?+1)(yj -l1)+ (20)

In Appendix B we show that ?+1 tends to a finite limit as j -- oo. Denote this limit by a. Then from (20), for large j we get (2j + l)(yj -1) ?+yj -a; that is,

-yj 1?+ a 1 (21)

2(j ? 1)

Thus

+1 a2.j 1) j1_e(a1l)/2. _Yj ( 2(j + I )

e

Therefore, the value a must satisfy

a - e(a-l)/2 (22)

the numerical solution of which is a = 3.5128624.... De- note the value of the right side of (21) for this a by 8j. Table 1 lists some values of 8j, as well as the values -1

and for comparison 8,71. The approximations seem quite reasonable already for moderate n. The table also lists 7jj, which clearly converge to a. (The convergence is some- what faster than that of -yj]+l.)

Let Wu = limnoo W. This value can be obtained by substituting 6 8J1 for dj in (18) and taking limits as n -? oo. Set b = (a - 1)/2, where a is the value that solves (22). It can then be seen that

W 1 x1[j e-b/(l-Y) dy] dx - 2 yeb/Y dy,

(23)

where the first and second terms of (23) correspond to the limits of the corresponding terms of (18). The correspond-




ing numerical values are

Wu = .466785 - .155820 = .310965.

This value should be compared with W10'000 = .311010. We have also considered the rule with a single threshold

d, the value of which is given in (19). For fixed n, we computed the dn for which maxdAn(d) = An(d*). Table 2 lists some values of dn and An(d*).

The value 1im,O An(d*) can be found as follows. Let dn (c) = 1 - c/n for a constant value c. Then

lim An(dn(c)) = (1 - e-)/c - e-C. (24)

The value in the right side of (24) is maximized for c = c* = 1.793280, and the value of the right side of (24) for this c* is .298426 and is easily seen to be limn1o An (d*). Thus the difference between the limiting optimal probability of choosing the best and the optimal probability of choosing the best with a single threshold rule is only .0 1254. (The comparable values with no freeze are .58016 and .51735, with a difference of .06281; see GM.) Table 2 also lists for some n values An (I - c*/n) = An(dn(C*)). These are remarkably close to An (d*) already for small n. The table also lists n(l - d*), to be compared with c*.

5. GEOMETRIC FREEZE

Here we consider the case where the freeze variable has a geometric distribution. We first show that in this case the optimal rule also is monotone. Here there is always a positive chance that no freeze will occur. To get meaningful limiting results, we must let the "success probability" p of the geometric variable depend on n. The choice p = c/n for a fixed positive c value yields meaningful results. Numerical results are again given.

Let M be a geometric random variable, so that qk 13k-1,k = 1,2,... for some fixed /30 < 3 < 1. Here

r = n and an,n-1 = ,/3(1 + 3). We show that the optimal rule is monotone. Here (10) can be written as

/3 ' + (n - k)-1 < 1 + (n- k)-k1ai(n-k) (25)

Replace an,k+l in (25) by x. With this replacement, the right side of (25) is a decreasing function of x. It therefore

Table 2. Simple Threshold Rules Under Uniform Freeze

n d* An(d*) A (1 - c!/n) n(1 - dn)

10 .82562 .34190 .34178 1.74384 20 .91155 .31970 .31968 1.76899 40 .95547 .30895 .30895 1.78124 60 .97025 .30542 .30542 1.78528 80 .97766 .30366 .30366 1.78729

100 .98212 .30261 .30261 1.78849 200 .99105 .30051 .30051 1.79086 500 .99642 .29926 .29926 1.79242

1,000 .99821 .29884 .29884 1.79261 oo .29843 .29843 1.79328

NOTE: The values d * are the optimal single thresholds, and An(d n) are their probabilities of choosing the best. The fourth column contains values of easy approximations to these probabilities.

suffices to show that (25) holds for x = i3/(1 + 3); that is,

3 ?-1 + j 1 + j-1[31/(? +/3)]-i for j= 1,2,.... (26)

For j = 1, inequality (26) clearly. holds. Because the left side of (26) is decreasing in j, it suffices to show that for fixed 3, 0 < 3 < 1, the right side is increasing in j (i.e., j-1 < (j + 1)-1(1 + 3)7/3), which clearly holds. Thus the rule is monotone, and an,k is the solution to the equation

n-k n-k

+ ? /3n-k+l-jIj = E 3n-k+l-i[jXij]1, (27) j=1 j=1

which shows that an,k is a function of n - k only. (This can also be argued directly using the Markovian structure of the problem for geometric freeze.) Set z = x-1 and rewrite (27) as

+o=Z/i+l-jzi j, i = 1,2, ... (28) j=1 j=1

or, for short,

1 + Si(3) = Si(3,z), i = 1,2, .. ., (29)

where the terms in (29) correspond to those of (28). De- note the solution z by tYi+l (3) = b-+l (3), where 0 = bi (3) < b2(3) + .... In particular, 7y2(p3) = (1 + 3)7/3 and 7y3(p3) = [(1?/3p)2 + 2/311/2/3. Clearly, bi+l (1) are the values of (2); that is, the solution to (9). Note that Si(3, 1) = Si(3) and Si (3, z) is increasing in z for every fixed i and 3, which yields -yi(3) > 1. It is easily seen that for every fixed 3, 0 < /3 < 1, one has limi,c Si(3) = 0.

Unlike the situation in the uniform freeze, here 1imi_*C4 (3p) = o0 for all 0 < 3 < 1. This can be seen as follows: From (29), we have 1 + Si(3) < -yj+1(3)Si(3), which implies *+l(/) > 1 ? [Si(/)]-? oc as i -? o0. Nevertheless, limi,c yi (3) = 1. This can be seen as follows: Suppose that limi,c yi (3) = 1 + 6, with 6 > 0. Be- cause the sequence {-yi(/3)} is decreasing, yi(/3) > 1 + 6 for all i. Thus -yi+1(/3)i > (1 + 8)i > 1 + j8 + [j(j - 1)/2]62, j=1,2,.Hence

1 + Si(/) = Si(/3<Yi+1(/3)) i

S ,/3+-j{l + j8 + [j(j - 1)/2]82}/j; (30) j=1

that is, after opening the curly bracket in (30) and cancelling the term S (3) on both sides, neglecting the second term, and taking the third term with j = i only, we get

1 > /32(i - 1)/2. (31)

But for 6 > 0, the right side of (31) tends to oo, which yields the desired contradiction.

We have not attempted a more careful analysis of the values of -yi (3) and the corresponding bi (3) = ty71 (3) because these seem of little interest, as by Theorem 3.4, the value of the optimal rule tends to zero, as n -? 00, for every fixed /3. The only hope of obtaining meaningful limiting results is to consider a sequence {/3n} for which li/3n = 1




Table 3. Asymptotically Best Simple Threshold Rules and Their Values Under Geometric Freeze

c d(c) n=20 n=40 n=60 n=80 n= 100 n=coo

0 1.5029 .5287 .5230 .5211 .5201 .5196 .5174 .1 1.5164 .5080 .5018 .4998 .4988 .4982 .4958 .3 1.5440 .4700 .4632 .4609 .4598 .4592 .4565 .5 1.5724 .4360 .4288 .4264 .4253 .4246 .4218 .7 1.6015 .4056 .3982 .3958 .3946 .3939 .3910 .9 1.6312 .3784 .3709 .3684 .3672 .3665 .3636 1 1.6462 .3658 .3583 .3559 .3547 .3540 .3511 2 1.8006 .2705 .2637 .2615 .2605 .2598 .2572 4 2.0982 .1741 .1687 .1669 .1661 .1656 .1636 6 2.3465 .1288 .1241 .1226 .1219 .1215 .1198 8 2.5460 .1030 .0987 .0974 .0967 .0963 .0948

10 2.7100 .0862 .0822 .0810 .0804 .0800 .0787

NOTE: The values under n are the probabilities Vn (c, d(c)) of choosing the best, and V??(c, d(c)). The threshold is an = 1 - d(c)/n, and the geometric probability parameter is ,n = 1 - c/ln.

as n -? oc. We consider 13n = 1 - c/n, where c > 0 is a constant. We consider simple threshold rules; that is, rules where an,k = an) k = 1, . . . , n. For a simple threshold rule, the values of (12) simplify to

Pn(k) = an k-(i -an-k+1)(- k + 1) k=1 n

(32)

(One can easily argue (32) directly, because to "win" with the kth observation, the previous (k - 1) must be less than an. The present must be some y where an < y < 1, and the remaining n - k observations must be less than y. (32) is readily obtained through integrating over y.) We consider an = 1 - d/n for some fixed d > 0. Denote the value for this 3n and an by VnT(c, d). By (13),

VTh(c,nd) = E (i - c)1 (1 - d)l

[ d

(n i ? 1). (33)

It is not difficult to verify that limnoo Vn (c, d) VI (c, d) is given by

V??(c d) = j1 e(c+d)x [1ed(lx)] dx

--(c+d)j [e(c+d)Y-ecY] dy e c~~~

- e (c+d) E? (c + d) -(c

For fixed c, the optimal value of d, denoted by d(c), can be found through differentiation. It is the unique value of d satisfying

v(c+d)k_Ck ec+d - 1

L k * k! c +d

and the optimal value simplifies to

1e- (c+d(c)) V??(c, d(c))- c+dc

c?+d(c)

Table 3 lists some values of d(c) and VI?(c, d(c)) for a range of c values. For c = 0, this is the usual no-freeze, one-threshold value given in GM. For comparison we also list the values VnT(c, d(c)) of (33), using the fixed threshold 1 - d(c)/n, for n = 20(20)100.

Clearly, as c increases (i.e., the freeze probability on each "trial" increases) so does d(c) (i.e. the threshold 1 - { [d(c)] /n} decreases). Table 4 lists the actual optimal thresholds obtained through (28) and the actual optimal values, denoted by WG(fl), for n = 2(2)20 and 3 = 1, .995, .975, and .95. The values for n = 20 should be compared to the values V20(c, d(c)) of Table 3 for c = 0, .1, .5, and 1, because 1 - c/20 yield these d values. The ratio of the optimal value attainable with a simple threshold rule to the general optimal value is decreasing in ,3; that is, replacing

Table 4. Exact Optimal Thresholds and Optimal Values Under Geometric Freeze

n bn(1) bn(995) bn(.975) bn(.95) Wn(1) Wn(995) Wn(975) Wn(95)

2 .5000 .4987 .4937 .4872 .7500 .7481 .7407 .7314 4 .7758 .7744 .7685 .7609 .6554 .6506 .6318 .6092 6 .8559 .8545 .8484 .8406 .6288 .6211 .5916 .5573 8 .8939 .8924 .8863 .8785 .6161 .6056 .5660 .5213

10 .9160 .9145 .9084 .9005 .6087 .5954 .5461 .4922 12 .9305 .9290 .9229 .9150 .6038 .5877 .5291 .4671 14 .9408 .9393 .9331 .9252 .6004 .5815 .5141 .4448 16 .9484 .9469 .9407 .9328 .5978 .5762 .5003 .4247 18 .9542 .9527 .9466 .9387 .5958 .5715 .4875 .4063 20 .9589 .9574 .9512 .9434 .5942 .5672 .4755 .3894

NOTE: The bn are the thresholds, and Wn are the corresponding optimal values.




the optimal rule by the best simple threshold rule has a (relatively) less serious effect when the freeze probability

(1-dl) is high.

6. CONCLUSIONS

A general theorem for the independent freeze model in optimal stopping theory states loosely that (a) a "freeze model" is equivalent to the same model without freeze but with an added "discount factor," and (b) the optimal rule will stop earlier in the presence of a freeze variable. When this model is applied to specific freeze variables, such as uniform or geometric freeze, and the original optimal stopping problem is that of "best choice" with full-information iid random variables, Xi, i = 1,... n n, with known distribution, the optimal rule assumes a particularly simple form: stop with the smallest i for which Xi exceeds a constant, where this constant depends only on the total number of (possibly) remaining steps until reaching. the final horizon n. The sequence of constants is monotone increasing in the number of remaining steps.

The probability of choosing the best can be computed, and for simple asymptotic models the asymptotic probabilities are positive and computable, as n -- oo. Specific values can be found in the tables.

APPENDIX A: PROOF OF THEOREM 2.1

Note that bi (Yi) = E[ob*(Yi,M)IY]. Clearly, ai C vi; thus Tn c Tn*, T c T*. On the other hand, for the finite problem, for each t* e Tn* we exhibit a t E Tn such that t* = t A (M + 1) is equivalent to t*, in the sense that they yield the same reward. (Note that t* = t A (M + 1) is a stopping rule, because at any time j + 1, {M = j} E oj4+p.) t can be defined as follows: On the part where t* = i and M > i, one can take t = i, because on this part t* depends on X1, .. ., Xi only, i = 1, . . . , n. On the part where t* > M (which is the complementary part to the union of the previously mentioned sets and hence depends on X1, . . ., Xn only), take t = n. Then clearly t e T,n and i* = t A (M + 1) is equivalent to t*. With this understanding, we can identify all rules in Tn* with rules in Tn. We show that for every t* E Tn* and corresponding t E Tn)

E0* (Yt,M) = Ebt(Yt) := E(qt t(Yt)). (A.1)

This clearly yields (4) by taking the supremum over all t* E Tn* (or, equivalently, over all t E Tn). It also yields (5). To see (A.1), write

n 4t4(Yt,M) = ZXi(Yi,M)I(t* =i)

i=l

n

= S fi(Yi)I(M > i)I(t* = i) i-1 n

= 5Oi(Yi)I(M >i)I(t = i), (A.2) i=l

where the last equality follows because on {t* = i < t} necessarily {M < i}. Now take expectations in both sides of (A.2). In the right side, first take conditional expectation over M and use the fact that M is independent of the Y's ({M ? i} independent

of Yi for i = 1, 2, ... would suffice) to obtain Z j=1 q5)(Yj)qjI(t = i)[:= Zi=L W/~i(Yi)I(t = i)]. This yields (A.1).

To see (6), use theorem (3.2) of Chow, Robbins, and Siegmund (1970, p. 50) for 4i (Yi) > 0, using the fact that ql > ...> qn. (6) then follows easily by backward induction.

In the infinite case one cannot necessarily identify the set of rules T and T*. For example, if M is unbounded and nondefec- tive, then t** _ M + 1 is a legitimate stopping rule in T*, which has no counterpart in T. Actually, suppose that M is nonde- fective with qi > 0 for all i, and let Oi(Yi) < -q71 a.s. Then Vt (YI, Y2,...) <-1 while V (Y1, Y2,...) = E** (Yt**, M) = 0. The reason no such problem arises when qi (Yi) > 0 is that though t** is still a bona fide stopping rule, its value (in all non- trivial cases) is less than that of the optimal rule. Any t* for which t* > t** on some parts of the sample space can be replaced by a rule that uses t** on that part. Now (7) can be shown to hold by theorem 4.3 of Chow et al. (1971, p. 65). (Note that without further assumptions, no claim can be made about the existence of an optimal rule.)

APPENDIX B: PROOFS FOR THE SECRETARY PROBLEM

Proof of Theorem 3.2

By backward induction, an,r-l must solve the equation qr-lx - q W(1) (x), where Wfr)(x) is given in (8). Thus qr-lx = qr(l - x), the solution of which is (11). Note that an,r-1

< ? with equality iff P(M = r - 1) = 0. (Compare to b2 = 1

) Now suppose that an,k+l > ... > an,,r-1 (possibly k + 1 = r - 1). Similar to the reasoning in the case with no freeze, we need that the solution an,k of x in the equation

r-k

qkXrk =rqk+ kw()(x) (B.1) j=1

be greater than an,k+l. The solution of (B.1) satisfies r-k r-k

qk + Eqr+l-j/i = Eqr,+-j[jxj]*. (B.2) j=1 j=1

(This coincides with (9) when all q's are equal.) Note that the right side of (B.2) is decreasing in x. Because an,k+l is a solution to an equation similar to (B.2) with k replaced by k + 1, we know that

r-k-1 r-k-1

qk+l + qr+l-/j = qr+1-j[jai k+l] * (B.3) j=1 j=1

Thus an,k > an,k+l iff when we substitute an,k for ankl in (B.3), the equality sign is replaced by >. But an,k solves (B.2). This yields (10).

Proof That for Uniform Freeze, -y/J+1 Tends to a Finite Limit

It clearly suffices to show that yjj+l < B for all j sufficiently large and some finite B. If this holds, then for any subsequence {jk} such that yj3k+l has a limit as k -* oc, the arguments in (20) and (21) show that this limit is the value a given in (22).

We show that for all j > jo, one has -yj < 1 + 4/j. Suppose that this is false. Then for some j sufficiently large, -yj > 1 + 4/j; thus

2j + 1 = E zk> E > - +4) k=l k=t3a/41

? j[(e3-e;)/4] >3j > 2j?+1 (B.4)




which is a contradiction. Thus jy < 1 + 4/j; hence yjj+l < (1 + 4/j)j+1 __ e4. Take B = e4 + 1, and the proof is complete.

[Received May 1994. Revised July 1995.]

REFERENCES

Chow, Y. S., Robbins, H., and Siegmund, D. (1971), Great Expectations: The Theory of Optimal Stopping, Boston: Houghton Mifflin.

Gilbert, J., and Mosteller, F. (1966), "Recognizing the Maximum of a Se- quence," Journal of the American Statistical Association, 61, 35-73.

Presman, E. L., and Sonin, I. M. (1972), "The Best Choice Problem for a Random Number of Objects," Theory of Probability and Its Applications,

17, 657-668. Rasmussen, W. T., and Pliska, S. R. (1976), "Choosing the Maximum From

a Sequence, With a Discount Function," Applied Mathematics and Op- timization, 2, 279-289.

Sakaguchi, M. (1986), "Best Choice Problems for Randomly Arriving Of- fers During a Random Lifetime," Mathematica Japonica, 31, 107-117.

Samuel-Cahn, E. (1995), "The Best-Choice Secretary Problem With Ran- dom Freeze on Jobs," Stochastic Processes and Their Applications, 55, 315-327.

Samuels, S. M. (1982), "Exact Solutions for the Full Information Best Choice Problem," Technical Report 82-17, Purdue University, Dept. of Statistics.

(1991), "Secretary Problems," in Handbook of Sequential Analysis, eds. B. K. Ghosh and P. K. Sen, New York: Marcel Dekker, pp. 381-405.



Documents

Optimal Stopping With Random Horizon With Application to the Full-Information Best-Choice Problem With Random Freeze