Selecting the Smallest Extreme Value Population with the ... · 5 III Asymptotic 2 Approximation The 2 distribution can closely approximate Type II censored data, i.e. tests are terminated

Department of Mathematics

wtrnumber2014-3

Selecting the Smallest Extreme Value Population

with the Smallest Variance

Yan Zhang and Francis Pascual

April 2014

Postal address: Department of Mathematics, Washington State University, Pullman, WA 99164-3113Voice: 509-335-3926 • Fax: 509-335-1188 • Email: [email protected] • URL: http://www.sci.wsu.edu/math

Selecting the Smallest Extreme Value Population with the Smallest Variance

Yan Zhang and Francis Pascual, Department of Mathematics

Washington State University, Pullman WA 99164

I Introduction

I.1 Background

Researchers from the Civil Engineering Department of Washington State University

wish to conduct a study on the asphalt. The experiment is designed as to prepare uniformly

mixed particles in randomly selected laboratories.

There are 20 types of aggregates, half are coarse and the other half fine. Three

aggregate properties are of interest to them: form, angularity and texture. Four measuring

methods are employed: X-ray tomography, digital caliper, visual reference and manual

imaging and analysis. But they give different readings on average. Of these methods visual

reference is subjective, since the observation is obtained by trained technicians scoring

aggregated samples in reference to some pre-set visual standards. Therefore, these methods

are not expected to produce results that are similar with respect to accuracy and variability

on the average.

I.2 Motivation

Because of these characteristics, the engineers desire a statistical evaluation and

comparison of the standard methods. By a careful study of the experiment design and

measuring methods, three statistical tasks are necessary.

Firstly, gage R&R (repeatability and reproducibility). This analysis is aimed to quantify

the amount of variation, noise, in the measurements of distinct test methods of asphalt.

Especially when two or more operators are involved in assessing the specimen, this

quantification of the inherent variation provides valuable information to the researchers.

Second, the comparison of test methods. Test methods are expected to have not only

different means but also different variances. It is a very desirable property of a test method

to have small variation in repeated measurements on same specimen. The goal here is to

2

study the amount of variation of each measuring technique and to determine which ones

correlate well in the measuring sense.

Lastly, the analytic hierarchy process is applied to evaluate and compare the different

test methods with respect to a set of specific criteria.

The second of above concerns motivates the study that is presented in this paper. The

primary research tool used in this project is simulation. We rank test methods with respect

to their sample variances. We assess the performance of the proposed ranking procedure

for various values of proportion of observed failures, number of treatments, sample size,

and the ratio of variances. The preliminary work done by the engineers suggests that the

Weibull distribution is appropriate. This information is enough for a simulation study.

Asymptotic methods are important tools for approximating large sample behavior

statistics used in comparisons. Note that log (Weibull) data follow a smallest extreme value

(SEV) distribution. For Type II censored SEV data with r failures, the SEV scale parameter

has a pivotal quantity. The Type II censored data are collected in the way that instead of

recording all the failures time the tests are terminated when a certain number of failures are

observed. We can then use the distribution of this quantity to derive a theoretical lower

bound on the probability of choosing the best population, or the smallest variance. Bain and

Engelhardt [2], (1991, page 223) suggests the approximation ))1(2(~ˆ

2 2 rr for

heavy censored Type II data and )1(~ˆ

22

2 n for complete data, where r is the number

of failures and n is the sample size. These can be used to approximate the distribution of

some random variable involving σ.

I.3 Past work

Similar research can be found in literature. There are some studies based on Weibull

distribution. In [8,9,10], Kingston and Patel have studied on selecting a best population

from k two-parameter Weibull population. Populations are ranked by their reliabilities or

th quantile, classified with respect to a control population, and more generally a subset of

size k containing l best populations out of t populations. Dudewicz, however, [5] reviewed

seven papers on multi-population ranking and selection based on P(CS).

More works have been done on other distributions. Bechhafer [3] described a single-

sample multiple decision procedure. David [4] used two approaches to rank normal

3

variances. Gupta and Sobel [6,7] have studied on a statistics Y, the smallest of p correlated

F statistics and on finding a subset of k normal populations containing the best one. Saxena

[13] researched on a random width confidence interval for the largest mean of k normal

populations. Ofosu, however, [12] selected a normal population with the smallest variance

by minimax procedure. Arresen and McCabe [1] selected a subset based on correlated

variances.

II Study by Simulation

First of all, we approach the problem by simulation. We rely on the knowledge of

engineers that Weibull distribution adequately describes the data.

II.1 Notations and Assumptions

Let π1, π2, ..., πk be k (k≥2) populations, where k is the number of treatments. We

assume that the random variables Xi, i = 1, 2, ..., k associated with πi has a Weibull

distribution with scale parameter ηi, shape parameter βi, and cumulative distribution

function (cdf)

i

iii

xxPx

exp1 ),;(X)(F ii , for i = 1, 2, ... , k.

For i = 1,2,…,k, suppose we take a simple random sample of size n from population πi

denoted by Xi1, Xi2, ... , Xin, which are identically independently Weibull distributed (iid),

Wei(ηi, βi). Now let Yij = ln(Xij) for i = 1, 2, ..., k and j = 1, 2, ..., n. Then Yi1, Yi2,…, Yin

are iid SEV distributed with location parameter μi = ln(ηi), scale parameter σi = 1/ βi and

cdf

i

ii

xxF

expexp1)( , for i = 1, 2, …, k.

II.2 Objectives and Procedures

The decision rule in this study for choosing the best population is to find a population

that is believed to have the smallest variance with a certain probability.

Yij follow SEV (μi, σi), i = 1, 2, ... , k and j = 1, 2, ... , n. Compute the maximum

likelihood estimators (MLE) of i and i , i and i . Note that i

iˆ

is pivotal for σi, that

is, its distribution is free of unknown parameters.

4

Since the variance of SEV distribution is 6

22 i , finding the population with the

smallest variance is equivalent to finding the smallest i . Without loss of generality, we

assume that k ,...,321 .

Let

1. f( k ˆ,...,ˆ,ˆ 21 ) be the joint density function of k ˆ,...,ˆ,ˆ 21 ;

2. f( 1 ) be the marginal density function of 1 ;

3. f( 132 ˆ|ˆ,...,ˆ,ˆ k ) be the joint conditional density distribution of k ˆ,...,ˆ,ˆ 32

given 1 . Then )ˆˆ,...,ˆˆ,ˆˆ()( 13121 kPCSP (1)

1

0

11k13121 ˆ)ˆf( )ˆ|ˆˆ,...ˆˆ,ˆˆ ()( dPCSP

and, })ˆ,...,ˆ,ˆmin{ˆ()( 321 kPCSP

11

3

1

2

1

1 ˆ,...,

ˆ,

ˆmin

ˆ

kP

11

3

3

3

1

2

2

2

1

1 ˆ,...,

ˆ,

ˆmin

ˆ

k

k

kP

})ˆ,...,ˆ,ˆmin{ˆ()( 33221 kk rZrZrZZPCSP (2)

where nirZ ii

i

ii ,,3,2,,

ˆˆ1

and CS denotes the event of “correct selection”.

The joint distribution of kZZZ ˆ,...,ˆ,ˆ21 is independent of .,,2,1 kii It depends on

the cdf of the underlying distribution, sample size n and number r of observed failures in

Type II censoring.

Recall that we assumed k ,...,321 . Therefore krrr ,...,1 32 . So

2321 }ˆ,...,ˆ,ˆmin{ˆ rZZZZ k implies }ˆ,...,ˆ,ˆmin{ˆ33221 kk rZrZrZZ and

)}ˆ,...,ˆ,ˆmin{ˆ()( 2321 rZZZZPCSP k , (3)

and the equality holds when k ,...,32 or krrr ,...,32 .

Note that if iiZ ˆˆ then ,11 . Here P( 2321 }ˆ,...,ˆ,ˆmin{ˆ rZZZZ k ) can serve as a

lower bound on P(CS).

5

III Asymptotic 2 Approximation

The 2 distribution can closely approximate Type II censored data, i.e. tests are

terminated after r failures are observed. For heavy censoring or small r,

)).1(2(~

ˆ2 2 rr

i

i

and for complete data or no censoring,

).1(~

ˆ 22

2 ii

i n

where ni is the sample size, is more appropriate. These approximations are given in Bain

and Engelhardt, (1991, page 223)

III.1 Heavily Censored Data

Recall (1). We have

)ˆˆ,...,ˆˆ,ˆˆ()( 13121 kPCSP

11

1

1

3

3

3

1

1

1

2

2

2

1

1 ˆ2

ˆ2,,

ˆ2

ˆ2,

ˆ2

ˆ2

k

k

krrrrrrP

Let i

ii rW

2 , for i = 1, 2, ... , k. Thus

),...,,()( 1331221 kk WrWWrWWrWPCSP

1

0

1111331221 )()|,...,,( dwwfwWrwWrwWrwP kk

where fi(wi) is pdf of Wi.

Let Fi(wi) be the cdf of Wi. By independence of the Wi’s, given w1 we have

11

0

12

1 )()()( dwwfWrwPCSPk

iii

= 11

0

12

1 )( dwwfWr

wP

k

ii

i

= 11

0

12

1 )(1 dwwfr

wF

k

i ii

Note that fi, Fi can be approximated by the ))1(2(2 r pdf and cdf, repectively.

Now, since krrr ,...,1 32 , for i = 1, 2, ..., k,

6

2

11

2

11

2

11 11r

wF

r

wF

r

wF

r

wF

r

w

r

wi

iii

ii

i

.

Thus,

11

0

12 2

1 )(1)( dwwfr

wFCSP

k

ii

(4)

The right-hand side provides a lower bound for P(CS) in choosing smallest variance. It

depends on sample sizes, 2r , k and the amount of censoring.

III.2 Complete Data

Similarly, we can derive a lower bound for complete data using ).1(~ˆ 2

2

2 ii

ii nV

The lower bound we get is slightly different, namely,

1

0

112

22

1 )(1)( dvvfr

vFCSP

k

ii

. (5)

The lower bound in (4) or (5) is simpler than (3), because it only depends on r2 and the

amount of censoring r. We shall see that this approximation is fairly good even for

relatively small sample sizes.

III.3 Ranking SEV Scale Parameters

In the asphalt research, the engineers want to reduce the number of test methods and

keep only the best treatment based on the primary data. So we choose the best t population

from a total of k populations. Divide the test methods into two subgroups, best unordered t

populations and worst unordered (k-t) populations.

When making a decision, we want to know what the probability of making a correct

decision is. Let correct decision be denoted by CD.

Assume k ,...,321

Let

1. )ˆ,,ˆ,ˆ( 21 kf be the joint density function of k ˆ,,ˆ,ˆ 21 ;

2. )ˆ(,),ˆ(),ˆ( 21 tfff be the marginal density functions of t ˆ,,ˆ,ˆ 21

respectively;

3. )ˆ,,ˆ,ˆ|ˆ,,ˆ,ˆ( 2121 tkttf be the joint conditional density distribution of

ktt ˆ,,ˆ,ˆ 21 given t ˆ,,ˆ,ˆ 21 .

7

})ˆ,,ˆ,ˆmin{}ˆ,,ˆ,ˆ(max{)( 2121 ktttPCDP

})ˆ,,ˆ,ˆmin{ˆ}ˆ,,ˆ,ˆmax{ˆ( 211

21 kttj

t

jtjP

Let A(j) = {1,2,…,t}\{j} and B(t) = {t+1, t+2, …, k}. then

)).()(ˆˆˆ()( 211

21tBiandjAiforPCDP

t

jiji

Under Type II censoring (r failures),

.)(),(,ˆ

2ˆ

2ˆ

2)(1

21

2

22

1

11

t

j i

i

j

i

j

j

i

i

j

i tBijAiforrrrPCDP

Let i

ii rU

2 and Fi(u) and fi(u) be its cdf and pdf respectively. Let .),(

j

ijis

.)(),(,),(),()(1

2121 21 t

jiji tBijAiforUjisUUjisPCDP

duufuUtBijAiforUjisuUjisP j

t

jjii )(|)(),(,),(),(

12121

021

duufjis

uUP

jis

uUP j

t

j jAi tBii

ii )(

),(1

),(1 0 )( )( 21 2

21

duufjis

uF

jis

uF j

t

j jAi tBiii )(

),(1

),(1 0 )( )( 211 2

21

. (6)

Again, note that the 2 pdf and cdf can be used to approximate the true pdf and cdf in

calculation with appropriate degree of freedom, respectively.

More generally, the k populations can also be ranked into t+1 groups, 1st best, 2nd best,

…, tth best and worst (k-t) unordered populations. Complete ordering occurs when t=k-1.

8

IV Study Results

As stated earlier, the engineers wish to collect data of an asphalt study. Certain

preliminary studies have suggested that Weibull distribution is appropriate for the

distribution of measurements. Recall that log(Weibull) data follow a SEV distribution. SEV

distribution is a standard location-scale distribution and its variance only depends on its

scale parameter . These two properties make studying SEV variance much easier than

Weibull variance. The SEV scale parameter has a pivotal quantity that has some

properties to simplify our study.

There are k test methods. Accuracy and variability are crucial to this study. Because

there is a standard that the engineers can compare values to, it is more important now that

treatments have small variance. The MLE is computed for each treatment. The

engineers want to pick the treatment with the smallest variance by choosing the smallest . However, this procedure will not guarantee the selection to be correct because the smallest

does not always correspond to the smallest . Therefore the probability of the correct

selection needs to be evaluated, that is, the probability of choosing the population whose

variance is the smallest by prior knowledge. There are two goals in this study: to estimate

probability of correct selection and to obtain a lower bound for the probability of correct

selection.

IV.1 Simulation Results

Let i represent the ith population with SEV scale parameteri . Assume that 1 is the

best population. Under the worst case scenario, assume the k scale parameters satisfy

k ,...,321 . In this special case, P(CS) takes on the smallest possible value. Let

11 , 12 , and the ratio 1

22

r vary between 1 and 6. It is found in study that when

r > 6, P(CS) will be very close to 1. Without loss of generality, assume SEV location

parameter 0 for all populations. Recall in Chapter II that we have shown that

)}ˆ,...,ˆ,ˆmin{ˆ()( 2321 rZZZZPCSP k and when k ,...,32 equality holds.

9

Procedures

Simulate a Type II censored simple random sample of size n from population

.,...,1for kii The r smallest values are observed failures while the n-r other values are

censored at the rth largest value. The r smallest order statistics in a sample of size n are

generated from a standard SEV distribution. For each of the k generated populations,

compute maximum likelihood estimate i , by numerically maximizing the SEV likelihood

function for the ith sample:

n

jijijijii censoredif

failureifyFyfL ijij

1

1

,0

,1,)](1[)(),(

where

i

iij

i

iij

iij

yyyf

expexp1

)( is the SEV pdf, and

i

iijij

yyF

expexp1)( is the SEV cdf, for i=1,…,k.

Since minimum variance is equivalent to minimum i , the sample with the smallest i

is chosen to be the best. This completes one simulation of the experiment. Repeat this

procedure for 1000 times. Then the probability of correct selection is estimated by counting

the number of times 1 is the smallest and dividing this by 1000.

For values of 2r between 1 and 6, a Monte Carlo evaluation of P(CS) or its lower

bound is the proportion of the time 2321 }ˆ,...,ˆ,ˆmin{ˆ rZZZZ k . When 11 , then

,ˆˆ,ˆˆ,ˆˆ2211 kkZZZ and the inequality becomes .}ˆ,ˆmin{ˆ 221 rk

For example, we assume the sample size of each treatment to be 20. There are four

treatments and six observed failures. We generate a sample of size 20 from each of the four

populations, with 6 observed failures and 14 censored observations at the 6th largest value.

Assume that 14321 and 04321 . So 2r is assumed to be

1. Then from each simulated sample, numerically maximize the likelihood function subject

to parameters .and The MLE’s are 4321 ˆ,ˆ,ˆ,ˆ . If }ˆ,ˆ,ˆmin{ˆ 4321 , the

counter increases by 1. Otherwise no increment is added to the counter. After this exact

process is repeated 1000 times, the P(CS) is estimated by the counter value divided by

1000. This is an estimate of the probability of correct selection when 4321 .

We expect this P(CS) = ¼, because each has an equal chance of being the smallest. In

10

general, P(CS)=1/k, where k is the number of treatments. Then, when 122 1 orr , a

Monte Carlo evaluation of P(CS) is to compare 24321 }ˆ,ˆ,ˆmin{toˆ r instead of to

}ˆ,ˆ,ˆmin{ 432 . Then, estimate P(CS) in the same fashion.

The above procedure is carried out using S-PLUS. The results are in Figure 4.1.

Figure 4.1 P(CS) Curve when n=20, k=4 and r=6

On the y-axis is the probability of correct selection. On the x axis is the 1

22

r . As we

can see, the larger the 2r , the higher the P(CS). Note when 12 r , from the graph we can

read that P(CS) is approximately 0.25 as expected. When 5.22 r , P(CS) is over 0.9.

From the curve we find that the P(CS) increases with 2r . However, the rate of increase

is not uniform, rather declines as 2r increases. So the higher 2r , the more likely it is to

make a correct selection. Yet this improvement is most obvious when 2r is relatively small.

In reality the researchers always have some idea of how big a difference in s' they want

to detect as reflected in 2r . This graph shows their chance of being right in picking the true

smallest variance.

However there are other ways to improve P(CS) other than by increasing 2r , which

may not be a desirable approach. We generalize this approach to other possible sample

sizes of n = 20, 40, 50, 100, 200, proportions failing of 10%, 30%, 50%, 70%, 90% and

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

11

100%, and number of treatments of k = 2, 4, 6, 8, 10. In the next part of this chapter there

are a few selected graphs of these combinations. In the hope to find out a pattern of the

behavior of P(CS), we vary one factor at a time while fixing the others.

Figure 4.2 P(CS) Curves when k=6 and proportion failing = 30%

In Figure 4.2, we first fix the experiment at 6 treatments and 30% failing and vary

sample size. The lowest curve has the smallest sample size, and the highest one, the largest.

This trend is seen for other values of k and proportions failing, which are not shown here.

Therefore, we conclude that, as expected, when sample size increases, P(CS) also

increases. As we look closely, all these curves start at almost same place where P(CS) is

approximately 1/6 as expected. P(CS) increases rapidly till 22 r . Soon the rate of

increase drop dramatically to almost 0 at the flat portion of curves. This is because

probability is always bounded by 1 from above.

Note that the biggest gaps between these lines, where they are most distinguishable, all

occur around 22 r . So if the researchers want to detect 2r around 2 or smaller, increasing

sample size will largely improve P(CS). The curve of n=200 reaches P(CS)=0.9 at 2.12 r

. Yet the curve of n=20 has P(CS) < 0.4 and did not go up to 0.9 until 2r is about 3. That is

a huge difference in terms of variance. However, a sample size of 200 may be impractical

in some experiments. The gaps between curves are not uniform. Although n = 200 exceed

P(CS) vs. r2

r2

Pro

ba

bili

ty o

f C

orr

ect

Se

lection

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

n=200n=100n=50n=40n=20

12

50 by n = 150, their P(CS) difference is about same as between n = 40 and n = 20 where

they only differ by 20. Certainly, there are some limitations on increasing sample size.

Figure 4.3 P(CS) Curves when n=40 and k=2

In Figure 4.3, we fix the experiment at sample size of 40 with 2 treatments. The study is

on how the number of observed failures affects P(CS). Again all curves start at P(CS) = 0.5,

the highest possible start for any P(CS) curve. The curves follow similar patterns as in

Figure 4.1 and 4.2. The more observed failures, the higher P(CS) is. Even though they all

differ by same 20%, the P(CS) difference between 10% and 30% failing is a couple of

times bigger than between 30% and 50% failing or more between 70% and 90% failing.

This increase is not directly proportionate to the change in percentage failing. Also around

22 r , the most observable improvements happen. So increasing proportion failing when

it’s relatively low and 2r is believed to be around 2 will improve P(CS). So a low r value

will result in shorter experiments. A high value of r means longer experiments.

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

100%90%70%50%30%10%

13

Figure 4.4 P(CS) Curves when n=20 and proportion failing = 50%

This time in Figure 4.4, we vary the number of treatments while fixing the sample size

at 20 and observed 50% failing. The most obvious difference from Figure 4.4 is that the

curves all start at different points. As discussed before, when 12 r P(CS) = 1/k. So when

k differs, the curves will not start at the same place. But then they increase at a similar pace

until close to 1. That is to say when 2r is small, its change will bring similar changes in

P(CS). When 2r is high, not much changes will happen. The fewer treatments, the higher

P(CS).

We make the following conclusions.

If 2r , by an available reasonable estimate, is between 1.5 and 2.5, increase the sample

size and observed failures will greatly help to bring a low P(CS) to a much higher level.

If 2r is believed to be smaller than 1.5, the previous recommendations will also work,

but eliminating unnecessary treatment(s) could improve P(CS) a lot.

If 2r is higher than 2.5, the P(CS) may already be high. No action is necessary.

However if there is still room for improvement, increase either sample size or observed

failures or both. But changing the number of treatments has no significant effect.

This study helps the researchers to determine an optimal combination of sample size,

number of treatments and proportion failing based on their own research constraints for

future similar studies extended from a preliminary study. But we should keep in mind the

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

k=2k=4k=6k=8k=10

14

true P(CS) is higher than the value off the graph. The plots only give a lower bound on

P(CS). When SEV scale parameter differs from population to population, the true P(CS)

is larger than the value in the graph. The approach is most conservative, but it should serve

as a practical guidelines.

IV.2 Approximation Results

Alternatively, based on the same assumption, the probability of correct selection is

bounded from below by theoretical approximation. It comes from formula (4) and (5) by

using the cdf and the pdf of 2 distribution. Since there is no closed form solution of (4) or

(5), the lower bound is then computed through a numerical integration program. For easier

comparison, the parameter values are the same as in the simulation study. Figure 4.5 is the

plot of P(CS) by 2 approximation when sample size is 20 with four treatments and six

observed failures.

Figure 4.5 P(CS) Curve by Approximation when n=20, k=4 and r=6

In Figure 4.5, the P(CS) curve has the same shape as Figure 4.1. Although it is a

smoother curve, it has the same characteristics we find in Figure 4.1. The rate of increase

also diminishes as 2r increases. The neighborhood of 22 r is also a critical region in

terms of how the P(CS) behaves, though the changes are smoother than curves from

simulations. By the time 2r is 6, P(CS) also gets very close to 1. This graph agrees with our

conclusions from the simulations.

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

15

Figure 4.6 P(CS) Curves by Approximation when k=6 and proportion failing = 30%

Figure 4.6 corresponds to Figure 4.2. They look like each other but there are

differences. When they are set side by side as in next section, the following points are

made. Note that in this case the sample size varies. How well the 2 approximation is

relies crucially on how big the sample size is. So the larger the sample size is, the better

these two lower bounds agree with each other. When n=200, their largest difference in

P(CS) is 0.0367. However, when n=20, the largest difference is 0.0588 and average

difference is 0.0265. Although under large sample size the 2 approximation is very good.

It actually improves when sample size is larger than 50. Another finding is all these curves

start exactly at 1/6.

P(CS) vs. r2

r2

Pro

ba

bili

ty o

f C

orr

ect

Se

lection

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

n=200n=100n=50n=40n=20

16

Figure 4.7 P(CS) Curves by Approximation when n=40 and k=2

Figure 4.7 is the 2 approximation version of Figure 4.3. Like Figures 4.6 and 4.2,

Figure 4.7 again supports the conclusions in Figure 4.3. When the sample size is not very

large, all of the 2 approximation curves are lower than simulation curves by 0.02 to 0.03

in the P(CS) in the 22 r neighborhood. This further provides evidence for our findings in

Figure 4.6.

Figure 4.8 P(CS) Curves by Approximation when n=20 and proportion failing = 50%

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

100%90%70%50%30%10%

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

k=2k=4k=6k=8k=10

17

Figure 4.8 relates to Figure 4.4. It illustrates our points in Figure 4.4 in a smoother and

clearer way. When 2r is low, these curves are nearly parallel to each other. The difference

between k=6 and k=8 is around 0.06, approximately 1/6 – 1/8 = 0.042. The difference

between k=2 and k=4 is about 0.24, close to 1/2 – 1/4 =0.25. This further demonstrated that

the rate of increase is not affected by number of treatments.

IV.3 Comparison of Simulation to Approximation

Now for a closer inspection, we put curves from simulation and approximation on the

same plot. Figure 4.9 combines Figure 4.1 and Figure 4.5. The approximation bounds the

simulation from below most of the time. This is a good approximation for sample size only

20 at some part of the 2r range.

Figure 4.9 Comparison of P(CS) Curves when n=20, k=4 and r=6

Note that the biggest discrepancy between two curves occurs again when ]3,2[2 r .

The difference goes as large as 0.07 in probability.

Also other possible cases are studied. They all yield similar result. In most cases, the

lower bounds are close to simulation results. And in some cases when there are more

treatments, smaller sample sizes and lower proportion failing, the lower bounds from 2

approximation tend to be lower than those of simulations. Increasing sample size or

proportion failing improves the 2 approximation.

P(CS) vs. r2

r2

Pro

babili

ty o

f C

orr

ect S

ele

ction

1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

By simulationBy approximation

18

IV.4 Probability of Correct Decisions

More generally, more than one best population may be chosen. This involves breaking

the populations into two subgroups. We modify the general algorithm in Chapter III to

compute P(CD). In calculating the probability of correct decision, under the same

assumptions, we only change in the earlier procedure the condition on increasing the

counter from }ˆ,ˆmin{ˆ 21 k to }.ˆ,,ˆmin{}ˆ,,ˆmax{ 11 ktt

Taking n=20, k=4, and r=6 we consider two situations: choosing the best two and the

best three populations.

Figure 4.10 Comparison of P(CS) and P(CD) Curves when n=20, k=4 and r=6

Figure 4.10 includes the P(CS) curve and the P(CD) curves in one plot. The probability

curve of breaking the populations into better or worse halves is the lowest one. Yet the

curves for choosing the best or the worst population are very close to each other. In fact,

the probability of choosing the worst population is a little bit higher than that for choosing

the best. However, this is resulted from the conditions we used. If we used strict

inequalities in both cases, the positions of the two curves are switched around. Therefore

the probability is nearly identical to each other. Analogously, we conclude that the more

balanced the two subgroups the lower the probabilities and the lowest point occur when

two subgroups are of equal size. The chance of choosing the best t populations is same as

the worst t populations.

Variance Ratio

Pro

babili

ty

1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

P(CS) of Choosing the Best One PopulationP(CD) of Choosing the Best Two PopulationsP(CD) of Choosing the Best Three Populations

Probability Curves for Choosing Different Number of Best Populations

19

V Case Study

We apply our results to an independent example now to illustrate their practical

application. The data set is from Lee and Wang [11] (2003, page 25). See appendix for

actual data set.

V.1 Data Description

This is a diet study on lab rats. There are three treatments, low-fat, unsaturated, and

saturated. Each treatment is randomly assigned to thirty experiment units. The time to death

of rats are collected till a certain time, 200. All the live rats are censored at time 200. This is

Type I censoring.

However, for our study purpose, we modified the data and introduce Type II censoring.

We treat only the first half of data as failures. Then the other half is treated as censored data

at the time of death the 15th rat.

The data distribution is assessed before our results are applied. The Weibull q-q plot

shows that the data can be adequately described by a Weibull distribution. Then we take

logarithm of the data set to transform it to an SEV data set.

V.2 Simulation Analysis

From these three SEV samples, we compute the MLE’s of each sample. The values are:

.09.1ˆ,71.0ˆ;45.1ˆ,92.0ˆ;05.3ˆ,8.1ˆ 332211 For the purpose of our

study, assume they are the true parameter values. The third population has the smallest

value and is set to be the best population. Using the estimates as true parameter values, we

generate a sample of size 30, with 15 failures from each of the three populations. Then

calculate the MLE’s from the generated samples. If }ˆ,ˆmin{ˆ 213 , the counter is

increased by 1, otherwise 0. Repeat this for 1000 times. We get P(CS) = counter value /

1000 = 0.762. This is an estimate the probability of correctly selecting population three as

the best population. Since we are given parameter values, 2r is fixed at around 1.3. So we

do not have a P(CS) curve in this case but only one value.

V.3 Possible Scenarios

In addition to above analysis, the focus of our study is on planning future experiments.

The researchers may wish to know more information from this data set to facilitate their

future study. Here are some possible cases:

1. When the treatments variations are unknown, what is the P(CS) curve given the 30

experimental units are assigned to each treatment;

20

2. Assume we have some information on the experiment for us to estimate the

parameters, and one treatment has smaller variance than the other two treatments.

What is the P(CS) curve when proportion of observed failures varies, given a fixed

sample size of each group, say 30.

3. Under same assumptions in 1 and 2, what are the P(CS) curves if we vary the

common sample size.

4. In another case, under same assumptions in 1, the researchers wish to keep the best

two treatments for their study. Then question becomes how to choose the two

populations that have smaller variances than the other one.

By answering these questions, the researchers can plan future experiments by finding

out how to choose a good sample size and observed proportion failing according to their

research goal, yet still be confident in making the right choices.

We have shown that simulated P(CS) lower bounds are better than approximated ones

when n is less than 50. Therefore, we will approach above cases by simulation. Similar

results can be obtained through a study on approximation too. The results will be given as

P(CS) curves.

In case 1, we know that sample size is 30 and the number of treatments is 3. So the only

changeable parameter is the observed proportion failing. See Figure 5.1.

Figure 5.1 P(CS) Curves of 30 Observations with Different Number of Failures

Figure 5.1 shows that observing more failures will improve the chance of finding the

best population at any given level of r2. When more failures are observed, the P(CS)

r2

PC

S

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

PCS of 30 Observations

r=3r=6r=12r=18r=24

21

improves faster than if r2 is increased. The curve is steeper when proportion failing is

higher. So it is recommended that more failures to be observed if possible. For example, if

we wish to detect a ratio of 22 r with a probability over 0.90, at least 12 failures out of

30 should be observed.

In case 2, we assume that 3.12 r based on the data set. With n=30 and k=3, the P(CS)

curve against the number of failures is given in Figure 5.2.

Figure 5.2 Plot P(CS) against the Number of Failures

This is consistent with our finding in Figure 5.1: more failures, higher P(CS). But the

improvement is not a drastic one. Ten more observed failures can only increase P(CS) by

0.15. The relationship between P(CS) and r is approximately linear. In reality, observing

more failures would be costly both in time and money. Thus the researcher may seek other

means of improving P(CS).

This brought up the issue in case 3. One possible solution would be to relax the

constraint on sample size. Now assume only k=3 and 3.12 r are known. P(CS) is plotted

against sample size.

r

PC

S

5 10 15 20 25

0.0

0.2

0.4

0.6

0.8

1.0

Simulation Result for 30 Obervations

22

Figure 5.3 Plots of P(CS) against Sample Size when Proportion Failing Varies

Figure 5.3 says two things. P(CS) increases when sample size or observed proportion

failing increases. Note that these curves are not crossing each other. It means that at any

given sample size, P(CS) is always higher when there are more failures. All these curves

have positive slopes, even though the absolute values are small. Higher failing proportion

curve is lightly steeper than the lower failing rate curves. For instance, when sample size

increases from 10 to 20, the P(CS) increases from 0.45 to 0.5 when 20% observed failures

and 0.52 to 0.61 when 80% observed failures.

In case 4, a new condition is introduced. The best two populations are chosen instead of

one. Denote the probability of making correct decision in choosing the best two

populations as P(CD). Given n=30, k=3, and r=15, using the MLE’s as the true parameter

values, we get P(CS) = 0.762 but P(CD) = 0.975. Because the variance ratio is assumed to

be known in this case, there is not graphic illustration in this case. We conclude that it is

much easier to choose the worst population than the best in this case.

By this example, we have demonstrated the application of our study. We conclude that

for the purpose of choosing the best population, larger sample size and higher failing

proportion are generally recommended.

n

P(C

S)

10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

P(CS) vs Number of Observations

20% failing40% failing60% failing80% failing

23

VI Conclusions and Summary

In this study, we have used two independent tools to study the probability of correctly

selecting the SEV population with the smallest variance. We find that:

Sample size is positively associated with P(CS);

Observed proportion failing is positively associated with P(CS);

The number of treatments has a direct effect on P(CS) but does not change with 2r ;

Simulation and 2 approximation yield close results for n greater than 50;

P(CD) is generally lower than P(CS).

For each study, graphs like the ones we used can be produced using the algorithm

described in Chapter II and III. The researchers could read off such combination of sample

size and proportion failing from the above graphs with a particular probability to correctly

detect a certain difference. Values that are not included in the graphs can be interpolated

between curves.

This completes our study on selecting the best SEV population. However, there is some

possible future research based on this. We provide graphs for researchers to use in practice.

For easier reference, a table of sample sizes can be produced to achieve a P(CS) and P(CD)

lower bound given k, r and 2r . Another interesting topic is to find confidence intervals for

P(CS) by simulation and approximation.

24

VII Bibliography

[1] Arvesen, J. N. and McCabe, G. P. (). Subset selection problems for variances with

applications to regression analysis.

[2] Bain, L. J. and Engelhardt, M. (1991). Statistical Analysis of Reliability and Life –

Testing Models. New York: Marcel Dekker Inc.

[3] Bechhafer, R. (). A single-sample multiple decision procedure for ranking means of

normal populations with known variances.

[4] David, H. A. (). The ranking of variance in normal populations.

[5] Dudewicz, E. J. (). Ranking (ordering) and selection: an overview of how to select the

best.

[6] Gupta, S. S. and Sobel, M. (). On the smallest of several correlated F statistics.

[7] Gupta, S. S. and Sobel, M. (). On selecting a subset containing the population with the

smallest variance.

[8] Kingston, J. V. and Patel, J. K. (1982). Classifiying Weibull populations with respect to

a control, Commun. Statistics Theor. Method, 11(8), 899-909.

[9] Kingston, J. V. and Patel, J. K. (1980). Selecting the best one of several Weibull

populations, Commun. Statistics Theor. Method, A9(4), 399-414.

[10] Kingston, J. V. and Patel, J. K (). A restricted subset selection procedure for Weibull

population,

[11] Lee, E. T. and Wang, J. W. (2003). Statistical Methods for Survival Data Analysis, 3rd

ed.. New York: J. Wiley.

[12] Ofosu, J. B. (). A two-stage minimax procedure for selecting the normal population

with the smallest variance.

[13] Saxena, K. M. (). Interval estimation of the largest variance of k normal populations.

25

Appendix

Rat data set

Used in Chapter 5 from Lee and Wang (2003, page 25)

Low Fat Saturated Unsaturated

Time Censor Time Censor Time Censor 140 0 124 0 112 0 177 0 58 0 68 0 50 0 56 0 84 0 65 0 68 0 109 0 86 0 79 0 153 0

153 0 89 0 143 0 181 0 107 0 60 0 191 0 86 0 70 0 77 0 142 0 98 0 84 0 110 0 164 0 87 0 96 0 63 0 56 0 142 0 63 0 66 0 86 0 77 0 73 0 75 0 91 0

119 0 117 0 91 0 140 1 98 0 66 0 200 1 105 0 70 0 200 1 126 0 77 0 200 1 43 0 63 0 200 1 46 0 66 0 200 1 81 0 66 0 200 1 133 0 94 0 200 1 165 0 101 0 200 1 170 1 105 0 200 1 200 1 108 0 200 1 200 1 112 0 200 1 200 1 115 0 200 1 200 1 126 0 200 1 200 1 161 0 200 1 200 1 178 0

26

SPLUS code

For computing P(CS) to graph the curves

simulate data from sev distribution sim.sev.dat <- function(n, r, beta, eta) { data.mat <- matrix(NA, n, 2) data.mat[,1] <- log(sort(rweibull(n, beta, eta))) data.mat[,2] <- rep(0, n) data.mat[1:r,2] <- rep(1, r) #r failuers if(r!=n) data.mat[(r+1):n,1] <- data.mat[(r+1),1] # n-r censored out <- list("data"=data.mat, "sample size"=n, "failures"=r,

"beta"=beta, "eta"=eta) return(out) }

pdf of sev distribution pdf.sev <- function(x, beta, eta) { sigma <- 1/beta mu <- log(eta) f <- 1/sigma * exp( (x-mu)/sigma - exp((x-mu)/sigma) ) return(f) }

suvival function of sev distribution suv.sev <- function(x, beta, eta) { sigma <- 1/beta mu <- log(eta) suv <- exp( - exp( (x-mu)/sigma ) ) return(suv) }

log likelihood of sev log.likelihood <- function(log.theta, data) { beta <- exp(log.theta[1]) eta <- exp(log.theta[2]) fail <- data[,1][data[,2]==1] #failures cens <- data[,1][data[,2]==0] #cencored if (length(cens) !=0) likelihood <- sum( log( pdf.sev(fail, beta, eta) ) ) + sum( log(

suv.sev(cens, beta, eta) ) ) else likelihood <- sum( log( pdf.sev(fail, beta, eta) ) ) return(-likelihood) }

Probability of Correct Selection prob.cs <- function(int, log.theta, n, r, k, B) { beta <- exp(log.theta[1]) eta <- exp(log.theta[2]) sigma.hat <- rep(NA, B) sigma.min <- rep(NA, B) sigma.vec <- rep(NA, k-1) for(i in 1:B) {

27

data <- sim.sev.dat(n, r, beta, eta)$data par <- nlminb(int, log.likelihood, data=data)$parameters theta <- exp(par) sigma.hat[i] <- 1/theta[1] for(j in 1:(k-1)) { data <- sim.sev.dat(n, r, beta, eta)$data par <- nlminb(int, log.likelihood, data=data)$parameters theta <- exp(par) sigma.vec[j] <- 1/theta[1] } sigma.min[i] <- min(sigma.vec) } pcs.vec <- rep(NA, 39) r2.vec <- c(seq(1, 3, by = 0.1),seq(3.3, 6, by = 0.3), seq(6.5,10, by =

0.5)) for(i in 1:39) { r2 <- r2.vec[i] pcs.vec[i] <- sum(sigma.hat <= sigma.min*r2)/B } output <- list("n"=n,"r2"=r2.vec,"pcs"=pcs.vec) return(output) }

Probability of Correct Decision prob.cd <- function(int, log.theta, n, r, k, B,m) { beta <- exp(log.theta[1]) eta <- exp(log.theta[2]) sigma.hat <- rep(NA, B) sigma.min <- rep(NA, B) sigma.vec <- rep(NA, max(k-m,m)) for(i in 1:B) { for(j in 1:m) { data <- sim.sev.dat(n, r, beta, eta)$data par <- nlminb(int, log.likelihood, data=data)$parameters theta <- exp(par) sigma.vec[j] <- 1/theta[1] } sigma.hat[i] <- max(sigma.vec[1:m]) for(j in 1:(k-m)) { data <- sim.sev.dat(n, r, beta, eta)$data par <- nlminb(int, log.likelihood, data=data)$parameters theta <- exp(par) sigma.vec[j] <- 1/theta[1] } sigma.min[i] <- min(sigma.vec[1:(k-m)]) } pcd.vec <- rep(NA, 50) r2.vec <- seq(1,10,length=50) for(i in 1:50) { r2 <- r2.vec[i]

28

pcd.vec[i] <- sum(sigma.hat <= sigma.min*r2)/B } plot(r2.vec, pcd.vec) output <- list("n"=n,"r2"=r2.vec,"pcd"=pcd.vec) return(output) }

Numerical Intergration pcs.lb.best<- function(nfail, ntrt, r2) { zout <-

.Fortran("pcsbestlb",as.integer(nfail),as.integer(ntrt),as.double(r2),pcslb = double(1))

return(zout$pcslb) }

Lower Bound Probability of Correct Selection pcs.lb <- function(r,k) { lb.vec <- rep(NA, 39) r2.vec <- c(seq(1, 3, by = 0.1),seq(3.3, 6, by = 0.3), seq(6.5,10, by =

0.5)) for (i in 1:39) { r2_r2.vec[i] lb.vec[i]_pcs.lb.best(r,k,r2) } return(lb.vec) }

Documents

Selecting the Smallest Extreme Value Population with the ... · 5 III Asymptotic 2 Approximation The 2 distribution can closely approximate Type II censored data, i.e. tests are terminated