11
Research Article Data Transformation for Confidence Interval Improvement: An Application to the Estimation of Stress-Strength Model Reliability Alessandro Barbiero Department of Economics, Management and Quantitative Methods, Universit` a degli Studi di Milano, Via Conservatorio, 7-20122 Milan, Italy Correspondence should be addressed to Alessandro Barbiero; [email protected] Received 31 July 2014; Revised 19 September 2014; Accepted 29 September 2014; Published 23 October 2014 Academic Editor: David Bulger Copyright © 2014 Alessandro Barbiero. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In many statistical applications, it is oſten necessary to obtain an interval estimate for an unknown proportion or probability or, more generally, for a parameter whose natural space is the unit interval. e customary approximate two-sided confidence interval for such a parameter, based on some version of the central limit theorem, is known to be unsatisfactory when its true value is close to zero or one or when the sample size is small. A possible way to tackle this issue is the transformation of the data through a proper function that is able to make the approximation to the normal distribution less coarse. In this paper, we study the application of several of these transformations to the context of the estimation of the reliability parameter for stress-strength models, with a special focus on Poisson distribution. From this work, some practical hints emerge on which transformation may more efficiently improve standard confidence intervals in which scenarios. 1. Introduction In many fields of applied statistics, it is oſten necessary to obtain an interval estimate for an unknown proportion or probability or, more generally, for a parameter whose natural space is the unit interval [1]. If is the unknown parameter of a binomial distribution, the customary approximate two- sided confidence interval (CI) for is known to be unsatis- factory when its true value is close to zero or one or when the sample size is small. In fact, estimation can cause difficulties because the variance of the corresponding point estimator is dependent on itself and because its distribution can be skewed. A number of papers have been devoted to the devel- opment of more refined CIs for (see, e.g., [14]). Here, we will consider the estimation of the probability = ( < ), where and are two independent rv’s. If represents the strength of a certain system and the stress on it, represents the probability that the strength overcomes the stress, and then the system works ( is then referred to as the “reliability” parameter). Such a statistical model is usually called the “stress-strength model” and in the last decades has attracted much interest from various fields [5, 6], ranging from engineering to biostatistics. In these works, inferential issues have been dealt with, mainly in the parametric context. e problem of constructing interval estimators for has been considered; when an exact analytical solution is not available, approximations based on the delta method and asymptotic normality of point estimators are carried out, some of them making use of some data transformation of the point estimate of . In this work, we concentrate on the case of Poisson-dis- tributed stress and strength. Approximate large-sample CIs for the reliability have already been built and assessed for different parameter and sample size configurations and have been proved to give satisfactory results unless is close to 1 (or, symmetrically, 0) or the sample sizes are too small [7]. With the aim of improving the performance of such CIs, four transformation functions (logit, probit, arcsine root, and complementary log-log) are selected and applied to the maximum likelihood estimate of , and the resulting CIs are Hindawi Publishing Corporation Advances in Decision Sciences Volume 2014, Article ID 485629, 10 pages http://dx.doi.org/10.1155/2014/485629

Research Article Data Transformation for Confidence

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Data Transformation for Confidence

Research ArticleData Transformation for Confidence IntervalImprovement An Application to the Estimation ofStress-Strength Model Reliability

Alessandro Barbiero

Department of Economics Management and Quantitative Methods Universita degli Studi di Milano Via Conservatorio7-20122 Milan Italy

Correspondence should be addressed to Alessandro Barbiero alessandrobarbierounimiit

Received 31 July 2014 Revised 19 September 2014 Accepted 29 September 2014 Published 23 October 2014

Academic Editor David Bulger

Copyright copy 2014 Alessandro BarbieroThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In many statistical applications it is often necessary to obtain an interval estimate for an unknown proportion or probability ormore generally for a parameter whose natural space is the unit interval The customary approximate two-sided confidence intervalfor such a parameter based on some version of the central limit theorem is known to be unsatisfactory when its true value is closeto zero or one or when the sample size is small A possible way to tackle this issue is the transformation of the data through a properfunction that is able to make the approximation to the normal distribution less coarse In this paper we study the application ofseveral of these transformations to the context of the estimation of the reliability parameter for stress-strengthmodels with a specialfocus on Poisson distribution From this work some practical hints emerge on which transformation may more efficiently improvestandard confidence intervals in which scenarios

1 Introduction

In many fields of applied statistics it is often necessary toobtain an interval estimate for an unknown proportion orprobability or more generally for a parameter whose naturalspace is the unit interval [1] If 119901 is the unknown parameterof a binomial distribution the customary approximate two-sided confidence interval (CI) for 119901 is known to be unsatis-factory when its true value is close to zero or one or when thesample size is small In fact estimation can cause difficultiesbecause the variance of the corresponding point estimatoris dependent on 119901 itself and because its distribution can beskewed A number of papers have been devoted to the devel-opment of more refined CIs for 119901 (see eg [1ndash4]) Here wewill consider the estimation of the probability 119877 = 119875(119883 lt 119884)where 119883 and 119884 are two independent rvrsquos If 119884 representsthe strength of a certain system and 119883 the stress on it 119877represents the probability that the strength overcomes thestress and then the system works (119877 is then referred to asthe ldquoreliabilityrdquo parameter) Such a statistical model is usually

called the ldquostress-strength modelrdquo and in the last decades hasattracted much interest from various fields [5 6] rangingfrom engineering to biostatistics In these works inferentialissues have been dealt with mainly in the parametric contextThe problem of constructing interval estimators for 119877 hasbeen considered when an exact analytical solution is notavailable approximations based on the delta method andasymptotic normality of point estimators are carried outsome of themmaking use of some data transformation of thepoint estimate of 119877

In this work we concentrate on the case of Poisson-dis-tributed stress and strength Approximate large-sample CIsfor the reliability 119877 have already been built and assessed fordifferent parameter and sample size configurations and havebeen proved to give satisfactory results unless 119877 is close to 1(or symmetrically 0) or the sample sizes are too small [7]With the aim of improving the performance of such CIsfour transformation functions (logit probit arcsine rootand complementary log-log) are selected and applied to themaximum likelihood estimate of 119877 and the resulting CIs are

Hindawi Publishing CorporationAdvances in Decision SciencesVolume 2014 Article ID 485629 10 pageshttpdxdoiorg1011552014485629

2 Advances in Decision Sciences

empirically compared in terms of coverage probability andexpected length

The paper is laid out as follows in Section 2 a brief dis-cussion on data transformations is presented with a specialfocus on those ordinarily used in connection with the esti-mation of the reliability parameter of a stress-strengthmodelSection 3 introduces the stress-strengthmodel with indepen-dent Poisson-distributed stress and strength recalling theformulas for reliability maximum likelihood estimator andstandard large-sampleCI and introducing refinements for thelatter based on data transformations Section 4 is devoted totheMonte Carlo simulation study which empirically assessesthe statistical performance of CIs An example of applicationon real data is provided in Section 5 and Section 6 concludesthe paper with some final remarks

2 Transformations and Application tothe Estimation of a Stress-Strength Model

In almost all fields of research one has to deal with datathat are not normal It is common practice to transform thenonnormal data at hand in order to exploit theoretical resultsthat strictly hold only for the normal distribution with theobjective of building plausible or more efficient estimatesCiting [8] ldquotransformations of statistical variables are usedin applied statistics mainly for two purposesrdquo The first one isldquovariance stabilizationrdquo ldquoA transformation is applied in orderto make it possible to use at any rate approximately thestandard techniques associated with continuous normal vari-ation for example the methods of analysis of variance Inparticular transformations are required which stabilize thevariance that is which make the variance of the trans-formed variable approximately independent of for examplethe binomial probability or the mean value of the Poissondistribution [sdot sdot sdot ] The constant-variance condition has ledto the introduction of the inverse sine inverse sinh andsquare-root transformations which are used nowadays inmany fields of applicationsrdquo In linear regression models theunequal variance of the error terms produces estimates thatare unbiased but are no longer best in the sense of having thesmallest variance [9]The second purpose is ldquonormalizationrdquoldquoA transformation is used in order to facilitate the computa-tion of tail sums of the distribution by the aid of the normalprobability integral [sdot sdot sdot ] A review of the literature shows thata considerable number of transformations of binomial nega-tive binomial Poisson and1205942 variables have been proposedrdquoIn regression models for example nonnormality of theresponse variable invalidates the standard tests of significancewith small samples since they are based on the normalityassumption [9] Reference [10] noted how ldquoapproximatelysymmetrizing transformation of a random variable may bea more effective method of normalizing it than stabilizing itsvariancerdquo Reference [11] although dated and themore recentreference [12] provide an exhaustive review of transformationused in statistical data analysis

If a random variable 119883 whose probability mass functionor density function depends on a parameter 120579 is transformedby a function 119884 = 119891(119883) which we suppose henceforth

to be strictly monotone the standard deviation of 119884 accord-ing to the deltamethod (see eg [13]) is given approximatelyby

120590119891(119909)(120579) asymp

120597119891

120597119909(120583119909(120579)) sdot 120590

119909(120579) (1)

where 120583119909(120579) and 120590

119909(120579) are the expected value and standard

deviation of119883Then the (approximate) standard deviation ofthe transformed random variable 119884 can be made equal to aconstant if the function119891 is chosen so to satisfy the followingrelationship

119891 (119909) = int

119909 119896

120590119909(120579)119889120583119909(120579) (2)

Thus for example if 119883 is distributed as a Poisson randomvariable with parameter 120582 being 120583

119909= 1205902

119909= 120582 we obtain

that the function 119891(119909) = int119909 119896120582minus12119889120582 = 119896lowastradic119909 is a variancestabilizing function with 119896 and 119896lowast proper positive cons-tants Formula (2) has been empirically shown to providereasonable stabilization in various applications as confirmedby its extensive use but other criteria can be employedbased on different ldquonotionsrdquo of variance stabilization [8 12]Otherwise modifications to the function derived by (2) canbe proposed for example in order to reduce or removethe bias [14] Reference [15] studied the root transformation119891(119909) = radic119909 + 119888 for a Poisson-distributed random variable119883 and demonstrated that for 119888 = 14 the root-transformedvariable radic119883 + 119888 has vanishing first-order bias and almostconstant variance

Whichever criterion is selected very often the propertransformation to be adopted is tied to the particular statis-tical distribution underlying the data However as much asoften the exact distribution of an estimator for a probabil-ityproportion is not easily derivable even if the distributionthat the sample data came from is known This sometimeshappens for example for the maximum likelihood estimator(MLE) of the reliability parameter 119877 of a stress-strengthmodel In this case we do not know ldquoa priorirdquo whichtransformation may fit the data (ie the distribution of theMLE) best

The focus of this paper is the estimation of a probabilitythus we will confine ourselves to transformation of propor-tions For this case among the most used transformationshere we recall the logit probit arcsine and complementarylog-log Table 1 reports the expression the image the firstderivative and the inverse of these four functions

The logit and probit transformations are widely used inthe homonym models [16] and more generally when dealingwith skewed proportion distributions They are similar sincethey are both antisymmetric functions around the point 119909 =05 that is 119891(119909 + 05) = minus119891(minus119909 + 05) The differencestands in the fact that the logit function takes absolute valueslarger than the probit as can be noted looking at Figure 1With regard to the estimation of the reliability parameter for

Advances in Decision Sciences 3

Table 1 Transformations

119891(119909) Codomain 1198911015840(119909) 119891

minus1(119909)

log[119909(1 minus 119909)] (minusinfin +infin) 1[119909(1 minus 119909)] e119909(1 + e119909)Φminus1(119909) (minusinfin +infin) 1120601(Φ

minus1(119909)) Φ(119909)

arcsin(radic119909) (0 +1205872) 1[2radic119909(1 minus 119909)] sin2 (119909)log[minus log(1 minus 119909)] (minusinfin +infin) 1[(1 minus 119909)| log(1 minus 119909)|] 1 minus eminuse

119909

00 02 04 06 08 10

minus4

minus2

0

2

4

x

Logit

Logit

Probit

Probit

Arcsin

Arcsin

Log-log

Log-log

Figure 1 Plot of transformation functions and dotted horizontallines of equation 119910 = 0 and 119909 = 05

the stress-strength model the logit transformation has beenby far the most used transformation for improving thestatistical performance of standard large-sample CIs amongothers it has been considered by [17ndash19] all these contri-butions concerning the Weibull distribution by [20] for theLindley distribution by [21] when stress and strength followa bivariate exponential distribution and by [22 23] in anonparametric context

Through the years the arcsine root transformation hasgained great favor and application among practitioners [24]perhaps more than its real merit As noted previously itshould be chosen for stabilizing binomial data but is oftenused to stabilize sample proportions as well (ie relativebinomial data) However it presents a drawback highlightedin [25] and related to the fact that its codomain is the limitedinterval (0 1205872) thus its normalizing effect may turn out tobe meagre as already pointed out in some studies where ithas been used for estimating the reliability of stress-strengthmodels [19 25 26]

The complementary log-log function which is sometimesused in binomial regression models is slightly different fromlogit and probit since it assumes negative values for 119909 lt 1 minuseminus1 (and positive values for 119909 gt 1 minus eminus1) and takes large posi-tive values only for values of 119909 very close to 1 for example ittakes values larger than 1 if and only if 119909 gt 1 minus eminuse = 0934

In the next section we will apply these transformations tothe estimation of a stress-strengthmodel with both stress andstrength following a Poisson distribution

3 Inference on the ReliabilityParameter for a Stress-Strength Model withIndependent Poisson Stress and Strength

Let119883 and119884be independent rvrsquosmodeling stress and strengthrespectively with 119883 sim Poisson (120582

119909) and 119884 sim Poisson (120582

119910)

Then the reliability 119877 = 119875(119883 lt 119884) of the stress-strengthmodel is given by (see [6 page 103])

119877 = 119875 (119883 lt 119884) =

+infin

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

= lim119896rarrinfin

119896

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

(3)

If two simple random samples of size 119899119909and 119899

119910from 119883 and

119884 respectively are available the reliability parameter can beestimated by the ML estimator obtained substituting in (3)the MLEs of the unknown parameters 120582

119909and 120582

119910

=

+infin

sum

119894=0

119909119894119890minus119909

119894[

[

1 minus

119894

sum

119895=0

119910119895119890minus119910

119895]

]

(4)

Deriving the expression of the variance of the MLE is notstraightforward however an approximate expression hasbeen easily derived through the delta method in [7] Basedupon this approximation a large-sample 1 minus 120572 CI for 119877 hasbeen built Such an interval estimator has the usual expression

(119877119871 119877119880) = ( minus 119911

1minus1205722radicV () + 119911

1minus1205722radicV ()) (5)

where V() is the sample estimate of the (asymptotic) varianceof (see [7] for details)

Although such an estimator has been proved to havea satisfactory behavior in terms of coverage for severalcombinations of sample sizes and values of the reliabilityparameter some decay of the performance is observed when119877 gets close to the extreme values 0 and 1 and when samplesizes are small In these cases the approximation to thenormal distribution is in fact very rough especially because ofthe skewness of the distribution of Thus transformationsof the values of the estimates can be considered in orderto ldquomake the data more normalrdquo and produce CIs based ontransformed data that have a coverage closer to the nominallevel

4 Advances in Decision Sciences

Table 2 CIs of the reliability parameter 119877 under variance-stabilizingnormalizing transformations

Transformation 120579 CI for 120579 (120579119871 120579119880) CI for 119877 (119877

119871 119877119880)

logit log[(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )]) (exp (120579

119871) (1 + exp (120579

119871)) exp (120579

119880) (1 + exp (120579

119880)))

Probit Φminus1() (120579 ∓ 119911

1minus1205722radicV()120601(Φminus1())) (Φ(120579

119871) Φ(120579

119880))

Arcsin arcsinradic (120579 ∓ 1199111minus1205722radicV()[2radic(1 minus )]) (sin2 (120579

119871) sin2 (120579

119880))

Loglog log[minus log(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )| log(1 minus )|]) (1 minus exp (minus exp (120579

119871)) 1 minus exp (minus exp (120579

119880)))

Recalling Table 1 and considering the logit transforma-tion of the reliability parameter 119877 120579 = log[119877(1 minus 119877)] theMLE of 120579 is 120579 = log[(1minus )] and an approximate (1minus120572) CIfor 120579 is

(120579119871 120579119880) = (120579 minus 119911

1minus1205722sdot

radicV ()

[ (1 minus )]

120579 + 1199111minus1205722

sdot

radicV ()

[ (1 minus )])

(6)

Then an approximate (1 minus 120572) CI for 119877 is

(119877119871 119877119880) = (

exp (120579119871)

[1 + exp (120579119871)]

exp (120579119880)

[1 + exp (120579119880)]) (7)

For the other transformations following the same stepsjust outlined approximate ldquotransformedrdquo CIs for 119877 can beobtained which are alternative to the standard naıve CI of(5) they are synthesized in Table 2

An alternative way to make inference on the reliabilityparameter 119877 for the stress-strength model with independentPoisson random variables can be summarized as follows (1)to transform (normalize) samples from 119883 and 119884 accordingto the a proper transformation for the Poisson distribution(2) to compute point and interval estimates of 119877 usingthe methods for a stress-strength model with independentnormal variables with known variances [6 page 112] LettingΞ = radic119883 + 14 and Υ = radic119884 + 14 then Ξ and Υ are inde-pendent and approximately distributed as normal randomvariables with variance 1205902

120585= 1205902

120592= 14 and expected value

120583120585= radic120582

119909and 120583

120592= radic120582119910 respectively Then instead of esti-

mating 119877 = 119875(119883 lt 119884) one can estimate 1198771015840 = 119875(Ξ lt Υ) a1 minus 120572 confidence interval for 1198771015840 is given by

(Φ[120578 minus1199111minus1205722

radic119872] Φ[120578 +

1199111minus1205722

radic119872]) (8)

with119872 = (1205902120585+1205902

120592)(1205902

120585119899119909+1205902

120592119899119910) and 120578 = (120592 minus 120585)120590 being

an estimator of 120578 = (120583120592minus 120583120585)120590 where 120585 and 120592 are the sample

means of Ξ and Υ respectively and 120590 = radic1205902120585+ 1205902120592= radic12

An alternative procedure tomake inference about the reli-ability parameter 119877 can be provided by parametric bootstrap

[27 pages 53ndash56] In the basic version it works as follows forthe estimation problem at hand

(1) Estimate the unknown parameters of the Poisson rvrsquos119883 and 119884 through their sample means 119909 and 119910

(2) Draw independently a bootstrap sample 119909lowast of size119899119909from a Poisson rv 119883lowast with parameter 119909 and a

bootstrap sample 119910lowast of size 119899119910from a Poisson rv 119884lowast

with parameter 119910

(3) Estimate the reliability parameter say lowast on samples119909lowast and 119910lowast using the very same expression in (4)

(4) Repeat steps 2 and 3 119861 times (119861 sufficiently large eg2000) thus obtaining the bootstrap distribution oflowast

(5) Estimate a (1 minus 120572) bootstrap percentile CI for 119877 fromlowast distribution taking the 1205722 and 1 minus 1205722 quantiles

(lowast

1205722 lowast

1minus1205722) (9)

4 Simulation Study

41 Scope and Design The simulation study aims at empir-ically comparing the performance of the interval estimatorspresented in the previous section namely the standard CI of(5) labeled ldquoANrdquo those of Table 2 (labeled ldquologitrdquo ldquoprobitrdquoldquoarcsinerdquo ldquologlogrdquo) and the CIs of (8) and (9) in terms ofcoverage rate (and also lower and upper uncoverage rates)and expected length In this Monte Carlo (MC) study thevalue of parameter 120582

119909of the Poisson distribution for stress

119883 is set equal to a ldquoreferencerdquo value 2 and the parameter120582119910of the Poisson distribution modeling strength is varied in

order to obtainmdashaccording to (3)mdashseveral different levels ofreliability119877 namely 01 02 03 04 05 06 07 08 09 and095

For each couple (120582119909 120582119910) a large number (119899119878 = 5000) of

samples of size 119899119909and of size 119899

119910are drawn from 119883 sim

Poisson (120582119909) and 119884 sim Poisson (120582

119910) independently Different

and unequal sample sizes are here considered (all the 16possible combinations between the values 119899

119909= 5 10 20 50

and 119899119910= 5 10 20 50) On each pair of samples the 95 CIs

for 119877 listed above are built and the MC coverage rate andthe length of the CIs are computed over the 119899119878MC samplesMoreover the lower and upper uncoverage or error rates arecomputed that is the proportion of CIs for which 119877 lt 119877

119871

and 119877 gt 119877119880 respectively

Advances in Decision Sciences 5

Note that for the smallest sample size that is when 119899119909= 5

(or 119899119910= 5) a practical problem arises as the sample values of

119883 (or 119884) may all be 0 in this case the variance estimate V()cannot be computed and a standard approximate CI cannotbe built We decided to discard these samples from the5000 MC samples planned for the simulation study andto compute the quantities of interest only on the ldquofeasiblerdquosamples Indeed the rate of such ldquononfeasiblerdquo samples is inany case very low under each scenario For the worst onescharacterized by 119877 = 01 and 119899

119910= 5 the theoretical rate of

ldquononfeasiblerdquo samples is about 5Since results for the coverage rates of some CIs are often

close to the nominal level 095 we performed a test in order tocheck if the actual coverage rate is significantly different from095 that is to state if such CIs are conservative or liberalThe null hypothesis is that the true rate 120578 of CIs that do notcover the real value 119877 is equal to 120572 = 005119867

0 120578 = 120572 = 005

whereas the alternative hypothesis is that the rate 120578 is differentfrom120572119867

1 120578 = 120572We employ the test suggested in [28 pages

518-519] which is based on the statistic

F =119899119877 + 1

119899119878 minus 119899119877sdot1 minus 120572

120572 (10)

where 119899119877 is the number of rejections of1198670in 119899119878 (5000 in our

case unless there are some nonfeasible samples) iterations ofaMC simulation plan Under119867

0F follows an119865 distribution

with ]1= 2(119899119878 minus 119899119877) and ]

2= 2(119899119877 + 1) degrees of freedom

The test rejects 1198670at level 120574 if either F le F

1205742]1]2

or F geF1minus1205742]

1]2

where F120574]1]2

denotes the 120574 percentile point ofan 119865 distribution with ]

1and ]2degrees of freedom The test

was performed on each scenario for each kind of CI at a level120574 = 1

42 Results and Discussion The simulation results are (par-tially) reported in Table 3 (120582

119910= 2 varying 119877 and the

common size 119899119909= 119899119910) The results about the CIs (8) and

(9) are not reported here as their performances are overallunsatisfactory Even if theoretically appealing the proce-dure leading to the CI in (8) practically fails some of thescenarios of the simulation plan were considered and theCI built following this alternative procedure never providessatisfactory results For the smallest sample size (119899 = 5) thecoverage proves to be larger than that provided by the ANCI but however smaller than the nominal one for the largersample size (119899 = 10 20 50) the coverage rate dramaticallydecreases (even to values as low as 60) Paradoxically thediscrete nature of the Poisson variable and thus the qualityof the normal approximation of step (1) affect results to amore relevant extent as the sample size increases As to thebootstrap procedure yielding the CI in (9) this solution iscomputationally cumbersome it becomes even more timeconsuming if bias-corrected accelerated CIs [27 pages 184ndash188] have to be calculated and it provides barely satisfactoryresults as proven by a preliminary simulation study notreported here that confirms the findings reported in [25] forparametric bootstrap inference on the reliability parameterin the bivariate normal case The rejection of 119867

0based on

the test statistic (10) described in the previous subsection

is indicated by a ldquolowastrdquo near the actual coverage rate of eachCI Figures 2 and 3 graphically display the values of lowerand upper uncoverage rates for the five interval estimatorsvarying 119877 in 05 06 07 08 09 095 for 119899

119909= 119899119910= 5 and

119899119909= 119899119910= 50 respectively These results comprise only the

equal sample size scenarios those for unequal sample sizescenarios do not add much value to the general discussionwe are going to outline

(i) First the improvement provided by the four trans-formations to the standard naıve CI of 119877 in terms ofcoverage is evidentThis improvement is considerablefor a small sample size and extreme values of 119877(say 095) when the coverage rate of the standardnaıve CI tends to decrease dramatically even below90 Note that for the sample sizes 5 10 and 20the actual coverage rate of the standard naıve CIis always significantly different (smaller) than thenominal level Under some scenarios namely for 119877 =03ndash07 the increase in coverage rate is accompaniedalso by a reduction in the CI average length On thecontrary for the other values of 119877 there is an increasein the average width of the modified CIs which ismuch more apparent for the logit transformation

(ii) Examining closely Table 3 one can note that the CIbased on the arcsine root transformation is exceptfor one scenario uniformly worse in terms of actualcoverage than the other three CIs based on logitprobit and complementary log-log transformationsFor small sample sizes (119899

119909= 5 10) the coverage rate

actually provided is significantly different (smaller)than the nominal one It can also be claimed that thisunsatisfactory performance is due to its incapabilityof symmetrizing the distribution as can be seen byglancing at the lower and upper uncoverage rates forvalues of 119877 getting close to 1 there is an apparentundercoverage effect on the left side (just a bit smallerthan that of the ldquooriginalrdquo standard approximate CI)

(iii) Logit probit and complementary log-log trans-formed CIs have overall a satisfactory performancein terms of closeness to the nominal confidence levelLogit and probit CIs exhibit a similar behavior as boththe lower and upper uncoverage rates they provide areclose to the nominal value (25) On the contrarycomplementary log-log function (as well as arcsineroot) often produces a larger undercoverage on oneside (here left) which is partially balanced by an over-coverage on the other side (here right) This is clearevidence of the higher symmetrizing capability ofthe logit and probit transformations However takinginto account the results of the hypothesis test based onthe statistic (10) the probit and complementary log-log functions are those that overall perform best thestatistical hypothesis that their actual coverage rate isequal to the nominal one is always accepted exceptfor one case (119877 = 01 and 119899 = 5) whereas thesame hypothesis is sometimes rejected (10 times outof 40) for the logit function (which tends to producesignificantly larger coverage for small sample sizes)

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Data Transformation for Confidence

2 Advances in Decision Sciences

empirically compared in terms of coverage probability andexpected length

The paper is laid out as follows in Section 2 a brief dis-cussion on data transformations is presented with a specialfocus on those ordinarily used in connection with the esti-mation of the reliability parameter of a stress-strengthmodelSection 3 introduces the stress-strengthmodel with indepen-dent Poisson-distributed stress and strength recalling theformulas for reliability maximum likelihood estimator andstandard large-sampleCI and introducing refinements for thelatter based on data transformations Section 4 is devoted totheMonte Carlo simulation study which empirically assessesthe statistical performance of CIs An example of applicationon real data is provided in Section 5 and Section 6 concludesthe paper with some final remarks

2 Transformations and Application tothe Estimation of a Stress-Strength Model

In almost all fields of research one has to deal with datathat are not normal It is common practice to transform thenonnormal data at hand in order to exploit theoretical resultsthat strictly hold only for the normal distribution with theobjective of building plausible or more efficient estimatesCiting [8] ldquotransformations of statistical variables are usedin applied statistics mainly for two purposesrdquo The first one isldquovariance stabilizationrdquo ldquoA transformation is applied in orderto make it possible to use at any rate approximately thestandard techniques associated with continuous normal vari-ation for example the methods of analysis of variance Inparticular transformations are required which stabilize thevariance that is which make the variance of the trans-formed variable approximately independent of for examplethe binomial probability or the mean value of the Poissondistribution [sdot sdot sdot ] The constant-variance condition has ledto the introduction of the inverse sine inverse sinh andsquare-root transformations which are used nowadays inmany fields of applicationsrdquo In linear regression models theunequal variance of the error terms produces estimates thatare unbiased but are no longer best in the sense of having thesmallest variance [9]The second purpose is ldquonormalizationrdquoldquoA transformation is used in order to facilitate the computa-tion of tail sums of the distribution by the aid of the normalprobability integral [sdot sdot sdot ] A review of the literature shows thata considerable number of transformations of binomial nega-tive binomial Poisson and1205942 variables have been proposedrdquoIn regression models for example nonnormality of theresponse variable invalidates the standard tests of significancewith small samples since they are based on the normalityassumption [9] Reference [10] noted how ldquoapproximatelysymmetrizing transformation of a random variable may bea more effective method of normalizing it than stabilizing itsvariancerdquo Reference [11] although dated and themore recentreference [12] provide an exhaustive review of transformationused in statistical data analysis

If a random variable 119883 whose probability mass functionor density function depends on a parameter 120579 is transformedby a function 119884 = 119891(119883) which we suppose henceforth

to be strictly monotone the standard deviation of 119884 accord-ing to the deltamethod (see eg [13]) is given approximatelyby

120590119891(119909)(120579) asymp

120597119891

120597119909(120583119909(120579)) sdot 120590

119909(120579) (1)

where 120583119909(120579) and 120590

119909(120579) are the expected value and standard

deviation of119883Then the (approximate) standard deviation ofthe transformed random variable 119884 can be made equal to aconstant if the function119891 is chosen so to satisfy the followingrelationship

119891 (119909) = int

119909 119896

120590119909(120579)119889120583119909(120579) (2)

Thus for example if 119883 is distributed as a Poisson randomvariable with parameter 120582 being 120583

119909= 1205902

119909= 120582 we obtain

that the function 119891(119909) = int119909 119896120582minus12119889120582 = 119896lowastradic119909 is a variancestabilizing function with 119896 and 119896lowast proper positive cons-tants Formula (2) has been empirically shown to providereasonable stabilization in various applications as confirmedby its extensive use but other criteria can be employedbased on different ldquonotionsrdquo of variance stabilization [8 12]Otherwise modifications to the function derived by (2) canbe proposed for example in order to reduce or removethe bias [14] Reference [15] studied the root transformation119891(119909) = radic119909 + 119888 for a Poisson-distributed random variable119883 and demonstrated that for 119888 = 14 the root-transformedvariable radic119883 + 119888 has vanishing first-order bias and almostconstant variance

Whichever criterion is selected very often the propertransformation to be adopted is tied to the particular statis-tical distribution underlying the data However as much asoften the exact distribution of an estimator for a probabil-ityproportion is not easily derivable even if the distributionthat the sample data came from is known This sometimeshappens for example for the maximum likelihood estimator(MLE) of the reliability parameter 119877 of a stress-strengthmodel In this case we do not know ldquoa priorirdquo whichtransformation may fit the data (ie the distribution of theMLE) best

The focus of this paper is the estimation of a probabilitythus we will confine ourselves to transformation of propor-tions For this case among the most used transformationshere we recall the logit probit arcsine and complementarylog-log Table 1 reports the expression the image the firstderivative and the inverse of these four functions

The logit and probit transformations are widely used inthe homonym models [16] and more generally when dealingwith skewed proportion distributions They are similar sincethey are both antisymmetric functions around the point 119909 =05 that is 119891(119909 + 05) = minus119891(minus119909 + 05) The differencestands in the fact that the logit function takes absolute valueslarger than the probit as can be noted looking at Figure 1With regard to the estimation of the reliability parameter for

Advances in Decision Sciences 3

Table 1 Transformations

119891(119909) Codomain 1198911015840(119909) 119891

minus1(119909)

log[119909(1 minus 119909)] (minusinfin +infin) 1[119909(1 minus 119909)] e119909(1 + e119909)Φminus1(119909) (minusinfin +infin) 1120601(Φ

minus1(119909)) Φ(119909)

arcsin(radic119909) (0 +1205872) 1[2radic119909(1 minus 119909)] sin2 (119909)log[minus log(1 minus 119909)] (minusinfin +infin) 1[(1 minus 119909)| log(1 minus 119909)|] 1 minus eminuse

119909

00 02 04 06 08 10

minus4

minus2

0

2

4

x

Logit

Logit

Probit

Probit

Arcsin

Arcsin

Log-log

Log-log

Figure 1 Plot of transformation functions and dotted horizontallines of equation 119910 = 0 and 119909 = 05

the stress-strength model the logit transformation has beenby far the most used transformation for improving thestatistical performance of standard large-sample CIs amongothers it has been considered by [17ndash19] all these contri-butions concerning the Weibull distribution by [20] for theLindley distribution by [21] when stress and strength followa bivariate exponential distribution and by [22 23] in anonparametric context

Through the years the arcsine root transformation hasgained great favor and application among practitioners [24]perhaps more than its real merit As noted previously itshould be chosen for stabilizing binomial data but is oftenused to stabilize sample proportions as well (ie relativebinomial data) However it presents a drawback highlightedin [25] and related to the fact that its codomain is the limitedinterval (0 1205872) thus its normalizing effect may turn out tobe meagre as already pointed out in some studies where ithas been used for estimating the reliability of stress-strengthmodels [19 25 26]

The complementary log-log function which is sometimesused in binomial regression models is slightly different fromlogit and probit since it assumes negative values for 119909 lt 1 minuseminus1 (and positive values for 119909 gt 1 minus eminus1) and takes large posi-tive values only for values of 119909 very close to 1 for example ittakes values larger than 1 if and only if 119909 gt 1 minus eminuse = 0934

In the next section we will apply these transformations tothe estimation of a stress-strengthmodel with both stress andstrength following a Poisson distribution

3 Inference on the ReliabilityParameter for a Stress-Strength Model withIndependent Poisson Stress and Strength

Let119883 and119884be independent rvrsquosmodeling stress and strengthrespectively with 119883 sim Poisson (120582

119909) and 119884 sim Poisson (120582

119910)

Then the reliability 119877 = 119875(119883 lt 119884) of the stress-strengthmodel is given by (see [6 page 103])

119877 = 119875 (119883 lt 119884) =

+infin

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

= lim119896rarrinfin

119896

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

(3)

If two simple random samples of size 119899119909and 119899

119910from 119883 and

119884 respectively are available the reliability parameter can beestimated by the ML estimator obtained substituting in (3)the MLEs of the unknown parameters 120582

119909and 120582

119910

=

+infin

sum

119894=0

119909119894119890minus119909

119894[

[

1 minus

119894

sum

119895=0

119910119895119890minus119910

119895]

]

(4)

Deriving the expression of the variance of the MLE is notstraightforward however an approximate expression hasbeen easily derived through the delta method in [7] Basedupon this approximation a large-sample 1 minus 120572 CI for 119877 hasbeen built Such an interval estimator has the usual expression

(119877119871 119877119880) = ( minus 119911

1minus1205722radicV () + 119911

1minus1205722radicV ()) (5)

where V() is the sample estimate of the (asymptotic) varianceof (see [7] for details)

Although such an estimator has been proved to havea satisfactory behavior in terms of coverage for severalcombinations of sample sizes and values of the reliabilityparameter some decay of the performance is observed when119877 gets close to the extreme values 0 and 1 and when samplesizes are small In these cases the approximation to thenormal distribution is in fact very rough especially because ofthe skewness of the distribution of Thus transformationsof the values of the estimates can be considered in orderto ldquomake the data more normalrdquo and produce CIs based ontransformed data that have a coverage closer to the nominallevel

4 Advances in Decision Sciences

Table 2 CIs of the reliability parameter 119877 under variance-stabilizingnormalizing transformations

Transformation 120579 CI for 120579 (120579119871 120579119880) CI for 119877 (119877

119871 119877119880)

logit log[(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )]) (exp (120579

119871) (1 + exp (120579

119871)) exp (120579

119880) (1 + exp (120579

119880)))

Probit Φminus1() (120579 ∓ 119911

1minus1205722radicV()120601(Φminus1())) (Φ(120579

119871) Φ(120579

119880))

Arcsin arcsinradic (120579 ∓ 1199111minus1205722radicV()[2radic(1 minus )]) (sin2 (120579

119871) sin2 (120579

119880))

Loglog log[minus log(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )| log(1 minus )|]) (1 minus exp (minus exp (120579

119871)) 1 minus exp (minus exp (120579

119880)))

Recalling Table 1 and considering the logit transforma-tion of the reliability parameter 119877 120579 = log[119877(1 minus 119877)] theMLE of 120579 is 120579 = log[(1minus )] and an approximate (1minus120572) CIfor 120579 is

(120579119871 120579119880) = (120579 minus 119911

1minus1205722sdot

radicV ()

[ (1 minus )]

120579 + 1199111minus1205722

sdot

radicV ()

[ (1 minus )])

(6)

Then an approximate (1 minus 120572) CI for 119877 is

(119877119871 119877119880) = (

exp (120579119871)

[1 + exp (120579119871)]

exp (120579119880)

[1 + exp (120579119880)]) (7)

For the other transformations following the same stepsjust outlined approximate ldquotransformedrdquo CIs for 119877 can beobtained which are alternative to the standard naıve CI of(5) they are synthesized in Table 2

An alternative way to make inference on the reliabilityparameter 119877 for the stress-strength model with independentPoisson random variables can be summarized as follows (1)to transform (normalize) samples from 119883 and 119884 accordingto the a proper transformation for the Poisson distribution(2) to compute point and interval estimates of 119877 usingthe methods for a stress-strength model with independentnormal variables with known variances [6 page 112] LettingΞ = radic119883 + 14 and Υ = radic119884 + 14 then Ξ and Υ are inde-pendent and approximately distributed as normal randomvariables with variance 1205902

120585= 1205902

120592= 14 and expected value

120583120585= radic120582

119909and 120583

120592= radic120582119910 respectively Then instead of esti-

mating 119877 = 119875(119883 lt 119884) one can estimate 1198771015840 = 119875(Ξ lt Υ) a1 minus 120572 confidence interval for 1198771015840 is given by

(Φ[120578 minus1199111minus1205722

radic119872] Φ[120578 +

1199111minus1205722

radic119872]) (8)

with119872 = (1205902120585+1205902

120592)(1205902

120585119899119909+1205902

120592119899119910) and 120578 = (120592 minus 120585)120590 being

an estimator of 120578 = (120583120592minus 120583120585)120590 where 120585 and 120592 are the sample

means of Ξ and Υ respectively and 120590 = radic1205902120585+ 1205902120592= radic12

An alternative procedure tomake inference about the reli-ability parameter 119877 can be provided by parametric bootstrap

[27 pages 53ndash56] In the basic version it works as follows forthe estimation problem at hand

(1) Estimate the unknown parameters of the Poisson rvrsquos119883 and 119884 through their sample means 119909 and 119910

(2) Draw independently a bootstrap sample 119909lowast of size119899119909from a Poisson rv 119883lowast with parameter 119909 and a

bootstrap sample 119910lowast of size 119899119910from a Poisson rv 119884lowast

with parameter 119910

(3) Estimate the reliability parameter say lowast on samples119909lowast and 119910lowast using the very same expression in (4)

(4) Repeat steps 2 and 3 119861 times (119861 sufficiently large eg2000) thus obtaining the bootstrap distribution oflowast

(5) Estimate a (1 minus 120572) bootstrap percentile CI for 119877 fromlowast distribution taking the 1205722 and 1 minus 1205722 quantiles

(lowast

1205722 lowast

1minus1205722) (9)

4 Simulation Study

41 Scope and Design The simulation study aims at empir-ically comparing the performance of the interval estimatorspresented in the previous section namely the standard CI of(5) labeled ldquoANrdquo those of Table 2 (labeled ldquologitrdquo ldquoprobitrdquoldquoarcsinerdquo ldquologlogrdquo) and the CIs of (8) and (9) in terms ofcoverage rate (and also lower and upper uncoverage rates)and expected length In this Monte Carlo (MC) study thevalue of parameter 120582

119909of the Poisson distribution for stress

119883 is set equal to a ldquoreferencerdquo value 2 and the parameter120582119910of the Poisson distribution modeling strength is varied in

order to obtainmdashaccording to (3)mdashseveral different levels ofreliability119877 namely 01 02 03 04 05 06 07 08 09 and095

For each couple (120582119909 120582119910) a large number (119899119878 = 5000) of

samples of size 119899119909and of size 119899

119910are drawn from 119883 sim

Poisson (120582119909) and 119884 sim Poisson (120582

119910) independently Different

and unequal sample sizes are here considered (all the 16possible combinations between the values 119899

119909= 5 10 20 50

and 119899119910= 5 10 20 50) On each pair of samples the 95 CIs

for 119877 listed above are built and the MC coverage rate andthe length of the CIs are computed over the 119899119878MC samplesMoreover the lower and upper uncoverage or error rates arecomputed that is the proportion of CIs for which 119877 lt 119877

119871

and 119877 gt 119877119880 respectively

Advances in Decision Sciences 5

Note that for the smallest sample size that is when 119899119909= 5

(or 119899119910= 5) a practical problem arises as the sample values of

119883 (or 119884) may all be 0 in this case the variance estimate V()cannot be computed and a standard approximate CI cannotbe built We decided to discard these samples from the5000 MC samples planned for the simulation study andto compute the quantities of interest only on the ldquofeasiblerdquosamples Indeed the rate of such ldquononfeasiblerdquo samples is inany case very low under each scenario For the worst onescharacterized by 119877 = 01 and 119899

119910= 5 the theoretical rate of

ldquononfeasiblerdquo samples is about 5Since results for the coverage rates of some CIs are often

close to the nominal level 095 we performed a test in order tocheck if the actual coverage rate is significantly different from095 that is to state if such CIs are conservative or liberalThe null hypothesis is that the true rate 120578 of CIs that do notcover the real value 119877 is equal to 120572 = 005119867

0 120578 = 120572 = 005

whereas the alternative hypothesis is that the rate 120578 is differentfrom120572119867

1 120578 = 120572We employ the test suggested in [28 pages

518-519] which is based on the statistic

F =119899119877 + 1

119899119878 minus 119899119877sdot1 minus 120572

120572 (10)

where 119899119877 is the number of rejections of1198670in 119899119878 (5000 in our

case unless there are some nonfeasible samples) iterations ofaMC simulation plan Under119867

0F follows an119865 distribution

with ]1= 2(119899119878 minus 119899119877) and ]

2= 2(119899119877 + 1) degrees of freedom

The test rejects 1198670at level 120574 if either F le F

1205742]1]2

or F geF1minus1205742]

1]2

where F120574]1]2

denotes the 120574 percentile point ofan 119865 distribution with ]

1and ]2degrees of freedom The test

was performed on each scenario for each kind of CI at a level120574 = 1

42 Results and Discussion The simulation results are (par-tially) reported in Table 3 (120582

119910= 2 varying 119877 and the

common size 119899119909= 119899119910) The results about the CIs (8) and

(9) are not reported here as their performances are overallunsatisfactory Even if theoretically appealing the proce-dure leading to the CI in (8) practically fails some of thescenarios of the simulation plan were considered and theCI built following this alternative procedure never providessatisfactory results For the smallest sample size (119899 = 5) thecoverage proves to be larger than that provided by the ANCI but however smaller than the nominal one for the largersample size (119899 = 10 20 50) the coverage rate dramaticallydecreases (even to values as low as 60) Paradoxically thediscrete nature of the Poisson variable and thus the qualityof the normal approximation of step (1) affect results to amore relevant extent as the sample size increases As to thebootstrap procedure yielding the CI in (9) this solution iscomputationally cumbersome it becomes even more timeconsuming if bias-corrected accelerated CIs [27 pages 184ndash188] have to be calculated and it provides barely satisfactoryresults as proven by a preliminary simulation study notreported here that confirms the findings reported in [25] forparametric bootstrap inference on the reliability parameterin the bivariate normal case The rejection of 119867

0based on

the test statistic (10) described in the previous subsection

is indicated by a ldquolowastrdquo near the actual coverage rate of eachCI Figures 2 and 3 graphically display the values of lowerand upper uncoverage rates for the five interval estimatorsvarying 119877 in 05 06 07 08 09 095 for 119899

119909= 119899119910= 5 and

119899119909= 119899119910= 50 respectively These results comprise only the

equal sample size scenarios those for unequal sample sizescenarios do not add much value to the general discussionwe are going to outline

(i) First the improvement provided by the four trans-formations to the standard naıve CI of 119877 in terms ofcoverage is evidentThis improvement is considerablefor a small sample size and extreme values of 119877(say 095) when the coverage rate of the standardnaıve CI tends to decrease dramatically even below90 Note that for the sample sizes 5 10 and 20the actual coverage rate of the standard naıve CIis always significantly different (smaller) than thenominal level Under some scenarios namely for 119877 =03ndash07 the increase in coverage rate is accompaniedalso by a reduction in the CI average length On thecontrary for the other values of 119877 there is an increasein the average width of the modified CIs which ismuch more apparent for the logit transformation

(ii) Examining closely Table 3 one can note that the CIbased on the arcsine root transformation is exceptfor one scenario uniformly worse in terms of actualcoverage than the other three CIs based on logitprobit and complementary log-log transformationsFor small sample sizes (119899

119909= 5 10) the coverage rate

actually provided is significantly different (smaller)than the nominal one It can also be claimed that thisunsatisfactory performance is due to its incapabilityof symmetrizing the distribution as can be seen byglancing at the lower and upper uncoverage rates forvalues of 119877 getting close to 1 there is an apparentundercoverage effect on the left side (just a bit smallerthan that of the ldquooriginalrdquo standard approximate CI)

(iii) Logit probit and complementary log-log trans-formed CIs have overall a satisfactory performancein terms of closeness to the nominal confidence levelLogit and probit CIs exhibit a similar behavior as boththe lower and upper uncoverage rates they provide areclose to the nominal value (25) On the contrarycomplementary log-log function (as well as arcsineroot) often produces a larger undercoverage on oneside (here left) which is partially balanced by an over-coverage on the other side (here right) This is clearevidence of the higher symmetrizing capability ofthe logit and probit transformations However takinginto account the results of the hypothesis test based onthe statistic (10) the probit and complementary log-log functions are those that overall perform best thestatistical hypothesis that their actual coverage rate isequal to the nominal one is always accepted exceptfor one case (119877 = 01 and 119899 = 5) whereas thesame hypothesis is sometimes rejected (10 times outof 40) for the logit function (which tends to producesignificantly larger coverage for small sample sizes)

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Data Transformation for Confidence

Advances in Decision Sciences 3

Table 1 Transformations

119891(119909) Codomain 1198911015840(119909) 119891

minus1(119909)

log[119909(1 minus 119909)] (minusinfin +infin) 1[119909(1 minus 119909)] e119909(1 + e119909)Φminus1(119909) (minusinfin +infin) 1120601(Φ

minus1(119909)) Φ(119909)

arcsin(radic119909) (0 +1205872) 1[2radic119909(1 minus 119909)] sin2 (119909)log[minus log(1 minus 119909)] (minusinfin +infin) 1[(1 minus 119909)| log(1 minus 119909)|] 1 minus eminuse

119909

00 02 04 06 08 10

minus4

minus2

0

2

4

x

Logit

Logit

Probit

Probit

Arcsin

Arcsin

Log-log

Log-log

Figure 1 Plot of transformation functions and dotted horizontallines of equation 119910 = 0 and 119909 = 05

the stress-strength model the logit transformation has beenby far the most used transformation for improving thestatistical performance of standard large-sample CIs amongothers it has been considered by [17ndash19] all these contri-butions concerning the Weibull distribution by [20] for theLindley distribution by [21] when stress and strength followa bivariate exponential distribution and by [22 23] in anonparametric context

Through the years the arcsine root transformation hasgained great favor and application among practitioners [24]perhaps more than its real merit As noted previously itshould be chosen for stabilizing binomial data but is oftenused to stabilize sample proportions as well (ie relativebinomial data) However it presents a drawback highlightedin [25] and related to the fact that its codomain is the limitedinterval (0 1205872) thus its normalizing effect may turn out tobe meagre as already pointed out in some studies where ithas been used for estimating the reliability of stress-strengthmodels [19 25 26]

The complementary log-log function which is sometimesused in binomial regression models is slightly different fromlogit and probit since it assumes negative values for 119909 lt 1 minuseminus1 (and positive values for 119909 gt 1 minus eminus1) and takes large posi-tive values only for values of 119909 very close to 1 for example ittakes values larger than 1 if and only if 119909 gt 1 minus eminuse = 0934

In the next section we will apply these transformations tothe estimation of a stress-strengthmodel with both stress andstrength following a Poisson distribution

3 Inference on the ReliabilityParameter for a Stress-Strength Model withIndependent Poisson Stress and Strength

Let119883 and119884be independent rvrsquosmodeling stress and strengthrespectively with 119883 sim Poisson (120582

119909) and 119884 sim Poisson (120582

119910)

Then the reliability 119877 = 119875(119883 lt 119884) of the stress-strengthmodel is given by (see [6 page 103])

119877 = 119875 (119883 lt 119884) =

+infin

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

= lim119896rarrinfin

119896

sum

119894=0

120582119894

119909119890minus120582119909

119894[

[

1 minus

119894

sum

119895=0

120582119895

119910119890minus120582119910

119895]

]

(3)

If two simple random samples of size 119899119909and 119899

119910from 119883 and

119884 respectively are available the reliability parameter can beestimated by the ML estimator obtained substituting in (3)the MLEs of the unknown parameters 120582

119909and 120582

119910

=

+infin

sum

119894=0

119909119894119890minus119909

119894[

[

1 minus

119894

sum

119895=0

119910119895119890minus119910

119895]

]

(4)

Deriving the expression of the variance of the MLE is notstraightforward however an approximate expression hasbeen easily derived through the delta method in [7] Basedupon this approximation a large-sample 1 minus 120572 CI for 119877 hasbeen built Such an interval estimator has the usual expression

(119877119871 119877119880) = ( minus 119911

1minus1205722radicV () + 119911

1minus1205722radicV ()) (5)

where V() is the sample estimate of the (asymptotic) varianceof (see [7] for details)

Although such an estimator has been proved to havea satisfactory behavior in terms of coverage for severalcombinations of sample sizes and values of the reliabilityparameter some decay of the performance is observed when119877 gets close to the extreme values 0 and 1 and when samplesizes are small In these cases the approximation to thenormal distribution is in fact very rough especially because ofthe skewness of the distribution of Thus transformationsof the values of the estimates can be considered in orderto ldquomake the data more normalrdquo and produce CIs based ontransformed data that have a coverage closer to the nominallevel

4 Advances in Decision Sciences

Table 2 CIs of the reliability parameter 119877 under variance-stabilizingnormalizing transformations

Transformation 120579 CI for 120579 (120579119871 120579119880) CI for 119877 (119877

119871 119877119880)

logit log[(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )]) (exp (120579

119871) (1 + exp (120579

119871)) exp (120579

119880) (1 + exp (120579

119880)))

Probit Φminus1() (120579 ∓ 119911

1minus1205722radicV()120601(Φminus1())) (Φ(120579

119871) Φ(120579

119880))

Arcsin arcsinradic (120579 ∓ 1199111minus1205722radicV()[2radic(1 minus )]) (sin2 (120579

119871) sin2 (120579

119880))

Loglog log[minus log(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )| log(1 minus )|]) (1 minus exp (minus exp (120579

119871)) 1 minus exp (minus exp (120579

119880)))

Recalling Table 1 and considering the logit transforma-tion of the reliability parameter 119877 120579 = log[119877(1 minus 119877)] theMLE of 120579 is 120579 = log[(1minus )] and an approximate (1minus120572) CIfor 120579 is

(120579119871 120579119880) = (120579 minus 119911

1minus1205722sdot

radicV ()

[ (1 minus )]

120579 + 1199111minus1205722

sdot

radicV ()

[ (1 minus )])

(6)

Then an approximate (1 minus 120572) CI for 119877 is

(119877119871 119877119880) = (

exp (120579119871)

[1 + exp (120579119871)]

exp (120579119880)

[1 + exp (120579119880)]) (7)

For the other transformations following the same stepsjust outlined approximate ldquotransformedrdquo CIs for 119877 can beobtained which are alternative to the standard naıve CI of(5) they are synthesized in Table 2

An alternative way to make inference on the reliabilityparameter 119877 for the stress-strength model with independentPoisson random variables can be summarized as follows (1)to transform (normalize) samples from 119883 and 119884 accordingto the a proper transformation for the Poisson distribution(2) to compute point and interval estimates of 119877 usingthe methods for a stress-strength model with independentnormal variables with known variances [6 page 112] LettingΞ = radic119883 + 14 and Υ = radic119884 + 14 then Ξ and Υ are inde-pendent and approximately distributed as normal randomvariables with variance 1205902

120585= 1205902

120592= 14 and expected value

120583120585= radic120582

119909and 120583

120592= radic120582119910 respectively Then instead of esti-

mating 119877 = 119875(119883 lt 119884) one can estimate 1198771015840 = 119875(Ξ lt Υ) a1 minus 120572 confidence interval for 1198771015840 is given by

(Φ[120578 minus1199111minus1205722

radic119872] Φ[120578 +

1199111minus1205722

radic119872]) (8)

with119872 = (1205902120585+1205902

120592)(1205902

120585119899119909+1205902

120592119899119910) and 120578 = (120592 minus 120585)120590 being

an estimator of 120578 = (120583120592minus 120583120585)120590 where 120585 and 120592 are the sample

means of Ξ and Υ respectively and 120590 = radic1205902120585+ 1205902120592= radic12

An alternative procedure tomake inference about the reli-ability parameter 119877 can be provided by parametric bootstrap

[27 pages 53ndash56] In the basic version it works as follows forthe estimation problem at hand

(1) Estimate the unknown parameters of the Poisson rvrsquos119883 and 119884 through their sample means 119909 and 119910

(2) Draw independently a bootstrap sample 119909lowast of size119899119909from a Poisson rv 119883lowast with parameter 119909 and a

bootstrap sample 119910lowast of size 119899119910from a Poisson rv 119884lowast

with parameter 119910

(3) Estimate the reliability parameter say lowast on samples119909lowast and 119910lowast using the very same expression in (4)

(4) Repeat steps 2 and 3 119861 times (119861 sufficiently large eg2000) thus obtaining the bootstrap distribution oflowast

(5) Estimate a (1 minus 120572) bootstrap percentile CI for 119877 fromlowast distribution taking the 1205722 and 1 minus 1205722 quantiles

(lowast

1205722 lowast

1minus1205722) (9)

4 Simulation Study

41 Scope and Design The simulation study aims at empir-ically comparing the performance of the interval estimatorspresented in the previous section namely the standard CI of(5) labeled ldquoANrdquo those of Table 2 (labeled ldquologitrdquo ldquoprobitrdquoldquoarcsinerdquo ldquologlogrdquo) and the CIs of (8) and (9) in terms ofcoverage rate (and also lower and upper uncoverage rates)and expected length In this Monte Carlo (MC) study thevalue of parameter 120582

119909of the Poisson distribution for stress

119883 is set equal to a ldquoreferencerdquo value 2 and the parameter120582119910of the Poisson distribution modeling strength is varied in

order to obtainmdashaccording to (3)mdashseveral different levels ofreliability119877 namely 01 02 03 04 05 06 07 08 09 and095

For each couple (120582119909 120582119910) a large number (119899119878 = 5000) of

samples of size 119899119909and of size 119899

119910are drawn from 119883 sim

Poisson (120582119909) and 119884 sim Poisson (120582

119910) independently Different

and unequal sample sizes are here considered (all the 16possible combinations between the values 119899

119909= 5 10 20 50

and 119899119910= 5 10 20 50) On each pair of samples the 95 CIs

for 119877 listed above are built and the MC coverage rate andthe length of the CIs are computed over the 119899119878MC samplesMoreover the lower and upper uncoverage or error rates arecomputed that is the proportion of CIs for which 119877 lt 119877

119871

and 119877 gt 119877119880 respectively

Advances in Decision Sciences 5

Note that for the smallest sample size that is when 119899119909= 5

(or 119899119910= 5) a practical problem arises as the sample values of

119883 (or 119884) may all be 0 in this case the variance estimate V()cannot be computed and a standard approximate CI cannotbe built We decided to discard these samples from the5000 MC samples planned for the simulation study andto compute the quantities of interest only on the ldquofeasiblerdquosamples Indeed the rate of such ldquononfeasiblerdquo samples is inany case very low under each scenario For the worst onescharacterized by 119877 = 01 and 119899

119910= 5 the theoretical rate of

ldquononfeasiblerdquo samples is about 5Since results for the coverage rates of some CIs are often

close to the nominal level 095 we performed a test in order tocheck if the actual coverage rate is significantly different from095 that is to state if such CIs are conservative or liberalThe null hypothesis is that the true rate 120578 of CIs that do notcover the real value 119877 is equal to 120572 = 005119867

0 120578 = 120572 = 005

whereas the alternative hypothesis is that the rate 120578 is differentfrom120572119867

1 120578 = 120572We employ the test suggested in [28 pages

518-519] which is based on the statistic

F =119899119877 + 1

119899119878 minus 119899119877sdot1 minus 120572

120572 (10)

where 119899119877 is the number of rejections of1198670in 119899119878 (5000 in our

case unless there are some nonfeasible samples) iterations ofaMC simulation plan Under119867

0F follows an119865 distribution

with ]1= 2(119899119878 minus 119899119877) and ]

2= 2(119899119877 + 1) degrees of freedom

The test rejects 1198670at level 120574 if either F le F

1205742]1]2

or F geF1minus1205742]

1]2

where F120574]1]2

denotes the 120574 percentile point ofan 119865 distribution with ]

1and ]2degrees of freedom The test

was performed on each scenario for each kind of CI at a level120574 = 1

42 Results and Discussion The simulation results are (par-tially) reported in Table 3 (120582

119910= 2 varying 119877 and the

common size 119899119909= 119899119910) The results about the CIs (8) and

(9) are not reported here as their performances are overallunsatisfactory Even if theoretically appealing the proce-dure leading to the CI in (8) practically fails some of thescenarios of the simulation plan were considered and theCI built following this alternative procedure never providessatisfactory results For the smallest sample size (119899 = 5) thecoverage proves to be larger than that provided by the ANCI but however smaller than the nominal one for the largersample size (119899 = 10 20 50) the coverage rate dramaticallydecreases (even to values as low as 60) Paradoxically thediscrete nature of the Poisson variable and thus the qualityof the normal approximation of step (1) affect results to amore relevant extent as the sample size increases As to thebootstrap procedure yielding the CI in (9) this solution iscomputationally cumbersome it becomes even more timeconsuming if bias-corrected accelerated CIs [27 pages 184ndash188] have to be calculated and it provides barely satisfactoryresults as proven by a preliminary simulation study notreported here that confirms the findings reported in [25] forparametric bootstrap inference on the reliability parameterin the bivariate normal case The rejection of 119867

0based on

the test statistic (10) described in the previous subsection

is indicated by a ldquolowastrdquo near the actual coverage rate of eachCI Figures 2 and 3 graphically display the values of lowerand upper uncoverage rates for the five interval estimatorsvarying 119877 in 05 06 07 08 09 095 for 119899

119909= 119899119910= 5 and

119899119909= 119899119910= 50 respectively These results comprise only the

equal sample size scenarios those for unequal sample sizescenarios do not add much value to the general discussionwe are going to outline

(i) First the improvement provided by the four trans-formations to the standard naıve CI of 119877 in terms ofcoverage is evidentThis improvement is considerablefor a small sample size and extreme values of 119877(say 095) when the coverage rate of the standardnaıve CI tends to decrease dramatically even below90 Note that for the sample sizes 5 10 and 20the actual coverage rate of the standard naıve CIis always significantly different (smaller) than thenominal level Under some scenarios namely for 119877 =03ndash07 the increase in coverage rate is accompaniedalso by a reduction in the CI average length On thecontrary for the other values of 119877 there is an increasein the average width of the modified CIs which ismuch more apparent for the logit transformation

(ii) Examining closely Table 3 one can note that the CIbased on the arcsine root transformation is exceptfor one scenario uniformly worse in terms of actualcoverage than the other three CIs based on logitprobit and complementary log-log transformationsFor small sample sizes (119899

119909= 5 10) the coverage rate

actually provided is significantly different (smaller)than the nominal one It can also be claimed that thisunsatisfactory performance is due to its incapabilityof symmetrizing the distribution as can be seen byglancing at the lower and upper uncoverage rates forvalues of 119877 getting close to 1 there is an apparentundercoverage effect on the left side (just a bit smallerthan that of the ldquooriginalrdquo standard approximate CI)

(iii) Logit probit and complementary log-log trans-formed CIs have overall a satisfactory performancein terms of closeness to the nominal confidence levelLogit and probit CIs exhibit a similar behavior as boththe lower and upper uncoverage rates they provide areclose to the nominal value (25) On the contrarycomplementary log-log function (as well as arcsineroot) often produces a larger undercoverage on oneside (here left) which is partially balanced by an over-coverage on the other side (here right) This is clearevidence of the higher symmetrizing capability ofthe logit and probit transformations However takinginto account the results of the hypothesis test based onthe statistic (10) the probit and complementary log-log functions are those that overall perform best thestatistical hypothesis that their actual coverage rate isequal to the nominal one is always accepted exceptfor one case (119877 = 01 and 119899 = 5) whereas thesame hypothesis is sometimes rejected (10 times outof 40) for the logit function (which tends to producesignificantly larger coverage for small sample sizes)

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Data Transformation for Confidence

4 Advances in Decision Sciences

Table 2 CIs of the reliability parameter 119877 under variance-stabilizingnormalizing transformations

Transformation 120579 CI for 120579 (120579119871 120579119880) CI for 119877 (119877

119871 119877119880)

logit log[(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )]) (exp (120579

119871) (1 + exp (120579

119871)) exp (120579

119880) (1 + exp (120579

119880)))

Probit Φminus1() (120579 ∓ 119911

1minus1205722radicV()120601(Φminus1())) (Φ(120579

119871) Φ(120579

119880))

Arcsin arcsinradic (120579 ∓ 1199111minus1205722radicV()[2radic(1 minus )]) (sin2 (120579

119871) sin2 (120579

119880))

Loglog log[minus log(1 minus )] (120579 ∓ 1199111minus1205722radicV()[(1 minus )| log(1 minus )|]) (1 minus exp (minus exp (120579

119871)) 1 minus exp (minus exp (120579

119880)))

Recalling Table 1 and considering the logit transforma-tion of the reliability parameter 119877 120579 = log[119877(1 minus 119877)] theMLE of 120579 is 120579 = log[(1minus )] and an approximate (1minus120572) CIfor 120579 is

(120579119871 120579119880) = (120579 minus 119911

1minus1205722sdot

radicV ()

[ (1 minus )]

120579 + 1199111minus1205722

sdot

radicV ()

[ (1 minus )])

(6)

Then an approximate (1 minus 120572) CI for 119877 is

(119877119871 119877119880) = (

exp (120579119871)

[1 + exp (120579119871)]

exp (120579119880)

[1 + exp (120579119880)]) (7)

For the other transformations following the same stepsjust outlined approximate ldquotransformedrdquo CIs for 119877 can beobtained which are alternative to the standard naıve CI of(5) they are synthesized in Table 2

An alternative way to make inference on the reliabilityparameter 119877 for the stress-strength model with independentPoisson random variables can be summarized as follows (1)to transform (normalize) samples from 119883 and 119884 accordingto the a proper transformation for the Poisson distribution(2) to compute point and interval estimates of 119877 usingthe methods for a stress-strength model with independentnormal variables with known variances [6 page 112] LettingΞ = radic119883 + 14 and Υ = radic119884 + 14 then Ξ and Υ are inde-pendent and approximately distributed as normal randomvariables with variance 1205902

120585= 1205902

120592= 14 and expected value

120583120585= radic120582

119909and 120583

120592= radic120582119910 respectively Then instead of esti-

mating 119877 = 119875(119883 lt 119884) one can estimate 1198771015840 = 119875(Ξ lt Υ) a1 minus 120572 confidence interval for 1198771015840 is given by

(Φ[120578 minus1199111minus1205722

radic119872] Φ[120578 +

1199111minus1205722

radic119872]) (8)

with119872 = (1205902120585+1205902

120592)(1205902

120585119899119909+1205902

120592119899119910) and 120578 = (120592 minus 120585)120590 being

an estimator of 120578 = (120583120592minus 120583120585)120590 where 120585 and 120592 are the sample

means of Ξ and Υ respectively and 120590 = radic1205902120585+ 1205902120592= radic12

An alternative procedure tomake inference about the reli-ability parameter 119877 can be provided by parametric bootstrap

[27 pages 53ndash56] In the basic version it works as follows forthe estimation problem at hand

(1) Estimate the unknown parameters of the Poisson rvrsquos119883 and 119884 through their sample means 119909 and 119910

(2) Draw independently a bootstrap sample 119909lowast of size119899119909from a Poisson rv 119883lowast with parameter 119909 and a

bootstrap sample 119910lowast of size 119899119910from a Poisson rv 119884lowast

with parameter 119910

(3) Estimate the reliability parameter say lowast on samples119909lowast and 119910lowast using the very same expression in (4)

(4) Repeat steps 2 and 3 119861 times (119861 sufficiently large eg2000) thus obtaining the bootstrap distribution oflowast

(5) Estimate a (1 minus 120572) bootstrap percentile CI for 119877 fromlowast distribution taking the 1205722 and 1 minus 1205722 quantiles

(lowast

1205722 lowast

1minus1205722) (9)

4 Simulation Study

41 Scope and Design The simulation study aims at empir-ically comparing the performance of the interval estimatorspresented in the previous section namely the standard CI of(5) labeled ldquoANrdquo those of Table 2 (labeled ldquologitrdquo ldquoprobitrdquoldquoarcsinerdquo ldquologlogrdquo) and the CIs of (8) and (9) in terms ofcoverage rate (and also lower and upper uncoverage rates)and expected length In this Monte Carlo (MC) study thevalue of parameter 120582

119909of the Poisson distribution for stress

119883 is set equal to a ldquoreferencerdquo value 2 and the parameter120582119910of the Poisson distribution modeling strength is varied in

order to obtainmdashaccording to (3)mdashseveral different levels ofreliability119877 namely 01 02 03 04 05 06 07 08 09 and095

For each couple (120582119909 120582119910) a large number (119899119878 = 5000) of

samples of size 119899119909and of size 119899

119910are drawn from 119883 sim

Poisson (120582119909) and 119884 sim Poisson (120582

119910) independently Different

and unequal sample sizes are here considered (all the 16possible combinations between the values 119899

119909= 5 10 20 50

and 119899119910= 5 10 20 50) On each pair of samples the 95 CIs

for 119877 listed above are built and the MC coverage rate andthe length of the CIs are computed over the 119899119878MC samplesMoreover the lower and upper uncoverage or error rates arecomputed that is the proportion of CIs for which 119877 lt 119877

119871

and 119877 gt 119877119880 respectively

Advances in Decision Sciences 5

Note that for the smallest sample size that is when 119899119909= 5

(or 119899119910= 5) a practical problem arises as the sample values of

119883 (or 119884) may all be 0 in this case the variance estimate V()cannot be computed and a standard approximate CI cannotbe built We decided to discard these samples from the5000 MC samples planned for the simulation study andto compute the quantities of interest only on the ldquofeasiblerdquosamples Indeed the rate of such ldquononfeasiblerdquo samples is inany case very low under each scenario For the worst onescharacterized by 119877 = 01 and 119899

119910= 5 the theoretical rate of

ldquononfeasiblerdquo samples is about 5Since results for the coverage rates of some CIs are often

close to the nominal level 095 we performed a test in order tocheck if the actual coverage rate is significantly different from095 that is to state if such CIs are conservative or liberalThe null hypothesis is that the true rate 120578 of CIs that do notcover the real value 119877 is equal to 120572 = 005119867

0 120578 = 120572 = 005

whereas the alternative hypothesis is that the rate 120578 is differentfrom120572119867

1 120578 = 120572We employ the test suggested in [28 pages

518-519] which is based on the statistic

F =119899119877 + 1

119899119878 minus 119899119877sdot1 minus 120572

120572 (10)

where 119899119877 is the number of rejections of1198670in 119899119878 (5000 in our

case unless there are some nonfeasible samples) iterations ofaMC simulation plan Under119867

0F follows an119865 distribution

with ]1= 2(119899119878 minus 119899119877) and ]

2= 2(119899119877 + 1) degrees of freedom

The test rejects 1198670at level 120574 if either F le F

1205742]1]2

or F geF1minus1205742]

1]2

where F120574]1]2

denotes the 120574 percentile point ofan 119865 distribution with ]

1and ]2degrees of freedom The test

was performed on each scenario for each kind of CI at a level120574 = 1

42 Results and Discussion The simulation results are (par-tially) reported in Table 3 (120582

119910= 2 varying 119877 and the

common size 119899119909= 119899119910) The results about the CIs (8) and

(9) are not reported here as their performances are overallunsatisfactory Even if theoretically appealing the proce-dure leading to the CI in (8) practically fails some of thescenarios of the simulation plan were considered and theCI built following this alternative procedure never providessatisfactory results For the smallest sample size (119899 = 5) thecoverage proves to be larger than that provided by the ANCI but however smaller than the nominal one for the largersample size (119899 = 10 20 50) the coverage rate dramaticallydecreases (even to values as low as 60) Paradoxically thediscrete nature of the Poisson variable and thus the qualityof the normal approximation of step (1) affect results to amore relevant extent as the sample size increases As to thebootstrap procedure yielding the CI in (9) this solution iscomputationally cumbersome it becomes even more timeconsuming if bias-corrected accelerated CIs [27 pages 184ndash188] have to be calculated and it provides barely satisfactoryresults as proven by a preliminary simulation study notreported here that confirms the findings reported in [25] forparametric bootstrap inference on the reliability parameterin the bivariate normal case The rejection of 119867

0based on

the test statistic (10) described in the previous subsection

is indicated by a ldquolowastrdquo near the actual coverage rate of eachCI Figures 2 and 3 graphically display the values of lowerand upper uncoverage rates for the five interval estimatorsvarying 119877 in 05 06 07 08 09 095 for 119899

119909= 119899119910= 5 and

119899119909= 119899119910= 50 respectively These results comprise only the

equal sample size scenarios those for unequal sample sizescenarios do not add much value to the general discussionwe are going to outline

(i) First the improvement provided by the four trans-formations to the standard naıve CI of 119877 in terms ofcoverage is evidentThis improvement is considerablefor a small sample size and extreme values of 119877(say 095) when the coverage rate of the standardnaıve CI tends to decrease dramatically even below90 Note that for the sample sizes 5 10 and 20the actual coverage rate of the standard naıve CIis always significantly different (smaller) than thenominal level Under some scenarios namely for 119877 =03ndash07 the increase in coverage rate is accompaniedalso by a reduction in the CI average length On thecontrary for the other values of 119877 there is an increasein the average width of the modified CIs which ismuch more apparent for the logit transformation

(ii) Examining closely Table 3 one can note that the CIbased on the arcsine root transformation is exceptfor one scenario uniformly worse in terms of actualcoverage than the other three CIs based on logitprobit and complementary log-log transformationsFor small sample sizes (119899

119909= 5 10) the coverage rate

actually provided is significantly different (smaller)than the nominal one It can also be claimed that thisunsatisfactory performance is due to its incapabilityof symmetrizing the distribution as can be seen byglancing at the lower and upper uncoverage rates forvalues of 119877 getting close to 1 there is an apparentundercoverage effect on the left side (just a bit smallerthan that of the ldquooriginalrdquo standard approximate CI)

(iii) Logit probit and complementary log-log trans-formed CIs have overall a satisfactory performancein terms of closeness to the nominal confidence levelLogit and probit CIs exhibit a similar behavior as boththe lower and upper uncoverage rates they provide areclose to the nominal value (25) On the contrarycomplementary log-log function (as well as arcsineroot) often produces a larger undercoverage on oneside (here left) which is partially balanced by an over-coverage on the other side (here right) This is clearevidence of the higher symmetrizing capability ofthe logit and probit transformations However takinginto account the results of the hypothesis test based onthe statistic (10) the probit and complementary log-log functions are those that overall perform best thestatistical hypothesis that their actual coverage rate isequal to the nominal one is always accepted exceptfor one case (119877 = 01 and 119899 = 5) whereas thesame hypothesis is sometimes rejected (10 times outof 40) for the logit function (which tends to producesignificantly larger coverage for small sample sizes)

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Data Transformation for Confidence

Advances in Decision Sciences 5

Note that for the smallest sample size that is when 119899119909= 5

(or 119899119910= 5) a practical problem arises as the sample values of

119883 (or 119884) may all be 0 in this case the variance estimate V()cannot be computed and a standard approximate CI cannotbe built We decided to discard these samples from the5000 MC samples planned for the simulation study andto compute the quantities of interest only on the ldquofeasiblerdquosamples Indeed the rate of such ldquononfeasiblerdquo samples is inany case very low under each scenario For the worst onescharacterized by 119877 = 01 and 119899

119910= 5 the theoretical rate of

ldquononfeasiblerdquo samples is about 5Since results for the coverage rates of some CIs are often

close to the nominal level 095 we performed a test in order tocheck if the actual coverage rate is significantly different from095 that is to state if such CIs are conservative or liberalThe null hypothesis is that the true rate 120578 of CIs that do notcover the real value 119877 is equal to 120572 = 005119867

0 120578 = 120572 = 005

whereas the alternative hypothesis is that the rate 120578 is differentfrom120572119867

1 120578 = 120572We employ the test suggested in [28 pages

518-519] which is based on the statistic

F =119899119877 + 1

119899119878 minus 119899119877sdot1 minus 120572

120572 (10)

where 119899119877 is the number of rejections of1198670in 119899119878 (5000 in our

case unless there are some nonfeasible samples) iterations ofaMC simulation plan Under119867

0F follows an119865 distribution

with ]1= 2(119899119878 minus 119899119877) and ]

2= 2(119899119877 + 1) degrees of freedom

The test rejects 1198670at level 120574 if either F le F

1205742]1]2

or F geF1minus1205742]

1]2

where F120574]1]2

denotes the 120574 percentile point ofan 119865 distribution with ]

1and ]2degrees of freedom The test

was performed on each scenario for each kind of CI at a level120574 = 1

42 Results and Discussion The simulation results are (par-tially) reported in Table 3 (120582

119910= 2 varying 119877 and the

common size 119899119909= 119899119910) The results about the CIs (8) and

(9) are not reported here as their performances are overallunsatisfactory Even if theoretically appealing the proce-dure leading to the CI in (8) practically fails some of thescenarios of the simulation plan were considered and theCI built following this alternative procedure never providessatisfactory results For the smallest sample size (119899 = 5) thecoverage proves to be larger than that provided by the ANCI but however smaller than the nominal one for the largersample size (119899 = 10 20 50) the coverage rate dramaticallydecreases (even to values as low as 60) Paradoxically thediscrete nature of the Poisson variable and thus the qualityof the normal approximation of step (1) affect results to amore relevant extent as the sample size increases As to thebootstrap procedure yielding the CI in (9) this solution iscomputationally cumbersome it becomes even more timeconsuming if bias-corrected accelerated CIs [27 pages 184ndash188] have to be calculated and it provides barely satisfactoryresults as proven by a preliminary simulation study notreported here that confirms the findings reported in [25] forparametric bootstrap inference on the reliability parameterin the bivariate normal case The rejection of 119867

0based on

the test statistic (10) described in the previous subsection

is indicated by a ldquolowastrdquo near the actual coverage rate of eachCI Figures 2 and 3 graphically display the values of lowerand upper uncoverage rates for the five interval estimatorsvarying 119877 in 05 06 07 08 09 095 for 119899

119909= 119899119910= 5 and

119899119909= 119899119910= 50 respectively These results comprise only the

equal sample size scenarios those for unequal sample sizescenarios do not add much value to the general discussionwe are going to outline

(i) First the improvement provided by the four trans-formations to the standard naıve CI of 119877 in terms ofcoverage is evidentThis improvement is considerablefor a small sample size and extreme values of 119877(say 095) when the coverage rate of the standardnaıve CI tends to decrease dramatically even below90 Note that for the sample sizes 5 10 and 20the actual coverage rate of the standard naıve CIis always significantly different (smaller) than thenominal level Under some scenarios namely for 119877 =03ndash07 the increase in coverage rate is accompaniedalso by a reduction in the CI average length On thecontrary for the other values of 119877 there is an increasein the average width of the modified CIs which ismuch more apparent for the logit transformation

(ii) Examining closely Table 3 one can note that the CIbased on the arcsine root transformation is exceptfor one scenario uniformly worse in terms of actualcoverage than the other three CIs based on logitprobit and complementary log-log transformationsFor small sample sizes (119899

119909= 5 10) the coverage rate

actually provided is significantly different (smaller)than the nominal one It can also be claimed that thisunsatisfactory performance is due to its incapabilityof symmetrizing the distribution as can be seen byglancing at the lower and upper uncoverage rates forvalues of 119877 getting close to 1 there is an apparentundercoverage effect on the left side (just a bit smallerthan that of the ldquooriginalrdquo standard approximate CI)

(iii) Logit probit and complementary log-log trans-formed CIs have overall a satisfactory performancein terms of closeness to the nominal confidence levelLogit and probit CIs exhibit a similar behavior as boththe lower and upper uncoverage rates they provide areclose to the nominal value (25) On the contrarycomplementary log-log function (as well as arcsineroot) often produces a larger undercoverage on oneside (here left) which is partially balanced by an over-coverage on the other side (here right) This is clearevidence of the higher symmetrizing capability ofthe logit and probit transformations However takinginto account the results of the hypothesis test based onthe statistic (10) the probit and complementary log-log functions are those that overall perform best thestatistical hypothesis that their actual coverage rate isequal to the nominal one is always accepted exceptfor one case (119877 = 01 and 119899 = 5) whereas thesame hypothesis is sometimes rejected (10 times outof 40) for the logit function (which tends to producesignificantly larger coverage for small sample sizes)

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Data Transformation for Confidence

6 Advances in Decision Sciences

Table3Simulationresults

(120582119909=2)

119877119899119909119899119910

AN

Logit

Prob

itArcsin

Loglog

AN

Logit

Prob

itArcsin

Loglog

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

cov

LEUE

Width

01

55

8994lowast

034

972

9717lowast

266

017

9673lowast

207

120

9435

112

453

9620lowast

363

017

02836

03835

03491

03152

04204

1010

8849lowast

040

1111

9623lowast

251

126

9557

185

259

9252lowast

106

642

9565

323

112

02153

02501

02354

02214

02622

2020

9226lowast

036

738

9520

306

174

9494

232

274

9362lowast

144

494

9512

326

162

01620

01709

01653

01600

01750

5050

9334lowast

094

572

9544

232

224

9528

198

274

9450

156

394

9548

254

198

01020

01044

01029

01015

01054

02

55

8751lowast

166

1083

9601lowast

311

088

9511

253

237

9166lowast

166

668

9519

429

052

04360

04717

04553

044

2505123

1010

9068lowast

112

820

9530

268

202

9472

222

306

9314lowast

174

512

9486

364

150

03395

03429

03358

03312

03597

2020

9320lowast

136

544

9522

266

212

9522

218

260

9432

180

388

9490

322

188

02469

02473

02444

02428

02537

5050

9396lowast

138

466

9542

248

210

9516

220

264

9478

196

326

9520

276

204

01569

01572

01563

01559

01588

03

55

8911lowast

250

839

9642lowast

224

134

9472

224

304

9222lowast

224

554

9472

448

080

05440

05284

05245

05264

05664

1010

9110lowast

192

698

9540

248

212

9490

206

304

9370lowast

192

438

9428

412

160

04149

03991

03981

040

0204166

2020

9318lowast

182

500

9534

256

210

9504

236

260

9398lowast

232

370

9496

348

156

03012

02947

02945

02956

03018

5050

9438

182

380

9546

242

212

9532

220

248

9510

196

294

9540

280

180

01927

01909

01909

01912

01929

04

55

8898lowast

386

716

9652lowast

210

138

9448

232

320

9240lowast

290

470

9504

414

082

060

6805592

05627

05733

05891

1010

9214lowast

270

516

9586lowast

208

206

9478

228

294

9372lowast

246

382

9498

382

162

04572

04305

04335

04397

044

5520

209364lowast

260

376

9558

244

198

9522

244

234

9470

248

282

9514

332

154

03319

03210

03226

03254

03274

5050

9472

218

310

9548

220

232

9528

220

252

9506

220

274

9564

252

184

02128

02098

02103

02111

02116

05

55

8932lowast

504

564

9672lowast

156

172

9504

238

258

9278lowast

348

374

9490

434

076

06285

05691

05752

05889

05864

1010

9198lowast

382

420

9554

204

242

9494

236

270

9384lowast

278

338

9488

362

150

04709

044

06044

5004526

04501

2020

9350lowast

332

318

9536

230

234

9480

254

266

9464

262

274

9476

350

174

03413

03290

03312

03345

03332

5050

9462

246

292

9538

218

244

9516

232

252

9496

234

270

9542

272

186

02191

02157

02163

02173

02169

06

55

8944lowast

636

420

9652lowast

144

204

9470

250

280

9266lowast

404

330

9478

428

094

060

9505587

05628

05741

05601

1010

9178lowast

484

338

9552

178

270

9488

232

280

9392lowast

306

302

9492

348

160

04576

04305

04337

044

0104324

2020

9340lowast

402

258

9544

202

254

9512

234

254

9446

298

256

9500

334

166

03304

03197

03212

03240

03206

5050

9506

268

226

9554

198

248

9538

214

248

9524

228

248

9558

258

184

02120

02090

02095

02103

02093

07

55

8952lowast

754

294

9606lowast

148

246

9504

250

246

9256lowast

476

268

9454

454

092

05490

05266

05242

05279

05094

1010

9170lowast

582

248

9532

164

304

9502

222

276

9378lowast

366

256

9492

344

164

04162

03991

03987

040

1403919

2020

9328lowast

468

204

9562

190

248

9518

234

248

9444

326

230

9530

326

144

02984

02920

02918

02930

02889

5050

9474

324

202

9548

188

264

9540

216

244

9524

252

224

9538

264

198

01910

01893

01893

01896

01885

08

55

8894lowast

960

146

9600lowast

124

276

9492

232

276

9254lowast

578

168

9500

418

082

044

1904657

04526

044

3604289

1010

9166lowast

702

132

9518

156

326

9496

234

270

9370lowast

418

212

9522

336

142

03411

03408

03349

03315

03242

2020

9322lowast

564

114

9522

188

290

9494

252

254

9432

364

204

9500

334

166

02424

02425

02399

02386

02356

5050

9460

394

146

9534

194

272

9520

232

248

9504

286

210

9540

270

190

01543

01544

01537

01533

01525

09

55

8832lowast

1114

054

9528

114

358

9504

258

238

9196lowast

676

128

9470

438

092

02799

03557

03295

03043

03029

1010

9080lowast

850

070

9512

144

344

9476

252

272

9340lowast

498

162

9508

330

162

02175

02410

02294

02187

02181

2020

9248lowast

692

060

9524

174

302

9512

236

252

9422lowast

428

150

9520

320

160

01550

01621

01574

01532

01531

5050

9452

462

086

9510

198

292

9530

230

240

9506

322

172

9540

262

198

00976

00996

00983

00971

00971

095

55

8686lowast

1296

018

9516

112

372

9510

262

228

9152lowast

772

076

9502

412

086

01691

02593

02289

01973

02058

1010

8990lowast

984

026

9480

148

372

9498

250

252

9262lowast

628

110

9484

346

170

01277

01611

01489

01359

01398

2020

9230lowast

750

020

9522

174

304

9512

248

240

9402lowast

476

122

9492

332

176

00931

01021

00975

00925

00941

5050

9420lowast

532

048

9530

188

282

9524

240

236

9508

346

146

9532

278

190

00579

006

0200590

00577

00582

Legend

ldquocovrdquo

=actualcoverage

rateldquoLE

rdquo=lower

errorrateldquoU

Erdquo=up

pere

rror

rate

lowastTh

erejectio

nof

H0

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Data Transformation for Confidence

Advances in Decision Sciences 7

R = 05

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 08

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 09

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 095

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 06

AN

Logit

Probit

Arcsin

Log-log

13 11 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6

R = 07

AN

Logit

Probit

Arcsin

Log-log

Figure 2 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 5

(iv) Differences in behavior among the four ldquotransfor-medrdquo CIs tend to become more apparent for smallsample sizes and for extreme values of 119877 (close to 0 or1) as can be seen looking at the bottom right graph ofFigure 2 Clearly the same differences tend to vanishas the sample sizes increase and119877 tends to 05 (see thetop left graph in Figure 3)

(v) Contrary to what one would predict the perfor-mances in terms of coverage rate of logit probit andcomplementary log-log-based CIs do not seem to benegatively affected by sample size in fact as under-lined before an (sometimes significant) increase ofthe coverage rate of the logit CIs is noticed when thesample size is small (119899

119909= 5) whereas the other two

CIs show an almost constant trendThe CI exploitingthe arcsine root transformation is the one takingmost advantage from the increase in sample sizeObviously there is an increase in the average length ofthe CIs moving to larger sample sizes

(vi) As one could expect there is a symmetry in the behav-ior of each CI when considering two complementaryvalues of119877 (ie values of119877 summing to 1) keeping thesample size fixed the values of the coverage rate andaverage width are similar As to the uncoverage ratesthey are nearly exchangeable for AN logit probitand arscin CIs that is a lower uncoverage error fora fixed value of 119877 is similar to the upper uncoverageerror for its complementary value (or at least theirrelativemagnitude is exchangeable)This feature does

not hold for log-log CIs which always present a left-side uncoverage error larger than that of the right-side These features are weakened for 119877 equal to 09(and 1 minus 119877 then equal to 01) probably because ofthe presence of ldquononfeasiblerdquo samples as discussed inSection 41 which distorts this symmetry condition

(vii) Finallymdashthe relative results are not reported here forthe sake of brevitymdashmoving to larger values (ielarger than 2) of 120582

119909 keeping 119877 constant seems to

bring benefit to the performance of all the CIs ascan be quantitatively noted by the reduced numberof rejections of the null hypothesis of equivalencebetween actual and theoretical coverage This may beexplained by the fact that for a large parameter 120582 thePoisson distribution tends to a normal distributiontherefore larger values of 120582

119909imply a better normal

approximation to Poisson and presumptively to theMLE

In Figure 4 for illustrative purposes the MC distributionof and the transformed data according to the four transfor-mations are displayed (119877 = 08 and 119899

119909= 119899119910= 5) We can

note at a glance that logit and probit produce distributionscloser to normality than arcsine root and log-log comple-mentary functions which do not seem completely able toldquocorrectrdquo the skewness of the original distribution Howeverimplementing the Shapiro-Wilk test of normality on the fourtransformed distributions leads to very low 119875-values practi-cally equal to 0 except for the probit function whose 119875-valueis 003132 thus proving that in this case all the transformeddata distributions are still far from normality

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Data Transformation for Confidence

8 Advances in Decision Sciences

R = 05

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 08

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 09

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 06

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

R = 07

AN

Logit

Probit

Arcsin

Log-log

6 5 4 3 2 1 0 1 2 3

AN

Logit

Probit

Arcsin

Log-logR = 095

Figure 3 Lower and upper uncoverage rates 120582119909= 2 119877 ge 05 119899

119909= 119899119910= 50

06 08 10 12 14minus05 05 10 15 20

Probit(R)minus1 0 1 2 3 4

Logit(R)02 04 06 08 10

R log-log(R)minus15 minus05 00 05 10 15

Arcsin(R)

Figure 4 Histogram of the MC empirical distribution of and 120579 according to the transformations of Table 2 (120582119909= 2 119877 = 08 119899

119909= 119899119910= 5)

Superimposed the density function of the normal distribution with mean and standard deviation equal to the mean and standard deviationof the data Note that neither the 119910-axis nor 119909-axis shares the same scale in the different plots

5 An Application

The application we illustrate here is based on the datadescribed in [29] and already used in [7] to whichwe redirectfor full details On these data we build the four 95CIs basedon the transformations of Table 2 along with the standardone The results (lower and upper bounds and length of theCIs) are reported in Table 4 Although the five CIs are notmuch different from each other nevertheless we can notethat all four transformation-based intervals have both lowerand upper bounds smaller than the corresponding boundfor the standard naıve interval (ie they are left shifted withrespect to it) the logit transformation yields an interval abit wider than the standard one whereas the other threetransformations produce a slight decrease in its length

Table 4 Results for the application data

Type 119877119871

119877119880

WidthAN 07452 09232 01780Logit 07256 09054 01799Probit 07302 09080 01777Arcsine 07365 09128 01763Log-log 07363 09113 01750

6 Conclusions

This work provided an empirical analysis of the conve-nience of data transformations for estimating the reliability

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Data Transformation for Confidence

Advances in Decision Sciences 9

parameter in stress-strength models We focused on themodel involving two independent Poisson random variablessince this is a case where the distribution of the MLE is noteasily derivable and exact CIs cannot be built thus trans-formation of the estimates is a viable tool to refine thestandard naıve interval estimator derived by the central limittheorem Four transformationswere considered (logit probitarcsine root and complementary log-log) and empiricallyassessed under a number of scenarios in terms of the coverageand average length of the CI they produced Results are infavor of logit probit and complementary log-log functionsmdashalthough to varying degreesmdashwhich ensure a coverage rateclose to the nominal coverage even with small samplesizes On the contrary the arcsine root function althoughimproving the performance of the standard CI often keepsits coverage rates under the nominal confidence level Thesefindings were to some extent predictable since logit andprobit transformations are very popular link functions ingeneralized regression models but are a bit surprising withregard to the complementary log-log function whose use islimited

Future research will ascertain if such results hold truefor other distributions for the stress-strength model and willeventually inspect alternative data transformations

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thank the editor and two anonymousreferees whose suggestions and comments improved thequality of the original version of the paper

References

[1] J L Fleiss B Levin andM C Paik Statistical Methods for Ratesand Proportions John Wiley amp Sons Hoboken NJ USA 3rdedition 2003

[2] A Agresti and B A Coull ldquoApproximate is better than ldquoexactrdquofor interval estimation of binomial proportionsrdquoThe AmericanStatistician vol 52 no 2 pp 119ndash126 1998

[3] L D Brown T T Cai and A Dasgupta ldquoConfidence intervalsfor a binomial proportion and asymptotic expansionsrdquo TheAnnals of Statistics vol 30 no 1 pp 160ndash201 2002

[4] A M Pires and C Amado ldquoInterval estimators for a binomialproportion comparison of twenty methodsrdquo REVSTAT Statis-tical Journal vol 6 no 2 pp 165ndash197 2008

[5] B V Gnedenko and I A Ushakov Probabilistic Reliability Engi-neering John Wiley amp Sons New York NY USA 1995

[6] S Kotz Y Lumelskii and M PenskyThe Stress-strength Modeland Its Generalizations Theory and Applications World Scien-tific New York NY USA 2003

[7] A Barbiero ldquoInference for reliability of Poisson datardquo Journal ofQuality and Reliability Engineering vol 2013 Article ID 5305308 pages 2013

[8] G Blom ldquoTransformations of the binomial negative binomialpoisson and X2 distributionsrdquo Biometrika vol 41 pp 302ndash3161954

[9] D C Montgomery Design and Analysis of Experiment JohnWiley amp Sons New York NY USA 6th edition 2005

[10] Y P Chaubey and G S Mudholkar ldquoOn the symmetrizingtransformations of random variablesrdquo Concordia Universityunpublished preprint Mathematics and Statistics 1983

[11] M H Hoyle ldquoTransformations an introduction and a bibliog-raphyrdquo International Statistical Review vol 41 no 2 pp 203ndash223 1973

[12] A Foi ldquoOptimization of variance-stabilizing transformationsrdquosubmitted

[13] W Greene Econometric Analysis Prentice Hall New York NYUSA 7th edition 2012

[14] A Dasgupta Asymptotic Theory of Statistics and ProbabilitySpringer New York NY USA 2008

[15] L Brown T Cai R Zhang L Zhao and H Zhou ldquoThe root-unroot algorithm for density estimation as implemented viawavelet block thresholdingrdquo Probability Theory and RelatedFields vol 146 no 3-4 pp 401ndash433 2010

[16] J H Aldrich and F D Nelson Linear Probability Logit andProbit Models Sage Beverly Hills Calif USA 1984

[17] A Baklizi ldquoInference on Pr (119883 lt 119884) in the two-parameterWei-bull model based on recordsrdquo ISRN Probability and Statisticsvol 2012 Article ID 263612 11 pages 2012

[18] K Krishnamoorthy and Y Lin ldquoConfidence limits for stress-strength reliability involving Weibull modelsrdquo Journal of Statis-tical Planning and Inference vol 140 no 7 pp 1754ndash1764 2010

[19] S P Mukherjee and S S Maiti ldquoStress-strength reliability inthe Weibull caserdquo in Frontiers in Reliability vol 4 pp 231ndash248World Scientific Singapore 1998

[20] D K Al-Mutairi M E Ghitany and D Kundu ldquoInferences onstress-strength reliability from Lindley distributionsrdquo Commu-nications in Statistics Theory and Methods vol 42 no 8 pp1443ndash1463 2013

[21] M M Shoukri M A Chaudhary and A Al-Halees ldquoEstimat-ing P (119884 lt 119883) when 119883 and 119884 are paired exponential variablesrdquoJournal of Statistical Computation and Simulation vol 75 no 1pp 25ndash38 2005

[22] A Baklizi and O Eidous ldquoNonparametric estimation of(119883 lt 119884) using kernel methodsrdquo Metron vol 64 no 1 pp 47ndash60 2006

[23] G Qin and L Hotilovac ldquoComparison of non-parametricconfidence intervals for the area under the ROC curve of a con-tinuous-scale diagnostic testrdquo Statistical Methods in MedicalResearch vol 17 no 2 pp 207ndash221 2008

[24] W H Ahrens ldquoUse of the arcsine and square root transforma-tions for subjectively determined percentage datardquo Weed Sci-ence vol 38 no 4-5 pp 452ndash458 1990

[25] A Barbiero ldquoInterval estimators for reliability the bivariatenormal caserdquo Journal of Applied Statistics vol 39 no 3 pp 501ndash512 2012

[26] A Barbiero ldquoConfidence intervals for reliability of stress-strength models in the normal caserdquo Communications in Statis-ticsmdashSimulation and Computation vol 40 no 6 pp 907ndash9252011

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Data Transformation for Confidence

10 Advances in Decision Sciences

[27] B Efron and R Tibshirani An Introduction to the BootstrapChapman and Hall London UK 1993

[28] I R Cardoso de Oliveira and D F Ferreira ldquoMultivariateextension of chi-squared univariate normality testrdquo Journal ofStatistical Computation and Simulation vol 80 no 5-6 pp 513ndash526 2010

[29] P CWang ldquoComparisons on the analysis of Poisson datardquoQua-lity and Reliability Engineering International vol 15 pp 379ndash383 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Data Transformation for Confidence

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of