1
IEEE TRANSACTIONS ON RELIABILITY, VOL. R-23, NO. 4, OCTOBER 1974 225 EI)ITORIALS Why, Oh Why, Oh Why Fitting a probability distribution to data is a time honored If the distribution is being fitted just to interpolate, why activity among statisticians and their followers. But why is not simply use one of the many existing interpolation it done? It is a way of summarizing data: a few numbers can methods? If it is being done to extrapolate (always a dan- regenerate great quantities of equivalent data. But often a gerous procedure), one of the more occult or metaphysical tractable probability distribution is fitted to the data in methods of extrapolation might be more appropriate. If it order to estimate some property of the real distribution, is being done to calculate a property of the real distribution, e.g., mean, 10th percentile, or median. e.g., the mean or variance, why not just use the sample When there are only a few data (say less than 30) the tails values (equating sample values to population values is a of the true distribution are not well defined by the data. time honored technique of fitting distributions anyway)? Rarely will the sample Cdf be like the real Cdf in the tail Don't blindly fit a distribution to the data in order to hide regions. Why then do we expect that the tail regions of the from yourself the gross uncertainties involved in an estima- fitted tractable Cdf will correspond to the tail regions of the tion process. real Cdf? Some properties of a probability distribution are quite sensitive to the behavior in the tail regions; especially the mean and variance. -R. A. E. Bias Is Not Bad In reliability statistics there is too much emphasis on failure rate is s-unbiased, the reliability calculated using that s-bias of estimators. Being s-biased is not particularly bad estimator will be s-biased, as will the mean life (reciprocal of for an estimator, as long as it has some other good proper- failure rate). Now, no reliability engineer knows which of ties. Being s-consistent (approaching the true value as the the 3 functions (reliability, failure rate, mean life) he wants sample size becomes larger) and being s-efficient (having a to have s-unbiased; so let's use estimators whose good reasonably small rms error) are probably the two most properties hold for functions of those estimators. For many important statistical properties for an s-estimator of a common estimators, the s-bias is small compared to the measure of reliability. standard deviation anyway; so the rms error is negligibly s-Bias is a special arithmetic property that measures the larger than the standard deviation. difference between the average value of the estimator and So, let's quit worrying about s-bias. the true value. If any estimator is s-unbiased, any nonlinear function of the estimator is s-biased (because of the special s- implies "statistical" definition of s-bias). Thus, if an estimator of the constant ---R. A. E.

Why, Oh Why, Oh Why

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Why, Oh Why, Oh Why

IEEE TRANSACTIONS ON RELIABILITY, VOL. R-23, NO. 4, OCTOBER 1974 225

EI)ITORIALS

Why, Oh Why, Oh Why

Fitting a probability distribution to data is a time honored If the distribution is being fitted just to interpolate, whyactivity among statisticians and their followers. But why is not simply use one of the many existing interpolationit done? It is a way of summarizing data: a few numbers can methods? If it is being done to extrapolate (always a dan-regenerate great quantities of equivalent data. But often a gerous procedure), one of the more occult or metaphysicaltractable probability distribution is fitted to the data in methods of extrapolation might be more appropriate. If itorder to estimate some property of the real distribution, is being done to calculate a property ofthe real distribution,e.g., mean, 10th percentile, or median. e.g., the mean or variance, why not just use the sampleWhen there are only a few data (say less than 30) the tails values (equating sample values to population values is a

of the true distribution are not well defined by the data. time honored technique of fitting distributions anyway)?Rarely will the sample Cdf be like the real Cdf in the tail Don't blindly fit a distribution to the data in order to hideregions. Why then do we expect that the tail regions of the from yourself the gross uncertainties involved in an estima-fitted tractable Cdf will correspond to the tail regions of the tion process.real Cdf? Some properties of a probability distribution arequite sensitive to the behavior in the tail regions; especiallythe mean and variance. -R. A. E.

Bias Is Not Bad

In reliability statistics there is too much emphasis on failure rate is s-unbiased, the reliability calculated using thats-bias of estimators. Being s-biased is not particularly bad estimator will be s-biased, as will the mean life (reciprocal offor an estimator, as long as it has some other good proper- failure rate). Now, no reliability engineer knows which ofties. Being s-consistent (approaching the true value as the the 3 functions (reliability, failure rate, mean life) he wantssample size becomes larger) and being s-efficient (having a to have s-unbiased; so let's use estimators whose goodreasonably small rms error) are probably the two most properties hold for functions of those estimators. For manyimportant statistical properties for an s-estimator of a common estimators, the s-bias is small compared to themeasure of reliability. standard deviation anyway; so the rms error is negligibly

s-Bias is a special arithmetic property that measures the larger than the standard deviation.difference between the average value of the estimator and So, let's quit worrying about s-bias.the true value. If any estimator is s-unbiased, any nonlinearfunction of the estimator is s-biased (because of the special s- implies "statistical"definition of s-bias). Thus, if an estimator of the constant ---R. A. E.