Process Capability Indices for Non Normal Distributions

Embed Size (px)

DESCRIPTION

Process Capability Indices

Citation preview

  • ABSTRACT

    PROCESS CAPABILITY INDICES FOR NON NORMAL

    DISTRIBUTIONS

    Sulagna Das, M.S.

    Department of Mathematical Sciences

    Division of Statistics

    Northern Illinois University, 2009

    Dr Alan M. Polansky, Director

    Process capability analysis is a statistical technique that is used to iden-

    tify and reduce the variability of a manufacturing process in order to

    produce items that meet certain specifications. Many different process

    capability indices have been developed to measure the capability of a

    manufacturing process. But they all have some drawbacks. The biggest

    drawback is that they can be applied only for processes that are normally

    distributed. This thesis makes an attempt to deal with the problem of

    non-normality by developing an index based on quantiles. To measure

    the accuracy of the estimates, confidence intervals have been computed

    in four different ways. Finally, the thesis shows how these confidence

    intervals work well only for large sample sizes using samples obtained by

    the bootstrap method.

  • NORTHERN ILLINOIS UNIVERSITYDE KALB, ILLINOIS

    AUGUST 2009

    PROCESS CAPABILITY INDICES FOR NON NORMAL DISTRIBUTIONS

    BY

    SULAGNA DAS

    c 2009 Sulagna Das

    A THESIS SUBMITTED TO THE GRADUATE SCHOOL

    IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

    FOR THE DEGREE

    MASTER OF SCIENCE

    DEPARTMENT OF MATHEMATICAL SCIENCES

    Thesis Director:Dr Alan M. Polansky

  • UMI Number: 1468057

    INFORMATION TO USERS

    The quality of this reproduction is dependent upon the quality of the copy

    submitted. Broken or indistinct print, colored or poor quality illustrations and

    photographs, print bleed-through, substandard margins, and improper

    alignment can adversely affect reproduction.

    In the unlikely event that the author did not send a complete manuscript

    and there are missing pages, these will be noted. Also, if unauthorized

    copyright material had to be removed, a note will indicate the deletion.

    ______________________________________________________________

    UMI Microform 1468057 Copyright 2009 by ProQuest LLC

    All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

    _______________________________________________________________

    ProQuest LLC 789 East Eisenhower Parkway

    P.O. Box 1346 Ann Arbor, MI 48106-1346

  • ACKNOWLEDGMENTS

    I would like to express my gratitude to my advisor, Prof. Alan Polansky, for

    his guidance and help in writing my thesis. He introduced me to this topic and

    I got interested right away. I am grateful to him for his continued support and

    time inspite of his busy schedule. I also wish to thank all of my professors and

    friends who offered their suggestions from time to time. Finally, I cannot forget the

    incredible support of our office staff, without which it would have been difficult for

    me to complete my academic program.

  • TABLE OF CONTENTS

    Page

    LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

    LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

    Chapter

    1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.1 Process Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 Process Capability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2.1 The Cp Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2.2 One-Sided Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.3 The Cpk Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.4 The Cpm Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.3 Some Indices Robust to Non-Normality . . . . . . . . . . . . . . . . . 8

    1.3.1 The Cpc Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.3.2 Indices Based on Quantiles . . . . . . . . . . . . . . . . . . . . . . . 10

    2. Confidence Intervals for the Cnpk Index . . . . . . . . . . . . . . . . . . 12

    2.1 The Standard Bootstrap Confidence Interval . . . . . . . . . . . . . 13

    2.2 The Percentile Method Bootstrap Confidence Interval . . . . . . 14

    2.3 The Bootstrap-t Confidence Interval . . . . . . . . . . . . . . . . . . . . 14

    2.4 The Hybrid Bootstrap Confidence Interval . . . . . . . . . . . . . . . 15

    2.5 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

  • iv

    Chapter Page

    2.5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    3. R Program Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

  • LIST OF TABLES

    Table Page

    2.1. Simulation Results for samples from a Standard Normal Density . 20

    2.2. Simulation Results for samples from a Skewed Unimodal Density . 21

    2.3. Simulation Results for samples from a Strongly Skewed UnimodalDensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.4. Simulation Results for samples from a Kurtotic Unimodal Density 23

  • LIST OF FIGURES

    Figure Page

    2.1. Skewed Unimodal Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.2. Strongly Skewed Unimodal Density . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.3. Kurtotic Unimodal Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

  • CHAPTER 1

    Introduction

    1.1 Process Capability

    Consider a manufacturing process. Even with the most well-designed manufac-

    turing system a certain amount of inherent variability in the manufactured items

    always exists. This inherent or natural variability is usually the effect of many small

    unavoidable causes. An unavoidable cause is one that cannot be attributed to a

    specific reason and occurs purely by chance. A process that is operating with only

    unavoidable, or chance, causes of variation is said to be in statistical control. There

    may be other kinds of variability in a manufacturing system that could be attributed

    to a cause like operator error, defective raw materials, or improperly adjusted ma-

    chines. Such variability is usually large compared to the natural variability in a

    process and affects the performance of the manufacturing process. Such sources of

    variability are referred to as assignable causes of variation. Statistical control can be

    restored in a process that is not in control by detecting and eliminating assignable

    causes in the process. Once a process is in control, one can then focus on the quality

    of the manufactured items.

    Process capability refers to how well a process is capable in producing items that

    meet the product requirements or specifications. The statistical technique of identi-

    fying and reducing process variability in order to produce items within specifications

  • 2is called a process capability analysis. A process capability analysis is a formal study

    that can be used to study the variability of a process. Such an analysis usually

    focuses on the variation in parameters or quality characteristics of a product that

    are required to meet certain specifications.

    Specifications refer to the range of a quality characteristic of an item where

    the item is useful or of acceptable quality. Specification limits are set for a man-

    ufacturing process and are determined by Industrial Engineers. USL refers to the

    upper specification limit and LSL refers to the lower specification limit for a single

    univariate quality characteristic. Process capability analysis can be done without

    specifications by simply describing the process variation. However the analysis is

    much more meaningful when done in terms of specifications.

    The extent or majority of the variation in a quality characteristic is defined as

    the natural tolerance of a process. For a normally distributed quality characteristic

    with mean and standard deviation the natural tolerance is the 6 interval in

    the distribution around the process mean. This measure, in conjunction with the

    specification limits, can be used as a measure of process capability.

    When defining the natural tolerance, or natural variability, of a normally dis-

    tributed product quality characteristic there are certain additional assumptions that

    should be kept in mind. For example, the process is also required to be stable, or

    in control. A stable process refers to one that does not exhibit changes in process

    distribution with time.

  • 3The normality of a quality characteristic can be verified by plotting a histogram,

    using a normal quantile plot, or by using a Shapiro-Wilk test for normality. The

    shape and spread of the histogram helps to determine if the distribution is approx-

    imately normal. A histogram also gives an immediate and visual impression of the

    process performance. The normal quantile plot and Shapiro-wilk test provide a

    more formal statistical method for assessing normality. For small samples a his-

    togram may not provide reliable results. In these cases a normal probability plot

    or Shapiro-wilk test can be used as an alternative to the histogram as it produces

    reasonable results for moderately small samples.

    Some uses of process capability analysis are :

    Predicting how well the process will hold to tolerances prescribed by the spec-ification limits. Process capability is often measured in terms of the natural

    tolerance of the process compared to the range of the specification set. Hence,

    process capability indicates how much of the process is within engineering

    tolerances.

    Selecting and modifying processes. This measure tells if a process is capableenough to meet specifications and hence helps in determining if a manufactur-

    ing process requires modifications.

    Constructing plans for process monitoring. A process capability analysis ofa process can help monitor the process and also give warning signs when a

    process does not meet capability standards.

    Selecting between competing vendors. A better manufacturing process canbe judged by comparing the relative manufacturing process capabilities of the

  • 4competing vendors. See, for example, Chou [1], Tseng and Wu [14], Huang

    and Lee [4], and Polansky [11,12].

    1.2 Process Capability Indices

    Process capability ratios (PCRs) express the capability of a process to manufac-

    ture products that meet specifications. PCRs provide a convenient way of expressing

    the capability of a process with a unit-less measure usually formed as a ratio of the

    acceptable variability of the process to the actual variability in the process. Several

    such measures of process capability have been proposed. A few of these proposals

    are presented here. A complete overview of the process capability indices can be

    found in Kotz and Johnson (1993).

    1.2.1 The Cp Index

    The most basic process capability index is the Cp index. Let be the process

    standard deviation. The Cp index is defined as

    Cp =USL LSL

    6.

    In practice when is not known it is replaced by an estimate such as the sample

    standard deviation of some observed process data, an unbiased estimate such as Rd2

    where R is the average range computed from an R-chart, or Sc4

    where S is determined

    from S-chart and d2 is a constant that changes with sample size. See Appendix 6

    of Montgomery (2009). Therefore, an estimate of Cp is given by

    Cp =USL LSL

    6,

    where is the estimated process standard deviation.

    Practical interpretation of the Cp index is only valid when the quality character-

  • 5istic has a normal distribution, the process is in statistical control, and the process

    mean, , is centered between the upper and lower specification limits. That is, when

    =USL + LSL

    2.

    This can be verified using a hypothesis test

    H0 : =USL + LSL

    2,

    against

    Ha : 6= USL + LSL2

    .

    These hypotheses can be tested using the standard t-test, under the assumption

    that the quality characteristic approximately follows a normal distribution.

    In practical situations a common problem that is encountered is that the as-

    sumption of normality is often violated. Since the capability index Cp uses 6 as

    the natural tolerance, the index requires that the quality characteristic of the pro-

    cess data follow normal distribution and hence a non-normal process data can lead

    to erroneous results. That is, statements made about expected fallout or percentage

    of non-conformity may be in error. Also Cp does not take into account where the

    process mean is located relative to specifications. It simply measures the spread of

    the specification relative to the 6 spread in process. An off-center process has lower

    capability than a centered process in that it does not operate at the midpoint of the

    interval between the specifications where the lowest proportion of non-conformity

    would occur. Due to these reasons Cp process capability index is not considered a

    process capability index that can be used in general situations.

  • 61.2.2 One-Sided Indices

    Often there are processes that have just either an upper or lower specification

    limit. For example, strength often has just a lower specification limit, and time often

    has just an upper specification limit. When a process has just an upper specification

    limit, a measure of process capability is defined as

    Cpu =USL

    3.

    When a process has just a lower specification limit a measure of process capability

    is defined as

    Cpl = LSL

    3.

    Estimates of Cpu and Cpl are obtained by replacing and by estimates and

    respectively. The estimate of is usually the sample mean of an observed sample

    of process data. The estimate of is the same as used for Cp. Some important

    assumptions should be kept in mind. The quality characteristic should be normally

    distributed and the process should be in statistical control.

    1.2.3 The Cpk Index

    The quantity Cpk is a process capability index defined by Kane (1986) that has

    been defined to take into account some of the problems encountered with the Cp

    index. The Cpk process capability index is the minimum of Cpu and Cpl. If Cp = Cpk

    the process is centered at the midpoint of the specification set. But when Cp and

    Cpk are not same the process is off-center. Hence, the Cpk index provides a better

  • 7measure of process capability than Cp when the process is not centered. In general

    Cpk is less than Cp . Note that there is a relation between Cp and Cpk given by

    Cpk =

    [1 |

    USL+LSL2

    |USLLSL

    2

    ]Cp.

    An estimate of Cpk is given by

    Cpk =

    [1 |

    USL+LSL2

    |USLLSL

    2

    ]Cp.

    where and Cp are specified above.

    1.2.4 The Cpm Index

    The Cpm index was developed to deal with the problems often encountered with

    the Cp and Cpk process capability indices. The Cpk index was developed as an

    alternative to Cp which does not work well for a process where the mean is not

    centered between specification limits. Also the Cpk index has a limitation when

    approaches zero. The Cpk index depends inversely on and hence becomes large as

    decreases. A large value of Cpk gives no information about the relative location

    of the mean in the interval LSL to USL.

    The Cpm index was proposed by Chan, Cheng and Spiring (1988) as a better

    indicator of process centering. This index is given by

    Cpm =USL LSL

    6,

    where =

    E(X T )2 = 2 + ( T )2, and is the target value for theprocess. Hence

    Cpm =USL LSL

    6

    2 + ( T )2 =Cp1 + 2

    ,

  • 8where

    = T

    .

    It can be seen that as ( T ) , Cpm 0 whereas Cpk . A necessarycondition for Cpm 1 is that | T | < USLLSL6 . This means that if the targetvalue T is the midpoint of the specifications, a Cpm index of one or greater implies

    that the mean lies within the middle third of the specification band.

    To estimate the Cpm we usually use

    Cpm =Cp

    1 + V 2,

    where

    V = T

    ,

    and Cp, and are as specified above.

    1.3 Some Indices Robust to Non-Normality

    Several nonparametric indices have been formulated to deal with the problem

    of non-normal data. The most commonly used approach deals with the problem of

    non-normality by transforming the data and specification limits. There are various

    graphical and analytical approaches to selecting a transformation. See Polansky and

    Kirmani (2003). A suitable transformation of the data to normal distribution can

    be done to compute and interpret capability indices. A popular transformation is

    taking the reciprocal of the original data. A skewed distribution responds well to

    the square root of the original data. However a major disadvantage with the method

  • 9of transformation is that it involves further calculations. Also it is seen that some

    people may not be able to handle and interpret a transformed data. Hence this

    method is often discouraged.

    Another approach is to fit the observed process data to a family of distributions.

    Indices specialized to the family of distributions are then computed to measure the

    process capability. One needs to make sure that the parameter estimates are based

    on a large enough sample to give reliable results. Also choice of the fitted distribu-

    tion may not always offer the best fit.

    1.3.1 The Cpc Index

    The Cpc index is another attempt to define capability in the case when the data

    are not normally distributed, developed by Luceno (1996). The Cpc index is defined

    as

    Cpc =USL LSL

    6

    12piE | X T |

    ,

    where T is the target value for the process which is often taken to be the midpoint

    of the specification set, given by

    T =USL + LSL

    2,

    and X is a random variable equal to the quality characteristic. The Cpc index can

    be estimated by estimating E | X T | with

    c =

    ni=1 | Xi T |

    n,

    where X1, X2, . . . , Xn is a sample of process data. Therefore, an estimate of the Cpc

  • 10

    index is given by

    Cpc =USL LSL

    6

    12pic

    .

    The denominator 6

    pic2

    is a more robust measure of natural tolerance than 6 is

    when the quality characteristic data are non-normal.

    1.3.2 Indices Based on Quantiles

    Alternative process capability indices have been proposed that use a more uni-

    versal measurement of the natural tolerance of a distribution. These measures are

    usually based on quantiles of the process distribution. For these measures, the as-

    sumption of normality is not required, but the indices may require large sample sizes

    to obtain accurate estimates. For example, an alternative to the Cp index is given

    by

    Cpq =USL LSL

    Q0.99865 Q0.00135 ,

    where Qy = yth quantile of the process distribution.

    Since for normal distribution Q0.00135 = 3 and Q0.99865 = +3, in the caseof a normally distributed data Cpq reduces to Cp. The Cpq index can be estimated

    with

    Cpq =USL LSL

    Q0.99865 Q0.00135,

    where Qy is the yth sample quantile. It is the value where y fraction of the data is

    below that value in a given dataset.

    Another similar capability index was developed based on this principal as an

    alternative to the Cpk index, by McCormack et.al.(2000). The Cnpk index is defined

    as

    Cnpk = min(Cnpl, Cnpu) where

  • 11

    Cnpl =Q50 LSLQ50 Q0.5 ,

    and

    Cnpu =USLQ50Q99.5 Q50 .

    An estimate of the Cnpk index is given by

    Cnpk = min (Cnpl, Cnpu) where,

    Cnpl =Q50 LSLQ50 Q0.5

    ,

    and

    Cnpu =USL Q50Q99.5 Q50

    .

    where Qy = yth sample quantile from a sample of observations from the process

    distribution.

    This thesis explores methods for computing four different approximate confidence

    intervals for Cnpk. We will empirically determine how well they perform in terms of

    capturing the true value of Cnpk. This is done using computer based simulations.

  • CHAPTER 2

    Confidence Intervals for the Cnpk Index

    As discussed in the previous chapter, it is clear that when the distribution of a

    process deviates from normality, statements made about many of the process capa-

    bility indices could be in error if the usual process capability indices such as Cp, Cpk

    or Cpm are used. Hence, in this work we have elected to focus on the Cnpk process

    capability index which does not require the assumption of normality. In order to

    make useful statements about a manufacturing process when the true value of the

    Cnpk index is not known, we wish to develop confidence intervals for the Cnpk index.

    It is required that the sampling distribution of the capability index be determined

    before computing statistics like a confidence interval, since confidence intervals are

    required to account for the sample variation in the estimates of the capability index.

    The sampling distribution of Cnpk is very complicated due to the fact that it is

    a minimum of two functions that involve ratios of sample quantiles. Moreover, the

    distribution of the sample quantiles depends on the population density f . For large

    samples there is an asymptotic normal result for sample quantiles. Let 0 < p < 1. If

    the distribution function of the process, F , possesses a density f in a neighborhood

    of Qp and f is positive and continuous at Qp, then the distribution of the sample

    quantile Qp has an approximate normal distribution with mean Qp and variance

    p(1p)[nf2(Qp)]

    when n is large. Therefore, one can note that the variance of Qp depends

    on the unknown density f evaluated at the unknown quantile Qp. Since the dis-

  • 13

    tribution of f is not known, it is difficult to use this result in practice. Note that

    even if the distribution F were known it would still be a difficult task to derive the

    sampling distribution of Cnpk.

    To deal with the problem of computing confidence intervals for Cnpk for non-

    normal data, alternative methods were considered. These methods can estimate

    the sampling distribution of Cnpk without having to specify the unknown density

    f . These methods are based on the concept of bootstrap estimation developed by

    Efron (1979). Four different types of bootstrap confidence intervals are considered.

    2.1 The Standard Bootstrap Confidence Interval

    Consider a random sample X1, X2, . . . , Xn from a process that follows some un-

    known distribution F . To compute a standard bootstrap confidence interval, we

    begin by simulating b resamples of size n from the empirical distribution of the

    sample. These samples are selected, with replacement, from the observed random

    sample X1, X2, . . . , Xn. Such samples are called resamples. For each resample, Cnpk

    is computed. Suppose Cnpk(1), C

    npk(2),. . . , C

    npk(b) are the b sample estimates of pro-

    cess capability index Cnpk computed on the resamples. Then the standard bootstrap

    confidence interval for Cnpk is given by,

    [Cnpk Z/2 SE(Cnpk), Cnpk + Z/2 SE(Cnpk)]

    where

    SE(Cnpk) =

    1b 1

    bi=1

    (Cnpk(i) Cnpk)2,

  • 14

    and

    Cnpk =1

    b

    bi=1

    Cnpk(i).

    2.2 The Percentile Method Bootstrap Confidence Interval

    Consider a random sample X1, X2, . . . , Xn from a process that follows some un-

    known distribution F . To compute the percentile method bootstrap confidence

    interval, we begin by simulating b resamples of size n. These samples are selected,

    with replacement, from the observed random sample X1, X2, . . . , Xn. On each re-

    sample Cnpk is computed. Suppose C

    npk(1), C

    npk(2),. . . ,C

    npk(b) are the b sample es-

    timates of process capability index Cnpk computed on the resamples. Sort C

    npk(1),

    Cnpk(2),. . . ,C

    npk(b) in ascending order. Let C

    npk[1], C

    npk[2],. . . ,C

    npk[b] denote these

    sorted values. A 100(1 )% bootstrap percentile method confidence interval forCnpk is then given by [C

    npk[b( 2)], C

    npk[b(12)]].

    2.3 The Bootstrap-t Confidence Interval

    Consider a random sample X1, X2, . . . , Xn that follows some unknown distribu-

    tion F . To compute the bootstrap-t confidence interval, we begin by simulating b

    resamples of size n. These samples are selected, with replacement, from the ob-

    served random sample X1, X2, . . . , Xn. On each resample Cnpk is computed. Sup-

    pose Cnpk(1), C

    npk(2),. . . ,C

    npk(b) are the b sample estimates of process capability index

    Cnpk. A second iteration of bootstrap samples are then generated by resampling from

    each of the b resamples generated above. Suppose c resamples are generated from

    each of the b samples. Let Cnpk(1), C

    npk(2),. . . ,C

    npk(c) be the c sample estimates of

    process capability index Cnpk generated from each of the b samples. Thus if c resam-

  • 15

    ples are generated for each of the b resamples, then cb values of Cnpk are computed.

    The standard error of Cnpk is computed for each of the b bootstrap samples given

    by,

    SE(Cnpk) =

    1c 1

    ci=1

    (Cnpk(i) C

    npk(i))2,

    where

    C

    npk(i) =1

    c

    ci=1

    Cnpk(i).

    This is followed by computing the measure T =C

    npkCnpk

    SE(Cnpk)for each of the b original

    bootstrap resamples. Finally the b values of T are sorted in ascending order. These

    are denoted as T [1], T [2], . . . , T [b]. A 100(1)% bootstrap-t confidence intervalfor Cnpk is then defined as,

    [Cnpk T [b(1 2

    )] SE(Cnpk), Cnpk T [b2

    )] SE(Cnpk)],

    where SE(Cnpk) is as computed previously.

    2.4 The Hybrid Bootstrap Confidence Interval

    Consider a random sample X1, X2, . . . , Xn that follows some unknown process

    distribution F . We begin by simulating b resamples of size n. These samples are

    selected, with replacement, from the observed random sample X1, X2, . . . , Xn. On

    each resample Cnpk is computed. Suppose C

    npk(1), C

    npk(2),. . . ,C

    npk(b) are the b sam-

    ple estimates of process capability index processCnpk computed on the resamples.

    We compute the measure H = Cnpk Cnpk for each of the b original bootstrap

  • 16

    samples. The b values of H are then sorted in ascending order. These are denoted

    as H[1], H[2], . . . , H[b]. A 100(1 )% hybrid bootstrap confidence interval isthen given by [Cnpk H[b(1 2 )], Cnpk H[b2 ]].

    2.5 Simulation Study

    A computer based simulation was developed to study the performance of the

    four bootstrap confidence intervals introduced above. Using samples from a known

    distributions, the four different bootstrap confidence intervals were computed and

    their ability to capture the true parameter value was studied.

    The algorithm is as follows. The sample size n, upper specification limit (USL)

    and lower specification limit (LSL) were specified. For each distribution, USL and

    LSL were selected to give a proportion non-conforming equal to 0.0027. The true

    value of Cnpk was computed using the specified limits. A random sample of size n

    was generated from the specified distribution. 90% confidence intervals were created

    using the four methods on the generated sample. This operation was repeated 1000

    times and each time it was determined if the true value of Cnpk was in each of the

    intervals. The width of the intervals was also computed.

    Normal mixtures were used in the study. Four different kinds of distributions

    were used. They were the normal, skewed unimodal, strongly skewed unimodal

    and kurtotic unimodal. The last three densities were studied by Marron and Wand

    (1992). Density plots of the skewed unimodal, strongly skewed unimodal, and kur-

    totic unimodal distributions are given in Figures 2.1-2.3.

  • 17

    The skewed unimodal density has the form

    1

    5(x) +

    1

    5(x; =

    1

    2, =

    2

    3) +

    3

    5(x; =

    13

    12, =

    5

    9),

    where

    (x; , ) = (2pi2)1

    2 exp[12

    (x )22

    ].

    This density is plotted in Figure 2.1.

    3 2 1 0 1 2 3

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    x

    Den

    sity

    Figure 2.1: Skewed Unimodal Density

  • 18

    The strongly skewed unimodal density has the form

    18(x) + 1

    8(x; = 1, = 2

    3) + 1

    8(x; = 5

    3, = 4

    9) + 1

    8(x; = 19

    9, =

    827

    ) + 18(x; = 65

    27, = 16

    81) + 1

    8(x; = 211

    81, = 32

    243) + 1

    8(x; = 665

    243, =

    64729

    ) + 18(x; = 2059

    729, = 128

    2187)),

    and is plotted in Figure 2.2.

    3 2 1 0 1 2 3

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    x

    Den

    sity

    Figure 2.2: Strongly Skewed Unimodal Density

  • 19

    The Kurtotic Unimodal Density has the form

    2

    3(x) +

    1

    3(x; = 0, =

    1

    10),

    and is plotted in Figure 2.3.

    3 2 1 0 1 2 3

    0.0

    0.5

    1.0

    1.5

    x

    Den

    sity

    Figure 2.3: Kurtotic Unimodal Density

  • 20

    Table 2.1: Simulation Results for samples from a Standard Normal Density

    n Method Coverage Length

    25 Percentile 22.9% 0.890Hybrid 62.6% 0.890

    Bootstrap-t 58.8% 1.085Standard 62.5% 1.0248

    50 Percentile 40.7% 0.537Hybrid 50.0% 0.537

    Bootstrap-t 57.3% 0.825Standard 70.0% 0.6323

    100 Percentile 64.2% 0.371Hybrid 46.1% 0.371

    Bootstrap-t 52.5% 0.725Standard 78.5% 0.4447

    250 Percentile 90.2% 0.355Hybrid 66.1% 0.355

    Bootstrap-t 76.3% 0.457Standard 88.9% 0.3793

    500 Percentile 93.3% 0.297Hybrid 75.5% 0.297

    Bootstrap-t 81.5% 0.327Standard 89.6% 0.2935

  • 21

    Table 2.2: Simulation Results for samples from a Skewed Unimodal Density

    n Method Coverage Length

    25 Percentile 23.3% 1.003640Hybrid 57.9% 1.003640

    Bootstrap-t 54.8% 1.205209Standard 61.6% 1.164100

    50 Percentile 40.3% 0.6135Hybrid 50.0% 0.6135

    Bootstrap-t 57.3% 0.9377Standard 71.0% 0.7144

    100 Percentile 64.6% 0.4048Hybrid 47.4% 0.4048

    Bootstrap-t 53.1% 0.7656Standard 78.6% 0.4797

    250 Percentile 90.6% 0.3826Hybrid 67.0% 0.3826

    Bootstrap-t 75.8% 0.4765Standard 89.9% 0.4076

    500 Percentile 93.1% 0.3203Hybrid 75.2% 0.3203

    Bootstrap-t 80.3% 0.3509Standard 90.8% 0.3233

  • 22

    Table 2.3: Simulation Results for samples from a Strongly Skewed Unimodal Density

    n Method Coverage Length

    25 Percentile 65.3% 0.4607Hybrid 78.0% 0.4607

    Bootstrap-t 71.1% 0.4074Standard 96.7% 0.5333

    50 Percentile 75.1% 0.2704Hybrid 62.9% 0.3149

    Bootstrap-t 62.9% 0.3149Standard 88.6% 0.3025

    100 Percentile 85.5% 0.1870Hybrid 40.7% 0.1870

    Bootstrap-t 43.9% 0.3246Standard 70.5% 0.2148

    250 Percentile 82.2% 0.2004Hybrid 64.7% 0.2004

    Bootstrap-t 76.1% 0.2652Standard 81.4% 0.2125

    500 Percentile 52.8% 0.1889Hybrid 70.4% 0.1889

    Bootstrap-t 72.9% 0.2003Standard 71.5% 0.1835

  • 23

    Table 2.4: Simulation Results for samples from a Kurtotic Unimodal Density

    n Method Coverage Length

    25 Percentile 22.8% 1.1720Hybrid 65.7% 1.1720

    Bootstrap-t 63.3% 1.5094Standard 71.9% 1.4476

    50 Percentile 39.8% 0.6363Hybrid 55.7% 0.6363

    Bootstrap-t 58.9% 1.0802Standard 71.70% 0.7752

    100 Percentile 62.6% 0.4214Hybrid 44.0% 0.4214

    Bootstrap-t 50.6% 0.8821Standard 79.6% 0.5100

    250 Percentile 90.8% 0.3999Hybrid 67.7% 0.3999

    Bootstrap-t 77.0% 0.4933Standard 90.7% 0.4277

    500 Percentile 92.1% 0.3297Hybrid 73.2% 0.3297

    Bootstrap-t 79.6% 0.3583Standard 90.5% 0.3348

    2.5.1 Conclusion

    From the tables above it is clear that

    None of the methods do well for small samples

    All of the methods get better as the sample size increases

    The Standard bootstrap method, seems to do the best regarding coverage.Hence it would be the recommended approach for confidence interval calcula-

    tion of capability index Cnpk, for the distributions used in this study.

  • CHAPTER 3

    R Program Code

    Below is the R code that was used to perform the simulation.

    cnpkf=function(x,lsl,usl){

    x50=quantile(x,0.5)

    x99.5=quantile(x,0.995)

    x0.5=quantile(x,0.005)

    cnpl=(x50-lsl)/(x50-x0.5)

    cnpu=(usl-x50)/(x99.5-x50)

    cnpk=min(cnpl,cnpu)

    return(cnpk)

    }

    cnpkt=function(lsl,usl){

    z50=qnorm(0.5)

    z99.5=qnorm(0.995)

    z0.5=qnorm(0.005)

    cnpl=(z50-lsl)/(z50-z0.5)

    cnpu=(usl-z50)/(z99.5-z50)

    true.cnpk=min(cnpl,cnpu)

    return(true.cnpk)

  • 25

    }

    cnpkbootpm=function(x,lsl,usl,b)

    {

    coverage=matrix(0,1,4)

    n=length(x)

    cnpk=cnpkt(lsl,usl)

    T1=matrix(0,b,1)

    H1=matrix(0,b,1)

    T1S=matrix(0,b,1)

    H1S=matrix(0,b,1)

    cnpkstar=matrix(0,b,1)

    cnpkstars=matrix(0,b,1)

    cnpks=matrix(0,b,1)

    sampleSD=sd(x)

    sigma=min(sampleSD,IQR(x)/1.349)

    h=1.587*sigma*n^(-1/3)

    cnpkhat=cnpkf(x,lsl,usl)

    for(i in 1:b)

    {

    xstar=sample(x,n,replace=T)

    cnpkstar[i]=cnpkf(xstar,lsl,usl)

    cnpkstar1=matrix(0,b,1)

    for(j in 1:100)

    {

  • 26

    xstar1=sample(xstar,n,replace=T)

    cnpkstar1[j]=cnpkf(xstar1,lsl,usl)

    }

    std=sd(cnpkstar1)

    T1[i]=(cnpkstar[i]-cnpk)/std

    H1[i]=cnpkstar[i]-cnpk

    }

    cnpks=sort(cnpkstar)

    se=sd(cnpkstar)

    # standard bootstrap------------------------

    BS.CL=cnpkhat-1.96*(se)

    BS.CU=cnpkhat+1.96*(se)

    if ((cnpk>=BS.CL)&&(cnpk=BSP.CL)&&(cnpk

  • 27

    tsort=sort(T1)

    d1=b*.95

    d2=b*.05

    l1=tsort[as.integer(d1)]

    u1=tsort[as.integer(d2)]

    BST.CL=cnpkhat-se*l1

    BST.CU=cnpkhat-se*u1

    if ((cnpk>=BST.CL)&&(cnpk=BSH.CL)&&(cnpk

  • 28

    cnpksim=function(n,iter,lsl,usl)

    {

    covmat=matrix(0,iter,4)

    b=1000

    for (i in 1:iter)

    {

    x=rnorm(n,0,1)

    covmat[i, ]=cnpkbootpm(x,lsl,usl,b)

    }

    return (covmat)

    }

  • REFERENCES

    [1] Chou, Y.-M. (1994). Selecting a better supplier by testing process capabilityindices. Quality Engineering, 6, 427-438.

    [2] Chan, L.K., Cheng, S.W. and Spiring, F.A. (1988). A new measure ofprocess capability, Cpm. Journal of Quality Technology, 20, 160-175.

    [3] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. TheAnnals of Statistics, bf 7, 1-26.

    [4] Huang, D.-Y., and Lee, R.F.(1995). Selecting the largest capability indexfrom several quality control processes. Journel of Statistical Planning andInference, 46, 335-346.

    [5] Kane, V. E. (1986). Process capability indices. Journal of Quality Technol-ogy, 18, 41-52.

    [6] Kotz, S., and Johnson, N.L. (1993). Process Capability Indices. Chapmanand Hall, London.

    [7] Luceno, A. (1996). A process capability ratio with reliable confidenceintervals. Communications in Statistics, Simulation and Computation, 25,235-246.

    [8] Mc Cormack, D.W., Harris, I.R., Horwitz, A.M. and Spagon, P.D.(2000).Capability indices for non-normal data. Quality Engineering, 12, 489-495.

    [9] Montgomery, D.C. (2009). Introduction to Statistical Quality Control. SixthEdition.

    [10] Marron, J.S., and Wand M. P. (1992). Exact mean integrated squared error.The annals of Statistics, 20, 712-736.

    [11] Polansky, A.M.(2003). Supplier selection based on bootstrap confidenceregions of process capability indices. International Journel of Reliability,Quality and Safety Engineering, 10, 1-14.

  • 30

    [12] Polansky, A.M.(2006). Permutation methods for comparing process capa-bility indices. Journal of Quality Technology, 38, 254-266.

    [13] Polansky, A.M. and Kirmani, S.N.U.A (2003). Quantifying the capabilityof industrial processes. Handbook of Statistics, Volume 22. Elsevier Science.625-656.

    [14] Tseng, S.-T., and Wu, T.-Y. (1991). Selecting the best manufacturing pro-cess. Journal of Quality Technology, 23, 53-62.