24
Contributed Research Articles 194 Statistical Quality Control with the qcr Package by Miguel Flores, Rubén Fernández-Casal, Salvador Naya, Javier Tarrío-Saavedra Abstract The R package qcr for Statistical Quality Control (SQC) is introduced and described. It includes a comprehensive set of univariate and multivariate SQC tools that completes and increases the SQC techniques available in R. Apart from integrating different R packages devoted to SQC (qcc, MSQC), qcr provides nonparametric tools that are highly useful when Gaussian assumption is not met. This package computes standard univariate control charts for individual measurements, ¯ x, S, R, p, np, c, u, EWMA, and CUSUM. In addition, it includes functions to perform multivariate control charts such as Hotelling T 2 , MEWMA and MCUSUM. As representative features, multivariate nonparametric alternatives based on data depth are implemented in this package: r, Q and S control charts. The qcr library also estimates the most complete set of capability indices from first to the fourth generation, covering the nonparametric alternatives, and performing the corresponding capability analysis graphical outputs, including the process capability plots. Moreover, Phase I and II control charts for functional data are included. Introduction Throughout the last decades, there has been an increasing interest to measure, improve, and control the quality of products, services, and procedures. This is connected to the strong relationship between quality, productivity, prestige, trust, and brand image. In fact, implementing procedures of statistical quality control (SQC) is currently related to increasing companies’ competitiveness. The concept of quality control has been extended from the first definitions based on the idea of adjusting production to a standard model to satisfy customer requirements and include all participants. Nowadays, SQC is not only applied to manufactured products but to all industrial and service processes. DEFINE Process mapping Ishikawa Pareto MEASURE Gage R & R Exploratory data analysis ANALYZE Exploratory data analysis Statistical inference Process capability analysis IMPROVE Design of experiments ANOVA Regression CONTROL Control charts Figure 1: Statistical tools applied in each steps of the Six Sigma methodology. The use of different SQC techniques was standardized With the development of the Six Sigma method by Motorola in 1997 (Pande et al., 2000). Six Sigma is a methodology or even philosophy focused on variability reduction that promotes the use of statistical methods and tools in order to improve processes in industry and services. The Six Sigma application is composed of five stages: Define, Measure, Analyze, Improve, and Control (DEMAIC). Figure 1 shows some representative statistical techniques applied in each of the Six Sigma stages. The two most representative statis- tical tools of SQC are the control charts and the process capability analysis (Montgomery, 2009). Therefore, the proposed qcr package has been developed in order to provide users a comprehensive and free set of programming tools to apply control charts and perform capability analysis in the SQC framework. The control stage is characterized by the use of tools based on anomaly detection and correction (Montgomery, 2009). The most representative techniques of this stage and the primary tool of the Statistical Process Control (SPC) are the control charts (Champ and Woodall, 1987). They have been The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Prácticas de CEC con R

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Prácticas de CEC con R

Contributed Research Articles 194

Statistical Quality Control with the qcrPackageby Miguel Flores, Rubén Fernández-Casal, Salvador Naya, Javier Tarrío-Saavedra

Abstract The R package qcr for Statistical Quality Control (SQC) is introduced and described. Itincludes a comprehensive set of univariate and multivariate SQC tools that completes and increasesthe SQC techniques available in R. Apart from integrating different R packages devoted to SQC (qcc,MSQC), qcr provides nonparametric tools that are highly useful when Gaussian assumption is notmet. This package computes standard univariate control charts for individual measurements, x, S, R,p, np, c, u, EWMA, and CUSUM. In addition, it includes functions to perform multivariate controlcharts such as Hotelling T2, MEWMA and MCUSUM. As representative features, multivariatenonparametric alternatives based on data depth are implemented in this package: r, Q and S controlcharts. The qcr library also estimates the most complete set of capability indices from first tothe fourth generation, covering the nonparametric alternatives, and performing the correspondingcapability analysis graphical outputs, including the process capability plots. Moreover, Phase I andII control charts for functional data are included.

Introduction

Throughout the last decades, there has been an increasing interest to measure, improve, and controlthe quality of products, services, and procedures. This is connected to the strong relationshipbetween quality, productivity, prestige, trust, and brand image. In fact, implementing procedures ofstatistical quality control (SQC) is currently related to increasing companies’ competitiveness.The concept of quality control has been extended from the first definitions based on the ideaof adjusting production to a standard model to satisfy customer requirements and include allparticipants. Nowadays, SQC is not only applied to manufactured products but to all industrial andservice processes.

4

Prácticas de CEC con R

Salvador Naya Javier Tarrío Miguel Flores

qcc

Univariate and multivariate

control charts

Process capability

OC curves

Ishikawa

Pareto

qcr

Univariate and multivariante control charts

Multivariate memory-type control charts:

MEWMA, CUSUM

Nonparametric control charts

Univariante and multivariate

process capability

Nonparametric process

capability

qualityTools

Ishikawa

Pareto

Gage R & R

Process capability

Design of experiments

IQCC

Univariate and multivariate

control charts

SixSigma

Process mapping

Ishikawa

Process capability

Gage R&R

Loss function

DEFINE

Process mapping

Ishikawa

Pareto

MEASURE

Gage R & R

Exploratory data analysis

ANALYZE

Exploratory data analysis

Statistical inference

Process capability analysis

IMPROVE

Design of experiments

ANOVA

Regression

CONTROL

Control charts

Figure 1: Statistical tools applied in each steps of the Six Sigma methodology.

The use of different SQC techniques was standardized With the development of the Six Sigmamethod by Motorola in 1997 (Pande et al., 2000). Six Sigma is a methodology or even philosophyfocused on variability reduction that promotes the use of statistical methods and tools in order toimprove processes in industry and services. The Six Sigma application is composed of five stages:Define, Measure, Analyze, Improve, and Control (DEMAIC). Figure 1 shows some representativestatistical techniques applied in each of the Six Sigma stages. The two most representative statis-tical tools of SQC are the control charts and the process capability analysis (Montgomery, 2009).Therefore, the proposed qcr package has been developed in order to provide users a comprehensiveand free set of programming tools to apply control charts and perform capability analysis in theSQC framework.

The control stage is characterized by the use of tools based on anomaly detection and correction(Montgomery, 2009). The most representative techniques of this stage and the primary tool of theStatistical Process Control (SPC) are the control charts (Champ and Woodall, 1987). They have been

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 2: Prácticas de CEC con R

Contributed Research Articles 195

developed to evaluate the process performance and at any time. The use of control charts preventsthe process from getting out of control, and helping to detect the assignable causes corresponding tovariations of the critical-to-quality features (CTQs), thus performing process changes when actuallyrequired. Furthermore, control charts provide estimates of the natural range of process variation(natural control limits), allowing us to compare this range with those limits specified by standards,company managers, or customers (specification limits). Hence, the process monitoring can be carriedout by comparing each new observation with these natural limits, preventing defects in the finalproduct. Briefly, a control chart is a two-dimensional graph whose axis represents the variable orattribute that is being monitored (CTQ variables). The estimation of natural control limits of theCTQ variables is developed by a process composed of two phases: In Phase I, the natural controllimits are estimated using a preliminary sample (calibration sample) where we assume that thecauses of variation are only random. In Phase II, each new observation is plotted on the controlchart along with the natural limits obtained in the previous step. The set of new observations(twhich are not used to calculate the natural control limits) make up the so-called monitoring sample.Patterns, observations of out of control limits, runs of more than six observations on one side of thecentral line, among others, are some of the different criteria to identify out of control states in aspecific process, providing also valuable information about the detection of any assignable causes ofvariation in the monitoring.

Anomaly detection-quality control approach to improve and ensure energy efficiency and hygrothermal comfort in buildings

Multivariate statistics

Parametric

T2

Ellipse

MEWMA and MCUSUM

Nonparametric

R

Q

S

Univariate statistics

Parametric charts for variables

𝑋𝑋�-R

𝑋𝑋�-S

I-MR

EWMA and CUSUM

Parametric charts for atributes

p and np

u and c

Nonparametric charts

r

Q

S

FDA statistics

Phase I

Functional data chart

Depth control chart

Phase II

Functional data chart

Rank control chart

Figure 2: Control charts implemented in the qcr package.

The most used control charts are based on the assumptions of normality and independence ofthe studied CTQ variables. These charts are used to control the position and dispersion of CTQattributes and variables. Figure 2 shows some of the most important types of control charts. Thesecan be classified according to the type of feature that is being controlled (attribute or variable),the variable dimension (univariate or multivariate), and assuming or not a parametric distributionof the variable (parametric or nonparametric). The qcr package provides charts for the mean (x),standard deviation (s), range (R), individual measurements (I), moving ranges (MR), proportionof nonconforming units (p), number of nonconforming units (np), number of defects per unit (c),mean number of defects per control unit (u), exponentially weighted moving average (EWMA), andcumulative sum control chart (CUSUM). The last two techniques are also called memory controlcharts, and they are specially designed to detect shifts of less than two standard deviations, bothwhen using rational samples or individual measurements. On the other hand, new control chartsbased on the concept of data depth and developed by Liu (1995) are implemented in qcr. Thoseare the r, Q, and S control charts, the nonparametric alternatives for individual measurements,mean control chart, and CUSUM control chart, respectively. When more than one variable definesthe process quality, multivariate control charts are applied. If the Gaussian assumption is met, theHotelling T2 control chart can be applied. If we want to detect small deviations, multivariate EWMA(MEWMA) and multivariate CUSUM (MCUSUM) can be implemented. When no parametricdistribution is assumed, r, Q, and S charts can be used.

Another interesting SQC tool, which is very useful in the industry, is the Process CapabilityAnalysis (PCA). It estimates how well a process meets the tolerances defined by the company,customers, standards, etc., by comparing the specification tolerances with respect to the naturalrange of variation of CTQ features. The capability of the process is measured using capability

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 3: Prácticas de CEC con R

Contributed Research Articles 196

indicators. Process Capability Ratio (PCR) is a numerical score that helps the manufacturers knowwhether the output of a process meets the engineering specifications. Large PCR values show thatthe industrial or service process is capable of meeting the customer requirements. There have beenmany different PCRs developed in the last four decades that require the Gaussian assumption forthe CTQ variable (Boyles, 1991). However, many processes in industry and real applications do notmeet this hypothesis. Thus, we could innacuratelly estimate the capability using PCR. Hence, manyauthors have studied different nonparametric alternatives to traditional PCR (Polansky, 2007).

The qcr package has been developed in R (R Core Team, 2021) under the GNU license. Nowadays,there are other R packages that currently provide quality control tools for users. The use of eachone is shown in Figure 3.

The qcc package (Scrucca, 2004) was developed by Professor Luca Scrucca of the Department ofEconomics, Finance, and Statistics at the University of Perugia. It enables us to perform Shewhartquality control charts for variables and attributes, as well as the CUSUM and EWMA charts for de-tecting small changes in the CTQ variable. Multivariate analysis is performed applying the HotellingT2 control chart. Additionally, it has functions implemented to obtain the operating characteristiccurves (OC) and to estimate process capability analysis indices. Pareto and Ishikawa diagrams arealso implemented. Otherwise, the IQCC package (Barros, 2017) is maintained by Professor EmanuelP. Barbosa of the Institute of Mathematics in the State University of Campinas. It has a smallernumber of control charts implemented, but it incorporates multivariate graphics. The qualityToolspackage (Roth, 2016) was developed to aid learning in quality sciences. Figure 3 shows some ofits utilities, e.g., capability analysis (providing a comprehensive set of parametric distributions)and design of experiments. In addition, the SixSigma library (Cano et al., 2012, 2015) provides al-ternative functions to qualityTools and qcc packages and the possibility of implementing process maps.

4

Prácticas de CEC con R

Salvador Naya Javier Tarrío Miguel Flores

qcc

Univariate and multivariate

control charts

Process capability

OC curves

Ishikawa

Pareto

qcr

Univariate and multivariante control charts

Multivariate memory-type control charts:

MEWMA, CUSUM

Nonparametric control charts

Univariante and multivariate

process capability

Nonparametric process

capability

qualityTools

Ishikawa

Pareto

Gage R & R

Process capability

Design of experiments

IQCC

Univariate and multivariate

control charts

SixSigma

Process mapping

Ishikawa

Process capability

Gage R&R

Loss function

DEFINE

Process mapping

Ishikawa

Pareto

MEASURE

Gage R & R

Exploratory data analysis

ANALYZE

Exploratory data analysis

Statistical inference

Process capability analysis

IMPROVE

Design of experiments

ANOVA

Regression

CONTROL

Control charts

Figure 3: Comparison between the main packages in R devoted to Statistical Quality Control andthe qcr package.

Furthermore, there are other libraries specifically focused on control chart applications. Namely,the spcadjust (Gandy and Kvaloy, 2013) that allows us to estimate the calibration control limits ofShewhart, CUSUM, and EWMA control charts, and the spc (Knoth, 2021) which provides toolsfor the evaluation of EWMA, CUSUM, and Shiryaev-Roberts control charts by using Average RunLength and RL quantiles criteria. Moreover, the MSQC package (Santos-Fernandez, 2013) is a setof tools for multivariate process control, mainly control charts. It contains the main alternativesfor multivariate control charts such as Hotelling (T2), Chi-squared, MEWMA, MCUSUM, andGeneralized Variance control charts. It also includes some tools to evaluate the multivariate normalassumption. The corresponding multivariate capability analysis can be performed using the MPCIlibrary (Santos-Fernández and Scagliarini, 2012) that provides different multivariate capabilityindices. It is also interesting to mention the edcc package (Zhu and Park, 2013) for its economicdesign of control charts by minimizing the expected cost per hour of the studied process.

It is important to emphasize that the qcr package also includes new applications such asnonparametric approaches of control charts and capability indices (also covering the capability plots),

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 4: Prácticas de CEC con R

Contributed Research Articles 197

which are currently unavailable in the R software.

Datasets in the qcr package

The qcr package contains new databases (see table 1) based on study cases tackled by the authorsduring their professional activity as well as well-known datasets implemented on other packagesfocused on statistical quality control such as:

• archery1: It consists of a stage in which the archer shoots 72 arrows. The information isgiven in x and y coordinates. It is implemented in the MSQC package (Santos-Fernandez2013).

• circuit: Number of nonconformities observed in 26 successive samples of 100 printed circuitboards. It is implemented in the qcc package (Scrucca 2004).

• dowel1: Diameter and length of a dowel pin. It is implemented in the MSQC package(Santos-Fernandez 2013).

• orangejuice: Frozen concentrated orange juice is packed in 6-oz cartons. These cartons areformed on a machine by spinning them from a cardboard stock and attaching a metal bottompanel. A can is then inspected to determine whether, when filled, the liquid could possiblyleak either on the side seam or around the bottom joint. If this occurs, a can is considerednonconforming. The data were collected as 30 samples of 50 cans each at half-hour intervalsover a three-shift period in which the machine was in continuous operation. It is implementedin the qcc package (Scrucca 2004).

• pcmanufact: A personal computer manufacturer counts the number of nonconformities perunit on the final assembly line. He collects data on 20 samples of 5 computers each. It isimplemented in the qcc package (Scrucca 2004).

• pistonrings: Piston rings for an automotive engine are produced by a forging process. Theinside diameter of the rings manufactured by the process is measured on 25 samples, eachof size 5, drawn from a process being considered ‘in control’. It is implemented in the qccpackage (Scrucca 2004).

Univariate and multivariate parametric control charts in qcr

The construction of a control chart is equivalent to the plotting of the acceptance regions of asequence of hypothesis tests over time. Namely, the x chart is a control chart used to monitor theprocess mean µ. It plots the sample means, X’s, corresponding to subgroups of the {X1, X2, ...}observations and is equivalent to test the hypotheses H0 : µ = µ0 versus Hα : µ = µ0 (for sometarget value µ0) conducted over time, using x as the test statistic. Here we assume that {X1, X2, ...}are the sample measurements of a particular CTQ feature that follows the F distribution with meanµ and standard deviation σ. When there is insufficient evidence to reject H0, we can state thatthe process is under control; otherwise, the process is out of control. In other words, processes areunder control when their sources of variation are only the sources common to the process (Brownand Wetherill, 1990). The decision to reject or not H0 is based on the value of the sample meanx observed at each time interval (Liu and Tang, 1996). The control charts are easy to construct,visualize, and interpret, and most important, have proven their effectiveness in practice since the1920’s.

Control charts are defined, on the one hand, by a center line that represents the average valueof the CTQ feature corresponding to the in-control state and, on the other hand, two horizontallines, called the upper control limit (UCL) and the lower control limit (LCL). The region betweenthe control limits corresponds to the region where H0 is not rejected (defined in the previoussection). As a consequence, the process will be out of control when an observed rational sampleor an individual measurement falls outside the limits. Let w be a sample statistic that measures aquality characteristic of interest, and suppose that the mean of w is µw and the standard deviationof w is σw. Then the center line, the upper control limit, and the lower control limit become:

UCL = µw + Lσw

CL = µw

LCL = µw − Lσw,

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 5: Prácticas de CEC con R

Contributed Research Articles 198

Name Description

counters

A water supply company wants to control the performance of the watercounters installed throughout a city. For this purpose, 60 rationalsamples have been taken, each one composed by 3 measurements, fromthe same age (10 years) and caliber water counters corresponding totwo different brands, and during a period of 5 years. This datasetis based on a study case of A Coruña’s water supply company,Empresa Municipal de Aguas de La Coruña (Emalcsa).

employment

A Spaniard-Argentinian hotel company wants to control the level ofoccupancy (measured in %) in their establishments through theapplication of a continuous control. For this purpose, 48 subsamples havebeen taken from six hotels corresponding to two different countries.

oxidation

This database contains information about the resistance against theoxidation of olive oil of the Picual variety. Five measurements of theOnset Oxidation Temperature (OOT, index that measures theresistance against the oxidation) are obtained from 50 batches of Picualolive oil produced in chronological order. It is importantly to notethat OOT decreases as the oil is progressively mixed with otherolive oil varieties defined by a lower OOT.

plates

A chemical company is developing a patent for a new variant of artificialstone mostly made of quartz (93wt% and polyester resin). This companyis launching a pilot plant where it begins to produce plates of thismaterial on an industrial scale. The CTQ variable of this product is theVickers hardness. In order to measure the hardness level and hardnesshomogeneity of the product, 50 plates have been measured 5 times indifferent sections. The characteristic learning curves, through graduallevel change, can be observed.

presion

A shipyard of recreational boats is intended to optimize and control themechanical properties of the yacht hulls made of a composite based onepoxy resin. In this regard, the modulus of elasticity due to tensileefforts is measured after applying two different curing pressures: 0.1and 10 MPa. Overall, 60 subsamples, composed of three measurements,obtained from 60 vessels, have been taken.

Table 1: Some of the specific datasets included in the qcr package

where L is the “distance” of the control limits from the center line, expressed in standard deviationunits.

When several random variables characterize the quality of a process/service, applying statisticalmultivariate quality control techniques becomes necessary. In fact, if we analyze each variableseparately, the probability that an observation of a variable will fall within the calculated limitswhen it is known that the process is actually under control will no longer be 0.9973 for 6σ amplitude.Assuming independence, it will be 0.9973p, where p is the number of CTQ features, while theprobability of type I will actually lead to α′ = 1 − (1 − α)p. Therefore, the control limits aredifferent from those drawn, assuming the control of each CTQ variable independently from theothers. Moreover, if the variables are dependent, the calculation of α becomes more complex. Thissubject is particularly important today, as automatic inspection procedures make it customary tomeasure many parameters of each product over time. The more common multivariate parametriccontrol charts are the Hotelling T2 (to identify big shifts) and the multivariate CUSUM (MCUSUM)and EWMA (MEWMA) for identifying small shifts.

The functions that compute the quality control statistics for the different univariate controlcharts (involving continuous, attribute or count data) are shown in Table 2. For the sake of simplicityand taking into account that these types of control charts are implemented in other packages, theuse of these functions is not shown in this work. More details are given in the help of qcr package(Flores et al., 2021).

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 6: Prácticas de CEC con R

Contributed Research Articles 199

Statistical quality control charts forFunction Chart name Variables

qcs.xbar XSample means of a continuous process variable areplotted to control the process average.

qcs.R RSample ranges of a continuous process variable areplotted to control the process variability.

qcs.S SSample standard deviations of a continuous variableare plotted to control the process variability.

qcs.one ISample values from a I chart data of a continuousprocess variable to control the level (position)of the process.

Attributes

qcs.p pProportion of nonconforming units is plotted, thenumber of defective items follow a binomial distribution.

qcs.np npNumber of nonconforming units is plotted, and the chartis constructed based on the average of the process.

qcs.c cNonconformities per unit are plotted, number of defectsin a large population follow a Poisson distribution.

qcs.u uAverage nonconformities per unit are plotted, this chartdoes not require a constant number of units.

qcs.g gNumber of non-events between events are plotted, itcounts the number of events between rarely-occurringerrors or nonconforming incidents.

Attributes and variables

qcs.cusum CUSUMCumulative sums for individual observations or for theaverages of rational subgroups are plotted to monitorthe process mean.

qcs.ewma EWMAThe exponential weighed average of CTQ variables areplotted to identify small changes in the process(measured as rational samples or individual observations).

Multivariate control charts

mqcs.t2 T2 Multivariante Hotelling T2 control chart for individualobservations (vectors).

mqcs.mcusum MCUSUM Multivariate Cumulative Sum control chart for individualobservations (vectors).

mqcs.ewma MEWMA Multivariate EWMA control chart for individualobservations (vectors).

Functional data control charts

fdqcs.depth Phase I Phase I control chart for functional data:depth control chart and deepest curves envelope.

fdqcs.rank Phase II Phase II control chart for functional data:rank control chart and deepest curves envelope.

plot.fdqcs FDA plots Graphical outputs for Phase I and Phase II control chartsfor functional data.

Table 2: Univariate Shewhart, multivariate Hotelling T2, univariate and multivariate CUSUM andEWMA and FDA control charts available in the qcr package

Nonparametric control charts based on data depth

The control charts presented in this section were proposed by Liu (1995) as an alternative to thosedescribed in previous section. The main idea of its control graphs is to reduce each multivariatemeasure to the univariate index, that is, its relative center-exterior classification induced by a depthof data. This approach is completely nonparametric, and therefore, these control charts are notdefined by any parametric assumption regarding the process model. Thus, they are applicable in awider number of case studies than those counterparts such as T 2, MCUSUM, and MEWMA controlcharts. In addition, these graphs allow the simultaneous detection of the change of location (shift ofthe mean) and the increase of the scale (change in variability) in a process.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 7: Prácticas de CEC con R

Contributed Research Articles 200

Liu (1995) proposed and justified three types of control charts, the r, Q, and S charts which canbe considered as data-depth-based multivariate generalizations of the univariate X, x, and CUSUMcharts, respectively.

Data depth

In multivariate analysis, the term depth refers to the degree of centrality of a point regarding a datacloud or a probability distribution. Therefore, it is possible to define a rank in the multidimensionalEuclidean space through the calculation of observation depth. According to Dyckerhoff (2004) andCascos et al. (2011), the depth function can be defined as a bounded function Dp : Rd −→ R, withP the distribution set in Rd, that assigns at each point of Rd its degree of centrality with respect toP . Depth functions with which control charts can be performed are the

• Simplicial depth (Liu 1990),• Mahalanobis depth (Mahalanobis 1936),• Halfspace or Tukey depth (Tukey 1975),• Likelihood depth (Fraiman et al. 1997), and• Random projection depth (Zuo and Serfling 2000).

Statistics derived from data depth

Let G a k-dimensional distribution, and let Y1, . . . , Ym be m random observations from G. Thesample Y1, . . . , Ym is generally the reference sample of a CTQ variable in the context of quality control,composed of measurements from products obtained by an under control process. If X1, X2, . . . arethe new observations from the manufacturing process, assuming that the different Xi values followan F distribution if the quality of the studied product has been deteriorated or, in other words, ifthe process is out of control. Otherwise, they follow a G distribution. Let DG(·) denote a notion ofdepth, and assume that G and F are two continuous distributions. Thus, if all the DG(Yi) valuesare sorted in increasing order, and Y[j] denotes the sample value associated with the jth smallestdepth value, then Y[1], . . . , Y[m] are the order statistics of Yi’s, with Y[m] being the most centralpoint. Therefore, the smaller the order (or the rank) of a point, the farther that point will be fromthe underlying distribution G(·).

Liu (1995) defines the rank statistic as

rG (y) = P {DG (Y ) ≤ DG (y) | Y ∼ G}

whereby Y ∼ G indicates that the random variable Y follows the distribution G. When G isunknown, the empirical distribution Gm of the sample {Y1, . . . , Ym} can be used instead, and thestatistic is defined by

rGm(y) =

#{

DGm(Yj) ≤ DGm

(y) , j = 1, . . . , m}

m

In the same way that rG and rGm, the Q statistics can be also defined as follows

Q (G, F ) = P {DG (Y ) ≤ DG (X) | Y ∼ G, X ∼ F } = EF [rG (X)]

Q (G, Fn) =1n

n∑i=1

rG (Xi)

Q (Gm, Fn) =1n

n∑i=1

rGm(Xi) ,

whereby Fn(·) denotes the empirical distribution of the sample {X1, . . . , Xn}. The control chartscorresponding to these statistics can be developed as described in the following sections.

The r chart

Calculate {rG (X1) , rG (X2) , . . . , rG (Xn)} or {rGm(X1) , rGm

(X2) , . . . , rGm(Xn)} if G is un-

known but Y1, . . . , Ym are available. As a result, the r chart consists of plotting the rank statistic in

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 8: Prácticas de CEC con R

Contributed Research Articles 201

regard to time. The control chart central line is CL = 0.5, whereas the lower limit is LCL =α, withα accounting for the false alarm rate. The process will be out of control if rG(·) falls under LCL. Asmall value of the rank statistic rGm(X) means that only a very small proportion of Yi values aremore outlying than X. Therefore, assuming that X ∼ F , then a small value of rGm(X) suggests apossible deviation from G to F . This may be due to a shifting in the location and/or an increase inthe scale of the studied CTQ variable. Taking into account that the UCL is not defined for the rchart, the CL line serves as a reference to identify emerging patterns, runs, or trends. If rGm(X) isgreater than 0.5, there is evidence of scale decreasing, and also could take place a negligible locationshift. This case should be tackled as an improvement in quality given a gain in the accuracy, andthus the process should not be considered as out of control.

The Q chart

The idea behind the Q chart is similar to the one behind the x chart. If X1, X2, . . . are univariateand G is a normal distribution, the x chart plots the averages of consecutive subsets of the differentXi. A goal of this type of chart is that it can prevent the identification of a false alarm when theprocess is actually in control (even when some individual sample points fall out of control limits dueto random fluctuations).The Q chart is the nonparametric alternative to the x chart. It is performed by plotting the averagesof consecutive subsets of size n corresponding to the rank statistic (rG(Xi) or rGm(Xi)), given byQ

(G, F j

n

)or Q

(Gm, F j

n

), whereas F j

n is the empirical distribution of the Xi’s in the jth subset,j = 1, 2, . . . . Accordingly, if only {Y1, Y2, . . . , Ym} are available, the Q chart plots the sequence{

Q(

Gm, F jn

), Q

(Gm, F j

n

), . . .

}.

Depending on the value of n, the corresponding control limits are as follows:

• If n ≥ 5, CL = 0.5 and

– LCL = 0.5 − Zα (12n)12 for Q

(G, F j

n

).

– LCL = 0.5 − Zα

√1

12( 1

m + 1n

)for Q

(Gm, F j

n

).

• If n < 5, CL = 0.5 and LCL = (n!α)1n

n .

The S control chart

The S control chart is based on the CUSUM univariate control chart, which is basically the plotof

∑ni=1 (X − µ), which reflects the pattern of the total deviation from the expected value. As

mentioned above, it is more effective than the X chart or the x chart in detecting small processchanges. The nonparametric CUSUM chart based on data depth suggests plotting Sn(G) andSn(Gm), defined by

Sn (G) =

n∑i=1

(rG (Xi) − 1

2

)with control limits CL = 0 and LCL = −Zα

(n12

) 12 and

Sn (Gm) =

n∑i=1

(rGm

(Xi) − 12

).

If only Y1, . . . Ym are available, the control limits are CL = 0 and LCL = −Zα

√n2 (

1m + 1

n )12 . The

LCL control limits in both cases constitute a curve instead of a straight line; if n is large, the controlchart S should be standardized as follows:

S∗n (G) =

S∗n (G)√

n12

S∗n (Gm) =

Sn (Gm)√n2 (

1m + 1

n )12

Therefore, this S∗ chart is defined by CL = 0 and LCL = −Zα.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 9: Prácticas de CEC con R

Contributed Research Articles 202

Examples of r, Q and S control charts applied using synthetic data

A bivariate data set is used to illustrate how the previously discussed control charts arise. In fact,a synthetic dataset composed of 540 observations of a bidimensional standard Gaussian variablehas been simulated, in addition to 40 individuals corresponding to another bidimensional Gaussianvariable with mean and standard deviation equal to 2.

R> mu <- c(0, 0)R> Sigma <- matrix(c(1, 0, 0, 1), nrow = 2)R> Y <- rmvnorm(540, mean = mu, sigma = Sigma)R> u <- c(2, 2)R> S <- matrix(c(4, 0, 0, 4), nrow = 2)R> x <- rmvnorm(40, mean = u, sigma = S)

Prior to the application of nonparametric control charts, the dataset has to be converted into anpqcsd object. The synthetic dataset is arranged as two matrices, G composed of the 500 first rows(multivariate observations) of Y, and x with the remaining ones and including those belonging to thesecond bidimensional variable

R> x <- rbind(Y[501:540, ], x)R> G <- Y[1:500, ]R> data.npqcd <- npqcd(x, G)

In the same way, the npqcd function creates a data object for non parametric quality control, thenpqcs.r(), npqcs.Q(), and npqcs.S() functions computes all the statistics required to obtain the r,Q, and S control charts, respectively. The argument method = c("Tukey","Liu","Mahalanobis","RP","LD")specifies the data depth function, and alpha is the signification level that defines the LCL. SeeFlores et al. (2021) to obtain additional information about these functions and their arguments.

r chart

The r control chart can be obtained by applying the npqcs.r() function to the npqcd object andplotting the result.

R> res.npqcs <- npqcs.r(data.npqcd, method = "Tukey", alpha = 0.025)R> plot(res.npqcs, title = " r Control Chart")

The resulting chart is shown in Figure 4, where it can be observed that the process is out of controlfrom the 42nd observation, as expected, taking into account that most of the rGm(Xi) values arefalling below the LCL.

Q chart

In this case, the dataset is assumed to be composed of rational samples of size 4. Thus, the Qnonparametric alternative of x chart is proposed and applied to control the bidimensional process:

R> n <- 4 # samplesR> m <- 20 # measurementsR> k <- 2 # number of variablesR> x.a <- array( , dim = c(n, k, m))R> for (i in 1:m) {+ x.a[, , i] <- x[(1 + (i - 1) * n):(i * n), ]+ }R> data.npqcd <- npqcd(x.a, G)R> res.npqcs <- npqcs.Q(data.npqcd, method = "Tukey", alpha = 0.025)R> plot(res.npqcs, title = "Q Control Chart")

Figure 5 clearly shows that the process is out of control in the second half, from the 20th rationalsample. We can also see that the high random fluctuations of the r chart are attenuated in the Qchart due to the averaging effect.

S chart

Finally, the nonparametric counterpart of CUSUM control chart is performed from the multivariateindividual observations.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 10: Prácticas de CEC con R

Contributed Research Articles 203

r Control Chart

Sample

Ran

k

1 5 9 14 20 26 32 38 44 50 56 62 68 74 80

0.0

0.2

0.4

0.6

0.8

1.0

LCL= 0.025

CL

Figure 4: r control chart.

Q Control Chart

Sample

Q_n

1 2 3 4 5 6 7 8 9 11 13 15 17 19

0.0

0.2

0.4

0.6

LCL= 0.220027934198348

CL

Figure 5: Q control chart.

R> data.npqcd <- npqcd(x, G)R> res.npqcs <- npqcs.S(data.npqcd, method = "Tukey", alpha = 0.05)R> plot(res.npqcs, title = "S Control Chart")

Figure 6 shows that the process is out of control from the 48th observation. Note that the S graphperforms better in identifying small changes in a process. In this case, the performance of the Qchart is better than the corresponding to the S chart.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 11: Prácticas de CEC con R

Contributed Research Articles 204

S Control Chart

Sample

S_n

1 5 9 14 20 26 32 38 44 50 56 62 68 74 80

−4

−2

02

LCL= −1.64485362695147

CL

Figure 6: S control chart.

Control charts for functional data based on data depth

In the paradigm of Industry 4.0, processes and services are many times described by continuouslymonitored data of hourly, daily, monthly curves or smooth functions. When the processes are definedby functional data, the authors encourage to apply control charts based on Functional Data Analysis(FDA) in order to implement control and improvement tasks. In this section, the use of controlcharts presented in Flores et al. (Flores et al., 2020). Summarizing, this methodology consists ofthe proposal of new Phase I and Phase II control charts to be applied in those case study in whichthe datum unit is a curve. Phase I control chart is based on the computation of functional datadepth (specifically Fraiman and Muniz (Fraiman and Muniz, 2001), Mode (Cuevas et al., 2007),and random projections (Cuevas et al., 2007) data depth) from which a data depth control chartis developed. Once the in-control calibration sample is obtained, the Phase II control chart basedon functional data depth and rank nonparametric control chart can be applied. In addition to thePhase I functional data depth and Phase II rank control charts, plots of functional envelopes fromthe original curves are provided in order to help to identify the possible assignable causes of out ofcontrol states.

Estimating a Phase I control chart for functional data (calibration)

A dataset is simulated in order to illustrate the use of FDA control charts for Phase I and II. Afunctional mean, mu0, and a functional standard deviation, sigma, are defined as shown in (Floreset al., 2020). An n0 = 100 hundred curves composed of m = 30 points are simulated. They accountfor the calibration or retrospective sample.

R> library(fda.usc)R> m <- 30R> tt<-seq(0,1,len=m)# H0R> mu_0<-30 * tt * (1 - tt)^(3/2)R> n0 <- 100R> mdata<-matrix(NA,ncol=m,nrow=n0)R> sigma <- exp(-3*as.matrix(dist(tt))/0.9)R> for (i in 1:n0) mdata[i,]<- mu_0+0.5*mvrnorm(mu = mu_0,Sigma = sigma )

Prior to the application of control charts, the dataset is converted in a specific format by thefdqcd function. A plot function is also programmed to properly show the original functional data,plot.fdqcd. Figure 7 shows the original functional data consisting of curves.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 12: Prácticas de CEC con R

Contributed Research Articles 205

R> fdchart <- fdqcd(mdata)R> plot(fdchart,type="l",col="gray",main="Functional data")

0 5 10 15 20 25 30

02

46

810

Functional data

t

X(t

)

Figure 7: Original curves that account for the calibration sample.

The following step is to identify those curves that account for the in-control process. This task isdone by the application of a Phase I control chart for functional data. This method is implementedin the qcr package by the fdqcs.depth function. Specifically, the arguments and default values forthese functions are

R> fdqcs.depth.default <- function(x, data.name=NULL,func.depth = depth.mode,nb=200,+ type = c("trim","pond"),ns = 0.01,+ plot = TRUE, trim = 0.025, smo =0.05,+ draw.control = NULL,...)

where func.depth is the type of depth measure, by default depth.mode, nb the number of bootstrapresamples, type accounts for the method used to trim the data, trim or pond (Flores et al., 2020), ns isthe quantile to determine the cutoff from the bootstrap procedure (Flores et al., 2020), plot a logicalvalue indicating that it should be plotted, trim the percentage of the trimming, smo the smoothingparameter for the bootstrap resampling (Flores et al., 2020), whereas draw.control specifies thecol, lty, and lwd for the fdataobj, statistic, IN and OUT objects. When the fdqcs.depth functionis applied to the curves of the calibration sample, the fddep object of fdqcs.depth class is obtained.It is composed of the original data, the depth corresponding to each curve, the lower control limit ofthe depth chart, the index of those curves out of control, the curves that account for the limits ofthe envelope composed by the deepest cures, and the deepest curve or functional median.

R> fddep <- fdqcs.depth(fdchart)R> summary(fddep)

Length Class Modefdata 100 fdata list

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 13: Prácticas de CEC con R

Contributed Research Articles 206

Depth 100 -none- numericLCL 1 -none- numericout 1 -none- numericfmin 1 fdata listfmax 1 fdata listfmed 1 fdata listns 1 -none- numeric

R> class(fddep)

[1] "fdqcs.depth"

R> plot(fddep,title.fdata = "FDA chart",title.depth = "Depth chart")R> out <- fddep$out; out

[1] 29

Figure 8 shows the control chart for the depth of the curves (right panel). The LCL is estimatedby a smoothed bootstrap procedure (Flores et al., 2020). In order to provide a tool to identify theassignable cause of each out-of-control curve, the original curves with the envelope with the 99%of the deepest curves are also shown (left panel). The analysis of the shape and magnitude of thecurves in and out of bounds can help to associate each curve out of control to an assignable cause,allowing for processes control, maintenance, and improvement.

0 5 10 15 20 25 30

02

46

810

FDA chart

t

X(t

)

Calibration curvesMedian (Deepest)Envelope 99 %Outliers ●

●●

●●

●●

●●

0 20 40 60 80 100

510

1520

Depth chart

t

Dep

th

Figure 8: Left panel: Original curves with the envelope composed of 90% of the deepest curves.Right panel: Control chart for the depths of the curves (the LCL has been estimated by bootstrapprocedures at a signification level of 10%.

Phase I ends when a calibration sample without curves out of control is obtained. The iterativeprocedure to obtain an in control calibration sample is shown in the following lines.

R> alpha <- 0.1R> trim <- 0.1R> while (length(out)>0) {R> mdata <- fddep$fdata$data[-out,]R> fddep <- fdqcs.depth(mdata,ns = alpha, trim=trim, plot=FALSE)R> out <- fddep$outR> }R> plot(fddep,title.fdata = "Envelope with the 90\% deepest curves",+ title.depth = "Depth control chart")

Figure 9 is obtained from the application of plot function to fddep object. It shows that all thecurves of the calibration sample are in control, and thus, the natural variability of the process isestimated.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 14: Prácticas de CEC con R

Contributed Research Articles 207

0 5 10 15 20 25 30

02

46

8

Envelope with the 90% deepest curves

t

X(t

)Calibration curvesMedian (Deepest)Envelope 90 %Outliers

● ●

●●

0 10 20 30 40

6.5

7.0

7.5

8.0

8.5

9.0

Control chart of curves depth

t

Dep

th

Figure 9: Results corresponding to the second iteration to obtain the in control calibration sample.Left panel: Original curves with the envelope composed of the 90% of the deepest curves. Right panel:Control chart for the depths of the curves (the LCL has been estimated by bootstrap procedures ata signification level of 10%.

Estimating a Phase II control chart for functional data (monitoring)

The next step is to perform the Phase II of process control. The monitoring phase is performed bythe application of Phase II control charts for functional data based on multivariate nonparametriccontrol charts. Firstly, a monitoring sample composed of 50 curves is simulated by the followingcode.

R> mu_a<- 30 * tt^(3/2) * (1 - tt)R> n_a <- 50R> mdata_a<-matrix(NA,ncol=m,nrow=n_a)R> for (i in 1:n_a) mdata_a[i,]<- mu_a+0.5*mvrnorm(mu = mu_a,Sigma = sigma )

The curves of the monitoring sample are defined with fdqcd format and a control chartfor Phase II is developed by applying the fdqcs.rank function. It is composed by the follow-ing arguments, fdqcs.rank(x,y = x,func.depth = depth.FM,alpha = 0.01,plot = TRUE,trim= 0.1,draw.control = NULL,...).

R> fdchart_a <- fdqcd(mdata_a,"Monitoring curves")R> phase2.chart <- fdqcs.rank(fdchart,fdchart_a)R> plot(phase2.chart)R> summary(phase2.chart)

Figure 10 accounts for the FDA chart with the calibration sample and its envelope composedby the deepest curves. Moreover, the monitoring sample is also included and compared with thecalibration sample by using the FDA chart. In addition, the Phase II rank control chart for functionaldata is shown including both calibration and monitoring samples or only the ranks correspondingto the monitoring sample. The second population that corresponds with the monitoring sample isidentified by the control chart from the first monitored curve (panels below in Figure 10).

Process capability analysis

The analysis of the capability of a process in the case of statistical quality control is done throughthe calculation of the so-called capability. These indices measure whether a process is capable or notof meeting the corresponding technical specifications set by the customer, or the manufacturer, bycomparing those with the natural variability of the CTQ variable that characterizes the process.The interpretation of these indices is associated with the result of this relation. Capability indicesare generally calculated as the ratio between the length of the specification interval and the naturalvariability of the process in terms of σ. Large values of these indices mean that the correspondingprocess is capable of producing articles that meet the requirements of the client and manufacturers.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 15: Prácticas de CEC con R

Contributed Research Articles 208

0 5 10 15 20 25 30

02

46

810

Fdata Chart

t

X(t

)

Calibration curvesTrim MeanMedian (Deepest)Envelope 99 %

0 5 10 15 20 25 30

02

46

810

Fdata Chart

t

X(t

)

Envelope 99 %MonitoringOutliers

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

Phase II: Rank Chart

Index

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0 10 20 30 40 50

0.00

00.

004

0.00

8

Phase II: Rank Chart

Index

Figure 10: First row: the left and right panels show the FDA charts for the calibrating andmonitoring samples with a signification level of 0.01. Second row: The left and right panels showthe Phase II rank control chart for functional data, including calibration and monitoring sample(left panel) and only monitoring sample (right panel).

In other words, the larger the value of the capability index, the smaller the number of productsoutside the specification limits.

In this section, we describe the capability indices for processes whose underlying distributionis normal and not normal (exponential, Weibull, etc.). However, it is important to note that thedevelopment of programming tools for nonparametric capability analysis is one of the main goalsand contributions of the qcr package. In addition to the estimation of capability indices, a graphicaloutput is provided. Based on the proposal of qualityTools package, the qcr graphical output forcapability analysis includes a normality test for the CTQ variable, a Q-Q plot, a histogram withthe theoretical Gaussian distribution density, parametric and nonparametric estimates of capabilityindices and a contour capability control chart. In the following lines, parametric and nonparametriccapability analysis utilities are described using different examples of applications.

Assuming a normal distribution

The most widely used capability indexes in the industry analyze the process capability under theassumptions of a stabilized process (in control) and a Gaussian distributed CTQ variable. Table 3shows the main parametric (assuming Gaussian distribution) indices, namely Cp, Cpk, Cpm, andCpmk.

Vännman (1995) proposed a general formulation of these indices by an expression that dependson the non-negative parameters u and v:

Cp (u, v) =d − u|µ − m|

3√

σ2 + v (µ − T )2,

whereby d = (USL − LSL)/2, m = (LSL+USL)/2, USL is the upper specification limit, the LSLis the lower specification limit, σ is the theoretical standard deviation, µ accounts for the theoreticalmean of the CTQ variable, and T is the specification target (by default the mean between the LSLand USL). The indices shown in Table 3 are obtained from this expression just considering valuesof 0 or 1 for u and v: Cp (0, 0) = Cp, Cp (1, 0) = Cpk, Cp (0, 1) = Cpm, Cp (1, 1) = Cpmk.

The piston rings data set is used to illustrate the calculation of the capability indices using theqcs.cp() function based on the expressions previously described in Table 3. From the statisticsobtained from the x control chart of pistonrings dataset, the γ and β values are estimated, and thecorresponding capability index is computed.

R> data("pistonrings")R> xbar <- qcs.xbar(pistonrings[1:125, ], plot = FALSE)R> limits <- c(lsl = 73.99, usl = 74.01)

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 16: Prácticas de CEC con R

Contributed Research Articles 209

Potential capability Cp = USL−LSL6σ

Actual capability with respectto the specification limits

Cp,lower = µ−LSL3σ

Cp,upper = USL−µ3σ

Cpk = min[

USL−µ3σ , µ−LSL

]Shifting of the mean withrespect to the target Cpm =

Cp√1+

(µ−T

σ

)2

Cpk correction for detectingdeviations with respect to the target Cpkm =

Cpk√1+

(µ−T

σ

)2

Table 3: PCR from first to fourth generation, USL is the upper specification limit, LSL is the lowerspecification limit, µ is the real mean, µ is the estimated mean, and σ is the estimated standarddeviation.

R> # qcs.cp(object = xbar, parameters = c(0, 0), limits = limits,R> # contour = FALSE)R> # qcs.cp(object = xbar, parameters = c(1, 0), limits = limits,R> # contour = FALSE)R> # qcs.cp(object = xbar, parameters = c(0, 1), limits = limits,R> # contour = FALSE)R> qcs.cp(object = xbar, parameters = c(1, 1), limits = limits,+ contour = FALSE)

Cpmk delta.usl gamma.usl0.2984 0.1176 0.9785

Consequently, the obtained results are Cp = 0.3407, Cpk = 0.3006, Cpm = 0.3382, andCpmk = 0.2984, respectively. The argument parameters account for u and v values, while object isthe type of control chart from which the σ is estimated, limits are the specification control limits,and contour is the parameter that indicates when the process capability contour chart is plotted.

Process capability plot

In Vännman (2001) and Deleryd and Vännman (1999), a graphical method (based on commoncapability indices) to analyze the capability of a process is proposed. The goal of using this typeof plot (if compared with respect to only capability indices calculation) is to provide immediateinformation of the location and spread of the CTQ feature and about the capability to meet thespecifications of the corresponding process. When using this chart, a process will be capable if theprocess capability index is higher than a certain value k, with k > 1. The most used values for kare k = 1, k = 4/3, or k = 5/3, even 2 at a Six Sigma level, taking into account the usual indexlimits for which a process could be assumed capable. It will also be assumed that the target valuematches the center of the specification interval, that is, T = (USL+LSL)

2 = m. Then, one of theindices defined by the Cp (u, v) family is used, e.g., Cpk or Cpm, and the process will be defined ascapable if Cp (u, v) > k, given the values of u, v, and k. Also note that if µ = T , all the Cp (u, v)indices are defined by the same expression as the Cp. Moreover, different setting for u, v, and kimpose different constraints on the process parameters (µ, σ). This can be easily seen through aprocess capability plot. This graph is a contour plot of Cp (u, v) = k as a function of µ and σ, butit can also be defined as a function of δ and γ, with δ = µ−T

d and γ = σd . The contour line is

obtained by rewriting the index Cp (u, v) as a function of δ and γ as follows Cp (u, v) =1−u|δ|

3√

γ2+v(δ)2 .

Therefore, the Cp (u, v) = k equation is solved, plotting γ depending on the values of δ. The resultingexpressions are:

γ =

√(1 − u|δ|)

9k2 − vδ2, |δ| ≤ 1u + 3k

√v

, (u, v) = (0, 0)

When u = v = 0, that is, when we consider the index Cp = k, we have γ = 13k and |δ| ≤ 1. It is

important to highlight that the γ-axis accounts for the process spread, whereas the δ-axis accountsfor the process location. The values of the parameters µ and σ which provide values (δ, γ) withinthe region bounded by the contour line Cp (u, v) = k and the δ-axis will provide a larger Cp (u, v)

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 17: Prácticas de CEC con R

Contributed Research Articles 210

value than k, leading a capable process. Furthermore, values of µ and σ which provide values (δ, γ)outside this region will provide a value Cp (u, v) smaller than k, i.e., a non-capable process. In thecase of the process not being capable, this type of plot is useful to understand if the correctiveactions have to be performed to decrease the process spread, or the process location (deviation withrespect to target), or even when both changes are needed to improve the process capability. This canbe observed by observing the distance with respect to the x and y-axis. Below are some examples ofcapability plot application which can be generated through the application of the qcs.cp functionwith contour=TRUE and k=1 (default values):

R> oldpar <- par(mfrow = c(2, 2))R> qcs.cp(object = xbar, parameters = c(0, 0), limits = limits,+ ylim = c(0, 1))R> qcs.cp(object = xbar, parameters = c(1, 0), limits = limits,+ ylim = c(0, 1))R> qcs.cp(object = xbar, parameters = c(0, 1), limits = limits,+ ylim = c(0, 1))R> qcs.cp(object = xbar, parameters = c(1, 1), limits = limits,+ ylim = c(0, 1))R> par(oldpar)

The result is shown in Figure 11. In all the cases, the points in red are out of the area defined by theline in blue and the δ axis. Thus, the corresponding process is not capable, no matter the capabilityindex that is used. In any case, note that the Cp index is useless in identifying non-capable processesdue to location shifts with respect to the target. In the same way, the Cpk index assumes as capableprocesses that are far from the target as long as they were close to the specification limits (as shownin Figure 11). Thus, the use of the Cpm and Cpmk are recommended due to they take into accountboth shifts from the target and the spread. In the present case, the process is not capable due tothe spread rather than the target shift. Therefore, the process changes could be due to decreases inthe variability process.

Contour plot: Cp

delta

gam

ma

−1.0 −0.5 0.0 0.5 1.0

0.0

0.4

0.8

Contour plot: Cpk

delta

gam

ma

−1.0 −0.5 0.0 0.5 1.0

0.0

0.4

0.8

Contour plot: Cpm

delta

gam

ma

−0.3 −0.1 0.1 0.3

0.0

0.4

0.8

Contour plot: Cpmk

delta

gam

ma

−0.2 0.0 0.1 0.2

0.0

0.4

0.8

Figure 11: Process capability plots using the Cp, Cpk, Cpm, and Cpmk indexes.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 18: Prácticas de CEC con R

Contributed Research Articles 211

Estimated process capability plot

In practice, the process parameters are unknown and we need to estimate them. We can perform adecision rule based on the sample statistics that provide a sample estimate of the capability indexand, finally the so called estimated process capability plot, also called γ∗ − δ∗ plot (Deleryd andVännman, 1999). It allows us to decide whether a process is capable or not assuming that µ andσ parameters are unknown and estimated by µ = X and σ2 = 1

n

∑ni=1 X2

i − X2. They are themaximum likelihood estimators when the CTQ variable of the process is normally distributed, andX1, X2, . . . , Xn is a random sample of a normal distribution with µ mean and σ2 variance.The qcr package only provides the γ∗ − δ∗ plot corresponding to the Cpm index taking into accountthat the other capability indices do not consider shifts from the target value in their calculations.For the general case, see the work of Vännman (2001). In order to obtain an appropriate decisionrule for the case of Cpm index, we test the hypotheses H0 : Cpm ≤ k0 versus H1 : Cpm > k0, using

Cpm =d

3√

σ2 + (µ − T )2

as test statistic. The null hypothesis will be rejected if Cpm > cα, where the constant cα isdetermined by previously defining a signifition test level α . Vännman (2001) showed that the nullhypothesis H0 : Cpm ≤ k0 can be reduced to H0 : Cpm = k0. Thus, for given values of α and n, theprocess will be considered capable if Cpm > cα, with cα > k0. Hubele and Vannman (2004) provedthat, when the Cpm index is used, the critical value for a given α is obtained as

cα = k0

√n

χ2α,n

,

where χ2α,n is the quantile α of a χ2 distribution with n degrees of freedom. The qcr package

includes the qcs.hat.cpm() function to obtain both the theoretical capability plot and the estimatedcapability plot from sample statistics. Among other options, the user can indicate the control chartfrom which the estimates µ and σ are obtained (alternatively, µ and σ can be introduced through muand std.dev), and the specification limits using limits. Furthermore, the signification level and thecapability limit can be modified, as they are set to α = 0.05 and k0 = 1 by default. The followingcode illustrates its application to pistonrings data.

R> xbar <- qcs.xbar(pistonrings[1:125, ], plot = FALSE)R> limits <- c(lsl = 73.99, usl = 74.01)R> # qcs.hat.cpm(object = xbar, limits = limits, ylim = c(0,1))R> mu <- xbar$centerR> std.dev <- xbar$std.devR> qcs.hat.cpm(limits = limits, mu = mu, std.dev = std.dev, ylim = c(0,1))

The result is shown in Figure 12. The contour line corresponding to the capability region obtainedfrom the capability index sample is always more restrictive than the corresponding theoretical one.

Nonparametric capability analysis

Traditional assumptions about data such as normality or independence are frequently violated inmany real situations. Thus, in scenarios in which assumptions of normality are not verified, theindices defined in the previous sections are not valid. Pearn and Chen (1997) and Tong and Chen(1998) proposed generalizations of Cp (u, v) for the case of arbitrary distributions of data

CNp (u, v) =d − u|M − m|

3√(

F99.865−F0.1356

)2+ v (M − T )2

,

where Fα is the percentile α% of the corresponding distribution and M the median of the process.However, the distribution of the underlying process is always unknown. Chang and Lu (1994)calculated estimates for F99.865, F0.135 and M based on the sample percentiles.

Pearn and Chen (1997) proposed the following estimator

CNp (u, v) =d − u|M − m|

3√(

Up−Lp

6

)2+ v

(M − T

)2

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 19: Prácticas de CEC con R

Contributed Research Articles 212

Contour plot: Cpm

delta

gam

ma

−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

0.0

0.2

0.4

0.6

0.8

1.0

TheoryEmpirical

Figure 12: Comparison between theorical and estimated process capability plots.

where Up is an estimator for F99.865, Lp is an estimator for F0.135, and M is an estimator for M ,obtained from the tables developed by Gruska et al. (1989).

The qcs.cpn() function of qcr calculates CNp, CNpk, CNpm, and CNpmk using the formulationdescribed by Tong and Chen (1998). The code that illustrates its use is shown below. To obtain thenonparametric capability indices it is necessary to indicate the u and v parameters.

R> xbar <- qcs.xbar(pistonrings[1:125, ], plot = FALSE)R> limits <- c(lsl = 73.99, usl = 74.01)R> # x <- xbar$statistics[[1]]R> # median <- median(x)R> # q = quantile(x, probs = c(0.00135, 0.99865)) # c(lq, uq)R> # qcs.cpn(parameters = c(0, 0), limits = limits, median = median, q = q)R> # qcs.cpn(object = xbar, parameters = c(0, 0), limits = limits)R> # qcs.cpn(object = xbar, parameters = c(1, 0), limits = limits)R> # qcs.cpn(object = xbar, parameters = c(0, 1), limits = limits)R> qcs.cpn(object = xbar, parameters = c(1, 1), limits = limits)

CNpmk0.9015

Thus, the values obtained are CNp = 1.0082, CNpk = 0.9275, CNpm = 0.9799 and CNpmk = 0.9015.If a capability limit of k = 1 or k = 1.33 is assumed, we can infer that the process is not actuallycapable to meet the customers or manager’s requirements.

Tools for a comprehensive processs capability analysis

Function qcs.ca() provides a comprehensive information of the capability of a process, summarizedthrough a graphical output. This function calculates the process capability indices Cp, Cpk, CpL,CpU , Cpm, Cpmk from a qcs object, assuming a Gaussian distribution. Moreover, it computesconfidence limits for Cp using the method described by Chou et al. (1990). Approximate confidencelimits for Cpl, Cpu, and Cpk are also estimated using the method described in Bissell (1990), whilethe confidence limits for Cpm are based on the approximated method of Boyles (1991) that assumesthe target is the mean of the specification limits. Moreover, the CNp, CNpk, CNpm, and CNpmk

nonparametric capability indices are also obtained. There is also a specific box within the summaryplot that shows the proportion of observations and expected observations under the Gaussianassumption out of the specification limits (nonconforming observations). Further, a histogram of the

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 20: Prácticas de CEC con R

Contributed Research Articles 213

data sample is provided, in addition to the corresponding Gaussian density curves obtained fromthe sample estimates (one per standard deviation estimate procedure). They are displayed alongwith the specification limits, a quantile-quantile plot for the specified distribution, and a processcapability plot obtained from the Cpm index (both using theoretical and sample alternatives). Inorder to describe the qcs.ca() performance, the following code corresponds to the analysis of thefirst 125 observations of the pistonrings dataset (the corresponding output is shown in Figure 13).

R> qcs.ca(xbar, limits = c(lsl = 73.99, usl = 74.01))

Process Capability Analysis

Call:qcs.ca(object = xbar, limits = c(lsl = 73.99, usl = 74.01))

Number of obs = 125 Target = 74Center = 74 LSL = 73.99StdDev = 0.009785 USL = 74.01

Paremetric Capability indices:

Value 0.1% 99.9%Cp 0.3407 0.2771 0.4065Cp_l 0.3807 0.2739 0.4875Cp_u 0.3006 0.2021 0.3991Cp_k 0.3006 0.1944 0.4068Cpm 0.3382 0.2749 0.4038

Non parametric Capability indices:

ValueCNp 1.0082CNpK 0.9275CNpm 0.9799CNpmk 0.9015

PPM:

Exp<LSL 1.267e+07 Obs<LSL 0Exp>USL 1.836e+07 Obs>USL 8e+05

Exp Total 3.103e+07 Obs Total 8e+05

Test:

Anderson Darling Test for normal distribution

data: xbarA = 0.1399, mean = 74.001, sd = 0.005, p-value = 0.9694alternative hypothesis: true distribution is not equal to normal

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 21: Prácticas de CEC con R

Contributed Research Articles 214

Den

sity

73.985 73.990 73.995 74.000 74.005 74.010 74.015

1020

3040

5060

70

LSL = 74 USL = 74LSL = 74 USL = 74

TARGETSTLT

Process Capability for x

Process DataSample n = 125

A = 0.14p = 0.969LSL = 74

Target = 74USL = 74

std.dev (ST) = 0.00979mean = 74

std.dev (LT) = 0.0101

Quantiles for x[, 1]

Process Capability for x

Parametric Capability Process (ST)

Cp = 0.341

Cpu = 0.301

Cpl = 0.381

Cpk = 0.301

Cpm = 0.338

Cpmk = 0.298

NonParametric Capability Process (ST)

CNp = 1.01

CNpk = 0.927

CNpm = 0.98

CNpmk = 0.901

Process Capability for x

Performance − ST [%]

Exp<LSL = 12.7

Exp>USL = 18.4

Exp Total = 31

Obs<LSL = 0

Obs>USL = 0.8

Obs Total = 0.8

TheoryEmpirical

Figure 13: A complete analysis of the process capability.

Conclusions

The qcr package has been developed to provide users with a comprehensive set of functions thatmanage statistical process control, ranging from univariate parametric analysis to multivariate andFDA nonparametric statistics. This package includes the main types of control charts and capabilityindices. It combines the main features of reputed SQC packages in R such as qcc and qualityToolswith the proposal of a new graphical appearance and the implementation of new SQC tools withincreasing importance in Industry 4.0 such as multivariate and nonparametric analysis.In addition to some utilities provided by reference R packages such as qcc, SixSigma, and qualityTools,qcr implements very important statistical techniques of Control and Analysis tasks of the Six Sigmaprocedure that are not included in other libraries. In the case of multivariate control charts, thesetools are the MEWMA and MCUSUM multivariate control charts, on the one hand, and the r, Qand S nonparametric control charts based on data depth, on the other hand. In addition, Phase Iand Phase II control charts for functional data (monthly, daily, hourly curves) based on functionaldata depth, bootstrap procedures, and nonparametric rank charts have also been implemented in theqcr package. These control charts for functional data provide tools to control and improve processeswhen their CTQ variables are obtained as hourly, monthly, daily, yearly smooth curves.It is also very important to note that qcr provides functions to perform nonparametric capabilityanalysis. In addition, the new implementation of the process capability plots for the main parametriccapability indices allows us to analyze if improvements in process spread or/and process location areneeded to obtain a capable process. The comparison between suppliers, machines, etc., is enabledthrough capability plots.All these utilities intend to make qcr a useful tool for users of a wide variety of industries, providinga competitive alternative to commercial software.

Acknowledgments

The work of Salvador Naya, Javier Tarrío-Saavedra, Miguel Flores and Rubén Fernández-Casalhas been supported by MINECO grant MTM2017-82724-R, and by the Xunta de Galicia (Gruposde Referencia Competitiva ED431C-2020-14 and Centro de Investigación del Sistema universitariode Galicia ED431G 2019/01), all of them through the ERDF. The research of Miguel Flores hasbeen partially supported by Grant PII-DM-002-2016 of Escuela Politécnica Nacional of Ecuador.In addition, the research of Javier Tarrío-Saavedra has been also founded by the eCOAR project(PC18/03) of CITIC. The authors thank Mateo Larco and Bibiana Santiso for their valuable helpwith the English edition.

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 22: Prácticas de CEC con R

Contributed Research Articles 215

Bibliography

F. Barros. IQCC: Improved Quality Control Charts, 2017. URL https://CRAN.R-project.org/package=IQCC. R package version 0.7. [p196]

A. Bissell. How reliable is your capability index? Applied Statistics, pages 331–340, 1990. [p212]

R. A. Boyles. The Taguchi capability index. Journal of Quality Technology, 23:17–26, 1991. [p196,212]

D. W. Brown and G. B. Wetherill. Statistical process control. Chapman and Hall, 1990. [p197]

E. L. Cano, J. M. Moguerza, and A. Redchuk. Six Sigma with R, volume 36 of Use R! Springer-Verlag,New York, 2012. [p196]

E. L. Cano, J. M. Moguerza, and M. P. Corcoba. Quality Control with R. Use R! Springer-Verlag,New York, 2015. [p196]

I. Cascos, A. López, and J. Romo. Data depth in multivariate statistics. Boletín de Estadística eInvestigación Operativa, 27(3):151–174, 2011. [p200]

C. W. Champ and W. H. Woodall. Exact results for Shewhart control charts with supplementaryruns rules. Technometrics, 29(4):393–399, 1987. [p194]

P.-L. Chang and K.-H. Lu. PCI calculations for any shape of distribution with percentile. QualityWorld Technical Supplement, pages 110–114, 1994. [p211]

Y.-M. Chou, D. Owen, and S. Borrego A. Lower confidence limits on process capability indices.Journal of Quality Technology, 22(3):223–229, 1990. [p212]

A. Cuevas, M. Febrero, and R. Fraiman. Robust estimation and classification for functional data viaprojection-based depth notions. Computational Statistics, 22(3):481–496, 2007. [p204]

M. Deleryd and K. Vännman. Process capability plots – a quality improvement tool. Quality andReliability Engineering International, 15(3):213–227, 1999. [p209, 211]

R. Dyckerhoff. Data depths satisfying the projection property. Allgemeines Statistisches Archiv, 88(2):163–190, 2004. [p200]

M. Flores, S. Naya, R. Fernández-Casal, S. Zaragoza, P. Raña, and J. Tarrío-Saavedra. Constructinga control chart using functional data. Mathematics, 8(1):58, 2020. [p204, 205, 206]

M. Flores, R. Fernández-Casal, S. Naya, and J. Tarrío-Saavedra. qcr: Quality Control Review, 2021.URL https://CRAN.R-project.org/package=qcr. R package version 1.3. [p198, 202]

R. Fraiman and G. Muniz. Trimmed means for functional data. Test, 10(2):419–440, 2001. [p204]

R. Fraiman, R. Y. Liu, and J. Meloche. Multivariate density estimation by probing depth. LectureNotes-Monograph Series, pages 415–430, 1997. [p200]

A. Gandy and J. T. Kvaloy. Guaranteed conditional performance of control charts via bootstrapmethods. Scandinavian Journal of Statistics, 40:647–668, 2013. doi: 10.1002/sjos.12006. [p196]

G. Gruska, K. Mirkhani, and L. Lamberson. Non normal data analysis. St. Clair Shores, MI: AppliedComputer Solutions, Inc, 1989. [p212]

N. F. Hubele and K. Vannman. The effect of pooled and un-pooled variance estimators on cpm subwhen using subsamples. Journal of quality technology, 36(2):207, 2004. [p211]

S. Knoth. spc: Statistical Process Control – Calculation of ARL and Other Control Chart PerformanceMeasures, 2021. URL https://CRAN.R-project.org/package=spc. R package version 0.6.5.[p196]

R. Liu. On a notion of data depth based on random simplices. The Annals of Statistics, 18(1):405–414, 1990. [p200]

R. Y. Liu. Control charts for multivariate processes. Journal of the American Statistical Association,90(432):1380–1387, 1995. [p195, 199, 200]

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 23: Prácticas de CEC con R

Contributed Research Articles 216

R. Y. Liu and J. Tang. Control charts for dependent and independent measurements based onbootstrap methods. Journal of the American Statistical Association, 91(436):1694–1700, 1996.[p197]

P. C. Mahalanobis. On the generalised distance in statistics. Proceedings of the National Institute ofSciences of India, 1936, pages 49–55, 1936. [p200]

D. C. Montgomery. Introduction to statistical quality control. John Wiley & Sons (New York), 2009.[p194]

P. S. Pande, R. P. Neuman, and R. R. Cavanagh. The six sigma way: How GE, Motorola, and othertop companies are honing their performance. McGraw-Hill (New York), 2000. [p194]

W. Pearn and K. Chen. A practical implementation of the process capability index cpk. QualityEngineering, 9(4):721–737, 1997. [p211]

A. M. Polansky. Process capability indices, nonparametric. Encyclopedia of Statistics in Qualityand Reliability, 2007. [p196]

R Core Team. R: A Language and Environment for Statistical Computing. R Foundation forStatistical Computing, Vienna, Austria, 2021. URL https://www.R-project.org/. [p196]

T. Roth. qualityTools: Statistics in Quality Science., 2016. URL http://www.r-qualitytools.org.R package version 1.55 http://www.r-qualitytools.org. [p196]

E. Santos-Fernandez. Multivariate Statistical Quality Control Using R, volume 14. Springer-Verlag,2013. ISBN 9781461454533. URL http://www.springer.com/statistics/computational+statistics/book/978-1-4614-5452-6. [p196, 197]

E. Santos-Fernández and M. Scagliarini. MPCI: An R package for computing multivariate processcapability indices. Journal of Statistical Software, 47(7):1–15, 2012. URL http://www.jstatsoft.org/v47/i07/. [p196]

L. Scrucca. qcc: An R package for quality control charting and statistical process control. R News,4/1:11–17, 2004. URL https://cran.r-project.org/doc/Rnews/. [p196, 197]

L.-I. Tong and J.-P. Chen. Lower confidence limits of process capability indices for non-normalprocess distributions. International Journal of Quality & Reliability Management, 15(8/9):907–919,1998. [p211, 212]

J. W. Tukey. Mathematics and the picturing of data. In Proceedings of the international congress ofmathematicians, volume 2, pages 523–531, 1975. [p200]

K. Vännman. A unified approach to capability indices. Statistica Sinica, pages 805–820, 1995. [p208]

K. Vännman. A graphical method to control process capability. In Frontiers in Statistical QualityControl 6, pages 290–311. Springer-Verlag, 2001. [p209, 211]

W. Zhu and C. Park. edcc: An R package for the economic design of the control chart. Journal ofStatistical Software, 52(9):1–24, 2013. URL http://www.jstatsoft.org/v52/i09/. [p196]

Y. Zuo and R. Serfling. General notions of statistical depth function. Annals of statistics, pages461–482, 2000. [p200]

Miguel FloresMODES, SIGTI, ADIAAC, Departamento de Matemáticas, Escuela Politécnica Nacional170517 [email protected]

Rubén Fernández-CasalMODES, CITIC, Universidade da CoruñaFacultade de Informática, Campus de Elviña, A Coruñ[email protected]

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859

Page 24: Prácticas de CEC con R

Contributed Research Articles 217

Salvador NayaMODES, CITIC, ITMATI, Universidade da CoruñaEscola Politécnica Superior, Mendizábal s/n, [email protected]

Javier Tarrío-SaavedraMODES, CITIC, Universidade da CoruñaEscola Politécnica Superior, Mendizábal s/n, [email protected]

The R Journal Vol. 13/1, June 2021 ISSN 2073-4859