15
Computational Statistics & Data Analysis 39 (2002) 227–241 www.elsevier.com/locate/csda Regression analysis of experiments with complex confounding patterns guided by the alias matrix John Lawson Department of Statistics, Brigham Young University, 230 TMCB, P.O. Box 26575, Provo, UT 84602-6575, USA Received 1 November 2000; received in revised form 1 June 2001 Abstract Resolution III experiments with complex confounding patterns, often called main eect plans. Their main eects are partially confounded with two-factor interactions rather than being either independent or completely aliased, as they are in regular designs that are constructed from a dening relation. It is possible to detect some two-factor interactions, from experiments with complex confounding patterns if only a few of the factors are active. The partial confounding of interactions with main eects in these experiments allows one to estimate interactions with regression analysis. Recent papers have shown how the interactions can be detected using repeated runs of stepwise regression, or computationally intensive Bayesian methods. This paper shows that, with a short list of candidate interactions, the important interactions can be detected eciently by a single pass of all subsets regression. The short list of candidates interactions for the all subsets regression are selected with regard to their potential for contributing to the value of the large estimated eects calculated from the data. The potential is determined by summing coecients in the alias matrix. This paper reviews the concept of the alias matrix, and denes the alias plot that can be obtained from the alias matrix. The alias plot provides simple graphic for viewing the potential contribution of interactions to large estimated eects. The interactions identied on the alias plot, along with main eects, are used as candidates in an all subsets regression analysis of data from partially confounded experiments. The resulting strategy allows exploration of a wide range of the potential model space, can reveal several plausible models for the data if they exist, and can be completed using standard statistical software. Examples are presented. c 2002 Published by Elsevier Science B.V. Keywords: Alias matrix; Alias plot; Eect sparsity; Eect heredity E-mail address: [email protected] (J. Lawson). 0167-9473/02/$ - see front matter c 2002 Published by Elsevier Science B.V. PII: S 0167-9473(01)00056-1

Regression analysis of experiments with complex confounding patterns guided by the alias matrix

Embed Size (px)

Citation preview

Computational Statistics & Data Analysis 39 (2002) 227–241www.elsevier.com/locate/csda

Regression analysis of experimentswith complex confounding patterns

guided by the alias matrixJohn Lawson

Department of Statistics, Brigham Young University, 230 TMCB, P.O. Box 26575, Provo,UT 84602-6575, USA

Received 1 November 2000; received in revised form 1 June 2001

Abstract

Resolution III experiments with complex confounding patterns, often called main e/ect plans. Theirmain e/ects are partially confounded with two-factor interactions rather than being either independentor completely aliased, as they are in regular designs that are constructed from a de2ning relation. It ispossible to detect some two-factor interactions, from experiments with complex confounding patterns ifonly a few of the factors are active. The partial confounding of interactions with main e/ects in theseexperiments allows one to estimate interactions with regression analysis. Recent papers have shownhow the interactions can be detected using repeated runs of stepwise regression, or computationallyintensive Bayesian methods. This paper shows that, with a short list of candidate interactions, theimportant interactions can be detected e4ciently by a single pass of all subsets regression. The shortlist of candidates interactions for the all subsets regression are selected with regard to their potentialfor contributing to the value of the large estimated e/ects calculated from the data. The potentialis determined by summing coe4cients in the alias matrix. This paper reviews the concept of thealias matrix, and de2nes the alias plot that can be obtained from the alias matrix. The alias plotprovides simple graphic for viewing the potential contribution of interactions to large estimated e/ects.The interactions identi2ed on the alias plot, along with main e/ects, are used as candidates in an allsubsets regression analysis of data from partially confounded experiments. The resulting strategy allowsexploration of a wide range of the potential model space, can reveal several plausible models for thedata if they exist, and can be completed using standard statistical software. Examples are presented.c© 2002 Published by Elsevier Science B.V.

Keywords: Alias matrix; Alias plot; E/ect sparsity; E/ect heredity

E-mail address: [email protected] (J. Lawson).

0167-9473/02/$ - see front matter c© 2002 Published by Elsevier Science B.V.PII: S 0167-9473(01)00056-1

228 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

1. Introduction

Many screening designs such as the Plackett–Burman designs (Lawson and Er-javec, 2001), certain 3k−p fractional factorials such as Latin Square and Greco LatinSquares (Hunter, 1985), and fractions of mixed level factorial designs such as theL18 (an 18 run fraction of a 2 × 37), have complex confounding patterns. By com-plex confounding we mean that interactions are partially confounded with many maine/ects, rather than being orthogonal to, or completely confounded with one main ef-fect, as they are in 2k−p fractional factorials. Traditionally, interactions were ignoredin screening experiments with complex confounding patterns, and the data was ana-lyzed by an analysis of variance of the main e/ects only. In fact, these experimentaldesigns were often called main e/ect plans since it was thought that no informationabout interactions could be obtained from them.

Recently, non-traditional methods of analysis have been proposed that would allowan analyst to detect a few important two-factor interactions in a screening design witha complex confounding pattern. These methods have relied on the hidden projectionproperty (Lin and Draper, 1992; Wang and Wu, 1995) inherent in the experimentaldesign and the principles of e8ect sparsity (Box and Meyer, 1986) and e8ect heredity(Hamada and Wu, 1992). The hidden projection property of designs with complexconfounding pattern allows estimation of some interactions even though the designis not of the right resolution. Some authors have extended the study of projectionproperties of designs to recommend follow-up experiments to increase the initialdesign resolution or to create entirely new designs (Draper and Lin, 1995; Cheng,1995; Church, 1995, 1996). But, for purposes of analyzing data from screeningdesigns that have already been conducted, we are interested in projection propertiesof the design only in as far as they allow interaction e/ects to be estimated usingregression analysis.

The other two principles that allow non-traditional methods to detect interactionsin screening designs are e8ect sparsity and e8ect heredity. The principle of e/ectsparsity means that in a screening design it is likely that only a few of the factorstested will have large or important e/ects. The principle of e/ect heredity meansthat if there are important interactions they will most likely involve factors that alsohave large main e/ects. These two principles make the analysis feasible by reducingthe number of e/ects we need to consider for a model of the data.

Hamada and Wu (1992) proposed an iterative strategy for analyzing screeningdesigns with complex confounding patterns based on regression. Their approach wasmotivated by the fact that it was infeasible to perform an all subsets regression withmost designs, because the number of main e/ects plus two-factor interactions is usu-ally much larger than the number of runs in the design and it would take an extensivetime to complete the calculations. Instead their strategy employs several iterations ofa forward stepwise regression. This method will not work if interactions are largerin magnitude than main e/ects. In that case they proposed a tedious alternative thatrequires k additional stepwise regressions, where k is the number of factors.

Lin (1998–1999) proposed a simpler alternative method for spotlightinginteractions in main e/ect plans through use of a forward stepwise regression on

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 229

a list of variables that included all main e/ects and all possible two-factor interac-tions. However, this method, like the Hamada and Wu (1992) method, is restrictedby the forward stepwise regression search. When a variable enters the model in aforward search, it will never be removed, even if it becomes superFuous after a sub-sequent variable enters the model. Due to this fact a forward search only explores asubset of the potential model space.

Box and Meyer (1993) took a Bayesian approach to identifying subsets of factorsthey would consider to be active in a screening experiment with a complex con-founding pattern. Their method assigns prior probabilities to a series of models, Mi ,that each consists of a subset of factors. Marginal posterior probabilities that a factorfj is active are then calculated. By including in the class of models, all subsets offactors up to a certain limit, this method can explore a much larger model space likean all subsets regression.

Chipman et al. (1997) proposed another Bayesian method of analysis based onthe stochastic search variable selection algorithm of George and McCulloch (1993).They propose relaxed weak heredity priors, or strict weak heredity, that capturethe dependence relation between the importance of an interaction terms and themain e/ects from which it is formed. The stochastic nature of the variable searchalgorithm means that all possible models have a positive probability of being visited,thus widening the possible model space searched. The algorithm is sensitive to theprior values of other parameters in the model, which the authors describe as tuningparameters. One of the advantages of this method is that it can identify severalmodels that can potentially explain the available data. When the method identi2esseveral models with comparable posterior probabilities, the authors point out that thedata is not informative about the exact model choice.

Although the methods discussed above for analyzing screening experiments withcomplex confounding patterns are much more powerful than the traditional maine/ects ANOVA, certain shortcomings or di4culties remain. The 2rst di4culty thatwould e/ect the typical data analyst is availability of software. Two of the methods(Chipman et al., 1997; Box and Meyer, 1993) require special purpose programsto perform the calculations. The other two methods, the Hamada and Wu (1992)and Lin (1998), can be performed using a standard statistical package that includesforward stepwise regression routine, but they do not cover the model space well, andwill not indicate when the data is not informative about the exact model.

In this paper we propose a new method for analyzing screening experiments withcomplex confounding patterns that eliminates the shortcomings of the methods dis-cussed above. This method explores a wide subset of the model space by employingthe all subsets regression routine that is available in most standard statistical pack-ages. This eliminates the disadvantage of the forward stepwise regression proposedby Hamada and Wu (1992) and Lin (1998). In the proposed method, the numberof interactions considered by the all subsets regression is restricted by consideringproperties of the alias matrix which prevents exploration of unlikely models. Therestricted number of candidate interactions substantially reduces the computation timefor all subsets regression. The alias matrix can also be computed in statistical pack-ages that include matrix computations such as SAS proc iml, MINITAB and S-Plus.

230 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

Therefore, another advantage of this method is that it eliminates the need for specialsoftware. Unlike the Chipman et al. (1997) Bayesian method that considers modelsunlikely based on relaxed weak heredity priors or strict weak heredity priors, thismethod considers models unlikely as those containing interactions with small poten-tial to bias the large calculated e/ects. Thus, another advantage of the method is thatit is not dependent on prior information, but rather the data or alias structure of thedesign. Finally, this method can identify several comparable models, when warrantedby the data, as the Chipman et al. (1997) Bayesian method.

The remainder of the paper is organized as follows. Section 2 describes the aliasmatrix and how it can be used to reduce the number of candidate interaction termsin all subsets regression. Section 3 outlines the proposed strategy. Sections 4 and 5show examples of the use of the method on simulated and real data. Section 6provides a discussion and conclusions.

2. Limiting candidate interactions using the alias structure

The alias matrix was de2ned by Box and Wison (1951), and Daniel (1976) used itto determine the aliases of main e/ect contrasts with two-factor interaction contrastsin the orthogonal main e/ect plans (OME) of Addelman and Kempthorn (1961,1962a, b). Lin and Draper (1993) show how to quickly construct the alias matrixsequentially for large Plackett–Burman designs. For a saturated n-run experimentaldesign, the alias matrix is de2ned as the n× u matrix product A= (X′

1X1)−1(X′1X2),

where

y=X1R1 + X2R2 + e (1)

is the model for the experimental data, and y is an n × 1 vector or response datafor an experiment, X1 is an n × n matrix of contrast coe4cients for the intercept,main e/ects and estimable two-factor interactions, R1 is the corresponding n × 1vector of regression coe4cients, X2 is an n×u matrix of contrast coe4cients for theinestimable two-factor and higher order interactions, R2 is the corresponding u × 1vector of regression coe4cients for the inestimable two factor interactions, e is ann × 1 vector of independent identically distributed random errors with means equalto zero.

The alias matrix, A, is obtained by noting that if the saturated model y=X1R1 + eis 2t to the data by least squares, the estimated coe4cients are, R1 = (X′

1X1)−1X′1y.

However, these coe4cients are biased since E(R1) = R1 + (X′1X1)−1(X′

1X2)R2.The alias matrix shows how linear combinations of the inestimable two-factor in-

teraction coe4cients bias the regression coe4cients for the intercept, the main e/ects,and the estimable two-factor interactions. The rows of the alias matrix correspondto the biased regression coe4cients and the columns correspond to the regressioncoe4cients for inestimable two-factor interactions. The values in each row are themultipliers for the linear combination of interaction regression coe4cients that biaseach estimable regression coe4cient.

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 231

Table 1Factor levels and response for reactor experiment

Run A B C D E 6 7 8 9 10 11 Y

1 + − + − − − + + + − + 562 + + − + − − − + + + − 933 − + + − + − − − + + + 674 + − + + − + − − − + + 605 + + − + + − + − − − + 776 + + + − + + − + − − − 657 − + + + − + + − + − − 958 − − + + + − + + − + − 499 − − − + + + − + + − + 4410 + − − − + + + − + + − 6311 − + − − − + + + − + + 6312 − − − − − − − − − − − 61

Example. Consider the Reactor example taken from Box and Meyer (1993). Thedesign was a 12 run Plackett–Burman design with 5 factors (A–E). The 12 runswere a subset of the runs in a full factorial that is presented in Box et al. (1978).Table 1 shows the coded factor settings and experimental data. The 2rst 5 columnsrepresent the factor settings and the columns labeled 6–11 are unassigned factors. Theunassigned factors are estimable and will be seen to represent linear combinationsof inestimable interaction e/ects.

The 12 × 11 X1 matrix of coe4cients for the estimable e/ects is shown below,along with the 12×10 X2 matrix of coe4cients for the inestimable two-factor interac-tions. The columns in X2 were formed by multiplying elementwise the correspondingcolumns in X1.

X1 =

A B C D E 6 7 8 9 10 111 −1 1 −1 −1 −1 1 1 1 −1 11 1 −1 1 −1 −1 −1 1 1 1 −1

−1 1 1 −1 1 −1 −1 −1 1 1 11 −1 1 1 −1 1 −1 −1 −1 1 11 1 −1 1 1 −1 1 −1 −1 −1 11 1 1 −1 1 1 −1 1 −1 −1 −1

−1 1 1 1 −1 1 1 −1 1 −1 −1−1 −1 1 1 1 −1 1 1 −1 1 −1−1 −1 −1 1 1 1 −1 1 1 −1 1

1 −1 −1 −1 1 1 1 −1 1 1 −1−1 1 −1 −1 −1 1 1 1 −1 1 1−1 −1 −1 −1 −1 −1 −1 −1 −1 −1 −1

;

232 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

X2 =

AB AC AD AE BC BD BE CD CE DE

−1 1 −1 −1 −1 1 1 −1 −1 1

1 −1 1 −1 −1 1 −1 −1 1 −1

−1 −1 1 −1 1 −1 1 −1 1 −1

−1 1 1 −1 −1 −1 1 1 −1 −1

1 −1 1 1 −1 1 1 −1 −1 1

1 1 −1 1 1 −1 1 −1 1 −1

−1 −1 −1 1 1 1 −1 1 −1 −1

1 −1 −1 −1 −1 −1 −1 1 1 1

1 1 −1 −1 1 −1 −1 −1 −1 1

−1 −1 −1 1 1 1 −1 1 −1 −1

−1 1 1 1 −1 −1 −1 1 1 1

1 1 1 1 1 1 1 1 1 1

:

The alias matrix is shown below. The fraction multiplier 112 actually takes the

place of the diagonal matrix (X′1X1)−1 that has 1

12 th on the diagonal.

A= (X′1X1)−1X′

1X2 =112

AB AC AD AE BC BD BE CD CE DE

A 0 0 0 0 −4 4 4 −4 −4 −4

B 0 −4 4 4 0 0 0 −4 4 −4

C −4 0 −4 −4 0 −4 4 0 0 −4

D 4 −4 0 −4 −4 0 −4 0 −4 0

E 4 −4 −4 0 4 −4 0 −4 0 0

6 −4 4 −4 4 4 −4 −4 4 −4 −4

7 −4 −4 −4 4 −4 4 −4 4 −4 4

8 4 4 −4 −4 −4 −4 −4 −4 4 4

9 −4 −4 −4 −4 4 4 −4 −4 −4 −4

10 −4 −4 4 −4 −4 −4 −4 4 4 −4

11 −4 4 4 −4 −4 −4 4 −4 −4 4

:

From this matrix we can see how the inestimable interaction e/ects bias many ofthe estimable main e/ects and unassigned e/ects. For example, reading down the2rst column of A, we see that if the AB interaction was a positive value, it wouldpositively bias main e/ects D and E and unassigned e/ect 8, while it would neg-atively bias main e/ect C and unassigned e/ects 6; 7; 9; 10, and 11. In a sense wecould say the value of an inestimable interaction leaks into the estimated values ofmany of the estimable e/ects, when we 2t the model y=X1R1 + e by regression.Looking at it from the other view (across the rows of A), we can see what ines-timable interactions could potentially bias an estimable e/ect. For example, lookingat the 2rst row we see that main e/ect A could be negatively biased by interactionsBC; CD; CE and DE, and positively biased by BD and BE.

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 233

In an experimental design with a complex confounding pattern, like the 12 runPlackett–Burman design shown in this example, some of the inestimable interactionscan actually be estimated if we pick a subset of main e/ects and interactions, toinclude in a regression analysis, that is smaller in number than the runs in thedesign. The problem is choosing the e/ects and interactions to include in the modelbefore we know what is important. Hamada and Wu (1992) solved the problem usingseveral runs of a forward stepwise regression starting with only the main e/ects ascandidates. In all subsets regression that entertains main e/ects and interactions couldbe used, but it is normally infeasible since the computation time is excessive. Anothersolution is to use an all subsets regression that entertains all estimable main e/ectsand a subset of the interaction e/ects as candidates. The information in the aliasmatrix can help us choose a subset of the interactions that are likely to contributemost to the model.

We obtain this information by looking at the bias coe4cients in the alias matrix inthe rows that correspond to the six largest e/ects. In the Reactor example above, thelargest e/ects (in absolute value) are B; E; 11; 8; 9 and D, see Box and Meyer (1993).B;D, and 9 had positive e/ects while the others had negative e/ects. If these largeestimable e/ects were negligible, their large estimated values could be due to biasfrom the inestimable interactions. The reasoning for this is similar to the justi2cationfor Daniel’s (1976) method of detecting outliers in unreplicated designs. Each outlieror interaction will have a positive or negative bias or inFuence on each estimablee/ect. To see the potential contribution of any particular interaction simultaneouslyupon all the large e/ects, we can algebraically sum the coe4cients in the columnof the alias matrix corresponding to that interaction, weighted by the sign of the sixlargest estimable e/ects that represent rows in the alias matrix.

For example to see the potential contribution of the AB interaction to the largecontrasts we calculate the sum:

LargeSign of Coe4cient frome/ect large e/ect alias matrixB + (0)D + (+4=12)E − (+4=12)8 − (+4=12)9 + (−4=12)11 − (−4=12)Algebraic sum of coe4cients: −4=12

We can see the relative potential contribution of each of the interactions to thelarge e/ects by calculating a similar sum for each interaction. By plotting the absolutevalue of these sums, in a Pareto Chart (Lawson and Erjavec, 2001), we can visualizethe potential contribution of each inestimable interaction to the large estimable e/ects.We call this special Pareto chart the alias plot.

Fig. 1 shows an alias plot for the reactor data given in Table 1.

234 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

Fig. 1. Alias plot for reactor data in Table 1.

From this alias plot we can see that the interactions, AC; BD;DE, and BE have thehighest potential of contributing to the large e/ects estimated from the data. Otherinteractions such as AB; AD, etc., are unlikely to have contributed to the large ef-fects unless they were extremely large. Therefore, we can limit the candidate termsfor an all subsets regression routine to the main e/ects (A; B; C; D, and E) andthe interactions that had the highest potential of biasing to the large e/ects, i.e.(AC; BD;DE; BE). Adding to the candidates for the all subsets regression the inter-actions AB-CE, that have little potential of contributing to the large e/ects, does nothelp in identifying the correct model.

The results of an all subsets regression using these variables showed 4–6 vari-ables would be an appropriate model size. The best 4 variable model contained thevariables B;D; BD and DE and had an R2 = 0:954, while the best 3 variable modelonly had R2 = 0:896. Adding a 2fth or sixth variable to the model did not add much,and 5 of the best 10 2ve variable models and 7 of the best 10 six variable modelscontained the terms B;D; BD and DE. These are the same terms model that Box andMeyer (1993) indicated to be correct for the data, and were found to be signi2cantin the full factorial by Box et al. (1978).

3. The proposed method of analysis

A summary of the proposed method is listed below:

1. Determine the contrast coe4cients in the estimable set (X1) and inestimable set (X2).2. Calculate the least squares estimates of the estimable e/ects R1 = (X′

1X1)−1X′1y.

Normally, this will be a saturated set of orthogonal contrasts like the set shownin the last section.

3. Calculate the alias matrix A= (X′1X1)−1(X′

1X2).4. Determine the set of large contrasts, L, to be those e/ects whose rank of absolute

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 235

value is greater than n=2 where n is the number of experiments or runs in thedesign.

5. Determine if any of the main e/ects appear to be signi2cant, and if so eliminatethem from the list of large e/ects, L.

6. Algebraically, sum (over the rows corresponding to the large e/ects in L) thecoe4cients in the alias matrix for each inestimable e/ect. Make an alias plot(Pareto diagram) of the absolute values of these sums.

7. Perform an all subsets regression using as candidates the main e/ects in (X1)and the interactions identi2ed to have large contributions in the Pareto style aliasplot, and any other interactions thought to be important by the experimenter.

8. Consider the best 2tting models found in the all subsets regression, and theprinciple of e/ect heredity. Choose a model that appears appropriate and make acon2rmation 2t to determine the value and signi2cance of the individual e/ects.

Signi2cant main e/ects are eliminated from the list of large e/ects in step 5, becausetheir interpretation is clear and should not confuse the identi2cation of potential in-teractions. Some popular methods of identifying appropriate models in an all subsetsregression in step 8 are by reference to the R2 statistic, the adjusted R2 statistic,Mallow’s Cp statistic and the MSE. If the contrasts in the design do not have thesame variances (i.e. the diagonal elements of (X′

1X1)−1 are not equal) like the cod-ing used for the e/ects in the L18 design used by Hamada and Wu (1992), steps 2and 3 above are modi2ed by calculating the standardized e/ects

R1s = Sqrt(Diag((X′1X1)−1))(X′

1X1)−1X′1y (2)

and the standardized alias matrix

As = Sqrt(Diag((X′1X1)−1))(X′

1X1)−1(X′1X2): (3)

When there are no large or active interactions in a screening design with a complexconfounding pattern, a half-normal or normal plot of the calculated e/ects would havea clear interpretation with a few signi2cant main e/ects. However, when there is oneor more active interactions in a screening design, their values tend to leak into manyof the estimable e/ects and nothing appears signi2cant on a half-normal plot. Inmany of these cases the half-normal plot will appear with a non-zero intercept as ifthere were outliers in the data.

If n is the number of experiments in the design, examples in the appendix showthe method proposed in this paper works well when the total number of active e/ectsis less than n=4, with less than half of these being interaction e/ects, or when thereare no interactions and the total number of active e/ects is less than n=3. The methodhas been tested extensively with data from common screening designs with complexconfounding patterns such as the 12 and 20 run Plackett–Burman designs and theL18. The calculations in the tests have been performed using SAS on a 166 mHzPC. The only limitation for larger designs such as the 24 or 28 run Plackett–Burmandesigns is computing time, when the number of candidate variables in the all subsetsregression is greater than 25.

236 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

4. Example with simulated data

Hamada and Wu (1992) constructed an example, using computer simulation, with11 factors, labeled A–K , in a 12 run Plackett–Burman design. The true model wasY =A + 2AB + 2AC + �, � ∼ N(0; �= 0:25). A listing of their actual data appearsin Chipman et al. (1997). This is an example where the interactions are larger inmagnitude than the main e/ect, and the Hamada and Wu (1992) primary methodcould not identify the correct model. This example is included here to show that theproposed method will identify the correct model.

Fig. 2, shows a half-normal plot of the e/ects with the graphical signi2cance limitof Lawson et al. (1998). It can be seen that none of the e/ects are signi2cant, but thesix labeled e/ects may be inFated or biased by the two inestimable interactions. Analias plot was constructed by algebraically summing coe4cients in the alias matrixfor the labeled e/ects in the half-normal plot. The resulting alias plot is shown inFig. 3. In this alias plot it can be seen that the interactions that potentially contributeto the large e/ects are AB;DE;HJ; AC;DF;GJ and IK . Therefore, an all subsetsregression was run with the candidate variables being the main e/ects A–K and theinteractions listed above. A plot of the mean square error (MSE) from the 2ttedmodels versus the number of independent variables in the models showed that nomore than 4 or 5 independent variables are needed in the 2tted equation. The 10best subsets of sizes 2–5 were examined. There was a large jump in R2 from 0.863,for the best 2 variable model, to 0.994 for the best 3 variable model, and verylittle additional increase in R2 for 4 or 5 variable models. The terms A; AB, and ACappeared in all the best 4 and 5 variable models. From this evidence it is convincingthat the best model contains only the terms A; AB and AC. A 2nal con2rmation

Fig. 2. Half-normal plot of e/ects from the Hamada and Wu simulated data.

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 237

Fig. 3. Alias plot from the Hamada and Wu simulated data.

regression was run to obtain the 2tted equation:

Y = − 0:0005 + 1:097A+ 2:0AB+ 1:981AC:

The coe4cients on A; AB and AC were all highly signi2cant and there was nocollinearity in the model since both the VIFs and condition indicies were 1.0.

5. Example with real published data

A 12 run Plackett–Burman design was employed to study the e/ect of 7 fac-tors upon fatigue life of weld repaired castings, see Hunter et al. (1982). This ex-periment was previously analyzed for the presence of interactions by Hamada andWu (1992), Box and Meyer (1993), and Lin (1998–1999). The data is shown inthose papers. The analysis of this data will be presented in a brief form, since thelast example illustrated the details of the plots and regression results for the pro-posed method of analysis. The largest absolute e/ects on the half-normal plot were(in order of magnitude) F;D, unassigned e/ect 9, unassigned e/ect 8, A and B.The inestimable interaction e/ects having a likely potential contribution to the largee/ects were: AE FG |AC BC BE |AB AD BD CG DF . The vertical bars in the listare breaks between bars of equal height on the Pareto style alias plot.

An all subsets regression was run using as candidate variables the seven maine/ects A–G and the interactions listed above. Examining the popular statistics wouldlead to conclusion that 3 or 4 variables should be included in the model. Table 2shows some of the best models of sizes 2, 3 and 4. There are actually 10 3-variablemodels with R2¿ 0:90 and 10 4-variable models with R2¿ 0:94. Hamada and Wu(1992) and Box and Meyer (1993) both pointed out that the model including F;G andtheir interaction FG provided a plausible explanation for the data, while Lin’s (1998)forward stepwise regression method selected the model F; FG; AE, which 2t the dataslightly better but did not obey the e/ect heredity principle. By searching a wider

238 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

Table 2Some models for cast fatigue data

Model R2 Cp

F; FG 0.89 16.6F; FG; AE 0.95 5.9F; FG; BD 0.93 10.4F; D; FG 0.92 13.0F; G; FG 0.91 14.8E; F; FG 0.91 15.5F; AE; FG; AD 0.96 5.5F; AE; FG; BD 0.96 5.8C; F; AE; FG 0.96 5.8E; F; AE; FG 0.96 6.3

model space using the all subsets regression, we can see many other models thatalso appear plausible. For example the models (F;D; FG), (E; F; FG), (E; F; AE; FG)all appear to obey the e/ect heredity principle, since each interaction involves atleast one main e/ect in the model, and the each model explains the data well,with R2 values over 0.91. Other models that contain many interactions and do notobey e/ect heredity principles such as (F; AE; FG; AD) seem less plausible. With thismany reasonable models possible, this data does not appear to be very informativeabout the exact model, and this is a case where the experimenter’s knowledge orfurther experimentation should be carried out before selecting the 2nal model. Theforward selection regression methods of Hamada, Wu and Lin, would miss thisrecommendation.

6. Discussion and conclusions

This paper presents a simple method of detecting models with a few interactions inscreening experiments with complex confounding patterns. This method will not workwith regular fractions like 2k−p. The method was tested extensively using simulateddata and the 12 and 20 run Plackett–Burman design and the L18 design. The methodshould also work well for the 24 and 28 run Plackett–Burman designs and the L36

orthogonal array design. The limiting factors are the number of candidate variablesin the all subsets regression and the maximum subset size considered (usually n=3).

By analyzing simulated data, it was found that a half-normal plot of the e/ects, wasnot clear from a design with a complex confounding pattern and active interactione/ects. The fact that active but inestimable interactions leak into many other e/ects,inFates the values of those e/ects, and the half-normal plot of the absolute estimablee/ects appears like there are outliers and no signi2cant e/ects. The information inthe alias matrix helps to identify which inestimable interaction e/ects are inFatingthe large estimable e/ects. In the test cases, it was easy to identify the true modelusing an all subsets regression with the main e/ects and interactions identi2ed in thealias plot. Only in cases where the number of active e/ects is large (¿n=3) or the

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 239

number of active interactions was large relative to the number of main e/ects didthe method perform inconsistently.

In real data, where e/ects are not simply active or not active, but instead range aspectrum of real values, there may be several models involving the larger e/ects andinteractions that adequately explain the data. Using an all subsets regression methodand an appropriate set of candidate variables, many of these potential models canbe identi2ed. Once a list of potential models has been identi2ed using all subsetsregression, we recommend the experimenter and statistician review the list to selectone or more reasonable models, or run additional experiments to reduce confoundingand help in identifying the model. This is an after the analysis alternative to pickingprior probabilities for various models required by the Bayesian methods of Box andMeyer (1993) and Chipman et al. (1997).

While reviewing the list of models after the analysis, things such as e/ect heredityand e/ects sparsity should be kept in mind. Models that contain many interactionsand few main e/ects are usually unlikely, and models that contain interactions thatdo not involve main e/ects in the model should not be selected if they do notmake physical sense. High R2 values can be expected, and e/ect sparsity should beconsidered when selecting the 2nal number of variables in the model. The popularmethods of R2, adjusted R2, MSE, and Cp and the Rencher and Pun (1980) tablesfor the expected value and upper 95 percentile of R2 (expected from all subsetsregression with no active e/ects) can be used to select the 2nal model size. Someof these criteria have been used in the examples presented above. SAS macros wereused to perfom the calculations in this paper, and they may be obtained from theauthor ([email protected]) by email.

Appendix. More simulation results

To further illustrate the applicability of the all subsets regression method proposedin this paper, this appendix presents further simulation results. The 2rst example illus-trates the fact that the method can detect interactions in non-hierarchal models. Usinga 20 run Plackett–Burman design with 15 factors A–O, data was simulated from themodel Y = 8−3:5A+3B+3D+2:5G+2AB+3CF+�, where � ∼ N(0; 1). This modelcontains two interactions, but is non-hierarchal in the sense that there is a strong in-teraction between factors C and F , and neither C nor F have active main e/ects.

The largest calculated absolute e/ects were (in order) G; A; B;M;H;D. None ofthe e/ects were signi2cant, but these 6 appeared larger than the others on thehalf-normal plot. The inestimable interactions that were largest on the alias plotwere: EG |CN |ABDLHL |AF AJ CF COFI JO LN . An all subsets regression wasperformed using as candidates the 15 main e/ects, A–O and the 12 interactions inthe list above. The plot of MSE versus number of terms in the model seemed to in-dicate that no more than 7 or 8 terms were needed in the regression model. The best6 variable model contained the terms A; B; D; G; AB, and CF and had an R2 = 0:985,a substantial improvement over the best 5 variable model (A; B; D; G; CF) that hadR2 = 0:904. All the best 7 and 8 variable models included all the terms in the best6 variable model, giving strong evidence to support the model A; B; D; G; AB; CF .

240 J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241

Table 3Results of tests with simulated data (n runs, p active e/ects)a

# active e/ects p6 n=4 n=46p6 n=3

E/ect heredityholds No Yes Yes

Size of interactione/ects relative tomain e/ects Smaller Larger Smaller Larger Smaller Larger

Proportion of e/ectsthat are interactions ¡ 0:40 O O O O O O

¿ 0:40 O O O O X OaO—method able to detect actual model; X—method does not consistently detect actual model.

The proposed method was tested in over 40 additional simulated cases using the12 and 20 run Plackett–Burman designs and the L18 design. Interactive SAS macroswere written to generate the data, calculate the alias matrix, and make the alias plot.A variety of scenarios were tested. It was found that the method does not consistentlyidentify the correct model when the e/ect sparsity principle does not hold and whenthere are more than n=3 active e/ects in a design with n runs. The scenarios testedexplored variation in other conditions such as the proportion of active e/ects that areinteractions, the size of interaction e/ects relative to main e/ects, and whether thee/ect heredity principle holds (e.g. all active interactions involve at least one activemain e/ect). Table 3 summarizes the results of the tests. Only when the number ofactive e/ects is 25–33% of the runs in the design, and a large proportion of theactive e/ects are interactions whose magnitude is less than the main e/ect size, didthe method perform inconsistently in identifying the true model. In other cases themethod was able to detect the actual model (or protect against false negatives).

References

Addelman, S., Kempthorn, O., 1961. Orthogonal Main E/ect Plans. ASTIA, Arlington Hall Station,Arlington, VA.

Addelman, S., Kempthorn, O., 1962a. Orthogonal main e/ect plans for asymmetrical factorialexperiments. Technometrics 4 (1), 21–46.

Addelman, S., Kempthorn, O., 1962b. Symmetrical and asymmetrical fractional factorial experiments.Technometrics 4 (1), 47–58.

Box, G.E.P., Hunter, W.G., Hunter, J.S., 1978. Statistics for Experimenters. Wiley, New York.Box, G.E.P., Meyer, R.D., 1986. An analysis for unreplicated fractional factorials. Technometrics 28

(1), 11–18.Box, G.E.P., Meyer, R.D., 1993. Finding the active factors in fractionated screening experiments. J.

Quality Technol. 25 (2), 94–105.Box, G.E.P., Wison, K.B., 1951. On experimental attainment of optimum conditions. J. Roy. Statist.

Soc. B 13 (1), 1–38.Cheng, C., 1995. Some projection properties of orthogonal arrays. Ann. Statist. 23, 1223–1233.Chipman, H., Hamada, M., Wu, C.F.J., 1997. A Bayesian variable-selection approach for analyzing

designed experiments with complex aliasing. Technometrics 39 (4), 372–381.

J. Lawson / Computational Statistics & Data Analysis 39 (2002) 227–241 241

Church, A., 1995. Projection methods for generating mixed level fractional factorial designs. ASAProceedings of the Section on Physical and Engineering Statistics, Vol. 95, pp. 268–273.

Church, A., 1996. Projection methods for generating supersaturated designs. ASA Proceedings of theSection on Physical and Engineering Statistics, Vol. 96, pp. 8–11.

Daniel, C., 1976. Applications of Statistics to Industrial Experimentation. Wiley, New York.Draper, N.R., Lin, D.K.J., 1995. Characterizing projected designs: repeat and mirror-image runs.

Commun. Statist. A Theory Methods 24, 775–795.George, E.I., McCulloch, R.E., 1993. Variable Selection Via Gibbs Sampling. J. Amer. Statist. Assoc.

88, 881–889.Hamada, M., Wu, C.F.J., 1992. Analysis of designed experiments with complex aliasing. J. Quality

Technol. 24 (3), 130–137.Hunter, J.S., 1985. Statistical design applied to product design. J. Quality Technol. 17, 210–221.Hunter, G.B., Hodi, F.S., Eager, T.W., 1982. High-cycle fatigue of weld repaired cast Ti–6A1–4V.

Metalur. Trans. 13A, 1589–1594.Lawson, J., Erjavec, J., 2001. Modern Statistics for Engineering and Quality Improvement, Duxbury,

Paci2c Grove, CA (Chapter 2).Lawson, J., Grimshaw, S., Burt, J., 1998. A quantitative method for identifying active contrasts in

unreplicate designs based on the half normal plot. Comput. Statist. Data Anal. 26 (4), 425–436.Lin, D.K.J., 1998–1999. Spotlight interaction e/ects in main-e/ect plans: a supersaturated design

approach. Quality Eng. 11 (1), 133–139.Lin, D.K.J., Draper, N.R., 1992. Projection properties of Plackett and Burman designs. Technometrics

34, 423–428.Lin, D.K.J., Draper, N.R., 1993. Generating alias relationships for two-level Plackett and Burman

designs. Comput. Statist. Data Anal. 15, 147–157.Rencher, A.C., Pun, F.C., 1980. InFation of R2 in best subset regression. Technometrics 22, 49–53.Wang, J.C., Wu, C.F.J., 1995. A hidden projection property of Plackett–Burman and related designs.

Statist. Sinica 5, 235–250.