A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_Paper

Embed Size (px)

Citation preview

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    1/17

    Quality Technology &Quantitative ManagementVol. 6, No. 4, pp. 353-369, 2009

    QQTTQQMM ICAQM 2009

    A Bayesian Reliability Approach toMultiple Response Optimization with

    Seemingly Unrelated Regression Models

    John J. Peterson1, Guillermo Mir-Quesada2 and Enrique del Castillo3

    1Research Statistics Unit, GlaxoSmithKline Pharmaceuticals, King of Prussia, PA, USA

    Bioprocess Research and Development, Lilly Technical Center-North, Indianapolis, IN, USADepartment of Industrial and Manufacturing Engineering, The Pennsylvania State University,

    2

    3

    University Park, PA, USA(ReceivedSeptember 2006, acceptedApril 2008)

    ______________________________________________________________________

    Abstract: This paper presents a Bayesian predictive approach to multiresponse optimizationexperiments. It generalizes the work of Peterson [33] in two ways that make it more flexible for usein applications. First, a multivariate posterior predictive distribution of seemingly unrelated regressionmodels is used to determine optimumfactor levels by assessing the reliability of a desired multivariateresponse. It is shown that it is possible for optimal mean response surfaces to appear satisfactory yet

    be associated with unsatisfactory overall process reliabilities. Second, the use of a multivariate normaldistribution for the vector of regression error terms is generalized to that of the (heavier tailed)multivariate t-distribution. This provides a Bayesian sensitivity analysis with regard to moderateoutliers. The effect of adding design points is also considered through a preposterior analysis. Theadvantages of this approach are illustrated with two real examples.

    Keywords: Design space, desirability function, Gibbs sampling, multivariate t-distribution, posteriorpredictive distribution, robust parameter design, robust regression.

    ______________________________________________________________________

    1. Introduction

    tatistically designed experiments and associatedresponse surface methods are considered

    effective methods for optimizing products and processes. Much has been written about

    experiments involving a single response, but less has been written about multiple response

    experiments, although they are quite prevalent. Popular statistical packages such as Design

    Expert and JMP allow experimenters to analyze multiple response experiments by

    providing procedures based upon "overlapping mean responses" or "desirability functions"

    of mean responses. The overlapping mean response approach provides an overlay plot ofthe mean response surfaces to see if there is a configuration of factor levels that

    simultaneously satisfies the conformance criteria of the experimenter. A listing of articles

    providing examples or discussion of this approach can be found in Montgomery and

    Bettencourt [30]. Harrington [16] first proposed an approach based upon a notion of a

    desirability function. The idea here is that for each response type, a function,

    taking values on [0, 1] is created to express the desirability of the mean response as a

    function of the factor levels. (Here, is an

    S

    thi ( ),id y

    u1ry vector of estimated mean responses, andcloser to 1 are more desirable.) The overall desirability, is a geometric mean

    of the individual values. Later on, Derringer and Suich [7], del Castillo et al. [6], and

    Kim and Lin [21] proposed modifications of Harrington's desirability function.

    ( ) ' sid y ( ),Dy( )id y

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    2/17

    354 Peterson, Mir-Quesada and del Castillo

    Another type of approach based upon a quadratic loss, about a set of target

    values have been discussed by Khuri and Conlon [20], Pignatiello [35], Ames et al. [1],

    Vining [41], and Ko et al. [22]. Some of these approaches try to model the joint predictivedistribution of but do not capture the uncertainty due to the estimated variance-

    covariance matrix parameters. Furthermore, for the quadratic loss function and desirability

    function approaches (except perhaps the one due to Harrington) it may be difficult to assign

    values of the function to a scale that can be converted into "poor", "good", "excellent", etc.

    Here, a panel of experts may be required (Derringer [8]) to obtain an informative "univariate

    response index" (Hunter [17]) for a multiresponse optimization problem.

    ( ),Qy

    y

    The overlapping mean response, desirability function, and quadratic loss function

    approaches have the drawback that they do not completely characterize the uncertainty

    associated with future multivariate responses and their associated optimization measures.

    The danger of this is that an experimenter may use one of these methods to get an optimal

    factorconfiguration, validate it with two or three successful runs, and then begin production.For example, suppose that the probability that a future multivariate response is satisfactory

    is only 0.7. Even so, the chance of getting three successful, independent validation runs is

    0.343, which can easily happen. Hunter [17] states that the variance of univariate response

    indices for multiresponse optimization "can be disturbing" and further study is needed to

    assess the influence of parameter uncertainty.

    For the optimal conditions obtained by these approaches, Peterson [33] used a real-

    data example to show that the probability (i.e. reliability) of a good multivariate response,

    as measured by these optimization criteria, can be unacceptably low. Furthermore, he

    showed that ignoring the model parameter uncertainty can lead to reliability estimates that

    are too large. A practical drawbackof the methodology in Peterson [33] is that the regression

    models were limited to the standard (normal theory) multivariate regression (SMR) model

    (cf., Johnson and Wichern [19]) having the same covariate structure across response types.

    In this paper we generalize the applicability of the Bayesian methodology in Peterson

    [33] to make it more widely useful for addressing commonly occurring multivariate response

    surface problems. This is done by introducing a method for utilizing seemingly unrelated

    regression (SUR) models (Zellner [44]) where each response-type has its own covariate

    structure. In addition to the multivariate normal distribution assumption for the vector of

    regression errors, we provide a further modification of this approach to handle

    regression-error vectors that have a multivariate t-distribution. The t-distribution modeling

    is useful for many typical response surface experiments from a Bayesian sensitivity analysis

    perspective. Many response surface designs have sample sizes sufficiently small as to make

    it difficult to assessnormality of the residuals. Obvious outliers can be removed,but moderate

    outliers may be somewhat confounded with factor effects. The t-distribution allows theexperimenter to vary the thickness of the outer tails of the distribution of the residual

    errors by varying the associated degrees of freedom parameter. These two modifications of

    Peterson [33] provide flexibility that maybe needed for typical response surface experiments.

    In this paper, we re-analyze the two examples found in Peterson [33] using the above

    two generalizations. In the first example, a mixture experiment, we show that use of SUR

    modeling can make a noticeable difference. The second example provides an illustration

    that the optimal conditions for the process under study provide a high likelihood of meeting

    specifications whether or not we use a normal distribution or a t-distribution with heavier

    tails. This distribution sensitivity analysis provides some assurance that extensive validation

    efforts may not be needed before implementing this optimized process.

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    3/17

    A Bayesian Reliability Approach to Multiple Response Optimization 355 ( ) Pr( | , data),p Ax Y xFor these examples, process optimality is measured by

    where is a vector of responses, u1ku1r is aY x vector of process factors, and is aspecification set that describes desirable or acceptable values for The approach used inthis paper models the posterior predictive distribution of given

    A

    .Yto find a value ofx xY

    that maximizes (p ). Here, the probability of response conformance or reliability, ( ),p xx

    is easy to interpret. However, other process reliabilities can be constructed if one wishes to

    employ a desirability function, or quadratic loss function, For example,

    using the posterior predictive distribution one can compute

    ( )DY , ( ).QY

    t( ) Pr( ( ) *| , data)p D Dx Y xor d( )( ) Pr( *| , datap Q Q )x Y x if informative values of or are available. Anillustration is given in Peterson [33]. The predictive nature of this approach also easily

    allows for the incorporation of noise variables to help the experimenter create a robust

    process. See, for example, Mir-Quesada et al. [29] and Rajagopal et al. [40].

    *D *Q

    An important application of the ( )p x desirability function is the set of all pointsx -

    such that ( )p x is at least some prespecified reliability value. An example of such a setappears in Peterson [33]. This type of set has applications for process capability, and in fact

    has been proposed by Peterson [34] for construction of a "design space", a pharmaceutical

    manufacturing process capability region described in the FDA document "Guidance for

    Industry Q8 Pharmaceutical Development" [10] available at http://www.fda.gov/cber/

    gdlns/ichq8pharm.htm.

    2. The Statistical Model for a Bayesian Reliability

    To compute ( ) Pr( | , data)p Ax Y x for multiresponse process optimization, weneed to obtain the posterior predictive distribution for givenY .x The regression model

    considered here is the one that allows the experimenter to use a different (parametrically)

    linearmodel for each response type. This will allow for more flexible and accurate modeling

    of than one would obtain with the SMR model.Y

    Here, is a vector of response-types andc 1( ,..., )rY YY r x is a u1k vector of factorsthat influence by way of the functionsY

    c ( ) , 1,..., ,i i i i Y e iz x E r (1)

    where i is a vector of regression model parameters and is a vector of

    covariates which are arbitrary functions of

    u1ip ( )iz x u1ip.x Furthermore, r is a random

    variable with a multivariate normal distribution having mean vector 0 and variance-

    covariance matrix,

    c)e 1( ,ee ...,

    . The model in (1) has been referred to as the "seemingly unrelatedregressions" (SUR) model (Zellner [43]). When {( )z x1( ) ...z x ( )r z x , we obtain the

    SMR model.In order to model all of the data and obtain a convenient form for estimating the

    regression parameters, consider the following vector-matrix form,

    ,Y Z eE (2)

    where c c 1[ ,..., ] ,rY Y Y c c c 1[ ,..., ] ,rE E E c cc c 1[ ,..., ] ,re e e and Z is a um p block diagonalmatrix of the form with1diag( ,..., ),rZ Z ... .rp1p p

    c c c]xHere,

    and for

    c 1( ,..., ),i i inY YYc (ie 1,..., ),i ine e 1[ ( ) ,..., ( )i i inZ z x z 1,...,i r.

    For the SMR model under a noninformative prior (which is proportional to

    the posterior predictive density function of has the multivariate 6 ( 1)/2| | ),r Y

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    4/17

    356 Peterson, Mir-Quesada and del Castillo

    t-distribution form. See, for example, Press [36]. Simulation of a multivariate t-distribution

    r.v. with df can be done simply by simulating a multivariate normal r.v. and an

    independent chi-square r.v. with df (Johnson [18]). For the SMR model then,

    v

    v ( ) Pr( | , data)p AY x can be computed directly for eachx x by Monte Carlo

    simulations. This was done by Peterson [33] as a way to do multiresponse surface

    optimization by maximizing ( )p x over the experimental region. Mir-Quesada et al. [29]

    extended these results for the SMR case to include noise variables. Computation of

    multivariate t-distribution probabilities over hyper-rectangles can also be computed

    efficiently by numerical integration (Genz and Bretz [13]).

    For the SUR model, no closed-form density or sampling procedure exists. However,

    using Gibbs-sampling (Griffiths [15]) it is easy to generate random pairs of SUR model

    parameters from the posterior distribution of ( , ).E Using the SUR model in (1) it is thenstraightforward to simulate r.v.'s from the posterior predictive distribution of givenY Y

    .x

    3. Computing the Bayesian Reliability

    3.1. The SUR Model with Normally Distributed Error Terms

    Before describing the Bayesian analysis, it is convenient to discuss some (conditional)

    maximum likelihoodestimates for the SUR model. For a given , the maximum likelihoodestimate (MLE) of E can then be expressed as

    1 1 1 [ ( ) ] ( ) ,n n c c Z I Z Z I Y E 6 6

    I

    (3)

    uwhere n is the n n identity matrix and is the Kronecker direct product operator.

    The variance-covariance matrix of

    E is c 1 1Var( ) [ ( ) ] .nZ I ZE

    ,E the variance-covariance matrix, For a given , can be estimated by

    1

    1 ( ) ( ) ,n

    j j

    jn cE E E e e

    ( ) ( ( ),..., ( ))e ec

    (4)

    where 1j j rjE E Ee c ( ) ( ) , 1,..., .ie y i r z xE E and ij ij i j Let iE be themaximum likelihood estimator of iE for each response-type independently of the other

    responses, and define

    ,i

    1

    rc c E E E ,The estimator of E

    lE c 1 1[ ( ( ) ) ]nEZ I Z l

    c 1( ( ) ) ,n YEZ I (5)

    is called the two-stage Aiken estimator (Zellner [43]).

    In order to compute and maximize ( ) Pr( | , data)p Ax Y xp

    over the experimental

    region, it is important to have a relatively efficient method for approximating ( )x by

    Monte Carlo simulations. The approach taken in this paper is to simulate a large number of

    r.v.'s from the posterior distribution of ( ,E ( , )E ), and use each value to generate ar.v. for each x. In this way, the sample of

    Y

    ( , )E values can be re-used for simulatingvalues at each

    Y

    x point, instead of having to do the Gibbs sampling all over again for each

    x-point.

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    5/17

    A Bayesian Reliability Approach to Multiple Response Optimization 357( , )Consider the noninformative prior for E which is proportional to

    (Percy [32] and Griffiths [15]). Note that the posterior distribution of

    ( 1)/2| | rE given is

    modeled by

    1 1~ ( ( ),( ( ) ) ),nN c Z I ZE E

    E

    (6)

    where has the form as in (3). This follows from Srivastava and Giles [39]. Note also

    that the posterior distribution of 1 given E is described by

    1 1 1~ ( , ( )),W n nE E

    W n 1 1 ( ),n

    (7)

    Ewhere is the Wishart distribution with df and scale parameter (

    and

    )E( , )

    has the form as in (4). This follows from a slight modification of expression (7) in

    Percy [32]. Sampling values from the posterior distribution of E can be done asfollows using Gibbs sampling:

    Step 0. Initialize the Gibbs sampling chain using 1/2 W E E E e where *Ecorresponds to (5), E corresponds to the -form in (5), and where

    , ).0 rI Here,~ (Ne W can be used to induce a slight overdispersion asrecommended by Gelman etal. [12]. In this article, W 2 is used. This initializationis done since E is approximately normal with mean *E and variance-covariancematrix E

    Step 1. Generate a value as in (7) by using the most recently simulated E and thedecomposition c ,S* where 1 1 ( )1 * c n E* * S and 1 . c

    ni iH H Here,

    1,..., n

    H H are iid ( , )rN 0 I distributed.

    Step 2. Generate a E value as in (6) by using the most recently simulated and

    c 0 ,R ( )E H where

    1 1

    ) ]nc c [ (E R R Z I Z and 0H is distributed as( , )rN 0 I .

    Following Percy [32], we use a burn-in of 100 iterations for steps 1 and 2. See Geweke

    [14] for use of (conditionally conjugate) informative priors for E and .

    To compute ( ),p x N are generated for each-vectorsY .x Each simulated

    is generated using

    -vector,Y( ) ,sY

    c

    c

    #

    1

    ( ) ( ) ( )

    ( )

    ,

    ( )

    s s s

    r

    z x

    Y e

    z x

    E

    ( ) ( )( , )

    (8)

    where s sE ( )is sampled using the Gibbs sampler, se ( )( , ),sN 0 1,..., .s N -point

    is sampled from

    and For each new ( ) ( )( , )sx the same sN E ( ),p

    pairs are used. The

    Bayesian reliability, x is approximated by

    ( )

    1

    1( ),

    Ns

    s

    I AN

    Y

    ,N ( )I

    , )Y

    (9)

    for large where is an indicator function.

    Percy [32] provides a similar, but three-step, Gibbs sampling procedure that generates

    a ( , E triplet for a given x value. However, this is not efficient for our purposes as

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    6/17

    358 Peterson, Mir-Quesada and del Castillo

    this Gibbs sampling procedure would have to be re-done for many in order to

    optimize

    -pointsx

    ( ).p x Percy also proposes a multivariate normal approximation to the posterior

    predictive distribution of givenY .x However, such an approximation may not beaccurate for small sample sizes. This is because one would expect the true posterior

    predictive distribution of givenY x to have heavier tails than a normal distribution due

    to model parameter uncertainty; this is indeed the case with the SMR model.

    3.2. The SUR Model with t-Distribution Error Terms

    The multivariate t-distribution can be a useful generalization of the multivariate normal

    distribution for applied statistics (Liu and Rubin [25]). In particular, it can be a useful tool

    for modeling bell-shaped distributions that have heavier than normal "tails". Liu [26]

    illustrates the utility of using t-distribution errors for robust data analysis within the context

    of an SMR model. Rajagopal et al. [40] provide an example for the univariate regression

    case.

    In this subsection, we show how to sample from the posterior predictive distribution of

    a SUR model with multivariate t-distribution errors. This will allow the experimenter to

    perform a sensitivity analysis with regard to a distribution that spans a continuum from

    Cauchy to normal This can be useful for many typically used response

    surface designs that may not provide enough data to perform discriminating tests of

    normality. Our experience is that (if the mean response is in A) the Bayesian reliability

    (df 1) f(df ).

    ( )p x gets smaller as the df get smaller reflecting a more disperse posterior predictive

    distribution for at eachY -point.x If ( )p x is acceptably large for both small and large df,

    then our sensitivity analysis provides some confidence that we have a reliable process,

    provided that the residual distribution appears bell-shaped and we have found good

    regression models for each response type.

    The SUR model with t-distribution errors has the same model form as in (1) but with

    i 's replaced by cHi 's, where the vector, H He ( ,..., )rH,

    1 has a multivariate t-distribution with

    location parameter0, scale (matrix) parameter, and df parameter The inverse of.v ,is sometimes called the precision matrix. For details about the multivariate

    t-distribution see Kotz and Johnson [23]. Here, we are assuming that is known. Some

    authors recommend using df (which implies three finite moments) for the

    t-distribution for purposes of modeling a heavy-tailed errors. See for example, Lange et al.

    [24], Gelman et al. [12], and Congdon [5]. The same noninformative prior, proportional to

    can beused for the t-distribution errors model. This prior is used in this article.

    1,v

    4v

    ( 1)/2| | ,r

    To do Gibbs sampling from a SUR model with t-distributions errors, first consider the

    following weighted SUR model

    c ( ) , 1,..., 1,..., ,ijij i j i j

    eY i r j n

    wz x E

    ie

    (10)

    where (10) is defined as in (1) but with an index jto represent the observation number and

    with replaced by c 1( ,.. )/ .ij je w Here, .,j j rje ee (0, )N ,

    ~ iid ,

    Conditional on

    1,..., .j n

    jw

    u1r(10) is a weighted SUR model. Note that here the weight is the same

    for each vector of responses, ,jY but different for each observation. If

    for where are iid chi-square with df, then (unconditional on the

    thj /j jw u v,..., ,n 1,..., nu u v1j

    's) the rx1 error vectors,ju c 1( / ,..., ) , 1,..., ,j j je w w j t v

    /rje n are iid multivariate with

    df, location parameter vector0, and scale parameter matrix, (Congdon [5]). As withthe normal errors SUR model, we use the noninformative prior which is proportional to

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    7/17

    A Bayesian Reliability Approach to Multiple Response Optimization 359 ( 1)/2| | .r

    c c 1 1 ( , ) ( ( ) ) (W Z W Z Z E

    1( ,..., ).ndiag w w W

    In order to set up the Gibbs sampling, we need to define some estimator-like

    functions of the data and model parameters. First we define

    1 ) ,W Y

    where In addition let

    c 1 1( , ) [ ( ) ] .V W Z W Z

    l

    Finally let,

    c

    1

    1( , ) ( )( ) .

    n

    i i i i i i

    wn

    E E EW Y Z Y Z

    The basic steps of the Gibbs sampling are as follows.

    Step 0. Initialize the Gibbs sampling chain using (5) for .E For 1, ., ),nwsimulate ' ,iw s where /

    ( ..diag wWi i v and the 'iu s have independent chi-square

    distributions with v df.w u

    ( .i n1,..., )Step 1. Simulate | ,E W according to 1 1, ~ ( ,W n nE EW W ( , )).

    Step 2. Simulate | ,E W according to , ~ ( ( , ),NE EW W V ( , )).W Step 3. Simulate 1,..., )nw (conditional on (diag wW E and by simulating each iw

    independently according to a gamma distribution, 1,..., ,i i n where( , )G b c denotes a gamma distribution with density function,

    ~ ( , ),w G b c i

    t*

    1 /

    ( ; , ) for 0.( )

    b w c

    b

    w eg w b c w

    b c

    ( )/2 b v rHere, and

    1 11[ ( ) ( )] .2 2

    c i i i i i v

    c y y= =E E

    ( )

    (11)

    sComputing Y ( )pand x follows as in (8) and (9). If so desired, it is clear from the

    above Gibbs sampling steps, that one can use the same (conditionally conjugate)

    informative priors for E and as for the normal errors SUR model.

    3.3. The Addition of Noise Variables

    One advantage of this posterior predictive approach to multiresponse optimization isthat it easily allows the experimenter to incorporate noise variables and thereby do

    robust-parameter-design process optimization. A noise variable is a factor that may be

    precisely controlled in a laboratory setting but not in actual production use. To see how

    noise variables can be incorporated, let c 1 1( ,... , ,..., )h h kx x x xx 1,...,h kx xwhere arenoise variables. Here, it is typically assumed that the ( 1,..., )jx j h k

    (0,1).N

    are scaled such

    that they are iid By simulating c1( ,..., )h kx x 1 1( ,..., ) Pr( | ,..., , data)h hp x x A x xY

    1( ,..., )hp x x

    and substituting into the simulation

    for (8), can be computed. Maximizing

    provides for a way to do robust process optimization. Details for the SMR

    case are discussed in Mir-Quesada et al. [29] and Rajagopal et al. [40]. Extension to the

    SUR case for normal ort-distributed errors is straightforward.

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    8/17

    360 Peterson, Mir-Quesada and del Castillo

    4. Optimization of the Bayesian Reliability

    If there are 2-3 controllable factors, then it is easy to maximize ( )p x by gridding overthe experimental region. For a larger number of controllable factors two other approaches

    are possible. One approach is to use a general optimization procedure such as can be found

    in Nelder and Mead [31], Price [37], or Chatterjee et al. [4]. Another approach is to create a

    closed-form approximate model for ( )p x using logistic regression or some other regression

    procedure such as a generalized additive model (Wood [42]). By creating a coarse to

    moderately dense grid over the experimental region, logistic regression can be applied to

    the data. For example, the grid can be an factorial design where( )( sI AY ),x km moints,5-10, say.Since we can simulate many pairs for each of many( )( ( ), )sI AY x -px

    it should be possible to create a good approximate closed-form model, ( )p ,x for ( ).p x

    One can then maximize ( )p x using some suitable optimization procedure. See Peterson

    [33] for an example using the SMR model.

    5. A Preposterior Analysis

    As will be seen in the next section, it may happen that standard multiresponse

    optimization procedures indicate that satisfactory results for the mean response surfaces are

    possible while the associated Bayesian reliability, ( ),p x is not satisfactory. If this happens

    it is because the posterior predictive distribution is too disperse or possibly even oriented in

    a way that causes ( )p x to be too small. One remedial possibility is to reduce the process

    variation or change the correlation structure in such a way as to increase ( ).p x However,

    this may not always be possible, and in some cases difficult or costly when possible. There

    is another approach which will increase ( )p x to some degree, provided that the mean

    response surfaces provide satisfactory results. Some of the dispersion of the posterior

    predictive distribution is due to the uncertainty of the model parameters. This uncertainty

    can be reduced by increasing the sample size. Increasing the number of observations will

    not make ( )p x go to one, due to the uncertainty of the natural process variation itself, but

    it may be useful to assess how much ( )p x will increase as additional data are added.

    5.1. A Preposterior Analysis with Normally Distributed Error Terms

    One way to assess how additional data might affect the posterior predictive distribution

    is to impute new data in such a way as to predict the effect of having additional data using

    the information we currently have at hand. In this paper we take two different approaches

    to imputing additional data for the SUR model with normally distributed error terms. The

    first approach is based upon single imputation, where we impute an (imaginary) additional

    data set that has the property that it keeps E and in the Gibbs sampling the same.However, the additional data involves augmenting the regression design matrix and df. It is

    evident from (6) that the posterior distribution of E given can be modified to behave asif additional data were used by augmenting the design matrix and the size of the

    identity matrix Accordingly, from (7) one can see that the posterior distribution of

    Z

    .nI given E can be changed to behave as if more data were added by increasing the n in the

    Wishart distribution, both for the df and for the n in the 1n -coefficient of 1 ( )E in (7).

    The second approach is based upon (parametric bootstrap) multiple imputation. Let

    be estimates of( E ) ( , )E (such as *E and E in (5)). Using the model form in (2),we simulate a new response values, from(n n) ,iY

    ( ) ( 1,...,i a )N i n n E Z and thengenerate Nrealizations of ( , )E conditional on the augmented data set, using the Gibbssampling process. These realizations are then used to generate N realizations of a new

    response variable, using (8). This whole process is repeated m times to get m estimates,Y

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    9/17

    A Bayesian Reliability Approach to Multiple Response Optimization 361Zof These values are then averaged to get a

    final estimate of 1 1Pr( | , , ,..., , ,..., ).a an n n nY A data Y Y x Z

    1 1,..., | ,...,

    {Pr( |

    m

    ,1 1, data, ..., , ,..., )}. Z% n n n na aY YE Y

    Z a an n n nY Yx Z ZA

    This preposterior estimate of Pr( |AY x

    1 1r( | , data, ,..., , ,..., ).a an n n nY A Y Y x = =

    | ,dataY

    , data) is a (parametric) bootstrap estimate

    of an expected value; as such it seems reasonable to use m equal to 200 (Efron and

    Tibshirani [9]). This multiple imputation approach, though more computationally intensive,

    can also be used to produce a histogram of simulated realizations from the random variable

    P

    Note that multiple imputation here is not done by simulating from the posterior

    predictive distribution, but instead from ( ) ( 1,...,i aN ),i n n E Z where Eand are point estimates of E and respectively. The reason for this is that, for any

    fixed -point.x simulating from the posterior predictive distribution will get us nowhere.This is because multiple imputation from the posterior predictive distribution produces an

    estimate of

    1 1,..., | , ,...,{Pr( | ,

    n n n na aY Y dataE Y x= = 1,...,d 1, , ,..., )}.a an n n nA ata Y Y = = (12)

    But (12) equals This follows from the well known result

    that

    ( ) Pr( | , datp x AY x2 1 2( ( | )) ( ).

    a).

    E E Y Y E Y

    = =

    One may be able to use new responses simulated from the

    posterior predictive distribution to compute

    1 1,..., | , ,..., {max Pr( |n n n na aY Y data RE Y

    x1 1, ,..., , ,..., )}.a an n nA data Y Yx = =, n

    But such a two-tiered Monte Carlo computation (with a maximization in between)could become rather burdensome.

    5.2. A Preposterior Analysis with t-Distributed Error Terms

    The Gibbs sampling for the SUR model with t-distributed error terms poses a difficulty

    for the single imputation approach. This is due to the fact that we need to use the

    terms (11) in the Gibbs sampling simulations

    for the but each term depends upon As such, we need to use

    the more computationally intensive multiple imputation (parametric bootstrap) approach.

    c 1[ / 2 (1/ 2)( ) ( )]i i i i c v y Z B y Z B6 ( 1,..., ),i aw i n n ic

    1i

    .iy

    Such modifications will give the experimenter an idea of how much the reliability can

    be expected to increase by reducing model uncertainty. For example, the experimenter can

    forecast the effects of replicating the experiment a certain number of times. This idea issimilar in spirit to the notion of a "preposterior" analysis as described by Raiffa and

    Schlaiffer [38].

    6. Examples

    6.1. A Mixture Experiment

    This example involves a mixture experiment to study the surfactants and emulsification

    variables involved in pseudolatex formation for controlled-release drug- containing beads

    (Frisbee and McGinity [11]). An extreme vertices design was used to study the influence of

    surfactant blends on the size of the particles in the pseudolatex and the glass transition

    temperature of films cast from those pseudolatexes. The factors chosen were: ="% of1x

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    10/17

    362 Peterson, Mir-Quesada and del Castillo

    Pluronic F68", 2 "%x of polyoxyethlene40 monostearate", 3 "%x

    1Y 2,Y

    1Y

    2.Y 1Y 2Y

    of polyoxyethylene

    sorbitan fatty acid ester NF". The experimental design used was a modified McLean-

    Anderson design (McLean and Anderson [28]) with two centroid points, resulting in asample size of eleven. The response variables measured were particle size' and glass

    transition temperature, which are denoted here as and respectively. The goal of

    the study was to find values ofx1, x2, and x3 to minimize as best as possible both and

    Here, we choose an upper bound for to be 240 and an upper bound for to be 19.

    Anderson and Whitcomb [2] also analyze this data set to illustrate Design Expert's capability

    to map out overlapping mean response surfaces.

    Frisbee and McGinity [11] and Anderson and Whitcomb [2] use an SMR model with

    second-order terms to model the bivariate response data. For this example, however, a

    severe outlier run was deleted. The resulting regression models obtained were:

    1 1 2 3 1 3 2 3

    248 272 533 485 424 ,y x x x x x x x

    2 1 2 3 1 3 2 3 18.7 14.1 35.4 36.7 18.0 .y x x x x x x x

    2 ,Y

    2 1 2 3 1 2 1 3 2 3 18.8 15.6 35.4 3.59min( , ) 17.7min( , ) 10.0min( , ),y x x x x x x x x x

    2Y2R

    -valuesp

    (13)

    (14)

    For this paper, several different mixture-experiment regression models were fit to each

    for each response type. For the Becker-type model (Becker [3]),

    (15)

    resulted in a mean squared error of 1.71, which is a 53% reduction over the quadratic

    model for in (14). The adjusted for the model in (15) is 96.4%. It turned out that

    the model forms in (13) and (15) gave the best overall fits to the data. As such, these two

    different (SUR) model forms werechosen to model the response surfaces. The Wilks-Shapiro

    test for normality of the residuals for each regression model yields greater than0.05. Tests for multivariate normality via skewness and kurtosis (Mardia [27]) were not

    significant at the 5% level; although such tests would not be very sensitive for the small

    sample size used in this example.

    Figure 1 shows the points where the predicted mean responses associated with the

    model in (13) are less than 240. Likewise, Figure 2 shows the points where the predicted

    mean responses associated with the model in (15) are less than 19. Figure 3 shows the

    points where the predicted mean responses associated with both models in (13) and (15) are

    less than 240 and 19, respectively.

    We define d d1 2 1 2{ , : 240, 19}A y y y y and ( ) Pr( | , data).p AY x ( )pAllx xprobabilities are computed here using N=1000 simulated values for eachY x point. For

    this example, ten independent Gibbs sampling chains were simulated for 1000 iterationsfollowing a burn-in of 100 iterations. Each chain was thinned to take only every tenth

    simulation. Here, N=1000 was taken as a reasonable value (Gelman et al. [12]). For

    binomial probabilities,Nof a 1000 produces a standard error of at most 0.0158 (assuming

    roughly independent posterior simulations). The Gelman-Rubin convergence statistics for

    all of the model parameters were all very good (less than 1.1 as recommended by Gelman

    et al. [12]). Gridding over the design simplex was done using 32,761 grid points. Using the

    SUR models in (13) and (15) we obtain 0.19) .c ) *) 0.622max ( (p p xx * (0.81, 0,at x* .Clearly, this is no indication of a reliable process at x If the experimenter had instead

    used the classical SMR model (forms in (13) and (14)) to maximize ( )p x over the design

    simplex then he/she would obtain max ( ) ( *) 0.863p p xx * (0.78,0,0 .22) .at cx( )pHence the optimal x for the SMR model is about a 39% increase in process reliability,

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    11/17

    A Bayesian Reliability Approach to Multiple Response Optimization 363

    2yp

    though still possibly unacceptable. The noticeable difference in probabilities is due to the

    fact that the (better fitting) model in (15), while having a smaller MSE, also has a larger

    mean predicted value than the model in (14) when is greater then 0.5. Theprobabilities

    1x( )x for the SUR model were also computed assuming that the residual

    errors had a t-distribution with 4 df. In this case, max ( ) ( *) 0.613p p xx* (0.83,0,0 .17) .cx

    at

    (Gelman et al. [12], suggest a t-distribution with 4 df for doing a robust

    data analysis.)

    0.00

    0.25

    0.50

    0.75

    1.00

    x1=10.00

    0.25

    0.50

    0.75

    1.00

    0.00 0.25 0.50 0.75 1.00x2=1 x3=1

    1 ,y

    Figure 1. The gray area is the part of the response surface

    associated with the model in (13) where the predicted meanresponse, is less than or equal to 240.

    0.00

    0.25

    0.50

    0.75

    1.00

    x1=10.00

    0.25

    0.50

    0.75

    1.00

    0.00 0.25 0.50 0.75 1.00x2=1 x3=1

    2 ,y

    Figure 2. The gray area is the part of the response surfaceassociated with the model in (15) where the predicted meanresponse, is less than or equal to 19.

    x1=1

    x2=1 x3=1

    x1=1

    x2=1 x3=1

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    12/17

    364 Peterson, Mir-Quesada and del Castillo

    0.00

    0.25

    0.75

    1.00

    x1=1

    0.50

    0.00

    0.25

    0.50

    0.75

    1.00

    0.00 0.25 0.50 0.75 1.00x2=1 x3=1

    d1 240y d2 19.y

    -points

    Figure 3. The gray area is the part of the response surfaceassociated with the models in (13) and (15) where both

    and

    All of these models may indicate the need for remedial action. Such action could be of

    the form of reducing the process variability, decreasing the means, and/or removing

    uncertainty due to the unknown model parameter values. Since the first two actions may be

    difficult to achieve, we consider the effects of adding more replications to the experimental

    design by way of a preposterior analysis. To assess the effect of adding additional data, thepreposterior analyses discussed in section 5 were performed. To keep the computations

    tractable, the same optimized x associated with each model and the original data

    set, were used. For each model, (p )x was computed using its own optimal x-point,

    for the SMR model (with normal errors), for the

    SUR model (with normal errors), and

    * (0.78, 0, 0.22)x * (0.81, 0, 0.19)cx0.17)

    c* (0.83, 0, cx for the SUR model with

    t-distributions errors (with 4 df). For each model, the entire design matrix was replicated 2,

    3, or 4 times. For the SMR model the degrees of freedom were adjusted accordingly. For

    the SUR model with normally distributed errors, both preposterior approaches (single and

    multiple imputation) discussed in section 5.1 were used. For the SUR model with the

    t-distribution errors, the multiple imputation approach as discussed in section 5.2 was used.

    ( *)pFigure 4 shows the increase in x as the number of replications is increased from

    one to four for both the SMR and SUR models. Here, it is evident that the SMR modelmight lead the experimenter to believe that reduction of model parameter uncertainty by

    using three or four replications would provide sufficient evidence that the process has a

    high rate of conformance with the specifications given by the set A. However, the better

    fitting SUR models, using either normal ort-distribution errors, indicate that increasing the

    number of experimental replications may not validate that the process has a high rate of

    conformance even with four replications. Instead, the SUR models are indicating that the

    experimenter must improve the process means and/or variances to obtain conformance

    with higher probability. The single imputation and (bootstrap) multiple imputation results

    for the SUR model with normal errors are reasonably close but further research needs to be

    done on how these two preposterior approaches compare. Nonetheless, this shows the

    x1=1

    x2=1 x3=1

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    13/17

    A Bayesian Reliability Approach to Multiple Response Optimization 365importance of improved modeling which can be achieved by generalizing from the SMR

    model to the more flexible SUR model.

    SMR normal errorsSUR normal errorsSUR normal errors with bootstrapSUR t-dist. errors with bootstrap

    Rep=1 Rep=2 Rep=3 Rep=40.55

    0.60

    0.65

    0.70

    0.75

    0.80

    0.85

    0.90

    0.95

    1.00

    Figure 4. Probabilities of conformance for increasing numbers of designreplications. Reps 2-4 are the preposterior probability estimates. (Bootstrapping is

    not applicable for Rep=1.) For each model, p(x) was computed using its ownoptimalx-point from Rep=1.

    6.2. Optimization of an HPLC Assay

    This example illustrates the optimization of an event probability

    for a high performance liquid chromatography (HPLC) assay as originally discussed in

    Peterson [33]. Here there are three factors ( = percent of isopropyl alcohol (pipa), =

    temperature (temp), and = pH) and four responses ( = resolution (rs), = run time,

    = signal-to-noise ratio (s/n), = tailing). For this assay, the chemist desires to have the

    event,

    Pr( | , data)AY x

    2x

    2y1x

    3x 1y

    3y 4y

    t d t d d1 2 3 4{ : 1.8, 15, 300, 0.75 0.85},A y y y yy (16)

    occur with high probability. As such it is desired to maximize ( ) Pr( | , data)p Ax Y x asa function of .x

    A Box-Behnken experimental design was run, with three center points, to gather data

    to fit four quadratic response surfaces. For the SMR model, full second-order quadratic

    regression forms were used for each response.

    All of the response surface models fit well, with all values above 99%. As in

    example 1, the Wilks-Shapiro test for normality of the residuals for each regression model

    yields p-values greater than 0.05, and the Mardia tests for multivariate skewness and

    2R

    SMR normal errorsSUR normal errorsSUR normal errors with bootstrapSUR t-dist. errors with bootstrap

    Rep=1 Rep=2 Rep=3 Rep=40.55

    0.60

    0.65

    0.70

    0.75

    0.80

    0.85

    0.90

    0.95

    1.00

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    14/17

    366 Peterson, Mir-Quesada and del Castillo

    kurtosis were not significant at the 5% level. The factor levels were coded so that all values

    were between and with the center of the experimental region at the origin.1

    (11

    2

    3

    4 0

    1,

    E

    E E

    (1)1 1

    (2 )1

    (3)1 1

    (4 )1

    y x

    y x

    y x

    y x

    Some of the factor terms for the second-order response surface models were not

    statistically significant so a SUR model was created from an SMR model by removing

    some of the non-significant terms, while still preserving model-term hierarchy. Using the

    STEPWISE option in SAS PROC REG the four regression models obtained for the SUR

    model analysis were:

    E E E E E E

    E E E E E E

    E E E E E E

    ) (1) (1) 2 (1) 2 (1)0 2 2 11 1 22 2 12 1 2 1

    (2) (2) (2) (2) 2 (2) 2 (2)0 1 2 2 3 3 11 1 22 2 12 1 2 2

    (3) (3) (3) (3) 2 (3)0 2 2 3 3 33 3 12 1 2 3

    (4 )1

    ,

    ,

    ,

    x x x x x e A

    x x x x x x

    x x x x x e

    E E E (4) (4) 2 (4) 22 2 11 1 22 2 4 .x x x e

    e(17)

    For comparison purposes, a sensitivity analysis involving three models were performed.

    The three models were:

    Model 1: An SMR model using a full second-order polynomial with normally distributederrors.

    Model 2: A SUR model as shown in (17) above with normally distributed errors.

    Model 3: A SUR model as shown in (17) above with errors have a t-distribution with 4 df.

    For models 1-3, 10,000 Monte Carlo simulations were done as it appears that the true

    underlying Bayesian probabilities were extreme (close to 1). One hundred burn-in

    simulations were done to get each independent simulated value. Gridding steps of 0.1 were

    used across the coded design space. For the SMR model (Model 1), the maximum ( )p x

    value is ( *) 0.96p 4x where c* (73.5, 43, 0.1) .x( )p

    However for the SUR model with

    normal errors (Model 2), the maximum x value is ( *) 1p x where c* (73.5, 43x , 0.1)(although a neighborhood containing *x also had values of ( )p 1).x Replacing thenormal errors assumption in Model 2 with t-distributions errors (with 4 df) (Model 3),

    produced a maximum (p ) value of ( *) 0.978,p x where c*x (74.5, 44.9, 0.06) .x

    It is interesting to note that the ( *)p x s for the SUR models are larger than for the

    SMR model. In this example, Model 2 is simply a special case of Model 1 where the

    non-significant regression terms are removed. Apparently, this removal of non-significant

    terms for Model 2 tightens up the posterior predictive distribution enough to increase the

    optimal ( )p x value of that over Model 1. Even the optimal ( )p x value for Model 3 is

    slightly larger than that for Model 1 despite the use of a residual errort-distribution with 4

    df. For this example, the sensitivity analysis tells us that for all three models the worst case

    probability is 0.964. If this smallest reliability estimate is adequate then we need not do apreposterior analysis to check the effects of gathering additional data.

    7. Summary

    The SMR model (with normal errors) has a closed-form posterior predictive distribution

    allowing quick and easy computation of ( ) Pr( | ,data)p Ax Y x or other posteriorpredictive metrics as shown in Peterson [33]. However, in some cases the use of the more

    general SUR model will be preferable. One such case was shown in the first example

    (section 6.1) where the fit of one of the response types was greatly increased by a change in

    the basic model form. For the second example (section 6.2) all of the individual regression

    models each had some terms that were not statistically significant. The larger posterior

    probability of conformance value for the SUR model over the SMR model indicates that

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    15/17

    A Bayesian Reliability Approach to Multiple Response Optimization 367some further efficiency can be obtained by removing terms in some of the models that do

    not appear predictive.

    The preposterior analysis discussed in section 5 allows the investigator to assess the

    effect of model parameter uncertainty on the posterior predictive probability of conformance.

    If the process means are all in conformance with process specifications, then an increase in

    data will result in some increase in posterior predictive probability of conformance. If this

    predicted increase is satisfactory, then the experimenter may want to gather more data to

    confirm this. If this predicted increase is not satisfactory, then the experimenter may wish

    to take different action and consider the possibility of process modification to improve

    response means and/or variances. At this point, it is not clear in general how the single and

    multiple imputation preposterior analyses compare to each other. Further research is needed

    to investigate the propertiesof preposterior analyses for response surface optimization.

    Useful modifications of the SUR model are possible with the addition of noise variables

    and a t-distribution model for the residual errors. Further research in this area to make thevariance-covariance matrix a function of the controllable factors may also prove helpful to

    experimenters.

    Acknowledgements

    We would like to thank Joseph Schaffer for a helpful discussion on the imputation

    aspects of this work as related to the preposterior analysis.

    References

    1. Ames, A. E., Mattucci, N., MacDonald, S., Szonyi, G. and Hawkins, D. M. (1997).Quality loss functions for optimization across multiple response surfaces. Journal of

    Quality Technology, 29, 339-346.2. Anderson, M. J. and Whitcomb, P. J. (1998). Find the most favorable formulations.

    Chemical Engineering Progress, April, 63-67.

    3. Becker, N. G. (1968). Models for the response of a mixture. Journal of the RoyalStatistics Society, Series B, 30, 349-358.

    4. Chatterjee, S., Laudato, M. and Lynch, L. A. (1996). Genetic algorithms and theirstatistical applications: an introduction. Computational Statistics and Data Analysis, 22,633-651.

    5. Congdon, P. (2006). Bayesian Statistical Modeling, 2nd edition. John Wiley and SonsLtd., Chichester.

    6. del Castillo, E., Montgomery, D. C. and McCarville, D. R. (1996). Modifieddesirability functions for multiple response optimization. Journal of Quality Technology,

    28, 337-345.7. Derringer, G. and Suich, R. (1980). Simultaneous optimization of several response

    variables. Journal of Quality Technology, 12, 214-219.

    8. Derringer, G. (1994). A balancing act: optimizing a product's properties, QualityProgress, June, 51-58.

    9. Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman andHall/CRC, Boca Raton.

    10. Food and Drug Administration (2006). Guidance for Industry - Q8 PharmaceuticalDevelopment. U. S. Department of Health and Human Services, CDER, CBER, USA.

    11. Frisbee, S. E. and McGinity, J. W. (1994). Influence of nonionic surfactants on thephysical and chemical properties of a biodegradable pseudolatex. European Journal of

    Pharmaceutics and Biopharmaceutics, 40, 355-363.

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    16/17

    368 Peterson, Mir-Quesada and del Castillo

    12. Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004).Bayesian Data Analysis,2nd edition. Chapman and Hall/CRC, Boca Raton.

    13. Genz, A. and Bretz, F. (2002). Methods for the computation of multivariate t-probabilities. Journal of Computational and Graphical Statistics, 11, 950-971.

    14. Geweke, J. (2005). Contemporary Bayesian Econometrics and Statistics. John Wiley andSons, Inc. Hoboken, NJ.

    15. Griffiths, W. (2003). Bayesian inference in the seemingly unrelated regressions model.In Computer-Aided Econometrics, eds. D. E. A. Giles, New York, Marcel Dekker,263-290.

    16. Harrington, E. C. (1965). The desirability function. Industrial Quality Control, 21,494-498

    17. Hunter, J. S. (1999). Discussion of response surface methodology current status andfuture directions. Journal of Quality Technology, 31, 54-57.

    18. Johnson, Mark E. (1987).Multivariate Statistical Simulation. John Wiley, New York.

    19. Johnson, R. A. and Wichern, D. W. (2002).Applied Multivariate Statistical Analysis, 5thedition. Englewood Cliffs, Prentice Hall.

    20. Khuri, A. I. and Conlon, M. (1987). Simultaneous optimization of multiple responsesrepresented by polynomial regression function. Technometrics, 23, 363-375.

    21. Kim, K. and Lin, D. K. J. (2000). Simultaneous optimization of mechanicalproperties of steel by maximizing exponential desirability functions. Journal of the

    Royal Statistical Society, Series C, 49, 311-325.

    22. Ko, Y. H., Kim, K. J. and Jun, C. H. (2005). A new loss function-based method formultiresponse optimization. Journal of Quality Technology, 37, 50-59.

    23. Kotz, S. and Johnson, R. (1985).Encyclopedia of Statistical Sciences, 6, 129-130.

    24. Lange, K., Little, R. and Taylor, J. (1989). Robust statistical modeling using the

    t-distribution. Journal of the American Statistical Association, 84, 881-896.25. Liu, C. and Rubin, D. B. (1995). ML estimation of the t-distribution using EM and its

    extensions, ECM and ECME. Statistica Sinica, 5, 19-39.

    26. Liu, C. (1996). Bayesian robust multivariate linear regression with incomplete data.Journal of the American Statistical Association, 91, 1219-1227.

    27. Mardia, K. V. (1974). Applications of some measures of multivariate skewness andkurtosis in testing normality and robustness studies. Sankhya B, 36, 115-128.

    28. McLean, R. A. and Anderson, V. L. (1966). Extreme vertices design of mixtureexperiments. Technometrics, 8, 447-454.

    29. Mir-Quesada, G., del Castillo, E. and Peterson, J. J., (2004). A Bayesian approach formultiple response surface optimization in the presence of noise variables. Journal of

    Applied Statistics, 31, 251-270.

    30. Montgomery, D. C. and Bettencourt, V. M. (1977). Multiple response surface methodsin computer simulation. Simulation, 29, 113-121.

    31. Nelder, J. A. and Mead, R. (1964). A simplex method for function minimization.The Computer Journal, 7, 308-313.

    32. Percy, D. F. (1992). Prediction for seemingly unrelated regressions. Journal of the RoyalStatistical Society, Series B, 54, 243-252.

    33. Peterson, J. J. (2004). A posterior predictive approach to multiple response surfaceoptimization. Journal of Quality Technology, 36, 139-153.

    34. Peterson, J. J. (2008). A Bayesian approach to the ICH Q8 definition of design space.Journal of Biopharmaceutical Statistics, 18, 958-974.

    35. Pignatiello, Jr. J. J. (1993). Strategies for robust multiresponse quality engineering. IIE

  • 7/29/2019 A Bayesian Reliability Approach To Multiple Response Optimization With Seemingly Unrelated Regression Models_P

    17/17

    A Bayesian Reliability Approach to Multiple Response Optimization 369Transactions25, 5-15.

    36. Press, S. J. (2003). Subjective and Objective Bayesian Statistics: Principles, Models, andApplications, 2nd edition. John Wiley, New York.

    37. Price, W. L. (1977). A controlled random search procedure for global optimization.The Computer Journal, 20, 367-370.

    38. Raiffa, H. and Schlaiffer, R. (2000). Applied Statistical Decision Theory. John Wiley,New York.

    39. Srivastava, V. K. and Giles, D. E. A. (1987). Seemingly Unrelated Regression EquationsModels. Marcel-Dekker, New York.

    40. Rajagopal, R., del Castillo, E. and Peterson, J. J. (2005). Model and distribution-robust process optimization with noise factors. Journal of Quality Technology, 37,210-222. (Corrigendum 38, p83).

    41. Vining, G. G. (1998). A compromise approach to multiresponse optimization. Journalof Quality Technology, 30, 309-313.

    42. Wood, S. N. (2006). Generalized Additive Models, an Introduction with R. Chapman andHall/CRC, Boca Raton, FL.

    43. Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressionsand tests of aggregation bias. Journal of the American Statistical Association, 57, 500-509.

    44. Zellner, A. (1971). An Introduction to Bayesian Inference in Econometrics. John Wiley,New York.

    Authors Biographies:

    John J. Peterson is a Senior Director in the Research Statistics Unit of GlaxoSmithKlinePharmaceuticals. He received his B.S. in Applied Mathematics and in Computer Science(double major) from the State University of New York at Stony Brook and his Ph.D. instatistics from The Pennsylvania State University. Dr. Peterson has over 20 years experienceas a statistician in the pharmaceutical industry. His current research area is in responsesurface methodology as applied to pharmaceutical industry problems, including applicationsto "chemistry, manufacturing, and control" (CMC) and combination drug studies. Dr.Peterson is a Fellow of the American Statistical Association and a Senior Member of theAmerican Society for Quality. He is also on the editorial boards of the Journal of QualityTechnologyand the journalApplied Stochastic Models in Business and Industry.

    Guillermo Mir-Quesada is a Sr. Research Scientist in the Bioprocess Research andDevelopment department at Eli Lilly and Co. He received a Ph.D. in Industrial Engineeringand Operations Research from Pennsylvania State University. He has worked in theBiotech division of Eli Lilly and Co. since 2003, where he has supported the developmentof manufacturing processes for active pharmaceutical ingredients. He is involved inactivities related to integrating Quality by Design principles in the drug development planand assessing the capability of manufacturing process in development.

    Enrique del Castillo is a Distinguished Professor of Engineering in the Department ofIndustrial & Manufacturing Engineering at the Pennsylvania State University. He alsoholds an appointment as Professor of Statistics at PSU and directs the EngineeringStatistics Laboratory. Dr. Castillos research interests include Engineering Statistics withparticular emphasis on Response Surface Methodology and Time Series Control. Anauthor of over 80 refereed journal papers, he is the author of the textbooks ProcessOptimization, a Statistical Approach (Springer, 2007), Statistical Process Adjustment for QualityControl (Wiley, 2002), and co-editor (with B.M. Colosimo) of the bookBayesian Process

    Monitoring, Control, and Optimization (CRC, 2006). He is currently (2006-2009) editor-in-chief of the Journal of Quality Technology.