33
Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction and Evaluation of Response Surface Designs Incorporating Bias from Model Misspecification

Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Embed Size (px)

Citation preview

Page 1: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Connie M. Borror, Arizona State University West

Christine M. Anderson-Cook, Los Alamos National Laboratory

Bradley Jones, JMP SAS Institute

Construction and Evaluation of Response Surface Designs Incorporating Bias from

Model Misspecification

Page 2: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Motivation Response surface design evaluation (and

creation) assuming a particular model Single number efficiencies Prediction variance performance Mean-squared error

Model misspecification? What effect does this have on prediction and

optimization?

Page 3: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Motivation Examine effect of model misspecification

Expected squared bias Prediction variance Expected mean squared error Using fraction of design space (FDS) plots and

box plots Evaluate designs based on the contribution of

ESB relative to PV.

Page 4: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Scenario Cuboidal regions True form of the model is of higher order than

the model being fit. Examine

Response surface models when the true form is cubic

Screening experiment when the true form is full second order.

Page 5: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Model Specifications The model to be fit is

Y = X11 + ε X1 = n × p design matrix for the assumed form of

the model The true form of the model is

Y = X11+ X22 + ε X2 = n × q design matrix pertaining to those

parameters (2) not present in the model to be fit (assumed model).

Page 6: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Model Specifications 2 in general, are not fully estimable

Assume 2 ~ N(0, ) 2β

1

2

2

2

2

2

0

0q

βΣ

Page 7: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Criteria Mean-squared error

Expected squared bias (ESB):

Expected MSE sum of PV and ESB

AwzwAzβ 2

2ESB

121 1 2 2ˆ[ ( )]MSE y x w X 'X w β w A z z w A β

Page 8: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Fraction of Design Space (FDS) Plots Zahran, Anderson-Cook,

and Myers (2003) scaled prediction variance values are plotted versus the fraction of the design space that has SPV at or below the given value

Adapt this to plot ESB and EMSE as well as PV.

We use FDS plots and box plots to assess the designs

0 0.25 0.5 0.75 1

Fraction of Design Space

3

4

5

6

7

8

9

SPV

CCD-1CR

Hexagon-1CRCCD-3CR

100% G-eff

Page 9: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Cases I. Two-factor response surface design

Assume a second-order model:

True form of the model is cubic:

20 i i ij i j ii iy x x x x

2 2 30 i i ij i j ii i ij i j iii iy x x x x x x x

Page 10: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs Central Composite Design (CCD) Quadratic I-optimal (Q I-opt) Quadratic D-optimal (Q D-opt) Cubic I-optimal (C I-opt) Cubic D-optimal (C D-opt) Cubic Bayes I-optimal (C Bayes I-opt) Cubic Bayes D-optimal (C Bayes D-opt)

Page 11: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I CCD

22 1 assume weIf B

Page 12: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs CCD (ESB and EMSE performance as bias increases)

Page 13: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs PV for all designs

Page 14: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs ESB for all designs

Page 15: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs EMSE for all designs

Page 16: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case I Designs FDS for EMSE for all designs

Page 17: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case II Four factor response surface design

Assume a second-order model:

True form of the model is cubic:

20 additional terms as we move from the second-order model to cubic.

20 i i ij i j ii iy x x x x

2 2 30 i i ij i j ii i ij i j iii iy x x x x x x x

Page 18: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case II Designs Six possible designs, with n = 27 runs

Central Composite Design (CCD) Box Behnken Design (BBD) Quadratic I-optimal (Q I-Opt) Quadratic D-optimal (Q D-Opt) Cubic Bayes I-optimal (C Bayes I-Opt) Cubic Bayes D-optimal (C Bayes D-Opt)

Note: Cubic I- and D-Optimal not possible with available size of design

Page 19: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case II PV for all designs

Page 20: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case II EMSE for all designs

Page 21: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case IIFDS plot of EMSE for Four Factors

Page 22: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III Eight-factor Screening Design

Assume a first-order model:

True form of the model is full second-order:

0 i iy x

20 i i ij i j ii i

i jy x x x x

Page 23: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III Designs 28-4 fractional factorial design with 4 center

runs D-optimal (for first order) Bayes I-optimal (for second order) Bayes D-optimal (for second order)

Page 24: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III Designs The difference in the number of terms from

the assumed to the true form of the models increases from 8 to 44. We would expect bias to quickly dominate

EMSE.

Page 25: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III PV for all designs

Page 26: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III ESB for all designs

Page 27: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Case III EMSE for all designs

Page 28: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Design Notes For the two-factor case:

The I-optimal and CCD were equivalent. They performed the best based on minimizing the

maximum EMSE They performed the best based on prediction

variance

Page 29: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Design Notes For the four factor case,

the BBD was best based on EMSE criteria (in particular, the 95th percentile, median, mean) when size of the coefficients of missing terms are

moderate to large The I-optimal design was competitive for this case

only if small amounts of bias were present.

As the number of missing cubic terms increases, the BBD was best for EMSE.

Page 30: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Design Notes I-optimal designs were highly competitive

over 95% of the design region; not with respect to the maximum PV, ESB, and EMSE.

Cubic Bayesian designs did not perform well.

Page 31: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Design Notes In the screening design example:

The D-optimal designs best if the assumed model is correct, but break down quickly if quadratic terms are in the model Much more pronounced than in the response surface design

cases. Quadratic Bayesian I-optimal design was best based on

mean, median, and 95th percentile of EMSE The 28-4 fractional factorial design was best with respect

to the maximum EMSE. The 28-4 design was best for both PV and ESB when the

PV and ESB contribution to the model were balanced.

Page 32: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Conclusions Appropriate design can strongly depend on the assumption

that we know the true form of the underlying model If we select designs carefully it is often possible to select a

model that predicts well in the design space, and provide some protection against missing model terms.

The ESB approach to assessing the effect of missing terms provides is advantageous: do not have to specify coefficient values for the true underlying

model, Instead, the relative size of the missing terms can be calibrated

relative to the variance of the observations.

Page 33: Connie M. Borror, Arizona State University West Christine M. Anderson-Cook, Los Alamos National Laboratory Bradley Jones, JMP SAS Institute Construction

Conclusions Size of the bias variance relative to observational

error needed to balance contributions from PV and ESB is highly dependent on the number of missing terms from the assumed model.

As the number of missing terms increases, the ability of designs to cope with the bias decreases substantially different designs are able to handle this increasing bias

differently.