1 Application of Monte Carlo Methods for Process Modeling John Kauffman, Changning Guo FDA\CDER...

Preview:

Citation preview

1

Application of Monte Carlo Methods for Process Modeling

John Kauffman, Changning GuoFDA\CDER Division of Pharmaceutical Analysis

Jean-Marie GeoffroyTakeda Global Research and Development

The opinions expressed in this presentation are those of the authors, and do not necessarily represent the opinions or policies of the FDA.

2

Outline

1. Why propagate uncertainty in regression-based process models?

2. Why use Monte Carlo (MC) simulation?

3. Why solve regression models with MC?

3

What is Design Space?

• Design space is “the multidimensional combination and interaction of input variables and process parameters that have been demonstrated to provide assurance of quality”. (ICH Q8)

• Process modeling (DOE) is a central component of design space determination.

4

Parameter#1

Parameter#2

Design Space Schematic

Knowledge Space

Design Space

5

Parameter#1

Parameter#2

Design Space Schematicwith uncertainty

Knowledge Space

Design Space

6

Case Study #1: Modeling 45 minute dissolution (D45) of a tableting process

• 32 Factorial Experimental Design– Granulating Water (GS: 36-38 kg)– Granulating Power (P: 18.5-22.5 kW)

• Nested Compression Factors– Compression Force (CF: 11.5-17.5 kN)– Press Speed (S: 70-110 kTPH)

• Least Squares Predictive Model*D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2

* Parameter values are mean-centered and range-scaled.

Publication ReferenceApplication of Quality by Design Knowledge (QbD) From Site Transfers to Commercial Operations Already in Progress,” J. PAT, Jan/Feb, pg. 8, 2006.

7

Diagram of Experimental Design

granulationwater (kg)

granulation power (kW)

38

36

37

18.5 20.5 22.5

speed

force

8

Experimentation and Process Modeling

D45exp 1

D45exp 2

D45exp 3

GSexp 1 Pexp 1 CFexp 1

GSexp 2 Pexp 2 CFexp 2

GSexp 3 Pexp 3 CFexp 3

9

Propagation of uncertainty in process model predictions:

All Model Coefficient variances and Process Variable variances contribute to each predicted Response uncertainty in a model-dependent manner.

D45pred 1 = B0 + B1·GSexp 1 + B2·Pexp 1 + B3·CFexp 1 + B4·GS2exp 1

D45pred 2 = B0 + B1·GSexp 2 + B2·Pexp 2 + B3·CFexp 2 + B4·GS2exp 2

D45pred 3 = B0 + B1·GSexp 3 + B2·Pexp 3 + B3·CFexp 3 + B4·GS2exp 3

Experimentation and Process Modeling

10

Propagation of Uncertainty in Regression Modeling

• What procedures can be used to estimate uncertainty in design space?

• What is the benefit of propagating uncertainty using Monte Carlo simulation?

11

B0

B1

B2

B3

B4

The Process Model: Matrix Representation

R = DB

D45pred 1

D45pred 2

D45pred 3

1 GSexp 1 Pexp 1 CFexp 1 GS2exp 1

1 GSexp 2 Pexp 2 CFexp 2 GS2exp 2

1 GSexp 3 Pexp 3 CFexp 3 GS2exp 3

=

Response

matrix

Design

matrix

B

matrix

12

Least Squares Solution to a Process Model

Matrix Representation of Process Model: R = DB

Solving for the Model Coefficients: D†R = B

The pseudoinverse solution of a matrix equation gives the least squares best estimates of the B coefficients!

Define the pseudoinverse of D: D† = (DTD)-1DT

13

Estimating Variance in Prediction:The Basis for Uncertainty in Design Space

Jth experimental variance = Jth diagonal element of Cov(R)

Response Covariance matrix Cov(R) = B[Cov(D)]BT

Assumptions: Only D has uncertainty.

Problems: 1.) We know that B has uncertainty. 2.) We know that uncertainties in D will be correlated, but we don’t know Cov(D)

14

Estimating Variance of Process Model Regression Coefficients

Jth Model coefficient variance = Jth diagonal element of Cov(B)

Problems: 1.) We know that D (matrix of input variables) has uncertainty. 2.) We suspect that uncertainties may be correlated.

Model coefficient Covariance matrix Cov(B)=[DTD]-1R

2

Response variance( p = # model coefficients N = # experiments)

R

2= (Ri – Ri)2

N - p

^

i=1

N

Assumptions: Only R has uncertainty; Errors uncorrelated and constant

Cov(B)=D†[Cov(R)]D†T

15

Monte Carlo Methods

• Develop a mathematical model.– The Process Model.

• Add random variables.– Replace quantities of interest with random numbers selected

from appropriate distribution functions that are expected to describe the variables.

• Monitor selected output variables.– Output variables become distributions whose properties are

determined by the model and the distributions of the random variables.

• Advantage #1: We make no assumptions concerning sources of uncertainty or covariance between variables.

16

Case Study #1: Influence of Process Parameter Variation on Prediction

• Model Conditions– GS mean = 36 kg– P mean = 20 kW– CF mean = 14 kN– Input parameter standard deviations were varied.– Dissolution values were predicted.

17

Example: Simulation 1

Distribution for Water (kg)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

35 35.5 36 36.5 37

Distribution for Power (kW)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

15 17.5 20 22.5 25

Distribution for Force (kN)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

9 12 15 18

D45 Simulation #1

0

0.02

0.04

0.06

0.08

0.1

0.12

60 70 80 90 100

GS Mean = 36 kg Std. Dev. = 0.25 kg

P Mean = 20 kW Std. Dev. = 1 kW

CF Mean = 14 kN Std. Dev. = 1 kN

D45 Simulation Result Mean = 74.6% Std. Dev. = 3.70%

D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2

18

Example: Simulations 1-4

D4

5 S

imula

tion #

1

0

0.02

0.04

0.06

0.08

0.1

0.12

6070

8090

100

GS Std. Dev.=0.25 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%

D45 Simulation Mean = 74.6% Std. Dev. = 3.70%

D4

5 S

imula

tion #

2

0

0.02

0.04

0.06

0.08

0.1

0.12

6070

8090

100

D4

5 S

imula

tion #

3

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

6070

8090

100

D4

5 S

imula

tion #

4

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

6070

8090

100

D45 Simulation Mean = 75.0% Std. Dev. = 4.59%

D45 Simulation Mean = 76.9% Std. Dev. = 7.89%

D45 Simulation Mean = 76.9% Std. Dev. = 8.26%

GS Std. Dev.=0.5 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%

GS Std. Dev.=1 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%

GS Std. Dev.=1 kgP Std. Dev.=2 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%

19

Influence of Process Parameters Variation

• Increase in granulation water mass (GS) variance:– Increases predicted D45 variance.– Slightly shifts predicted D45 means.– Skews the predicted D45 distributions.

• Increase in granulator power (P) endpoint variance:– Increases predicted D45 variance.– Does not shift predicted D45 means.– Does not skew the predicted D45 distributions.

20

Influence of Dissolution Measurement Error

• Model Conditions– GS mean = 36 kg– P mean = 20 kW– CF mean = 14 kN– Input parameter standard deviations were varied.– Dissolution measurement error was added.– Dissolution values were predicted.

21

Example: Simulations 5-7

GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=0%(Control)

D45 Simulation Mean = 75.0% Std. Dev. = 3.88%

D45 Simulation Mean = 75.0% Std. Dev. = 4.37%

D45 Simulation Mean = 75.0% Std. Dev. = 5.58%

D45 Simulation Mean = 75.0% Std. Dev. = 7.16%

Sim

ulatio

n #5

0

0.02

0.04

0.06

0.08

0.1

0.12

5065

8095

Sim

ulato

n 5-7

Co

ntrol

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

5060

7080

90100

Sim

ulatio

n #6

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

5060

7080

90100

Sim

ulatio

n #7

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

5064

7892

GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=2%

GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=4%

GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=6%

22

Influence of Dissolution Measurement Error

• Increase in D45 measurement variance:– does not shift predicted D45 means.– does not appear to skew predicted D45

distributions.– increases predicted D45 variance.

• Advantage #2, we get the distribution, not just the standard deviation.

• Advantage #3, sensitivity analysis allows us to prioritize process improvement.

23

Measurement Uncertainty and Prediction Uncertainty

Benchmark

Experiment Measured Mean Model Prediction Measured St. Dev.1 69.0 69.2 3.12 73.5 72.3 3.13 71.9 72.1 1.94 67.1 65.5 2.95 69.8 71.2 1.96 75.5 75.0 0.87 65.4 66.6 2.78 75.8 77.3 1.19 56.1 59.4 3.710 61.0 59.4 3.711 67.2 68.3 2.412 77.0 77.3 1.113 72.7 68.3 2.4

2.42.6

Standard Error of PredictionStandard Error (RMS Measurement Standard Deviation)

D45 (%)

24

Monte Carlo Prediction Error

RandomResponse

RandomCoefficients

RandomInputs

Result using estimated coefficient St. Dev.

D45 Measurement #1 #2 #3 #4

N = 6 6 10000 10000 6B matrix - regression regression regression regression

B Std. Dev. - regression regression regression 0R Std. Dev. - 0 0 0 1.7%P Std. Dev. - 0 0 0.5 0.5

GS Std. Dev. - 0 0 0.25 0.25CF Std. Dev. - 0 0 0.5 0.5

Std. Error of Prediction 2.6% 1.7% 1.6% 2.6% 2.5%

D45 Monte Carlo Simulation Parameters

Std. Error

25

Prediction Error Based on Estimated Coefficient Standard Deviations

• Estimated model coefficient standard deviations do not predict the observed response uncertainty.

• Can we use Monte Carlo simulation to provide better estimates of model coefficient standard deviations?

26

Propagation of Uncertainty in Process Modeling

1. Assign random variables to Dissolution values (R) and use Monte Carlo simulations to propagate error to the model coefficients (B).

Solving for the Model Coefficients: D†R=B

2. Assign random variables to Process Parameters (D) and use Monte Carlo simulations to propagate error to B.

B1 = D†11·RExp 1 + D†

12·RExp 2 + D†13·RExp 3 +…

B2 = D†21·RExp 1 + D†

22·RExp 2 + D†23·RExp 3 +…

The pseudoinverse of D: D†=(DTD)-1DT

27

How Do Variances in Process Parameters Influence Model Coefficients?

• Simulation # 1 (1-0.25-1)– Measured D45 means and standard deviations.– P 19-23 kW ± 1 kW– GS 36-38 kg ± 0.25 kg– CF 12-18 ± 1 kN

• Compare to regression distributions– Model coefficient means– Model coefficient standard deviations

28

How Do Variances in Process Parameters Influence Model Coefficients?

“Bias” Power (P) Water (GS) Force (CF) Water2

Ran

. In

put Bias Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

64 67 70 73

Power Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

-8 -6 -4 -2 0 2 4

Water Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

-8 -6 -4 -2 0 2 4 6

Water2 Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

-4 -2 0 2 4 6 8 10

Force Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

-16 -14 -12 -10 -8 -6 -4 -2

Increase in process parameter variance causes a shift in some model coefficients.

Increase in process parameter variance increases model coefficient variance.

Bias Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

64 65 66 67 68 69 70 71 72 73

Power Coefficient

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

-3.5 -2.375 -1.25 -0.125 1

Water Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4

Force Coefficient

0

0.2

0.4

0.6

0.8

1

1.2

-5 -4 -3 -2 -1

Water2 Coefficient

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

-4 -2 0 2 4 6 8 10

Reg

ress

ion

29

How Do Variances in Process Parameters Influence Model Coefficients?

– Simulation # 1 (1-.25-1)– Simulation # 2 (1-0.5-1)– Simulation # 3 ( 1-1-1)– Simulation # 4 ( 2-1-1)

• Increasing input parameter variance:– increases variance in the model coefficients.– can skew the model coefficient distribution.– can shift model coefficient means.

30

Estimated Model Coefficient Uncertainties from Monte Carlo Simulation

Regression #1 #2 #3 #4P Std. Dev. - - 1 1 0.5

GS Std. Dev. - - 0.25 0.25 0.25CF Std. Dev. - - 1 1 0.5R Std. Dev. - measured - measured 2.5%

B0 68.35 ± 0.83 68.35 ± 0.93 68.41 ± 1.06 68.39 ± 1.44 68.39 ± 0.9

B(GS) -1.34 ± 1.07 -1.34 ± 1.21 -1.23 ± 1.49 -1.22 ± 1.96 -1.24 ± 1.28

B(P) -2.88 ± 0.96 -2.88 ± 1.04 -2.13 ± 1.18 -2.11 ± 1.53 -2.63 ± 1.06

B(CF) -8.95 ± 1.17 -8.95 ± 1.38 -7.37 ± 1.35 -7.37 ± 1.87 -8.52 ± 1.2

B(GS^2) 2.44 ± 1.35 2.44 ± 1.52 2.13 ± 1.84 2.14 ± 2.43 2.15 ± 1.57Std. Error of Prediction 1.6% 1.8% 2.1% 2.8% 1.8%

Monte Carlo Simulation Parameters

Std. Error

31

Case Study #2: Nasal Spray Performance Models

• A nasal spray product is a combination of a therapeutic formulation and a delivery device.

• 3-level, 4-factor Box-Behnken designs – Pfeiffer nasal spray pump– Placebo formulations (CMC & Tween 80 solutions)– Reference:

• Changning Guo, Keith J. Stine, John F. Kauffman, William H. Doub. 2008 “Assessment of the influence factors on in vitro testing of nasal sprays using Box-Behnken experimental design”, European Journal of Pharmaceutical Sciences 35 (12 ) 417–426

• Changning Guo, Wei Ye, John F. Kauffman, William H. Doub. “Evaluation of Impaction Force of Nasal Sprays and Metered-Dose Inhalers Using the Texture Analyser.” Journal of Pharmaceutical Sciences. In press

32

Response Variables

Plume geometry: measures the side view of a spray plume at its fully developed phase

Spray Pattern: measures the cross sectional uniformity of the spray

Parameters used to describe the shape of a nasal spray plume: spray pattern area, plume width.

33

Response Variables

• Droplet Size Distribution– Volume Median Diameter D50

• Impaction Force

0

10

20

30

40

50

60

70

80

90

100

Cu

mu

lativ

e d

istr

ibu

tion

Q3

/ %

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

De

nsi

ty d

istr

ibu

tion

q3

*

0.100.10 0.5 1 5 10 50 100particle size / µm

34

Nasal Spray Response Models

• Optimized regression models

Responses Prediction Equations R2

Spray Pattern Area 22 148548927915647338 CVVCCVSR 0.97

Plume Width 22 3.43.41.83.87.26 CVCVR 0.96

D50 28.201.276.196.275.34 VVCCVR 0.90

Impaction Force 22 74.041.023.051.134.045.4 VSTVSR 0.94

35

Spray Pattern Model

Variances from input variables and spray pattern area measurements have similar level of influence on the model coefficients.

regression Random R Random D Random

R&D Spray pattern model term Mean

Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

offset 338 15 338 4 338 4 338 5 S 47 13 46 4 46 4 46 5 V 156 13 155 3 155 4 155 4 C -279 13 -279 5 -279 4 -279 6

VC -89 23 -89 7 -89 7 -89 8 V2 -54 18 -54 5 -54 6 -54 7 C2 148 18 148 5 148 5 148 7

22 148548927915647338 CVVCCVSR

36

Plume Width Model

Variance from plume width measurements have more influence on the model coefficients than those from the input variables.

regression Random R Random D Random R&D Plume Geometry

model term Mean Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

offset 26.7 0.6 26.7 0.3 26.7 0.2 26.7 0.3 V 8.3 0.5 8.3 0.3 8.3 0.1 8.3 0.4 C -8.1 0.5 -8.1 0.3 -8.1 0.1 -8.1 0.3 V2 -4.4 0.8 -40.4 0.4 -4.3 0.2 -4.3 0.4 C2 4.4 0.7 4.4 0.4 4.4 0.2 4.4 0.4

22 3.43.41.83.87.26 CVCVR

37

Droplet Size Model – D50

28.201.276.196.275.3450 VVCCVD

323334353637

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Coef_1_2

Minimum33.8212Maximum35.0931Mean 34.4472Std Dev 0.1449Values 10000

-1.50-1.45-1.40-1.35-1.30-1.25

0

10

20

30

40

50

Coef_2_2

Minimum-1.4152Maximum-1.3451Mean -1.3817Std Dev 0.00983Values 10000

171819202122

0.0

0.5

1.0

1.5

2.0

2.5

Coef_3_2

Minimum18.9851Maximum20.1786Mean 19.5939Std Dev 0.1707Values 10000

-1.6-1.5-1.4-1.3-1.2-1.1

0

5

10

15

20

25

30

35

Coef_4_2

Minimum-1.4020Maximum-1.3050Mean -1.3549Std Dev 0.0134Values 10000

444648505254565860

Values in Thousandths

0

100

200

300

400

500

600

700

Coef_5_2

Minimum 0.04949Maximum0.05399Mean 0.05200Std Dev 0.000606Values 10000

444648505254565860

Values in Thousandths

0

100

200

300

400

500

600

700

Coef_5_2

Minimum 0.0415Maximum 0.0628Mean 0.0516Std Dev 0.00238Values 10000

-1.6-1.5-1.4-1.3-1.2-1.1

0

5

10

15

20

25

30

35

Coef_4_2

Minimum-1.5820Maximum-1.1279Mean -1.3509Std Dev 0.0644Values 10000

171819202122

0.0

0.5

1.0

1.5

2.0

2.5

Coef_3_2

Minimum17.4895Maximum21.8571Mean 19.5707Std Dev 0.5912Values 10000

-1.50-1.45-1.40-1.35-1.30-1.25

0

10

20

30

40

50

Coef_2_2

Minimum-1.5192Maximum-1.2604Mean -1.3799Std Dev 0.0306Values 10000

323334353637

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Coef_1_2

Minimum32.5884Maximum36.5062Mean 34.4707Std Dev 0.4972Values 10000

444648505254565860

Values in Thousandths

0

100

200

300

400

500

600

700

Coef_5_2

Minimum 0.0414Maximum 0.0599Mean 0.0516Std Dev 0.00244Values 10000

-1.6-1.5-1.4-1.3-1.2-1.1

0

5

10

15

20

25

30

35

Coef_4_2

Minimum-1.6118Maximum-1.1057Mean -1.3509Std Dev 0.0667Values 10000

171819202122

0.0

0.5

1.0

1.5

2.0

2.5

Coef_3_2

Minimum17.3591Maximum22.1043Mean 19.5710Std Dev 0.6144Values 10000

-1.50-1.45-1.40-1.35-1.30-1.25

0

10

20

30

40

50

Coef_2_2

Minimum-1.4974Maximum-1.2670Mean -1.3800Std Dev 0.0317Values 10000

323334353637

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Coef_1_2

Minimum32.7020Maximum36.3831Mean 34.4713Std Dev 0.5187Values 10000

Random R only

Random D only

Random R&D

offset V C VC V2

Variance from input variables have more influence on the model coefficients than those from the D50 measurements.

38

Impaction Force Model

22 74.041.023.051.134.045.4 VSTVSR

regression Random R Random D Random R&D force

Mean Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

Mean Std. Dev.

offset 4.44 0.11 4.44 0.04 4.45 0.04 4.45 0.06 S 0.34 0.09 0.34 0.04 0.33 0.04 0.33 0.06 V 1.51 0.09 1.51 0.03 1.50 0.03 1.50 0.04 T 0.23 0.09 0.23 0.04 0.23 0.03 0.23 0.05 S2 0.41 0.13 0.41 0.05 0.39 0.06 0.39 0.07 V2 -0.72 0.12 -0.72 0.04 -0.72 0.04 -0.72 0.08

Variances from input variables and impaction force measurements have similar level of influence on the model coefficients.

39

• The means of model coefficients show good agreement between regression results and Monte Carlo simulations

• The standard deviations of model coefficients obtained from regression results are larger than those from Monte Carlo simulation.

– The estimated standard deviations from regression may overestimate the uncertainties in the model coefficients.

– Regression based coefficient standard deviations in defining design space may result in a smaller selection range of input variable values that are necessary to meet the desired confidence level.

How Do Variances in Formulation and Actuation Influence Model Coefficients?

40

Advantages of Monte Carlo Simulation

1. We make no assumptions concerning sources of uncertainty or variable covariance.

2. We see the distribution of output variable values, not just a standard deviation.

3. Sensitivity analysis allows us to prioritize high risk input variables and improve process control.

Recommended