Upload
miles-higgins
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
1
Application of Monte Carlo Methods for Process Modeling
John Kauffman, Changning GuoFDA\CDER Division of Pharmaceutical Analysis
Jean-Marie GeoffroyTakeda Global Research and Development
The opinions expressed in this presentation are those of the authors, and do not necessarily represent the opinions or policies of the FDA.
2
Outline
1. Why propagate uncertainty in regression-based process models?
2. Why use Monte Carlo (MC) simulation?
3. Why solve regression models with MC?
3
What is Design Space?
• Design space is “the multidimensional combination and interaction of input variables and process parameters that have been demonstrated to provide assurance of quality”. (ICH Q8)
• Process modeling (DOE) is a central component of design space determination.
4
Parameter#1
Parameter#2
Design Space Schematic
Knowledge Space
Design Space
5
Parameter#1
Parameter#2
Design Space Schematicwith uncertainty
Knowledge Space
Design Space
6
Case Study #1: Modeling 45 minute dissolution (D45) of a tableting process
• 32 Factorial Experimental Design– Granulating Water (GS: 36-38 kg)– Granulating Power (P: 18.5-22.5 kW)
• Nested Compression Factors– Compression Force (CF: 11.5-17.5 kN)– Press Speed (S: 70-110 kTPH)
• Least Squares Predictive Model*D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2
* Parameter values are mean-centered and range-scaled.
Publication ReferenceApplication of Quality by Design Knowledge (QbD) From Site Transfers to Commercial Operations Already in Progress,” J. PAT, Jan/Feb, pg. 8, 2006.
7
Diagram of Experimental Design
granulationwater (kg)
granulation power (kW)
38
36
37
18.5 20.5 22.5
speed
force
8
Experimentation and Process Modeling
D45exp 1
D45exp 2
D45exp 3
GSexp 1 Pexp 1 CFexp 1
GSexp 2 Pexp 2 CFexp 2
GSexp 3 Pexp 3 CFexp 3
9
Propagation of uncertainty in process model predictions:
All Model Coefficient variances and Process Variable variances contribute to each predicted Response uncertainty in a model-dependent manner.
D45pred 1 = B0 + B1·GSexp 1 + B2·Pexp 1 + B3·CFexp 1 + B4·GS2exp 1
D45pred 2 = B0 + B1·GSexp 2 + B2·Pexp 2 + B3·CFexp 2 + B4·GS2exp 2
D45pred 3 = B0 + B1·GSexp 3 + B2·Pexp 3 + B3·CFexp 3 + B4·GS2exp 3
Experimentation and Process Modeling
10
Propagation of Uncertainty in Regression Modeling
• What procedures can be used to estimate uncertainty in design space?
• What is the benefit of propagating uncertainty using Monte Carlo simulation?
11
B0
B1
B2
B3
B4
The Process Model: Matrix Representation
R = DB
D45pred 1
D45pred 2
D45pred 3
1 GSexp 1 Pexp 1 CFexp 1 GS2exp 1
1 GSexp 2 Pexp 2 CFexp 2 GS2exp 2
1 GSexp 3 Pexp 3 CFexp 3 GS2exp 3
=
Response
matrix
Design
matrix
B
matrix
12
Least Squares Solution to a Process Model
Matrix Representation of Process Model: R = DB
Solving for the Model Coefficients: D†R = B
The pseudoinverse solution of a matrix equation gives the least squares best estimates of the B coefficients!
Define the pseudoinverse of D: D† = (DTD)-1DT
13
Estimating Variance in Prediction:The Basis for Uncertainty in Design Space
Jth experimental variance = Jth diagonal element of Cov(R)
Response Covariance matrix Cov(R) = B[Cov(D)]BT
Assumptions: Only D has uncertainty.
Problems: 1.) We know that B has uncertainty. 2.) We know that uncertainties in D will be correlated, but we don’t know Cov(D)
14
Estimating Variance of Process Model Regression Coefficients
Jth Model coefficient variance = Jth diagonal element of Cov(B)
Problems: 1.) We know that D (matrix of input variables) has uncertainty. 2.) We suspect that uncertainties may be correlated.
Model coefficient Covariance matrix Cov(B)=[DTD]-1R
2
Response variance( p = # model coefficients N = # experiments)
R
2= (Ri – Ri)2
N - p
^
i=1
N
Assumptions: Only R has uncertainty; Errors uncorrelated and constant
Cov(B)=D†[Cov(R)]D†T
15
Monte Carlo Methods
• Develop a mathematical model.– The Process Model.
• Add random variables.– Replace quantities of interest with random numbers selected
from appropriate distribution functions that are expected to describe the variables.
• Monitor selected output variables.– Output variables become distributions whose properties are
determined by the model and the distributions of the random variables.
• Advantage #1: We make no assumptions concerning sources of uncertainty or covariance between variables.
16
Case Study #1: Influence of Process Parameter Variation on Prediction
• Model Conditions– GS mean = 36 kg– P mean = 20 kW– CF mean = 14 kN– Input parameter standard deviations were varied.– Dissolution values were predicted.
17
Example: Simulation 1
Distribution for Water (kg)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
35 35.5 36 36.5 37
Distribution for Power (kW)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
15 17.5 20 22.5 25
Distribution for Force (kN)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
9 12 15 18
D45 Simulation #1
0
0.02
0.04
0.06
0.08
0.1
0.12
60 70 80 90 100
GS Mean = 36 kg Std. Dev. = 0.25 kg
P Mean = 20 kW Std. Dev. = 1 kW
CF Mean = 14 kN Std. Dev. = 1 kN
D45 Simulation Result Mean = 74.6% Std. Dev. = 3.70%
D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2
18
Example: Simulations 1-4
D4
5 S
imula
tion #
1
0
0.02
0.04
0.06
0.08
0.1
0.12
6070
8090
100
GS Std. Dev.=0.25 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%
D45 Simulation Mean = 74.6% Std. Dev. = 3.70%
D4
5 S
imula
tion #
2
0
0.02
0.04
0.06
0.08
0.1
0.12
6070
8090
100
D4
5 S
imula
tion #
3
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
6070
8090
100
D4
5 S
imula
tion #
4
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
6070
8090
100
D45 Simulation Mean = 75.0% Std. Dev. = 4.59%
D45 Simulation Mean = 76.9% Std. Dev. = 7.89%
D45 Simulation Mean = 76.9% Std. Dev. = 8.26%
GS Std. Dev.=0.5 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%
GS Std. Dev.=1 kgP Std. Dev.=1 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%
GS Std. Dev.=1 kgP Std. Dev.=2 kWCF Std. Dev.= 1 kND45 Std. Dev.= 0%
19
Influence of Process Parameters Variation
• Increase in granulation water mass (GS) variance:– Increases predicted D45 variance.– Slightly shifts predicted D45 means.– Skews the predicted D45 distributions.
• Increase in granulator power (P) endpoint variance:– Increases predicted D45 variance.– Does not shift predicted D45 means.– Does not skew the predicted D45 distributions.
20
Influence of Dissolution Measurement Error
• Model Conditions– GS mean = 36 kg– P mean = 20 kW– CF mean = 14 kN– Input parameter standard deviations were varied.– Dissolution measurement error was added.– Dissolution values were predicted.
21
Example: Simulations 5-7
GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=0%(Control)
D45 Simulation Mean = 75.0% Std. Dev. = 3.88%
D45 Simulation Mean = 75.0% Std. Dev. = 4.37%
D45 Simulation Mean = 75.0% Std. Dev. = 5.58%
D45 Simulation Mean = 75.0% Std. Dev. = 7.16%
Sim
ulatio
n #5
0
0.02
0.04
0.06
0.08
0.1
0.12
5065
8095
Sim
ulato
n 5-7
Co
ntrol
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
5060
7080
90100
Sim
ulatio
n #6
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
5060
7080
90100
Sim
ulatio
n #7
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
5064
7892
GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=2%
GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=4%
GS Std. Dev=0.5 kgP Std. Dev=1 kWCF Std. Dev=0.5 kND45 Std. Dev=6%
22
Influence of Dissolution Measurement Error
• Increase in D45 measurement variance:– does not shift predicted D45 means.– does not appear to skew predicted D45
distributions.– increases predicted D45 variance.
• Advantage #2, we get the distribution, not just the standard deviation.
• Advantage #3, sensitivity analysis allows us to prioritize process improvement.
23
Measurement Uncertainty and Prediction Uncertainty
Benchmark
Experiment Measured Mean Model Prediction Measured St. Dev.1 69.0 69.2 3.12 73.5 72.3 3.13 71.9 72.1 1.94 67.1 65.5 2.95 69.8 71.2 1.96 75.5 75.0 0.87 65.4 66.6 2.78 75.8 77.3 1.19 56.1 59.4 3.710 61.0 59.4 3.711 67.2 68.3 2.412 77.0 77.3 1.113 72.7 68.3 2.4
2.42.6
Standard Error of PredictionStandard Error (RMS Measurement Standard Deviation)
D45 (%)
24
Monte Carlo Prediction Error
RandomResponse
RandomCoefficients
RandomInputs
Result using estimated coefficient St. Dev.
D45 Measurement #1 #2 #3 #4
N = 6 6 10000 10000 6B matrix - regression regression regression regression
B Std. Dev. - regression regression regression 0R Std. Dev. - 0 0 0 1.7%P Std. Dev. - 0 0 0.5 0.5
GS Std. Dev. - 0 0 0.25 0.25CF Std. Dev. - 0 0 0.5 0.5
Std. Error of Prediction 2.6% 1.7% 1.6% 2.6% 2.5%
D45 Monte Carlo Simulation Parameters
Std. Error
25
Prediction Error Based on Estimated Coefficient Standard Deviations
• Estimated model coefficient standard deviations do not predict the observed response uncertainty.
• Can we use Monte Carlo simulation to provide better estimates of model coefficient standard deviations?
26
Propagation of Uncertainty in Process Modeling
1. Assign random variables to Dissolution values (R) and use Monte Carlo simulations to propagate error to the model coefficients (B).
Solving for the Model Coefficients: D†R=B
2. Assign random variables to Process Parameters (D) and use Monte Carlo simulations to propagate error to B.
B1 = D†11·RExp 1 + D†
12·RExp 2 + D†13·RExp 3 +…
B2 = D†21·RExp 1 + D†
22·RExp 2 + D†23·RExp 3 +…
The pseudoinverse of D: D†=(DTD)-1DT
27
How Do Variances in Process Parameters Influence Model Coefficients?
• Simulation # 1 (1-0.25-1)– Measured D45 means and standard deviations.– P 19-23 kW ± 1 kW– GS 36-38 kg ± 0.25 kg– CF 12-18 ± 1 kN
• Compare to regression distributions– Model coefficient means– Model coefficient standard deviations
28
How Do Variances in Process Parameters Influence Model Coefficients?
“Bias” Power (P) Water (GS) Force (CF) Water2
Ran
. In
put Bias Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
64 67 70 73
Power Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-8 -6 -4 -2 0 2 4
Water Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-8 -6 -4 -2 0 2 4 6
Water2 Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
-4 -2 0 2 4 6 8 10
Force Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-16 -14 -12 -10 -8 -6 -4 -2
Increase in process parameter variance causes a shift in some model coefficients.
Increase in process parameter variance increases model coefficient variance.
Bias Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
64 65 66 67 68 69 70 71 72 73
Power Coefficient
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
-3.5 -2.375 -1.25 -0.125 1
Water Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Force Coefficient
0
0.2
0.4
0.6
0.8
1
1.2
-5 -4 -3 -2 -1
Water2 Coefficient
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-4 -2 0 2 4 6 8 10
Reg
ress
ion
29
How Do Variances in Process Parameters Influence Model Coefficients?
– Simulation # 1 (1-.25-1)– Simulation # 2 (1-0.5-1)– Simulation # 3 ( 1-1-1)– Simulation # 4 ( 2-1-1)
• Increasing input parameter variance:– increases variance in the model coefficients.– can skew the model coefficient distribution.– can shift model coefficient means.
30
Estimated Model Coefficient Uncertainties from Monte Carlo Simulation
Regression #1 #2 #3 #4P Std. Dev. - - 1 1 0.5
GS Std. Dev. - - 0.25 0.25 0.25CF Std. Dev. - - 1 1 0.5R Std. Dev. - measured - measured 2.5%
B0 68.35 ± 0.83 68.35 ± 0.93 68.41 ± 1.06 68.39 ± 1.44 68.39 ± 0.9
B(GS) -1.34 ± 1.07 -1.34 ± 1.21 -1.23 ± 1.49 -1.22 ± 1.96 -1.24 ± 1.28
B(P) -2.88 ± 0.96 -2.88 ± 1.04 -2.13 ± 1.18 -2.11 ± 1.53 -2.63 ± 1.06
B(CF) -8.95 ± 1.17 -8.95 ± 1.38 -7.37 ± 1.35 -7.37 ± 1.87 -8.52 ± 1.2
B(GS^2) 2.44 ± 1.35 2.44 ± 1.52 2.13 ± 1.84 2.14 ± 2.43 2.15 ± 1.57Std. Error of Prediction 1.6% 1.8% 2.1% 2.8% 1.8%
Monte Carlo Simulation Parameters
Std. Error
31
Case Study #2: Nasal Spray Performance Models
• A nasal spray product is a combination of a therapeutic formulation and a delivery device.
• 3-level, 4-factor Box-Behnken designs – Pfeiffer nasal spray pump– Placebo formulations (CMC & Tween 80 solutions)– Reference:
• Changning Guo, Keith J. Stine, John F. Kauffman, William H. Doub. 2008 “Assessment of the influence factors on in vitro testing of nasal sprays using Box-Behnken experimental design”, European Journal of Pharmaceutical Sciences 35 (12 ) 417–426
• Changning Guo, Wei Ye, John F. Kauffman, William H. Doub. “Evaluation of Impaction Force of Nasal Sprays and Metered-Dose Inhalers Using the Texture Analyser.” Journal of Pharmaceutical Sciences. In press
32
Response Variables
Plume geometry: measures the side view of a spray plume at its fully developed phase
Spray Pattern: measures the cross sectional uniformity of the spray
Parameters used to describe the shape of a nasal spray plume: spray pattern area, plume width.
33
Response Variables
• Droplet Size Distribution– Volume Median Diameter D50
• Impaction Force
0
10
20
30
40
50
60
70
80
90
100
Cu
mu
lativ
e d
istr
ibu
tion
Q3
/ %
0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
De
nsi
ty d
istr
ibu
tion
q3
*
0.100.10 0.5 1 5 10 50 100particle size / µm
34
Nasal Spray Response Models
• Optimized regression models
Responses Prediction Equations R2
Spray Pattern Area 22 148548927915647338 CVVCCVSR 0.97
Plume Width 22 3.43.41.83.87.26 CVCVR 0.96
D50 28.201.276.196.275.34 VVCCVR 0.90
Impaction Force 22 74.041.023.051.134.045.4 VSTVSR 0.94
35
Spray Pattern Model
Variances from input variables and spray pattern area measurements have similar level of influence on the model coefficients.
regression Random R Random D Random
R&D Spray pattern model term Mean
Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
offset 338 15 338 4 338 4 338 5 S 47 13 46 4 46 4 46 5 V 156 13 155 3 155 4 155 4 C -279 13 -279 5 -279 4 -279 6
VC -89 23 -89 7 -89 7 -89 8 V2 -54 18 -54 5 -54 6 -54 7 C2 148 18 148 5 148 5 148 7
22 148548927915647338 CVVCCVSR
36
Plume Width Model
Variance from plume width measurements have more influence on the model coefficients than those from the input variables.
regression Random R Random D Random R&D Plume Geometry
model term Mean Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
offset 26.7 0.6 26.7 0.3 26.7 0.2 26.7 0.3 V 8.3 0.5 8.3 0.3 8.3 0.1 8.3 0.4 C -8.1 0.5 -8.1 0.3 -8.1 0.1 -8.1 0.3 V2 -4.4 0.8 -40.4 0.4 -4.3 0.2 -4.3 0.4 C2 4.4 0.7 4.4 0.4 4.4 0.2 4.4 0.4
22 3.43.41.83.87.26 CVCVR
37
Droplet Size Model – D50
28.201.276.196.275.3450 VVCCVD
323334353637
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Coef_1_2
Minimum33.8212Maximum35.0931Mean 34.4472Std Dev 0.1449Values 10000
-1.50-1.45-1.40-1.35-1.30-1.25
0
10
20
30
40
50
Coef_2_2
Minimum-1.4152Maximum-1.3451Mean -1.3817Std Dev 0.00983Values 10000
171819202122
0.0
0.5
1.0
1.5
2.0
2.5
Coef_3_2
Minimum18.9851Maximum20.1786Mean 19.5939Std Dev 0.1707Values 10000
-1.6-1.5-1.4-1.3-1.2-1.1
0
5
10
15
20
25
30
35
Coef_4_2
Minimum-1.4020Maximum-1.3050Mean -1.3549Std Dev 0.0134Values 10000
444648505254565860
Values in Thousandths
0
100
200
300
400
500
600
700
Coef_5_2
Minimum 0.04949Maximum0.05399Mean 0.05200Std Dev 0.000606Values 10000
444648505254565860
Values in Thousandths
0
100
200
300
400
500
600
700
Coef_5_2
Minimum 0.0415Maximum 0.0628Mean 0.0516Std Dev 0.00238Values 10000
-1.6-1.5-1.4-1.3-1.2-1.1
0
5
10
15
20
25
30
35
Coef_4_2
Minimum-1.5820Maximum-1.1279Mean -1.3509Std Dev 0.0644Values 10000
171819202122
0.0
0.5
1.0
1.5
2.0
2.5
Coef_3_2
Minimum17.4895Maximum21.8571Mean 19.5707Std Dev 0.5912Values 10000
-1.50-1.45-1.40-1.35-1.30-1.25
0
10
20
30
40
50
Coef_2_2
Minimum-1.5192Maximum-1.2604Mean -1.3799Std Dev 0.0306Values 10000
323334353637
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Coef_1_2
Minimum32.5884Maximum36.5062Mean 34.4707Std Dev 0.4972Values 10000
444648505254565860
Values in Thousandths
0
100
200
300
400
500
600
700
Coef_5_2
Minimum 0.0414Maximum 0.0599Mean 0.0516Std Dev 0.00244Values 10000
-1.6-1.5-1.4-1.3-1.2-1.1
0
5
10
15
20
25
30
35
Coef_4_2
Minimum-1.6118Maximum-1.1057Mean -1.3509Std Dev 0.0667Values 10000
171819202122
0.0
0.5
1.0
1.5
2.0
2.5
Coef_3_2
Minimum17.3591Maximum22.1043Mean 19.5710Std Dev 0.6144Values 10000
-1.50-1.45-1.40-1.35-1.30-1.25
0
10
20
30
40
50
Coef_2_2
Minimum-1.4974Maximum-1.2670Mean -1.3800Std Dev 0.0317Values 10000
323334353637
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Coef_1_2
Minimum32.7020Maximum36.3831Mean 34.4713Std Dev 0.5187Values 10000
Random R only
Random D only
Random R&D
offset V C VC V2
Variance from input variables have more influence on the model coefficients than those from the D50 measurements.
38
Impaction Force Model
22 74.041.023.051.134.045.4 VSTVSR
regression Random R Random D Random R&D force
Mean Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
Mean Std. Dev.
offset 4.44 0.11 4.44 0.04 4.45 0.04 4.45 0.06 S 0.34 0.09 0.34 0.04 0.33 0.04 0.33 0.06 V 1.51 0.09 1.51 0.03 1.50 0.03 1.50 0.04 T 0.23 0.09 0.23 0.04 0.23 0.03 0.23 0.05 S2 0.41 0.13 0.41 0.05 0.39 0.06 0.39 0.07 V2 -0.72 0.12 -0.72 0.04 -0.72 0.04 -0.72 0.08
Variances from input variables and impaction force measurements have similar level of influence on the model coefficients.
39
• The means of model coefficients show good agreement between regression results and Monte Carlo simulations
• The standard deviations of model coefficients obtained from regression results are larger than those from Monte Carlo simulation.
– The estimated standard deviations from regression may overestimate the uncertainties in the model coefficients.
– Regression based coefficient standard deviations in defining design space may result in a smaller selection range of input variable values that are necessary to meet the desired confidence level.
How Do Variances in Formulation and Actuation Influence Model Coefficients?
40
Advantages of Monte Carlo Simulation
1. We make no assumptions concerning sources of uncertainty or variable covariance.
2. We see the distribution of output variable values, not just a standard deviation.
3. Sensitivity analysis allows us to prioritize high risk input variables and improve process control.