22
gxpandjvt.com JOURNAL OF V ALIDATION TECHNOLOGY [SUMMER 2011] 47 Statistical Viewpoint. David LeBlond, Daniel Griffith, and Kelly Aubuchon ] For more Author information, go to gxpandjvt.com/bios [ ABOUT THE AUTHORS David LeBlond, Ph.D., is senior statistician in Exploratory Statistics, Global Pharmaceutical R&D, Abbott Global Pharmaceutical, Abbott Park, IL. He may be contacted by e-mail at david.leblond@ab- bott.com. Daniel Griffith is a statistician in the Technical Support Department at Minitab Inc. Kelly Aubuchon is a statistician in the Technical Support Department at Minitab Inc., State College, PA. Linear Regression 102: Stability Shelf Life Estimation Using Analysis of Covariance “Statistical Viewpoint” addresses principles of statis- tics useful to practitioners in compliance and valida- tion. We intend to present these concepts in a mean- ingful way so as to enable their application in daily work situations. Reader comments, questions, and suggestions are needed to help us fulfill our objective for this col- umn. Please contact managing editor Susan Haigney at [email protected] with comments, sugges- tions, or manuscripts for publication. KEY POINTS The following key points are discussed: Analysis of covariance (ANCOVA) is an important kind of multiple regression that involves two predic- tor variables: one continuous (e.g., time) and one categorical (e.g., batch of material). Like simple linear regression, simple ANCOVA fits straight lines to response measurements (e.g., potency, related substance, or moisture content) over time: one line for each level (i.e., batch) of the categorical variable. A key objective of ANCOVA is to determine whether the straight lines for all batches are best described as having a common-intercept-common-slope (CICS) model, a separate-intercepts-common-slope (SICS) model, or a separate-intercepts-separate-slopes (SISS) model. In ANCOVA, model choice is based on two statistical F-tests: one comparing slopes and one comparing intercepts among batches. In the case of pharma- ceutical shelf life estimation, the US Food and Drug Administration recommends a p-value < 0.25 for significance in these tests. ANCOVA model adequacy can be assessed by exam- ining measures such as a root mean square error (RMSE), lack of fit, PRESS, and predicted R-square. Once the appropriate model (i.e., CICS, SICS, or SISS) has been identified for a given data set, it can be used to obtain expected values, confidence inter- vals, and prediction intervals of potency of a given lot at a given time. When a lower or an upper specification limit can be identified for the response, the ANCOVA model can be used to estimate the shelf life for the batches tested. The shelf life for a pharmaceutical batch is defined as the maximum storage period within which the 95% confidence interval for the batch mean response level remains within the specification range. Depending on the response, the confidence interval may be one or two sided. The shelf life for a pharmaceutical product is taken to be the minimum shelf life for batches on stability. ANCOVA analysis and shelf life estimation using the Minitab Stability Studies Macro is illustrated in the cases of pharmaceutical potency, related substance, and moisture content responses. David LeBlond, Daniel Griffith, and Kelly Aubuchon

Linear Regression 102: Stability Shelf Life Estimation

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 47

Statistical Viewpoint.David LeBlond, Daniel Griffith, and Kelly Aubuchon]

For more Author

information,

go to

gxpandjvt.com/bios [ABOUT THE AUTHORSDavid LeBlond, Ph.D., is senior statistician in Exploratory Statistics, Global Pharmaceutical R&D, Abbott Global Pharmaceutical, Abbott Park, IL. He may be contacted by e-mail at [email protected]. Daniel Griffith is a statistician in the Technical Support Department at Minitab Inc. Kelly Aubuchon is a statistician in the Technical Support Department at Minitab Inc., State College, PA.

Linear Regression 102: Stability Shelf Life Estimation Using Analysis of Covariance

“Statistical Viewpoint” addresses principles of statis-tics useful to practitioners in compliance and valida-tion. We intend to present these concepts in a mean-ingful way so as to enable their application in daily work situations.

Reader comments, questions, and suggestions are needed to help us fulfill our objective for this col-umn. Please contact managing editor Susan Haigney at [email protected] with comments, sugges-tions, or manuscripts for publication.

KEY POINTSThe following key points are discussed:

•Analysis of covariance (ANCOVA) is an important kind of multiple regression that involves two predic-tor variables: one continuous (e.g., time) and one categorical (e.g., batch of material).

•Like simple linear regression, simple ANCOVA fits straight lines to response measurements (e.g., potency, related substance, or moisture content) over time: one line for each level (i.e., batch) of the categorical variable.

•A key objective of ANCOVA is to determine whether the straight lines for all batches are best described as having a common-intercept-common-slope (CICS) model, a separate-intercepts-common-slope (SICS) model, or a separate-intercepts-separate-slopes (SISS) model.

•In ANCOVA, model choice is based on two statistical

F-tests: one comparing slopes and one comparing intercepts among batches. In the case of pharma-ceutical shelf life estimation, the US Food and Drug Administration recommends a p-value < 0.25 for significance in these tests.

•ANCOVA model adequacy can be assessed by exam-ining measures such as a root mean square error (RMSE), lack of fit, PRESS, and predicted R-square.

•Once the appropriate model (i.e., CICS, SICS, or SISS) has been identified for a given data set, it can be used to obtain expected values, confidence inter-vals, and prediction intervals of potency of a given lot at a given time.

•When a lower or an upper specification limit can be identified for the response, the ANCOVA model can be used to estimate the shelf life for the batches tested.

•The shelf life for a pharmaceutical batch is defined as the maximum storage period within which the 95% confidence interval for the batch mean response level remains within the specification range. Depending on the response, the confidence interval may be one or two sided.

•The shelf life for a pharmaceutical product is taken to be the minimum shelf life for batches on stability.

•ANCOVA analysis and shelf life estimation using the Minitab Stability Studies Macro is illustrated in the cases of pharmaceutical potency, related substance, and moisture content responses.

David LeBlond, Daniel Griffith, and Kelly Aubuchon

Page 2: Linear Regression 102: Stability Shelf Life Estimation

48 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

INTRODUCTIONA previous installment of “Statistical Viewpoint” described simple linear regression in which there is a single continuous independent variable such as time, temperature, concentration, or weight (1). Many important relationships involve multiple indepen-dent variables, some of which may be categorical in nature (e.g., batch of material, supplier, manufacturing site, laboratory, preservative type, clinical subject). Understanding such relationships requires the use of multiple linear regression. In this installment, we deal with the simplest kind of multiple linear regression in which there are two independent variables: one continuous (called the “covariate”) and one categori-cal. The following are some examples in which this kind of relationship is important:

• Pre-clinical studies. Ten xenograft rodents are treated with a range of doses of an anti-tumor agent and the tumor weight for each animal decreases as dose increases. The objective is to quantify the animal to animal differences in dose response profile. Here tumor weight is the depen-dent variable, rodent identity is the categorical variable, and dose is the covariate.

• Process scale-up. Active pharmaceutical ingre-dient (API) concentration is measured over time in three chemical reactors. The reactors differ in size (scale). The objective is to estimate scale effects on the rate of API synthesis. Here, API concentration is the dependent variable, scale is the categorical variable, and dose level is the covariate.

• Analytical methods. An assay measures the concentration of an analyte in plasma samples based on a florescence response. Samples are tested in duplicate. Each test provides a blank response and a test response. The objective is to compare analyte concentrations among samples, while correcting each for the effect of the blank. In this case, the test response is the dependent vari-able, sample identity is the categorical response, and blank is the covariate.

• Pharmaceutical product stability. The drug potency, related substance (a degradation prod-uct), and moisture level are measured over time in multiple batches of product stored in a tem-perature- and humidity-controlled chamber. The objective is to estimate the shelf life of the prod-uct. Here, the potency, related substance, and moisture levels are the dependent variables, batch identity is the categorical variable, and storage time is the covariate.

Notice the following distinctions in these examples.The relative importance of the covariate and cat-

egorical variable differs. Sometimes, as in the pharma-ceutical stability example, the primary interest may be on the effects of the covariate (i.e., stability over time) where the categorical variable, batch, is merely an unavoidable nuisance variable. In other cases, as in the analytical methods example, differences among levels of the categorical variable, sample, are of primary interest, while the effects of the covariate, blank, is an unavoidable nuisance variable. In other cases, as with the pre-clinical studies or process scale-up examples, both the differences between the categorical variable (rodent or scale) and the effects of the covariate (dose or reaction time) may be of equal interest.

The covariate may or may not be truly indepen-dent. Sometimes the covariate may be a truly inde-pendent variable whose value is well controlled and known with certainty, such as dose level or time. In other cases, the covariate is actually a measured value, such as an analytical blank. This violates one of the assumptions of regression, that the predictor variables are known without error (1). We still often use regression in these cases as long as the covariate is measured relatively accurately.

The experiment may include all or only some of the categorical variable levels of interest. Sometimes we include all levels of a categorical variable that are of interest, such as with the analytical methods example where we are concerned only with the samples being tested. In other cases, the categorical variable lev-els in our experiment are merely a sampling of all possible levels drawn from a larger population, such as all possible rodents or all possible manufactured batches. In these later cases, we must remember that the methods we discuss here do not allow us to make strong inferences about that larger population; our conclusions will be limited primarily to the categorical levels (e.g., rodents, batches) we have tested. To make stronger inferences about the larger population, more advanced statistical methods are required.

This article focuses on the important example of pharmaceutical product stability. Thus our categori-cal variable will be batch and our covariate will be storage time. Design and analysis of stability studies is a mature discipline and such studies may include additional continuous covariates such as dosage strength, storage temperature, or excipient levels as well as additional categorical variables such as excipi-ent lot or packaging type. These more complex studies are referred to as multi-factor stability studies (2).

Page 3: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 49

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

The analysis of such studies is beyond the scope of this article.

Because batches often differ in stability, stability studies on a single batch of product are not of interest. The number of batches in such studies is often small, yet the objective is inevitably a shelf-life estimate to be applied to the population of all future manufactured batches. This is somewhat troubling. In light of the last distinction mentioned above, we advise caution and encourage the reader to discuss with a statisti-cian the possibility of using mixed model or Bayesian approaches (2) where appropriate. We will proceed with our description of the traditional approach with-out apologies because it is common industry practice.

MODELS OF INSTABILITYWe will assume here that, for a given batch, the change in level over time can be approximated by a straight line. Chemists refer to this as pseudo zero-order kinet-ic mechanism. The real kinetic mechanism is almost certainly more complex, but this linear assumption is often found to be adequate. In any real application this linear assumption should be justified. In some cases, the response measurements or the time scale can be altered using appropriate transformation(s) to obtain a linear stability profile.

Consider the case where stability data are avail-able for three batches of product. Figure 1 illustrates possible models, or scenarios, of product instability where the response is, for instance, the level of some related substance or degradation product of the active drug. However, the models described in Figure 1 apply equally well for decreasing responses (e.g., potency) or for responses that may rise or fall over time (e.g., moisture). In Figure 1, the mean response level for each batch is indicated by a different colored line.

Each line can be defined by its intercept (i.e., response level at time zero) and slope (i.e., rate of change in response over time).

The common intercept and common slope (CICS) model represents a scenario were the stability profiles of all batches have a common intercept and common slope. This might be the result of a well controlled manufacturing process where the initial levels of all components, as well as their stability over time, are uniform across batches. The CICS model generally will result in a longer estimated product shelf life because it allows tighter estimates of the mean slope and intercept that are common to all batches.

The separate intercept and common slope (SICS) model represents a scenario where batches have sepa-rate intercepts but a common slope. This could result from a manufacturing process in which the initial level of the component of interest is not well con-trolled batch to batch. However, other aspects of the process that govern batch stability are uniform such that the rate of change in the level of the component of interest is the same for all batches.

The separate intercept and separate slope (SISS) model represents a scenario where batches have sepa-rate intercepts and separate slopes. This could result from an uncontrolled manufacturing process in which neither the initial level nor the stability of the com-ponent of interest is well controlled batch to batch.

Clearly the CICS model is most desirable. The SICS model may be acceptable as long as the initial level non-uniformity is controlled within acceptance limits. However, the SISS model is the least desirable scenario because batches may become increasingly less uni-form over time. The presence of large batch-to-batch variability makes it difficult to accurately estimate a shelf life for the process from only a few batches.

Figure 1: Multiple-batch models of instability: CICS (common intercept and common slope), SICS (separate intercept and common slope), SISS (separate intercept and separate slope).

Page 4: Linear Regression 102: Stability Shelf Life Estimation

50 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Some readers may notice that a CISS (common intercept and separate slopes) model is missing from Figure 1. Certainly there is no scientific reason to exclude a manufacturing process in which initial lev-els of batches are very well controlled but that other components (such as stabilizers) or process settings that affect batch stability might not be well controlled. However, while the initial levels may be relatively well controlled they are unlikely to be identical, at least for batches derived from blended powders or unit-dose filling processes. So, unless there are compelling sci-entific reasons to consider the CISS model, we must use the stability data to choose either the SISS, SICS, or CICS models.

A model that is important in building the analysis of covariance (ANCOVA) table but is not considered in the evaluation of stability data is what we might call the “common intercept, no slope” (CINS) model. The CINS model assumes that the common slope of all batches is zero. This implies a perfectly stable product. While very stable pharmaceutical products do exist, we never make an assumption of perfect stability in evaluating stability data.

ANCOVA MODEL SELECTIONWell controlled processes that follow a CICS model will more likely result in a longer shelf-life estimate than those that follow the SICS or SISS models. Because the estimate of shelf life depends on the model choice, the first task is to choose the model. While there may be development experience or theo-retical reasons to expect one model over another, the traditional approach is to let the stability data them-selves guide us to the most appropriate model. The ANCOVA is the statistical procedure for selecting the most appropriate of the three models. ANCOVA is a close cousin of the analysis of variance (ANOVA) asso-ciated with simple linear regression (1). Like ANOVA, ANCOVA partitions the variance in the observed mea-surement in a specific way. This partitioning allows us to make two statistical F-tests for batch differences among slopes and intercepts.

The algebra behind the ANCOVA F-tests is com-plicated. But it is not necessary to understand the algebra because the calculations are easily handled by

statistical software packages such as Minitab Statistical Software (3). However, it is necessary to understand what these F-tests are comparing, what the criteria for test acceptance or rejection are, and to be familiar with the ANCOVA table that statistical software produces.

The ANCOVA F-tests make a comparison between two models: a simple (null or reduced) model and a more complicated (alternative or full) model. The p-value associated with the test F statistic is used to decide whether the portion of response variance attributable to the extra features of the more compli-cated model is larger than can be explained by mea-surement variation alone. If so, we reject the simpler model in favor of the more complicated one. Table I shows the models being compared in the ANCOVA F-tests.

The p-value obtained from either ANCOVA test in Table I is the probability of obtaining an F statistic that is as or more extreme than the one we observed, given that the null hypothesis (i.e., the simpler model) is true. If the p-value is below some fixed value, we should select the more complicated model; other-wise we choose the simpler model. This fixed value is referred to as the “alpha” or “type I error” level. In many applications, we choose a limit value of 0.05 for our hypothesis tests. However, in the case of pharma-ceutical product stability, it is traditional to use the more conservative limit of 0.25 for the p-value (4). The 0.25 limit is controversial because it implies that 25% of the time we will incorrectly choose the more complicated (and less desirable) model.

The rational for choosing this more conservative limit has to do with the safety and efficacy. If we incorrectly choose the more complicated model, the estimated shelf-life estimate will likely be too short. The consumers of this drug product will likely not suffer side effects if a manufacturer establishes a shelf life that is shorter than necessary. On the other hand, if we incorrectly choose the simpler model, the esti-mated shelf life estimate will likely be too long. In that case, consumers that use product near the end of its shelf-life may be under medicated (if potency declines with time) or be exposed to higher levels of harmful degradation products. Consequently, regulatory agen-cies have established the more conservative p-value

Table I: Model comparisons made in the ANCOVA F-tests.ANCOVA F-test Simple model More complicated model

Test for common slopes SICS SISS

Test for common intercepts CICS SICS

Page 5: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 51

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

limit of 0.25 to reduce the likelihood of establishing a shelf life that is too long. This practice seems undesirable from a manufacturer’s point of view, but remember that the shelf life is meant to apply to the population of all future batches. Establishing a shelf life that a process cannot support adds to the cost of operations due to out-of-specification investigations of batches on stability and potential product recalls.

The ANCOVA decision process is diagrammed in Fig-ure 2. It starts with the stability data at the top. There are of course many ways to organize stability data. The format shown in Figure 2 is what is required for input into most statistical packages, such as Minitab, for ANCOVA analysis. In this format, there are three columns: the response level column, the time (covariate) column, and the batch (categorical variable) column. For brevity, only the first four and last observations are shown.

We start with the worst-case presumption that slopes and intercepts vary among batches. The F-test for separate slopes is examined first. If this test is statistically sig-nificant (i.e., p-value < 0.25) then the ANCOVA process concludes with the selection of SISS as the final stability model. As discussed previously, unless there is a compel-ling scientific argument, an F-test comparing the SISS and CISS models is not made at this point.

If the F-test for separate slopes is not statistically sig-nificant (i.e., p-value ≥ 0.25), there is no evidence in the data for a difference in slopes among the batches, and we can presume a common slope model, SICS. Next, we perform the second F-test in Table I that tests for separate batch intercepts, assuming that batch slopes are com-mon. This test is a comparison of model SICS and CICS. If the test is statistically significant, then the ANCOVA concludes with the selection of SICS model. If the test is not statistically significant, then the remaining model, CICS, is selected.

An ANCOVA table that is produced by the Minitab stability macro is shown in Table II. It consists of five rows and five columns of statistical quantities. The quantities in each column, how they are obtained, and what they represent are described as follows.

Source. A label indicates the variable or interaction that contributes variation to the measurement. This label also indicates the particular F-test that this row repre-sents. The Time source provides an F test that tests the hypothesis that the common slope is zero in the CICS model. A low p-value suggests that some instability is present, but is of no interest to us in model selection here because we never entertain a model with zero slope. The Error source does not include an F-test but provides an estimate of total analytical variance (the quantity mean squre error [MSE]), assuming that the SISS model is appropriate. The degrees of freedom (DF) and the Seq SS in the Total source row are merely the sum of those quantities in the rows above.

The Batch and Batch*Time sources provide the ANCO-VA F-tests for intercept and slope, respectively, that are of interest to us here. The p-values from these F-tests are used to make the model choice as described in Table I and Figure 2.

Figure 2: The ANCOVA model selection process.

Table II: ANCOVA table output from the Minitab stability macro.Source DF Seq SS Seq MS F P

Time DFT=1 SST=SSECINS-SSECICS MST=SST/DFT FT=MST/MSE p-valueT

Batch DFB=B-1 SSB=SSECICS-SSESICS MSB=SSB/DFB FB=MSB/MSE p-valueB

Batch*Time DFBT=B-1 SSBT=SSESICS-SSESISS MSBT=SSBT/DFBT FBT=MSBT/MSE p-valueBT

Error DFE=N-2*B SSE=SSESISS MSE=SSE/DFE

Total DFtot=N-1 SStot=SSECINS

Page 6: Linear Regression 102: Stability Shelf Life Estimation

52 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

DF. This gives the degrees of freedom associated with each source. This is a measure of the amount of information available in the data to estimate the statistics associated with this source. B is the number of batches in the data set, and N is the total number of independent measurements in the data set. Notice how the DF for the Total source equals the sum of the values above it.

Seq SS. This is the sum of squares associated with this source. Larger Seq SS values represent sources that contribute more to variation in the data. This quantity is obtained from the ANOVA error sum of squares (see Reference 1) from the multiple regression fit to models CINS, CICS, SICS, and SISS. The error sum of squares is indicated as SSEmodel where the subscript gives the fitted model. Notice how the Seq SS for the Total source equals the sum of the values above it.

Seq MS. Seq MS gives the mean square (or variance) associated with the source. This is simply the respective Seq SS divided by the DF.

F. This gives the F-value for this source that is sim-ply a ratio of the respective SS MS to some measure of error variance. In the case of pharmaceutical stability ANCOVA, it is common to use MSE as the error vari-ance for all F-tests, but in a traditional ANCOVA table, the quantity SSESICS/(N-B-1) is used as the measure of error variance for the test for common intercept (the Batch source). MSE is used because it is smaller than the traditional quantity. This leads to a larger F-value, which is more likely to lead to statistical significance and a more conservative final model choice.

P. This gives the p-value for the F-test associated with this Source. This p-value is the complement of the cumulative F-distribution with quantile = FSource, numerator degrees of freedom = DFSource, and denomi-nator degrees of freedom = DFE.

To summarize, p-valueB and p-valueBT in Table II are calculated from the stability data and are used to test for common intercept and slope, respectively, as described in Table I and Figure 2. The outcome of the ANCOVA process is a final stability model that is used to estimate the product shelf life. It is important to remember that the model selected through the ANCOVA process may change if data are re-analyzed after additional stability time points are acquired.

DETERMINATION OF SHELF LIFEShelf life for a pharmaceutical product is based on measurements of one or more stability indicating responses for which upper or lower acceptance limits have been established. The responses are measured on a few (typically three) batches of product that are

stored under carefully controlled temperature and humidity in the intended packaging. Traditionally, a pharmaceutical product shelf life for a batch is based on the 95% confidence limit for the mean response level over time, as estimated from the available stabil-ity batch data. The 95% confidence limit for a mean regression line is described briefly in Reference 1. The shelf life (S) is based on the shortest Time at which the estimated 95% confidence bound crosses an accep-tance limit. Shelf-life estimation for a single batch in three common situations is illustrated in Figure 3.

The left panel of Figure 3 illustrates an increas-ing response level over time (such as a degradation product) for which only an upper acceptance limit is set. In this case, it is common to use a one-sided upper confidence bound. The middle panel illustrates a decreasing response level (such as tablet potency) with only a lower acceptance limit set. In this case, it is common to use a one-sided lower confidence bound. The right panel illustrates the situation for a response level (such as moisture) that may either increase or decrease on storage and for which both upper and lower limits have been set. Cases do exist where lower (or upper) limits are in place for responses expected to increase (or decrease) over time. In such cases, it may be desirable to employ two-sided limits. One-sided confidence limits will lead to longer shelf-life estimates so their use must be risk justified.

Usually, multiple response data from multi-ple batches are used to set shelf life. ANCOVA is employed to identify the appropriate stability model for the batches at hand. Regression procedures (1) based on the selected ANCOVA model are used to obtain 95% confidence bounds for each batch. Assignment of shelf life for a product is based on “worst-case”: the response, batch, and side (upper or lower) giving the shortest shelf life is used to set the shelf life for the product. Shelf life estimates are often based on extrapolation beyond storage periods of available stability batches. International Confer-ence on Harmonisation (ICH) Q1E guidelines state that the maximum allowable shelf life is two times the maximum storage period of available stability data (4).

Because our focus here is on the ANCOVA decision process, we will emphasize this aspect of the computer output in the examples below. The shelf life estimation process involves simple or multiple regression and results in additional tables of computer output. This consists of prediction equations and stability profile graphs for all batches (CICS model) or for each batch

Page 7: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 53

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

(SICS or SISS models), model summary statistics, and an ANOVA that may include a lack-of-fit (LOF) test of nonlinearity in the stability profile. This output will be illustrated in the examples that follow. Interested readers can learn more details about multiple regres-sion from standard statistical textbooks (5).

ANCOVA DATA ANALYSIS USING THE MINITAB STABILITY MACROANCOVA and regression analysis for shelf-life estima-tion can be obtained using many commercially available statistical packages. We illustrate this process here using a convenient Minitab macro that may be downloaded and saved (3). Once saved, an analysis is made as follows:

1. Start Minitab.2. Enter stability data into a worksheet using the format

given in Figure 2.3. Select Edit, then Command Line Editor.4. Type a short script into the Command Line Editor

(syntax described below) that describes the type of analysis desired.

5. Choose Submit Commands to execute the script.

The script will invoke the macro, and the ANOVA and regression results, including stability profile graphs and shelf-life estimates, will be produced. A typical stability macro script is given as follows:

%stability ycol tcol bcol; store out.1-out.n; itype it; confidence cl; life c.1 c.z; xvalues xpredt xpredb; nograph; criteria alpha.

The script syntax consists of a main command (%stability…), given in the first line, and a set of optional subcommands, each given on subsequent lines. The order of appearance of the subcommands is not important. All commands and subcommands must end in a semicolon except the last subcom-mand, which must end in a period. Each command and subcommand consists of a key word followed by user-specified input parameters whose values tell the macro what worksheet columns to use for data and calculated predictions, and the kind of confidence interval to employ. In the %stability command, ycol indicates the column in your worksheet containing your response (e.g., potency, related substance, or moisture level,), tcol indicates column for storage time, and bcol indicates the particular batch. The bcol worksheet column can be formatted as either numeric (i.e., 1, 2, 3,…) or text (i.e., A, B, C, …). The macro has a limit of up to 100 batches in the worksheet. The other subcommands are explained in Table III.

Figure 3: Illustration of shelf-life determination for a single batch. Red horizontal lines indicate upper (U) or lower (L) acceptance limits. The solid straight line is the mean regression line, and the dashed line is the upper or lower confidence interval. The maximum batch shelf life is indicated by S.

Page 8: Linear Regression 102: Stability Shelf Life Estimation

54 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

STABILITY ANALYSISThe following illustrates five stability analyses using this macro. The potency data used was obtained from an actual literature example (6). The related substance and moisture data are realistic, but artifi-cially constructed.

Example One: Potency Stability (CICS Model, One- or Two-Sided Limit)Table IV provides potency stability data (%LC) obtained over a 24-month period from B=3 batches (batches numbered 2, 5, and 7) of a drug product. A total of N=31 independent measurements are avail-able. The first three columns of this table are in the format required by the Minitab macro. Notice that independent replicate measurements on each batch are available for months 3-24. Such independent replicates provide a test of the linearity assumption as described below. Note also that we are assuming independence of each measurement here (as discussed in Reference 1), but independence is a key assumption

that must be justified. The lower acceptance limit for potency for this product is 95% LC.

We can use the following script to analyze these data and obtain an estimate for the product shelf life:

%stability c1 c2 c3;store c4 c5 c6;itype -1;confidence 0.95;life 95;criteria 0.25.

Table V provides the ANCOVA and other computer output. Compare the ANCOVA output in Table V to that shown in Table II and to the ANCOVA decision process shown in Figure 2. The p-value associated with the test for separate slopes (Source = Batch*Time) is 0.797 which is > 0.25, so the data provide no evidence for separate slopes among the batches. The p-value associated with the test for separate intercepts (Source = Batch) is 0.651, which is > 0.25, so the data provide no evidence for sepa-

Table III: Stability macro subcommands.Subcommand Input Parameters Definition

LIFE c.1 c.z Required in order to obtain shelf-life estimation. Specifies the acceptance limit(s) of your response as constants. If you have only an upper or lower spec limit, indicate this using only c.1. Use both c.1 and c.2 for two-sided limits.

STORE out.1-out.n Specifies storage columns for the fitted values and confidence/prediction limits for each row of data. Either 3 or 5 columns for one- or two-sided limits, respectively. These may be separated by spaces (c4 c5 c6 …) or given as a range using a dash (c4-c6). When using the xvalues subcommand, fits and limits are provided only for the batches/ times in the columns specified in the xvalues subcommand, and not the fits and limits for every value in the dataset.

ITYPE it Defines the type of confidence limit. It = 1 for an upper confidence boundIt = 0 for a two-sided confidence intervalIt = -1 for a lower confidence bound

If ITYPE is not used, the LIFE subcommand parameters are used to select it. If both c.1 and c.z are specified, it is set to 0, if only c.1 is given, it is set to -1..ITYPE must be used if an upper bound is desired.

CONFIDENCE cl Cl is the confidence level used to estimate confidence/prediction intervals. By default, the cl = 0.95. The type of interval depends on it:

it=0 produces a two-sided 100*cl% confidence central intervalit= -1 (or 1) produces a single-sided lower (or upper) bound.

XVALUES xpredt xpredb Requests fitted values and limits for batch/time combinations that were not included in your stability data set. The desired times and batchs are entered into columns xpredt and xpredb, respectively, prior to invoking the macro. The xvalues subcommand al-ways needs to be used in conjunction with the store subcommand.

NOGRAPH N/A Suppresses the output of graphs.

CRITERIA alpha Defines the significance level used in the ANCOVA F-tests. By default, the significance level is 0.25.

Page 9: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 55

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

rate intercepts among the batches. Consequently, we take CICS as an appropriate stability model for estimating shelf life. As seen in Table V, the Minitab macro output refers to the CICS model as “Model 1”.

The output in Table V provides the regression equa-tion with common intercept (100.567 %LC) and slope (-0.192994 %LC/month). The negative slope indicates that potency is decreasing with time. The output includes the following summary statistics.

S. Root mean square estimate of the final model 1 fit. This estimates total analytical standard deviation.

PRESS. Prediction sum-of-squares (PRESS). This gives a robust estimate of your model’s predictive error. In gen-eral, the smaller the PRESS value, the better the model’s predictive ability.

R-Sq(pred). A robust version of Adjusted R-Sq use-ful for comparing models because it is calculated using observations not included in model estimation. Predicted R-Sq ranges between 0 and 100%. Larger values of pre-dicted R-Sq suggest models of greater predictive ability.

R-Sq(adj). A robust version of R-Sq, the percentage of response variation that is explained by the model, adjusted for the complexity of the model.

The output in Table V also includes a ANOVA table. This ANOVA table is similar to that described previously (1), but has a few additional statistical tests. Interested readers are referred to standard statistical text books for more information on complex ANOVA (5). One useful feature of the ANOVA in Table V is the LOF test. Simply put, this LOF test compares a model’s residual variance to that available from pure replication to form an F ratio. If this ratio is large and the p-value is significant (i.e., < 0.05), either there is evidence for non-linearity, or the replicates are not truly independent. Such is the case in this example (p-value = 0.0000037). If it is determined that this nonlinearity is impacting the shelf-life estima-tion, it may be advisable to alter the model, transform the response, or analyze replicate averages rather than individual replicates. We will assume in this example that the LOF has no impact and, for illustration, will use this model to estimate shelf life.

The shelf-life estimate for this example is given at the bottom of Table V as 26 months. This estimate is illus-trated in Figure 4. This plot shows the individual measure-ments for each batch as separate colors. The solid black line is the best-fit regression line for the mean potency of all three batches. The red dashed line gives the one-sided lower 95% confidence bound of the mean potency. It can be seen that this line intersects the lower acceptance limit for the product (95% LC) at about 26 months. It is common practice to round a shelf-life estimate down to the nearest whole month.

Table IV: Example one potency stability data and estimated fits and limits.

c1 c2 c3 c4 c5 c6

Potency Month Batch Fit Lower CL Lower PL

1 101.0 0 2 100.567 100.215 99.1808

2 102.0 0 5 100.567 100.215 99.1808

3 101.3 0 7 100.567 100.215 99.1808

4 101.3 1 2 100.374 100.043 98.9928

5 101.4 1 5 100.374 100.043 98.9928

6 101.5 1 7 100.374 100.043 98.9928

7 100.8 2 5 100.181 99.869 98.8043

8 99.8 3 2 99.988 99.693 98.6152

9 100.2 3 5 99.988 99.693 98.6152

10 100.2 3 7 99.988 99.693 98.6152

11 99.2 3 2 99.988 99.693 98.6152

12 99.7 3 5 99.988 99.693 98.6152

13 99.8 3 7 99.988 99.693 98.6152

14 99.5 6 2 99.409 99.154 98.0442

15 98.8 6 5 99.409 99.154 98.0442

16 99.0 6 7 99.409 99.154 98.0442

17 97.8 6 2 99.409 99.154 98.0442

18 98.5 6 5 99.409 99.154 98.0442

19 98.5 6 7 99.409 99.154 98.0442

20 97.4 12 2 98.251 97.994 96.8857

21 98.0 12 5 98.251 97.994 96.8857

22 98.5 12 7 98.251 97.994 96.8857

23 97.2 12 2 98.251 97.994 96.8857

24 97.1 12 5 98.251 97.994 96.8857

25 97.4 12 7 98.251 97.994 96.8857

26 96.9 24 2 95.935 95.436 94.5045

27 96.6 24 5 95.935 95.436 94.5045

28 96.6 24 7 95.935 95.436 94.5045

29 96.0 24 2 95.935 95.436 94.5045

30 96.1 24 5 95.935 95.436 94.5045

31 96.4 24 7 95.935 95.436 94.5045

Page 10: Linear Regression 102: Stability Shelf Life Estimation

56 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Notice the additional numbers in columns c4-c6 of Table IV. The stability macro will place these numbers in the worksheet as a result of the store subcommand (see the script above used for this analysis). The Fit and Lower CL (columns c4 and c5) correspond to the black and red dashed lines, respectively, in Figure IV. The Lower PL in Table IV is the Lower 95% prediction limit for individual observations. This limit is more conservative (lower) than the 95% confidence for the mean (red line) and reflects the scatter of individual values about the fitted line (see Reference 1 for more description). Notice in Table IV that this prediction limit is below the acceptance limit at 24 months. Thus in this case, while a 26-month shelf life for the product may be acceptable from a regulatory point of view, a sponsor may want to consider the risk of out-of-specification results for this product near the end of shelf life.

So far we have assumed a one-sided lower limit of 95%LC. If the product had an upper limit of 105%LC

Table V: Example one ANCOVA, regression, ANOVA, and estimated shelf-life out-put from the Minitab stability macro.ANCOVA

Source DF Seq SS Seq MS F P

Time 1 80.359 80.359 117.167 0.000

Batch 2 0.598 0.299 0.436 0.651

Batch*Time 2 0.314 0.157 0.229 0.797

Error 25 17.146 0.686

Total 30 98.417

Model 1 Analysis

Regression Equation

y = 100.567 - 0.192994 time

Summary of Model

S = 0.789106 R-Sq = 81.65% R-Sq(adj) = 81.02%

PRESS = 20.4369 R-Sq(pred) = 79.23%

Analysis of Variance

Source DF Seq SS Adj SS Seq MS F P

Regression 1 80.3588 80.3588 80.3588 129.051 0.0000000

Time 1 80.3588 80.3588 80.3588 129.051 0.0000000

Error 29 18.0580 18.0580 0.6227

Lack-of-Fit 5 13.1613 13.1613 2.6323 12.901 0.0000037

Pure Error 24 4.8967 4.8967 0.2040

Total 30 98.4168

Estimated shelf-life for all batches: 26

Figure 4: Example one potency stability profile for all batches based on a CICS model and a one-sided lower acceptance limit.

Page 11: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 57

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

as well and there is risk of batches exceeding the upper limit, then we might want a shelf life based on a two-sided 95% confidence interval. In that case we could use the following analysis script:

%stability c1 c2 c3;life 95 105.

The resulting stability profile is shown in Figure 5. Notice in this case that the shelf-life estimate is slightly lower (25.5 months which we would likely round down to 25 months). This is because two-sided limits will be wider than a one-sided bound and will thus intersect the limit sooner.

Example Two: Potency Stability (SICS Model Two, One-Sided Lower Limit)Another set of potency stability data is given in columns C1-C3 of Table VI. As before, we will assume a one-sided lower acceptance limit of 95%LC.

We will use the following script to estimate the product shelf life based on these data:

%stability c1 c2 c3;store c4 c5 c6;itype -1;confidence 0.95;life 95;criteria 0.25.

The ANCOVA and other statistical output from this analysis are given in Table VII.

There is no evidence for separate slopes (p-value = 0.834). However, there is evidence for separate intercepts (p-value < 0.001). A comparison with the ANCOVA decision process of Figure 2 shows that the SICS model is appropriate in this case. The regression equations in Table VII show that the estimated slope (-0.213121 %LC/month) is common to each batch, but the intercepts differ. As in example one, the LOF test is significant (p-value = 0.0258), but we will assume that the straight-line assumption is adequate for illustration purposes here.

Figure 6 provides the separate stability profiles for each batch. Because the intercepts differ, the macro produces a separate plot for each batch. The shelf life estimated for each batch, based on when its 95% confidence lower bound crosses the acceptance limit of 95%LC, is given on the upper right corner of each plot. Batch 5 has the lowest estimated shelf life (23.4 months). Therefore, by the “worst-case” logic of pharmaceutical shelf-life estima-tion, limits the shelf life for the product to 23.4 months

as is also indicated in Table VII. In practice, we would likely round this down to 23 months. As described in Example one, columns C4-C6 of Table VI provide the numeric Fit and interval estimates based on the store subcommand request.

Example Three: Potency Stability (SISS Model, One-Sided Lower Limit With Predictions)Yet another set of potency stability data is provided in columns C1-C3 of Table VIII.

These data are analyzed using the following script:

%stability c1 c2 c3;store c4 c5 c6;itype -1;confidence 0.95;life 95;criteria 0.25.

Table IX shows the ANCOVA and other statistical out-put from this analysis.

There is evidence for both separate slopes (p-value = 0.17) and intercepts (p-value < 0.01). Both p-values are below the regulatory limit of 0.25. A comparison with the ANCOVA decision process of Figure 2, shows that the SISS model is appropriate in this case. The regression equa-tions for each batch are given in Table IX, and the slopes and intercepts differ for each batch as expected. We note that in this case, the LOF test is not statistically significant (p-value = 0.100568). For this test we use the traditional Type I error rate of 0.05 to judge statistical significance.

Figure 5: Example one potency stability profile for all batches based on a CICS model and a two-sided acceptance limit.

Page 12: Linear Regression 102: Stability Shelf Life Estimation

58 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Figure 6: Example two potency stability profiles for each batch on a SICS model and a one-sided lower acceptance limit.

Page 13: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 59

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

Table VI: Example two potency stability data and estimated fits and limits.C1 C2 C3 C4 C5 C6

Potency Month Batch Fit Lower CL Lower PL

1 104.8 0 3 102.176 101.434 100.192

2 104.0 0 4 104.255 103.463 102.252

3 102.0 0 5 100.820 100.163 98.866

4 101.4 1 5 100.607 99.971 98.660

5 100.8 2 5 100.394 99.777 98.453

6 103.0 3 3 101.536 100.857 99.575

7 103.2 3 4 103.616 102.887 101.637

8 100.2 3 5 100.181 99.581 98.245

9 101.2 3 3 101.536 100.857 99.575

10 99.7 3 5 100.181 99.581 98.245

11 100.8 6 3 100.897 100.261 98.950

12 102.8 6 4 102.976 102.295 101.014

13 98.8 6 5 99.541 98.977 97.617

14 99.2 6 3 100.897 100.261 98.950

15 103.3 6 4 102.976 102.295 101.014

16 98.5 6 5 99.541 98.977 97.617

17 98.6 12 3 99.618 98.999 97.677

18 102.4 12 4 101.698 101.045 99.745

19 98.0 12 5 98.263 97.688 96.335

20 97.2 12 3 99.618 98.999 97.677

21 101.2 12 4 101.698 101.045 99.745

22 97.1 12 5 98.263 97.688 96.335

23 97.6 24 3 97.061 96.215 95.035

24 99.1 24 4 99.140 98.291 97.113

25 96.6 24 5 95.705 94.853 93.677

26 98.0 24 3 97.061 96.215 95.035

27 99.5 24 4 99.140 98.291 97.113

28 96.1 24 5 95.705 94.853 93.677

Page 14: Linear Regression 102: Stability Shelf Life Estimation

60 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Table VII: Example two ANCOVA, regression, ANOVA, and estimated shelf-life output from the Minitab stability macro.ANCOVA

Source DF Seq SS Seq MS F P

Time 1 74.489 74.489 60.008 0.000

Batch 2 53.968 26.984 21.738 0.000

Batch*Time 2 0.455 0.227 0.183 0.834

Error 22 27.309 1.241

Total 27 156.221

Model 2 Analysis

Regression Equation

batch

3 y = 102.176 - 0.213121 time

4 y = 104.255 - 0.213121 time

5 y = 100.82 - 0.213121 time

Summary of Model

S = 1.07556 R-Sq = 82.23% R-Sq(adj) = 80.01%

PRESS = 37.3301 R-Sq(pred) = 76.10%

Analysis of Variance

Source DF Seq SS Adj SS Seq MS F P

Regression 3 128.457 128.457 42.8191 37.0144 0.0000000

time 1 74.489 88.734 74.4895 64.3914 0.0000000

batch 2 53.968 53.968 26.9839 23.3259 0.0000024

Error 24 27.764 27.764 1.1568

Lack-of-Fit 13 22.179 22.179 1.7061 3.3602 0.0258372

Pure Error 11 5.585 5.585 0.5077

Total 27 156.221

Overall minimum estimated shelf-life: 23.4

Stability profiles for each batch are given in Figure 7.As seen in Figure 7 and Table IX, the product shelf

life estimated by these data is limited by Batch 8 to 15.6 months. We would likely round this down to 15 months in practice. However, it would be interesting in this case to see what potencies the model would predict for these batches at 15 months. No real stability testing was done at 15 months of storage, but we can use the stability model to obtain estimates by including the desired times and batch numbers in columns c4 and c5, respectively, prior to the analysis and employing the following script:

%stability c1 c2 c3; itype 0; confidence 0.95; life 95 105; xvalues c4 c5; store C6 c7 c8 c9 c10.

For illustration, we are requesting two-sided 95% confidence limits (it=0). This amounts to requesting a 97.5% confidence lower bound, which is more con-servative than a 95% confidence lower bound. The same result could be obtained using it= -1 and cl = 97.5. Columns C4 and C5 contain the time points and

Page 15: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 61

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

Table VIII: Example three potency data and estimated fits and limits.

C1 C2 C3 C4 C5 C6

Potency Month Batch Fit Lower CL Lower PL

1 104.0 0 4 104.071 103.402 102.729

2 102.0 0 5 100.782 100.280 99.515

3 101.6 0 8 101.259 100.375 99.798

4 101.4 1 5 100.573 100.101 99.318

5 100.8 2 5 100.365 99.919 99.119

6 103.2 3 4 103.482 102.921 102.191

7 100.2 3 5 100.156 99.736 98.919

8 100.0 3 8 100.269 99.618 98.936

9 99.7 3 5 100.156 99.736 98.919

10 102.8 6 4 102.894 102.419 101.637

11 98.8 6 5 99.530 99.164 98.311

12 99.0 6 8 99.278 98.754 98.002

13 103.3 6 4 102.894 102.419 101.637

14 98.5 6 5 99.530 99.164 98.311

15 102.4 12 4 101.717 101.302 100.482

16 98.0 12 5 98.279 97.897 97.054

17 97.8 12 8 97.297 96.514 95.895

18 101.2 12 4 101.717 101.302 100.482

19 97.1 12 5 98.279 97.897 97.054

20 97.0 12 8 97.297 96.514 95.895

21 99.1 24 4 99.363 98.605 97.975

22 96.6 24 5 95.775 95.027 94.392

23 99.5 24 4 99.363 98.605 97.975

24 96.1 24 5 95.775 95.027 94.392

batches for which we want predictions. The above macro performs the fit as given previously in Table IX and the xvalues subcommand produces the predictions in col-umns C6-C10 of Table X. Note that the lower confidence bound is still within the limit of 95%LC, although the lower prediction bound, which reflects individual result variation, is below the acceptance limit.

Example Four: Related Substance Stability (SISS Model Three, One-Sided Upper Limit)To illustrate estimation of shelf life for a response whose level increases on storage, we will use the data for a related substance (degradation product of the active ingredient) given in columns C1-C3 of Table XI. The levels in column C1 are expressed as a percent of label claim for the active ingredient and the upper limit for this particular related substance is assumed to be 0.3%LC.

We can obtain the shelf life based on this response by using the following script:

%stability c1 c2 c3;store c4 c5 c6;itype 1;confidence 0.95;life 0.3;criteria 0.25.

Notice in this case that we are requesting a one-sided upper confidence limit (it=1) of 95% (cl=0.95). The output from this analysis is shown in Table XII.

As in example three, the ANCOVA output in Table XII indicates an SISS model. The separate slopes and inter-cepts are given in Table XII along with an LOF test that is not statistically significant, and an estimated shelf life of 15.61 months (which we would usually round down to 15 months).

Stability profiles for these batches are given in Figure 8, which confirms that batch 8 is the stability limiting batch for the product shelf life. Numeric predictions, requested using the STORE subcommand are given in columns C4-C6 of Table IX.

Example Five: Moisture Stability (CICS Model, Two-Sided Limits)As a final example of a response that may either increase or decrease on storage, we examine the moisture data given in columns C1-C3 of Table XIII. The moisture measure-ments in column C1 have units of %(w/w). We will take the acceptance limits for this product to be 1.5 to 3.5 %(w/w).

We can analyze these data using the following script. Notice that we have specified both the lower and upper

acceptance limits using the life subcommand and requested two-sided confidence limits using the itype subcommand.

%stability c1 c2 c3;itype 0;confidence 0.95;life 1.5 3.5;criteria 0.25.

The results of this analysis are provided in Table XIV.Notice in this case, the ANCOVA analysis leads to the

CICS model because neither the test for separate slopes nor the test for separate intercepts is statistically significant

Page 16: Linear Regression 102: Stability Shelf Life Estimation

62 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Table IX: Example three ANCOVA, regression, ANOVA, and estimated shelf-life output from the Minitab stability macro.ANCOVA

Source DF Seq SS Seq MS F P

Time 1 45.451 45.451 100.994 0.00

Batch 2 64.918 32.459 72.124 0.00

Batch*Time 2 1.760 0.880 1.955 0.17

Error 18 8.101 0.450

Total 23 120.230

Model 3 Analysis

Regression Equation

batch

4 y = 104.071 - 0.196151 time

5 y = 100.782 - 0.208609 time

8 y = 101.259 - 0.330208 time

Summary of Model

S = 0.670850 R-Sq = 93.26% R-Sq(adj) = 91.39%

PRESS = 13.5446 R-Sq(pred) = 88.73%

Analysis of Variance

Source DF Seq SS Adj SS Seq MS F P

Regression 5 112.129 112.129 22.4258 49.831 0.000000

time 1 45.451 45.950 45.4513 100.994 0.000000

batch 2 64.918 21.635 32.4588 72.124 0.000000

time*batch 2 1.760 1.760 0.8800 1.955 0.170420

Error 18 8.101 8.101 0.4500

Lack-of-Fit 10 6.156 6.156 0.6156 2.532 0.100568

Pure Error 8 1.945 1.945 0.2431

Total 23 120.230

Overall minimum estimated Shelf-Life: 15.6

Table X: Example three fit, confidence limit, and prediction limit estimates for time and batch combinations not present in the stability data.C4 C5 C6 C7 C8 C9 C10

Xvalue_Month Xvalue_Batch Fit Lower CL Upper CL Lower PL Upper PL

15 4 101.128 100.574 101.683 99.6139 102.643

15 5 97.653 97.110 98.195 96.1426 99.163

15 8 96.306 95.036 97.577 94.4088 98.204

Page 17: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 63

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

Figure 7: Example three potency stability profiles each batch on a SISS model and a one-sided lower acceptance limit.

Page 18: Linear Regression 102: Stability Shelf Life Estimation

64 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Table XI: Example four related substance stability data and estimated fits and limits.C1 C2 C3 C4 C5 C6

Related Month Batch Fit Upper CL Upper PL

1 0.030 0 4 0.027881 0.047950 0.068139

2 0.054 3 4 0.045534 0.062376 0.084284

3 0.066 6 4 0.063188 0.077421 0.100878

4 0.051 6 4 0.063188 0.077421 0.100878

5 0.078 12 4 0.098495 0.110942 0.135547

6 0.114 12 4 0.098495 0.110942 0.135547

7 0.177 24 4 0.169110 0.191852 0.210765

8 0.165 24 4 0.169110 0.191852 0.210765

9 0.090 0 5 0.126544 0.141610 0.164556

10 0.108 1 5 0.132802 0.146984 0.170472

11 0.126 2 5 0.139060 0.152420 0.176429

12 0.144 3 5 0.145319 0.157933 0.182427

13 0.159 3 5 0.145319 0.157933 0.182427

14 0.186 6 5 0.164093 0.175072 0.200678

15 0.195 6 5 0.164093 0.175072 0.200678

16 0.210 12 5 0.201643 0.213096 0.238373

17 0.237 12 5 0.201643 0.213096 0.238373

18 0.252 24 5 0.276742 0.299188 0.318236

19 0.267 24 5 0.276742 0.299188 0.318236

20 0.102 0 8 0.112219 0.138754 0.156060

21 0.150 3 8 0.141938 0.161447 0.181919

22 0.180 6 8 0.171656 0.187385 0.209936

23 0.216 12 8 0.231094 0.254586 0.273163

24 0.240 12 8 0.231094 0.254586 0.273163

Page 19: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 65

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

Table XII: Example four ANCOVA, regression, ANOVA, and estimated shelf-life output from the Minitab stability macro.ANCOVA

Source DF Seq SS Seq MS F P

Time 1 0.041 0.041 100.994 0.00

Batch 2 0.058 0.029 72.124 0.00

Batch*Time 2 0.002 0.001 1.955 0.17

Error 18 0.007 0.000

Total 23 0.108

Model 3 Analysis

Regression Equation

batch

4 y = 0.0278806 + 0.00588454 time

5 y = 0.126544 + 0.00625826 time

8 y = 0.112219 + 0.00990625 time

Summary of Model

S = 0.0201255 R-Sq = 93.26% R-Sq(adj) = 91.39%

PRESS = 0.0121902 R-Sq(pred) = 88.73%

Analysis of Variance

Source DF Seq SS Adj SS Seq MS F P

Regression 5 0.100916 0.100916 0.0201832 49.831 0.000000

time 1 0.040906 0.041355 0.0409062 100.994 0.000000

batch 2 0.058426 0.019472 0.0292129 72.124 0.000000

time*batch 2 0.001584 0.001584 0.0007920 1.955 0.170420

Error 18 0.007291 0.007291 0.0004050

Lack-of-Fit 10 0.005540 0.005540 0.0005540 2.532 0.100568

Pure Error 8 0.001751 0.001751 0.0002188

Total 23 0.108207

Overall minimum estimated shelf-life: 15.61

Page 20: Linear Regression 102: Stability Shelf Life Estimation

66 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

Figure 8: Example four related substance stability profiles for each batch on a SISS model and a one-sided upper acceptance limit.

Table XIII: Example two moisture stability data.

C1 C2 C3

Moisture Month Batch

1 2.20059 0 1

2 1.70372 1 1

3 3.32395 2 1

4 2.75907 3 1

5 2.43192 3 1

6 1.76331 6 1

7 1.56801 6 1

8 2.19423 12 1

9 3.22311 12 1

10 3.16325 24 1

11 1.54837 24 1

12 2.81078 0 2

13 1.94915 1 2

14 2.49058 2 2

15 2.00485 3 2

16 3.30700 3 2

17 2.99309 6 2

18 3.30159 6 2

19 2.72512 12 2

20 1.88341 12 2

21 2.77215 24 2

22 1.69048 24 2

23 2.45301 0 3

24 2.16138 1 3

25 2.26631 2 3

26 2.12853 3 3

27 2.51775 3 3

28 2.31034 6 3

29 3.36915 6 3

30 2.32070 12 3

31 2.72001 12 3

32 2.19393 24 3

33 3.45895 24 3

Page 21: Linear Regression 102: Stability Shelf Life Estimation

gxpand jv t .com Journal of Validation technology [Summer 2011] 67

David LeBlond, Daniel Griffith, and Kelly Aubuchon.

(i.e., p-values of 0.483 and 0.705, respectively). The stability profile given in Figure 9 indicates a shelf life for all batches of 45.35 months, which agrees with the estimate at the bottom of Table XIV. In this case, it is the 95% confidence upper bound that crosses the upper limit earliest and that, therefore, governs the product shelf life.

CONCLUSION We have illustrated here the ANCOVA process that is used to set product shelf life for pharmaceutical products. We have also illustrated the use of a convenient Minitab macro that can be used to perform the ANCOVA analysis, choose the appropriate stability model, and execute the multiple regressions to estimate shelf life and produce other useful statistical tests and statistics. The macro is flexible enough to handle a variety of common situations and produces graphics that serve as useful regression diagnostics.

It is essential to stress here the critical aspect of soft-ware validation. Validation is a regulatory requirement for any software used to estimate pharmaceutical product shelf life. Reliance on any statistical software, whether “validated” or not, carries with it the risk of producing misleading results. It is incumbent on the users of statis-tical software to determine, not only that the statistical packages they use can produce accurate results, given a battery of standard data sets, but also that the statistical model and other assumptions being made apply to the particular data set being analyzed, and that data and com-mand language integrity are maintained. It is not uncom-mon for a computer package to perform differently when installed on different computing equipment, in different environments, or when used under different operating systems. In our hands, using a number of representative data sets, the Minitab Stability macro performs admirably

Table XIV: Example five ANCOVA, regression, ANOVA, and estimated shelf-life output from the Minitab stability macro.ANCOVA

Source DF Seq SS Seq MS F P

Time 1 0.012 0.012 0.033 0.858

Batch 2 0.251 0.125 0.354 0.705

Batch*Time 2 0.531 0.265 0.748 0.483

Error 27 9.573 0.355

Total 32 10.366

Model 1 Analysis

Regression Equation

y = 2.45678 + 0.0022724 time

Summary of Model

S = 0.577939 R-Sq = 0.11% R-Sq(adj) = -3.11%

PRESS = 12.0483 R-Sq(pred) = -16.23%

Analysis of Variance

Source DF Seq SS Adj SS Seq MS F P

Regression 1 0.0116 0.0116 0.011599 0.034726 0.853386

time 1 0.0116 0.0116 0.011599 0.034726 0.853386

Error 31 10.3544 10.3544 0.334013

Lack-of-Fit 5 1.0545 1.0545 0.210898 0.589613 0.707875

Pure Error 26 9.2999 9.2999 0.357689

Total 32 10.3660

Estimated shelf life for all batches: 45.35

Page 22: Linear Regression 102: Stability Shelf Life Estimation

68 Journal of Validation technology [Summer 2011] i v t home.com

Statistical Viewpoint.

compared to other statistical packages such as JMP, SAS, and R. However, we can make no general claim that it will not be found lacking in other environments. Readers are advised to enlist the aid of local statisticians to assure that the statistical packages they use are properly validated.

REFERENCES1. Hu Yanhui, “Linear Regression 101,” Journal of Validation

Technology 17(2), 15-22, 2011.2. LeBlond D., “Chapter 23,” Statistical Design and Analysis of

Long-Term Stability Studies for Drug Products, In Qui Y, Chen Y, Zhang G, Liu L, Porter W (Eds.), 539-561, 2009.

3. Minitab Stability Studies Macro (2011), A technical sup-port document describing the use of the Macro in Minitab version 16 is available from the Minitab Knowledgebase at http://www.minitab.com/support/answers/answer.aspx?id=2686.

4. International Conference on Harmonization. ICH Q1E, Step 4: Evaluation for Stability Data, 2003. http://www.ich.org/products/guidelines/quality/article/quality-guidelines.html

5. Neter J, Kuntner MH, Nachtsheim CJ, and Wasserman W, Applied Linear Statistical Models, Chapter 23. 3rd edition, Irwin Chicago, 1996.

6. Schuirmann, DJ, “Current Statistical Approaches in the Center for Drug Evaluation and Research, FDA,” Proceed-ings of Stability Guidelines, AAPS and FDA Joint Conference, Arlington, VA, Dec 11-12, 1989. JVT

ARTICLE ACRONYM LISTINGANCOVA Analysis of CovarianceANOVA Analysis of VarianceAPI Active Pharmaceutical IngredientCICS Common Intercept and Common SlopeCL Confidence LimitDF Degrees of FreedomLOF Lack of fit%LC Percent of Label ClaimMSE Mean Square ErrorPL Prediction LimitPRESS Predicted Residual Sum of SquaresRMSE Root Mean Squared ErrorR-Sq R-squareR-Sq(adj) Adjusted R-squareR-Sq(pred) Prediction R-squareSICS Separate Intercept and Common SlopeSISS Separate Intercept and Separate Slope

Figure 9: Example five moisture stability profile for all batches based on a CICS model and a two-sided acceptance limit.