5. Regression in the Toolbar of Minitab’s Help

Embed Size (px)

Citation preview

  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    1/9

    Regression in the Toolbar of Minitabs Help

    1.Example ofsimple linear regression

    You are a manufacturer who wants to obtain a quality measure on a

    product, but the procedure to obtain the measure is expensive. There is

    an indirect approach, which uses a different product score (Score 1) in

    place of the actual quality measure (Score 2). This approach is less costly

    but also is less precise. You can use regression to see if Score 1 explains a

    significant amount of variance in Score 2 to determine if Score 1 is an

    acceptable substitute for Score 2.

    1 Open the worksheet EXH_REGR.MTW.

    2 Choose Stat > Regression > Regression.

    3 In Response, enter Score2.

    4 In Predictors, enter Score1.

    5 Click OK.

    Session window output

  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    2/9

    Interpreting the results

    Minitab displays the results in the Session window by default.

    The p-value(5) in the Analysis of Variance table(13) (0.000), indicates

    that the relationship between Score 1 and Score 2 is statistically

    significant(10) at an -level(2) of 0.05. This is also shown by the p-value

    for the estimated coefficient(14) of Score 1, which is 0.000.

    The R2(15) value shows that Score 1 explains 95.7% of the variance in

    Score 2, indicating that the model fits the data extremely well.

    Observation 9 is identified as an unusual observation(16) because its

    standardized residual

    (17)

    is less than -2. This could indicate that thisobservation is an outlier. See Identifying outliers.

    Because the model is significant and explains a large part of the variance

    in Score 2, the manufacturer decides to use Score 1 in place of Score 2 as

    a quality measure for the product.

    2. Example of multiple regressions

    As part of a test of solar thermal energy, you measure the total heat

    flux from homes. You wish to examine whether total heat flux (HeatFlux)

    http://bsscpopup%28%27../shared_glossary/p_value_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/analysis_of_variance_table_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/statistically_significant_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../shared_glossary/Coefficients_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/influential_observation_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Standardized_residuals_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Standardized_residuals_def.htm');http://bsscpopup%28%27../shared_glossary/p_value_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/analysis_of_variance_table_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/statistically_significant_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../shared_glossary/Coefficients_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/influential_observation_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Standardized_residuals_def.htm');
  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    3/9

    can be predicted by the position of the focal points in the east, south, and

    north directions. Data are from [27]. You found, using best subsets

    regression, that the best two-predictor(18) model included the variables

    North and South and the best three-predictor added the variable East. Youevaluate the three-predictor model using multiple regression.

    1 Open the worksheet EXH_REGR.MTW.

    2 Choose Stat > Regression > Regression.

    3 In Response, enter HeatFlux.

    4 In Predictors, enter East South North.

    5 Click Graphs.

    6 UnderResiduals for Plots, chooseStandardized.

    7 Under Residual Plots, choose Individual Plots. Check

    Histogram of residuals, Normal plot of residuals, and

    Residuals versus fits. Click OK.

    8 Click Options. Under Display, check PRESS and predicted R-

    square. Click OKin each dialog box.

    Session window output

    http://bsscpopup%28%27../Shared_GLOSSARY/response_and_predictor_variables_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/response_and_predictor_variables_def.htm');
  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    4/9

  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    5/9

    Interpreting the results

    Session window output The p-value(5) in the Analysis of Variance table(13) (0.000) shows that

    the model estimated by the regression procedure is significant(10) at an

    -level(2) of 0.05. This indicates that at least one coefficient is different

    from zero.

    The p-values for the estimated coefficients(14) of North and South are

    both 0.000, indicating that they are significantly related to HeatFlux.

    The p-value for East is 0.092, indicating that it is not related to

    HeatFlux at an -level(2) of 0.05. Additionally, the sequential sum of

    squares(19) indicates that the predictor East doesn't explain a

    substantial amount of unique variance. This suggests that a model with

    only North and South may be more appropriate.

    The R2(15) value indicates that the predictors explain 87.4% of the

    variance in HeatFlux. The adjusted R2(20) is 85.9%, which accounts for

    the number of predictors in the model. Both values indicate that the

    model fits the data well.

    The predicted R2(21) value is 78.96%. Because the predicted R2 value is

    close to the R2 and adjusted R2 values, the model does not appear to

    be overfit and has adequate predictive ability.

    Observations 4 and 22 are identified as unusual because the absolute

    value of the standardized residuals are greater than 2. This may

    http://bsscpopup%28%27../shared_glossary/p_value_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/analysis_of_variance_table_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/analysis_of_variance_table_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/statistically_significant_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../shared_glossary/Coefficients_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/sum_of_squares_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/sum_of_squares_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/R_squared_adjusted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/R_squared_adjusted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/R_squared_adjusted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/r_squared_predicted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/r_squared_predicted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/r_squared_predicted_def.htm');http://bsscpopup%28%27../shared_glossary/p_value_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/analysis_of_variance_table_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/statistically_significant_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../shared_glossary/Coefficients_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/alpha_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/sum_of_squares_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/sum_of_squares_def.htm');http://bsscpopup%28%27../shared_glossary/R_squared_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/R_squared_adjusted_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/r_squared_predicted_def.htm');
  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    6/9

    indicate they are outliers(22). See Checking your model, Identifying

    outliers, and Choosing a residual type.

    Graph window output

    The histogram

    (23)

    indicates that outliers may exist in the data, shownby the two bars on the far right side of the plot.

    The normal probability plot(24) shows an approximately linear pattern

    consistent with a normal distribution(25). The two points in the upper-

    right corner of the plot may be outliers(22). Brushing the graph identifies

    these points as 4 and 22, the same points that are labeled unusual

    observations in the output. See Checking your model and Identifying

    outliers. The plot of residuals(11) versus the fitted values(26) shows that the

    residuals(11) get smaller (closer to the reference line) as the fitted

    values increase, which may indicate the residuals have non-constant

    variance. See [9] for information on non-constant variance.

    3. Example of Fitted Regression Line

    You are studying the relationship between a particular machinesetting and the amount of energy consumed. This relationship is known to

    have considerable curvature, and you believe that a log transformation of

    the response variable(18) will produce a more symmetric error(32)

    distribution. You choose to model the relationship between the machine

    setting and the amount of energy consumed with a quadratic model(33).

    1 Open the worksheet EXH_REGR.MTW.

    2 Choose Stat > Regression > Fitted Line Plot.3 In Response (Y), enter EnergyConsumption.

    4 In Predictor (X), enter MachineSetting.

    5 Under Type of Regression Model, choose Quadratic.

    6 Click Options. Under Transformations, check Logten of Yand

    Display logscale for Y variable. Under Display Options, check

    Display confidence interval and Display prediction interval.

    Click OKin each dialog box.

    Session window output

    http://bsscpopup%28%27../Shared_GLOSSARY/outlier_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Histogram_Glossary_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/probability_plot_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/normal_distribution_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/outlier_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Residuals_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Fitted_values_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Fitted_values_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Residuals_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/response_and_predictor_variables_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/error_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/error_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Regression_model_order_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/outlier_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Histogram_Glossary_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/probability_plot_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/normal_distribution_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/outlier_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Residuals_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Fitted_values_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Residuals_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/response_and_predictor_variables_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/error_def.htm');http://bsscpopup%28%27../Shared_GLOSSARY/Regression_model_order_def.htm');
  • 7/27/2019 5. Regression in the Toolbar of Minitabs Help

    7/9

    Interpreting the results

    The quadratic model (p-value(9) = 0.000, or actually p-value