Simple Linear Regression - Statistical Inference

Preview:

Citation preview

1

Simple Linear Regression -Statistical Inference

Reading: Section 12.3 and 12.4, 12.5Learning Objectives: Students should be able to:• Describe the relationship between two distributions using

plots and correlation.• Make inference about population parameters

• Confidence intervals• Hypothesis tests

• Make predictions for new observations of independent variables based on known dependent variables

2

Is the Simple Linear Regression Model Useful?Coefficient of determination and correlation coefficient

• Coefficient of Determination (r2) – the larger the r2, the greater the variation in Y being explained by its linear relationship to x.

• Correlation coefficient (r) describes how “tight” a linear relationship is between two variables.– Positive: larger x tend to associate with large y values– Negative: larger x values tend to associate with smaller y values

Relationship between COD (r2) and CC (r)

Pearson correlation of y and x = 1.000

The regression equation is y = 5.00 + 1.00 xS = 0 R-Sq = 100.0% R-Sq(adj) = 100.0%Source DF SS MS F PRegression 1 10.254 10.254 * *Residual Error 8 0.000 0.000Total 9 10.254

3

Pearson correlation of y1 and x1 = 0.000

The regression equation is y1 = 4.75 + 0.000 x1S = 1.13376 R-Sq = 0.0% R-Sq(adj) = 0.0%Source DF SS MS F PRegression 1 0.000 0.000 0.00 0.999Residual Error 8 10.283 1.285Total 9 10.283

Relationship between COD (r2) and CC (r)

4

Regression Analysis: MPG versus motorsz

The regression equation isMPG = 33.7 - 0.0474 motorsz

S = 3.06705 R-Sq = 77.2% R-Sq(adj) = 76.4%

Analysis of Variance

Source DF SS MS F PRegression 1 955.34 955.34 101.56 0.000Residual Error 30 282.20 9.41Total 31 1237.54Correlations: MPG, motorsz

Pearson correlation of MPG and motorsz= -0.879

Relationship between COD (r2) and CC (r)

5

Correlations: motorsz, weight

Pearson correlation of motorsz and weight = 0.947P-Value = 0.000

Regression Analysis: motorsz versus weight

The regression equation ismotorsz = - 135 + 0.117 weight

Predictor Coef SE Coef T PConstant -134.67 26.84 -5.02 0.000weight 0.116932 0.007241 16.15 0.000

S = 38.2179 R-Sq = 89.7% R-Sq(adj) = 89.3%

Analysis of Variance

Source DF SS MS F PRegression 1 380883 380883 260.77 0.000Residual Error 30 43818 1461Total 31 424701

Is the Simple Linear Regression Model Useful?Inference about

• The larger the |r|, the larger the• Small |r| may be indicative of true = 0 • Will make inference about β using CI and HT.

If |r| or r2 is reasonably large – CI will not contain 0– Hypothesis of = 0 will be rejected

Then utility of model is confirmed 6

1

1̂1

1

β1 is the true expected change in Y for every one unit change in X. If , then changes in x do not influence changes in Y. Hence model is not useful.

01

(1-α)100% Confidence Interval for

7

1

2ˆ1

2

11 1,,~ˆ

N

SN

XX

(1-α)100% Confidence Interval for

8

1

Point Estimator and its distribution (t-distribution)

Hypothesis-Testing Procedure (t-test)

9

Hypothesis-Testing Procedure (F-test)

10

Source of Variation D.F. Sum of Square Mean Square F-testRegression

Error

Total

Reject for a level α test.

Or compute p-value

2,1,10 0: nFfH if

Example: MPG and Motorsize

11

Regression Analysis: MPG versus motorsz - Editted ouput from MINITAB

The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)

Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706

S = 3.06705 R-Sq = 77.2%

Analysis of Variance

Source DF SS MS F PRegression 1 955.34 955.34 Residual Error 282.20 9.41Total 1237.54

12

13

Confidence Interval for Mean Y ValueCI for

14

**)|( 10*. xxYExY

Point Estimation

Prediction Interval for a Future Y ValuePI for

15

*| xY

Prediction

Example: MPG and Motorsize

16

The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)

Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706

S = 3.06705 R-Sq = 77.2%

Obs Fit SE Fit 95% CI 95% PI1 19.499 0.547

(1) Suppose we want to get some idea on MPG of cars with motorsize of 300.Do we want a CI or a PI?

Example: MPG and Motorsize

17

The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)

Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706

S = 3.06705 R-Sq = 77.2%

Obs Fit SE Fit 95% CI 95% PI1 19.499 0.547

(2) Suppose you want to get some idea on a car with motorsize of 300 that you are considering purchasing. Do you want a CI or a PI?

CI and PI Using Minitab

18

Predicted Values for New Observations

NewObs Fit SE Fit 95% CI 95% PI

1 19.499 0.547 (18.382, 20.616) (13.136, 25.862)

Extrapolation

19

Would it be wise to use our estimated simple linear regression model to predict the MPG of a car with motorsz= 600?

Extrapolation is making estimations/predictions for Y conditional on values of x outside of those observed in the data used to estimate the regression parameters.

Summary

Use the estimated linear regression model to: • Evaluate the linear relationship between Y and x:

– Coefficient of determination– Confidence interval for β1

– Hypothesis test for H0: β1=0• Predict values of Y conditional on known x

– Point estimate– Confidence interval or prediction interval

20

Recommended