12
Simple Linear Regression Statistics 700 Week of November 27

Simple Linear Regression

Embed Size (px)

DESCRIPTION

Simple Linear Regression. Statistics 700 Week of November 27. Example for Illustration. - PowerPoint PPT Presentation

Citation preview

Page 1: Simple Linear Regression

Simple Linear Regression

Statistics 700

Week of November 27

Page 2: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 2

Example for Illustration• The human body takes in more oxygen when exercising

than when it is at rest. To deliver oxygen to the muscles, the heart must beat faster. Heart rate is easy to measure, but measuring oxygen uptake requires elaborate equipment. If oxygen uptake (VO2) can be accurately predicted from heart rate (HR), the predicted values may replace actually measured values for various research purposes. Unfortunately, not all human bodies are the same, so no single prediction equation works for all people. Researchers can, however, measure both HR and VO2 for one person under varying sets of exercise conditions and calculate a regression equation for predicting that person’s oxygen uptake from heart rate.

Page 3: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 3

Data From An Individual

• Goals in this illustration:

• Scatterplot: linear relationship or not?

• Obtain the best-fitting line using least-squares.

• To test whether the model is significant or not.

• To obtain a confidence interval for the regression coefficient.

• To obtain predictions.

HR 94 96 95 95 94 95 94 104 104 106VO2 0.473 0.753 0.929 0.939 0.832 0.983 1.049 1.178 1.176 1.292HR 108 110 113 113 118 115 121 127 131

VO2 1.403 1.499 1.529 1.599 1.749 1.746 1.897 2.040 2.231

Page 4: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 4

The Scatterplot

90 100 110 120 130

0.4

1.4

2.4

HeartRate

Oxy

genU

ptak

e

Page 5: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 5

Simple Linear Regression Model1. Conditional on X=x, the response variable Y has mean equal to xx

2. is the y-intercept; while is the slope of the regression line, which could be interpreted as the change in the mean value per unit change in the independent variable.

3. For each X = x, the conditional distribution of Y is normal with mean (x) and variance 2.

4. Y1, Y2, …, Yn are independent of each other.

Shorthand:

Yi = + xi + i with i IID N(0,2)

Page 6: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 6

nibXaYYYR

bXaY

XbYa

SXX

SXYb

SYYSXX

SXYr

YXnYXYYXXSXY

YnYYYSYY

XnXXXSXX

iiiii

i

n

iii

n

ii

n

ii

n

ii

n

ii

n

ii

,...,2,1 ),(ˆ

LinePredictionˆ

ofEstimator

ofEstimator

))((

))((

)(

)(

gression Linear ReSimplefor Formulas

11

2

1

22

1

2

1

22

1

Page 7: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 7

SYY

SSRR

MSE

nSSE

SSR

MSE

MSRF

n

SSEMSE

SSESSRSSY

YYSSR

YYRSSE

c

n

ii

n

iii

n

ii

2

2

2

1

2

11

2

tion Determinaoft Coefficien

ofestimator unbiased an is

)2/(

1/2

)ˆ(

)ˆ(

Page 8: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 8

SXX

Xx

nMSEtbxa

xX

SXX

Xx

nMSEtbxa

x

SXX

Xx

nMSExY

SXX

X

nMSEa

SXX

MSEb

n

n

20

2/;20

0

20

2/;20

0

20

02

2

2

2

)(11)()(

:at Interval Prediction

)(1)()(

:)(for Interval Confidence

)(1)()](ˆ[ˆ

1)()(ˆ

)(ˆ

Page 9: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 9

X YHeartRate OxygenUptake X^2 Y^2 XY Predicted Residual ResSq

94 0.473 8836 0.223729 44.462 0.828944 -0.35594369 0.12669696 0.753 9216 0.567009 72.288 0.906248 -0.1532479 0.02348595 0.929 9025 0.863041 88.255 0.867596 0.061404206 0.0037795 0.939 9025 0.881721 89.205 0.867596 0.071404206 0.00509994 0.832 8836 0.692224 78.208 0.828944 0.00305631 9.34E-0695 0.983 9025 0.966289 93.385 0.867596 0.115404206 0.01331894 1.049 8836 1.100401 98.606 0.828944 0.22005631 0.048425

104 1.178 10816 1.387684 122.512 1.215465 -0.03746474 0.001404104 1.176 10816 1.382976 122.304 1.215465 -0.03946474 0.001557106 1.292 11236 1.669264 136.952 1.292769 -0.00076895 5.91E-07108 1.403 11664 1.968409 151.524 1.370073 0.032926843 0.001084110 1.499 12100 2.247001 164.89 1.447377 0.051622633 0.002665113 1.529 12769 2.337841 172.777 1.563334 -0.03433368 0.001179113 1.599 12769 2.556801 180.687 1.563334 0.035666318 0.001272118 1.749 13924 3.059001 206.382 1.756594 -0.00759421 5.77E-05115 1.746 13225 3.048516 200.79 1.640638 0.105362109 0.011101121 1.897 14641 3.598609 229.537 1.872551 0.02444948 0.000598127 2.04 16129 4.1616 259.08 2.104463 -0.06446315 0.004155131 2.231 17161 4.977361 292.261 2.259072 -0.02807157 0.000788

2.77556E-15 0.2466642033 25.297 0 220049 37.68948 2804.105

SXX 2518 B 0.038652SYY 4.008519 A -2.80435SXY 97.326

SSR 3.761855 SSE 0.246664MSR 3.761855 MSE 0.01451

F 259.2659

Page 10: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 10

Results of Regression Analysis (using Minitab)

Regression Analysis

The regression equation is

OxygenUptake = - 2.80 + 0.0387 HeartRate

Predictor Coef StDev T PConstant -2.8044 0.2583 -10.86 0.000HeartRat 0.038652 0.002400 16.10 0.000

S = 0.1205 R-Sq = 93.8% R-Sq(adj) = 93.5%

Analysis of Variance

Source DF SS MS F PRegression 1 3.7619 3.7619 259.27 0.000Residual Error 17 0.2467 0.0145Total 18 4.0085

Page 11: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 11

Fitted Line on the Scatterplot

90 100 110 120 130

0.5

1.5

2.5

HeartRate

Oxy

genU

ptak

e

Y = -2.80435 + 3.87E-02X

R-Sq = 93.8 %

Regression Plot

Page 12: Simple Linear Regression

Week of 11/27/2000 Simple Linear Regression 12

90 100 110 120 130

0.5

1.5

2.5

HeartRate

Oxy

Upt

akeY = -2.80435 + 3.87E-02X

R-Sq = 93.8 %

Regression

95% CI

95% PI

Regression Plot