50
Ch 15 – Inference for Regression

Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Embed Size (px)

Citation preview

Page 1: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Ch 15 – Inference for Regression

Page 2: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Example #1:The following data are pulse rates and heights for a group of 10 female statistics students.

Height 55 59 60 63 64 64 66 70 70 72

Pulse 53 53 58 62 63 65 68 70 73 76

a. Sketch a scatterplot of the data. What is the least-squares regression line for predicting pulse rate from height?

ˆ 27.5784 1.42579y x where y = predicted pulse rates

x = height

Page 3: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364
Page 4: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

b. What is the correlation coefficient between height and pulse rate? Interpret this number.

r = 0.9746

Strong, Positive relationship

Page 5: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

c. What is the predicted pulse rate of a 59” tall student?

ˆ 27.5784 1.42579y x

ˆ 27.5784 1.42579 59y

ˆ 56.54y

Page 6: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

d. What is the residual for the 59” student?

Height 55 59 60 63 64 64 66 70 70 72

Pulse 53 53 58 62 63 65 68 70 73 76

ˆy y 53 – 56.54 = – 3.54

Page 7: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

e. Construct a residual plot and describe its meaning.

No pattern, so good linear model

Page 8: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Ok, so what is the new stuff for chapter 15?

y a bx This is not the true line for the population!

y x Where = true y-intercept and = true slope of the population

Page 9: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Remember: Residuals tell you information about the line and if it is a good model

Chapter 15 only focuses on slope.

We are going to determine if there is a linear relationship between two variables. (or = 0)

Page 10: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Conditions for Inference:

• The observations are independent

• The relationship is linear

Can’t do repeated observations on the same individual!

Look for patterns in the residual plot

Page 11: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

• The standard deviation of the response about the true line is the same everywhere

• The response varies Normally about the true regression line

Look for spread in the residual plot

Histogram for residuals, look to see if approx normal

Conditions for Inference:

Page 12: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Standard Error about the LSRL:

s = unbiased estimator of

Standard deviation of residuals

2

y y

n

2residuals

2s

n

Page 13: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Calculator Tip! Standard Error

Stat – Tests - LinRegTTest

L1: xL2: yUse Leave RegEq blankCalculates = standard error

Page 14: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Confidence Intervals for Regression Slope:

*2n bb t SE

where bSE Standard error of the slope

Page 15: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

2bs

SEx x

2

2

ˆ

2

y y

n

x x

SEb estimates the variability in the sampling distribution of the estimated slope (how much slopes vary from experiment to experiment.

Page 16: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Minitab Printout:

The regression equation isPredicted y = y-intercept + slope x-variable ˆ( )y a bx

Predictor Coef StDev T PConstant y-intercept (a) ignore ignore ignore

X-variable Slope (b) SEb test-statistic p-value(2-sided)

s = standard deviation R-sq = r2 R-sq(adj) = ignoreof residuals

Page 17: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Example #1Infants who cry easily may be more easily stimulated than others. This may be a sign of higher IQ. Child development researchers explored the relationship between the crying of infants four to ten days old and later their IQ test scores. A snap of a rubber band on the sole of the foot caused the infants to cry. The researchers recorded the crying and measured its intensity by the number of peaks in the most active 20 seconds. They later measured the children’s IQ at age three years using the Stanford-Binet IQ test. The data is below.

Page 18: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Crying IQ Crying IQ Crying IQ Crying IQ

10 87 20 90 17 94 12 94

12 97 16 100 19 103 12 103

9 103 23 103 13 104 14 106

16 106 27 108 18 109 10 109

18 109 15 112 18 112 23 113

15 114 21 114 16 118 9 119

12 119 12 120 19 120 16 124

20 132 15 133 22 135 31 135

16 136 17 141 30 155 22 157

33 159 13 162

Page 19: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

a. Label all important parts of the Minitab printout.

The regression equation isIQ = 91.3 + 1.49 Crycount

Predictor Coef StDev T PConstant 91.3 8.934 10.22 0.000Crycount 1.49 0.4870 3.07 0.004

s = 17.50 R-sq = 20.7% R-sq(adj) = 21%

LSRL

(y-int)(slope) (SEb)

(standard deviation of the residuals)

(correlation of determination)

Page 20: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

b. Sketch a scatterplot of the data.

Page 21: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

c. Calculate the standard deviation of the residuals using your calculator.

2residuals

2s

n

17.4987

Page 22: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

d. Construct a 95% confidence interval for the slope.

P: True slope of the line for crying vs. IQ

Page 23: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

A:

The observations are independent

Infants who cry easily may be more easily stimulated than others. This may be a sign of higher IQ. Child development researchers explored the relationship between the crying of infants four to ten days old and later their IQ test scores. A snap of a rubber band on the sole of the foot caused the infants to cry. The researchers recorded the crying and measured its intensity by the number of peaks in the most active 20 seconds. They later measured the children’s IQ at age three years using the Stanford-Binet IQ test.

Each infant should be separate from another, not influencing the next test

Page 24: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

The relationship is linearA:

No apparent patterns in the residuals

Page 25: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

A: The standard deviation of the response about the true line is the same everywhere

Residuals spread out evenly

Page 26: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

The response varies Normally about the true regression line

A:

Slightly skewed right.

Page 27: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Line of regression T-interval N:

Page 28: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

I: *2n bb t SE

*38 21.49 0.4870t

1.49 2.042 0.4870

(0.49844 , 2.48735)

Page 29: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

C: I am 95% confident the true slope of the line for crying vs. IQ is between 0.49844 and 2.48735.

Note: 0 is not in the interval! This means they have an linear relationship.

OR

I am 95% confident the mean IQ increases by between 0.49844 and 2.48735 points for each additional peak in crying.

Page 30: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Ch 15B – Hypothesis Testing for Slope

Page 31: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Remember:

brSySx

so, if r = 0, then b = 0

Page 32: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Ho: 0

Or there is no true linear relationship between x and y.

t b

SEb

Test Statistic:

Page 33: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Calculator Tip! Line Regression Test

Stat – Tests - LinRegTTest

L1: xL2: yLeave RegEq blank

Page 34: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

Example #1How well does the number of beers a student drink predict his or her blood alcohol content (BAC). Sixteen students volunteers at Ohio State University drank a randomly assigned number of cans of beer. Thirty minutes later, a police officer measured their BAC. The data is below. Stu # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Beer 5 2 9 8 3 7 3 5 3 5 4 6 5 7 1 4

BAC 0.10 0.03 0.19 0.12 0.04 0.095 0.07 0.06 0.02 0.05 0.07 0.10 0.085 0.09 0.01 0.05

a. What is the least-squares regression line?

ˆ 0.0127 0.0180y x where y = predicted BAC

x = # of beers

Page 35: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

b. Make a scatterplot of the data and describe its shape.

Positive, strong, linear relationship

Page 36: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

c. What is the correlation coefficient? What does it mean?

r = 0.894

Strong, positive relationship

Page 37: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

d. Label all important parts of the Minitab printout.

The regression equation isBAC = – 0.0127 + 0.0180 BeersPredictor Coef StDev T PConstant – 0.0127 0.01264 –1.00 0.332Beers 0.017964 0.002402 7.48 0.000

s = 0.02044 R-sq = 80% R-sq(adj) = 78.6%

LSRL

(y-int)(slope)

(SEb)

(standard deviation of the residuals)

(correlation of determination)

(test statistic)

(Prob, 2-tailed)

Page 38: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

e. Verify the results by using your calculator.

Stat – Tests - LinRegTTest

L1: xL2: yLeave RegEq blank

Page 39: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

2bs

SEx x

0.0024020.02044

72.4375

Page 40: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

f. Conduct the hypothesis test to see if there is a positive relationship between # beers and BAC.

P: determine if there is a positive linear relationship between # beers and BAC

Page 41: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

H:

Ho: =0 The number of beers has no effect on BAC

Ha: > 0 The number of beers has a positive linear effect on BAC.

Page 42: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

The relationship is linearA:

No apparent patterns in the residuals

Page 43: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

A: The standard deviation of the response about the true line is the same everywhere

Residuals spread out evenly

Page 44: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

The response varies Normally about the true regression line

A:

Page 45: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

N: Line of Regression T-Test

T:

b

btSE

0.0180

0.002402 7.48

Page 46: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

O:P(t > 7.48) =

df = n – 2 = 16 – 2 = 14

Page 47: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364
Page 48: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

O:P(t > 7.48) =

df = n – 2 = 16 – 2 = 14

Less than 0.0005

Or: on calc

P(t > 7.48) = 0.000001

Page 49: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

M:

____ p 0.000001 0.05

<

Reject the Null

Page 50: Ch 15 – Inference for Regression. Example #1: The following data are pulse rates and heights for a group of 10 female statistics students. Height5559606364

S: There is enough evidence to claim that an increased number of beers does increase BAC.