22
Name _____________________________ Chapter 3 Learning Objectives Secti on Related Example on Page(s) Relevant Chapter Review Exercise( s) Can I do this? Identify explanatory and response variables in situations where one variable helps to explain or influences the other. 3.1 144 R3.4 Make a scatterplot to display the relationship between two quantitative variables. 3.1 145, 148 R3.4 Describe the direction, form, and strength of a relationship displayed in a scatterplot and recognize outliers in a scatterplot. 3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand the basic properties of correlation, including how the correlation is influenced by outliers. 3.1 152, 156, 157 R3.1, R3.2 Use technology to calculate correlation. 3.1 Activity on 152, 171 R3.4 Explain why association does not imply causation. 3.1 Discussio n on 156, 190 R3.6 Interpret the slope and y intercept of a least-squares regression line. 3.2 166 R3.2, R3.4 Use the least-squares regression line to predict y for a given x. Explain the dangers of extrapolation. 3.2 167, Discussio n on 168 R3.2, R3.4, R3.5 Calculate and interpret residuals. 3.2 169 R3.3, R3.4 Explain the concept of least squares. 3.2 Discussio n on 169 R3.5 Determine the equation of a least-squares regression line using technology or computer 3.2 Technolog y Corner on 171, R3.3, R3.4 1

Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Name _____________________________

Chapter 3 Learning Objectives SectionRelated

Exampleon Page(s)

RelevantChapter Review

Exercise(s)

Can I do this?

Identify explanatory and response variables in situations where one variable helps to explain or influences the other.

3.1 144 R3.4

Make a scatterplot to display the relationship between two quantitative variables. 3.1 145, 148 R3.4

Describe the direction, form, and strength of a relationship displayed in a scatterplot and recognize outliers in a scatterplot.

3.1 147, 148 R3.1

Interpret the correlation. 3.1 152 R3.3, R3.4Understand the basic properties of correlation, including how the correlation is influenced by outliers.

3.1 152, 156, 157 R3.1, R3.2

Use technology to calculate correlation. 3.1 Activity on 152, 171 R3.4

Explain why association does not imply causation. 3.1 Discussion on

156, 190 R3.6

Interpret the slope and y intercept of a least-squares regression line. 3.2 166 R3.2, R3.4

Use the least-squares regression line to predict y for a given x. Explain the dangers of extrapolation.

3.2167,

Discussion on 168

R3.2, R3.4, R3.5

Calculate and interpret residuals. 3.2 169 R3.3, R3.4

Explain the concept of least squares. 3.2 Discussion on 169 R3.5

Determine the equation of a least-squares regression line using technology or computer output.

3.2Technology Corner on 171, 181

R3.3, R3.4

Construct and interpret residual plots to assess whether a linear model is appropriate. 3.2 Discussion on

175, 180 R3.3, R3.4

Interpret the standard deviation of the residuals and and use these values to assess how well the least-squares regression line models the relationship between two variables.

3.2 180 R3.3, R3.5

Describe how the slope, y intercept, standard deviation of the residuals, and are influenced by outliers.

3.2 Discussion on 188 R3.1

Find the slope and y intercept of the least-squares regression line from the means and standard deviations of x and y and their correlation.

3.2 183 R3.5

1

Page 2: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

3.1 Describing Relationships Read 143–144 What is the difference between an explanatory variable and a response variable?

Read 145–149 On scatterplots, the explanatory variable goes on the horizontal axis and we don’t necessarily start each axis at (0,0). Start each axis near the data.

What is the easiest way to lose points when making a scatterplot? (xkcd.com/833)

Track and Field Day! The table below shows data for 13 students in a statistics class. Each member of the class ran a 40-yard sprint and then did a long jump (with a running start). Make a scatterplot of the relationship between sprint time (in seconds) and long jump distance (in inches).

Sprint Time (s) 5.41

5.05 9.49

8.09 7.01

7.17 6.83

6.73 8.01

5.68 5.78

6.31 6.04

Long Jump Distance (in) 171 184 48 151 90 65 94 78 71 130 173 143 141

2

Page 3: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Four characteristics you should consider when interpreting a scatterplot: “DUFS”AlwaysForm – linear or nonlinearDirection – positive or negativeStrength – very weak, weak, moderate, strong, very strong, perfect

SometimesUnusual stuff – outliers, clusters

The following scatterplot shows the lengths (in 1000s of feet) and the speeds (in miles/hour) for the 78 roller coasters I rode from Oct. 2007 to Aug. 2009. Describe the relationship between length and speed.

Caution! A strong association between two variables does not automatically indicate a cause-and-effect relationship! Correlation is NOT causation! Examples: $ of ice cream sales/week in South Carolina vs. # of shark attacks each week in South Carolina# of kids with cell phones vs. # kids who pass AP® Stats exammonthly sunblock sales vs. # of shark attacks monthlyIce cream sales vs. drowning deathshand span vs. vocabulary size

Read 150: Using technology to create scatterplotsHW #26: page 159 (1, 5, 7, 9, 11, 13, 34)

3

Page 4: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

3.1 Correlation Just like two distributions can have the same shape and center with different spreads, two associations can have the same direction and form, but very different strengths. These are both linear and positive, but which one is stronger?

Read 150–151What is the correlation r? It is a numerical measure of direction and strength of linear patterns.

Linear? Use r. Not linear? Don’t use r.Draw an oval around the points. A near circle indicates nearly a 0 correlation (weak). Long skinny ovals indicate correlation closer to +1 or -1 (strong).

Correlation won't tell you the form of a relationship. Graph first, then you might be able to use correlation to describe strength and direction.Correlation, r, is NOT a resistant measure of strength; outliers will affect it.The formula for correlation is on the formula packet, so don't memorize it.

4

Page 5: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Read 154–157 Read 154-155 Eleven important facts about correlation, r1. Switching x & y won't change r.2. x and y must be quantitative (if either variable isn't quantitative, use "association").3. Units don't affect r & r has no units.4. Correlation does not imply causation.5. sign of r tells direction.6. -1 ≤ r ≤ 1 7. If the scatterplot is linear, then r near ±1 means strong, near 0 means weak.8. r measures linear relationships only.9. Correlation is not resistant-- outliers will affect r.10. r won't tell you whether something is linear.11. r doesn't tell the whole story; identify means and standard deviations for both x and y.

HW #27: page 161 (15–18, 21, 27–32)3.2 Least-Squares Regression Read 164–168The general form of a regression equation: y=a+bx The difference between y and : y = an observed value, a real data value y = a predicted value, a value from the equation

To interpret slope, you must address the ideas of: rate, units, prediction, & average.To interpret the intercept, you must address the ideas of: units, prediction when x is 0 (in context), & average.To predict a y for some x, you must address the ideas of: units, prediction when x is a # (in context) & average.

Used Ford EscapesThe following scatterplot shows the number of miles driven (in thousands) and advertised price (in thousands) for 11 used Ford Escapes from the 2012-2014 model years. The regression line shown on the scatterplot is = 24600 – 0.114x.

a) Interpret the slope and y intercept of a regression line.

b) Predict the price of a used Escape with 50,000 miles.

5

MilesDrive

nPrice

6000 269988000 2099812000 1959915000 2499919000 2599824000 1959944000 1999845000 1859947000 2159957000 1559971000 17599

Page 6: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

c) Predict the price of a used Escape with 250,000 miles. How confident are you in this prediction?

Extrapolation can be a bad thing:

Using the Track and Field data from earlier, the equation of the least-squares regression line is = 305 – 27.6x where y = long jump distance and x = sprint time. a) Interpret the slope.

b) Does it make sense to interpret the y-intercept? Explain.

HW #28 page 193 (35, 37, 39, 41, 45)3.2 ResidualsRead 168–172What is a residual?

Interpreting a residual: The residual is how far the actual y is above or below what we predict.

Positive residuals mean points are above the line. Negative residuals mean points are below the line.

406080

100120140160180200

5 6 7 8 9Sprint (seconds)

LongJump = (-27.6 in/s)Sprint + 305 in 2

6

Page 7: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Calculate and interpret the residual for the Ford Escape with 57,000 miles and an asking price of $15,599.

How can we determine the “best” regression line for a set of data?

The least-squares regression line is not resistant to outliers.

Example: 12 of Taco Bell’s chicken menu items:

(a) Calculate the equation of the least-squares regression line using technology. Make sure to define variables! Sketch the scatterplot with the graph of the least-squares regression line.

(b) Interpret the slope and y-intercept in context.

(c) Calculate and interpret the residual for the first item listed, the Chicken Burrito Supreme, with 12g of fat and 50g of carbs.

Read 172-178 To know a line is the right model to use, we can examine a residual plot. What is a residual plot? A scatterplot is a graph of the explanatory variable (x) and the response variable (y). A residual plot is a graph of the explanatory variable (x) and the residuals.

7

Page 8: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

What is the purpose of a residual plot? A residual plot can reveal patterns that were not apparent (not obvious) in the original scatterplot.

What would we see in a residual plot that tells us that our linear model is not a good one? Bad: A curved pattern, meaning the relationship between x&y is not linear.

A line is a bad model to choose.

Bad: Changing vertical spread, meaning predictions will be less accurate for some values of x.

What would we see in a residual plot that tells us that our linear model is appropriate?Good: A uniform, even, random scatter, meaning any errors in our predictions

will be random and about equal for any value of x. Construct and interpret a residual plot for the Ford Escape data.

HW #29: page 193 (43, 47, 49, 51)3.2 Standard deviation of the residuals and Read page 177What is the standard deviation of the residuals? How do you calculate and interpret it?

Calculate and interpret the standard deviation of the residuals for the Ford Escape data.

Suppose that you see a used Ford Escape for sale. Predict the asking price for this Escape, using the average asking price.

Would our predictions be better if we used miles driven to help estimate the sale price?

Read 178-180

8

Page 9: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

What is the coefficient of determination r2? How do you calculate it?

r2 is a numerical measure of how much better our predictions are by using the regression equation instead of just using the average of the y values. r2 is called the coefficient of determination.

How do you interpret r2? ___% of the variation in ____ (put the response variable there) is accounted for by the LSRL relating ____ (put the response variable there) to ____ (put the explanatory variable there).

How is related to r? How is related to s?

If , then To know whether r is +0.7 or -0.7, we will need more information.

both tell about how well the line fits our data

neither tells about form As , then (but ). because has no unites because r has no units s has the same units as the response

variable

HW #30: page 193 (55, 57)Interpreting Computer OutputRead 181–182

9

Page 10: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

10

Page 11: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Several factors contribute to the speed of a roller coaster: largest drop, maximum angle of drop, initial speed at the top of the largest hill, etc. Some parks report both height and drop, others only report the height. To investigate the relationship between height and speed for steel sit-down coasters, output from a regression analysis of 39 such American rides is shown below.

(a) What is the equation of the least-squares

regression line? Define any variables you use.

(b) Interpret the slope of the least-squares regression line.

(c) What is the correlation?

(d) Is a linear model appropriate for this data? Explain.

(e) Would you be willing to use the linear model to predict the speed of a roller coaster that is 500 feet tall, if it were built? Explain.

(f) Calculate and interpret the residual for Mr. Freeze, which is 218 ft. tall and has a speed of 70 mph.

Predictor Coef SE Coef T PConstant 25 1.13 11.7 0.0000Height 0.24 0.02 12 0.0000

S = 6.54 R-Sq = 86.0% R-Sq(adj) = 87.2%

11

Page 12: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

(g) Interpret the values of and s.

(h) If the speed were measured in km/h instead of mph, how would the values of and s be affected? Explain.

HW #31 page 193 (59, 61, 1–9 below)In addition to other factors affecting climate, temperatures generally tend to decrease as one moves north from the equator to the North Pole. Here are average June high temperatures and latitude for 12 cities along the US east coast. 1. Interpret the scatterplot for this data.

2. Calculate the equation of the least squares regression line.

3. Interpret the slope and y-intercept in the context of the problem.

12

City Latitude Avg. June HighPortland 43.4 73Boston 42.3 76

Providence 41.8 78Philadelphia 40.0 83Baltimore 39.3 85

Washington, DC 38.9 83Virginia Beach 36.8 84

Wilmington, NC 34.1 87Savannah 32.0 90

Jacksonville 30.2 90Daytona 29.1 88Miami 25.8 89

Page 13: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

4. Calculate and interpret the value of the correlation coefficient.

5. If the temperature were measured in Celsius instead of Fahrenheit, how would the correlation change? Explain.

6. Calculate and interpret the residual for Washington, DC.

7. Interpret the residual plot.

8. Identify and interpret the value of s in the context of the problem.

9. Identify and interpret the value of in the context of the problem.

13

Page 14: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Regression Wisdom, etc. Read 186-191Does it matter which variable is x and which is y?

Which of the following has the highest correlation?

4

6

8

10

4 6 8 10 12 14 16A

3

5

7

9

4 6 8 10 12 14 16A

5

7

9

11

13

4 6 8 10 12 14 16 5

7

9

11

13

8 10 12 14 16 18 20E

Examine the hypothetical scatterplot heights & weights of H. S. students below (Form, Direction, Strength?)

Add each of these men and consider:Outlier in height or weight?Still fit the overall pattern?Influential?

Basketball player Shaquille O'Neal (7'1", 325 lbs.) Sumo wrestler Kasugao Katsumasa (6'0", 340 lbs.)Basketball player Manute Bol, (7'6" 200 lbs)How do outliers affect the correlation, least-squares regression line, and standard deviation of the residuals?

How do outliers affect the correlation, least-squares regression line, & standard deviation of the residuals? Are all outliers influential? Outlier in pattern Outlier out of patternr gets stronger (closer to 1 or -1)LSRL won't move muchs won't change much

r gets stronger (closer to 0)LSRL will change a lots will increase

14

Page 15: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Read 182-185To calculate the equation of the least-squares regression line using summary statistics, use these 2 facts:

1. Slope = b=rs y

sx2. The LSRL passes through the point ( x , y ).

Notice that for each increase of 1 standard deviation in x, the predicted value of y increases by r times the standard deviation of y.

example: Given: Mean for x=3, mean for y=7, std. dev. for x=1.5, std. dev. for y=2.5, and r = .89, what is the equation of the LSRL?

HW #32 page 193 (65, 71–78) Review Chapter 3

HW #33 page 202 Chapter 3 Review ExercisesReview / FRAPPY!FRAPPY: 2005 #3HW #34 page 203 Chapter 3 AP Statistics Practice TestChapter 3 Test

A note on predictions: consider these two interpretations and notice how the second one is better than the first one.

“The model tells us that a 6th grader with a forearm circumference of 12 inches should have a 1 rep max of 50.8 pounds.”

“The average 1 rep max for a 6th grader (from the sampled population) with a forearm circumference of 12 inches is expected to be approximately 50.8 pounds.”

15

Page 16: Wednesday, August 11 (131 minutes)stevewillott.com/17-18 ap stats notes in word/3 notes 16... · Web view3.1 147, 148 R3.1 Interpret the correlation. 3.1 152 R3.3, R3.4 Understand

Plot, statistic,or equation

Characteristics

Form

?

Dire

ctio

n?

Stre

ngth

?

Show

s re

latio

nshi

p?

Use

for

pred

ictin

g or

es

timat

ing?

Other facts

Scatterplot

Correlationr

Coefficient of determination

r2

Least Squares Regression Liney=ax+bory=b0+b1 xResidual plot

16