27
MATH& 146 Lesson 36 Sections 5.2 Correlation 1

MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

MATH& 146

Lesson 36

Sections 5.2

Correlation

1

Page 2: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Correlation

If the ordered pairs (x,y) tend to follow a straight-

line path, there is linear correlation, or

correlation for short.

2

Page 3: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Positive Correlation

A positive correlation is a relationship between two

variables where if one variable increases, the other

one also increases.

• The more time you spend running on a treadmill, the

more calories you will tend to burn.

• Taller people tend to have larger shoe sizes.

• The longer your hair grows, the more shampoo you

will tend to need.

3

Page 4: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Negative Correlation

A negative correlation means that there is an inverse

relationship between two variables - when one variable

decreases, the other increases.

• A student who has many absences tends to have

lower grades.

• As weather gets colder, air conditioning costs tend

to decrease.

• The older a man gets, the less hair that he tends to

have.

4

Page 5: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Correlation Coefficient

If you suspect a correlation exists between two

variables, it can be measured with the correlation

coefficient.

The correlation coefficient, r or R, is a measure

of the form, strength, and direction of a scatter

plot, all in one number between –1 and +1.

5

Page 6: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 1

Using the 8 graphs below, describe the relationship

between a scatterplot and its correlation coefficient.

6

Page 7: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

The Correlation Coefficient

Calculating R involves multiplying the z-scores for

the x- and y-coordinates of each ordered pair,

adding up the products, and dividing by n – 1.

1

x yz zR

n

7

Page 8: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

The Correlation Coefficient

Fortunately, while we can compute the correlation

using the formula, we will usually perform the

calculations on a computer or calculator.

For graphing calculators, the correlation coefficient

can also be computed with the LinRegTTest

command.

8

Page 9: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 2

Use the formula to find the correlation coefficient.

The mean for the x-values is 2 and y-values is 5.

The standard deviation for both is 1.

9

Page 10: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 3

Calculate the linear correlation coefficient, R, for

the following set of data.

10

x y

2 5

8 7

5 6

3 4

6 8

Page 11: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Correlation Coefficient

One way to interpret R is in terms of focus. Values

close to +1 or –1 will have very clear pictures. Values

close to 0 will often look like a vague swarm of dots.

11

Unfocused FocusedFocused

Page 12: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

No Correlation

However, a correlation coefficient close to zero can

mean other things, such as a nonlinear pattern.

It can't be emphasized enough that the scatter plot

must be analyzed first before making any conclusions

about correlation.

12

Page 13: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 4

Match the calculated correlations to the corresponding

scatterplot.

a) R = 0.49

b) R = – 0.48

c) R = – 0.03

d) R = – 0.85

13

Page 14: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 5

It appears no straight line would fit any of the

datasets represented in the graphs. Try drawing

nonlinear curves on each plot. Once you create a

curve for each, describe what is important in your

fit.

14

Page 15: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Possible Explanations for

Correlation

"Correlation does not imply causation" is a phrase

used in science and statistics to emphasize that a

correlation between two variables does not necessarily

imply that one causes the other.

For example, every person who learned math in the

17th century is dead. However, learning math does not

necessarily cause death!

Correlations can help us search for cause-and-effect

relationships. But causality is not the only possible

explanation for a correlation.

15

Page 16: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Possible Explanations for

Correlation

For example, children with bigger feet tend to read

better than children with smaller feet, but bigger feet

will not cause a child to be a better reader. In this

case, there is an underlying cause: children with

bigger feet also tend to be older and have been in

school longer.

For another example, the more firemen fighting a fire,

the bigger the fire is observed to be. However, more

firemen does not cause the fire to increase.

16

Page 17: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Possible Explanations for

Correlation

1) The correlation may be a coincidence.

2) Both variables might be directly influenced by

some common underlying cause.

3) One of the correlated variables may actually be

a cause of the other. Note that, even in this

case, we may have identified only one of

several causes.

17

Page 18: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Coincidence

A coincidence is a sequence of events that, although

accidental, seem to have been planned, arranged, or

correlated. Some examples include:

• As stock prices go up, skirt lengths get shorter.

• As Internet Explorer's market share decreased, so

has the U.S. murder rate.

• As the number of pirates in the world decreased,

global temperatures have increased.

18

Page 19: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Underlying Cause

An underlying cause (sometimes referred as a "lurking

variable" or "confounding factor") is an overlooked

variable that is actually causing a sequence of events

to seem to be correlated. Some examples include:

• Ice cream sales reflect the number of shark attacks

on swimmers.

• Sleeping with one's shoes on is strongly correlated

with waking up with a headache.

19

Page 20: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Causation

We say one event causes a second event when the

second event is a consequence of the first. Some

examples include:

• Flossing regularly reduces gingivitis.

• Outdoor temperature determines how fast a cricket

chirps.

• Flattering your statistics instructor will give you

better grades.

20

Page 21: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 6

Describe a possible explanation for each

correlation.

a) When I exercise regularly, I tend to lose weight.

b) Cities with more homicides also tend to have

more churches.

c) With a decrease in the wearing of hats, there

has been an increase in global warming over

the same period.

21

Page 22: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Coefficient of Determination

If we are convinced that the association we are

examining is linear, then the regression line provides

the best numerical summary of the relationship. But

how good is "best"?

A common way to explain the strength of a linear fit is

the coefficient of determination, R 2. Literally, it is

the correlation squared, and describes the amount of

variation in the response that is explained by the least

squares line.

22

Page 23: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 7

At a concert, the number of tattoos, X, and the number of piercings, Y, that a person had was recorded:

Find the regression line for the data. What is Rand R2?

23

X Y

2 5

8 7

5 6

3 4

6 8

Page 24: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Coefficient of Determination

Because R is always between –1 and 1, R-

squared is always between 0 and 1. Often, R-

squared is written as a percent.

A value of 100% means the relationship is

perfectly linear and the regression line perfectly

predicts the observations.

A value of 0% means there is no linear relationship

and the regression line does a very poor job.

24

Page 25: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Coefficient of Determination

In the previous example, the correlation between x

and y was R = 0.7947.

So the coefficient of determination is 0.79472 =

0.6316, which we report as 63.2%.

25

Page 26: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Coefficient of Determination

What does this value of 63.2% mean?

A useful interpretation of R-squared is that it

measures how much of the variation in the

response variable is explained by the explanatory

variable. For example, 63.2% of the variation in

the y-values can be explained by the x-values.

26

Page 27: MATH& 146 Lesson 36 - Amazon S3• As the number of pirates in the world decreased, global temperatures have increased. 18. Underlying Cause ... cause (sometimes referred as a "lurking

Example 8

If a linear model has a strong negative relationship

with a correlation of –0.87, how much of the

variation in the response is explained by the

explanatory variable?

27