13
Stat 203 Additional (FOR INTEREST) material. You are not responsible for knowing this. We’ve been looking at the Pearson correlation r without looking at how it’s calculated. For correlating the response variable to multiple explanatory variables, the easiest way is to use the sum of squares error and total (SSE and SST)

Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

Stat 203 Additional (FOR INTEREST) material.

You are not responsible for knowing this.

We’ve been looking at the Pearson correlation r without

looking at how it’s calculated.

For correlating the response variable to multiple explanatory

variables, the easiest way is to use the sum of squares error

and total (SSE and SST)

Page 2: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

For only one y variable and one x variable we have a more

directed way.

Page 3: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

r is the Pearson correlation coefficient.

n is the sample size.

Page 4: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

The parts in the brackets are “How many standard errors

above the x mean and above the y mean” respectively

Page 5: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

This following notation isn’t exactly right, but it will serve our

purposes. zx and zy are the standardized scores of x and y

(the raw scores).

Page 6: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

For a set of 5 dragons, we might have a dataset like this:

Length in cm (x) Weight in grams (y) 34.3 670 24.8 373 30.0 557 28.7 480 30.9 567

Which produces this scatterplot:

Page 7: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

If y (weight) increases with x (length), then above-average x

values will occur for the same cases as above-average y values.

So zx > 0 usually when zy > 0 .

That means, for most values, (zx )(zy) > 0

In the correlation formula you’re adding mostly positive

numbers, and your correlation will end up positive.

Page 8: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

If y decreases as x increases, below-average x occurs with

above-average y.

So zx < 0 usually when zy > 0 .

That means, for most values, (zx )(zy) < 0

In the correlation formula you’re adding mostly negative

numbers, and your correlation will end up negative.

Page 9: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

First, standardize the scores.

Length in cm (x) Weight in grams (y) 34.3 z = 1.32 670 z = 1.27 24.8 z = -1.43 373 z = -1.41 30.0 z = 0.08 557 z = 0.25 28.7 z = -0.30 480 z = -0.45 30.9 z = 0.34 567 z= 0.34

Page 10: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

Then multiply each one together

Length in cm (x) Weight in grams (y) (zx )(zy)

34.3 z = 1.32 670 z = 1.27 1.68

24.8 z = -1.43 373 z = -1.41 2.02

30.0 z = 0.08 557 z = 0.25 0.02

28.7 z = -0.30 480 z = -0.45 0.13

30.9 z = 0.34 567 z= 0.34 0.11

Page 11: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

Then add the multiplied values

Length in cm (x) Weight in grams (y) (zx )(zy)

34.3 z = 1.32 670 z = 1.27 1.68

24.8 z = -1.43 373 z = -1.41 2.02

30.0 z = 0.08 557 z = 0.25 0.02

28.7 z = -0.30 480 z = -0.45 0.13

30.9 z = 0.34 567 z= 0.34 0.11

TOTAL

3.97

Page 12: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

This pretty much does the whole formula for us.

r= 0.985, very strong positive.

Page 13: Stat 203 Additional (FOR INTEREST) material. You are not …jackd/Stat203_2011/Wk08_Extra.pdf · 2012. 7. 5. · We’ve been looking at the Pearson correlation r without looking

Final note: The correlation formula doesn’t show up in your

textbook in this form, but in an equivalent but longer form.

For the equivalence and more information I recommend

http://en.wikipedia.org/wiki/Pearson_product-

moment_correlation_coefficient