Upload
wes
View
34
Download
0
Embed Size (px)
DESCRIPTION
Correlation: How Strong Is the Linear Relationship?. Lecture 46 Sec. 13.7 Mon, Dec 3, 2007. The Correlation Coefficient. The correlation coefficient r is a number between –1 and +1. It measures the direction and strength of the linear relationship. - PowerPoint PPT Presentation
Citation preview
Correlation: How Strong Is the Linear Relationship?Lecture 46Sec. 13.7Mon, Dec 3, 2007
The Correlation Coefficient
The correlation coefficient r is a number between –1 and +1.
It measures the direction and strength of the linear relationship. If r > 0, then the relationship is positive. If r < 0, then
the relationship is negative. The closer r is to +1 or –1, the stronger the
relationship. The closer r is to 0, the weaker the relationship.
Strong Positive Linear Association
x
y In this display, r is close to +1.
Strong Positive Linear Association
x
y In this display, r is close to +1.
Strong Negative Linear Association
In this display, r is close to –1.
x
y
Strong Negative Linear Association
In this display, r is close to –1.
x
y
Almost No Linear Association
In this display, r is close to 0.
x
y
Almost No Linear Association
In this display, r is close to 0.
x
y
Interpretation of r
-1 -0.8 -0.2 0.80 10.2
Interpretation of r
-1 -0.8 -0.2 0.80 10.2
StrongNegative
StrongPositive
Interpretation of r
-1 -0.8 -0.2 0.80 10.2
WeakNegative
WeakPositive
Interpretation of r
-1 -0.8 -0.2 0.80 10.2
No SignificantCorrelation
Correlation vs. Cause and Effect If the value of r is close to +1 or -1, that
indicates that x is a good predictor of y. It does not indicate that x causes y (or that y causes x).
The correlation coefficient alone cannot be used to determine cause and effect.
Calculating the Correlation Coefficient There are many formulas for r. The most basic formula is
Another formula is
2222 yynxxn
yxxynr
22 )()(
))((
yyxx
yyxxr
Example
Consider again the data
x y
1 8
3 12
4 9
5 14
8 16
9 20
11 17
15 24
Example
We found earlier thatSSX = 150SSY = 206SSXY = 165
Example
Then compute r.
.9387.0206150
165r
TI-83 – Calculating r
To calculate r on the TI-83,First, be sure that Diagnostic is turned on.
Press CATALOG and select DiagnosticsOn.
Then, follow the procedure that produces the regression line.
In the same window, the TI-83 reports r2 and r.
TI-83 – Calculating r
Use the TI-83 to calculate r in the preceding example.
Find r for the S/T Ratio vs. Graduation Rate.
Find r for SOL-Eng Passing Rate vs. Graduation Rate.
Another Formula for r
It turns out that
where SST
SSR2 r
2
2
SST
ˆSSR
yy
yy
Another Formula for r
Free-lunch participation vs. graduation rate data,
SSR = 1896.7,
SST = 2598.2. So we get
.7300.02.2598
7.18962 r
The Coefficient of Determination
r2 is called the coefficient of determination. It is interpreted as telling us how much of
the variation in y is determined by the variation in x.
So, 73% of the variation is graduation rates is determined by the variation in participation in the free-lunch program.
The Coefficient of Determination
What percentage of the variation in graduation rate is determined by the variation in S/T ratio?
What percentage of variation in graduation rate is determined by the variation in teachers’ average salary?
How Does r Work?
How does r indicate the direction of the relationship?
Consider the numerator of the formula.
22 )()(
))((
yyxx
yyxxr
How Does r Work?
Consider the lunch vs. graduation data:District Free Lunch Grad. Rate District Free Lunch Grad. Rate
Amelia 41.2 68.9 King and Queen 59.9 64.1
Caroline 40.2 62.9 King William 27.9 67.0
Charles City 45.8 67.7 Louisa 44.9 80.1
Chesterfield 22.5 80.5 New Kent 13.9 77.0
Colonial Hgts 25.7 73.0 Petersburg 61.6 54.6
Cumberland 55.3 63.9 Powhatan 12.2 89.3
Dinwiddie 45.2 71.4 Prince George 30.9 85.0
Goochland 23.3 76.3 Richmond 74.0 46.9
Hanover 13.7 90.1 Sussex 74.8 59.0
Henrico 30.2 81.1 West Point 19.1 82.0
Hopewell 63.1 63.4
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
41.2 68.9
40.2 62.9
45.8 67.7
22.5 80.5
25.7 73.0
55.3 63.9
45.2 71.4
23.3 76.3
13.7 90.1
30.2 81.1
63.1 63.4
(first half)
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
41.2 68.9 1.9 -2.7
40.2 62.9 0.9 -8.7
45.8 67.7 6.5 -3.9
22.5 80.5 -16.8 8.9
25.7 73.0 -13.6 1.4
55.3 63.9 16.0 -7.7
45.2 71.4 5.9 -0.2
23.3 76.3 -16.0 4.7
13.7 90.1 -25.6 18.5
30.2 81.1 -9.1 9.5
63.1 63.4 23.8 -8.2
(first half)
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
41.2 68.9 1.9 -2.7 -5.13
40.2 62.9 0.9 -8.7 -7.83
45.8 67.7 6.5 -3.9 -25.35
22.5 80.5 -16.8 8.9 -149.52
25.7 73.0 -13.6 1.4 -19.04
55.3 63.9 16.0 -7.7 -123.20
45.2 71.4 5.9 -0.2 -1.18
23.3 76.3 -16.0 4.7 -75.2
13.7 90.1 -25.6 18.5 -473.6
30.2 81.1 -9.1 9.5 -86.45
63.1 63.4 23.8 -8.2 -195.16
(first half)
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
59.9 64.1
27.9 67.0
44.9 80.1
13.9 77.0
61.6 54.6
12.2 89.3
30.9 85.0
74.0 46.9
74.8 59.0
19.1 82.0
(second half)
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
59.9 64.1 20.6 -7.5
27.9 67.0 -11.4 -4.6
44.9 80.1 5.6 8.5
13.9 77.0 -25.4 5.4
61.6 54.6 22.3 -17.0
12.2 89.3 -27.1 17.7
30.9 85.0 -8.4 13.4
74.0 46.9 34.7 -24.7
74.8 59.0 35.5 -12.6
19.1 82.0 -20.2 10.4
(second half)
How Does r Work?
Consider the lunch vs. graduation data:x y x –x y –y (x –x)(y –y)
59.9 64.1 20.6 -7.5 -154.50
27.9 67.0 -11.4 -4.6 52.44
44.9 80.1 5.6 8.5 47.60
13.9 77.0 -25.4 5.4 -137.16
61.6 54.6 22.3 -17.0 -379.10
12.2 89.3 -27.1 17.7 -479.67
30.9 85.0 -8.4 13.4 -112.56
74.0 46.9 34.7 -24.7 -857.09
74.8 59.0 35.5 -12.6 -447.30
19.1 82.0 -20.2 10.4 -210.08
(second half)
Scatter Plot
Free LunchRate
Gra
du
ati
on
Rate
20 30 40 50 60 70 80
50
60
80
90
70
Scatter Plot
Free LunchRate
Gra
du
ati
on
Rate
20 30 40 50 60 70 80
50
60
80
90
70
Scatter Plot
Free LunchRate
Gra
du
ati
on
Rate
20 30 40 50 60 70 80
50
60
80
90
70
Scatter Plot
Free LunchRate
Gra
du
ati
on
Rate
20 30 40 50 60 70 80
50
60
80
90
70
The two oddballs