30
Linear Correlation

Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

Embed Size (px)

Citation preview

Page 1: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

Linear Correlation

Page 2: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 2

Perfect Correlation

• 2 variables x and y are perfectly correlated if they are related by an affine transform

y = ax + b

• The correlation is positive if a>0 and negative if a<0.

• By corollary, 2 variables are perfectly positively correlated if and only if each pair of corresponding values has the same z-score.

• If the 2 variables are perfectly negatively correlated, corresponding z-scores will be equal in magnitude but opposite in sign.

Page 3: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 3

Pearson’s r

Pearson's correlation coefficient

for a population:

, where and x y X Yx Y

X Y

z z X Yz z

N

for a sample:

, where and 1

1 if and are perfectly positively correlated

-1 if and are perfectly negatively correlated

x yx Y

X Y

z z X X Y Yr z z

N s s

x y

x y

Page 4: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 4

Scatterplots

PSYC 6130 Section A 2005-2006

y = 0.9678x + 0.86

r2 = 0.5576

75%

80%

85%

90%

95%

100%

80% 82% 84% 86% 88% 90% 92% 94% 96% 98%

Assignment 1 Grade

Ass

ignm

ent 2

Gra

de

0.75r

Page 5: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 5

Pearson’s r only measures linear dependence

• Two variables can have low correlation and still be highly dependent.

Page 6: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 6

CSE 3101 2006F

y = 0.74x + 20

r2 = 0.40

y = 0.0065x2 - 0.15x + 48

r2 = 0.410

20

40

60

80

100

0 20 40 60 80 100

Assignment 1 (%)

Ass

ignm

ent 2

(%

)

Higher-Order Models

PSYC 6130A 2005

y = 0.53x + 0.45

r2 = 0.38

y = -1.4x2 + 2.3x - 0.11

r2 = 0.4650%

60%

70%

80%

90%

100%

30% 50% 70% 90% 110%

Fall Exam (%)

Fin

al E

xam

(%

)

Page 7: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 7

Pearson’s r depends on the range of the variables under study

• r2 measures the proportion of variance in one variable accounted for by the other.

• If the range of variable X is restricted, it will account for less of the variance in Y.

PSYC 6130 Section A 2005-2006

y = 0.867x + 0.105

r2 = 0.64075%

80%

85%

90%

95%

100%

80% 85% 90% 95% 100%

Assignment 1 Grade

Ass

ignm

ent 2

Gra

de

PSYC 6130 Section A 2005-2006 (A+ Grades Only)

y = 0.144x + 0.799

r2 = 0.020

91%

92%

93%

94%

95%

96%

91% 92% 93% 94% 95% 96%

Assignment 1 Grade

Ass

ignm

ent 2

Gra

de

0.80r 0.14r

Page 8: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 8

PSYC 6130 Section A 2005-2006

y = 0.867x + 0.105

r2 = 0.64075%

80%

85%

90%

95%

100%

75% 85% 95% 105%

Assignment 1 Grade

Ass

ignm

ent

2 G

rade

Pearson’s r is Sensitive to Outliers

0.80r

PSYC 6130 Section A 2005-2006

y = 0.40x + 0.539

r2 = 0.16475%

80%

85%

90%

95%

100%

75% 85% 95% 105%

Assignment 1 Grade

Ass

ignm

ent

2 G

rade

0.40r

Outlier (Fake Student)

Page 9: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 9

Standard Definition of Correlation (Population)

2Recall that the population vari oance f isX X

XYcovariWe defin ancee the population of an s d aX Y

2 22 1X X XE X X

N

1XY X Y X YE X Y X Y

N

The Pearson correlation between and is then given byX Y

XY

X Y

Page 10: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 10

Standard Definition of Correlation (Sample)

2Recall that the sample of isvariance X Xs

XYcovarianceWe define the sample of and ass X Y

2 22 1

1Xs E X X X XN

1

1XYs E X X Y Y X X Y YN

The Pearson correlation between and is then given byX Yr

XY

X Y

sr

s s

Page 11: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 12

Computational Formula

( )( ) X Yx y X Y

X Y X Y

XYz z X Y NN N

covariance

For a population:

For a sample: 1

( )( ) 11 ( 1)

x y

X Y X Y

XY NXYz z X X Y Y NrN N s s s s

unbiased covariance

Page 12: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 13

Example: 6130A 2005-2006 Assignment Marks

Assignment 1 Assignment 2

X Y XY

0.8671 0.8176 0.70890.8150 0.8239 0.67150.8497 0.8428 0.71610.8555 0.8679 0.74250.9017 0.8365 0.75430.9538 0.8742 0.83380.9191 0.9308 0.85550.9306 0.9308 0.86630.9480 0.9182 0.87050.9364 0.9371 0.87750.9480 0.9308 0.88240.9422 0.9434 0.88890.9480 0.9560 0.9062

Mean 0.9088 0.8931 0.8134Pop. Std. Dev. 0.04473 0.04847Sample Std. Dev. 0.04655 0.05045

Page 13: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

End of Lecture 7

Wed, Oct 29 2008

Page 14: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

Correlation and the Power of Matched Tests

Page 15: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 16

Correlation and the Power of Matched t-tests

• Now that we understand correlation, we can better understand the power of matched t-tests when scores in the two conditions are correlated.

PSYC 6130 Section A 2005-2006

y = 0.867x + 0.105

r2 = 0.64075%

80%

85%

90%

95%

100%

75% 85% 95% 105%

Assignment 1 Grade

Ass

ignm

ent

2 G

rade

Page 16: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 17

Recall formulae for standard error for independent and matched tests

• Independent t-test • Matched t-test2 2

2 1 2 1 22D

s s s ss

n

r

n

1 2

2 2 21 2

1X X

s s sn

2 2 21 2

For the purpose of power calculations, assume homogeneity of variance:

[ ] [ ] .

Then:

E s E s

1 2

2 22X X n

2 2 2 22 2 2(1 )

D n n n

Page 17: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 18

Knowing the expected std error, we can estimate the expected t-value

• Independent t-test • Matched t-test

1 2

1 2

1 2

[ ]

2

2

X X

E t

n

nd

1 2

1 2

[ ]

1

2 1

1

2 1

D

E t

n

nd

1Thus

1matched independent

Page 18: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 19

The power of matched t-tests

• Large positive correlations between scores in the two conditions will mean a greater expected t-score for the matched design.

• But keep in mind that the critical value for the matched design will be somewhat larger as well, due to a smaller df.

• Which test is more powerful is decided by the exact tradeoff between these two effects.

1

1matched independent

2( 1)independentdf n 1matcheddf n

Page 19: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

Applying Correlation Analysis

Page 20: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 21

Adjusted Correlation Coefficient

Although the sample covariance is an of

the population covariance , the sample correlation coefficient

is an unbiased estimator of the populat

unbiased

ion corr

estimator

n elation coeot fficien t

XY

XY

s

r.

A less-biased estimate of is given by

the adjusted correlation coefficient :adjr

21 11

2adj

r Nr

N

Page 21: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 22

Testing Pearson’s r for Significance

2 2

2Thus

1 12

r Nt r

r rN

2

When =0, is approximately -distributed on N-2 degrees of freedom,

with standard deviation

1

2r

r t

rs

N

Page 22: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 23

Underlying Assumptions (For Inference)

• Independent random sampling

• Bivariate normal distribution

-10 -5 0 5 10-10

-5

0

5

10

X

Y

-100

10

-10

0

100

0.01

0.02

0.03

XY

Pro

babi

lity

Page 23: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 24

Applications of Pearson’s r

• Measuring reliability and validity

– Examples:

• e.g., test-retest reliability

• Split-half reliability

• Inter-rater reliability

• Criterion validity of self-report (correlate self-report against behavioural measure)

• Correlation between tests that are supposed to measure the same thing.

• Correlation between algorithmic model and human responses in behavioural studies.

• Measuring relationships between variables (correlational studies)

– e.g., frequency of cannabis and alcohol use

• Measuring relationships between IVs and DVs (experimental studies, when IV on interval/ratio scale

– e.g., exam performance as a function of alcohol consumption on previous night.

Page 24: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 25

Power Analysis for Pearson’s r

ALet expected correlation under alternate hypothesis

A

Then

E[ ] 1t N

Page 25: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 26

Confidence Intervals for Pearson’s r

• Pearson’s r is bounded on [-1..1].

• Consequently, sampling distribution for r is not normal.

• Sampling distribution for >0 is negatively skewed.

• Sampling distribution for <0 is positively skewed.

• Thus confidence intervals are generally not symmetric.

Page 26: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 27

Fisher Transform

• Fisher transform (Appendix r′): Method for symmetrizing r to facilitate calculation of confidence interval using standard normal table.

1

3rs

N

1 1log

2 1

rr

r

is approximately normally-distributed around with standard deviationr

Page 27: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 28

Confidence Intervals on r

/2 rr z s

Note that since does not depend on statistics computed from the sample,

approximates a normal distribution, not a -distribution, and we can use the

-table for our calculations:

rs

r t

z

The inverse transform can then be applied to convert the confidence limits

back to :

r

r

2

2

1

1

r

r

er

e

These transforms can also be done directly using Appendix r

Page 28: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

End of Lecture 8

Nov 5 2008

Page 29: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 30

Testing Difference of Pearson Correlations from 2 Independent Samples

• Converting the skewed r distribution to an (approximately) normal distribution allows straightforward two-sample testing:

1 2

1 2

r r

r rz

s

1 2

1 2

where

1 1

3 3

(Remember that variances add)

r rsN N

Page 30: Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform

PSYC 6130, PROF. J. ELDER 31

Example

COSC 3101 2003-04 Section N

r = 0.303

20

30

40

50

60

70

80

90

100

20 40 60 80 100

Midterm Exam Grade (%)

Fin

al E

xam

Gra

de

(%

)

COSC 3101 2003-04 Section M

r = 0.4486

20

30

40

50

60

70

80

90

100

20 40 60 80 100

Midterm Exam Grade (%)

Fin

al E

xam

Gra

de

(%

)

N=43 N=44