31
Correlation

Chapter 9 : Linear Correlation

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 9 : Linear Correlation

Correlation

Page 2: Chapter 9 : Linear Correlation

Correlational Research

Correlational research: describes the relationship between

two or more naturally occurring variables.

– Is age related to political conservativism?

– Are highly extraverted people less afraid of rejection

than less extraverted people?

– Is depression correlated with hypochondriasis?

– Is I.Q. related to reaction time?

Page 3: Chapter 9 : Linear Correlation

measure two variables and determine whether

there is a relationship present

predictor <-> criterion

No causality because:

direction: there is no way to tell which is the cause or

the effect

third variable problem: some third variable that was not

measured could be responsible for the relationship.

Page 4: Chapter 9 : Linear Correlation

3rd variable

problem

red car speeding

ticket

?

midnight

basketball

less

crime

?

vitamins healthier

?

larger feet reading

skills

?

Dr. Dimwit

Page 5: Chapter 9 : Linear Correlation

Scales of Measurement and Indicators

Scale of

Measurement

Indicator of

Central

Tendency

Indicator of

Variability

Indicator of

Association

Nominal/

CategoricalMode Variation Ratio Cramer’s Phi (ϕc)

Ordinal MedianSemi-Interquartile

Range (SIQ)Spearman’s Rho (rs)

Interval/Ratio Mean Standard Deviation Pearson’s (r)

Page 6: Chapter 9 : Linear Correlation

Correlation Coefficients

Correlation

Coefficient

Predictor (X) Criterion (Y)

Cramér’s Phi (ϕ𝑐) Nominal/Categorical Nominal/Categorical

Spearman’s Rho (rs) Ordinal Ordinal

Pearson’s (r) Interval or Ratio Interval or Ratio

Page 7: Chapter 9 : Linear Correlation

No relationship

Positive linear

relationship

Negative linear

relationship

Curvilinear

Relationship:(Linear corr.

not appropriate)

Estimate r

for each case:

Page 8: Chapter 9 : Linear Correlation

• Correlation coefficient (r)

+1.00 perfect positive correlation;

-1.00 perfect negative correlation;

0 lack of correlation

ABS|r| = magnitude of relationship

sign (r) direction of relationship

r2 = % variance of Y explained by X

Page 9: Chapter 9 : Linear Correlation

Types of correlation coefficients

Pearson’s correlation coefficient: linear

relationship between two interval / ratio

variables.

Spearman’s rank-order correlation: linear

relationship between two variables measured

using ordinal (ranked) scores.

Point-biserial correlation: linear relationship

between the scores from one continuous

variable and one dichotomous (0 or 1) variable.

Page 10: Chapter 9 : Linear Correlation

Conceptual Formula for

Pearson’s correlation:

Positive r

[z’s from x and y same sign]Negative r

[z’s from x and y different sign]

<-Neg zx | Pos zx-> <-Neg zx | Pos zx->

<-N

eg

zy

| P

os

zy->

<-N

eg

zy

| P

os

zy->

N

zzr

yxpopulation

y

iy

x

ix

s

YYz

s

XXz

Page 11: Chapter 9 : Linear Correlation

1.11Y 31.5 X

y

iy

x

ix

s

YYz

s

XXz

Sx = 6.22 Sy = 3.41

Research question: Is education about other ethnicities

correlated with tolerant attitudes towards others?

Education Score

Tolerance Score

Zx Zy ZxZy

25 3 -1.05 -2.38 2.50

25 9 -1.05 -.62 .65

33 14 .24 .85 .20

35 11 .56 -.03 -.02

38 13 1.05 .56 .59

36 14 .72 .85 .61

31 12 -.08 .26 -.02

29 12 -.40 .26 -.10

22 9 -1.53 -.62 .95

41 14 1.53 .85 1.30

315 111 6.66

1

2

n

XXs

ix

Page 12: Chapter 9 : Linear Correlation

74.

110

66.6

1

n

zzr

yx

Could this be (1) due to chance, such as random error, or

(2) very UNLIKELY to occur due to chance (< 5%)?

Inferential statistics are needed.

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

0.00 10.00 20.00 30.00 40.00 50.00

To

lera

nc

e S

co

re

Education Score

Page 13: Chapter 9 : Linear Correlation

Testing Pearson’s r for significance

H0: ρ = 0 x<->y association does not exist

Ha: ρ ≠ 0 x<->y association exists (non-directional)

Using the t distribution:

Using the table of critical values

df = N – 2 (N is the number of pairs of scores)

= 10 – 2 = 8

21

2

r

rNt

Page 14: Chapter 9 : Linear Correlation

dfp =

0.05

p =

0.01

1 12.7163.6

6

2 4.30 9.92

3 3.18 5.84

4 2.78 4.60

5 2.57 4.03

6 2.45 3.71

7 2.36 3.50

8 2.31 3.36

9 2.26 3.25

10 2.23 3.17

11 2.20 3.11

12 2.18 3.05

13 2.16 3.01

14 2.14 2.98

Using a t-tableHa: an association exists between education & tolerance (two-tailed)

alpha = .05

df = N – 2

10-2 = 8

If t > 2.31,

reject H0,

left with Ha.

If t <= 2.31,

retain ho

Page 15: Chapter 9 : Linear Correlation

Hypotheses

Directional hypothesis – Ha states whether the correlation is expected to be positive or negative (one-tailed test appropriate).

Nondirectional hypothesis – Ha states that there is an association, but does not specify the direction (two-tailed test appropriate).

t = -2.0 t = +2.0 t = -1.67

df = 8

αlevel = .05

Page 16: Chapter 9 : Linear Correlation

Our example

21

2

r

rNt

tr = 3.96 df = 10 – 2 = 8

tcrit df = 8 = 3.71

APA Style: r(df) = value obtained, p = .##

r(8) = .74, p = .0059

274.1

74.210

t

Page 17: Chapter 9 : Linear Correlation

Hypothesis Testing

Rejecting the null hypothesis –concluding that the null

hypothesis is wrong.

Leaving us with the alternative hypothesis (Ha) that there

is an association between predictor and criterion

Failing to reject the null hypothesis –concluding the null

hypothesis (no association) is a likely possibility.

We do not “accept” the null hypothesis (H0) , because

the null hypothesis can never be proven.

Page 18: Chapter 9 : Linear Correlation

Errors• Type I error – a researcher rejects the null

hypothesis when it is true (a false positive)

– Alpha –probability of Type I error (most commonly p = .05).

• Type II error – a researcher fails to reject the null hypothesis when it is false (a false negative)

– Beta – the probability of Type II error

(most commonly beta = .20).

Page 19: Chapter 9 : Linear Correlation

Statistical Decisions and

Outcomes

Reject null

hypothesis

Fail to reject

(we retain) null

Null hypothesis

false

Null hypothesis

true

Reality (unknown)

Statistical Decision

Type II Error ():Incorrectly concludeNo correlation

Type I Error ():Incorrectly concludethere is a correlation

Correct:

Correlation exists

Correct:

Correlation does

not exist

Page 20: Chapter 9 : Linear Correlation

Power

• Power is the probability that a study will detect effects that are really present (correctly reject the null hypothesis).

• Power = 1-beta. Typically set at .80, or 80% chance of observing an effect when present.

• Power analysis is used to decide how many participants are needed to detect a significant effect, since increasing participants increases power.

Page 21: Chapter 9 : Linear Correlation

Power Table: required n (rows) and r (columns)

n .10 .20 .30 .40 .50 .60 .70 .80 .90

15 .06 .11 .19 .32 .50 .70 .88 .98 >.995

30 .08 .16 .37 .61 .83 .95 >.995

50 .11 .29 .57 .83 .97 >.995

100 .17 .52 .86 .99 >.995

200 .29 .81 .99 >.995

1000 .89 >.995

Power has a direct impact on likelihood of success and is often required for

Masters and Dissertation proposals and fellowship and grant applications.

Know your power, use your power!

Page 22: Chapter 9 : Linear Correlation

Effect Size

Effect size: how strongly variables are related to eachother.

Coefficient of determination (r2): the proportion of variability in the criterion that is due to the predictor.

(Range: .00 to 1.00). One indicator of effect size.

r2 =.742 = .55

55% of variance in the criterion (tolerance) is explained by the predictor (education)

Page 23: Chapter 9 : Linear Correlation

Limitations

• Pearson’s r only measures the degree of linearcorrelation.

• Problems in generalizing from sample correlations– Restricted or truncated ranges (results in smaller

magnitude correlation)

– Bivariate outliers

Page 24: Chapter 9 : Linear Correlation

RESTRICTION OF RANGE

Full Range. r = .60

Restricted range, r = .20

Restriction of range

often decreases r

Page 25: Chapter 9 : Linear Correlation

Marital Satisfaction Over Time

1 2 3 4 5 6 7 8 9 10

Years of Marriage

Ma

rita

l S

atisfa

ctio

n

Wife

Husband

Page 26: Chapter 9 : Linear Correlation

Marital Satisfaction Over Time

No C

hild

Infa

nt

Pre

school

Sch

ool

Adole

scen

t

Young

Adult

Em

pty N

est

Ret

irem

ent

Years of Marriage

Ma

rita

l S

ati

sfa

cti

on

Previous slide data showed

restriction of range!

Page 27: Chapter 9 : Linear Correlation

Outliers

• An outlier is a score that is so deviant from

the data that one can question whether it

belongs in the data set.

• > + / - 3 SD from the mean.

• On-line outliers fall in the same pattern as the

rest of the data artificially inflating r.

• Off-line outliers fall outside of the pattern of

the rest of the data artificially deflating r.

Page 28: Chapter 9 : Linear Correlation

IMPACT OF OUTLIERS ON CORRELATION

On-line outlier Off-line outlier

..

Page 29: Chapter 9 : Linear Correlation

Assumptions of the significance test

Independent random sampling

Normal distribution (and bivariate normal

distribution)

Interval or ratio scale variables

Page 30: Chapter 9 : Linear Correlation

SPSS

Pearson’s r:

Analyze → correlate → bivariate correlations

Select variable you wish to correlate and

place them in box

Make sure Pearson’s is checked

Choose one/two tailed

OK

Page 31: Chapter 9 : Linear Correlation

SPSS

Scatter Plot:

GraphLegacy DialoguesScatter/Dot

Select Simple Scatter

Select variables for X and Y axis

Ok

Note: Select 3D Scatter to look at bivariate

normal assumption for r.