9
Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Embed Size (px)

Citation preview

Page 1: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Comparing r and b

How to Choose, Moving From One to the Other, and Sampling Distributions

Page 2: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Questions

• How is the raw score slope related to the correlation coefficient?

• Describe a concrete situation where two groups have the same correlation between two variables but different slopes.

• Describe a concrete situation where you would prefer r to b.

• Describe the sampling distribution of r. Include bias, sampling variance, skew, sample size, power.

• Draw a picture and describe the sampling distribution of the regression line.

Page 3: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Slope Estimates

• r is b when X and Y are z scores.

• The test for the significance of the difference between groups for r and b means 2 different things.

• Males: r = .30, SX=50, SY=1, b=.006.• Females:r = .60, SX=100, SY=1,

b=.006.• What if r = .60 for both groups? b = ?

b rS

SY X XYY

X.

With correlation, there is only standardized slope. With regression, there is slope, intercept and standard error of prediction.

Page 4: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Choice Between r and b

• Always report correlation matrix with M & SD so people can choose.

• Correlation to show strength of association between vbls or across settings

• Regression for prediction problems• If units have meaning, may want regression

(consider slope and intercept), e.g., SAT = 0; change in GPA, graduation rate, etc.

Page 5: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Sampling Distribution of rSampling distribution depends on N and ρ.

0.80.60.40.20.0-0.2r

80

60

40

20

0

Fre

qu

en

cy

Sampling Distribution of r1000 Samples of N=50, rho = .30

0.80.60.40.20.0-0.2r

Sampling Distribution of r1000 Samples of N=50, rho = .30

rho=.30

0.80.60.40.20.0-0.2r

Sampling Distribution of r1000 Samples of N=50, rho = .30

r=.27

M=.295, slight bias.

Slight negative skew.

Big power problem.

With N=50, critical value of r is .27, so about half of observed rs will not be significant. Power is about .5. This size correlation and sample are common in psych.

1

)1( 222

Nr

Page 6: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Sampling Distribution of r (2)

1.00.90.80.70.60.5r

100

80

60

40

20

0

Fre

quen

cy

Sampling Distribution of r1000 Samples of N = 50, rho = .80

1.00.90.80.70.60.5r

Sampling Distribution of r1000 Samples of N = 50, rho = .80

rho = .80M=.795, slight bias. A little more negative skew.

No power problem.

Uncommon situation unless you are estimating reliability. Correlation and regression demand large samples for significant results unless the effects of the IV are very large. Large effects are not common in most areas of psychology (social science generally).

Page 7: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = 0

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .5

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .1

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .6

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .2

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .7

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .3

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .8

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

800

600

400

200

0

rho = .4

1.00.80.50.30.0-0.3-0.5-0.8-1.0r

1000

800

600

400

200

0

rho = .9

Empirical Sampling Distributions,rho = .0 to rho = .9

Page 8: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Sampling Distribution of Regression Line

201612840X

20

10

0

-10

-20Y

Sampling Distribution of a Regression LineY = -11+1.6X , N=100

rho=.8

Sigma(Y) = 2, Sigma(X) = 1

Mu(Y) = 5, Mu(X) = 10

20

10

0

-10

-20Y

Sampling Distribution of a Regression LineY = -11+1.6X , N=100

Note fan shape. You will see this in the line’s confidence interval.

The means of X and Y are typically pretty well estimated. The line always goes thru Xbar, Ybar.

A little difference in the slope has little impact on the line close to the mean, but more and more the farther out we get from the mean of X. Note the relation to leverage.

Page 9: Comparing r and b How to Choose, Moving From One to the Other, and Sampling Distributions

Review

• How is the raw score slope related to the correlation coefficient?

• Describe a concrete situation where two groups have the same correlation between two variables but different slopes.

• Describe a concrete situation where you would prefer r to b.

• Describe the sampling distribution of r. Include bias, sampling variance, skew, sample size, power.

• Draw a picture and describe the sampling distribution of the regression line.