10

Click here to load reader

Multiple Regression worked example (July 2014 updated)

Embed Size (px)

Citation preview

Page 1: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 1

QUANTITATIVE RESEARCH METHODS

SAMPLE OF

REGRESSION ANALYSIS

Prepared by

Michael Ling

Page 2: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 2

PROBLEM

Create a multiple regression model to predict the level of daily ice­cream sales Mr Whippy can expect to make, given the daily temperature and humidity. Using the base model (50 marks):

• What is the regression model and regression equation?

• What interpretation do you make of the findings?

• Is the regression model valid?

• Is the sample size adequate?

Create an interaction term for temperature and humidity:

• Is there an interaction effect in the model?

• What is the effect size (F2

) of the interaction?

• What interpretation do you make of the findings?

• Show the interaction effect graphically (e.g., using ModGraph)

SOLUTION

Base Model

The regression model is Sales = a + b*temperature + c*humidity + e where Sales is the

criterion variable, temperature and humidity are predictor; a is intercept crosses the Sales axis;

b and c are regression coefficients; e is an error term. The regression equation is Sales = -24.112

+ 3.513*temperature + 7.589*humidity (Table 1).

Since R2=.629, 62.9% of the variance in ice-cream sales can be explained by temperature

and humidity (Table 2). Compared to R2, adjusted R2 provides a less biased estimate (60.9%) of

the extent of the relationship between the variables in the population.

The ANOVA is significant (F=31.397, df(regression)=2, df(residual)=37, Sig < .001 )

which means that the two predictors collectively account for a statistically significant proportion

of the variance in the criterion variable (Table 3).

The B weight for temperature is 3.513, which means that, after controlling for humidity,

a 1-unit increase in temperature will result in a predicted 3.513 unit increase in ice-cream sales.

The B weight for humidity is 7.589, which means that, after controlling for temperature, a 1-unit

Page 3: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 3

increase in temperature will result in a predicted 7.589 unit increase in ice-cream sales (Table 1).

The standardized coefficient (Beta) for temperature is .712, which means, after controlling for

humidity, a 1 standard deviation (SD) increase in temperature will result in a .712 SD increase in

ice-cream sales. Similarly, a 1 SD increase in humidity will result in a .229 SD increase in ice-

cream sales (Table 1). Temperature can account for a significant proportion of unique variance

in ice-cream sales (t=6.943, Sig < .001) (Table 1). Humidity accounts for a significant

proportion of unique variance in ice-cream sales (t=2.238, Sig < 0.05) (Table 1). The Pearson’s

correlation between temperature and ice-cream sales is r = .761, and that between humidity and

ice-cream sales is r = .382 (Table 1).

The partial correlation between temperature and ice-cream sales is .752 and that between

humidity and ice-cream sales is .345 (Table 1). The part correlation (sr) for temperature is .695,

indicating that approximately 48.3% (.6952) of the variance in ice-cream sales can be uniquely

attributed to temperature (Table 1). Similarly, approximately 5% (.2242) of the variance in ice-

cream sales can be uniquely attributed to humidity (Table 1).

The Variance Inflation Factors (VIF) of temperature and humidity are both 1.048. As

they are both close to 1, multicollinearity is not a problem. From the normal P-P plot, the points

are clustered tightly along the diagonal and hence the residuals are normally distributed (Figure

1). The absence of any clear patterns in the spread of points in the scatterplot indicates that the

assumptions of normality, linearity and homoscedasticity of residuals are met (Figure 2).

Using G*Power and setting alpha = .05 (two-tailed), power = 0.8 and 2 predictors, the

results of sample sizes are shown in Table A. As there are 40 samples in this dataset, the effect

size is approximately .25 and hence samples are adequate to detect a medium-to-large effect.

Interaction Model

Page 4: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 4

The ANOVA is significant (F=40.819, df(regression)=3, df(residual)=36, Sig < .001)

which indicates that the interaction model is statistically significant (Table 4). Since R2=.773,

77.3% of the variance in ice-cream sales can be explained by the interaction model with the

interaction effect, which is14.4% improvement over the base model (Table 5).

The regression equation is Sales = 257.096 – 6.976*temperature – 76.825*humidity +

3.123*temperature*humidity (Table 6). Temperature can account for a significant proportion

of unique variance in ice-cream sales (t=-3.121, Sig < .005) (Table 6). Humidity accounts for a

significant proportion of unique variance in ice-cream sales (t=-4.292, Sig < .001) (Table 6).

The interaction variable can account for a significant proportion of unique variance in ice-cream

sales (t=4.770, Sig < .001) (Table 6). The partial correlation between temperature and ice-

cream sales is -.461 and that between humidity and ice-cream sales is -.582 (Table 6). The part

correlation (sr) for temperature is reduced to -.248, indicating that approximately 6.2% (.2482) of

the variance in ice-cream sales can be uniquely attributed to temperature (Table 6).

Approximately 11.6% (.3412) of the variance in ice-cream sales can be uniquely attributed to

humidity (Table 6), and approximately 14.3% (.3792) of the variance in ice-cream sales can be

uniquely attributed to the interaction variable (Table 6). The effect size of the interaction (F2) =

(.7732 - .6292) / (1 - .7732) = .502. Since it is greater than .35, the result is a large effect.

The use of VIFs to interpret multicollinearity in a regression model that has interaction

effects is erroneous with uncentered variables [1]. As a result, the moderating effect is examined

by applying ModGraph[2] on centered scores. The centered scores of the interaction model are

the zscores (Table 7 and Table 8). Two ModGraphs are plotted where one examines the

moderating relationship when temperature is the main effect (Figure 3) and the other examines

moderating relationship when humidity is the main effect (Figure 4).

Referring to Figure 3, ice-cream sales is directly proportional to temperature only when

humidity is high, ice-cream sales is inversely proportional to temperature when humidity is both

Page 5: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 5

medium and low. Thus, humidity moderates the relationship between ice-cream sale and

temperature. Referring to Figure 4, ice-cream sales is directly proportional to humidity only

when temperature is high, ice-cream sales is inversely proportional to humidity when

temperature is both medium and low. Thus, temperature moderates the relationship between ice-

cream sale and humidity.

References:

1. Robinson, C. & Schumacker, R. E. (2009). Interaction Effects: Centering, Variance Inflation Factor, and

Interpretation Issues. Multiple Linear Regression Viewpoints, 35 (1), 6-11.

2. http://www.victoria.ac.nz/psyc/paul-jose-files/modgraph/modgraph.php

Page 6: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 6

Appendix

Table 1: Base Model - Coefficients

Model

Unstandardized

Coefficients

Standardized

Coefficients

t Sig.

95.0% Confidence

Interval for B Correlations

B Std. Error Beta

Lower

Bound

Upper

Bound

Zero-

order Partial Part

1 (Constant) -24.112 15.933 -1.513 .139 -56.394 8.171

temperature 3.513 .506 .712 6.943 .000 2.488 4.538 .761 .752 .695

humidity 7.589 3.392 .229 2.238 .031 .717 14.461 .382 .345 .224

a. Dependent Variable: sales

Model

Collinearity Statistics

Tolerance VIF

1 (Constant)

temperature .954 1.048

humidity .954 1.048

Table 2: Base Model Summaryb

Model R R Square

Adjusted R

Square

Std. Error of the

Estimate

1 .793a .629 .609 14.977

a. Predictors: (Constant), humidity, temparature

b. Dependent Variable: sales

Table 3: Base Model - ANOVAb

Model Sum of Squares df Mean Square F Sig.

1 Regression 14084.540 2 7042.270 31.397 .000a

Residual 8299.060 37 224.299

Total 22383.600 39

a. Predictors: (Constant), humidity, temparature

b. Dependent Variable: sales

Table A: Results of G*Power Effect Size .35 .25 .15

Sample Size 28 42 66

Page 7: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 7

Figure 1: Normal P-P Plot

Figure 2: Scatterplot

Page 8: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 8

Table 4: ANOVA (Interaction Model)b

Model Sum of Squares df Mean Square F Sig.

1 Regression 17298.244 3 5766.081 40.819 .000a

Residual 5085.356 36 141.260

Total 22383.600 39

a. Predictors: (Constant), temp_humidity, temperature, humidity

b. Dependent Variable: sales

Model

Collinearity Statistics

Tolerance VIF

1 (Constant)

temperature .954 1.048

humidity .954 1.048

Table 5: Model Summary (Interaction Model)b

Model R R Square

Adjusted R

Square

Std. Error of the

Estimate

1 .879a .773 .754 11.885

a. Predictors: (Constant), temp_humidity, temperature, humidity

b. Dependent Variable: sales

Page 9: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 9

Table 6: Coefficients (Interaction Model)a

Model

Unstandardized

Coefficients

Standardized

Coefficients

t Sig.

95.0% Confidence

Interval for B Correlations

B

Std.

Error Beta

Lower

Bound

Upper

Bound

Zero-

order Partial Part

1 (Constant) 257.096 60.297 4.264 .000 134.807 379.384

temperature -6.976 2.235 -1.413 -3.121 .004 -11.510 -2.443 .761 -.461 -.248

humidity -76.825 17.901 -2.322 -4.292 .000 -113.130 -40.519 .382 -.582 -.341

temp_humidity 3.123 .655 3.674 4.770 .000 1.795 4.451 .745 .622 .379

a. Dependent Variable: sales

Table 7: Model Summary (Interaction Model)

Model R

R

Square

Adjusted R

Square

Std. Error of

the Estimate

Change Statistics

R Square

Change F Change df1 df2

Sig. F

Change

1 .793a .629 .609 14.977 .629 31.397 2 37 .000

2 .879b .773 .754 11.885 .144 22.750 1 36 .000

a. Predictors: (Constant), Zscore(humidity), Zscore(temparature)

b. Predictors: (Constant), Zscore(humidity), Zscore(temperature), Zscore(temp_humidity)

c. Dependent Variable: sales

Table 8: Coefficients (Interaction Model)

Model

Unstandardized

Coefficients

Standardized

Coefficients

t Sig.

95.0% Confidence

Interval for B Correlations

B Std. Error Beta

Lower

Bound

Upper

Bound

Zero-

order Partial Part

1 (Constant) 96.100 2.368 40.583 .000 91.302 100.898

Zscore(temparature) 17.049 2.456 .712 6.943 .000 12.073 22.024 .761 .752 .695

Zscore(humidity) 5.495 2.456 .229 2.238 .031 .519 10.470 .382 .345 .224

2 (Constant) 96.100 1.879 51.138 .000 92.289 99.911

Zscore(temparature) -33.860 10.850 -1.413 -3.121 .004 -55.864 -11.855 .761 -.461 -.248

Zscore(humidity) -55.623 12.961 -2.322 -4.292 .000 -81.909 -29.337 .382 -.582 -.341

Zscore(temp_humidity) 88.020 18.454 3.674 4.770 .000 50.594 125.446 .745 .622 .379

a. Dependent Variable: sales

Page 10: Multiple Regression worked example (July 2014 updated)

REGRESSION ANALYSIS July 2014 updated

Prepared by Michael Ling Page 10

Figure 3: ModGraph 1 – zscore(temp) as main effect, zscore(humidity) as

moderator, zscore(temp*humidity) as interaction variable

Figure 4: ModGraph 1 – zscore(humidity) as main effect, zscore(temperature) as

moderator, zscore(temp*humidity) as interaction variable

-50.00

0.00

50.00

100.00

150.00

200.00

250.00

300.00

low med high

Sa

le o

f Ic

e-c

rea

m

Temperature

Temperature and Humidity

Humidity

high

med

low

Gra

de

Humidity

Temperature and Humidity

Temperature

high

med

low