32
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction effects from OLS coefficients: Interaction between 1 categorical and 1 continuous independent variable Jane E. Miller, PhD

Jane E. Miller, PhD

Embed Size (px)

DESCRIPTION

Calculating interaction effects from OLS coefficients: Interaction between 1 categorical and 1 continuous independent variable. Jane E. Miller, PhD. Overview. General equation for a model with main effects and interactions Coding of main effects and interaction terms - PowerPoint PPT Presentation

Citation preview

Page 1: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating interaction effects from OLS coefficients:

Interaction between1 categorical and 1 continuous

independent variable

Jane E. Miller, PhD

Page 2: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Overview• General equation for a model with main effects and

interactions• Coding of main effects and interaction terms• Solving for the interaction pattern based on

estimated coefficients– Intercept– Slope

• Graphical depiction of the sum of coefficients for particular combinations of the independent variables

Page 3: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Review: Contingency of coefficients

in an interaction modelY = β0 + β1X1 + β2X2 + β3X1 _ X2,

• Inclusion of the interaction term X1_ X2 means that the βis on the main effects terms X1 and X2 no longer apply to all values of X1 and X2.– The main effects and interactions βis for X1 and X2

are contingent upon one another and cannot be considered separately.

Page 4: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Review: Implications for interpreting main effects and interaction βs

Y = β0 + β1X1 + β2X2 + β3X1 _ X2,

• In the interaction model:– β1 estimates the effect of X1 on Y when X2 = 0,

– β2 estimates the effect of X2 on Y when X1 = 0,

– β3 must also be considered in order to calculate the shape of the overall pattern among X1, X2, and Y.

• E.g., when X1 and X2 take on other values.

Page 5: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Review: Some possible patterns of association between IPR, race, and birth weight

IPR

BW

IPR

BW

IPR

BW

IPR

BW

WhiteBlack

No racial difference in IPR/BW relation: intercept and slope same for blacks & whites.

Blacks & whites have same intercept but different slope of IPR/BW curves

Blacks & whites have different slope and intercepts of IPR/BW curves

Blacks & whites have same slope but different intercepts of IPR/BW curves

Page 6: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

General equation for predicted value of DV based on an interaction model

• The general equation to calculate the predicted value of the dependent variable includes– main effects coefficients– interaction term coefficients– values of the independent variables

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

Page 7: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating overall effect of interaction for specific case characteristics

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

• Each coefficient is multiplied by the value of the associated variable for cases with the characteristics of interest.

• To see which coefficients pertain to which cases, fill in values of variables for different combinations of race and the income-to-poverty ratio (IPR).

Page 8: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Example: Estimated coefficientsβ

Intercept 3,106Main effect terms

Non-Hispanic black (NHB) –177Income-to-poverty ratio (IPR) 23

Interaction termNHB_IPR –5

IPR = family income ($) / Federal Poverty Level for a family of that size and age composition.Reference category: Non-Hispanic whites.

Page 9: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting the intercept• The intercept β0 from an OLS model is an estimate of the

level of the dependent variable when continuous variables take the value 0, for infants in the reference category for all categorical variables.

• In a model where – The dependent variable is birth weight in grams.– The reference category is specified to be non-Hispanic white

infants.

• β0 is an estimate of birth weight when IPR = 0, for non-Hispanic white infants.

Page 10: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Review: Coding of main effect and interaction term variables: race and income

Case characteristics – SELECTED VALUES

Variables Main effects terms Interaction termNHB IPR NHB_IPR

Non-H white & IPR = 0.0 0 0.0 0Non-H white & IPR = 0.5 0 0.5 0Non-H white & IPR = 1.0 0 1.0 0

For a two-category race variable (non-Hispanic white = reference category).

E.g., IPR = 0.5 means family income is half the Federal Poverty Level (FPL); IPR = 2.0 means family income is twice the FPL.

Reference category

Page 11: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating the value of the intercept for one group

NHB IPR NHB_IPR Non-H white & IPR = 0.0 0 0.0 0.0

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

The intercept for non-Hispanic whites is calculated:= β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0

Thus, the intercept for non-Hispanic white infants (when IPR = 0) collapses to include only β0 because all of the other coefficients in the formula are multiplied by a value of 0.

Page 12: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting the IPR/birth weight pattern

• IPR is a continuous variable– The coefficient is an estimate of the effect on the dependent

for a 1-unit increase in the continuous IV, with categorical variables set to their reference category values.

• So βIPR estimates the increment in birth weight for every one-unit increase in IPR (e.g., from family income at the poverty line to twice the poverty line)– It is the slope of the IPR/birth weight curve for infants in the

reference category, in this case, non-Hispanic white infants.

Page 13: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating values for the IPR/birth weight curve for white infants

NHB IPR NHB_IPR Non-H white & IPR = 1.5 0 1.5 0.0

= β0 + (βNHB × 0) + (βIPR × 1.5) + (βNHB_IPR × 0)

= β0 + (βIPR × 1.5)

Because non-Hispanic whites are the reference category for race, the equation collapses to include only the IPR main effect (βIPR) because the other coefficients are multiplied by 0.

= β0 + (βIPR × IPR)

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

Page 14: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating values for the IPR/birth weight curve for white infants

NHB IPR NHB_IPR Non-H white & IPR = 3.0 0 3.0 0.0

= β0 + (βNHB × 0) + (βIPR × 3.0) + (βNHB_IPR × 0)

= β0 + βIPR × 3.0

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

Page 15: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting the race main effect

• The main effect βNHB estimates the difference in birth weight between non-Hispanic black infants and those in the reference category (non-Hispanic whites), when continuous variables are set at the value 0.

• It is an estimate of the difference in intercept between black and white infants when IPR is 0.

Page 16: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating the intercept for different values of the categorical variable

NHB IPR NHB_IPR Non-H white & IPR = 0.0 0 0.0 0.0

NHB IPR NHB_IPR Non-H black & IPR = 0.0 1 0.0 0.0

As we saw a moment ago, for the intercept for non-Hispanic whites is calculated:

= β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0

For non-Hispanic blacks, the intercept is calculated:= β0 + (βNHB × 1) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0 + βNHB

Page 17: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

More on the race main effect

• It is an estimate of the difference in intercept between black and white infants when IPR is 0. = β0 + βNHB = 3,106 + (– 177) = 2,929

• In other words, black infants born to families with an IPR of zero have a predicted birth weight of 2,929 grams.– or 177 grams LOWER than that of their white

counterparts.

Page 18: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating values for the IPR/birth weight curve for white infants

Because non-Hispanic whites are the reference category for race, the equation collapses to include only the IPR main effect (βIPR) because the other coefficients are multiplied by 0.

= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)

= β0 + (βNHB × 0) + (βIPR × IPR) + (βNHB_IPR × 0)

= β0 + (βIPR × IPR)

Page 19: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculating values for the IPR birth weight curve for black infants

NHB IPR NHB_IPR Non-H black & IPR = 1.5 1 1.5 1.5

= β0 + (βNHB × 1) + (βIPR × 1.5) + (βNHB_IPR × 1.5)

For Non-Hispanic blacks, the equation includes all three terms (βNHB, βIPR, and βNHB_IPR) because each of those coefficients is multiplied by a non-zero value.

Page 20: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting the coefficient on the interaction between race and IPR

• The slope – for blacks = βIPR + βNHB_IPR = 23 + (–5) = 18

– for whites = βIPR = 23

• The race_IPR coefficient tests whether the slope of the IPR/birth weight pattern is different for non-Hispanic black infants than for their non-Hispanic white counterparts.– βNHB_IPR is thus the estimated difference in slope for

blacks compared to whites.

Page 21: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

More on the race/IPR interaction

• The estimated coefficients mean that each 1-unit increase in IPR is associated with 23 grams more birth weight among non-Hispanic

white infants. 18 grams more birth weight among non-Hispanic

black infants. Thos values are the slopes of the respective

IPR/BW curves for the two racial/ethnic groups.

Page 22: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Preparing to graph the slope of IPR/birthweight by race

• For infants in the reference category (non-Hispanic white), – Multiply selected values of IPR by βIPR and add to β0

to obtain predicted birth weight at interesting values of IPR.

• For non-Hispanic black infants, – Multiply selected values of IPR by βIPR + βNHB_IPR then

add to β0 + βNHB .

Page 23: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Calculated birth weight by racefor selected values of IPR

IPR (family income in

multiples of the FPL)

Non-Hispanic white Non-Hispanic black

Formula Result Formula Result

0 = β0 + 0 × βIPR

= 3,106 + 0×23 3,106

= β0 + βNHB + 0 × (βIPR + βNHB_IPR) = 3,106 – 177 + 0 × (23 – 5) 2,929

1

= β0 + 1× βIPR

= 3,106 + 1×23= 3,106 + 23 3,129

= β0 + βNHB + 1 × (βIPR + βNHB_IPR) = 3,106 – 177 + 1 × (23 – 5)= 2,929 + 1 × (18) = 2,929 + 18 2,947

6

= β0 + 6 × βIPR

= 3,106 + 6×23= 3,106 + 390 3,244

= β0 + βNHB + 6 × (βIPR + βNHB_IPR) = 3,106 – 177 + 6 × (23 – 5)= 2,929 + 6 × (18) = 2,929 + 108 3,037 β0 = 3,106; βIPR = 23; βNHB = –177; βNHB_IPR = –5

Page 24: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Use a spreadsheet to calculate and graph the interaction

• Spreadsheets can – Store

• The estimated coefficients• The input values of the independent variables• The correct generalized formula to calculate the predicted

values for many combinations of the IVs involved in the interaction

– Graph the overall pattern

• See spreadsheet template and voice-over explanation

Page 25: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

3,200

3,100

3,000

2,900

2,800

* Ref cat = Reference category = non-Hispanic white infants.

= β0 + βNHB = 3,106 + (– 177) = 2,929 = intercept for black infants

1 42

3,300

IPR

0

= β0 = intercept = 3,106 = predicted BW for ref cat *

= βIPR = 23 = slope of IPR/ BW curve for ref cat *

= βIPR + βNHB_IPR = 23 – 5 = 18 = slope of IPR/ BW curve for non-Hispanic black infants

Predicted birth weight by race/ethnicity and IPR

6

Page 26: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Overall shape of the race/IPR/ birth weight pattern

• Based on this set of βs, black infants have – a lower birth weight than whites at all IPR levels.

• Negative coefficient on the NHB main effect yields a lower intercept for blacks than for whites.

– a slower rate of birth weight increase as IPR rises.• Negative coefficient on NHB_IPR, which yields a shallower

slope of the IPR/birth weight curve for blacks than for whites.

• Thus the deficit in birth weight for blacks widens with increasing IPR.

Page 27: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Using the three-way chart to verify your multivariate results

• Check the pattern calculated from the estimated coefficients against the simple three-way chart.

• If the shapes are wildly inconsistent with one another, probably reflects an error in either – How you specified the model, or– How you calculated the overall pattern from the coefficients.

• Small changes in the shape or size of the pattern may occur due to controlling for other variables in your multivariate model.

Page 28: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Summary• An interaction between a continuous and a categorical

independent variable will yield differences in the intercept and/or slope of the association between the continuous IV and the DV.

• Calculating the overall shape of an interaction requires adding together the pertinent main effects and interaction term βs for combinations of the categorical IV and selected values of the continuous IV in the interaction.– A spreadsheet can be helpful for storing and organizing the

βs, input values, and formulas.

Page 29: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Suggested resources

• Chapters 9 and 16 of Miller, J.E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

• Chapters 8 and 9 of Cohen et al. 2003. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 3rd Edition. Florence, KY: Routledge.

Page 30: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Supplemental online resources

• Podcasts– Introduction to interactions– Creating variables to test for interactions– Specifying models to test for interactions– Interpreting multivariate regression coefficients

• Spreadsheet template for calculating overall effect of an interaction between a categorical and a continuous independent variable.

Page 31: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Suggested practice exercises

• Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.– Question #4 in the problem set for Chapter 16– Suggested course extensions for Chapter 16

• “Applying statistics and writing” exercise #2.

Page 32: Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Contact information

Jane E. Miller, [email protected]

Online materials available athttp://press.uchicago.edu/books/miller/multivariate/index.html