lecture 9 - Amazon Web Serviceslecturecontent.s3.amazonaws.com/pdf/15167.pdf · Lecture 9 . 2...

Preview:

Citation preview

Effect Modification and Non-Linear Associations: Regression Based Approaches

Lecture 9

2

Lecture Set Overview

n  Testing for effect modification and estimating different outcome/ predictor associations for different levels of a potential effect modifier via the use of interaction terms in regression

n  Conceptualizing non-linearity as a type of effect-modification and showing another way to model it in a regression context (without categorizing the continuous predictor)

Section A

Regression With Interaction Terms

4

Learning Objectives

n  Describe the “interaction term” approach to estimating separate outcome/predictor associations for different levels of an effect modifier, and for testing for effect modification

5

Assessing EM By Presenting Stratified Results

n  Example 1: Suicide Outcomes and Sexual Identity1

1 Pinney T, Millman S. Asian/Pacific Islander Adolescent Sexual Orientation and Suicide Risk in Guam. American Journal of Public Health (2004) . Vol 4, No 7. pps 1204-1206.

6

Assessing EM By Presenting Stratified Results

n  Example 1: Suicide Outcomes and Sexual Identity

7

Assessing EM By Presenting Stratified Results

n  Example 1: Suicide Outcomes and Sexual Identity

Table:  Relationship  Between  Attempted  Suicide  and  Sexual  Orientation,  

 Presented  Separately  by  Sex.        

           

Odds  Ratio  of  Suicide  Attempt1      

 Unadjusted   Adjusted2  

             Males   5.01  (2.13,  11.78)   5.06  (1.65,  15.55)  

             Females   2.65  (1.17,  6.00)   2.17  (0.84,  5.60)  

             1  Compares  the  odds  of  having  had  at  least  one  suicide  attempt  for  

 youth  how  identify  as  homosexual  compared  to  youth  who  do  not  identify  as  homosexual  

     2  Adjusted  for  ethnicity,  relationship  abuse,  alcohol  abuse  and  markers  of  depression  

               

Assessing EM By Presenting Stratified Results

n  Example 22: Coffee Drinking and Mortality (multivariate: adjusted for body-mass index; race or ethnic group; level of education; alcohol consumption; the number of cigarettes smoked per day, use or nonuse of pipes or cigars, and time of smoking cessation (<1 year, 1 to <5 years, 5 to <10 years, or ≥10 years before baseline); health status; diabetes (yes vs. no); marital status; physical activity; total energy intake; consumption of fruits, vegetables, red meat, white meat, and saturated fat; use or nonuse of vitamin supplements; and use or nonuse of postmenopausal hormone therapy)

2 Freedman N, et al. Association of Coffee Drinking with Total and Cause-Specific Mortality. New England Journal of Medicine (2012). 366 (20) 1891-1904.

8

Assessing EM By Presenting Stratified Results

n  Example 22: Coffee Drinking and Mortality

9

Another Approach

n  Sometimes, however, the researcher may want to estimate separate associations for one predictor only, and pool the estimates for the other predictors across all groups. For example:

-  Estimate the sex-specific associations between wages and years of education, after adjusting for other factors (across both males and females)

-  Estimate age specific associations between mortality and race in dialysis patients, after adjusting for other factors (using data from all age groups combined)

n  This can be done by including an “interaction term” in a multiple regression model

10

11

Assessing EM With an Interaction Term

n  Example 1: Hourly Wages and Years of Education

Table  1:  Unadjusted  and  Adjusted  Linear  Regression  Slopes:Years  of  Education(outcome  is  Hourly  Wages,  n=  534)Index Slope  (95%  CI) Estimate  is  Adjusted  for:  A 0.75  (0.59.  0.91) (unadjusted)B 0.75  (0.59.  0.91) SexC 0.76  (0.60.  0.92) Sex,  Union  MembershipD 0.47  (0.29,  0.65) Sex,  Union  Membership,  Job  TypeE 0.49  (0.31,  0.67) Sex,  Union  Membership,  Job  Type,  Sector,  Marital  Status

n  Results from Model C, plotted separately for males and females

12

Assessing EM With an Interaction Term

05

1015

Hou

rly W

ages

(US

$)

0 5 10 15 20Years of Education

Males Females

Adjusted for sex, and union membership

Plotted Separately By SexRegression Estimates of Hourly Wages for Years of Education

n  The slope of each sex-specific regression line is the same: this is the slope of years of education from the MLR with years of education and sex (0.76), and other adjustment variables

n  The difference between the estimated wages for males and females is the same at each value of years of education: this is the difference in hourly wages between males and females adjusting for years of education (and other adjustment variables): in this example, this difference is $1.89

n  Similar graphics could be shown for the other models

13

Assessing EM With an Interaction Term

n  Once sex has been adjusted for, the wages/years of education relationship is the same in each level of sex

n  Once years of education has been adjusted for, the wages/sex relationship is same at each value of years of education

n  Suppose, however, we are interested in investigating whether the relationship between wages and years of education is modified by sex (after adjustment for union membership)

14

Assessing EM With an Interaction Term

n  We could stratify the sample by sex, and estimate two different regressions of wages on years of education after adjusting for union membership

n  Results:

Slope of Years of Education (95% CI) Males 0.71 (0.50, 0.91) Females 0.84 (0.62, 1.06)

15

Assessing EM With an Interaction Term

16

Assessing EM With an Interaction Term

n  Approach 2: add an interaction term between years of education and sex to the model that includes the other adjustment variables

n  Here’s how it works: -  Add an interaction term to the model which already include

years of education ( x1 ) and sex ( x2 ) as predictors, as well as the other predictors

n  This interaction term, x3 , can be created by taking the product of x1 and x2 ; x3 =x1*x2

n  New model with interaction term

Where x1= years of education, x2 =sex (1 for females), and x3= interaction term (x1*x2)

n  What is value of x3 for:

-  Males? -  Females?

17

xs) and slopesother (ˆˆˆˆˆ 3322110 ++++= xxxy ββββ

Assessing EM With an Interaction Term

n  Results

n  Slope of years of education Males (x2=0, x3=0) Females (x2=1, x3=x1*1)

18

xs) and slopesother (14.069.370.040.0ˆ 321 ++−++= xxxy

Assessing EM With an Interaction Term

xs) and slopesother (70.040.0ˆ 1 ++= xy

xs) and slopesother ()1*(14.069.370.040.0ˆ 11 ++−++= xxyxs) and slopesother (14.070.069.340.0ˆ 11 +++−+= xxy

xs) and slopesother ()14.070.0(69.340.0ˆ 1 +++−+= xy

n  Results from model with interaction term plotted separately for males and females

19

Assessing EM With an Interaction Term

05

1015

Hou

rly W

ages

(US

$)

0 5 10 15 20Years of Education

Males Females

Adjusted for sex and union membership, with an interaction term for sex and years of education

Plotted Separately By SexRegression Estimates of Hourly Wages for Years of Education

n  Results

n  Testing formally if slope of x3 is statistically significant is called “a test of interaction”

Ho: β3=0 HA:β3≠0

-  In this example, the p-value is 0.38: there is not a statistically significant interaction between years of education and sex after adjusting for union membership status

20

xs) and slopesother (14.069.370.040.0ˆ 321 ++−++= xxxy

Assessing EM With an Interaction Term

n  Example: Mortality in patients with primary biliary cirrhosis (Mayo clinic data)

n  Randomized trial: patients randomized to receive DPCA or placebo

n  Results: (unadjusted) HR (DPCA to placebo) 1.06 (0.75, 1.50)

n  Question: Is effect of drug modified by age of patient?

21

Assessing EM With an Interaction Term

n  Age of patient: categorized into quartiles

Quartile 1: < 42 years Quartile 2: [42,50) Quartile 3: [50, 57) Quartile 4: >= 57 years

n  To investigate whether age modifies the effect of the drug, we will need to fit a Cox model that includes drug, the age quartile indicators, and interaction terms between drug and each of the age quartile indicators

22

Assessing EM With an Interaction Term

n  Model Results

n  For each of the age quartiles:

-  Age quartile 1: -  Age quartile 2: -  Age quartile 3: -  Age quartile 4:

23

Assessing EM With an Interaction Term

)x*-0.25(x)x*0.10(x)x*0.28(x 1.22x0.58x0.02x

0.07x- [t]ˆ ) hazardln(

413121

432

1o

++

+++

++= λ

1o 0.07x- [t]ˆ ) hazardln( += λ

)1*(28.002.00.07x- [t]ˆ ) hazardln( 11o x+++= λ)1*(10.058.00.07x- [t]ˆ ) hazardln( 11o x+++= λ

)1*(25.022.10.07x- [t]ˆ ) hazardln( 11o x−+++= λ

n  For each of the age quartiles:

-  Age quartile 1:

-  Age quartile 2:

-  Age quartile 3:

-  Age quartile 4:

24

Assessing EM With an Interaction Term

1o 0.07x- [t]ˆ ) hazardln( += λ

1o 0.28)x(-0.07 02.0[t]ˆ ) hazardln( +++= λ

1o .10)x0.07- (58.0[t]ˆ ) hazardln( +++= λ

1o )25.0(-0.07 22.1[t]ˆ ) hazardln( x−+++= λ

n  HRs comparing drug to placebo for each of the age quartiles:

-  Age quartile 1: 0.94 (0.38, 2.33) -  Age quartile 2: 1.24 (0.59, 2.69) -  Age quartile 3: 1.03 (0.54, 1.97) -  Age quartile 4: 0.73 (0.40, 1.34)

25

Assessing EM With an Interaction Term

n  Testing for an interaction between age and treatment

-  Resulting p-value: 0.74

26

Assessing EM With an Interaction Term

)x*-0.25(x)x*0.10(x)x*0.28(x 1.22x0.58x0.02x

0.07x- [t]ˆ ) hazardln(

413121

432

1o

++

+++

++= λ

n  Conclusion: the effect of the drug is not modified by age (although the results looked promising for the oldest age quartile, this was not significant after accounting for sampling variability)

27

Assessing EM With an Interaction Term

n  The inclusion of interaction terms in a multiple regression model (linear, logistic, or Cox) allows for the assessment of effect modification with (or without) adjustment for potentially multiple other predictors

28

Assessing EM With an Interaction Term

Section B

Examples of Interaction Terms in Published Research

30

Learning Objectives

n  Exposure to several examples of the use of interaction terms in published analyses

31

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD1

1 Chemtob C, et al. Maternal Posttraumatic Stress Disorder and Depression in Pediatric Primary Care Association With Child Maltreatment and Frequency of Child Exposure to Traumatic Events. JAMA Pediatrics (2013) .

32

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD

33

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD

34

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD

35

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD: Interaction model for outcome of psychological abuse score for child

where

321 02.026.015.0ˆˆ xxxy o −+++= β

313

21

* and scale, PTSD ,scale depression

score, abuse calpsychologi mean ˆ

xxxxx

y

=

==

=

36

Linear Regression With Interaction Term

n  Let’s look at several examples of the relationship between psychological abuse score and depression for specific values of PTSD PTSD= 3:

PTSD = 4:

PTSD=10:

321 02.026.015.0ˆˆ xxxy o −+++= β

)*3(02.0)3(26.015.0ˆˆ 11 xxy o −+++= β

)*4(02.0)3(26.015.0ˆˆ 11 xxy o −+++= β

)*10(02.0)3(26.015.0ˆˆ 11 xxy o −+++= β

37

Linear Regression With Interaction Term

n  Overall Interpretation The association between psychological abuse scores and depression decreases with increasing PTSD scores

38

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD

39

Linear Regression With Interaction Term

n  Example 1: Depression and PTSD

40

Cox Regression With Interaction Term

n  Example 2: Second Smoke and Fiber Consumption2

2 Clark M, et al. Dietary fiber intake modifies the association between secondhand smoke exposure and coronary heart disease mortality among Chinese non-smokers in Singapore. Nutrition (2013) Vol 29: 1304–1309

41

Cox Regression With Interaction Term

n  Example 2: Second Smoke and Fiber Consumption2

2 Clark M, et al. Dietary fiber intake modifies the association between secondhand smoke exposure and coronary heart disease mortality among Chinese non-smokers in Singapore. Nutrition (2013) Vol 29: 1304–1309

42

Cox Regression With Interaction Term

n  Example 3: Association of Race and Age With Survival Among Patients Undergoing Dialysis3

3 Kucircka K, et al. Association of Race and Age With Survival Among Patients Undergoing Dialysis. Journal of the American Medical Association. (2011) 306;6.

43

Cox Regression With Interaction Term

n  Example 3: Association of Race and Age With Survival Among Patients Undergoing Dialysis3

3 Kucircka K, et al. Association of Race and Age With Survival Among Patients Undergoing Dialysis. Journal of the American Medical Association. (2011) 306;6.

44

Cox Regression With Interaction Term

n  Methods Section Excerpts

45

Cox Regression With Interaction Term

n  Methods Section Excerpts

Seven Age Groups 18-30 years 31-40 years 41-50 years 51-60 years 61-70 years 71-80 years > 80 years

46

Cox Regression With Interaction Term

n  Model used by authors

betas) and sx'other ( )*(ˆ)*(ˆ)*(ˆ)*(ˆ)*(ˆ)*(ˆ

ˆˆˆˆˆˆ

ˆ[t]ˆ death) of hazardln(

7113611251114110319218

776655443322

11o

++++++

++++++

++=

xxxxxxxxxxxx

xxxxxx

x

ββββββ

ββββββ

βλ

47

Cox Regression With Interaction Term

n  Close-up of Table 2

48

Mentioning Investigation of Interaction

n  Many articles will mention that the researchers investigated interaction, even if no interactions were found or reported

Mentioning Investigation of Effect Modification

n  From abstract4

4Jagsi R, et al. Gender Differences in the Salaries of Physician Researchers. Journal of the American Medical Association (2012); 307(22); 2410-2417.

49

Mentioning Investigation of Effect Modification

n  “We explored pairwise interactions between gender and the other characteristics”

50

Section C

Non-linear Relationships with Continuous Predictors in Regression: the Spline Approach

52

Learning Objectives

n  Get a brief overview of another method for handling non-linearity in a regression setting, that allows for a piecewise approaching to estimating the relationship between an outcome and a continuous predictor

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1

n  Scatterplot: Arm Circumference and Weight, 1,000 Nepalese Children 0-60 months

53

Example 1

n  Scatterplot : Linear or not?

54

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1: Investigating Nonlinearity

n  Option 1: categorize weight into 4 groups

55

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1: Investigating Nonlinearity

n  Option 2: fit a curve ( )

56

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

21211

ˆˆˆˆ xxy o βββ ++=

Example 1: Investigating Nonlinearity

n  Option 3: More than 1 Line (spline)

57

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1: Investigating Nonlinearity

n  Linear spline approach -  Allows for nonlinearity to be investigated via fitting lines with

differing slopes across the predictor range -  Researcher can pick points where line slope can change -  Slope changes can be estimated at multiple points

n  Effect modification analogy -  Non-linearity occurs when an outcome/predictor relationship is

modified by the predictor -  Ex: relationship between AC and weight depends on weight

58

Example 1: Investigating Nonlinearity

n  Option 3: More than 1 Line (spline)

59

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1: Investigating Nonlinearity

n  Estimated association with a spline at 5 kg

where and

60

+−−++= )5(86.017.125.6ˆ 11 xxy weight1 =x

⎩⎨⎧

≥−

<=− +

5x if )5(5x if 0

)5(11

11 xx

Example 1: Investigating Nonlinearity

n  Why does this work? -  When x1 <5

-  When x1≥5

61

117.125.6ˆ xy +=

1

11

11

11

-0.86)x(1.17 10.75 -0.86x1.17x)(-0.86)(-56.25

)(-0.86)(-5-0.86x1.17x 6.25 )5(86.017.125.6ˆ

++=

+++=

+++=

−−++= xxy

Example 1: Investigating Nonlinearity

n  Option 3: More than 1 Line (spline)

62

810

1214

1618

Arm

Circ

umfe

renc

e (c

m)

0 5 10 15 20Weight (kg)

1,000 Children, 0-60 Months OldArm Circumference and Weight

Example 1: Investigating Nonlinearity

n  Testing for change in slope at 5 kg

63

+−−++= )5(86.017.125.6ˆ 11 xxy

Example 2: Investigating Nonlinearity

n  NHANEs Data: Obesity and age: lowess plot of ln(odds) of obesity versus age

64

-3-2

-10

1

ln(O

dds

of B

eing

Obe

se)

20 40 60 80Age (years)

bandwidth = .8

Logit transformed smoothLowess smoother

Example 2: Investigating Nonlinearity

n  NHANEs Data: Obesity and age: lowess plot of ln(odds) of obesity versus age Fit a model that allows for changes in slope at 40 and 60, plus adjustment for HDL levels and sex where

65

5544

131211

ˆ ˆ

)60(ˆ)40(ˆˆˆobesity) of oddsln(

xx

xxxo

ββ

ββββ

+

+−+−++= ++

F)(1sex x (mg/dL), HDL,(years) age 541 === xx

Example 2: Investigating Nonlinearity

n  Adjusted ln(odds ratios) for age age ln(OR) for one year difference in age < 40 40-60 >= 60

66

5544

131211

ˆ ˆ

)60(ˆ)40(ˆˆˆobesity) of oddsln(

xx

xxxo

ββ

ββββ

+

+−+−++= ++

321ˆˆˆ βββ ++

21ˆˆ ββ +

1̂β

Example 2: Investigating Nonlinearity

n  Results

67

Table  1:  Logistic  Regression  Results  for  Predictors  of    ObesityOdds  Ratio  (95%  CI)

Predictor Unadjusted Adjusted1

Age  (years)<  40  years 1.04  (1.03,  1.05) 1.04  (1.02,  1.05)40-­‐60  years 0.97  (0.95,  0.99) 1.00  (0.99,  1.01)>=  60  years 0.96  (0.94,  0.98) 0.99  (0.95,  1.01)

1  adjusted  for  HDL  levels  and  sex

Example 3: Investigating Nonlinearity

n  Soda consumption and physical education classes1

1 Chen H, and Wang Y. Influence of School Beverage Environment on the Association of Beverage Consumption With Physical Education Participation Among US Adolescents. American Journal of Public Health (2013). 103 (11)

68

Example 3: Investigating Nonlinearity

n  Soda consumption and physical education classes

69

Example 3: Investigating Nonlinearity

n  Soda consumption and physical education classes

70

Example 3: Investigating Nonlinearity

n  Soda consumption and physical education classes

71

Example 3: Investigating Nonlinearity

n  Soda consumption and physical education classes

n  Model 4: Where:

x1 = number days of moderate to vigorous activity x2 = number of days participating in physical education class

72

xs and slopesother )3(61.018.026.0ˆˆ 221 +−+−+−+= +xxxy oβ

⎩⎨⎧

≥−

<=− +

3x if )3(3x if 0

)3(21

22 xx

Example 3: Investigating Nonlinearity

n  Model 4: When x2<3:

73

xs and slopesother )3(61.018.026.0ˆˆ 221 +−+−+−+= +xxxy oβ

xs and slopesother 18.026.0ˆˆ 21 +−+−+= xxy oβ

Example 3: Investigating Nonlinearity

n  Model 4: When x2≥3:

74

xs and slopesother )3(61.018.026.0ˆˆ 221 +−+−+−+= +xxxy oβ

xs and slopesother )3(61.018.026.0ˆˆ 221 +−+−+−+= xxxy oβ

xs and slopesother )61.0(361.018.026.0ˆˆ 221 +−+−+−+= xxxy oβ

xs and slopesother )61.018.0(26.0)61.0(3ˆˆ 21 ++−+−+−= xxy oβ

75

Summary

n  Linear splines offer an alternative to categorizing a continuous predictor when investigating and/or handling potential non-linearity in an outcome/exposures association estimated with regression (simple or multiple)

n  This approach is useful when the “per unit” change in a measure of association (mean difference, odds ratio, hazard ratio) is of scientific interest, but the association is not necessarily linear on the regression scale