42
1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

Embed Size (px)

Citation preview

Page 1: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

1

Psych 5510/6510

Chapter 13

ANCOVA: Models with Continuous and Categorical Predictors

Part 2: Controlling for Confounding Variables

Spring, 2009

Page 2: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

2

ContextsIn this chapter we are looking at three contexts in

which we might want to combine continuous and categorical variables in our model.:

1. Within a ‘true experimental’ design, where we can use this approach to increase the power of the design and to add sophistication to our model.

2. Within a ‘quasi-experimental’ or ‘static group’ design, where we can use this approach to control a confounding variable.

3. Within a correlational design, where we can introduce a categorical variable to better understand a continuous variable.

Page 3: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

3

Context 2: Controlling for Confounding Variables

In this context we are looking at including continuous variables in the model as a way of controlling confounding variables, particularly within quasi-experimental designs and static group designs

Page 4: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

4

Quasi-Experimental Designs

In a quasi-experimental design subjects are not randomly assigned to groups, usually for some practical or ethical reason. The groups are still assumed to be essentially as equal as they would have been if random assignment was possible. The independent variable is applied to the groups, and if group means are subsequently found to be different this difference is attributed to the independent variable.

Page 5: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

5

Confounding Variables

Because the subjects are not randomly assigned to groups the door is open for a certain type of confounding variable, where some pre-existing differences exist between the groups could account for the difference between the group means found after the independent variable is applied.

Page 6: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

6

Static Group DesignIn a static group design the independent variable is

what we use to divide subjects into groups. When, for example, the independent variable is ‘gender’ then the subjects are assigned to groups based upon their gender, we then measure some dependent variable to see if there is a difference between the genders. The independent variable ‘gender’ is a pre-existing property of the subjects, it is not a variable that is manipulated by the experimenter. In a static group design the independent variable is not manipulated by the experimenter.

Page 7: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

7

Quasi-experimental vs. Static Group

To make sure you understand the difference between the two designs. They both involve non-random assignment to groups, but in a quasi-experimental design you hope the groups start off as equal as they would have been if you could have randomly assigned to groups, you then manipulate the independent variable to see if you can make the groups different. In a static group design you use the independent variable to divide them into groups then measure to see if the groups are different.

Page 8: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

8

Example (Quasi-Experimental)

We want to examine the effectiveness of two different types of curriculum for teaching mathematics (methods ‘a1’ and ‘a2’). We have two different teachers (teachers ‘b1’ and ‘b2’). Each teacher teaches both methods in different classes, giving us four treatment combinations (classes) in a 2 by 2 factorial design. The dependent variable is an identical final exam that is used by both teachers using both curriculums.

Page 9: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

9

Design2-factor Design: (note as there are only two levels of each I.V. we can

code each I.V. and the interaction term with one contrast each).

I.V. A = Teacher A or Teacher B (i.e. Xi1)

I.V. B = Old Curriculum or New Curriculum (i.e. Xi2)

Interaction of A and B (i.e. Xi3)

D.V. = scores on a final (i.e. Yi)

If we could randomly assign students to treatment combinations it would be a true experimental design. In this case, however, students are allowed to select which class they take. As there is: 1) non-random assignment to groups; and 2) the independent variables (teacher and curriculum) are manipulated by the experimenter, this is a quasi-experimental design.

Page 10: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

10

Analysis Without Controlling for Confounding Variables

Model: Ŷi=β0 + β1X1i+ β2X2i+ β3X3i

Model Summary

.318a .101 .078 3.05451Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), X3, X2, X1a.

ANOVAb

121.947 3 40.649 4.357 .006a

1082.284 116 9.330

1204.232 119

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), X3, X2, X1a.

Dependent Variable: Yb.

Page 11: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

11

Analysis Without Controlling for Confounding Variables (cont.)

Model: Ŷi=β0 + β1X1i+ β2X2i+ β3X3i

Coefficientsa

70.624 .279 253.279 .000

.652 .279 .206 2.339 .021

.363 .279 .115 1.303 .195

.677 .279 .214 2.429 .017

(Constant)

X1

X2

X3

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

The effect of teacher (X1) and the teacher x curriculum interaction (X3) are statistically significant.

Page 12: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

12

Confounding Variable

But letting the students select which class they take introduces possible confounding variables. The one we will examine is the possibility that the students who are better at math may prefer one teacher over the other, or one approach over the other, and this in turn influences which class they take. If this is the case, then the difference between the cell means might not be due to the independent variables but instead be due to the better students deciding to enroll in a particular class. In other words, we would have gotten different results if the students were randomly assigned to the cells.

Page 13: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

13

Covariate

To control this confounding variable we decide to give everyone a pre-test that measures mathematical ability (our ‘covariate’ in ANOVA terms). We will call this variable ‘Z’ to make it easier to keep track of it when we include it in a model with the categorical variables X1, X2, and X3.

Page 14: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

14

Redundancy

If we had randomly assigned students to classes then we would expect the mean value of Z (math ability) to be about the same in each cell, which would make Z not redundant with our categorical variables.

But in this case we are including Z specifically because we think its mean value is not the same in each cell. We think that knowing which cell a student is in would help us predict their Z score, and vice versa. In other words we think that Z and the categorical variables are at least somewhat redundant.

Page 15: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

15

Testing to Determine Whether Z is a Confounding Variable

To test whether math ability (measured by variable Z) is a confounding variable we will test whether it is redundant with the independent variables of the experiment. If we can use the independent variables to predict Z then Z is at least somewhat redundant with the independent variables. If it is redundant then it can account for some of the same error that the independent variables can (making it a confounding variable).

Page 16: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

16

Testing to Determine Whether Z is a Confounding Variable

Model C: Źi=β0

Model A: Źi=β0 + β1X1i+ β2X2i+ β3X3i I couldn’t find a ‘Z’ symbol with a hat over it.

If the independent variables, coded by X1, X2, and X3 can predict Z then they are redundant with Z.

Page 17: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

17

Model C: Źi=β0

Model A: Źi=β0 + β1X1i+ β2X2i+ β3X3i

Z is redundant with the independent variables and thus is a confoundingvariable. Note that 1 - R² = 1 - .068 = 0.932 is the tolerance of Z ina design that contains Z, X1, X2, and X3 are predictors. This is not a lotof redundancy but is still statistically significant (p=.041).

Model Summary

.261a .068 .044 5.14225Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), X3, X2, X1a.

ANOVAb

225.023 3 75.008 2.837 .041a

3067.353 116 26.443

3292.376 119

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), X3, X2, X1a.

Dependent Variable: Zb.

Page 18: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

18

Remember, that if the PRE of using X1, X2, and X3 to predict Z is statistically significant then we can conclude that Z is redundant, but given the nature of null hypothesis testing, if the PRE is not statistically significant that does not prove that Z is not redundant (sorry about the double negative), for failure to reject H0 does not prove H0 is true. In other words, we can prove Z is a confounding variable but we can’t prove it is not.

Page 19: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

19

Including Z in the ModelIf we conclude that Z is a confounding

variable then we will want to include it in our model.

Ŷi= β0+ β1X1 +β2X2 +β3X3 +β4Z

Page 20: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

20

Controlling the Confounding Variable

Ŷi= β0+ β1X1 +β2X2 +β3X3 +β4Z

Remember that when we analyze β1, β2 , and β3 we are looking at how much their corresponding variables add to the model after all of the other terms (including Z) are included. So we are looking for the effects of the independent variables after individual differences on math ability have already been accounted for, thus we have taken the confounding variable (Z) out of the analyses of the independent variables.

Page 21: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

21

Power

We are including Z specifically because it is redundant with the categorical predictors, but this can hurt the power of our test to see if the independent variables had an effect (see next slide). This, however, is what we want, for we are removing the confounding variable that was giving us an unrealistic picture of the effects of the independent variables.

Page 22: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

22

PRE When Predictors are Redundant

Page 23: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

23

Analysis When Controlling for Confounding Variable Z

Model: Ŷi=β0 + β1X1i+ β2X2i+ β3X3i + β4Zi

Model Summary

.341a .116 .086 3.04172Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Z, X3, X2, X1a.

ANOVAb

140.248 4 35.062 3.790 .006a

1063.984 115 9.252

1204.232 119

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Z, X3, X2, X1a.

Dependent Variable: Yb.

Page 24: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

24

Analysis (cont.)Coefficientsa

65.169 3.888 16.760 .000

.547 .287 .173 1.904 .059

.349 .278 .110 1.255 .212

.676 .278 .213 2.435 .016

.077 .055 .128 1.406 .162

(Constant)

X1

X2

X3

Z

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

1) When we included the confounding variable in the model (thus removing its effect from the analysis of the independent variables) variable X1 (the difference between the two teachers) is no longer statistically significant. This makes sense (to me) for when I created the data I simulated the better students preferring to take the classes offered by teacher a1, when this confound was removed from the analysis the difference in effectiveness between the two teachers was no longer as great.

Page 25: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

25

Analysis (cont.)Coefficientsa

65.169 3.888 16.760 .000

.547 .287 .173 1.904 .059

.349 .278 .110 1.255 .212

.676 .278 .213 2.435 .016

.077 .055 .128 1.406 .162

(Constant)

X1

X2

X3

Z

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

2) The effect of curriculum (X2) remained non significant. 3) The interaction of the two independent variables (X3) remained

significant, in fact its p value went down a little when Z was included in the model. Why would that happen? In this case the answer probably lies in the tolerances, the tolerance of X3 (tolerances not shown above) is 1.00, thus it wasn’t redundant at all with the other predictor variables (including Z). If it wasn’t redundant with Z, then Z probably served to remove some of the within group variance involved in that contrast.

Page 26: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

26

What We Have Done (ANOVA)Ŷi= β0+ β1X1 +β2X2 +β3X3 +β4Z

In terms of ANOVA, we have used Z to adjust the mean of each cell in such a way that the effect of the confounding variable is removed before the analysis of each independent variable begins (because each variable is analyzed as if it were added last to a model that contains the other variables).

Page 27: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

27

What We Have Done (Model-Comparison Approach)

Ŷi= β0+ β1X1 +β2X2 +β3X3 +β4Z

In model-comparison terms we are simply interested in whether it is worthwhile to include the pre-test (math ability) scores in our model, and how that influences the worthwhileness of the other variables.

Page 28: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

28

Summary of the Two Approaches

In the ANOVA perspective we are trying to remove a confounding variable from the analysis of the effects of our independent variables. In the model-comparison approach we are trying to come up with the best model of the dependent variable.

Page 29: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

29

Caveat

This provides a simple way to control for confounding variables in an experiment that does not have random assignment to groups. You identify possible confounding variables, measure them, determine if they are confounding, then see if the effects of the independent variable change when the effects of the confounding variables are controlled by adding them to the model.

This is not, however, a sure-fire approach, for in a quasi-experimental design it is hard to know for sure whether or not you have thought of all possible confounding variables.

Page 30: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

30

Further Explorations

Following the model-comparison perspective, let’s say we find that it is worthwhile to have both the categorical variables (coding the independent variables) and the continuous variable (our confounding variable) in our model. It might be interesting to see if it would be worthwhile to move from an additive model to an interactive model:

Ŷi= β0+ β1X1 +β2X2 +β3X3 +β4Z +β5ZX1 +β6ZX2 +β7ZX3

Page 31: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

31

Further Explorations

Let’s examine what the interaction terms would measure: X1 codes which teacher is teaching the class, so ZX1

would look at whether the difference in effectiveness between the teachers was dependent upon the math ability of the student. It could be that for the students of low math ability the difference between the teachers was quite high, but for those of high math ability their was little difference between the teachers. Or, it could be that one teacher was better with students of low ability while the other was better with students of high ability. Very interesting!

Page 32: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

32

Further Explorations

X2 codes which curriculum was used, so ZX2 would look at whether the difference in curriculum was dependent upon the math ability of the student. That could be quite interesting.

X3 codes the interaction between teacher and curriculum, so ZX3 would look at whether the interaction between the variables was dependent upon the math ability of the student. That’s a little harder to conceptualize but that could be quite interesting as well.

Page 33: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

33

Effects of Confounding Variables

It is important to note that confounding variables can hide real effects of the independent variable as well as create apparent effects of the independent variables that don’t really exist. So far we have taken a look at the latter case, where math ability created a difference between teacher effectiveness that went away when the confounding variable was included in the model.

Page 34: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

34

Second Data SetIn the first data set I had the better students choose to

take classes from teacher a1, when actually teachers a1 and a2 were equally effective. The effect of the confounding variable was to artificially increase the differences between the apparent effectiveness of the two teachers.

In this second data set I had the better students take the class from teacher 1a when teacher a2 was actually more effective. This effectiveness of teacher a2 however, is being hidden by the confounding variable of the better students selecting to take the class from the less effective teacher a1.

Page 35: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

35

Analysis Without Controlling for Confounding Variable Z

Model: Ŷi=β0 + β1X1i+ β2X2i+ β3X3i

Coefficientsa

-.061 .060 -1.010 .315

.094 .060 .143 1.570 .119

.077 .060 .117 1.282 .202

.057 .060 .086 .945 .346

(Constant)

X1

X2

X3

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

Notice that none of the variables that code the independent variablesare statistically significant, including that of teacher effectiveness (X1)despite my having created data where teacher a2 is better than teachera1, that difference is being hidden by the confounding variable.

Page 36: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

36

Analysis With Controlling for Confounding Variable Z

Model: Ŷi=β0 + β1X1i+ β2X2i+ β3X3i + β4Zi

Notice that when we control the confounding variable by including itin the model that variable X1 (which codes the difference ineffectiveness between the two teachers) is now statistically significant.

Coefficientsa

-.076 .060 -1.282 .203

.129 .061 .195 2.101 .038

.072 .059 .109 1.219 .225

.056 .059 .085 .953 .343

.127 .059 .201 2.165 .032

(Constant)

X1

X2

X3

Z

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

Page 37: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

37

Static Group Designs

The exact same procedures for controlling a confounding variable can be used in a static group design.

In a static group design subjects are not randomly assigned to groups, and the independent variable is not manipulated by the experimenter. Instead, the independent variable is used to sort subjects into groups.

Page 38: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

38

Example

An example of a static group design would be to examine the effect of gender on salaries in some company. Gender is the independent variable, but it is not something that is ‘done’ to the subjects, instead it is the criteria by which subjects are assigned to one group or the other.

Page 39: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

39

Confounding VariableConfounding variables in static group designs have to do

with other systematic differences between the groups that arise when you use the independent variable to sort the subjects. In our example, if we use gender to sort employees into two groups, we might also end up sorting them by how many years they have worked for the company. It could be that only within the last ten years has the company given women an equal opportunity in the hiring process. If that is the case, then if we find that the two groups have different mean incomes is that attributable to our independent variable (gender) or is it due to the confounding variable (years of service)?

Page 40: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

40

Controlling the Confounding VariableTo control this confounding variable we simply add to our

model how many years the employee has worked at the company.

Independent Variable: Gender (X1)Confounding Variable: Years of Employment (X2)Dependent Variable: salary (Y)

Model: Ŷi= β0+ β1X1 +β2X2

Now when we analyze the effect of gender (X1) on salary (Y) we will be doing so after years of employment (X2) has been accounted for (i.e. held steady, i.e. looking at differences in salary between genders for people who have been there an equal amount of time).

Page 41: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

41

More Work to be Done

This study is still pretty unsophisticated. For example, when we measure how many years each employee has worked for the company we are missing out on how much experience they had before they were hired.

There also could be other confounding variables we have yet to add, for example, differences in educational levels between the genders.

And, there are other variables that would be interesting to add as we progress. For example, whether it is worthwhile to add to the model the gender of the person making the decisions regarding promotions.

Page 42: 1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009

42

Interaction

We might also include in our model the interaction of gender with years at the job. This could tell us whether the rate of getting salary increases over the years is dependent upon gender.