18
This article was downloaded by: [Eindhoven Technical University] On: 21 November 2014, At: 17:42 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Communications in Statistics - Theory and Methods Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsta20 Identification of Confounding versus Dispersing Covariates by Confounding Influence Xiaoqin Wang a & Li Yin b a Department of Electronics, Mathematics and Natural Sciences , University of Gävle , Gävle , Sweden b Department of Medical Epidemiology and Biostatistics , Karolinska Institute , Stockholm , Sweden Published online: 08 Nov 2013. To cite this article: Xiaoqin Wang & Li Yin (2013) Identification of Confounding versus Dispersing Covariates by Confounding Influence, Communications in Statistics - Theory and Methods, 42:24, 4540-4556, DOI: 10.1080/03610926.2011.650267 To link to this article: http://dx.doi.org/10.1080/03610926.2011.650267 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Identification of Confounding versus Dispersing Covariates by Confounding Influence

  • Upload
    li

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

This article was downloaded by: [Eindhoven Technical University]On: 21 November 2014, At: 17:42Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsta20

Identification of Confounding versus DispersingCovariates by Confounding InfluenceXiaoqin Wang a & Li Yin ba Department of Electronics, Mathematics and Natural Sciences , University of Gävle ,Gävle , Swedenb Department of Medical Epidemiology and Biostatistics , Karolinska Institute , Stockholm ,SwedenPublished online: 08 Nov 2013.

To cite this article: Xiaoqin Wang & Li Yin (2013) Identification of Confounding versus Dispersing Covariates by ConfoundingInfluence, Communications in Statistics - Theory and Methods, 42:24, 4540-4556, DOI: 10.1080/03610926.2011.650267

To link to this article: http://dx.doi.org/10.1080/03610926.2011.650267

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Communications in Statistics—Theory and Methods, 42: 4540–4556, 2013Copyright © Taylor & Francis Group, LLCISSN: 0361-0926 print/1532-415X onlineDOI: 10.1080/03610926.2011.650267

Identification of Confounding versus DispersingCovariates by Confounding Influence

XIAOQIN WANG1 AND LI YIN2

1Department of Electronics, Mathematics and Natural Sciences,University of Gävle, Gävle, Sweden2Department of Medical Epidemiology and Biostatistics,Karolinska Institute, Stockholm, Sweden

As is well known, omission of non confounding covariates identified by the treatmentassignment may lead to considerable bias for estimated treatment effect even ina simple randomized trial. In this article we identify confounding vs. dispersingcovariates by the confounding influence characterizing variance change and biasrisk of estimated treatment effect due to constraint on effects of these covariates.Consequently, consistent constraint on effects of identified confounding covariatesreduces variance of estimated treatment effect whereas inconsistent constraint oneffects of identified dispersing covariates—such as omission of identified dispersingcovariates—leads to little bias for estimated treatment effect.

Keywords Confounding covariate; Confounding measure; Confoundinginfluence; Dispersing covariate; Treatment assignment; Treatment effect.

Mathematics Subject Classification Primary 62H12; Secondary 62F10.

1. Introduction

Many statistical analyses infer effect of treatments on outcome in the presenceof covariates. Although it is advocated that one should include all covariates—which have effects on outcome—in the model, one sometimes uses a small modelin view of efficiency and robustness (e.g., Lindsey, 1996) and due to the fact thatsome covariates are not observed. In this case, one needs to evaluate not onlythe size of covariate effects but also the influence of covariates on estimationof treatment effect. Omission of some covariates leads to considerable bias forestimated treatment effect. We say that such covariates have confounding character.Omission of some covariates changes variances but cause little bias for estimatedtreatment effect. We say that such covariates have dispersing character.

Received November 23, 2010; Accepted December 12, 2011Address correspondence to Li Yin, Department of Medical Epidemiology and

Biostatistics, Karolinska Institute, Box 281, SE, Stockholm 171 77, Sweden; E-mail: [email protected]

4540

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4541

One usually identifies confounding covariates based on the treatmentassignment, i.e., the probability distribution of treatments given covariates(e.g., Rothman and Greenland, 1998). Covariates are identified as confoundingcovariates if they are associated with treatments and have effects on outcome;see Rosenbaum and Rubin (1983), Rubin (2005), Greenland and Robins (1986),and Geng et al. (2002) for identifications of confounding covariates in theframework of causal inference. Covariates are identified as non confounding if theyare independent of treatments given confounding covariates. But such identifiedconfounding covariates may have dispersing character whereas such identified nonconfounding covariates may have confounding character.

Consider a simple logistic model for a dichotomous outcome with onedichotomous treatment variable and one dichotomous covariate but withouttreatment-covariate interaction. Under randomized treatment assignment, omissionof the identified non confounding covariate leads to considerable bias forthe estimated treatment effect which is measured by the odds ratio (Gailet al., 1984; Robinson and Jewell, 1991; Neuhaus and Jewell, 1993; Neuhaus,1998); this phenomenon—sometimes called non collapsibility of the odds ratio(Greenland et al., 1999)—suggests that the identified non confounding covariate hasconfounding character. On the other hand, under certain non randomized treatmentassignment, omission of the identified confounding covariate leads to no bias butvariance change for the estimated treatment effect (Robinson and Jewell, 1991;Neuhaus, 1998); this phenomenon suggests that the identified confounding covariatehas dispersing characters. Similar phenomena have been observed in the propensityscore approach to the causal inference (e.g., Senn et al., 2007; Austin et al., 2007and in survival analysis estimating treatment effect measured as hazard ratio (Fordet al., 1995). See also Lee and Nelder (2004) for insight into these phenomena fromperspectives of the conditional vs. marginal models.

The purpose of this article is to identify confounding vs. dispersing covariates.The identification is based on the confounding influence characterizing variancechange and bias risk of estimated treatment effect due to constraint on covariateeffects. Omission of covariates is a special constraint on covariate effects. Equalitiesamong covariate effects is another example of constraint on covariate effects. Usingour identification, one can find if constraint on covariate effects improves efficiencyand robustness or leads to bias for inference of treatment effect.

In Sec. 2, we define confounding versus dispersing covariates by using theconfounding influence. In Sec. 3, we study the confounding influence as functionof the treatment assignment based on the generalized linear model. In Sec. 4, weapply the results in Sec. 3 to the aforementioned example. In Sec. 5, we evaluatethe confounding influence and partition covariates into confounding vs. dispersingcovariates. A real example is used to illustrate statistical modeling based on analysisof the confounding influence. In Sec. 6, we conclude the article with final remarks.

2. Confounding Influence on Estimation of Treatment Effect due toConstraint on Covariate Effects

2.1. Setting of the Article

The study unites are a simple random sample of N units from some population.The data comprises independent observations �yi� zi� xi�, i = 1� � � � � N , of anoutcome variable Y , a treatment variable Z and a set X of covariates on the sample.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4542 Wang and Yin

The number of the observable covariates in X is finite. The model relating yi to�zi� xi� is given by

g���zi� xi�� = h�zi� xi�� (1)

where g��� is a monotonic smooth link function, ��zi� xi� is the mean of Y given �Z =zi�X = xi�, and h��� is the predictor function. All model parameters are constructedusing the means ��Z�X� that are estimable by the data. We ignore variabilityof Z and X, and henceforth Z and X are running variables for observed valuesof the corresponding random variables. Let Pr�A� denote proportion of A in thesample and Pr�A �B� proportion of substratum A in stratum B. Then a treatmentassignment is expressed by the proportion Pr�Z �X�.

The parametrization we use for the model (1) comprises three vectors �� � ofparameters. The first vector � = ��1� � � � � �a�

′ describes average effects of Z over thesample, which is of our interest. We call � treatment effect, which can be definedin various ways. One way is to consider average of the mean ��Z�X� over X of thesample and then define treatment effect � to be a vector of parameters describingchange of the average mean with Z. Another way is to consider a vector of effectsof Z at X describing change of ��Z�X� with Z given X and then define treatmenteffect � to be average of this vector over X of the sample. For a generalized linearmodel, effects of Z at X are easy to estimate and so is treatment effect �, as shown inSec. 3.1. Treatment effect is the causal effect of Z on Y if the treatment assignmentis strongly ignorable given X (Rosenbaum and Rubin, 1983; Rubin, 2005).

Partition X into subsets X1 and X2. Then the second vector = �1� � � � � b�′

describes change of ��Z�X� with covariates in X1 for any given Z and any givenX2. We call the vector of effects of X1. The third vector contains the rest ofparameters. We impose constraint on in attempt to reduce the number of modelparameters and simplify the model. In this article we focus on constraint on , i.e.,all the effects of X1, but one can readily extend the results to constraint on part ofthe effects of X1.

As an example, consider the model g���zi� xi�� = �+ �zi + �1�xi + �2��zixi�.The parametrization comprising �� �� �1�� �2� is not of our interest, so we constructthe parametrization comprising �� � . Because g���Z �= 0� X��− g���Z = 0� X�� =Z�� + �2�X�, the effect of Z at X is equal to � + �2�X. Averaging this effect overX of the sample, we get treatment effect � = � + �2�x̄, where x̄ = �1/N�

∑i xi. The

vector of effects of X is = ��1�� �2��′. The remaining parameter is = �.In this article, we use a conditional distribution of Y given �Z�X� to infer �,

and . The likelihood function for , �, and based on the model (1) is

L���Z�X� �yi���zi�� �xi��� (2)

Suppose regularity conditions that allow consistency and asymptotic normality ofthe maximum likelihood estimates ̂, �̂, and ̂. In this article, we focus on asymptoticproperties of these estimates.

2.2. Variance Change of Estimated Treatment Effect

We consider a constraint on all the effects of X1: = ∗, where ∗ is a constantvector. Under the constraint = ∗, the likelihood (2) yields the estimate �̂=∗ of �

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4543

and the variance var��̂=∗�. Denote the true value of by 0. Under the consistentconstraint = 0, we have �̂=0

and var��̂=0�. Then we decompose the variance

change var��̂�− var��̂=∗� by

var��̂�− var��̂=∗� = �var��̂�− var��̂=0��+ �var��̂=0

�− var��̂=∗�� (3)

The first term on the right side of the equality is called the primary variance changeand the second term the secondary variance change, which is equal to the zeromatrix under the consistent constraint. We define a relative primary variance changeby �var��̂�− var��̂=0

��var−1��̂�, which depends little on the sample size. Similarlywe define a relative variance change by �var��̂�− var��̂=∗��var

−1��̂� and a relativesecondary variance change by �var��̂=0

�− var��̂=∗��var−1��̂=0

�.The expected information matrix I, partitioned according to , �, , is given by

I = I I� II� I�� I�I I� I

� (4)

The inverse I−1 is the variance of �̂, ̂, and ̂ all together. In particular, the ��� ��submatrix of I−1 is the variance var��̂� of �̂. Using (4), direct calculation leads to

var−1��̂� = I�� − �I� I��(I II I

)−1 (I�I�

)� (5)

Under the consistent constraint, the information matrix is obtained by removingthose submatrices involving from I given by (4). Using this information matrix,direct calculation leads to

var−1��̂=0� = I�� − I�I

−1 I�� (6)

In Appendix A.1 we derive the following efficiency formula for the primary variancechange

var−1��̂=0�− var−1��̂� = C�I − II

−1 I�

−1C′� (7)

where C = I� − I�I−1 I. As is shown below, the efficiency formula (7) and the

matrix factor C play key roles in identification of confounding versus dispersingcovariates. The matrix factor I − II

−1 I in the formula (7) is positive definite,

because

I − II−1 I = var−1�̂�=�0

� (8)

and var−1�̂�=�0� is positive definite where �0 is the true value of �. The formula (8)

is obtained by a similar method to the formula (6).Combining (3) and (7), we see the following facts. If C �= 0, the primary variance

change is not zero and the consistent constraint reduces the variance of �̂. If C = 0,the primary variance change is zero. Furthermore, the secondary variance changeis zero under the consistent constraint. Consequently, the consistent constraint doesnot reduce the variance of �̂ in the case of C = 0.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4544 Wang and Yin

The treatment assignment determines the primary variance change through Cdespite perturbation from the positive-definite matrix factor I − II

−1 I according

to the efficiency formula (7). We explain in Sec. 4 and illustrate in Sec. 5 how tochoose treatment assignment such that C = 0.

2.3. Bias Risk of Estimated Treatment Effect

Under an inconsistent constraint = ∗ (∗ �= 0), the estimate �̂=∗ of � is biased.In Appendix A.2, we expand �̂=∗ in terms of ̂− ∗ and get the bias risk

�̂− �̂=∗ = −�I�� − I�I−1 I��

−1C�̂− ∗�+ � (9)

where the matrix factor C = I� − I�I−1 I also appears in the efficiency formula (7)

for the primary variance change. The first term is a linear term of �̂− ∗� and iscalled primary bias risk while the symbol � contains all the higher-order terms andis called secondary bias risk.

If C �= 0, then the inconsistent constraint may lead to primary bias risk. If C =0, then the primary bias risk is zero even under the inconsistent constraint.

Similar to the primary variance change, the treatment assignment determines theprimary bias risk through C despite perturbation from the positive-definite matrixfactor I�� − I�I

−1 I�.

The treatment assignment influences the bias risk through the moments E�Y�,E�XY�, E�ZY�, and E�XZ′Y� whereas it influences the primary bias risk through theinformation matrix. Therefore, the treatment assignment has different influence onthe bias risk than on the primary bias risk. Consequently, the treatment assignmenthas different influence on the the primary bias risk than on the secondary one.

2.4. Confounding and Dispersing Covariates

Definition 2.1. Both the primary variance change var��̂�− var��̂=0� and the

primary bias risk −�I�� − I�I−1 I��

−1C�̂− ∗� are called confounding influence. Thematrix factor C = I� − I�I

−1 I is called the confounding measure.

This definition clarifies the fact that the confounding influence is dependent onthe model given treatment assignment Pr�Z �X�.

Theorem 2.1. The necessary and sufficient condition for zero confounding influence isC = 0.

Proof. The proof follows from the observation that the matrix factor �I −II

−1 I� in (7) is positive definite according to (8) and so is �I�� − I�I

−1 I�� in (9)

according to (6).Now we define confounding versus dispersing covariates by using the

confounding measure C.

Definition 2.2. A set of covariates is called confounding set if the corresponding Cis not equal to the zero matrix. A set of covariates is called a dispersing set if thecorresponding C is equal to the zero matrix.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4545

If the confounding set contains one covariate, then this covariate is calledconfounding covariate. If the dispersing set contains one covariate, then thiscovariate is called dispersing covariate.

As described in Secs. 2.2 and 2.3, such defined confounding set has the followingcharacters: an inconsistent constraint on effects of the confounding set may lead toprimary bias risk while a consistent constraint on these effects reduces the varianceof �̂. Such defined dispersing set has the following characters: an inconsistentconstraint on effects of dispersing set leads to no primary bias risk while a consistentconstraint on these effects does not reduce the variance of �̂. These charactersof confounding vs. dispersing sets imply the confounding vs. dispersing charactersdescribed in the Introduction.

3. Confounding Influence based on the Generalized Linear Model

3.1. Conditional vs. Marginal Models

Consider the generalized linear model

g���zi� xi�� = �+ p′i� + q′i�1� + u′i

�2�� (10)

where the parameters and the associated design vectors are both partitioned intofour vectors. The parameter � is the baseline or the intercept parameter whenzi = 0 (control treatment) and xi = 0 (reference of the covariate). The vector pi =�pi1� � � � � pia�

′ is the design vector of zi with component pij = pj�zi�, which can bea dummy variable for a level of a categorical Z or a power of zi for a continuousZ. Similarly, qi = �qi1� � � � � qib�1� �

′ is the design vector of xi with component qik =qk�xi�, which can be a dummy variable for a level of a categorical component of Xor power of a continuous component of X or interaction between components ofX. The design vector of the zi-xi interaction is ui = �ui1� � � � � uib�2� �

′ with componentuil = ul�zi� xi� = pj�l��zi�qk�l��xi�, where the integer function j�l� indicates elementspj�l��zi� from p and k�l� for qk�l��xi� from q. The parameter vector � is associatedwith pi,

�1� with qi, and �2� with ui. Let = ��1�′� �2�′�′, which has dimension b =b�1� + b�2�.

The parametrization comprising �� �� �1�� �2� is not of our interest, so weconstruct the following parametrization comprising the vectors �� � of parameters.From model (10), we have g���Z �= 0�X��− g���Z = 0�X�� =∑a

j=1 pj�Z���j +∑l∈Lj

qk�l��X��2�l �, where Lj = �l � j�l� = j�. Therefore, the effect of component pj�Z�

at X is �j�X� = �j +∑

l∈Ljqk�l��X�

�2�l . Averaging �j�X� over X of the sample and

using q̄k�l� = �1/N�∑

i qk�l��xi�, we get treatment effect of component pj�Z�

�j = �j +∑l∈Lj

q̄k�l��2�l � (11)

where j = 1� � � � � a. We call � = ��1� � � � � �a�′ treatment effect. Note that � is not

equal to ��2�=0 by setting �2� = 0 in the model (10).The vector of effects of X is given by = ��1�′� �2�′�′, and the vector of the rest

of the parameters is given by = �. Using ��� � �, we rewrite (10) as

g���zi� xi�� = + p′i�+ q′i�1� + v′i

�2�� (12)

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4546 Wang and Yin

where vi = �vi1� � � � � vib�2� �′ with component vil = vl�zi� xi� = pj�l��zi��qk�l��xi�− q̄k�l��,

where q̄k�l� = �1/N�∑

i qk�l��xi�. Both the models (12) and (10) are called conditionalmodels, referring to their dependency on both zi and xi.

A special constraint = 0, which omits X in the model (12), leads to

g���zi�� = =0 + p′i�=0 (13)

where ��zi� = E�Y �Z = zi�. The model (13) is called a marginal model, referring toits dependency on treatment zi only.

The information matrix based on the model (12) is

I =N∑i=1

eie′iw���zi� xi���

where ei = �1� p′i� q′i� v

′i�′ is the design vector of unit i and w���zi� xi�� is the weight

function (McCullagh and Nelder, 1989; Lindsey, 1996). For notational convenience,we rewrite I as

I = N∑Z�X

ee′w���Z�X��Pr�Z�X�� (14)

where e = �1� p′�Z�� q′�X�� v′�Z�X��′ and Pr�A� is proportion of stratum A in thesample. In Appendix A.3 we describe how to calculate the confounding measureC = I� − I�I

−1 I.

If C �= 0, then X is confounding. Thus, the marginal model (13) may lead tobiased estimate for �. Therefore, one uses the conditional model (12) or (10) andreduces the variance of �̂ by consistent constraint on effects of X. If C = 0, then Xis dispersing. If X has a small secondary bias risk, one uses the marginal model (13)to estimate �. If X has a large secondary bias risk, then one uses the conditionalmodel (12) or (10) but cannot reduce the variance of �̂ by consistent constraints oneffects of X. In the following sections, we find conditions for C = 0 under severaltreatment assignments.

3.2. Randomized Treatment Assignment

In randomized trial, an assignment mechanism allocates treatments randomlyamong study units, so that Z is independent of X and Z is strongly ignorable givennull covariate (Rosenbaum and Rubin, 1983; Rubin, 2005). For simplicity, supposethe proportions of Z and X strictly satisfy the relation Pr�Z �X� = Pr�Z�.

In Appendix A.3, we prove the following theorem which gives several sufficientconditions for a zero confounding measure C = 0 under randomized treatmentassignment.

Theorem 3.1.

(1) Suppose there is no Z-X interaction, i.e., �2� = 0 in the model (12). Randomizedtreatment assignment leads to zero confounding measure C = 0 under either ofthe following three conditions (a)–(c) for the weight function w���Z�X��: (a)w���Z�X�� = w0 is a constant; (b) either �1� = 0 or � = 0; (c) w���Z�X�� =w0 exp�g���Z�X���.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4547

(2) Suppose there exists Z-X interaction, i.e., �2� �= 0. Randomized treatmentassignment leads to zero confounding measure C = 0 under either the condition (a)or the condition (d): = ��1�� �2�� = 0.

These conditions can be satisfied for certain outcome distributions and speciallink functions. For instance, the condition (a) is satisfied by the classical linearregression. The condition (c) is satisfied by the Poison distribution with the log linkfunction. The condition (d) is approximately satisfied for small covariate effects. Inall these cases, the covariates are dispersing.

3.3. Non Randomized Treatment Assignment

A non randomized treatment assignment implies Pr�Z �X� �= Pr�Z�. We mayhave C = 0 under certain non randomized treatment assignments. From theweight function w���Z�X�� we derive a weighted proportion Prw�Z�X�, such thatPrw�Z�X� = nw���Z�X�� Pr�Z�X�, where n is the normalizing factor. In AppendixA.4, we prove

Theorem 3.2.

(1) Suppose there is no Z-X interaction, i.e., �2� = 0 in the model (12). Theconfounding measure C is the zero matrix if Prw�Z�X� = Prw�Z�Prw�X�, i.e., Z andX are independent under the weighted proportion Prw�Z�X�.

(2) Suppose there exists the Z-X interaction, i.e., �2� �= 0. The confounding measure Cis the zero matrix if the independence condition in the statement (1) is valid andfurther if q̄j�l� =

∑X qj�l��X�Pr

w�X� for all l = 1� � � � � b�2�.

The independence condition of the statement (1) can be satisfied by appropriatechoice of the treatment assignment Pr�Z�X�. That is, given the weight w���Z�X��,there may exist pr�Z�X� so that Z is independent of X under Prw�Z�X�. Theadditional condition in the statement (2) can be satisfied if Pr�X� is equal to Prw�X�.In these cases, the covariates are dispersing.

3.4. Non Biasing Treatment Assignment

Robinson and Jewell (1991) and Neuhaus (1998) illustrated that the marginal model(13) may lead to not only consistent estimate �̂=0 but also considerable variancereduction in comparison to �̂.

Consider the model parameter of (13), �=0 = g���Z = 1��− g���Z = 0��, where��Z� = ∑

X ��Z�X�Pr�X�Z�. There may exist Pr�X �Z� such that �=0 = �, which inturn implies consistency of �̂=0. The consistency implies either (1) that both theprimary and secondary biases are simultaneously zero or (2) that both the biasesare non zero but cancel each other. In the case (2), we have C �= 0, which impliesthat the covariates are confounding and the marginal model (13) yields a primaryvariance change given by the efficiency formula (7).

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4548 Wang and Yin

4. Confounding Influence in One Example

We analyze the confounding influence and identify confounding versus dispersingcovariates of the example mentioned in the introduction. The conditional model is

g���zi� xi�� = + zi�+ xi (15)

while the marginal model is

g���zi�� = =0 + zi�=0 (16)

where both the treatment variable Z and the covariate X are 0–1 dummy indicatorvariables. The design vector of unit i is ei = �1� zi� xi�

′ and thus the informationmatrix has the form

I =∑i

eie′iw���zi� xi��� (17)

where w���zi� xi�� is the weight function of unit i. Let wzx =∑

i w���zi� xi��1zx�zi� xi�, where 1zx�zi� xi� is the indicator function taking one if zi = z and xi = xand zero otherwise. Let wz· =

∑i w���zi� xi��1z�zi�, w·x =

∑i w���zi� xi�� 1x�xi� and

w·· =∑

i w���zi� xi��. In the analog, let Nzx = ∑i 1zx�zi� xi�, Nz· =

∑i 1z�zi�, N·x =∑

i 1x�xi�, and N be the total number of units. From (17), we get I = w··, I� =I�� = w1·, I = I = w·1, and I� = w11.

Using (17), direct calculation yields C = w11 − w1·w·1/w··. From the bias formula(9) we get the bias risk for the marginal model (16)

�̂− �̂=0 = − w··w0·w1·

C�̂− 0�+ � (18)

where the symbol � represents the secondary bias risk. From the efficiency formula(7) we have

var−1��̂=0�− var−1��̂� = w··

w·0w·1C2 (19)

from which one can evaluates the primary variance change var��̂�− var��̂=0�.

(I) Randomized treatment assignment. Under randomized treatment assignment,where N11/N·1 = N10/N·0, the confounding measure C can be non zero, e.g., forthe logistic regression of a dichotomous outcome. In this case, X is confoundingand the marginal model (16) leads to a primary bias risk given by (18) and aprimary variance change given by (19).

(II) Non randomized treatment assignment. A zero confounding measure C = 0may arise from non-randomized treatment assignment. The condition C = 0 isequivalent to the following relation between the proportions N11/N·1 and N10/N·0,(

1− 1N10/N·0

)= w���0� 1��w���1� 0��

w���1� 1��w���0� 0��

(1− 1

N11/N·1

)�

In this case, X is dispersing, implying zero primary bias risk and zero primaryvariance change as seen from (18) and (19).

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4549

(III) Non biasing treatment assignment. A non zero confounding measure C �= 0 mayarise from non biasing treatment assignment. With appropriate choice of N11/N1·and N01/N0·, we can get, for Z = 0� 1,

��Z� = ��Z� 1�NZ1/NZ· + ��Z� 0�NZ0/NZ·�

such that �=0 = g���Z = 1��− g���Z = 0�� = �. Compared to case (II), we seethat the treatment assignment leading to zero confounding measure is not thesame as the one leading to non biasing treatment assignment. Consequently, nonbiasing treatment assignment may yield C �= 0. In this case, the covariate X isconfounding and the marginal model (16) yields a primary variance change givenby (19).

5. Statistical Modeling based on Analysis of Confounding Influence

In practical statistical modeling, covariates are rarely all dispersing and the marginalmodel (13) is hardly valid. More often, covariates are partitioned into confoundingversus dispersing subsets according to their confounding influences. Once thepartition is accomplished, one adjusts confounding respective dispersing covariatesby using available methods such as the quasi likelihood method (McCullagh andNelder, 1989; Lindsey, 1996) and reduces the variance of estimated treatment effectby testing and imposing consistent constraints on effects of confounding covariates.

5.1. Evaluation of Confounding Influence

In practice, it is of more interest to evaluate the confounding influence—i.e., theprimary variance change and the primary bias risk for treatment effect—rather thanthe confounding measure. As seen in Sec. 2.2, we can measure the confoundinginfluence by the relative primary variance change, i.e., �var��̂�− var��̂=0

��/var��̂�,where = 0 is the consistent constraint.

The estimate �̂ and its variance var��̂� are readily obtained. But the true value0 is unknown, so one can not get �̂=0

and its variance var��̂=0�. Therefore, we

use ̂ to approximate 0 and the constraint = ̂ to approximate the consistentconstraint = 0. Under the constraint = ̂, we calculate �̂=̂ and var��̂=̂� andmeasure the confounding influence by �var��̂�− var��̂=̂��/var��̂�, which we still callrelative primary variance change for simplicity.

If a set of covariates has a relative primary variance change smaller than athreshold value, then this set is dispersing and otherwise confounding. Furthermore,one determines if the set is over- or under-dispersing by the sign of the secondaryvariance change, which is obtained by subtracting the primary variance change fromthe variance change. For a dispersing set, the primary bias risk is zero, so thesecondary bias risk is equal to the bias risk.

5.2. Partition of Covariates into Confounding Versus Dispersing Subsets

The following two theorems are useful to carry out the partition.

Theorem 5.1. A set Xc = �Xcj� of confounding covariates is a confounding set.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4550 Wang and Yin

Proof. Let cj denote the vector of effects of Xcj and c for Xc. Let cj0 betrue value of cj and c0 for c. The consistent constraint c = c0 contains theconsistent constraint cj = cj0. According to Altham (1984), we have var��̂c=c0

� <=var��̂cj=cj0

�. Furthermore, Definition 2.2 and the efficiency formula (7) imply

var��̂cj=cj0� < var��̂� for any j if Xcj is confounding covariate. The two inequalities

imply var��̂c=c0� < var��̂�. By the efficiency formula (7), we have C > 0 for Xc, thus

proving that Xc is a confounding set.This theorem means that one may form a confounding set by finding individual

confounding covariates separately. Also, notice that if confounding covariatesare identified by treatment assignment, then the set of these covariates may notnecessarily be a confounding set identified by treatment assignment.

To find if a confounding covariate exists among a set of covariates, we have thefollowing.

Theorem 5.2. Every covariate in a dispersing set Xd = �Xdj� is a dispesing covariate.

Proof. Let dj denote the vector of effects of Xdj and d for Xd. Let dj0 betrue value of dj and d0 for d. The consistent constraint d = d0 containsthe consistent constraint dj = dj0. According to Altham (1984), we havevar��̂d=d0

� <= var��̂dj=dj0� <= var��̂�. Definition 2.2 and the efficiency formula (7)

imply var��̂d=d0� = var��̂� if Xd is a dispersing set. Therefore, var��̂cj=cj0

� = var��̂�.By the efficiency formula (7), we have C = 0 for Xdj for any j, thus proving thatXdj is a dispersing covariate.

One can further partition the dispersing set into over- and under-dispersingsubsets or into subsets according to the secondary bias risks.

5.3. A Real Example

Medical researchers examined the short-term survival of cardia cancer patients aftertheir diagnoses at different hospitals in Sweden. The hospitals were categorizedinto large versus small types (Z = 1� 0) according to their volumes of cardia cancerpatients. The outcome was dichotomous: Y = 1� 0 for survival or not at one yearafter the diagnosis. We had also measurements of the following covariates: gender(X1 = 1� 0 for male or female), age (X2 = 1� 0 for old or young), and area typeof residence (X3 = 1� 0 for rural or urban). The area type reflected accessibility ofhospitals in patients’ residential areas. The data contained records of 158 cardiacancer patients diagnosed between 1988 and 1995.

The probability for one year survival of each unit i is modeled by

Logit�Pr�Y = 1 � zi� xi�� = + �zi + 1xi1 + 2xi2 + 3xi3� (20)

where � represents the hospital type effect, 1 the gender effect, 2 the age effect, and3 the area type effect. Table 1 presents the ML estimates and their variances forthe parameters of the model (20).

Table 2 presents the relative primary variance change for the hospital type effect(Column (2)) under the constraints 1 = ̂1, 2 = ̂2, and 3 = ̂3, respectively, therelative variance change (Column (4)) and the bias risk (Column (5)) for the hospitaltype effect under the constraints 1 = 0, 2 = 0 and 3 = 0 respectively. As seen in

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4551

Table 1ML estimates and variances for parameters of the model (20)

Parameter (covariate) (ML estimate, variance)

(baseline) (1.511,0.270)� (hospital type) (−0.159,0.132)1 (gender) (−0.902,0.190)2 (age) (0.040,0.155)3 (area type) (−0.327,0.129)

Column (2), age is essentially a dispersing covariate, implying weak primary biasrisk and variance change. Further, age has a small secondary variance change asseen by comparison between Columns (2) and (4) and a small bias risk as seen inColumn (5). Hence, there is no need to keep age in the model. Area type is ratherstrong confounding covariate as seen in column (2). Gender is on the borderlinebetween the dispersing and confounding covariates as seen in Column (2). Uponomission from the model (20), gender has, as seen in Column (5), a large bias risk,in which one can not distinguish the primary from secondary bias risks.

Since area type is a confounder, we keep it in the model (20). We also keep thegender in the model (20) due to its large bias risk. After omitting age, i.e., imposing2 = 0, the model (20) becomes

Logit�Pr�Y = 1 � zi� xi�� = 2=0 + �2=0zi + 1�2=0xi1 + 3�2=0xi3� (21)

which yields the ML estimate and its variance for the hospital type effect,�−0�155� 0�130�, as read from Column (3) of Table 2. Because area type is aconfounder, a consistent constraint on its effects reduces the variance of theestimated hospital type effect. One possible constraint is to set the area type effectequal to the area type effect for stomach cancer which is similar to cardia cancer,so that the data for stomach cancer can be used for cardia cancer (Yin et al., 2006).

Table 2Confounding influences of the covariates—gender, age, and area type—on the

estimated hospital type effect, based on the model (20). Column (1): (ML estimate,variance) of the hospital type effect under the constraints 1 = ̂1, 2 = ̂2, and

1 = ̂1, respectively. Column (2): relative primary variance change for the hospitaltype effect in Column (1). Column (3): (ML estimate, variance) for the hospitaltype effect under the constraints 1 = 0, 2 = 0, and 3 = 0, respectively. Column(4): relative variance change in Column (3). Column (5): bias risk in Column (3).

Confounding influence

Parameter (covariate) (1) (2) (3) (4) (5)

1 (gender) (−0.159,0.128) 3.0% (−0.019,0.123) 6.8% −0.1402 (age) (−0.159,0.130) 1.5% (−0.155,0.130) 1.5% −0.0043 (area type) (−0.159,0.119) 9.8% (−0.055,0.118) 10.6% −0.104

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4552 Wang and Yin

In contrast, no constraint is needed to impose on the gender effect because genderis a possible dispersing covariate.

6. Discussions

To conclude this article, we discuss roles of the treatment assignment and theconfounding measure in the confounding influence.

In this article, we showed that the treatment assignment determines theconfounding influence through the confounding measure. Thus one can not directlyidentify confounding versus dispersing covariates by the treatment assignment.The misidentification due to the treatment assignment is often observed for thegeneralized linear model with a non-linear link function. Omission of dispersingcovariates identified by the treatment assignment mis-specifies the outcome modeldue to Jensen’s inequality, thus biasing estimated treatment effect.

It is the confounding measure that plays the direct role in the confoundinginfluence. By evaluating the confounding measure or the confounding influence,one identifies confounding vs. dispersing covariates. By adjusting for identifiedconfounding respective dispersing covariates, one unbiasedly estimates treatmenteffect and improves the estimation by imposing consistent constraints on effects ofconfounding covariates.

Appendix

A.1 Derivation for the Efficiency Formula (7)

Consider the �� � submatrix I�� � of the information matrix (4) and its inverseI−1�� �

I�� � =(I II I

)I−1�� � =

(I I

I I

)�

By using I�� �I−1�� � = 1, we obtain

I = �I − II−1 I�

−1� (22)

I = �I − II−1 I�

−1� (23)

I = −III−1 = −I−1

II� (24)

Applying the matrix inverse formula

�A + BCBT �−1 = A−1 − A−1B�C−1 + BTA−1B�−1BTA−1

to Eq. (22), we obtain

I = I−1 + I−1

IIII

−1 � (25)

We rewrite the formula (5) as

var−1��̂� = I�� − I�II� − I�I

I� − I�II� − I�I

I�� (26)

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4553

Applying (25) to the second term in the left-hand side of (26) and usingvar−1��̂=0

� = I�� − I�I−1 I� (see the formula (6)), we obtain

var−1��̂� = var−1��̂=0�− I�I

−1 II

II−1 I� − I�I

I� − I�II� − I�I

I�� (27)

Applying (24) to the second and fourth terms in the left-hand side of (27), we obtain

var−1��̂� = var−1��̂=0�+ I�I

II−1 I� − I�I

I� + I�III

−1 I� − I�I

I�� (28)

By doing some arrangement in (28) and using C = I� − I�I−1 I, we obtain

var−1��̂� = var−1��̂=0�− I�I

C′ − I�IC′ (29)

or equivalently

var−1��̂� = var−1��̂=0�− �I�I

+ I�I�C′� (30)

Application of (24) to the submatrix I in (30) yields

var−1��̂� = var−1��̂=0�− �−I�I

−1 II

+ I�I�C′� (31)

Application of (23) to (31) leads to (7).

A.2 Derivation for the Bias Risk Formula (9)

From the likelihood (2) we obtain the likelihood equationsU�̂� �̂� ̂� = 0

U��̂� �̂� ̂� = 0

U�̂� �̂� ̂� = 0

� (32)

where the score function U is the first derivative of the log-likelihood l���Z�X� �yi���xi�� �zi�� with respect to , and similarly for the other score functions. Bysolving these equations we obtain the estimates ̂, �̂, and ̂.

Under the constraint = ∗ we obtain the constrained likelihood equations,{U�̂=∗� �̂=∗�

∗� = 0

U��̂=∗� �̂=∗� � ∗� = 0�

(33)

By solving these equations we obtain the estimates ̂=∗ and �̂=∗ . By Taylor-expanding the first two score functions of (32) about ̂=∗ , �̂=∗ , and ∗ and keepingonly the linear terms, we obtain

0 = U�̂� �̂� ̂� ≈ U�̂=∗� �̂=∗� ∗�− I�̂− ̂=∗�− I���̂− �̂=∗�− I�̂− ∗�

(34)

0 = U��̂� �̂� ̂� ≈ U��̂=∗� �̂=∗� ∗�− I��̂− ̂=∗�− I����̂− �̂=∗�− I��̂− ∗��

(35)

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4554 Wang and Yin

Using (34) and (33), we express �̂− ̂=∗� in terms of ��̂− �̂=∗� and �̂− ∗�.Substituting this expression for �̂− ̂=∗� into (35) and using (33), we obtain theprimary bias risk ��̂− �̂=∗� in terms of �̂− ∗�. Complete with the secondary biasrisk, we obtain the formula (9).

A.3 Proof for Theorem 3.1

In the statement (1) we have �2� = 0 in the model (12) and therefore onlyconsider C

�j�1�k

= I�j

�1�k− I�jI

−1 I�1�k

, where �j is an element from � �1�k from

�1�. From (14), we get I�j

�1�k

= N∑

Z�X pj�Z�qk�X�w���Z�X��Pr�Z�X�, I = N∑Z�X w���Z�X��Pr�Z�X�, I

�1�k

= N∑

Z�X qk�X�w���Z�X��Pr�Z�X�, and I�j = N∑Z�X pj�Z�w���Z�X�� Pr�Z�X�. Let Pr

∗�Z�X� = w���Z�X�� Pr�Z�X� for notationalsimplicity. Then,

C�j

�1�k

= N

(∑Z�X

pj�Z�qk�X�Pr∗�Z�X�−

∑Z�X

pj�Z�Pr∗�Z�X�

[∑Z�X

Pr∗�Z�X�

]−1∑Z�X

qk�X�Pr∗�Z�X�

� (36)

The randomized treatment assignment implies independence between X and Z, i.e.,Pr�Z�X� = Pr�Z�Pr�X�. Either of the conditions (a)–(c) in Theorem 3.1 impliesw���Z�X�� = w�Z�w�X�. Inserting these two relations into (36), we have C

�j�1�k

= 0for all �j� k�, thus proving the statement (1).

To prove the statement (2), we first consider C�j

�1�k. In the proof of the

statement (1), we see condition (a) leads to C�j

�1�k

= 0 for all �j� k�. Condition (d)implies condition (b), which also leads to C

�j�1�k

= 0 for all �j� k�. Now we consider

C�j

�2�l

= I�j

�2�l− I�jI

−1 I�2�l

, where �j is an element from � and �2�l from �2�. From

(14), we get I�j

�2�l

= N∑

Z�X pj�Z�vl�Z�X�w���Z�X��Pr�Z�X� and I

�2�l

= N∑

Z�X

vl�Z�X�w���Z�X��Pr�Z�X�, where vl�Z�X� = pj�l��Z��qk�l��X�− q̄k�l�� and j�l� andk�l� are the integer functions indicating elements from p and q, respectively. Inanalog to (36), we have

C�j

�2�l

= N

(∑Z�X

pj�Z�vl�Z�X�Pr∗�Z�X�−

∑Z�X

pj�Z�Pr∗�Z�X�

[∑Z�X

Pr∗�Z�X�

]−1∑Z�X

vl�Z�X�Pr∗�Z�X�

� (37)

Either of the conditions (a) and (d) in Theorem 3.1 implies w���Z�X�� = w�Z�.The randomized treatment assignment implies independence between Z and X, i.e.,Pr�Z�X� = Pr�Z�Pr�X�. Then Pr∗�Z�X� = w�Z�Pr�z�Pr�X�. Inserting this relationinto (37), we find that each of the two terms in the left-hand side of (37) is zero.Therefore we have C

�j�2�l

= 0 for all �j� l�, thus proving the statement (2).

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

Confounding versus Dispersing Covariates 4555

A.4 roof of Theorem 3.2

In the statement (1) we have �2� = 0 in the model (12) and therefore only considerC

�j�1�k, where �j is an element from � and

�1�k from �1�. Denote the weighted

proportion of �Z�X� by Prw�Z�X� = nPr∗�Z�X� = nw���Z�X��Pr�Z�X�, where thenormalizing factor is n = N/I. Replacing Pr∗�Z�X� by Prw�Z�X�, we transform(36) into

C�j

�1�k

= I

[∑Z�X

pj�Z�qk�X�Prw�Z�X�

− ∑Z�X

pj�Z�Prw�Z�X�

∑Z�X

qk�X�Prw�Z�X�

]� (38)

If Prw�Z�X� = Prw�Z�Prw�X�, i.e., Z and X are independent under the weighteddistribution, then we have C

�j�1�k

= 0 for all �j� k�, thus proving the statement (1).To prove the statement (2), we first consider C

�j�1�k. In the proof of the statement

(1), we see that the independence condition leads to C�j

�1�k

= 0 for all �j� k�. We then

consider C�j

�2�l, where �2�l is an element from �2�. Replacing Pr∗�Z�X� by Prw�Z�X�,

we transform (37) into

C�j

�2�l

= I

[∑Z�X

pj�Z�vl�Z�X�Prw�Z�X�−

∑Z�X

pj�Z�Prw�Z�X�

∑Z�X

vl�Z�X�Prw�Z�X�

]� (39)

where vl�Z�X� = pj�l��Z��qk�l��X�− q̄k�l��. If Prw�Z�X� = Prw�Z�Prw�X�, then werewrite (39) as

C�j

�2�l

= I

[∑Z

pj�Z�pj�l��Z�Prw�Z�−

∑Z

pj�Z�Prw�Z�

∑Z

pj�l��Z�Prw�Z�

]∑X

�qk�l��X�− q̄k�l��Prw�X�� (40)

Therefore, if q̄k�l� =∑

X qk�l��X�Prw�X� for all l, we have C

�j�2�l

= 0 for all �j� l�, thusproving the statement (2).

References

Austin, P. C., Grootendorst, P., Normand, S. L. T., Anderson, G. M. (2007). Conditioningon the propensity score can result in biased estimation of common measures oftreatment effect: A Monte Carlo study. Statist. Med. 26:754–768.

Altham, P. (1984). Improving the precision of estimation by fitting a model. J. Roy. Statist.Soc. B 46:118–119.

Ford, I., Norrie, J., Ahmadi, S. (1995). Model inconsistency, illustrated by the Coxproportional hazards model. Statist. Med. 14:735–746.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14

4556 Wang and Yin

Geng, Z., Guo, J., Fung, W. (2002). Criteria for confounders in epidemiological studies.J. Roy. Statist. Soc. B 64:3–15.

Gail, M. H., Wieand, S., Piantadosi, S. (1984). Biased estimates of treatment effect inrandomized experiments with nonlinear regressions and omitted covariates. Biometrika71:431–444.

Greenland, S., Robins, J. M. (1986). Identifiability, exchangeability, epidemiologicalconfounding. Int. J. Epidemiol. 15:413–419.

Greenland, S., Robins, J. M., Pearl, J. (1999). Confounding and collapsibility in causalinference. Statist. Sci. 14:29–46.

Lee, Y., Nelder, J. A. (2004). Conditional and marginal models: Another view. Statist. Sci.19:219–228.

Lindsey, J. K. (1996). Parametric Statistical Inference. Oxford: Clarendon Press.McCullagh, P., Nelder, J. A. (1989). Generalized Linear Models. London: Chapman & Hall.Neuhaus, J. M. (1998). Estimation efficiency with omitted covariates in generalized linear

models. J. Amer. Statist. Assoc. 93:1124–1129.Neuhaus, J. M., Jewell, N. P. (1993). A geometric approach to assess bias due to omitted

covariates in generalized linear models. Biometrika 80:807–816.Robinson, L. D., Jewell, N. P. (1991). Some surprising results about covariate adjustment in

logistic regression models. Int. Statist. Rev. 58:227–240.Rosenbaum, P. R., Rubin, D. B. (1983). The central role of the propensity score in

observational studies for causal effects. Biometrika 70:41–55.Rothman, K. L., Greenland, S. (1998). Modern Epidemiology. Philadelphia: Lippincott

Williams & Wilkins.Rubin, D. B. (2005). Causal inference using potential outcomes: design, modeling, decisions.

J. Amer. Statist. Assoc. 100:322–331.Senn, S., Graf, E., Caputo, A. (2007). Stratification for the propensity score compared with

linear regression techniques to assess the effect of treatment or exposure. Statist. Med.26:5529–5544.

Yin, L., Sundberg, R., Wang, X., Rubin, D. B. (2006). Control of confounding throughsecondary samples. Statist. Med. 25:3814–3825.

Dow

nloa

ded

by [

Ein

dhov

en T

echn

ical

Uni

vers

ity]

at 1

7:42

21

Nov

embe

r 20

14