Upload
simonedizio
View
220
Download
0
Embed Size (px)
Citation preview
7/30/2019 articolo Dreassi ridotto
1/15
DOI: 10.1007/s10260-003-0078-7Statistical Methods & Applications (2004) 13: 87101
c Springer-Verlag 2004
A multilevel Bayesian model for contextual effect
of material deprivation
Annibale Biggeri, Emanuela Dreassi, Marco Marchi
Dipartimento di Statistica G. Parenti, Universita di Firenze, Viale Morgagni 59, 50134 Firenze, Italy
(e-mail: {abiggeri,dreassi,marchi}@ds.unifi.it)
Received: January 10, 2002 / Revised version: June 23, 2003
Abstract. The relationship between socioeconomic factors and health has been
studied in many circumstances. Whether the association takes place at individual
level only, or also at population level (contextual effect) is still unclear. We present
a multilevel hierarchical Bayesian model to investigate the joint contribution of in-
dividual and population-based socioeconomic factors to mortality, using data fromthe census cohort of the general population of the city of Florence, Italy (Tuscany
Longitudinal Study, 19911995). Evidence supporting a contextual effect of de-
privation on mortality at the very fine level of aggregation is found. Inappropriate
modelling of individual and aggregate variables could strongly bias effect estimates.
Key words: Hierarchical Bayesian model, multilevel model, material deprivation
index, contextual effects, ecological fallacy
1. Introduction: Material deprivation and contextual effect
Material deprivation indicators usually refer to the occurrence of subject states such
as unemployment, low education, living in a very small dwelling, overcrowding,
not having a car (e.g. see Townsend et al., 1988, Jarman, 1983 and Morris and
Carstairs, 1991). So far, they have been used as aggregate-level covariates to adjust
ecological regression coefficients in small area studies (St Leger, 1995). In fact, a
strong association of area based deprivation and mortality, on one side, and area
based deprivation and exposure to environmental/individual hazards, on the otherside (Eachus et al. 1996, Davey Smith et al. 1998, Pell et al. 2000) was repeatedly
found. Many authorsstressed the correlation of material deprivation with prevalence
of known risk factors, like cigarette smoking, in agreement with the hypothesis that
The research on Tuscany Longitudinal Study (Studio Longitudinale Toscano, SLTo) was supported
by the Regione Toscana Servizio Statistica.
7/30/2019 articolo Dreassi ridotto
2/15
88 A. Biggeri et al.
material deprivation be responsible of only indirect effects or simply acts as a
surrogate variable. For example, Sundquist et al. (1999) conducted an individual
level study showing that the prevalence of material deprivation is explained by
individual-level risk factors for cardiovascular disease (obesity, hypertension).
In epidemiology the interpretation of the effect of any aggregate-level variable
is however controversial. Diez-Roux (1998) discussed atomistic versus eco-
logical fallacy, and proposed to include aggregate variables at different level of
aggregation in individual studies. Only a few papers considered individual and ag-
gregate data hierarchically structured: Anderson et al. (1997), for example, used
the same variable (personal income) at the individual and aggregate level (the cen-
sus tract median). The idea was to study whether the aggregate variable could still
be predictive of the response after having considered the individual level variable
(Firebaugh, 1978).
In the present paper, the analysis of contextual variables such as material de-privation indicators is reframed by using Cronbachs model (Cronbach and Webb,
1975) and Bayesian multilevel approach. Our aims are: 1) to show that different
ways of modelling contextual variables assume different prior believes on the ex-
istence and nature of the effects, and 2) to show that a simple individual-level
analysis could produce more biased results than a simple aggregate-level analysis.
This is done using a material deprivation index and data derived from the Tuscany
Longitudinal Cohort Study (see Biggeri et al. 2001).
In Sect. 2, as motivating example, we present the data and a descriptive analysis
of mortality by city wards and deprivation levels. Section 3 introduces the differentstatistical models used: individual, aggregate and multilevel models. More details
are given to describe Bayesian modelling and the difference between contextual and
Cronbach approaches. The results and the main conclusion are showed respectively
in Sects. 4 and 5.
2. Materials: the Florence Census Cohort Study
The data come from a census-based cohort study. All residents in Florence (Tuscany,Italy) at the census day 1991, October 31-st, have been enrolled and their mortality
followed-up by automated procedures of record-linkage up to 1995, December 31-
st. The cause of death certificates have been collected by the Tuscany Mortality
Register. Observed and expected deaths (all causes, males, age groups greater than
14 years) using internal standardization have been calculated by census-tracts and
sub-urban areas (city wards).A total of 163613 people have been enrolled, 639662.5
person years at risk have been observed in the follow-up period and 8612 deaths
have occurred. The crude death rate was 13.46 per thousand highlighting the high
percentage of old people in the considered population. The city is composed by 14city wards (Fig. 1) and 2752 census tracts (Fig. 2). Table 1 shows, for each city
ward, observed deaths for all causes, the corresponding Standardized Mortality
Ratios (SMR), the Bayesian relative risks (RR) and the 95% credibility intervals
evaluated on the simulated posterior distributions (respectively as the mean and
the 2.5% and 97.5% of the sampled values) estimated using the spatial Bayesian
7/30/2019 articolo Dreassi ridotto
3/15
A multilevel model for Contextual Effect of Material Deprivation 89
Fig. 1. The ward of the City of Florence (Tuscany, Italy)
model of Besag et al. (1991). There is a strong gradient in mortality among city
wards. Two wards (Mantignano and Ponte di Mezzo) appeared at higher risk (about
14% excess) and one (Poggetto) significantly lower (about 10% deficit). Figure 3shows the map of relative risks in the city: the western area appears more affected.
Material deprivation has been defined for each individual as the frequency of the
following unfavourable events: unemployment, low education (less than 6 years
of schooling), poor housing condition (less than 25 square metres), and absence
of bathroom in the flat. In Table 2 crude death rates by material deprivation are
reported. Material deprivation strongly affects mortality, with a clear trend from a
standardized rate of 13.12 per thousand among not deprived people up to 20.66 per
thousand among the most deprived (2 or more unfavorable events). The prevalence
of deprived people by city wards is reported in Table 3. There is some evidence
that higher mortality correlates with higher prevalence of material deprivation on
pure ecological comparison: the highest and lowest city ward for mortality are the
highest and lowest for deprivation prevalence of deprived people. The presence of
a contextual effect could be speculated restricting the analysis to the stratum of
not deprived people (Table 4): higher SMRs and Bayesian relative risks are still
observed in the city wards with higher prevalence of material deprivation.
The emphasis here is not in interpreting such hypothesized effects, but to show
how these kind of data should properly be analyzed. Next section introduces the
used statistical models.
3. Methods
We use individual data or cross-tabulated data (where the statistical unit is the cell,
after count data have been generated collapsing by deprivation level categorized as
7/30/2019 articolo Dreassi ridotto
4/15
90 A. Biggeri et al.
Table 1. Observed deaths, Standardized Mortality Ratios (SMR), Bayesian relative risks (RR) and 95%
credibility interval (CI95%) for all causes mortality by city ward, Florence 19911995, males. Tuscany
Longitudinal Study
city wards obs SMR RR CI95%
Duomo 679 1.02 1.02 0.94-1.09
Gavinana 719 0.94 0.95 0.88-1.01
Santo Spirito 637 1.02 1.01 0.94-1.09
Legnaia 877 1.00 1.01 0.95-1.08
Mantignano 548 1.14 1.13 1.04-1.22
Novoli 747 1.01 1.01 0.95-1.09
Ponte di mezzo 440 1.15 1.13 1.03-1.24
San Jacopino 529 1.09 1.08 1.00-1.17
Le Panche 355 0.91 0.92 0.83-1.01
Poggetto 755 0.90 0.91 0.85-0.97
San Gallo 531 1.00 1.00 0.92-1.09Oberdan 795 1.02 1.01 0.95-1.08
Campo di Marte 483 0.96 0.96 0.88-1.05
Coverciano 530 0.95 0.95 0.87-1.03
Table 2. Observed deaths and crude rates (per thousand and 95% confidence interval, CI95%) for all
causes mortality by deprivation index, Florence 19911995, males. Tuscany Longitudinal Study
deprivation index obs rate(1000) CI95%
0 7166 13.12 12.8013.40
1 1308 15.05 14.2015.90
2+ 138 20.66 17.5024.40
0, 1, 2 or more unfavourable events, census tract and age-group). Let i denote thegeneric individual (or cell), j the census tract and k the city ward.
Let xijk denote the material deprivation index for the generic i-th subject (orcell) living in j-th census-tract within k-th city ward; xjk the census tract averageand xk the city ward average of deprivation index. In order to compare the effect
size of each variable in the subsequent regression analyses, all the variables havebeen standardized dividing them by their respective sample standard deviations.
We will compare the results from the following regression models:
Cox proportional hazard regression (using individual data and age as time axis);
Poisson regression models (using cross-tabulated data).
The models were fitted to data at different levels of aggregation:
individual level;
aggregate level;
multilevel, following both contextual and Cronbachs definition (see Cronbach
and Webb, 1975, Boyd and Iversen, 1979 and Kreft et al., 1995).
When individual data are considered, the response variable Yijk is an indicatorfor status (death or alive) joined to time to event variable (i = 1, . . . , 163613). Weused a Cox proportional hazard regression having specified attained age as the time
7/30/2019 articolo Dreassi ridotto
5/15
A multilevel model for Contextual Effect of Material Deprivation 91
Table 3. Distribution of the study cohort by city wards and deprivation index (number of unfavorable
events see text for each cohort member), Florence 19911995, males. Tuscany Longitudinal Study
city wards 0 1 2+ total
% % %
Duomo 11227 0.82 2185 0.16 213 0.02 13625
Gavinana 11464 0.85 1928 0.14 147 0.01 13539
Santo Spirito 9752 0.82 1990 0.17 182 0.02 11924
Legnaia 14234 0.85 2345 0.14 117 0.01 16696
Mantignano 10348 0.82 2142 0.17 186 0.02 12676
Novoli 14581 0.85 2419 0.14 228 0.01 17228
Ponte di mezzo 5687 0.80 1282 0.18 138 0.02 7107
San Jacopino 7727 0.88 985 0.11 52 0.01 8764
Le Panche 6229 0.84 1089 0.15 84 0.01 7402
Poggetto 12186 0.89 1400 0.10 85 0.01 13671
San Gallo 7601 0.87 1113 0.13 71 0.01 8785Oberdan 11797 0.89 1386 0.10 98 0.01 13281
Campo di Marte 7530 0.90 810 0.10 39 0.01 8379
Coverciano 9065 0.86 1377 0.13 94 0.01 10536
Table 4. Observed deaths, Standardized Mortality Ratios (SMR), Bayesian relative risks (RR) and 95%
credibility interval (CI95%) for not-deprived people. Mortality for all causes, Florence 19911995,
males. Tuscany Longitudinal Study
city wards obs SMR RR CI95%
Duomo 575 1.05 1.03 0.96 1.10Gavinana 575 0.92 0.94 0.87 1.00
Santo Spirito 490 0.98 0.99 0.91 1.06
Legnaia 720 1.00 1.01 0.94 1.07
Mantignano 410 1.14 1.10 1.00 1.19
Novoli 610 1.04 1.04 0.96 1.10
Ponte di mezzo 341 1.14 1.09 0.99 1.18
San Jacopino 468 1.10 1.07 0.99 1.15
Le Panche 289 0.93 0.97 0.88 1.05
Poggetto 663 0.91 0.94 0.87 1.00
San Gallo 455 1.00 1.00 0.92 1.07
Oberdan 689 1.01 1.00 0.93 1.06Campo di Marte 425 0.95 0.96 0.89 1.03
Coverciano 456 0.98 0.98 0.90 1.06
axis and allowing left censoring (age at entry). We consider log ijk the risks ratio,specifying in the linear predictor material deprivation index at several levels.
Individual data have been also collapsed generating counts by five year
age groups and calculating expected number of deaths by indirect internal age-
standardization on the person years tabulated by deprivation index and census tract(i = 1, . . . , 42902). Then a Poisson regression model has been used to estimatecovariate effects. In particular, we defined as response Yijk the number of observedevents and we specified Yijk Poisson(Eijkijk), where ijk represent the rel-ative risk for the generic individual i, living in the j-th census tract and k-th cityward and Eijk a population denominator (the expected number of deaths). We then
7/30/2019 articolo Dreassi ridotto
6/15
92 A. Biggeri et al.
specified a linear model for log ijk with material deprivation at different levels aspredictors.
3.1. Models for individual level
The linear predictor is formulated in the following way:
log ijk = 0 + 1xijk ,
where the covariate xijk is defined for the i-th subject (or cell).
3.2. Models for aggregate level
These models are formulated on two different levels of data aggregation using
the covariate xjk or xk defined at census tract or city ward level. The models arerespectively:
log jk = 0 + 1xjk ,
log k = 0 + 1xk .
These analysis have been performed by means of Poisson regression models for
cross-tabulated data.
3.3. Contextual multilevel models
These models are specified using both individual xijk and aggregate xjk (or xk)covariates in the same regression model.
log ijk = 0 + 1xijk +
2xjk + 3xk .
This kind of models, involving both individual and averaged variables, are called
contextual models (Boyd and Iversen, 1979). In the epidemiological applicationsthe term contextual has been used more broadly, to address to any aggregate
variable even in absence of the corresponding individual level variable.
3.4. Cronbachs multilevel models
The previous models could be instable due to multicollinearity (covariates usually
exhibit a strong correlation). A simple centering of the deprivation index variables
gives rise to the Cronbachs model, a multiple regression model with all the variables
being centered. The model becomes:
log ijk = 0 + 1(xijk xjk) +
2(xjk xk) + 3(xk x)
This model, proposed in the analysis of educational data in late seventies, has
a nice interpretation of model parameters and has not yet been widely used in
7/30/2019 articolo Dreassi ridotto
7/15
A multilevel model for Contextual Effect of Material Deprivation 93
the epidemiological literature. Although exact algebraic correspondence is valid
for gaussian linear models only, covariance decomposition applies to this model
(Sheppard, 2003). Cronbachs and contextual models in general can be compared
in non linear case (see Sheppard, 2003); only when a pure ecological model is
fitted (i.e. a model for only aggregate response and explanatory variables) we looseperfect algebraic comparability. Aggregate regression coefficients {2, 3} are notconfounded by individual level covariate xijk . Cox and Poisson regression modelshave been fitted as in the previous models. Poisson regression has been performed
also into a hierarchical multilevel Bayesian models approach as follows.
3.5. Hierarchical multilevel Bayesian Models
A Bayesian model has been specified to take into account for the hierarchies impliedin the data, where individuals are grouped by census tracts and city wards. By
means of hierarchical Bayesian modelling we are able to consider multiple sources
of variability at the same time, possibly including a spatial dependence among
neighboring census tracts or city wards. Indeed, the previous regression approaches
fail in estimating the uncertainty in the effect estimate of higher level covariates
(see Goldstein 1995 and Greenland 2002 for an epidemiological perspective).
Hierarchical Bayesian models have been specified on the number of observed
and expected events under internal age-standardization by deprivation index and
census tract:
Yijk Poisson(Eijkijk)
where ijk is the relative risk for the generic individual i, living in the j-th cen-sus tract and k-th city ward, with given degree of material deprivation, as beforementioned.
A simple regression model for the relative risk consists in separate random
intercepts for each area unit (census tract/ward). The intercepts can be parameterized
as realizations of random variables with fixed zero means and unknown variances.
The model becomes
log ijk = 0
jk + 1xijk
where the random coefficients are assumed to follow a know parameter distribution,
for example 0jk Normal(0, (0)1).
Alternatively, both the intercepts and the slopes can be parameterized as a
realization of random variable(s) with fixed mean and unknown covariance matrix
log ijk = 0
jk + 1
jkxijk
and (0jk , 1
jk) Multivariate Normal(,T1).
Both of them are examples of general ANCOVA (Analysis of Covariance) mod-
els which could be used to get unbiased effect estimates of the individual effect
level covariate while adjusting for the aggregate (hierarchical) nature of the data
(subjects within census tract within city wards). However these model are highly
7/30/2019 articolo Dreassi ridotto
8/15
94 A. Biggeri et al.
parameterized and, more parsimoniously, between area units variability could be
explained by aggregate level covariates.
Different models, depending on the assumed structure of random effect terms
have been specified:
a) Not spatially structured random intercepts and slopes for each city ward.b) Spatially dependent (using a Gaussian Autoregressive Conditional model,
see Bernardinelli et al., 1995) random intercepts and slopes for each city ward.
c) Spatially dependent random intercepts and slopes for each city ward and not
spatially structured random intercepts for each census tract.
d) Random intercepts and slopes for each city ward and random intercepts for
each census tract (both spatially unstructured). This last model is
log ijk = (0
k + 4
j ) + 1
k(xijk xjk) + 2(xjk xk) +
3(xk x)
with prior distributions Normal(0,10000) for fixed coefficients 2 and 3; priordistributions for each k-th random coefficients
(0k, 1
k) Multivariate Normal(,T1),
and Normal prior for each j-th random coefficients
4j Normal(4, (4)1).
Hyperpriors for and T are, respectively
Multivariate Normal
00
,
0.0001 0
0 0.0001
1
and
T Wishart
0.1 0.005
0.005 0.01
1
, 2
Hyperprior for 4 is Normal (0, 10000), for 4 is Gamma (0.001, 10000). Thispriors and hyperpriors can be regard as non informative since they have a very
large variance. In the absence of a prior knowledge, the prior distribution can be
chosen to be vague; then the prior distribution has only a negligible influence on
the results and the shape of the posterior will be nearly the same (for a review about
non informative prior distributions on Bayesian inference see Kass and Wasserman,
1996).Models (a) and (b) ignore the census tract level. The former assumes exchange-
able random terms while the latter specifies a conditionally autoregressive structure
among city wards. This assumption is more realistic as could be argued from Fig. 3.
Models (c) introduce the census tract level spatially unstructured. The spatial de-
pendence at lower level (census tract) has not be considered because it has been
7/30/2019 articolo Dreassi ridotto
9/15
A multilevel model for Contextual Effect of Material Deprivation 95
Fig. 2. The 1991s census tracts of the City of Florence (Tuscany, Italy)
Fig. 3. Relative risk for all causes mortality, Florence 19911995, males. Tuscany Longitudinal Study
enclosed when we define a spatial dependence at higher level (city ward). The shape
of census tracts and city wards seem to suggest that spatial structure, based on area
adjacencies, is more appropriate when considering city ward subdivision. Finally,
in model (d) both census tract and city ward are spatially unstructured.
Model comparison has been performed using the expected predictive deviance
(EPD) criterion:
2
(Yijk + 0.05) log((Yijk + 0.05)/(Y
ijk + 0.05)) Yijk + Y
ijk ,
where predicted data Yijk are sampled from a Poisson(Eijkijk) and ijk are the
estimates obtained from the posterior distributions. The EPD measures the dis-
crepancy between the observed and predicted data, which can be expressed (see
7/30/2019 articolo Dreassi ridotto
10/15
96 A. Biggeri et al.
Table 5. Log-Relative Risks and standard error for standardized scores (relative effects) of depriva-
tion index obtained by different models (see text). All causes, Florence 19911995, males. Tuscany
Longitudinal Study
model covariates individual data cross-tabulated data
individual individual xijk 0.058 (0.010) 0.076 (0.010)
aggregate census-tract xjk 0.066 (0.010)
aggregate ward xk 0.028 (0.011)
contextual individual xijk 0.040 (0.011) 0.061 (0.010)
avg-census xjk 0.057 (0.011) 0.045 (0.011)
avg-ward xk 0.013 (0.011) 0.009 (0.011)
Cronbach individual (xijk xjk) 0.037 (0.010) 0.058 (0.010)census tract (xjk xk) 0.067 (0.010) 0.061 (0.010)
ward (xk x) 0.032 (0.011) 0.026 (0.011)
Table 6. Hierarchical Bayesian models estimates for fixed coefficients; expected posterior (EPoD),
predictive deviance (EPD) and model complexity. All causes, Florence 19911995, males. Tuscany
Longitudinal Study. In bold the lower EPoD and EPD measure
model (xjk xk) (xk x) EPoD EPD complexity
(a) 0.0636 0.0405 13983.10 28702.77 14719.67
(0.0101) (0.0279) (53.20) (285.16)
(b) 0.0636 0.0385 13979.19 28696.32 14717.13(0.0101) (0.0327) (51.58) (283.23)
(c) 0.0621 0.0308 13418.14 28237.55 14819.41
(0.0114) (0.0348) (69.69) (289.22)
(d) 0.0625 0.0292 13401.50 28222.20 14820.70
(0.0113) (0.0300) (68.78) (284.09)
(a) Not spatially structured random intercepts and slopes for each city ward.(b) Spatially dependent random intercepts and slopes for each city ward. (c)
Spatially dependent random intercepts and slopes for each city ward and not
spatially structured random intercepts for each census tract. (d) Random inter-
cepts and slopes for each city ward and random intercepts for each census tract
(both spatially unstructured).
Gelfand and Ghosh, 1998) as the sum of a goodness-of-fit term (the Expected
Posterior Deviance, EPoD) and a penalty term for model complexity.
4. Results
The logarithm of the relative risks and their standard errors obtained from the Coxmodel and the Poisson regression for each level of data aggregation are reported
on Table 5.
The individual level analysis provides only effect estimates of individual level
covariates. If contextual effects are supposed to act, those estimates would be biased.
In case of the linear Gaussian model it can be proved that the bias depends on the
7/30/2019 articolo Dreassi ridotto
11/15
A multilevel model for Contextual Effect of Material Deprivation 97
Table 7. Individual effects (constant and coefficient) and descriptive measure of the mean deprivation
for each city ward (xk) on model (d). All causes, Florence 19911995, males. Tuscany Longitudinal
Study. For the less deprived wards (lower xk) the individual effects is greater
city ward constant coefficient (xijk xjk) xk
Duomo 0.0906 (0.14023) 0.0179 (0.02723) 0.191633Gavinana 0.1439 (0.14053) 0.0737 (0.02701) 0.164119Santo Spirito 0.1059 (0.14566) 0.0805 (0.02720) 0.197417Legnaia 0.1115 (0.13873) 0.0694 (0.02605) 0.154468Mantignano 0.0440 (0.14147) 0.0360 (0.02828) 0.198328Novoli 0.1018 (0.14001) 0.0408 (0.02755) 0.166880Ponte di mezzo 0.0661 (0.14883) 0.0387 (0.03241) 0.219221San Jacopino 0.0065 (0.14391) 0.0601 (0.03180) 0.124258Le Panche 0.1771 (0.14585) 0.0501 (0.03540) 0.169819Poggetto 0.1388 (0.14362) 0.0774 (0.03157) 0.114842
San Gallo 0.0709 (0.14160) 0.0462 (0.03395) 0.142857Oberdan 0.0934 (0.14643) 0.0230 (0.03426) 0.119127Campo di Marte 0.0910 (0.15350) 0.0877 (0.03987) 0.105980Coverciano 0.1546 (0.14440) 0.0146 (0.03474) 0.148538
Table 8. Precision matrix T for the individual effects for each city ward of the Bayesian model (d), the
mean and standard deviations of posterior distribution
precision element posterior mean and standard deviation
T0k0k
97.4502 (39.53246)
T1
k1
k 490.0663 (238.7085)T0
k1k
12.92921 (63.99769)
size of the contextual level effect times the ratio between the variance ofxj and thevariance ofxij .
On the contrary the aggregate level analysis provides unbiased effect estimates
of the overall effect of X (in the linear case the sum of the true individual andcontextual effects). It should be noticed that the bias of the individual level effect
estimates reflects the importance of properly accounting for the hierarchies in thedata. This bias is opposite to the ecological fallacy, which arises when the effect
estimates obtained by aggregate level analysis are used as an approximation to the
true individual effect estimates. From table 5, column relative to cross-tabulated
data, the overall effect is estimated 0.066 by aggregate model at census-tract level,
while the individual level model gives an overestimated coefficient of 0.076.
In principle, Contextual models provide unbiased estimates of the true individ-
ual effect. Note that only if the analysis is conducted at census-tract level we will
obtain unbiased estimates of individual effect. The general rule is that estimates of
the effect of individual covariates are biased unless appropriate aggregate level ofanalysis be specified. The true individual effect is estimated 0.040 by the Cox model
and 0.061 by Poisson regression. These compares to 0.058, 0.076 respectively when
individual models were fitted.
Cronbachs models provide unbiased estimates of individual effects as well.
The effect estimates of the aggregate variables are comparable to those obtained
7/30/2019 articolo Dreassi ridotto
12/15
98 A. Biggeri et al.
13000 13500 14000 14500
0.
0
0.0
01
0.
002
0.
003
0.
004
0.
005
model (a)model (b)model (c)model (d)
Fig. 4. Posterior deviance distributions for the hierarchical Bayesian models
fitting the model to aggregate data (0.061 and 0.026 compared to 0.066 and 0.028
for census-tract and city ward respectively).The Bayesian multilevel approach must be taken into account to assure validity
to effect estimates and their precisions. Effect estimates for the multilevel hierar-
chical Bayesian models (ad) are reported on Table 6. We note that estimates of
contextual effects are very close to those obtained by Cox model and Poisson re-
gression (Table 5), but with larger standard errors, as expected (using model (d) we
obtain 0.0625 standard error 0.0113 for census tract level; 0.0292 standard error
0.030 for city ward level). Hierarchical Bayesian models properly address multi-
ple sources of variability, with special regard to uncertainty of effect estimate of
higher level covariates (see paragraph 3.5). The underestimation of standard errorsis proportional for each level of the hierarchy to the number of clusters and the
between/within variance component ratios. The reader is invited to note the big
change in the size of standard errors for the covariate defined at ward level (only
14 ward) and the minor change for the covariate defined at census tract level (2752
tracts).
The selection among the fitted Bayesian models has been done using EPD
values. The mean and standard deviation of posterior and predictive deviance dis-
tributions for the considered models are also reported on Table 6, the graph of
posterior deviance distributions for models (a)(d) are shown in Fig. 4, the predic-tive deviance distributions in Fig. 5. Introducing the census-tract level decreased
substantially the posterior deviance, much more than the increase in model com-
plexity; model (d) resulted best.
The selected model was therefore the model with only spatially unstructured
effects for city wards and census tracts. For this model the estimated individual
7/30/2019 articolo Dreassi ridotto
13/15
A multilevel model for Contextual Effect of Material Deprivation 99
27000 27500 28000 28500 29000 29500 30000
0.
0
0.
0005
0.
0010
0.
0015
model (a)model (b)model (c)model (d)
Fig. 5. Predictive deviance distributions for the hierarchical Bayesian models
-4 -2 0 2 4
-0.
2
-0.
1
0.
0
0.
1
0.
2
DuomoGavinana
Santo SpiritoLegnaiaMantignanoNovoliPonte di MezzoSan JacopinoLe PanchePoggettoSan GalloOberdanCampo di MarteCoverciano
Fig. 6. Individual effects for each city ward on hierarchical Bayesian model (d)
effects by city ward are shown in Fig. 6. The mean and standard deviations of the
posterior distributions of individual effects are shown on Table 7 together with the
city ward average deprivation index. The mean and standard deviations of posterior
distributions of the hyperparameters contained in the precision matrix T are shown
on Table 8.
7/30/2019 articolo Dreassi ridotto
14/15
100 A. Biggeri et al.
5. Conclusions
A class of models for contextual analyses (when covariates are measured at indi-
vidual and aggregate level) are reviewed. We suggest those models which include
centering of covariates (Cronbachs model) and a multilevel Bayesian approach tocope with random effects and consistently estimates precision parameters. Various
alternative modelling have been discussed, with special emphases to random effects
spatial models. Bayesian model comparison is performed using measures which
take into account for model complexity. The example highlights the difficulties and
biases of simple analyses conducted at only one level of the hierarchy.
Any analysis of individual level variables, which does not consider the multilevel
data structure, will give biased results, unless the contextual effect is null. In turn,
the analysis conducted at the aggregate level only gives biased standard errors
and could provide biased point estimates if contextual effect is null and ecologicalconfounding or effect modification is present. However it will give an estimate of
the overall effect (contextual plus individual), provided no confounding is acting.
In fact, when considering the data hierarchy, in the presence of an individual
level effect only, the aggregate level effect in the Cronbachs model would be equal
to the individual level effect, provided no ecological bias is in action.
In presence of an aggregate level effect only, the individual level effect in the
Cronbachs or contextual model would be close to the null value.
In other cases the aggregate level effect in the Cronbachs model could be
interpreted as the overall covariate effect, the sum of individual and contextualeffects. The explanation of the causal mechanism involved in contextual effects is
matter of subject-specific research.There are still several subtletiesto be considered,
especially in the non-linear case, which are beyond the scope of our paper. The
interested reader is referred to Sheppard (2003).
In the Florence census cohort 19911995, material deprivation appeared to be
strongly associated with mortality for all causes. Our findings suggest the presence
of complex patterns of associations between deprivation and mortality, involving
individual and small area effects. The analysis restricted to sub-groups of population
(most or least deprived) suggested a certain contribution of contextual effects at
census-tract level. Using aggregate data at census level will give unbiased estimate
of the overall effect, provided no confounding be active.
References
Anderson RT, Sorlie P, Backlund E, Johnson N, Kaplan GA (1997) Mortality Effects of Community
Socioeconomic Status. Epidemiology 8, 4247
Bernardinelli L, Clayton D, Pascutto C, Montomoli C, Ghislandi M, Songini M (1995) Bayesian analysis
of space-variation in disease risk. Statistics in Medicine 14, 24332443
Besag J, York J, Mollie A (1991) Bayesian image restoration, with two applications in spatial statistics(with discussion). Annals of the Institute of Statistical Mathematics 43, 159
Biggeri A, Gorini G, Dreassi E, Kalala N, Lisi C (2001) Condizione socio-economica e mortalita in
Toscana, Studi e Ricerche, n. 7, Edizioni Regione Toscana, Centro Stampa Giunta Regionale,
Firenze
Boyd LH, Iversen GR (1979) Contextual Analysis: Concepts and Statistical Techniques. Belmont, CA:
Wadsworth
7/30/2019 articolo Dreassi ridotto
15/15
A multilevel model for Contextual Effect of Material Deprivation 101
Cronbach LJ,Webb N (1975) Between class andWithin class Effects in a ReportedAptitudeTreatmentInteraction: A reanalysis of a study by G.L. Anderson. Journal of Educational Psychology 67, 717
724
Davey Smith G, Hart C, Watt G, Hole D, Hawthorne V (1998) Individual social class, area-based
deprivation, cardiovascular disease risk factors, and mortality: the Renfrew and Paisley study.
Journal of Epidemiology & Community Health 52, 399405
Diez-Roux AV (1988) Bringing contex back into epidemiology: variables and fallacies in multilevel
analysis. American Journal of Public Health 88, 216222
Eachus J, Williams M, Chan P, Davey Smith G, Grainge M, Donovan J, Frankel S (1996) Deprivation and
cause specific morbidity: evidence from the Somerset and Avon survey of health. British Medical
Journal 312, 287292
Firebaugh G (1978) A rule for inferring individual-level relationships from aggregate data. American
Sociological Review 43, 557572
GelfandAE, Ghosh SK (1998) Model choice: a minimum posterior predictive loss approach. Biometrika
85, 111
Goldstein H (1995) Multilevel Statistical Models. Second Edition, London: Edward Arnold
Greenland S (2002) A review of multilevel model theory for ecologic analyses. Statistics in Medicine
21, 389395
Jarman B (1983) Identification of underprivileged areas. British Medical Journal 17051709
Kass RE, Wasserman L (1996) The selection of Prior Distributions by Formal Rules, Journal of the
American Statistical Association 91, 13431370
Kreft IGG, de Leeuw J,Aiken L (1995) The Effect of Different Form of Centering in Hierarchical Linear
Models. Multivariate Behavioral Research 30, 122
Morris R, Carstairs V (1991) Which deprivation? A comparison of selected deprivation indexes. Journal
of Public Health Medicine 13, 318326
Pell JP, Pell ACH, Norrie J, Ford I, Cobbe SM (2000) Effect of socioeconomic deprivation on waiting
time for cardiac surgery: retrospective cohort study. British Medical Journal 320, 1518
Sheppard L (2003) Insight on bias and information in group-level studies. Biostatistics 4, 265278
St Leger S (Ed.) (1995) Use of deprivation indices in small area studies of environment and health.
Journal of Epidemiology & Community Health S2, 49, 188
Sundquist J, Malmstrom M, Johansson SE (1999) Cardiovascular Risk Factors and the Neighbourhood
Environment. International Journal of Epidemiology 28, 841845
Townsend P, Phillimore P, Beattie A (1988) Health and deprivation: inequalities and the north. London:
Croom Helm