15
This article was downloaded by: [Florida State University] On: 17 October 2014, At: 18:12 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells Haydar Demirhan a a Department of Statistics , Hacettepe University , Beytepe, Ankara , 06800 , Turkey Published online: 19 Jun 2013. To cite this article: Haydar Demirhan (2013) Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells, Journal of Applied Statistics, 40:10, 2303-2316, DOI: 10.1080/02664763.2013.810196 To link to this article: http://dx.doi.org/10.1080/02664763.2013.810196 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

  • Upload
    haydar

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

This article was downloaded by: [Florida State University]On: 17 October 2014, At: 18:12Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Bayesian estimation of log odds ratiosover two-way contingency tables withintraclass correlated cellsHaydar Demirhan aa Department of Statistics , Hacettepe University , Beytepe,Ankara , 06800 , TurkeyPublished online: 19 Jun 2013.

To cite this article: Haydar Demirhan (2013) Bayesian estimation of log odds ratios over two-waycontingency tables with intraclass correlated cells, Journal of Applied Statistics, 40:10, 2303-2316,DOI: 10.1080/02664763.2013.810196

To link to this article: http://dx.doi.org/10.1080/02664763.2013.810196

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics, 2013Vol. 40, No. 10, 2303–2316, http://dx.doi.org/10.1080/02664763.2013.810196

Bayesian estimation of log odds ratios overtwo-way contingency tables with intraclass

correlated cells

Haydar Demirhan∗

Department of Statistics, Hacettepe University, Beytepe, Ankara 06800, Turkey

(Received 16 September 2011; accepted 28 May 2013)

In this article, a Bayesian approach is proposed for the estimation of log odds ratios and intraclasscorrelations over a two-way contingency table, including intraclass correlated cells. Required likelihoodfunctions of log odds ratios are obtained, and determination of prior structures is discussed. Hypothesistesting for log odds ratios and intraclass correlations by using the posterior simulations is outlined. Becausethe proposed approach includes no asymptotic theory, it is useful for the estimation and hypothesis testingof log odds ratios in the presence of certain intraclass correlation patterns. A family health status and lim-itations data set is analyzed by using the proposed approach in order to figure out the impact of intraclasscorrelations on the estimates and hypothesis tests of log odds ratios. Although intraclass correlations aresmall in the data set, we obtain that even small intraclass correlations can significantly affect the estimatesand test results, and our approach is useful for the estimation and testing of log odds ratios in the presenceof intraclass correlations.

Keywords: Bayes factor; categorical data; credible interval; cross table; neighborhood-level diseaseclustering; epidemiology; familial correlation; normal distribution; odds ratio; population genetics; priordistribution

1. Introduction

In some studies, categorical data include intraclass correlations if the data are collected oversampling units including more than one observation unit and these units are correlated. Forinstance, in a public health survey, if categorical data are collected over families, intraclass corre-lations arise from within family dependence. When these intraclass correlations are disregarded,reliability of analyses, such as the common chi-square test, becomes controversial because con-ventional methods of inference can be severely affected by these intraclass correlations [5]. Also,most of the proposed estimators of intraclass correlation coefficient (ICC) are subject to bias [6].When a two-way contingency table is considered, if there are intraclass correlations, cell countsare no longer distributed as multinomial or Poisson and assumptions of the classical chi-square test

∗Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 3: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2304 H. Demirhan

do not hold due to the intraclass correlations [5,14]. Cohen [14] took into account the existence ofintraclass correlations in the classical setting when each cluster included two units. The case thateach cluster includes more than two units was considered by Altham [5], who gave the estimationand testing of cell probabilities and logs of them in the classical setting. However, the approachgiven for logs is valid for large samples [5]. Brier [8] proposed the use of the Dirichlet-multinomialmodel for the estimation and testing of quantities over contingency tables in the classical settingunder the presence of intraclass correlations. Choi and McHugh [13] proposed a correction for theclassical goodness-of-fit or independence test of a data set, which comes from cluster samplingand has an intraclass correlation pattern. Helu and Naik [23] considered classical estimation andtesting of intraclass correlations using the Kotz-type log-likelihood for the estimation and Srivastaestimator for the testing. Doner [19] presented a review of methodology on the estimation of ICCover the random effect models. Commenges and Jacqmin [15] defined the ICC within the contextof random effect models, and derived a distribution-free score test. Harris and Burch [22] proposeda family of estimators based on the pivotal quantities for the estimation of the ICC in the balancedone-way random effects model. Ye and Wang [28] considered estimation and testing of the ICCover a two-way random effects model with interaction. Atenafu et al. [6] proposed a new bias-corrected estimator for the ICC in the context of the random effects model. Burch and Harris [9]used Bayesian methods to estimate the ICC with-in the one-way random effect models context.Nandram and Choi [26,27] considered the Bayesian estimation and testing of cell probabilitiesand intraclass correlations a two-way contingency table from the Bayesian point of view. It wasmentioned by Nandram and Choi [26,27] that Bayesian approaches used for intraclass correlatedcontingency tables do not include asymptotic theory. Also, they explained problematic respectsof the asymptotic theory used in the classical setting. Heo and Kim [24] developed various priorstructures for the estimation of intraclass correlation parameters.

Bayesian estimation of log odds ratios from 2 × 2 and k 2 × 2 contingency tables was con-sidered by Albert [4]. Demirhan and Hamurkaroglu [16] proposed Bayesian estimation of logodds ratios from R × C and 2 × 2 × K contingency tables. They derived the required likelihoodfunctions and both informative and non-informative prior distributions.

Classical estimation and testing of log odds ratios over contingency tables are somewhat prob-lematic if there are intraclass correlations. Approaches given for the classical estimation of a logodds ratio and its variance are valid under large samples and mainly based on the assumption thatcell counts follow a multinomial distribution [1]. If there are intraclass correlations, the distribu-tion of cell counts is no longer multinomial. Hence, if the classical theory is applied neglecting theintraclass correlations, we have some results based on a distribution other than the multinomialdistribution. Inferring the results as if they are based on the multinomial distribution is not aconvenient manner. A likelihood function for odds ratios can be obtained under the intraclass cor-relations, and classical estimation and testing can be carried on this likelihood function. However,there would be some problems such as the estimation of standard deviations that remain in theclassical approach. If some cells of the considered contingency table contain zero or very smallcounts, reliability of classical point and interval estimates of log odds ratios is questionable, andresults of hypothesis tests based on the standard error estimates that require a large sample aresuspicious. The asymptotic distribution of the test statistic is chi-square when the table of interestis balanced and does not contain many zero counts. However, if the table of interest is unbalancedor the number of zero cells is high, and there are intraclass correlations, the distribution of thetest statistic is no longer chi-square, and estimates of cell proportions and results of hypothesistests are inaccurate (see Nandram and Choi [26] for a detailed discussion). Because the Bayesiansetting does not include asymptotic theory, we can get reliable results for log odds ratios with theBayesian setting even if some cells of the considered table are problematic [16,26]. Moreover,it is possible to include available prior knowledge into the analysis and utilize from the generaladvantages of the Bayesian way.

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 4: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2305

In the Bayesian way, posterior distribution of the odds ratio can be derived in terms of posteriordistributions of the cell probabilities, and inferences can be based on this posterior distribution.This is one of the reasons making the Bayesian approach attractive. However, use of this wayis inappropriate in the Bayesian estimation of odds ratios over a contingency table. The relationbetween cell probabilities and the corresponding odds ratio is many-to-one. Assume we havefour cell probabilities that pi, i = 1, 2, 3, 4. The many-to-one transformation used to obtain theodds ratio is θ = (p1p4)/(p2p3). When we induce non-informative priors on the cell probabilitiesand apply the transformation, it is quite probable that the corresponding priors of log odds ratioswill not be non-informative due to the used many-to-one transformation. Therefore, when weapply the transformation over the posterior distributions of cell probabilities, we overlook thisnon-informativeness issue. If we are intending to conduct an informative analysis, we shouldevaluate whether the many-to-one transformation consistently transfers the prior information onfour cell probabilities to the corresponding odds ratio or not. Let the joint prior distribution ofcell probabilities be fp(p1, p2, p3, p4). The marginal prior distribution of θ = θ1 = (p1p4)/(p2p3)

is obtained as f (θ1) = ∫Rθ4

∫Rθ3

∫Rθ2

fp(θ1θ2θ3/θ4, θ2, θ3, θ4)(θ2θ3/θ4) dθ2 dθ3 dθ4, where θ2 = p2,θ3 = p3, and θ4 = p4. We do not have many options for the choice of fp(p1, p2, p3, p4). If weconsider uniform and beta distributions by assuming independency of cell probabilities and theDirichlet distribution without the assumption of independence, then it is straightforwardly seenfrom the marginal expected value of θ1 that f (θ1) does not contain any consistent prior informationfor the odds ratio of interest in all of these cases. Thus, it is impossible to guarantee that the many-to-one transformation consistently transfers the prior information from cell probabilities to thecorresponding odds ratio even in such a simple case. For a reliable analysis, one should transformhis prior knowledge on an odds ratios into prior knowledge on each of the four cell probabilities.This seems to be impossible, because some of the four cell probabilities are contained in other oddsratios at the same time. As another option, one can use conditional inference. There is a one-to-onecorrespondence between odds ratios and the cell probabilities, given the marginal probabilities ofthe table. However, under the Bayesian setting, cell probabilities are random variables and hencethe marginal probabilities are also random variables. Therefore, the transformation between cellprobabilities and odds ratios is still many-to-one, and the mentioned problems remain. Instead, useof a log odds ratio for the estimation of the corresponding odds ratio is a more convenient manner.Because the relationship between an odds ratio and the corresponding log odds ratio is one-to-one,there is no problem in the combination of prior information, and prior information on an oddsratio is straightforwardly transformed into that of on a log odds ratio. Consequently, we shoulduse a specific probability model for odds ratios instead of using the many-to-one transformationof cell probabilities.

With these motivations, we propose Bayesian approaches for the estimation and testing oflog odds ratios over a two-way contingency table under the existence of intraclass correlationpatterns. We simultaneously estimate and test the ICCs. In our approach, prior knowledge on eachodds ratio is suitably reflected to the analysis without transforming prior knowledge on an oddsratio into that of cell probabilities. Therefore, we successfully include expert information alongwith sample information into the estimation and testing of log odds ratios with our approach. Weuse no asymptotic theory, and hence log odds ratios can be estimated over unbalanced or sparsetables. We apply the proposed approach to a data set from the 2006 National Health InterviewSurvey (NHIS) conducted by the US National Center for Health Statistics (NCHS). This data setis unbalanced and contains very small cell counts. We expect to get small intraclass correlations.Thus, we illustrate the impact of intraclass correlations, even if they are small, along with theimpact of unbalance and very small cell counts on the estimates and tests of log odds ratios. Inaddition, we obtain non-informative posterior estimates of cell probabilities over the data set,and use these estimates to obtain the estimates of odds ratios. We compare these estimates ofodds ratios with their counterparts obtained by our direct Bayesian approach. By this operation,

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 5: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2306 H. Demirhan

we practically aim to reveal the necessity of a direct Bayesian analysis for estimation of oddsratios.

In Section 2, derivation of the joint likelihood function of log odds ratios and intraclass correla-tions used prior structure and required posterior calculations are mentioned. In Section 3, we givean approach for the Bayesian testing of log odds ratios and intraclass correlations. In Section 4,application of considered approaches to the NHIS data set is given. The last section is devoted toa short discussion.

2. Bayesian estimation

2.1 Likelihood function

We need a likelihood function to incorporate intraclass correlations into the estimation and testingof log odds ratios and ICCs over a two-way (R × C) contingency table. The likelihood functionis obtained by using the key formula given by Altham [5] along with the approach given byDemirhan and Hamurkaroglu [16]. The key formula is also used by Nandram and Choi [26].

Let nijk be the number of individuals from ith cluster falling in the jth row and kth column of theconsidered R×C contingency table. Here and hereafter i = 1, . . . , I , j = 1, . . . , R and k = 1, . . . , Cunless otherwise stated. The total number of individuals in the ith cluster is ni. The probabilitythat all individuals in the ith cluster fall in the same cell of a contingency table of interest is as inthe following [5]:

θtπjk + (1 − θt)(πjk)t , 0 ≤ θt ≤ 1,

where θt denotes the intraclass correlation coefficient for clusters including t = ni individuals fort = 2, . . . , T , T is the greatest family size and θ1 = 0. The probability that the individuals are indifferent but specified cells is as follows:

(1 − θt)∏j,k

(πjk)nijk .

Nandram and Choi [26] derived the likelihood function of intraclass correlations and cellprobabilities using the described model as follows:

�(θ, π | n) =∏t,j,k

(θtπjk + (1 − θt)(πjk)t)gtjk

{∏t

(1 − θt)gt

} ∏i/∈C

⎧⎨⎩ni!

∏j,k

(πjk)nijk

nijk!

⎫⎬⎭ ,

where C is the set of clusters in which all individuals fall in a single cell of the contingency tableof interest, gtjk is the number of clusters of size t with all individuals in the cell (j, k) in C, and gt

is the number of clusters of size t in C.Under the multinomial or Poisson sampling, the kernel of likelihood function is l(πij | yij) ∝∏i,j(πij)

yij [1]. A local log odds ratio is defined for the R × C table structure as follows:

λjk = log

(πjkπ(j+1)(k+1)

πj(k+1)π(j+1)k

), j = 1, . . . , R − 1; k = 1, . . . , C − 1.

Let λ = (λjk) include considered log odds ratios. In order to derive a likelihood function for λ,πjk is transformed into λjkajk . πjk and ajk are obtained as in the following [16]:

πjk = exp(λjk + ajk), ajk = log

(πj(k+1)π(j+1)k

π(j+1)(k+1)

), for j = 1, . . . , R − 1; k = 1, . . . , C − 1;

πjC = [exp(λj(C−1) + aj(C−1))]−1, ajC = log

(π(j+1)(C−1)

πi(C−1)π(j+1)C

), for j = 1, . . . , R − 1;

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 6: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2307

πRk = [exp(λ(R−1)k + a(R−1)k)]−1, aRj = log

(π(R−1)(k+1)

π(R−1)kπR(k+1)

), for k = 1, . . . , C − 1;

πRC = exp(λ(R−1)(C−1) + aRC), aRC = log

(π(R−1)CπR(C−1)

π(R−1)(C−1)

).

By using these definitions, the required likelihood function of intraclass correlations, log oddsratios and nuisance parameters (a) is obtained as follows:

�(θ, λ | n) ∝T∏

t=1

⎧⎨⎩

R−1∏j=1

C−1∏k=1

[θt exp(λjk + ajk) + (1 − θt) exp(t(λjk + ajk))]gtjk

×R−1∏j=1

[θt exp(−λj(C−1) − aj(C−1)) + (1 − θt) exp(−t(λj(C−1) + aj(C−1)))]gtj(C−1)

×C−1∏k=1

[θt exp(−λ(R−1)k − a(R−1)k) + (1 − θt) exp(−t(λ(R−1)k + a(R−1)k))]gt(R−1)k

×[θt exp(λRC + aRC) + (1 − θt) exp(t(λRC + aRC))]gtRC

⎫⎬⎭

[T∏

t=1

(1 − θt)gt

]

×∏i/∈C

⎧⎨⎩

⎡⎣R−1∏

j=1

C−1∏k=1

exp(nijk(λjk + ajk))

⎤⎦

⎡⎣R−1∏

j=1

exp(−nij(C−1)(λj(C−1) + aj(C−1)))

⎤⎦

×[

C−1∏k=1

exp(−ni(R−1)k(λ(R−1)k + a(R−1)k))

]exp(niRC(λRC + aRC))

⎫⎬⎭ .

The jk indices are converted to � to simplify the equation.The total number of concerned log oddsratios is K = (R − 1)(C − 1). When log odds ratios are matched with cell probabilities, somelog odds ratios correspond to the same cell probability more than once. Thus, log odds ratiosare indexed such that � = (j − 1)(C − 1) + k, j < R, k < C, � = (j − 1)(C − 1) + (C − 1),j < R, k = C, � = (R − 2)(C − 1) + k, j = R, k < C and � = (R − 1)(C − 1), j = R, k = C.This type of indexing is used hereafter.

2.2 Prior structure

Prior distribution of each intraclass correlation parameter is taken as Beta(ξt , γt). We use the betadistribution in order to be able to determine an informative prior for each intraclass correlationcoefficient when prior knowledge is available. However, it is hard to obtain prior information aboutintraclass correlations. We a priori assume that ICCs are independently identically distributed.With this assumption, we can determine prior expected value and prior precision of each ICCseparately. This enables us to induce low prior precision (non-informative prior) for some of theICCs and high prior precision (informative prior) for others.

We consider the determination of prior distribution of nuisance parameters and log odds ratios,simultaneously. We follow the discussions given by Demirhan and Hamurkaroglu [16] for thedetermination of prior structure for log odds ratios. Multivariate normal (MVN) distributionswith known and unknown dispersion parameters are used as prior distributions. Parameters a andλ are not assumed to be independent. Thus, the joint prior is obtained as p(λ, a) = p(λ | a)p(a).

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 7: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2308 H. Demirhan

Prior probability density function p(a) is taken as a multivariate flat distribution so as not toinduce any subjective prior information on a. The p(λ | a) is taken as an MVN distribution withparameters μ and �, which are user-specified hyper-parameters of p(λ, a | μ, �). Subjective priorinformation, if available, is the information on the value of an odds ratio for a possibly intraclasscorrelated table. In order to translate subjective prior information on an odds ratio to a value forthe relevant element of μ, a natural logarithm is calculated. � reflects the precision of the priorbelief in each log odds ratio. It is effectual in the specification of an informative prior distributionfor log odds ratios. When they are a priori assumed to be independent, it is taken that � = τ−1I.Here τ represents our degree of belief in prior information by determining prior variance. Thisis the straightforward way for the determination of a precision parameter for Bayesian log-linearmodeling. Close to 0 values of τ reflect a weak belief in prior information. One should choose atending to infinity value for τ to make prior variances of corresponding log odds ratios close to zero,if he has a strong belief in the induced prior information. When log odds ratios are not assumedto be independent, information on the covariances between log odds ratios is reflected by the off-diagonal elements of �. They are chosen by the researcher according to his experiences. However,positive definiteness of � is not guaranteed in this manner. Demirhan and Hamurkaroglu [16]give an approach for the determination of prior distribution when independence of parameters isnot assumed or the table of interest includes ordinal variables.

2.3 Joint posterior distribution and posterior inferences

A vector h is defined such that h = (μT, αT, βT). The nuisance parameter vector a is integratedout from the joint posterior distribution of θ, λ and a given data, h and � as follows:∫

Ra

p(θ, λ | a, h, �)p(a | h, �) da.

As mentioned in Box and Tiao [7, p. 70] and Demirhan and Hamurkaroglu [16], p(a | h, �) canbe perceived as a weight function. p(a | h, �) is expected to be sharp because the sample providesmuch information about a. Thus, most of its probability content is concentrated around its modea. In this case, integrating out a is nearly equivalent to assigning the a to a in the p(θ, λ | a, h, �),namely p(θ, λ | h, �)

.= p(θ, λ | a, h, �) [7]. Therefore, the relevant terms in the p(θ, λ | a, h, �)

can be perceived as a proportionality constant, and we use p(θ, λ | h, �) in the following equations.Consequently, p(θ, λ | h, �) is obtained as follows:

p(θ, λ | h, �) ∝{

T∏t=2

(θt)ξt−1(1 − θt)

γt−1

} ⎧⎨⎩

∏i/∈C

⎡⎣ R∏

j=1

C∏k=1

exp(δjknijkλ�)

⎤⎦

⎫⎬⎭

×T∏

t=2

(1 − θt)gt

⎧⎨⎩

R∏j=1

C∏k=1

[θt exp(δjkλ�) + (1 − θt) exp(δjk tλ�)]gtjk

⎫⎬⎭

× exp

{−1

2

[K∑

�=1

σ−2�� (λ� − μ�)

2 − 2K∑

�=1,�<s

σ−2�s (λ� − μ�)(λs − μs)

]}, (1)

where δjk = −1 for (j < R, k = C) or (j = R, k < C) and δjk = 1 otherwise, σ−2�� and σ−2

�s arediagonal and off-diagonal elements of �−1, respectively.

A data augmentation approach can be employed with an auxiliary variable for posterior com-putations over (1). Such an approach is used by Nandram and Choi [26] in the estimation of cellprobabilities.Also, it can be adapted for the estimation of log odds ratios and seems to be a straight-forward way for posterior calculations. However, the data augmentation approach adds auxiliary

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 8: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2309

variables to the joint posterior distribution of log odds ratios and intraclass correlations. Thisincreases the complexity and uncertainty of estimation process. Use of the Metropolis–Hastings(MH) algorithm (see Chen et al. [10, p. 23] for details) for posterior calculations is a more effec-tive and economical way of providing that appropriate proposal distributions are available. Anormal distribution with a moderate variance and a mean of zero is used as the proposal distribu-tion of each log odds ratio. A Beta(1, η) distribution is used as the proposal distribution of eachintraclass correlation parameter. These are arbitrary choices. Sensitivity of results to the proposaldistribution can be evaluated by repeating posterior calculations for several proposal distributions.Acceptance rates also help the assessment of the suitability of a proposal distribution. For normalproposal densities, we expect to get acceptance rates about 45–50% for an univariate case and25–30% for a multivariate case to conclude appropriateness of a proposal distribution [12].

3. Bayesian testing

We consider testing of log odds ratios and intraclass correlations from the Bayesian point of view.We test simple hypotheses Hλ

0 : λr = λr0 for −∞ < λr

0 < ∞, r = 1, . . . , K and Hθ0 : θt = ρ t

0 for0 ≤ ρ t

0 ≤ 1, t = 1, . . . , T . There is no specific alternative hypothesis, so we have Hλ1 : λr �= λr

0and Hθ

1 : θt �= ρ t0 alternative hypotheses. In general, Bayes factors (see Kass and Raftery [25] for

details and sample applications) are used to test these hypotheses. The Bayes factor is the ratio ofthe marginal likelihoods. Calculation of the marginal likelihoods in the analysis of log odds ratiosand intraclass correlations is problematic because the required integrals cannot be analyticallysolved. In general, various asymptotic approximations such as Laplace’s method or Monte Carlomethods such as Gaussian quadrature, importance sampling, bridge sampling or path samplingare used to solve the integrals. Each approach has some difficulties in itself, and is suitable fordifferent specific cases [11,21,25]. Bayes factors are sensitive to the choice of prior distributions.When a diffuse prior is induced on the parameters of interest and Bayes factor is used to test apoint null hypothesis, it is very probable that the Bayes factor supports H0 while it is incorrectactually [3]. In addition, one should find a specific Bayes factor for each test. These make the useof Bayes factors troublesome.

Instead of dealing with difficulties of Bayes factors, we adapt the approach given by Demp-ster [17,18], Aitkin [2] and Aitkin et al. [3]. They considered the likelihood ratio (LR) between thenull and alternative hypotheses as a random variable, and proposed to obtain posterior distributionof LR and to make the decision about null hypothesis according to the posterior distribution ofLR. The logic under their approach is as follows: for a parameter vector β, LR is defined asLR(β) = �(β0)/�(β). Here �(β0) is the likelihood calculated under the null hypothesis and �(β)

is an unknown function of β. Therefore, LR(β) can be considered as a random variable. One canobtain p(LR(β) | y) and calculate P = P(LR(β) < c | y) for an arbitrary constant c. The poste-rior probability P is then used to assess the evidence against the null hypothesis. For instance, ifP > 0.9 for a c < 1, we have quite strong evidence against the null hypothesis [2,3].

In our testing problem, βT = (λT, θT). For instance, when we are testing Hλ0 : λ2 = λ2

0 thenβT

0 = (λ1, λ20, λ3, . . . , λK , θT). In this case, we only know the value of λ2, and values of the rest of

parameters, which are nuisance parameters currently, are unknown.Aitkin et al. [3] use maximum-likelihood estimates (MLEs) of unknown nuisance parameters, and give various approaches forthe calculation of P. However, their cases are more tractable than our problem. In our case,use of MLEs is inappropriate, so we use our posterior estimates of nuisance parameters in thecalculation of �(β0). We use our MH output in order to approach p(LR(β) | y). We directlyuse the MH output and calculate �(β), and hence LR(β) at each step. As a result, we get anestimate of the posterior distribution of LR(β), (p(LR(β) | y)), and able to obtain an estimate ofP, (P = P(LR(β) < c | y)). A decision about the null hypothesis is made by using P.

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 9: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2310 H. Demirhan

4. Analysis of the NHIS data set

Concerned data set is collected within the 2006 NHIS conducted by the NCHS and taken fromthe web site of NCHS (http://www.cdc.gov/nchs/nhis.htm). We derived a particular data set fromthe 2006 NHIS. The ‘personsx: Family Health Status & Limitations’ file is used. Householdsincluding only one family, size of which is less than eight, in the south region are considered(Filters: FMX = 1, FPX < 8, REGION = 3) in the ‘personsx’file.Variables of interest are gender(Variable name: SEX; Categories: male/female) and duration of depression/anxiety/emotionalproblem in the family (Variable name: LADURB17; Categories: any/less than 3 months/3–5months / 6–12 months/more than 1 year). A 2 × 5 contingency table is obtained and given inTable 1.

In the data collection process, families were taken as sampling units, and each member ofa family was taken as an observation unit. It is quite probable that if a member of a familyhas a limitation due to a psychological problem such as depression, anxiety or an emotionalproblem, another member of the same family is affected from this limitation and may develop apsychological problem resulting in another limitation in the family. Therefore, we expect to haveintraclass correlations in this data set.

We obtain the following vectors for calculations. The vector S includes number of t = 2, . . . , 7sized families.

g1 = (g1jk) =(

0 0 0 1 350 1 1 4 25

)N = (Njk) =

(205 1 1 4 125190 3 3 11 159

)

g2 = (g2jk) =(

1 0 0 0 10 0 0 0 2

)g = (gt) = (0 100 57 32 6 11 5)T

gt = 0, t �= 1, 2 S = (68 104 57 32 6 11 5)T

In addition, we derived a 209 × 2 × 5 matrix n = (nijk) for i �= C, and it is available upon requestfrom the author. Since the three dimensional matrix g has a small number of counts, we expect tohave small intraclass correlations. We have chosen such a data set because we intended to exhibitthe severity of the impacts of intraclass correlations on the estimates and significance tests of logodds ratios even if they are small. Four log odds ratios are defined between columns and rows ofTable 1 (� = 1, . . . , 4). Classical estimates of log odds ratios, odds ratios, standard deviations andclassical estimates of 95% confidence intervals of log odds are given in Table 2.

We conclude from the classical estimates that having limitations less than 1 year due to depres-sion, anxiety or an emotional problem in males and females are equally likely. Note that this is astrong inference with such an unbalanced data set, in which some counts are very small, 40% ofcells have an expected cell count less than five, and also there are intraclass correlations arisingfrom the data collection process. All log odds ratios are found insignificant when 95% confidenceintervals of log odds ratios are considered. This inference is also unreliable, especially for λ2 andλ3, due to the intraclass correlations and very small cell counts.

Proposal distributions and implementation conditions of the employed MH algorithm are deter-mined as follows. Supports of proposal distributions of log odds ratios and intraclass correlations

Table 1. Gender – duration of limitation cross tabulation.

Duration

Gender Any <3 months 3–5 months 6–12 months >1 year

Male 206 1 1 4 126Female 190 3 3 11 161

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 10: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2311

Table 2. Classical estimates and confidence intervals.

95% CI of λ�

� exp(λ�) λ� Sλ�

LB UB

1 3.253 1.179 1.159 −1.093 3.4512 1.000 0.000 1.632 −3.200 3.2003 0.917 −0.086 1.294 −2.622 2.4504 0.465 −0.766 0.596 −1.934 0.402

Notes: CI, confidence interval; LB, lower bound; UB, upper bound.

should be (−∞, ∞) and (0, 1), respectively. Thus, a normal and a beta distribution are chosenas proposal distributions for log odds ratios and intraclass correlations, respectively. Parametersof the proposal distributions affect the acceptance rate and covered portion of the sample by thechain over the spread of the proposal distribution. For a proposal distribution, a large spreadcauses a low acceptance rate, whereas a too small spread causes a low proportion of coverage. Ifthese situations occur, we likely get high autocorrelations in sample values [12]. We tune up theparameters of the proposal distributions according to these issues. The total number of iterationswas taken as 10,000; first 1000 iterations were left as burin period to filtrate the effect of startingvalues; a record was made after each 3 iterations to decrease the effect of autocorrelation. A totalof 10 chains were run under these settings. The overall average of the trace of each parameter wastaken as the posterior estimate of the corresponding parameter. The potential scale reduction factor(R) given by Gelman [20] was used to check the convergence for each parameter. The greatest Rvalues for the elements of θ and λ were 1.01 and 1.008, respectively. Thus, we conclude that theconvergence is achieved for all parameters.

We determine non-informative prior distributions for θ and λ. Thus, our Bayesian infer-ences are comparable with classical counterparts. For the prior distribution of θ, it is takenthat ξt = γt = 1.1, ∀t. This choice makes the prior distribution of θ diffuse enough. We haveno strict information on the prior variances of log odds ratios, and we may get unreasonableposterior estimates for unsuitable prior variances. Therefore, we apply a sensitivity analysis forthe determination of a diffuse prior distribution for log odds ratios. We expect to have poste-rior estimates close to prior information when the value of τ of Section 2.2 is great, and closeto information contained in the sample when the value of τ is small. The mean vector of priordistribution of λ is taken as μ = (0.8, 0.8, 0.8, 0.8)T arbitrarily, and each τ�, for � = 1, . . . , 4 istaken as seen in the horizontal axis of Figure 1 for this prior information. Additionally, we use

Figure 1. Posterior estimates for the first prior information.

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 11: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2312 H. Demirhan

Figure 2. Posterior estimates for the second prior information.

another prior information in the opposite direction that (−0.8, −0.8, −0.8, −0.8)T in order tomake a cross-check of non-informative posterior estimates. Each τ� is taken as seen in the hor-izontal axis of Figure 2 for the second prior information. If our posterior estimates are indeednon-informative, we should have very similar non-informative posterior estimates for both of theprior means. Posterior estimates of log odds ratios and intraclass correlations are obtained foreach combination of prior information and τ = (τ�) by using the MH algorithm.

The plot of posterior estimates of the log odds ratios versus the values of τ for the first priorinformation is given in Figure 1. We get posterior estimates very close to the prior informationfor greater values of τ as expected. Amount of shrinkage of posterior estimates from the priorinformation is small for both λ2 and λ3. Posterior estimates get closer and closer to the sampleinformation as the value of τ gets smaller. The lines become flatter for the values of τ for which theposterior estimates are supposed to be very close to MLEs that will be obtained under intraclasscorrelations, and hence non-informative posterior estimates. We can decide which value of τ

should be used to get non-informative posterior estimates by investigating the flatness. The flatnessis seen near τ = 0.3 for all log odds ratios in Figure 1.

Plot of posterior estimates of log odds ratios versus the mentioned values of τ for the secondprior information is given in Figure 2. Again we get posterior estimates of log odds ratios closeto the second prior information, which is in the opposite direction to the first prior, for greatervalues of τ . Intervals of τ in which the lines get flatter are not the same for all log odds ratios inFigure 2. The flatness is seen for τ1 = 0.04, τ2 = τ3 = 0.0175 and τ4 = 0.07. Similar posteriorestimates of log odds ratios to those obtained for the first prior are obtained for these values ofτ , and hence they are supposed to be non-informative and close to the MLEs. We obtain similarnon-informative posterior estimates under two distinct prior information, which are in oppositedirections. This implies that our Bayesian estimation approach is consistent for different priorsettings, and posterior estimates obtained under the first and second prior settings are close to themaximum-likelihood counterparts.

Posterior estimates of ICCs are stable near the values presented in Table 3 for each value of theτ under both the first and second prior information settings. Consequently, we report posteriorestimates for τ = 0.33 under the first prior information. Grand averages of acceptance rates over10 parallel chains are 0.50, 0.53, 0.53 and 0.50 for each log odds ratio, and 0.66, 0.67, 0.67,0.69, 0.67 and 0.70 for each intraclass correlation. The latter acceptance rates are greater than0.50. This value is suggested for normal proposal distributions, but our proposal distribution isbeta. Grand averages of autocorrelations over 10 parallel chains at the first lag are 0.009, 0.016,

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 12: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2313

Table 3. Posterior estimates, estimated standard deviations and HPD intervals for log odds ratios and intraclasscorrelations.

95% HPDI of θt 95% HPDI of λ�

t θt Sθt

LB UB � exp(λ�) λ� Sλ�

LB UB

2 0.012 0.011 0.000 0.034 1 2.894 1.063 0.442 0.551 1.6163 0.021 0.018 0.000 0.058 2 2.260 0.815 0.398 0.507 1.6224 0.037 0.033 0.000 0.103 3 2.376 0.865 0.401 0.542 1.6145 0.146 0.112 0.000 0.369 4 3.168 1.153 0.445 0.507 1.6556 0.093 0.076 0.000 0.2507 0.173 0.130 0.000 0.433

Notes: HPDI, highest probability density interval; LB, lower bound; UB, upper bound.

0.014 and 0.013 for each log odds ratio, and −0.010, 0.009, 0.016, 0.0135, 0.038 and −0.015 foreach intraclass correlation. These imply the appropriateness of the used proposal distributions.Non-informative posterior estimates of intraclass correlations (θt) and odds ratios (exp(λ�)), theirestimated standard deviations (Sθt

and Sλ�) and 95% HPD intervals are given in Table 3. We

also present posterior estimates of the log odds ratios in terms of odds ratios to enhance theinterpretation. In addition, MCMC diagnostic plots for θ and λ are given in the supplementarymaterial.

Although estimates of intraclass correlations are very small for families of size 2, 3 and 4, andhigher for the rest, non-informative Bayesian estimates of odds ratios are very different from theclassical counter parts presented in Table 2. This implies that even a small intraclass correlationcan effect the estimates of log odds ratios very much. Especially, λ2, λ3 and λ4 are severely affectedby disregarding the intraclass correlations. According to the exp(λ1), when compared with peoplewho have had a limitation between 6 and 12 months due to the considered psychological problems,the odds of suffering from one of the problems for males who have had a limitation more than1 year is 3.17 times the odds of suffering from the same causes for females who have had alimitation more than 1 year. Note that the classical estimate of this odds ratio is 0.465. Effect ofintraclass correlations is apparent in this result also.

We obtain non-informative Bayesian estimates of cell probabilities (πjk) and intraclass correla-tions (θt) by the approach of Nandram and Choi [26]. While the posterior calculations were beingdone, estimates of odds ratios (exp(λ�)) were also calculated by using the posterior estimates ofcell probabilities, and we call them indirect estimates of odds ratios. Although the invariance ofprior distributions is not guaranteed, we present these estimates in order to see their inappropri-ateness. If indirect estimation is suitable, we should have close estimates for odds ratios in boththe direct and indirect ways. In addition, classical estimates of cell probabilities are provided toevaluate the effect of intraclass correlations on the estimates of cell probabilities. Posterior esti-mates of cell probabilities (π) and intraclass correlations (θt), their estimated standard deviations(Sπ and Sθt

) and 95% HPD intervals, and classical estimates of cell probabilities (π) are given in

Table 4. Indirect estimates of log odds ratios (λ), corresponding odds ratios, estimated standarddeviations (Sλ) and 95% HPD intervals of log odds ratios, which are obtained over MCMC output,are given in Table 5.

It is seen from Tables 3 and 4 that non-informative posterior estimates of intraclass correlationsobtained by using two different approaches are close to each other. This strengthens our inferencethat posterior inferences obtained by our approach are non-informative, and hence the sampleinformation about intraclass correlations is successfully derived. Sample information is containedin matrices N, g1, g2, g, S and n. In the classical estimation of odds ratios, only N is considered. Inthe indirect estimation, all matrices are included but the many-to-one transformation is employed

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 13: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2314 H. Demirhan

Table 4. Posterior estimates, estimated standard deviations and HPD intervals of cell probabilities andintraclass correlations.

95% HPDI of π 95% HPDI of θt

Cell π π Sπ LB UB t θt Sθt

LB UB

(1, 1) 0.292 0.289 0.017 0.256 0.322 2 0.009 0.009 0.000 0.034(1, 2) 0.001 0.003 0.002 0.000 0.008 3 0.017 0.017 0.000 0.061(1, 3) 0.001 0.003 0.002 0.000 0.008 4 0.029 0.029 0.001 0.106(1, 4) 0.006 0.007 0.003 0.002 0.014 5 0.125 0.110 0.004 0.408(1, 5) 0.178 0.178 0.014 0.151 0.207 6 0.078 0.072 0.002 0.267(2, 1) 0.269 0.265 0.016 0.234 0.298 7 0.143 0.124 0.004 0.458(2, 2) 0.004 0.006 0.003 0.002 0.012(2, 3) 0.004 0.006 0.003 0.002 0.012(2, 4) 0.016 0.017 0.005 0.009 0.027(2, 5) 0.228 0.228 0.016 0.198 0.259

Notes: HPDI, HPD interval; LB, lower bound; UB, upper bound.

Table 5. Indirect estimates, estimated standard deviations and HPDintervals of odds ratios.

95% HPDI of λ�

� exp(λ�) λ� Sλ�

LB UB

1 2.522 0.925 0.963 −0.922 2.8052 0.999 −0.001 1.345 −2.860 2.4813 1.100 0.095 1.110 −2.371 2.0814 0.504 −0.685 0.564 −1.950 0.314

Notes: HPDI, HPD interval; LB, lower bound; UB, upper bound.

to obtain estimates of odds ratios. In our procedure, all matrices are included and log odds ratiosare directly modeled. Indirect estimates of λ2 and λ3 and their standard deviations are very closeto their classical counterparts. However, if intraclass correlations are appropriately taken intoaccount, we should see the effect of information contained in g1, g2, g, S and n on the estimatesof, at least, λ2 and λ3 in Table 5. Our posterior estimates of λ2, λ3 and λ4 are different from thoseobtained by classical and indirect ways due to the effect of intraclass correlations.

We set λr0 and θ t

0 of Section 3 to 0 for r = 1, . . . , 4 and t = 2, . . . , 7, respectively. In order toconduct the hypothesis tests for log odds ratios and intraclass correlations, values of P(LR(β) <

0.1 | y) for each parameter are given in Table 6.According to the posterior probabilities, we have a strong evidence against the insignificance of

all log odds ratios. Zero falls outside the 95% HPD intervals of all log odds ratios. This supports the

Table 6. p(LR(β) < 0.1 | y) values.

t For θt � For λ�

2 0.476 1 0.9983 0.461 2 0.9734 0.450 3 0.9885 0.469 4 0.9796 0.4567 0.469

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 14: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Journal of Applied Statistics 2315

inferences obtained by p(LR(β) | y) for log odds ratios. The evidence against the insignificanceof intraclass correlations is weak. This is supported by close to zero lower bounds of 95% HPDintervals of intraclass correlations. When we use classical confidence intervals of log odds ratios(Table 2), we infer that all log odds ratios are insignificant at 0.05 level of significance. Bayesianand classical inferences about the significance of log odd ratios conspicuously differ. It is seenhere that we may draw misleading inferences with classical hypothesis testing in the presence ofintraclass correlations.

5. Discussion

In this article, we deal with Bayesian estimation and testing of log odds ratios and intraclass correla-tions from a two-way contingency table under the existence of intraclass correlations. We proposea Bayesian approach for the estimation of log odds ratios, intraclass correlation coefficients andtheir HPD intervals. We estimate posterior probabilities of LRs for intraclass correlations and logodds ratios over the output of the MH algorithm and use them for the hypothesis testing of logodds ratios and intraclass correlations. This is an application of the approach of Aitkin et al. [3]to the analysis of log odds ratios and intraclass correlations.

We analyze a 2 × 5 contingency table from the 2006 NHIS conducted by the US NCHS. Wefind out in the analysis of the data set that intraclass correlations are effectual on the estimatesof log odds ratios and results of hypothesis tests. Especially, when cell counts of the consideredcontingency table are unbalanced, classical estimates are already unreliable even if there is no intr-aclass correlations. If there are also intraclass correlated cells, classical estimates and hypothesistests are severely affected. One can get just the opposite results from the classical hypothesis tests.On the other hand, one can obtain estimates of odds ratios effortless by using the classical theory,whereas it is more difficult to obtain Bayesian estimates of odds ratios. Bearing this conflict inmind, the classical theory should be used cautiously unless the sample size is very large and thecontingency table of interest is well-balanced.

Epidemiology and population genetics studies are two important fields in which the pro-posed approaches can be applied effectively. Intraclass correlation patterns arise in family orneighborhood-level disease clustering. It is possible to have intraclass correlations within variouscohorts in cohort studies. It is nearly inevitable to have intraclass correlations in population healthsurveys. Deterministic sampling of clusters in genetic epidemiological studies causes intraclasscorrelations within some clusters. In a population genetics study of a familial disease, it is highlypossible to have intraclass correlations. In these studies, data are mostly presented by contin-gency tables, and analyses are carried on these tables and odds ratios. Therefore, the approachesproposed in this article is suitable in these fields.

In conclusion, although there are difficulties in the posterior computations, use of the proposedBayesian setting in the estimation of log odds ratios and intraclass correlation coefficients overtwo-way contingency tables and their significance tests is seen as reliable and suggestible.

Acknowledgements

We thank three referees for constructive criticisms and suggestions that significantly improved the clarity and quality ofthe manuscript.

References

[1] A. Agresti, Categorical Data Analysis, John Wiley and Sons, New York, 2002.[2] M. Aitkin, The calibration of P-values, posterior Bayes factors and the AIC from the posterior distribution of the

likelihood (with discussion), Stat. Comput. 7 (1997), pp. 253–272.

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014

Page 15: Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

2316 H. Demirhan

[3] M. Aitkin, J.R. Boys, and T. Chadwick, Bayesian point null hypothesis testing via the posterior likelihood ratio,Stat. Comput. 15 (2005), pp. 217–230.

[4] J.H. Albert, Bayesian estimation of odds ratios under prior hypotheses of independence and exchangeability, J. Stat.Comput. Simul. 27 (1987), pp. 251–268.

[5] P.M. Altham, Discrete variable analysis for individuals grouped into families, Biometrika 63 (1976), pp. 263–269.[6] E.G. Atenafu, J.S. Hamid, T. To, A.R. Willan, B.M. Feldman, and J. Beyene, Bias-corrected estimator for intraclass

correlation coefficient in the balanced one-way random effects model, BMC Med. Res. Methodol. 12 (2012), articleno. 126.

[7] G.E.P. Box and G.C. Tiao, Bayesian Inference in Statistical Analysis, Addison-Wesley, Reading, MA, 1973.[8] S.S. Brier, Analysis of contingency tables under cluster sampling, Biometrika 67 (1980), pp. 591–596.[9] B.D. Burch and I.R. Harris, Bayesian estimators of the intraclass correlation coefficient in the one-way random

effects model, Comm. Statist. Theory Methods 28 (1999), pp. 1247–1272.[10] M.-H. Chen, Q.-M. Shao, and J.G. Ibrahim, Monte Carlo Methods in Bayesian Computation, Springer, Berlin, 2000.[11] S. Chib, Marginal likelihood from the Gibbs output, J. Amer. Statist. Assoc. 90 (1995), pp. 1313–1321.[12] S. Chib and E. Greenberg, Understanding the Metropolis–Hastings algorithm,Amer. Statist. 49 (1995), pp. 327–335.[13] J.W. Choi and R.B. McHugh, A reduction factor in goodness-of-fit and independence tests for clustered and weighted

observations, Biometrics 45 (1989), pp. 979–996.[14] J.E. Cohen, The distribution of the chi-squared statistic under clustered sampling from contingency tables, J. Amer.

Statist. Assoc. 71 (1976), pp. 665–670.[15] D. Commenges and H. Jacqmin, The intraclass correlation coefficient: Distribution-free definition and test,

Biometrics 50 (1994), pp. 517–526.[16] H. Demirhan and C. Hamurkaroglu, Bayesian estimation of log odds ratios from R × C and 2 × 2 × K contingency

tables, Statist. Neerlandica 62 (2008), pp. 405–424.[17] A.P. Dempster, The direct use of likelihood in significance testing, in Proc. Conf. Foundational Questions in Statistical

Inference, O. Barndorff-Nielsen, P. Blaesild, and G. Sihon, eds., Kluwer, Hingham, MA, 1974, pp. 335–352.[18] A.P. Dempster, The direct use of likelihood in significance testing, Statist. Comput. 7 (1997), pp. 247–252.[19] A. Donner, A review of inference procedures for the intraclass correlation coefficient in the one-way random effects

model, Int. Stat. Rev. 54 (1986), pp. 67–82.[20] A. Gelman, Inference and monitoring convergence, in Markov Chain Monte Carlo in Practice, W.R. Gilks, S.

Richardson, and D.J. Spiegelhalter, eds., Chapman&Hall/CRC, London, 1996, pp. 131–143.[21] J.K. Gosh, M. Delampady, and T. Samanta, An Introduction to Bayesian Analysis Theory and Methods, Springer,

New York, 2006.[22] I.R. Harris and B.D. Burch, Pivotal estimation with applications for the intraclass correlation coefficient in the

balanced one-way random effects model, J. Statist. Plann. Inference 83 (2000), pp. 257–276.[23] A. Helu and D.N. Naik, Estimation of sib–sib correlation via a Kotz-type density function, Comm. Statist. Theory

Methods 36 (2007), pp. 1021–1029.[24] J. Heo and Y.H. Kim, Developing noninformative priors for the familial data, J. Korean Statist. Soc. 36 (2007),

pp. 77–91.[25] R.E. Kass and A.E. Raftery, Bayes factors, J. Amer. Statist. Assoc. 90 (1995), pp. 773–795.[26] B. Nandram and J.W. Choi, Bayesian analysis of a two-way categorical table incorporating intraclass correlation,

J. Stat. Comput. Simul. 76 (2006), pp. 233–249.[27] B. Nandram and J.W. Choi, Alternative tests of independence in two-way categorical tables, J. Data Sci. 5 (2007),

pp. 217–237.[28] R.-D. Ye and S.-G. Wang, Inferences on the intraclass correlation coefficients in the unbalanced two-way random

effects model with interaction, J. Statist. Plann. Inference 139 (2009), pp. 396–410.

Dow

nloa

ded

by [

Flor

ida

Stat

e U

nive

rsity

] at

18:

12 1

7 O

ctob

er 2

014