7
Computational Statistics and Data Analysis 60 (2013) 90–96 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Objective Bayesian higher-order asymptotics in models with nuisance parameters Laura Ventura a,, Nicola Sartori a , Walter Racugno b a Department of Statistical Sciences, University of Padova, Italy b Department of Mathematics and Informatics, University of Cagliari, Italy article info Article history: Received 11 May 2012 Received in revised form 29 October 2012 Accepted 30 October 2012 Available online 7 November 2012 Keywords: Asymptotic expansion Directed and modified directed likelihood Matching prior Modified profile likelihood Tail area probability abstract A higher-order approximation to the marginal posterior distribution for a scalar parame- ter of interest in the presence of nuisance parameters is proposed. The approximation is obtained using a matching prior. The procedure improves the normal first-order approxi- mation and has several advantages. It does not require the elicitation on the nuisance pa- rameters, neither numerical integration nor Monte Carlo simulation, and it enables us to perform accurate Bayesian inference even for small sample sizes. Numerical illustrations are given for models of practical interest, such as linear non-normal models and logistic regression. Finally, it is shown how the proposed approximation can routinely be applied in practice using results from likelihood asymptotics and the R package bundle hoa. © 2012 Elsevier B.V. All rights reserved. 1. Introduction Let us consider a model with a scalar parameter of interest ψ ,a d-dimensional nuisance parameter λ and likelihood function L(ψ , λ) = L(ψ,λ; y), where y = (y 1 ,..., y n ) is a random sample of size n. Given a prior π(ψ,λ) over the entire parameter, under general regularity conditions Bayesian inference about ψ is based on the marginal posterior distribution π m |y) π(ψ,λ) L(ψ , λ) dλ. (1) For objective Bayesian inference, when agreement between Bayesian and non-Bayesian inference is of interest, the class of matching priors can be considered (see, for instance, Datta and Mukerjee, 2004, for a comprehensive review). In particular, in the presence of nuisance parameters, a suitable solution was discussed by Tibshirani (1989), who suggested a prior such that the resulting marginal posterior intervals have accurate frequentist coverage. In this paper we discuss higher-order asymptotic inference based on (1), using Tibshirani’s matching prior. In particular, we derive an explicit higher-order approximation for π m |y) in terms of a higher-order pivotal quantity. Such an approximation enables one to perform accurate Bayesian inference on ψ even for small sample sizes, with a remarkable improvement over the first-order normal approximation. The proposed procedure has several advantages since it does not require the elicitation on the nuisance parameters, neither numerical integration nor Markov Chain Monte Carlo (MCMC) simulation. A remarkable further advantage in the use of such an approximation is that its expression automatically includes the matching prior without requiring its explicit computation. Moreover, the result can be used for tail area approximation. In particular, we show that the resulting credible Correspondence to: Department of Statistical Sciences, University of Padova, Via C. Battisti 241, Italy. Tel.: +39 0498274177; fax: +39 0498274170. E-mail address: [email protected] (L. Ventura). 0167-9473/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2012.10.022

Objective Bayesian higher-order asymptotics in models with nuisance parameters

Embed Size (px)

Citation preview

Page 1: Objective Bayesian higher-order asymptotics in models with nuisance parameters

Computational Statistics and Data Analysis 60 (2013) 90–96

Contents lists available at SciVerse ScienceDirect

Computational Statistics and Data Analysis

journal homepage: www.elsevier.com/locate/csda

Objective Bayesian higher-order asymptotics in models withnuisance parametersLaura Ventura a,∗, Nicola Sartori a, Walter Racugno b

a Department of Statistical Sciences, University of Padova, Italyb Department of Mathematics and Informatics, University of Cagliari, Italy

a r t i c l e i n f o

Article history:Received 11 May 2012Received in revised form 29 October 2012Accepted 30 October 2012Available online 7 November 2012

Keywords:Asymptotic expansionDirected and modified directed likelihoodMatching priorModified profile likelihoodTail area probability

a b s t r a c t

A higher-order approximation to the marginal posterior distribution for a scalar parame-ter of interest in the presence of nuisance parameters is proposed. The approximation isobtained using a matching prior. The procedure improves the normal first-order approxi-mation and has several advantages. It does not require the elicitation on the nuisance pa-rameters, neither numerical integration nor Monte Carlo simulation, and it enables us toperform accurate Bayesian inference even for small sample sizes. Numerical illustrationsare given for models of practical interest, such as linear non-normal models and logisticregression. Finally, it is shown how the proposed approximation can routinely be appliedin practice using results from likelihood asymptotics and the R package bundle hoa.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

Let us consider a model with a scalar parameter of interest ψ , a d-dimensional nuisance parameter λ and likelihoodfunction L(ψ, λ) = L(ψ, λ; y), where y = (y1, . . . , yn) is a random sample of size n. Given a prior π(ψ, λ) over the entireparameter, under general regularity conditions Bayesian inference about ψ is based on the marginal posterior distribution

πm(ψ |y) ∝

π(ψ, λ) L(ψ, λ) dλ. (1)

For objective Bayesian inference, when agreement between Bayesian and non-Bayesian inference is of interest, the class ofmatching priors can be considered (see, for instance, Datta and Mukerjee, 2004, for a comprehensive review). In particular,in the presence of nuisance parameters, a suitable solution was discussed by Tibshirani (1989), who suggested a prior suchthat the resulting marginal posterior intervals have accurate frequentist coverage.

In this paper we discuss higher-order asymptotic inference based on (1), using Tibshirani’s matching prior. In particular,we derive an explicit higher-order approximation for πm(ψ |y) in terms of a higher-order pivotal quantity. Such anapproximation enables one to perform accurate Bayesian inference on ψ even for small sample sizes, with a remarkableimprovement over the first-order normal approximation.

The proposed procedure has several advantages since it does not require the elicitation on the nuisance parameters,neither numerical integration nor Markov Chain Monte Carlo (MCMC) simulation. A remarkable further advantage in theuse of such an approximation is that its expression automatically includes the matching prior without requiring its explicitcomputation. Moreover, the result can be used for tail area approximation. In particular, we show that the resulting credible

∗ Correspondence to: Department of Statistical Sciences, University of Padova, Via C. Battisti 241, Italy. Tel.: +39 0498274177; fax: +39 0498274170.E-mail address: [email protected] (L. Ventura).

0167-9473/$ – see front matter© 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.csda.2012.10.022

Page 2: Objective Bayesian higher-order asymptotics in models with nuisance parameters

L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96 91

intervals coincide with the frequentist confidence intervals based on the modified directed likelihood (see, e.g., Barndorff-Nielsen, 1986), thus suggesting its use also for Bayesian inference.

Examples in the context of models of practical interest, such as linear non-normal models and logistic regression, arediscussed. Moreover, we also show how the proposed accurate approximation can routinely be applied in practice usingresults from likelihood asymptotics and the R package bundle hoa (Brazzale et al., 2007).

The outline of the paper is as follows. Background theory is briefly reviewed in Section 2. In Section 3 we discuss higher-order asymptotics for πm(ψ |y). Examples are illustrated in Section 4. Some final remarks conclude the paper.

2. Matching priors

Assume that the likelihood L(ψ, φ) is given under the orthogonal parameterization (ψ, φ). Tibshirani (1989) andNicolaou (1993) showed that a class of matching priors for (ψ, φ), i.e., priors that ensure approximate frequentist validityof posterior credible sets, is given by

πT (ψ, φ) ∝ iψψ (ψ, φ)1/2 g(φ) , (2)

where g(φ) is an arbitrary function and iψψ (ψ, φ) is the (ψ,ψ) component of the Fisher information matrix i(ψ, φ). Thecomputation of the marginal posterior distribution (1) with this prior requires the choice of the arbitrary function g(φ) in(2) and possibly cumbersome numerical integration.

These drawbacks can be avoided using results in Ventura et al. (2009, Appendix B). In particular, the marginal posteriordistribution (1) based on the prior πT (ψ, φ) can be written to second order in the original non-orthogonal parameterization(ψ, λ) as

πm(ψ |y) ∝ πmp(ψ) LM(ψ), (3)

with

πmp(ψ) ∝ iψψ ·λ(ψ, λψ )1/2 (4)

the matching prior for ψ only (see Ventura et al., 2009; Ventura and Racugno, 2011), and

LM(ψ) = LP(ψ)M(ψ) (5)

the modified profile likelihood forψ (see, e.g., Barndorff-Nielsen and Cox, 1994, Chapter 8, Severini, 2000, Chapter 9). In (4),

iψψ ·λ(ψ, λ) = iψψ (ψ, λ)− iψλ(ψ, λ)iλλ(ψ, λ)−1iλψ (ψ, λ)

is the partial information for ψ , being iψψ (ψ, λ), iψλ(ψ, λ), iλλ(ψ, λ), and iλψ (ψ, λ)-blocks of the expected Fisherinformation i(ψ, λ) from the log-likelihood ℓ(ψ, λ) = log L(ψ, λ), and λψ is the Maximum Likelihood Estimator (MLE)of λ for fixed ψ . In (5), LP(ψ) = exp(ℓP(ψ)) = exp(ℓ(ψ, λψ )) is the profile likelihood and M(ψ) is a suitably definedcorrection term. For instance, the modified profile likelihood of Barndorff-Nielsen (1983) is given by

M(ψ) =|jλλ(ψ, λψ )|1/2|jλλ(ψ, λ)|1/2

|ℓλ;λ(ψ, λψ )|, (6)

where jλλ(ψ, λ) is the (λ, λ)-block of the observed information j(ψ, λ), ℓλ;λ(ψ, λ) = ∂ℓ(ψ, λ)/(∂λ∂λT) is a sample spacederivative, and (ψ, λ) is the MLE of (ψ, λ). Other expressions for M(ψ) are found in Severini (2000, Chapter 9); see alsoPace and Salvan (2006).

3. Higher-order asymptotics for πm(ψ|y)

In the following, we discuss higher-order asymptotic inference for πm(ψ |y) based on a matching prior.Let us denote by ℓ′

P(ψ) = ∂ℓP(ψ)/∂ψ and jP(ψ) = −∂ℓ′

P(ψ)/∂ψ the profile score function and the profile observedinformation, respectively. Moreover, let sP(ψ) = ℓ′

P(ψ)/jP(ψ)1/2 be the profile score statistic and let rP(ψ) = sign(ψ −ψ)

{2(ℓP(ψ)− ℓP(ψ))}1/2 be the directed profile likelihood. Finally, let

r∗

P (ψ) = rP(ψ)+1

rP(ψ)log

q(ψ)rP(ψ)

(7)

be the modified directed likelihood, where q(ψ) is a suitable quantity; see, e.g., Severini (2000, Chapter 7) for a review. Forinstance, Barndorff-Nielsen and Chamberlin (1994) used

q(ψ) =ℓ′

P(ψ)

jP(ψ)1/2iψψ ·λ(ψ, λ)

1/2

iψψ ·λ(ψ, λψ )1/2

|ℓλ;λ(ψ, λψ )|

|jλλ(ψ, λ)|1/2|jλλ(ψ, λψ )|1/2. (8)

Page 3: Objective Bayesian higher-order asymptotics in models with nuisance parameters

92 L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96

The modified directed likelihood (7) is well-known in the non-Bayesian framework as a higher-order pivotal quantity, withstandard normal null distribution with third-order accuracy, that means with an error of order O(n−3/2).

The prior πmp(ψ) used in (3) is also a strong matching prior (Fraser and Reid, 2002; Ventura and Racugno, 2011) in thesense that a frequentist p-value coincideswith a Bayesian posterior survivor probability to a high degree of approximation. Ingeneral, a strong matching prior forψ only, which guarantees an equivalence between a frequentist p-value and a Bayesiantail area probability, can be expressed as

π∗(ψ) ∝ rP(ψ)/(M(ψ)q(ψ)). (9)

This prior, as well as (4), is a data dependent prior. Such priors, although unorthodox in the Bayesian framework, are usedin several contexts; see, for instance, Reid (2003, Section 3.5).

3.1. Approximation for πm(ψ |y)

The following proposition gives a second-order approximation for (3).

Proposition 1. The marginal posterior distribution for ψ based on the matching prior (4) can be written in the following form

πm(ψ |y)∝ exp

−12r∗

P (ψ)2 sP(ψ)rP(ψ)

, (10)

where the symbol ‘‘∝’’ means second order equivalence.

Proof. Consider the strongmatching priorπ∗(ψ) given in (9), and let rM(ψ) = sign(ψM−ψ){2(log LM(ψM)−log LM(ψ))}1/2

be the directed modified profile likelihood, where ψM is the maximizer of LM(ψ). Then, using results in Sartori et al. (1999)and discarding additive constants not depending on ψ , we have

log(LM(ψ) π∗(ψ)) = −12(rM(ψ))2 + logπ∗(ψ)

= −12(rP(ψ))2 − rP(ψ)NP + logπ∗(ψ)

= −12(rP(ψ))2 − rP(ψ)

NP +

1rP(ψ)

log1

π∗(ψ)

= −

12(rP(ψ))2 − rP(ψ) [NP + INF] ,

where NP = −rP(ψ)−1 logM(ψ) and INF = rP(ψ)−1 log(q(ψ)M(ψ)/rP(ψ)) are known as the nuisance parameters andthe information adjustments, respectively, in the modified directed likelihood decomposition (see, e.g., Barndorff-Nielsenand Cox, 1994, Section 6.6.4). Then,

log(LM(ψ) π∗(ψ)) = −12(rP(ψ)+ NP + INF)2 + Op(n−1)

= −12r∗

P (ψ)2+ Op(n−1). (11)

Using (6) and (8) we have π∗(ψ) ∝ πmp(ψ) rP(ψ)/sP(ψ), and therefore we obtain

logπm(ψ |y) = −12r∗

P (ψ)2+ log

sP(ψ)rP(ψ)

+ Op(n−1), (12)

and thus (10) follows. Note that the approximation is with an error of order O(n−1) in a moderate deviation region, that isfor ψ − ψ = O(n−1/2). �

A remarkable advantage of this approximation is that its expression automatically includes the matching prior, withoutrequiring its explicit computation.

Moreover, note that in (10) the modified directed likelihood (7) may be replaced by the modified directed likelihood ofBarndorff-Nielsen (1991) or by the adjusted directed likelihoods discussed in Barndorff-Nielsen and Chamberlin (1994).Indeed, all these versions are closely related to each other in the sense that they are equivalent to second order (seeBarndorff-Nielsen and Chamberlin, 1994, Section 5).

As a final remark, note that the approximation (10), as themodified directed likelihood r∗

P (ψ), has a numerical singularityinψ = ψ . However, this is not a concern for practical purposes as it can be avoided by numerical interpolation (see Brazzaleet al., 2007, Section 9.3).

Page 4: Objective Bayesian higher-order asymptotics in models with nuisance parameters

L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96 93

Table 1Frequentist coverage probabilities of approximate 0.95 equi-tailed credible interval with left and rightnoncoverage probabilities (in brackets) under the gamma model.

n = 5 n = 10 n = 15 n = 20

π am(ψ |y) 0.900 (0.009, 0.089) 0.928 (0.010, 0.061) 0.937 (0.014, 0.048) 0.942 (0.014, 0.044)πm(ψ |y) 0.948 (0.024, 0.027) 0.951 (0.022, 0.027) 0.950 (0.026, 0.024) 0.950 (0.024, 0.026)

3.2. Tail area approximation

Accurate tail probabilities are easily computable by direct integration of πm(ψ |y). In particular, using results in Venturaand Racugno (2011), it can be shown that ψ0

−∞

πm(ψ |y) dψ ≡ Φr∗

P (ψ0), (13)

where r∗

P (ψ) is the modified directed likelihood given in (7), with formula (8) for q(ψ), the symbol ‘‘≡’’ means third-orderasymptotic equivalence, andΦ(·) is the standard normal distribution function.

In practice, using (13), an asymptotic equi-tailed credible interval for ψ can be computed as {ψ : |r∗

P (ψ)| ≤ z1−α/2}, i.e.,as a confidence interval for ψ based on (7) with approximate level (1 − α), where z1−α/2 is the (1 − α/2)-quantile of thestandard normal distribution. Therefore, this credible interval is also an accurate likelihood-based confidence interval forψwith approximate level (1 − α). Note also that from (13) the median posterior estimator (MPE) of (3) can be computed asthe solution ψ∗ in ψ of the estimating equation r∗

P (ψ) = 0, and thus it coincides with the frequentist estimator defined asthe zero-level confidence interval based on r∗

P (ψ) (Skovgaard, 1989). Such an estimator has been shown to be a refinementof the maximum likelihood estimator ψ (see Pace and Salvan, 1999 and Giummolé and Ventura, 2002).

4. Examples and numerical illustrations

The aim of this section is to provide some illustrations of the accuracy of the higher-order approximation (10) and ofthe related tail area approximation (13). In particular, in applications of practical interest, we study the accuracy of (10), incomparison to the first-order approximation

π am(ψ |y) ∼ N(ψ, jP(ψ)−1), (14)

using likelihood asymptotics tools and the R package bundle hoa (Brazzale et al., 2007).

Example 1 (Gamma Distribution). Let (y1, . . . , yn) be a random sample from the gamma density p(y;ψ, λ) = λψyψ−1

exp(−λy)Γ (ψ)−1, y > 0, ψ, λ > 0. We assume ψ the parameter of interest, with the scale parameter λ as nuisance.The profile log-likelihood is ℓP(ψ) = ψ(t − n)− n logΓ (ψ)+ nψ log(ψ/y), with t =

log yi and y sample mean, and the

modified profile log-likelihood is ℓM(ψ) = ℓP(ψ) − 0.5 logψ , since M(ψ) = 1/√ψ . Moreover, simple calculations give

iψψ.λ(ψ, λ) = (n/ψ)(ψρ(ψ) − 1) = jP(ψ), with ρ(ψ) = (∂2/∂ψ2) logΓ (ψ). Note that the matching prior πmp(ψ) doesnot depend on the nuisance parameter λ. This is a general result when ψ is the index parameter of a group model and λis the group element, and (3) corresponds to the Laplace approximation of the marginal posterior distribution (1) based onthe Chang–Eaves reference prior discussed in Datta and Ghosh (1995).

The behavior of (10) under the gammamodel is illustrated through a simulation study based on 10000Monte Carlo trials,withψ = λ = 1. Table 1 gives the empirical frequentist coverages for 0.95 posterior equi-tailed credible intervals with leftand right noncoverage probabilities from π a

m(ψ |y) and from πm(ψ |y). From Table 1 we observe that, for every n, (10) clearlyimproves on π a

m(ψ |y). Larger sample sizes would show, as one would expect, rather little differences between the resultsof all the procedures.

We also evaluated the finite-sample properties of the posterior median of (10). The MPEs of (10) and of the first-orderapproximation are compared in terms of the usual centering and dispersionmeasures, i.e., bias and standard deviation. FromTable 2 it is clear that the MPE of (10) exhibits a smaller bias than the maximum profile estimator.

Example 2 (Survival Times). Let us consider the survival times ti in weeks of n = 17 patients with leukemia along withtheir white blood cell counts xi at the time of diagnosis, i = 1, . . . , 17 (see Cox and Snell, 1981, Example U). We use thesedata to illustrate the higher-order approximation (10) for a Weibull model with shape parameter κ and scale parameterλi = β1 exp(β2(xi − x)), i = 1, . . . , n. To this end, we use the fact that yi = log ti follows a non-normal regression and scalemodel of the form yi = xTi β + σεi, with here xTi β = logβ1 + β2(xi − x), σ = 1/κ and εi log-Weibull random variable, alsocalled an extreme-value or Gumbel random variable, with density f (εi) = exp(εi − eεi).

Page 5: Objective Bayesian higher-order asymptotics in models with nuisance parameters

94 L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96

Table 2Bias (and standard deviations) of the MPEs of π a

m(ψ |y) and of πm(ψ |y)under the gamma model.

n = 5 n = 10 n = 15 n = 20

π am(ψ |y) 1.21 (4.09) 0.34 (0.72) 0.20 (0.49) 0.14 (0.37)πm(ψ |y) 0.03 (1.64) 0.01 (0.51) 0.00 (0.40) 0.00 (0.31)

Table 3Frequentist coverage probabilities of approximate 0.95 equi-tailed credible interval withleft and right noncoverage probabilities (in brackets) under the extreme-value model.

n = 10 n = 17 n = 34

π am(ψ |y) 0.892 (0.043, 0.065) 0.915 (0.041, 0.044) 0.932 (0.036, 0.032)πm(ψ |y) 0.948 (0.029, 0.023) 0.951 (0.026, 0.023) 0.951 (0.025, 0.024)

0.03

00.

025

0.02

00.

015

0.01

00.

005

0.00

0

-3.5 -3.0 -2.5 -2.0 -1.5

post

beta2

Fig. 1. Posteriors πm(ψ |y) (solid line) and first-order approximation π am(ψ |y) (dashed line) for ψ = β2 in the leukemia data.

The proposed Bayesian procedure for inference in non-normal regression models can be easily fitted by means of thersm fitting routine, provided by the marg section of the library HOA, and (10) can be obtained from the correspondingcond method. Fig. 1 gives the posterior distribution (10) together with the posterior (14) for ψ = β2: the correspondingasymptotic equi-tailed credible intervals for ψ with approximate level 0.95 are (−3.06, −1.43) and (−2.96, −1.63),respectively, and the two MPEs are −2.30 and −2.28, respectively.

As in Example 1, Tables 3 and 4 give the results of a simulation study based on 10000 Monte Carlo trials with n =

10, 17, 34. In all three scenarios, we used as true parameter values the observedMLE, that is logβ1 = 3.95, β2 = −0.99 andσ = 1.00. From Table 3 we observe that, even for small n, πm(ψ |y) for the regression coefficient ψ = β2 clearly improveson π a

m(ψ |y). From Table 4 it can be noted that the MPE of (3) is more accurate than the maximum profile estimator.

Example 3 (Binary Data). Let us consider the bank dataset, which is made of four measurements on 100 genuine Swissbanknotes and 100 counterfeit ones (see Marin and Robert, 2007, Chapter 4). The response variable y is the status of thebanknote. The explanatory variables are the length of the bill x1, the width of the left edge x2, the width of the right edge x3,and the bottom margin width x4, all expressed in millimeters. A logit model is used to predict the type of banknote, i.e., todetect counterfeit banknotes, based on the four regressors x1, x2, x3 and x4.

Here,we focus our attention on inference on the scalar parameterψ = β4, i.e., the coefficient of x4. The proposedBayesianprocedure for inference on ψ can be easily fitted by means of the cond method for glm objects (Brazzale et al., 2007). Theposterior distribution πm(ψ |y) is compared with the MCMC approximation (105 simulations) for the posteriors obtainedwith a flat prior and with Zellner’s uninformative G-prior (see Fig. 2), discussed in Marin and Robert (2007, Chapter 4).

We also performed a simulation study with 104 samples simulated from the fitted model (MLE), that is with β1 =

−2.44, β2 = 1.88, β3 = 2.01 and β4 = 2.05. For the MCMC approximation for π flatm (ψ |y) and πG-prior

m (ψ |y) we used104 replications. Table 5 gives the empirical frequentist coverages for 0.95 posterior equi-tailed credible intervals with leftand right noncoverage probabilities. The accuracy of our proposal in terms of frequentist coverage is again remarkable. On

Page 6: Objective Bayesian higher-order asymptotics in models with nuisance parameters

L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96 95

Table 4Bias (and standard deviations) of the MPEs of π a

m(ψ |y) andof πm(ψ |y) under the extreme-value model.

n = 10 n = 17 n = 34

π am(ψ |y) 0.066 (1.53) 0.039 (0.80) 0.013 (0.64)πm(ψ |y) 0.025 (1.46) 0.029 (0.77) 0.011 (0.62)

Table 5Frequentist coverage probabilities of approximate 0.95 equi-tailed credible intervalwith left and right noncoverage probabilities (in brackets) under the logistic model.

π am(ψ |y) πm(ψ |y) π flat

m (ψ |y) πG-priorm (ψ |y)

0.956 0.950 0.893 0.939(0.022, 0.022) (0.025, 0.025) (0.100, 0.007) (0.040, 0.021)

1 2 3 4

beta4

post

erio

r

1.2

1.0

0.8

0.6

0.4

0.2

0.0

Fig. 2. Higher-order (solid line) and first-order posteriors (dashed),MCMCposteriorswith flat prior (long dash) andwith uninformative G-prior forψ = β4(dot-dashed). The corresponding 0.95 equi-tailed credible intervals are (1.38, 2.68), (1.38, 2.72), (1.54, 2.94) and (1.41, 2.74).

the other hand, the frequentist coverage of the equi-tailed credible interval based on Zellner’s G-prior is less accurate butstill satisfactory, while this is not the case with the equi-tailed credible interval based on the commonly used flat prior.

5. Final remarks

For the purpose of making objective Bayesian inferences for a scalar interest parameter, higher-order asymptotic theoryis discussed. Advantages of the proposed approximations are that no elicitation on the nuisance parameters, neithermultidi-mensional integration nor MCMC simulation are necessary in order to obtain πm(ψ |y), and no orthogonal parameterizationis required in order to specify the matching prior. A further advantage of this approximation is that its expression automat-ically includes the matching prior, without requiring its explicit computation. From a practical point of view, the proposedapproximation enables one to compute accurate invariant equi-tailed credible intervals for ψ as {ψ : |r∗

P (ψ)| ≤ z1−α/2},i.e., as accurate likelihood-based confidence intervals for ψ with approximate level (1 − α). This result suggests that themodified directed likelihood can be used for Bayesian, as well as for non-Bayesian, inference.

Other possible applications of the proposedmethods are, amongothers, the bi-normal andbi-exponential stress–strengthmodels P(X < Y ) discussed in Ventura and Racugno (2011) (some R software is available at homes.stat.unipd.it/ventura/?page=Software), nonlinear regression models (with the profile method available for objects of classnlreg), linear mixed effects models (see Guolo et al., 2006), and generalized mixed effects models.

As a final remark, note that, while the statistic r∗

P (ψ) requires the MLE ψ as an ingredient, the posterior (3) does not.In view of this, expression (3) can be useful in these situations where ψ can be infinite, using a suitably defined modifiedprofile likelihood (see Severini, 2000, Chapter 9).

Acknowledgments

We thank the referees for their useful comments that greatly improved the paper. This work was supported by agrant from the University of Padua (Progetti di Ricerca di Ateneo 2011) and by the Cariparo Foundation Excellence grant2011/2012.

Page 7: Objective Bayesian higher-order asymptotics in models with nuisance parameters

96 L. Ventura et al. / Computational Statistics and Data Analysis 60 (2013) 90–96

References

Barndorff-Nielsen, O.E., 1983. On a formula for the distribution of the maximum likelihood estimator. Biometrika 70, 343–365.Barndorff-Nielsen, O.E., 1986. Inference on full or partial parameters based on the standardized signed log likelihood ratio. Biometrika 73, 307–322.Barndorff-Nielsen, O.E., 1991. Modified signed log likelihood ratio. Biometrika 78, 557–563.Barndorff-Nielsen, O.E., Chamberlin, S.R., 1994. Stable and invariant adjusted directed likelihoods. Biometrika 81, 485–499.Barndorff-Nielsen, O.E., Cox, D.R., 1994. Inference and Asymptotics. Chapman and Hall, London.Brazzale, A.R., Davison, A.C., Reid, N., 2007. Applied Asymptotics. Cambridge University Press, Cambridge.Cox, D.R., Snell, E.J., 1981. Applied Statistics: Principles and Examples. Chapman and Hall, London.Datta, G.S., Ghosh, J.K., 1995. Noninformative priors for maximal invariant parameter in group models. TEST 4, 95–114.Datta, G.S., Mukerjee, R., 2004. Probability Matching Priors: Higher-Order Asymptotics. In: Lecture Notes in Statistics, Springer.Fraser, D.A.S., Reid, N., 2002. Strong matching of frequentist and Bayesian parametric inference. J. Statist. Plann. Inference 103, 263–285.Giummolé, F., Ventura, L., 2002. Practical point estimation from higher-order pivots. J. Stat. Comput. Simul. 72, 419–430.Guolo, A., Brazzale, A.R., Salvan, A., 2006. Improved inference on a scalar fixed effect of interest in nonlinear mixed-effects models. Comput. Statist. Data

Anal. 51, 1602–1613.Marin, J.M., Robert, C., 2007. Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer.Nicolaou, A., 1993. Bayesian intervals with good frequentist behaviour in the presence of nuisance parameters. J. R. Stat. Soc. Ser. B 55, 377–390.Pace, L., Salvan, A., 1999. Point estimation based on confidence intervals: exponential families. J. Stat. Comput. Simul. 64, 1–21.Pace, L., Salvan, A., 2006. Adjustments of the profile likelihood from a new perspective. J. Statist. Plann. Inference 136, 3554–3564.Reid, N., 2003. The 2000 Wald memorial lectures: asymptotics and the theory of inference. Ann. Statist. 31, 1695–1731.Sartori, N., Bellio, R., Salvan, A., Pace, L., 1999. The directed modified profile likelihood in models with many nuisance parameters. Biometrika 86, 735–742.Severini, T.A., 2000. Likelihood Methods in Statistics. Oxford University Press.Skovgaard, I.M., 1989. A review of higher order likelihood inference. Bull. Int. Statist. Inst. 53, 331–351.Tibshirani, R., 1989. Noninformative priors for one parameter of many. Biometrika 76, 604–608.Ventura, L., Cabras, S., Racugno, W., 2009. Prior distributions from pseudo-likelihoods in the presence of nuisance parameters. J. Amer. Statist. Assoc. 104,

768–774.Ventura, L., Racugno, W., 2011. Recent advances on Bayesian inference for P(X < Y ). Bayesian Anal. 6, 411–428.