22
“reflections on the probability space induced by moment conditions with implications for Bayesian inference”: a discussion Christian P. Robert Universit´ e Paris-Dauphine, Paris & University of Warwick, Coventry [email protected]

"reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Embed Size (px)

DESCRIPTION

A discussion I will present at the 6th French Econometrics Conference in Dauphine, Friday Dec. 5

Citation preview

Page 1: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

“reflections on the probability space induced bymoment conditions with implications for Bayesian

inference”: a discussion

Christian P. RobertUniversite Paris-Dauphine, Paris & University of Warwick, Coventry

[email protected]

Page 2: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Outline

what is the question?

what could the question be?

what is the answer?

what could the answer be ?

Page 3: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

what is the question?

”If one specifies a set of moment functions collectedtogether into a vector m(x , θ) of dimension M, regards θas random and asserts that some transformation Z (x , θ)has distribution ψ, then what is required to use thisinformation and then possibly a prior to make validinference?” R. Gallant, p.4

Page 4: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Priors without efforts

I quest for model induced prior dating back to early 1900’s[Lhoste, 1923]

I reference priors such as Jeffreys’ prior induced by samplingdistribution

[Jeffreys, 1939]

I Fiducial distributions as Fisher’s attempted answer[Fisher, 1956]

Page 5: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Fisher’s t fiducial distribution

When considering

t =x − θ

s/√

n

the ratio has a frequentist t distribution with n − 1 degrees offreedom

Page 6: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Fisher’s t fiducial distribution

However, no equivalent justification in asserting that

t =x − θ

s/√

n

has a t posterior distribution with n − 1 degrees of freedom on θ,given (x , s) except when using a non-informative and improperprior π(θ,σ2) ∝ 1/σ2 since, then

θ ∼ Tn−1(x , s/√n)

Page 7: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Fisher’s t fiducial distribution

Furthermore, neither Bayesian nor frequentist interpretation impliesthat

t =x − θ

s/√

n

has a t posterior distribution with n − 1 degrees of freedom jointly

Page 8: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

what could the question be?

Given a set of moment equations

E[m(X1, . . . , Xn, θ)] = 0

(where both the Xi ’s and θ are random), can one derive alikelihood function and a prior distribution compatible with thoseconstraints?

Page 9: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

coherence across sample sizes n

Highly complex question since it implies the integral equation∫Θ×Xn

m(x1, . . . , xn, θ)π(θ)f (x1|θ) · · · f (xn|θ)dθdx1 · · · dxn = 0

must or should have a solution in (π, f ) for all n’s.possible outside of a likelihood x prior modelling?

Page 10: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

coherence across sample sizes n

Highly complex question since it implies the integral equation∫Θ×Xn

m(x1, . . . , xn, θ)π(θ)f (x1|θ) · · · f (xn|θ)dθdx1 · · · dxn = 0

must or should have a solution in (π, f ) for all n’s.possible outside of a likelihood x prior modelling?

Page 11: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Zellner’s Bayesian method of moments

Given moment conditions on parameter θ and σ2

E[θ|x1, . . . , xn] = xn E[σ2|x1, . . .] = s2n var(θ|σ2, x1, . . .) = σ2/n

derivation of a maximum entropy posterior

θ|σ2, x1, . . . ∼ N(xn, σ2/n) σ−2|x1, . . . ∼ Exp(s2n)

[Zellner, 1996]

but incompatible with corresponding predictive distribution[Geisser & Seidenfeld, 1999]

Page 12: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Zellner’s Bayesian method of moments

Given moment conditions on parameter θ and σ2

E[θ|x1, . . . , xn] = xn E[σ2|x1, . . .] = s2n var(θ|σ2, x1, . . .) = σ2/n

derivation of a maximum entropy posterior

θ|σ2, x1, . . . ∼ N(xn, σ2/n) σ−2|x1, . . . ∼ Exp(s2n)

[Zellner, 1996]

but incompatible with corresponding predictive distribution[Geisser & Seidenfeld, 1999]

Page 13: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

what is the answer?

Under the condition that Z (·, θ) is surjective,

p?(x |θ) = ψ(Z (x , θ))

and arbitrary choice of prior π(θ)

I lhs and rhs operate on different spaces

I no reason why density ψ should integrate against Lebesguemeasure in n-dimensional Euclidean space

I no direct connection with a genuine likelihood function, i.e.,product of the densities of the Xi ’s (conditional on θ)

Page 14: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

what is the answer?

Under the condition that Z (·, θ) is surjective,

p?(x |θ) = ψ(Z (x , θ))

and arbitrary choice of prior π(θ)

I lhs and rhs operate on different spaces

I no reason why density ψ should integrate against Lebesguemeasure in n-dimensional Euclidean space

I no direct connection with a genuine likelihood function, i.e.,product of the densities of the Xi ’s (conditional on θ)

Page 15: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

what could the answer be?

“A common situation that requires consideration of thenotions that follow is that deriving the likelihood from astructural model is analytically intractable and onecannot verify that the numerical approximations onewould have to make to circumvent the intractability aresufficiently accurate.” R. Gallant, p.7

Page 16: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Approximative Bayesian answers

Defining joint distribution on (θ, x1, . . . , xn) through momentequations prevents regular Bayesian inference as likelihood isunavailablethere may be alternative available:

I Approximative Bayesian computation (ABC) and empiricallikelihood based Bayesian inference

[Tavare et al., 1999; Owen, 201; Mengersen et al., 2013]

I INLA (Laplace), EP (expectation/propagation),[Martino et al., 2008; Barthelme & Chopin, 2014]

I variational Bayes[Jaakkola & Jordan, 2000]

Page 17: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Approximative Bayesian answers

Defining joint distribution on (θ, x1, . . . , xn) through momentequations prevents regular Bayesian inference as likelihood isunavailablethere may be alternative available:

I Approximative Bayesian computation (ABC) and empiricallikelihood based Bayesian inference

[Tavare et al., 1999; Owen, 201; Mengersen et al., 2013]

I INLA (Laplace), EP (expectation/propagation),[Martino et al., 2008; Barthelme & Chopin, 2014]

I variational Bayes[Jaakkola & Jordan, 2000]

Page 18: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Bayesian approximative answers

I Using a fake likelihood does not prohibit Bayesian analysis, asshown in the paper with model in eqn. (45)

I However this requires case-by-case consistency analysis sincepseudo-likelihoods do not offer same garantees

I Example of ABC model choice based on insufficient statistics[Marin et al., 2014]

Page 19: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Empirical likelihood (EL)

Dataset x made of n independent replicates x = (x1, . . . , xn) of arv X ∼ F

Generalized moment condition pseudo-model

EF

[h(X ,φ)

]= 0,

where h known function, and φ unknown parameter

Induced empirical likelihood

Lel(φ|x) = maxp

n∏i=1

pi

for all p such that 0 6 pi 6 1,∑

i pi = 1,∑

i pih(xi ,φ) = 0

[Owen, 1988, B’ka, & Empirical Likelihood, 2001]

Page 20: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Empirical likelihood (EL)

Dataset x made of n independent replicates x = (x1, . . . , xn) of arv X ∼ F

Generalized moment condition pseudo-model

EF

[h(X ,φ)

]= 0,

where h known function, and φ unknown parameter

Induced empirical likelihood

Lel(φ|x) = maxp

n∏i=1

pi

for all p such that 0 6 pi 6 1,∑

i pi = 1,∑

i pih(xi ,φ) = 0

[Owen, 1988, B’ka, & Empirical Likelihood, 2001]

Page 21: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Raw ABCel sampler

Naıve implementation: Act as if EL was an exact likelihood[Lazar, 2003, B’ka]

for i = 1 → N do

generate φi from the prior distribution π(·)set the weight ωi = Lel(φi |xobs)

end for

return (φi ,ωi ), i = 1, . . . , N

I Output weighted sample of size N

[Mengersen et al., 2013, PNAS]

Page 22: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion

Raw ABCel sampler

Naıve implementation: Act as if EL was an exact likelihood[Lazar, 2003, B’ka]

for i = 1 → N do

generate φi from the prior distribution π(·)set the weight ωi = Lel(φi |xobs)

end for

return (φi ,ωi ), i = 1, . . . , N

I Performance evaluated through effective sample size

ESS = 1/ N∑

i=1

ωi

/ N∑j=1

ωj

2

[Mengersen et al., 2013, PNAS]