Upload
successinmp
View
218
Download
0
Embed Size (px)
Citation preview
7/26/2019 Monte Carlo Notes From Main Book
1/584
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
Christian P. Robert
Universite Paris Dauphine and CREST-INSEEhttp://www.ceremade.dauphine.fr/~xian
3-6 Mayo 2005
http://www.ceremade.dauphine.fr/~xianhttp://www.ceremade.dauphine.fr/~xianhttp://find/7/26/2019 Monte Carlo Notes From Main Book
2/584
Markov Chain Monte Carlo Methods
Outline
Motivation and leading example
Random variable generation
Monte Carlo Integration
Notions on Markov Chains
The Metropolis-Hastings Algorithm
The Gibbs Sampler
MCMC tools for variable dimension problems
Sequential importance sampling
http://find/7/26/2019 Monte Carlo Notes From Main Book
3/584
Markov Chain Monte Carlo Methods
New [2004] edition:
http://find/7/26/2019 Monte Carlo Notes From Main Book
4/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Motivation and leading example
Motivation and leading exampleIntroductionLikelihood methodsMissing variable models
Bayesian MethodsBayesian troubles
Random variable generation
Monte Carlo Integration
Notions on Markov Chains
The Metropolis-Hastings Algorithm
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
5/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Latent structures make life harder!
Even simple models may lead to computational complications, as
in latent variable models
f(x|) =
f(x, x|) dx
M k Ch i M C l M h d
http://find/7/26/2019 Monte Carlo Notes From Main Book
6/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Latent structures make life harder!
Even simple models may lead to computational complications, as
in latent variable models
f(x|) =
f(x, x|) dx
If (x, x) observed, fine!
M k Ch i M t C l M th d
http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
7/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Latent structures make life harder!
Even simple models may lead to computational complications, asin latent variable models
f(x|) =
f(x, x|) dx
If (x, x) observed, fine!
Ifonly x observed, trouble!
Markov Chain Monte Carlo Methods
http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
8/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Example (Mixture models)
Models ofmixtures of distributions:
Xfj with probability pj,
forj = 1, 2, . . . , k, with overall density
Xp1f1(x) + +pkfk(x).
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
9/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Example (Mixture models)
Models ofmixtures of distributions:
Xfj with probability pj,
forj = 1, 2, . . . , k, with overall density
Xp1f1(x) + +pkfk(x).
For a sample of independent random variables (X1, , Xn),sample density
ni=1
{p1f1(xi) + +pkfk(xi)} .
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
10/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
Example (Mixture models)
Models ofmixtures of distributions:
Xfj with probability pj,
forj = 1, 2, . . . , k, with overall density
Xp1f1(x) + +pkfk(x).
For a sample of independent random variables (X1, , Xn),sample density
ni=1
{p1f1(xi) + +pkfk(xi)} .
Expanding this product involves kn elementary terms: prohibitive
to compute in large samples.
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
11/584
Markov Chain Monte Carlo Methods
Motivation and leading example
Introduction
1 0 1 2 3
1
0
1
2
3
1
2
Case of the 0.3N(1, 1) + 0.7N(2, 1) likelihood
Markov Chain Monte Carlo Methods
http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
12/584
Motivation and leading example
Likelihood methods
Maximum likelihood methods
Go Bayes!!
For an iid sample X1, . . . , X n from a population with densityf(x|1, . . . , k), the likelihood function is
L(|x) = L(1, . . . , k|x1, . . . , xn)=n
i=1f(xi|1, . . . , k).
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
13/584
Motivation and leading example
Likelihood methods
Maximum likelihood methods
Go Bayes!!
For an iid sample X1, . . . , X n from a population with densityf(x|1, . . . , k), the likelihood function is
L(|x) = L(1, . . . , k|x1, . . . , xn)=n
i=1f(xi|1, . . . , k).
Global justifications from asymptotics
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
14/584
Motivation and leading example
Likelihood methods
Maximum likelihood methods
Go Bayes!!
For an iid sample X1, . . . , X n from a population with densityf(x|1, . . . , k), the likelihood function is
L(|x) = L(1, . . . , k|x1, . . . , xn)=n
i=1f(xi|1, . . . , k).
Global justifications from asymptotics Computational difficulty depends on structure, eg latent
variables
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
15/584
Motivation and leading example
Likelihood methods
Example (Mixtures again)
For a mixture of two normal distributions,
p
N(, 2) + (1
p)
N(, 2),
likelihood proportional to
n
i=1 p1
xi
+ (1 p)1
xi
containing2n terms.
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
16/584
Motivation and leading example
Likelihood methods
Standard maximization techniques often fail to find the globalmaximum because of multimodality of the likelihood function.
Example
In the special case
f(x|, ) = (1)exp{(1/2)x2}+
exp{(1/22)(x)2} (1)
with >0 known,
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
17/584
Motivation and leading example
Likelihood methods
Standard maximization techniques often fail to find the globalmaximum because of multimodality of the likelihood function.
Example
In the special case
f(x|, ) = (1)exp{(1/2)x2}+
exp{(1/22)(x)2} (1)
with >0 known, whatever n, the likelihood is unbounded:
lim0 (= x1, |x1, . . . , xn) =
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
18/584
Motivation and leading example
Missing variable models
The special case of missing variable models
Consider again a latent variable representation
g(x|) = Zf(x, z|) dz
Markov Chain Monte Carlo Methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
19/584
Motivation and leading example
Missing variable models
The special case of missing variable models
Consider again a latent variable representation
g(x|) = Zf(x, z|) dzDefine the completed (but unobserved) likelihood
Lc(
|x, z) =f(x, z
|)
Markov Chain Monte Carlo Methods
M i i d l di l
http://find/7/26/2019 Monte Carlo Notes From Main Book
20/584
Motivation and leading example
Missing variable models
The special case of missing variable models
Consider again a latent variable representation
g(x|) = Zf(x, z|) dzDefine the completed (but unobserved) likelihood
Lc(
|x, z) =f(x, z
|)
Useful for optimisation algorithm
Markov Chain Monte Carlo Methods
M ti ti d l di l
http://find/7/26/2019 Monte Carlo Notes From Main Book
21/584
Motivation and leading example
Missing variable models
The EM Algorithm
Gibbs connection Bayes rather than EM
Algorithm (ExpectationMaximisation)
Iterate (in m)
1. (E step) Compute
Q(|(m),x) = E[log Lc(|x,Z)|(m),x] ,
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
22/584
Motivation and leading example
Missing variable models
The EM Algorithm
Gibbs connection Bayes rather than EM
Algorithm (ExpectationMaximisation)
Iterate (in m)
1. (E step) Compute
Q(|(m),x) = E[log Lc(|x,Z)|(m),x] ,2. (M step) Maximise Q(|(m),x) in and take
(m+1)= arg max
Q(|(m),x).
until a fixed point [ofQ] is reached
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
23/584
Motivation and leading example
Missing variable models
Echantillon N(0,1)
2 1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Sample from (1)
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
24/584
Motivation and leading example
Missing variable models
1 0 1 2 3
1
0
1
2
3
1
2
Likelihood of .7N(1, 1) + .3N(2, 1) and EM steps
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
25/584
Motivation and leading example
Bayesian Methods
The Bayesian Perspective
In the Bayesian paradigm, the information brought by the data x,realization of
Xf(x|),
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
26/584
g p
Bayesian Methods
The Bayesian Perspective
In the Bayesian paradigm, the information brought by the data x,realization of
Xf(x|),is combined with prior information specified by prior distributionwith density
()
Markov Chain Monte Carlo Methods
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
27/584
g p
Bayesian Methods
Central tool
Summary in a probability distribution, (|x), called the posteriordistribution
Markov Chain Monte Carlo MethodsMotivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
28/584
Bayesian Methods
Central tool
Summary in a probability distribution, (|x), called the posteriordistributionDerived from the jointdistribution f(x|)(), according to
(|x) = f(x|)()f(x|)()d ,
[Bayes Theorem]
Markov Chain Monte Carlo MethodsMotivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
29/584
Bayesian Methods
Central tool
Summary in a probability distribution, (|x), called the posteriordistributionDerived from the jointdistribution f(x|)(), according to
(|x) = f(x|)()f(x|)()d ,
[Bayes Theorem]
wherem(x) =
f(x|)()d
is the marginal densityofX
Markov Chain Monte Carlo MethodsMotivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
30/584
Bayesian Methods
Central tool...central to Bayesian inference
Posterior defined up to a constant as
(|x)f(x|) ()
Operatesconditionalupon the observations
Markov Chain Monte Carlo MethodsMotivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
31/584
Bayesian Methods
Central tool...central to Bayesian inference
Posterior defined up to a constant as
(|x)f(x|) ()
Operatesconditionalupon the observations Integrate simultaneously prior informationandinformation
brought by x
Markov Chain Monte Carlo MethodsMotivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
32/584
Bayesian Methods
Central tool...central to Bayesian inference
Posterior defined up to a constant as
(|x)f(x|) ()
Operatesconditionalupon the observations Integrate simultaneously prior informationandinformation
brought by x
Avoids averaging over theunobservedvalues ofx
Markov Chain Monte Carlo MethodsMotivation and leading example
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
33/584
Bayesian Methods
Central tool...central to Bayesian inference
Posterior defined up to a constant as
(|x)f(x|) ()
Operatesconditionalupon the observations Integrate simultaneously prior informationandinformation
brought by x
Avoids averaging over theunobservedvalues ofx
Coherentupdating of the information available on ,independent of the order in which i.i.d. observations arecollected
Markov Chain Monte Carlo MethodsMotivation and leading example
B i M h d
http://find/7/26/2019 Monte Carlo Notes From Main Book
34/584
Bayesian Methods
Central tool...central to Bayesian inference
Posterior defined up to a constant as
(|x)f(x|) ()
Operatesconditionalupon the observations Integrate simultaneously prior informationandinformation
brought by x
Avoids averaging over theunobservedvalues ofx
Coherentupdating of the information available on ,independent of the order in which i.i.d. observations arecollected
Provides acompleteinferential scope and a unique motor of
inference
Markov Chain Monte Carlo MethodsMotivation and leading example
B i t bl
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
35/584
Bayesian troubles
Conjugate bonanza...
Example (Binomial)
For an observation X
B(n, p) so-called conjugate prior is the
family of beta Be(a, b) distributions
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
36/584
Bayesian troubles
Conjugate bonanza...
Example (Binomial)
For an observation X
B(n, p) so-called conjugate prior is the
family of beta Be(a, b) distributionsThe classical Bayes estimator is the posterior mean
(a + b + n)
(a + x)(n
x + b)
1
0p px+a1(1 p)nx+b1dp
= x + aa + b + n
.
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
37/584
Bayesian troubles
Example (Normal)
In the normalN(, 2) case, with both and unknown,conjugate prior on = (, 2) of the form
(2) exp
( )2 +
/2
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
38/584
Bayesian troubles
Example (Normal)
In the normalN(, 2) case, with both and unknown,conjugate prior on = (, 2) of the form
(2) exp
( )2 +
/2
since
((, 2)|x1, . . . , xn) (2) exp
( )2 +
/2
(2)n exp
n( x)2 + s2x
/2
(2)+n exp (+ n)( x)2+ + s2x+
nn +
/2
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
39/584
Bayesian troubles
...and conjugate curse
The use ofconjugate priors for computational reasons
implies a restriction on the modeling of the available priorinformation
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
40/584
y
...and conjugate curse
The use ofconjugate priors for computational reasons
implies a restriction on the modeling of the available priorinformation may be detrimental to the usefulness of the Bayesian approach
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
41/584
y
...and conjugate curse
The use ofconjugate priors for computational reasons
implies a restriction on the modeling of the available priorinformation may be detrimental to the usefulness of the Bayesian approach gives an impression of subjective manipulation of the prior
information disconnected from reality.
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
42/584
A typology of Bayes computational problems
(i). use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
43/584
A typology of Bayes computational problems
(i). use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii). use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
44/584
A typology of Bayes computational problems
(i). use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii). use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii). use of a huge dataset;
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
45/584
A typology of Bayes computational problems
(i). use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii). use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii). use of a huge dataset;
(iv). use of a complex prior distribution (which may be theposterior distribution associated with an earlier sample);
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
46/584
A typology of Bayes computational problems
(i). use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii). use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii). use of a huge dataset;
(iv). use of a complex prior distribution (which may be theposterior distribution associated with an earlier sample);
(v). use of a complex inferential procedure as for instance, Bayesfactors
B01(x) =P(0 | x)P(
1
|x)
(0)(
1)
.
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
47/584
Example (Mixture once again)
Observations from
x1, . . . , xnf(x|) =p(x; 1, 1) + (1 p)(x; 2, 2)
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
48/584
Example (Mixture once again)
Observations from
x1, . . . , xnf(x|) =p(x; 1, 1) + (1 p)(x; 2, 2)
Prior
i|iN(i, 2i /ni), 2i IG(i/2, s2i /2), p Be(, )
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
49/584
Example (Mixture once again)
Observations from
x1, . . . , xnf(x|) =p(x; 1, 1) + (1 p)(x; 2, 2)
Prior
i|iN(i, 2i /ni), 2i IG(i/2, s2i /2), p Be(, )
Posterior
(|x1, . . . , xn) n
j=1 {p(xj; 1, 1) + (1 p)(xj; 2, 2)} ()
=n=0
(kt)
(kt)(|(kt))
n
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
50/584
Example (Mixture once again (contd))
For a given permutation (kt), conditional posterior distribution
(|(kt)) = N1(kt), 21n1+ IG((1+ )/2, s1(kt)/2)N
2(kt),
22n2+ n
IG((2+ n )/2, s2(kt)/2)
Be( + , + n
)
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
51/584
Example (Mixture once again (contd))
where
x1(kt) = 1
t=1 xkt , s1(kt) =
t=1(xktx1(kt))2,
x2(kt) = 1n
nt=+1 xkt , s2(kt) =
nt=+1(xktx2(kt))2
and
1(kt) = n11+ x1(kt)
n1+ , 2(kt) =
n22+ (n )x2(kt)n2+ n ,
s1(kt) = s21+ s
21(kt) +
n1
n1+
(1
x1(kt))
2,
s2(kt) = s22+ s
22(kt) +
n2(n )n2+ n (2x2(kt))
2,
posterior updates of the hyperparameters
Markov Chain Monte Carlo MethodsMotivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
52/584
Example (Mixture once again)
Bayes estimator of:
(x1, . . . , xn) =n=0
(kt)
(kt)E[|x, (kt)]
Too costly: 2n terms
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
53/584
press for AR
Example (Poly-t priors)
Normal observation x N(, 1), with conjugate prior
N(, )
Closed form expression for the posterior mean
f(x|) () d
f(x|) () d=
= x + 2
1 + 2 .
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
54/584
Example (Poly-t priors (2))
More involved prior distribution:poly-t distribution
[Bauwens,1985]
() =ki=1
i+ ( i)2
i i, i >0
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
55/584
Example (Poly-t priors (2))
More involved prior distribution:poly-t distribution
[Bauwens,1985]
() =ki=1
i+ ( i)2
i i, i >0Computation ofE[|x]???
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
56/584
Example (AR(p) model)
Auto-regressive representation of a time series,
xt =
p
i=1 ixti+ t
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
57/584
Example (AR(p) model)
Auto-regressive representation of a time series,
xt =
p
i=1 ixti+ tIf order p unknown, predictive distribution ofxt+1 given by
(xt+1|xt, . . . , x1) f(xt+1|xt, . . . , xtp+1)(, p|xt, . . . , x1)dpd,
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://find/7/26/2019 Monte Carlo Notes From Main Book
58/584
Example (AR(p) model (contd))
Integration over the parameters of all models
p=0
f(xt+1|xt, . . . , xtp+1)(|p, xt, . . . , x1) d (p|xt, . . . , x1) .
Markov Chain Monte Carlo Methods
Motivation and leading example
Bayesian troubles
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
59/584
Example (AR(p) model (contd))Multiple layers of complexity
(i). Complex parameter space within each AR(p) model becauseof stationarity constraint
(ii). ifp unbounded, infinity of models(iii). varies between models AR(p) and AR(p + 1), with a
different stationarity constraint (except for rootreparameterisation).
(iv). if prediction used sequentially, every tick/second/hour/day,posterior distribution(, p|xt, . . . , x1) must be re-evaluated
Markov Chain Monte Carlo Methods
Random variable generation
R d i bl ti
http://find/7/26/2019 Monte Carlo Notes From Main Book
60/584
Random variable generation
Motivation and leading example
Random variable generationBasic methods
Uniform pseudo-random generatorBeyond Uniform distributionsTransformation methodsAccept-Reject MethodsFundamental theorem of simulation
Log-concave densities
Monte Carlo Integration
Notions on Markov Chains Markov Chain Monte Carlo Methods
Random variable generation
R d i bl ti
http://find/7/26/2019 Monte Carlo Notes From Main Book
61/584
Random variable generation
Rely on the possibility of producing (computer-wise) anendless flow of random variables (usually iid) from well-knowndistributions
Markov Chain Monte Carlo Methods
Random variable generation
R d i bl ti
http://find/7/26/2019 Monte Carlo Notes From Main Book
62/584
Random variable generation
Rely on the possibility of producing (computer-wise) anendless flow of random variables (usually iid) from well-knowndistributions
Given a uniform random number generator, illustration ofmethods that produce random variables from both standardand nonstandard distributions
Markov Chain Monte Carlo Methods
Random variable generation
Basic methods
The inverse transform method
http://find/7/26/2019 Monte Carlo Notes From Main Book
63/584
The inverse transform method
For a functionF on R, thegeneralized inverseofF, F, is definedby
F
(u) = inf{x; F(x)u}.
Markov Chain Monte Carlo Methods
Random variable generation
Basic methods
The inverse transform method
http://find/7/26/2019 Monte Carlo Notes From Main Book
64/584
The inverse transform method
For a functionF on R, thegeneralized inverseofF, F, is definedby
F
(u) = inf{x; F(x)u}.Definition (Probability Integral Transform)
IfU U[0,1], then the random variable F(U) has the distributionF.
Markov Chain Monte Carlo Methods
Random variable generation
Basic methods
The inverse transform method (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
65/584
The inverse transform method (2)
To generate a random variable XF, simply generate
U U[0,1]
Markov Chain Monte Carlo Methods
Random variable generation
Basic methods
The inverse transform method (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
66/584
The inverse transform method (2)
To generate a random variable XF, simply generate
U U[0,1]and then make the transform
x=F(u)
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Desiderata and limitations
http://find/7/26/2019 Monte Carlo Notes From Main Book
67/584
Desiderata and limitationsskip Uniform
Production of a deterministicsequence of values in [0, 1]which imitates a sequence ofiiduniform random variablesU[0,1].
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Desiderata and limitations
http://find/7/26/2019 Monte Carlo Notes From Main Book
68/584
Desiderata and limitationsskip Uniform
Production of a deterministicsequence of values in [0, 1]which imitates a sequence ofiiduniform random variablesU[0,1].
Cant use the physical imitation of a random draw [noguarantee of uniformity, no reproducibility]
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Desiderata and limitations
http://find/7/26/2019 Monte Carlo Notes From Main Book
69/584
Desiderata and limitationsskip Uniform
Production of a deterministicsequence of values in [0, 1]which imitates a sequence ofiiduniform random variablesU[0,1].
Cant use the physical imitation of a random draw [noguarantee of uniformity, no reproducibility]
Randomsequence in the sense: Having generated(X1, , Xn), knowledge ofXn [or of (X1, , Xn)] impartsno discernible knowledge of the value ofXn+1.
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Desiderata and limitations
http://find/7/26/2019 Monte Carlo Notes From Main Book
70/584
Desiderata and limitationsskip Uniform
Production of a deterministicsequence of values in [0, 1]which imitates a sequence ofiiduniform random variablesU[0,1].
Cant use the physical imitation of a random draw [noguarantee of uniformity, no reproducibility]
Randomsequence in the sense: Having generated(X1, , Xn), knowledge ofXn [or of (X1, , Xn)] impartsno discernible knowledge of the value ofXn+1.
Deterministic: Given the initial value X0, sample(X1, , Xn) always the same
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Desiderata and limitations
http://find/7/26/2019 Monte Carlo Notes From Main Book
71/584
Desiderata and limitationsskip Uniform
Production of a deterministicsequence of values in [0, 1]which imitates a sequence ofiiduniform random variablesU[0,1].
Cant use the physical imitation of a random draw [noguarantee of uniformity, no reproducibility]
Randomsequence in the sense: Having generated(X1, , Xn), knowledge ofXn [or of (X1, , Xn)] impartsno discernible knowledge of the value ofXn+1.
Deterministic: Given the initial value X0, sample(X1, , Xn) always the same Validity of a random number generator based on a singlesample X1, , Xn when n tends to +, not on replications
(X11, , X1n), (X21, , X2n), . . . (Xk1, , Xkn)
where n fixed and k tends to infinity Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Uniform pseudo-random generator
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
72/584
Uniform pseudo random generator
Algorithm starting from an initial value 0u01 and atransformationD, which produces a sequence
(ui) = (Di(u
0))
in [0, 1].
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Uniform pseudo-random generator
http://find/7/26/2019 Monte Carlo Notes From Main Book
73/584
p g
Algorithm starting from an initial value 0u01 and atransformationD, which produces a sequence
(ui) = (Di(u
0))
in [0, 1].For all n,
(u1, , un)reproduces the behavior of an iid U[0,1] sample (V1, , Vn) whencompared through usual tests
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Uniform pseudo-random generator (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
74/584
p g ( )
Validity means the sequence U1, , Un leads to accept thehypothesis
H :U1
,
, Un
are iid U[0,1]
.
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Uniform pseudo-random generator (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
75/584
p g ( )
Validity means the sequence U1, , Un leads to accept thehypothesis
H :U1
,
, Un
are iid U[0,1]
.
The set of tests used is generally of some consequence KolmogorovSmirnov and other nonparametric tests Time series methods, for correlation between Ui and(Ui1, , Uik) Marsaglias battery of tests called Die Hard(!)
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Usual generators
http://find/7/26/2019 Monte Carlo Notes From Main Book
76/584
g
In R and S-plus, procedure runif()
The Uniform Distribution
Description:
runif generates random deviates.
Example:
u
7/26/2019 Monte Carlo Notes From Main Book
77/584
500 520 540 560 580 600
0.
0
0.
2
0.
4
0.6
0.8
1.
0
uniform sample
0.0 0.2 0.4 0.6 0.8 1.0
0.
0
0.
5
1.0
1.
5
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Usual generators (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
78/584
( )
In C, procedure rand() orrandom()
SYNOPSIS
#include
long int random(void);
DESCRIPTION
The random() function uses a non-linear additivefeedback random number generator employing a
default table of size 31 long integers to return
successive pseudo-random numbers in the range
from 0 to RAND_MAX. The period of this random
generator is very large, approximately
16*((2**31)-1).
RETURN VALUE
random() returns a value between 0 and RAND_MAX.
Markov Chain Monte Carlo Methods
Random variable generation
Uniform pseudo-random generator
Usual generators(3)
http://find/7/26/2019 Monte Carlo Notes From Main Book
79/584
In Scilab, procedure rand()
rand() : with no arguments gives a scalar whose
value changes each time it is referenced. By
default, random numbers are uniformly distributedin the interval (0,1). rand(normal) switches to
a normal distribution with mean 0 and variance 1.
EXAMPLE
x=rand(10,10,uniform)
Markov Chain Monte Carlo Methods
Random variable generation
Beyond Uniform distributions
Beyond Uniform generators
http://find/7/26/2019 Monte Carlo Notes From Main Book
80/584
Generation of any sequence of random variables can beformally implemented through a uniform generator
Distributions with explicit F (for instance, exponential, andWeibull distributions), use the probability integral
transform here
Markov Chain Monte Carlo Methods
Random variable generation
Beyond Uniform distributions
Beyond Uniform generators
http://find/7/26/2019 Monte Carlo Notes From Main Book
81/584
Generation of any sequence of random variables can beformally implemented through a uniform generator
Distributions with explicit F (for instance, exponential, andWeibull distributions), use the probability integral
transform here Case specific methods rely on properties of the distribution (forinstance, normal distribution, Poisson distribution)
Markov Chain Monte Carlo Methods
Random variable generation
Beyond Uniform distributions
Beyond Uniform generators
http://find/7/26/2019 Monte Carlo Notes From Main Book
82/584
Generation of any sequence of random variables can beformally implemented through a uniform generator
Distributions with explicit F (for instance, exponential, andWeibull distributions), use the probability integral
transform here Case specific methods rely on properties of the distribution (forinstance, normal distribution, Poisson distribution)
More generic methods (for instance, accept-reject andratio-of-uniform)
Markov Chain Monte Carlo Methods
Random variable generation
Beyond Uniform distributions
Beyond Uniform generators
http://find/7/26/2019 Monte Carlo Notes From Main Book
83/584
Generation of any sequence of random variables can beformally implemented through a uniform generator
Distributions with explicit F (for instance, exponential, andWeibull distributions), use the probability integral
transform here Case specific methods rely on properties of the distribution (forinstance, normal distribution, Poisson distribution)
More generic methods (for instance, accept-reject andratio-of-uniform)
Simulation of the standard distributions is accomplished quiteefficiently by many numerical and statistical programmingpackages.
Markov Chain Monte Carlo Methods
Random variable generation
Transformation methods
Transformation methods
http://find/7/26/2019 Monte Carlo Notes From Main Book
84/584
Case where a distribution F is linked in a simple way to anotherdistribution easy to simulate.
Example (Exponential variables)
IfU
U[0,1], the random variable
X= log U/
has distribution
P(Xx) = P( log Ux)= P(U ex) = 1 ex,
the exponential distribution Exp().
Markov Chain Monte Carlo Methods
Random variable generation
Transformation methods
Other random variables that can be generated starting from an
http://find/7/26/2019 Monte Carlo Notes From Main Book
85/584
g gexponential include
Y =2
j=1
log(Uj)22
Y = 1
aj=1
log(Uj) Ga(a, )
Y =aj=1log(Uj)a+bj=1log(Uj)
Be(a, b)
Markov Chain Monte Carlo Methods
Random variable generation
Transformation methods
Points to note
http://find/7/26/2019 Monte Carlo Notes From Main Book
86/584
Transformation quite simple to use There are more efficient algorithms for gamma and beta
random variables Cannot generate gamma random variables with a non-integershape parameter
For instance, cannot get a 21 variable, which would get us a
N(0, 1) variable.
Markov Chain Monte Carlo Methods
Random variable generation
Transformation methods
Box-Muller Algorithm
http://find/7/26/2019 Monte Carlo Notes From Main Book
87/584
Example (Normal variables)
Ifr, polar coordinates of (X1, X2), then,
r2 =X21 + X22
22 = E(1/2) and
U[0, 2]
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Box-Muller Algorithm
http://find/7/26/2019 Monte Carlo Notes From Main Book
88/584
Example (Normal variables)
Ifr, polar coordinates of (X1, X2), then,
r2 =X21 + X22
22 = E(1/2) and
U[0, 2]
Consequence: IfU1, U2 iidU[0,1],
X1 =
2log(U1) cos(2U2)
X2 = 2log(U1) sin(2U2)iidN(0, 1).
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Box-Muller Algorithm (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
89/584
1. Generate U1, U2 iidU[0,1] ;2. Define
x1 = 2log(u1) cos(2u2),x2 =
2log(u1) sin(2u2) ;
3. Take x1 and x2 as two independent draws from
N(0, 1).
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Box-Muller Algorithm (3)
http://find/7/26/2019 Monte Carlo Notes From Main Book
90/584
4 2 0 2 4
4
3
2
1
0
1
2
3
Unlike algorithms based on the CLT,this algorithm is exact
Get two normals for the price oftwo uniforms
4 2 0 2 4
0.
0
0.
1
0.
2
0.
3
0.
4
Drawback (in speed)in calculating log, cos and sin.
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
More transforms
http://find/7/26/2019 Monte Carlo Notes From Main Book
91/584
Reject
Example (Poisson generation)
Poissonexponential connection:IfN P() and Xi Exp(), i N,
P(N=k) =
P(X1+
+ Xk
1< X1+
+ Xk+1).
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
More Poisson
http://find/7/26/2019 Monte Carlo Notes From Main Book
92/584
Skip Poisson
A Poisson can be simulated by generating Exp(1) till theirsum exceeds 1.
This method is simple, but is really practical only for smallervalues of.
On average, the number of exponential variables required is .
Other approaches are more suitable for large s.
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Atkinsons PoissonTo generate N P():
http://find/7/26/2019 Monte Carlo Notes From Main Book
93/584
To generate N P():1. Define
=/
3, = and k= log clog ;
2. Generate U1
U[0,1] and calculate
x={ log{(1 u1)/u1}}/
until x >0.5 ;3. Define N=
x + 0.5
and generate
U2 U[0,1];4. Accept N if
x+log (u2/{1+exp(x)}2)k+Nlog log N!.
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Negative extension
http://find/7/26/2019 Monte Carlo Notes From Main Book
94/584
A generator of Poisson random variables can produce negativebinomial random variables since,
Y Ga(n, (1 p)/p) X|y P(y)
impliesX Neg(n, p)
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Mixture representation
http://find/7/26/2019 Monte Carlo Notes From Main Book
95/584
The representation of the negative binomial is a particularcase of a mixture distribution
The principle of a mixture representation is to represent adensityfas the marginal of another distribution, for example
f(x) =iY
pi fi(x),
If the component distributions fi(x) can be easily generated,
Xcan be obtained by first choosing fi with probability pi andthen generating an observation from fi.
Markov Chain Monte Carlo Methods
Random variable generationTransformation methods
Partitioned sampling
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
96/584
Special case of mixture sampling when
fi(x) =f(x) IAi
(x)Ai f(x) dxand
pi= Pr(XAi)for a partition (A
i)i
Markov Chain Monte Carlo Methods
Random variable generationAccept-Reject Methods
Accept-Reject algorithm
http://find/7/26/2019 Monte Carlo Notes From Main Book
97/584
Many distributions from which it is difficult, or evenimpossible, to directly simulate.
Another class of methods that only require us to know thefunctional form of the density fof interest only up to amultiplicative constant.
The key to this method is to use a simpler (simulation-wise)densityg, the instrumental density, from which the simulation
from the target density f is actually done.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
Fundamental theorem of simulation
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
98/584
Lemma
Simulating
Xf(x)
equivalent to simulating
(X, U) U{(x, u) : 0< u < f(x)}0 2 4 6 8 10
0.
00
0.
05
0.
10
0.
15
0.
20
0.
25
x
f(x)
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
The Accept-Reject algorithm
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
99/584
Given a density of interest f, find a density g and a constant Msuch that
f(x)M g(x)on the support off.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
The Accept-Reject algorithm
http://find/7/26/2019 Monte Carlo Notes From Main Book
100/584
Given a density of interest f, find a density g and a constant Msuch that
f(x)M g(x)on the support off.
1. Generate Xg, U U[0,1] ;2. Accept Y =X ifU f(X)/M g(X) ;
3. Return to 1. otherwise.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
Validation of the Accept-Reject method
Warranty:
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
101/584
Warranty:
This algorithm produces a variable Ydistributed according to f
4 2 0 2 4
0
1
2
3
4
5
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
Two interesting properties
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
102/584
First, it provides a generic method to simulate from anydensityf that is known up to a multiplicative factorProperty particularly important in Bayesian calculations wherethe posterior distribution
(|x)()f(x|).
is specified up to a normalizing constant
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
Two interesting properties
http://find/7/26/2019 Monte Carlo Notes From Main Book
103/584
First, it provides a generic method to simulate from anydensityf that is known up to a multiplicative factorProperty particularly important in Bayesian calculations wherethe posterior distribution
(|x)()f(x|).
is specified up to a normalizing constant
Second, the probability of acceptance in the algorithm is
1/M, e.g., expected number of trials until a variable isaccepted is M
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
More interesting properties
http://find/7/26/2019 Monte Carlo Notes From Main Book
104/584
In cases f and g both probability densities, the constant M isnecessarily larger that 1.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
More interesting properties
http://find/7/26/2019 Monte Carlo Notes From Main Book
105/584
In cases f and g both probability densities, the constant M isnecessarily larger that 1.
The size ofM, and thus the efficiency of the algorithm, arefunctions of how closely g can imitate f, especially in the tails
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
More interesting properties
http://find/7/26/2019 Monte Carlo Notes From Main Book
106/584
In cases f and g both probability densities, the constant M isnecessarily larger that 1.
The size ofM, and thus the efficiency of the algorithm, arefunctions of how closely g can imitate f, especially in the tails
For f/g to remain bounded, necessary for g to have tailsthicker than those off.It is therefore impossible to use the A-R algorithm to simulatea Cauchy distribution fusing a normal distribution g, howeverthe reverse works quite well.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
No Cauchy!
http://find/7/26/2019 Monte Carlo Notes From Main Book
107/584
Example (Normal from a Cauchy)Take
f(x) = 1
2exp(x2/2)
and
g(x) = 1
11 + x2
,
densities of the normal and Cauchy distributions.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
No Cauchy!
http://find/7/26/2019 Monte Carlo Notes From Main Book
108/584
Example (Normal from a Cauchy)Take
f(x) = 1
2exp(x2/2)
and
g(x) = 1
11 + x2
,
densities of the normal and Cauchy distributions.Then
f(x)
g(x) = 2 (1 + x2)ex2/2 2e = 1.52attained at x=1.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
http://find/7/26/2019 Monte Carlo Notes From Main Book
109/584
Example (Normal from a Cauchy (2))
So probability of acceptance
1/1.52 = 0.66,
and, on the average, one out of every three simulated Cauchyvariables is rejected.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
No Double!
http://find/7/26/2019 Monte Carlo Notes From Main Book
110/584
Example (Normal/Double Exponential)
Generate a N(0, 1) by using a double-exponential distributionwith density
g(x|) = (/2) exp(|x|)Then
f(x)
g(x|)
2
1e
2/2
and minimum of this bound (in ) attained for
= 1
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
http://find/7/26/2019 Monte Carlo Notes From Main Book
111/584
Example (Normal/Double Exponential (2))
Probability of acceptance
/2e= .76To produce one normal random variable requires on the average1/.761.3 uniform variables.
Markov Chain Monte Carlo Methods
Random variable generationFundamental theorem of simulation
http://find/7/26/2019 Monte Carlo Notes From Main Book
112/584
truncate
Example (Gamma generation)
Illustrates a real advantage of the Accept-Reject algorithmThe gamma distributionGa(, ) represented as the sum ofexponential random variables, only if is an integer
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
Example (Gamma generation (2))
http://find/7/26/2019 Monte Carlo Notes From Main Book
113/584
Can use the Accept-Reject algorithm with instrumental distribution
Ga(a, b), with a= [], 0.
(Without loss of generality, = 1.)
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
Example (Gamma generation (2))
http://find/7/26/2019 Monte Carlo Notes From Main Book
114/584
Can use the Accept-Reject algorithm with instrumental distribution
Ga(a, b), with a= [], 0.
(Without loss of generality, = 1.)
Up to a normalizing constant,
f /gb =baxa exp{(1 b)x} ba
a(1 b)e
aforb1.The maximum is attained at b= a/.
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
Cheng and Feasts Gamma generator
Gamma Ga(, 1), >1 distribution
http://find/7/26/2019 Monte Carlo Notes From Main Book
115/584
( , ),
1. Define c1 = 1, c2= ( (1/6))/c1,c3 = 2/c1, c4 = 1 + c3, and c5 = 1/
.
2. Repeatgenerate U1, U2
take U1 =U2+ c5(1 1.86U1) if >2.5until 0< U1
7/26/2019 Monte Carlo Notes From Main Book
116/584
Example (Truncated Normal distributions)Constraint x produces density proportional to
e(x)2/22 Ix
for a bound large compared with
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
Truncated Normal simulation
http://find/7/26/2019 Monte Carlo Notes From Main Book
117/584
Example (Truncated Normal distributions)Constraint x produces density proportional to
e(x)2/22 Ix
for a bound large compared with There exists alternatives far superior to the nave method ofgenerating aN(, 2) until exceeding, which requires an averagenumber of
1/(( )/)simulations fromN(, 2) for a single acceptance.
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
118/584
Example (Truncated Normal distributions (2))Instrumental distribution: translated exponential distribution,E(, ), with density
g(z) =e(z) Iz.
Markov Chain Monte Carlo Methods
Random variable generation
Fundamental theorem of simulation
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
119/584
Example (Truncated Normal distributions (2))Instrumental distribution: translated exponential distribution,E(, ), with density
g(z) =e(z) Iz.
The ratio f /g is bounded by
f/g 1/ exp(2/2 ) if > ,1/ exp(
2/2) otherwise.
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Log-concave densities (1)
http://find/7/26/2019 Monte Carlo Notes From Main Book
120/584
move to next chapter Densities fwhose logarithm is concave, forinstance Bayesian posterior distributions such that
log (|x) = log () + log f(x|) + c
concave
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Log-concave densities (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
121/584
TakeSn={xi, i= 0, 1, . . . , n + 1} supp(f)
such that h(xi) = log f(xi) known up to the same constant.
By concavity ofh, line Li,i+1 through (xi, h(xi)) and(xi+1, h(xi+1))
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Log-concave densities (2)
http://find/7/26/2019 Monte Carlo Notes From Main Book
122/584
TakeSn={xi, i= 0, 1, . . . , n + 1} supp(f)
such that h(xi) = log f(xi) known up to the same constant.
By concavity ofh, line Li,i+1 through (xi, h(xi)) and(xi+1, h(xi+1))
x1
x2
x3
x4
x
L (x)2,3
log f(x)
below h in [xi, xi+1] and
above this graph outside this interval
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Log-concave densities (3)
http://find/7/26/2019 Monte Carlo Notes From Main Book
123/584
For x[xi, xi+1], if
hn(x) = min{Li1,i(x), Li+1,i+2(x)} and hn(x) =Li,i+1(x) ,
the envelopes arehn(x)h(x)hn(x)
uniformly on the support off, with
hn(x) =
and hn(x) = min(L0,1(x), Ln,n+1(x))
on [x0, xn+1]c.
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Log-concave densities (4)
http://find/7/26/2019 Monte Carlo Notes From Main Book
124/584
Therefore, if
fn
(x) = exp hn(x) and fn(x) = exp hn(x)
thenfn
(x)f(x)fn(x) =n gn(x),where n normalizing constant offn
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
ARS Algorithm
http://find/7/26/2019 Monte Carlo Notes From Main Book
125/584
1. Initialize n and Sn.
2. Generate X
gn(x), U U[0,1]
.
3. IfU fn
(X)/n gn(X), accept X;otherwise, ifU f(X)/n gn(X), accept X
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
kill ducks
http://find/7/26/2019 Monte Carlo Notes From Main Book
126/584
Example (Northern Pintail ducks)
Ducks captured at time i with both probability pi and size N ofthe population unknown.Dataset
(n1, . . . , n11) = (32, 20, 8, 5, 1, 2, 0, 2, 1, 1, 0)
Number of recoveries over the years 19571968 ofN= 1612Northern Pintail ducks banded in 1956
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
E l (N th Pi t il d k (2))
http://find/7/26/2019 Monte Carlo Notes From Main Book
127/584
Example (Northern Pintail ducks (2))Corresponding conditional likelihood
L(p1, . . . , pI
|N, n1, . . . , nI) =
N!
(N r)!
I
i=1 pnii (1
pi)
Nni ,
where Inumber of captures, ni number of captured animalsduring the ith capture, and r is the total number of differentcaptured animals.
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
E l (N th Pi t il d k (3))
http://find/7/26/2019 Monte Carlo Notes From Main Book
128/584
Example (Northern Pintail ducks (3))Prior selectionIf
N P()and
i = log
pi
1 pi
N(i, 2),
[Normal logistic]
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
E l (N th Pi t il d k (4))
http://find/7/26/2019 Monte Carlo Notes From Main Book
129/584
Example (Northern Pintail ducks (4))
Posterior distribution
(, N
|, n1, . . . , nI)
N!
(N r)!N
N!
I
i=1(1 + ei)NIi=1
exp
ini 1
22(i i)2
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Example (Northern Pintail ducks (5))
http://find/7/26/2019 Monte Carlo Notes From Main Book
130/584
For the conditional posterior distribution
(i|N, n1, . . . , nI) exp
ini 122
(i i)2
(1+ei)N ,
the ARS algorithm can be implemented since
ini 122
(i i)2 N log(1 + ei)
is concave in i.
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
Posterior distributions of capture log-odds ratios for theyears 19571965.
1.
0
1.
0
1.
0
http://find/7/26/2019 Monte Carlo Notes From Main Book
131/584
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
01960
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
01961
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
01962
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
1963
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
1964
10 9 8 7 6 5 4 3
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
1965
Markov Chain Monte Carlo Methods
Random variable generation
Log-concave densities
1960
http://find/7/26/2019 Monte Carlo Notes From Main Book
132/584
1960
8 7 6 5 4
0.0
0.2
0.
4
0.
6
0.
8
True distribution versus histogram of simulated sample
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Motivation and leading example
http://find/7/26/2019 Monte Carlo Notes From Main Book
133/584
Random variable generation
Monte Carlo Integration
IntroductionMonte Carlo integrationImportance SamplingAcceleration methodsBayesian importance sampling
Notions on Markov Chains
The Metropolis-Hastings Algorithm
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Introduction
Quick reminder
http://find/7/26/2019 Monte Carlo Notes From Main Book
134/584
Two major classes of numerical problems that arise in statisticalinference
Optimization - generally associated with the likelihoodapproach
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Introduction
Quick reminder
http://find/7/26/2019 Monte Carlo Notes From Main Book
135/584
Two major classes of numerical problems that arise in statisticalinference
Optimization - generally associated with the likelihoodapproach
Integration- generally associated with the Bayesian approach
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Introduction
skip Example!
Example (Bayesian decision theory)
Bayes estimators are not always posterior expectations but rathersolutions of the minimization problem
http://find/7/26/2019 Monte Carlo Notes From Main Book
136/584
Bayes estimators are not always posterior expectations, but rathersolutions of the minimization problem
min
L(, )()f(x|)d .
Proper loss:For L(, ) = ( )2, the Bayes estimator is the posterior mean
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Introduction
skip Example!
Example (Bayesian decision theory)
Bayes estimators are not always posterior expectations, but rathersolutions of the minimization problem
http://find/7/26/2019 Monte Carlo Notes From Main Book
137/584
Bayes estimators are not always posterior expectations, but rathersolutions of the minimization problem
min
L(, )()f(x|)d .
Proper loss:For L(, ) = ( )2, the Bayes estimator is the posterior meanAbsolute error loss:For L(, ) =| |, the Bayes estimator is theposterior median
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Introduction
skip Example!
Example (Bayesian decision theory)
Bayes estimators are not always posterior expectations, but rathersolutions of the minimization problem
http://find/7/26/2019 Monte Carlo Notes From Main Book
138/584
solutions of the minimization problem
min
L(, )()f(x|)d .
Proper loss:For L(, ) = ( )2, the Bayes estimator is the posterior meanAbsolute error loss:For L(, ) =| |, the Bayes estimator is theposterior medianWith no loss functionuse themaximum a posteriori (MAP) estimator
arg max
(|x)()
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Monte Carlo integration
http://find/7/26/2019 Monte Carlo Notes From Main Book
139/584
Theme:Generic problem of evaluating the integral
I = Ef[h(X)] = X
h(x)f(x)dx
where X is uni- or multidimensional, fis a closed form, partlyclosed form, or implicit density, and h is a function
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Monte Carlo integration (2)
Monte Carlo solution
http://find/7/26/2019 Monte Carlo Notes From Main Book
140/584
Monte Carlo solutionFirst use a sample (X1, . . . , X m) from the density f toapproximate the integral I by the empirical average
hm= 1m
mj=1
h(xj)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Monte Carlo integration (2)
Monte Carlo solution
http://find/7/26/2019 Monte Carlo Notes From Main Book
141/584
Monte Carlo solutionFirst use a sample (X1, . . . , X m) from the density f toapproximate the integral I by the empirical average
hm= 1m
mj=1
h(xj)
which convergeshm
Ef[h(X)]
by the Strong Law of Large Numbers
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Monte Carlo precision
Estimate the variance with
http://find/7/26/2019 Monte Carlo Notes From Main Book
142/584
vm= 1
m
1
m 1m
j=1
[h(xj) hm]2,
and for m large,
hm Ef[h(X)]vm
N(0, 1).
Note: This can lead to the construction of a convergence test andof confidence bounds on the approximation ofEf[h(X)].
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Example (Cauchy prior/normal sample)
For estimating a normal mean, a robust prior is a Cauchy prior
http://find/7/26/2019 Monte Carlo Notes From Main Book
143/584
g , p y p
X N(, 1), C(0, 1).
Under squared error loss, posterior mean
(x) =
1 + 2e(x)
2/2d
1
1 +
2e(x)
2/2d
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
Example (Cauchy prior/normal sample (2))
Form of
suggests simulating iid variables
http://find/7/26/2019 Monte Carlo Notes From Main Book
144/584
1, , mN(x, 1)
and calculating
m(x) =mi=1
i1 + 2i
mi=1
1
1 + 2i.
The Law of Large Numbers implies
m(x)(x) as m .
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Monte Carlo integration
10.
6
http://find/7/26/2019 Monte Carlo Notes From Main Book
145/584
0 200 400 600 800 1000
9.6
9.8
1
0.0
10.2
10.4
iterations
Range of estimators m for 100 runs and x= 10
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Importance sampling
P d
http://find/7/26/2019 Monte Carlo Notes From Main Book
146/584
Paradox
Simulation from f(the true density) is not necessarily optimal
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Importance sampling
Paradox
http://find/7/26/2019 Monte Carlo Notes From Main Book
147/584
Paradox
Simulation from f(the true density) is not necessarily optimal
Alternative to direct sampling from f is importance sampling,
based on the alternative representation
Ef[h(X)] =
X
h(x)
f(x)
g(x)
g(x)dx .
which allows us to use other distributions than f
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Importance sampling algorithm
Evaluation of
http://find/7/26/2019 Monte Carlo Notes From Main Book
148/584
Ef[h(X)] =
X
h(x)f(x)dx
by
1. Generate a sample X1, . . . , X n from a distributiong
2. Use the approximation
1
m
m
j=1f(Xj)
g(Xj)
h(Xj)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Same thing as before!!!
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
149/584
Convergence of the estimator
1
m
m
j=1f(Xj)
g(Xj)
h(Xj)
X h(x)f(x)dx
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Same thing as before!!!
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
150/584
Convergence of the estimator
1
m
m
j=1f(Xj)
g(Xj)
h(Xj)
X h(x)f(x)dxconverges for any choice of the distribution g[as long as supp(g)supp(f)]
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Important details
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
151/584
Instrumental distribution g chosen from distributions easy tosimulate
The same sample (generated from g) can be used repeatedly,
not only for different functions h, but also for differentdensities f
Even dependent proposals can be used, as seen laterPMC chapter
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Althoughg can be any density, some choices are better thanothers:
Finite variance only when
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
152/584
Ef
h2(X)
f(X)
g(X)
=
X
h2(x)f2(X)
g(X) dx
7/26/2019 Monte Carlo Notes From Main Book
153/584
Ef
h2(X)
f(X)
g(X)
=
X
h2(x)f2(X)
g(X) dx
7/26/2019 Monte Carlo Notes From Main Book
154/584
Ef
h2(X)
f(X)
g(X)
=
X
h2(x)f2(X)
g(X) dx
7/26/2019 Monte Carlo Notes From Main Book
155/584
Ratio of the densities
(x) = p(x)
p0(x)=
2
exp x2/2
(1 + x2)
very badly behaved: e.g.,
(x)2p0(x)dx= .
Poor performances of the associated importance samplingestimator
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
200
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
156/584
0 2000 4000 6000 8000 10000
0
50
100
150
iterations
Range and average of500 replications of IS estimate ofE[exp X] over 10, 000 iterations.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Optimal importance function
The choice of g that minimizes the variance of the
http://find/7/26/2019 Monte Carlo Notes From Main Book
157/584
The choice ofg that minimizes the variance of theimportance sampling estimator is
g
(x) = |h(x)
|f(x)Z |h(z)|f(z)dz .
http://find/7/26/2019 Monte Carlo Notes From Main Book
158/584
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Practical impact
mj=1 h(Xj)f(Xj)/g(Xj)mj=1 f(Xj)/g(Xj)
,
7/26/2019 Monte Carlo Notes From Main Book
159/584
where f and g are known up to constants.
Also converges to I by the Strong Law of Large Numbers. Biased, but the bias is quite small
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Practical impact
mj=1 h(Xj)f(Xj)/g(Xj)mj=1 f(Xj)/g(Xj)
,
http://find/7/26/2019 Monte Carlo Notes From Main Book
160/584
where f and g are known up to constants.
Also converges to I by the Strong Law of Large Numbers. Biased, but the bias is quite small In some settings beats the unbiased estimator in squared error
loss.
Using the optimal solution does not always work:mj=1 h(xj)f(xj)/|h(xj)|f(xj)m
j=1 f(xj)/|h(xj)|f(xj) =
#positive h #negative hmj=11/|h(xj)|
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Selfnormalised importance sampling
For ratio estimator
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
161/584
nh =n
i=1i h(xi)
n
i=1i
with Xig(y) and Wi such that
E[Wi|Xi=x] =f(x)/g(x)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Selfnormalised variance
then
var(nh) 1
var(Snh ) 2E[h]cov(Snh Sn1 ) + E[h]2 var(Sn1 )
http://find/7/26/2019 Monte Carlo Notes From Main Book
162/584
var(h) n22
var(Sh ) 2E [h]cov(Sh , S1 ) + E [h] var(S1 )
.
for
Snh =
ni=1
Wih(Xi) , Sn1 =
ni=1
Wi
Rough approximation
varnh 1nvar(h(X)) {1 + varg(W)}
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Students t distribution)
X T(,,2), with density
http://goforward/http://find/http://goback/7/26/2019 Monte Carlo Notes From Main Book
163/584
f(x) = ((+ 1)/2)
(/2)
1 +
(x )22
(+1)/2.
Without loss of generality, take = 0, = 1.Problem: Calculate the integral
2.1 sin(x)
x n
f(x)dx.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Students t distribution (2)) Simulation possibilities
http://find/7/26/2019 Monte Carlo Notes From Main Book
164/584
p
Directly from f, since f= N(0,1)2
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Students t distribution (2)) Simulation possibilities
http://find/7/26/2019 Monte Carlo Notes From Main Book
165/584
p
Directly from f, since f= N(0,1)2
Importance sampling using Cauchy C(0, 1)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Students t distribution (2)) Simulation possibilities
http://find/7/26/2019 Monte Carlo Notes From Main Book
166/584
p
Directly from f, since f= N(0,1)2
Importance sampling using Cauchy C(0, 1) Importance sampling using a normal N(0, 1)
(expected to be nonoptimal)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Students t distribution (2)) Simulation possibilities
http://find/7/26/2019 Monte Carlo Notes From Main Book
167/584
Directly from f, since f= N(0,1)2
Importance sampling using Cauchy C(0, 1) Importance sampling using a normal N(0, 1)
(expected to be nonoptimal) Importance sampling using a U([0, 1/2.1])
change of variables
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
6.5
7.0
6.5
7.0
6.5
7.0
6.5
7.0
http://find/7/26/2019 Monte Carlo Notes From Main Book
168/584
0 10000 20000 30000 40000 50000
5.0
5.5
6.0
0 10000 20000 30000 40000 50000
5.0
5.5
6.0
0 10000 20000 30000 40000 50000
5.0
5.5
6.0
0 10000 20000 30000 40000 50000
5.0
5.5
6.0
Sampling from f(solid lines), importance sampling with
Cauchy instrumental (short dashes), U([0, 1/2.1])instrumental (long dashes) and normal instrumental (dots).
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
IS suffers from curse of dimensionality
As dimension increases, discrepancy between importance andtarget worsens
skip explanation
http://find/7/26/2019 Monte Carlo Notes From Main Book
169/584
http://find/7/26/2019 Monte Carlo Notes From Main Book
170/584
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Since E[Vn+1] = 1 and Vn+1 independent fromFn,
E(Un+1| F
n) =UnE(Vn+1| F
n) =Un,
and thus{Un}n0 martingaleSi
b J i lit
7/26/2019 Monte Carlo Notes From Main Book
171/584
Sincex x concave, by Jensen s inequality,
E(Un+1 | Fn) E(Un+1 | Fn) Unand thus{Un}n0 supermartingaleAssume E(
Vn+1)< 1. Then
E(Un) =n
k=1
E(Vk) 0, n .
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
But{Un}n0 is a nonnegative supermartingale and thus
Unconverges a.s. to a random variable Z 0. By Fatous lemma,
E(Z) = E
limn
Un lim infn E(Un) = 0.
Hence Z = 0 and U 0 a s which implies that the martingale
http://find/7/26/2019 Monte Carlo Notes From Main Book
172/584
Hence, Z= 0 and Un 0 a.s., which implies that the martingale{Un}n0 is not regular.Apply these results to Vk =
dd
(ik), i {1, . . . , N }:
E
d
d(ik)
E
d
d(ik)
= 1.
with equality iff dd
= 1, -a.e., i.e. = .
Thus all importance weights converge to 0
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
too volatile!
Example (Stochastic volatility model)
http://find/7/26/2019 Monte Carlo Notes From Main Book
173/584
yt=exp(xt/2) t , t N(0, 1)
with AR(1) log-variance process (or volatility)
xt+1=xt+ ut , ut N(0, 1)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Evolution of IBM stocks (corrected from trend and log-ratio-ed)
6
http://find/7/26/2019 Monte Carlo Notes From Main Book
174/584
0 100 200 300 400 500
10
9
8
7
time
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Stochastic volatility model (2))
Observed likelihood unavailable in closed from.Joint posterior (or conditional) distribution of the hidden state
http://find/7/26/2019 Monte Carlo Notes From Main Book
175/584
p ( )sequence{Xk}1kKcan be evaluated explicitly
Kk=2
exp 2(xk xk1)2 + 2 exp(xk)y2k+ xk /2 , (2)up to a normalizing constant.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Computational problems
Example (Stochastic volatility model (3))
http://find/7/26/2019 Monte Carlo Notes From Main Book
176/584
Example (Stochastic volatility model (3))
Direct simulation from this distribution impossible because of
(a) dependence among the Xks,(b) dimension of the sequence{Xk}1kK, and(c) exponential term exp(xk)y2k within (2).
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Importance sampling
Example (Stochastic volatility model (4))
http://find/7/26/2019 Monte Carlo Notes From Main Book
177/584
Natural candidate: replace the exponential term with a quadraticapproximation to preserve Gaussianity.
E.g., expand exp(xk) around its conditional expectationxk1 as
exp(xk)exp(xk1)
1 (xk xk1) +12
(xk xk1)2
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Stochastic volatility model (5))
Corresponding Gaussian importance distribution with mean
k =xk1{2 + y2kexp(xk1)/2} {1 y2kexp(xk1)}/2
2 + 2 ( )/2
http://find/7/26/2019 Monte Carlo Notes From Main Book
178/584
k2 + y2kexp(xk1)/2
and variance
2k = (2 + y2kexp(xk1)/2)1
Prior proposal on X1,
X1 N(0, 2)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
Example (Stochastic volatility model (6))
Simulation starts with X1 and proceeds forward to Xn, each Xkbeing generated conditional on Yk and the previously generatedX
http://find/7/26/2019 Monte Carlo Notes From Main Book
179/584
Xk1.Importance weight computed sequentially as the product of
exp 2(xk xk1)2 + exp(xk)y2k+ xk /2exp 2k (xk k)2 1k .
(1kK)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
sity
0.0
6
0.0
8
0.1
0
2
0.1
0.0
0.1
http://find/7/26/2019 Monte Carlo Notes From Main Book
180/584
weights
Dens
15 5 0 5 10 15
0.0
0
0.0
2
0.0
4
0 20 40 60 80 100
0.4
0.3
0.2
t
Histogram of the logarithms of the importance weights (left)and comparison between the true volatility and the best fit,based on 10, 000 simulated importance samples.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Importance Sampling
0.
2
0.
4
http://find/7/26/2019 Monte Carlo Notes From Main Book
181/584
0 20 40 60 80 100
0.
4
0.
2
0.0
t
Corresponding range of the simulated{Xk}1k100,compared with the true value.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Correlated simulations
Negative correlation reduces varianceSpecial technique but efficient when it appliesTwo samples (X1, . . . , X m) and (Y1, . . . , Y m) fromf to estimate
http://find/7/26/2019 Monte Carlo Notes From Main Book
182/584
I = R h(x)f(x)dxby
I1 =
1
m
m
i=1h(Xi) and
I2 =
1
m
m
i=1h(Yi)
with mean I and variance 2
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Variance reduction
Variance of the averageI1 +I2 2 1( )
http://find/7/26/2019 Monte Carlo Notes From Main Book
183/584
var
I1+ I2
2
=
2 +
1
2cov(
I1,
I2).
If the two samples are negatively correlated,
cov(I1,I2)0 ,they improve on two independent samples of same size
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Antithetic variables
If f symmetric about take Yi = 2 Xi
http://find/7/26/2019 Monte Carlo Notes From Main Book
184/584
Iff symmetric about , take Yi = 2 Xi IfXi=F1(Ui), take Yi=F1(1 Ui) If (Ai)i partition ofX,partitioned sampling by sampling
Xjs in each Ai (requires to know Pr(Ai))
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Control variates
out of control!
For
I =
h(x)f(x)dx
http://find/7/26/2019 Monte Carlo Notes From Main Book
185/584
unknown and
I0 = h0(x)f(x)dxknown,
I0 estimated by
I0 and
I estimated byI
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Control variates (2)
Combined estimator
http://find/7/26/2019 Monte Carlo Notes From Main Book
186/584
I =
I+ (
I0 I0)
I is unbiased for I andvar(I) = var(I) + 2var(I) + 2cov(I,I0)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Optimal control
Optimal choice of
cov(I,I0)
http://find/7/26/2019 Monte Carlo Notes From Main Book
187/584
= ( , 0)var(
I0)
,
withvar(I) = (1 2) var(I),
where correlation between
I and
I0
Usual solution: regression coefficient ofh(xi) over h0(xi)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Example (Quantile Approximation)
Evaluate
= Pr(X > a) =
a
f(x)dx
http://find/7/26/2019 Monte Carlo Notes From Main Book
188/584
by
= 1nni=1
I(Xi> a),
with Xi iid f.If Pr(X > ) = 12 known
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Example (Quantile Approximation (2))
Control variate
1
nI(X > a) +
1
nI(X > ) Pr(X > )
http://find/7/26/2019 Monte Carlo Notes From Main Book
189/584
=n
i=1
I(Xi> a) +
n
i=1
I(Xi> ) Pr(X > )
improves upon if
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Example (Quantile Approximation (2))
Control variate
= 1
nI(Xi > a) +
1
nI(Xi > ) Pr(X > )
http://find/7/26/2019 Monte Carlo Notes From Main Book
190/584
=n
i=1
I(Xi> a) +
n
i=1
I(Xi> ) Pr(X > )
improves upon if ).
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Integration by conditioning
Use Rao-Blackwell Theorem
http://find/7/26/2019 Monte Carlo Notes From Main Book
191/584
UseRao-Blackwell Theorem
var(E[(
X)|Y]) var((X))
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Consequence
IfI unbiased estimator ofI = Ef[h(X)], with X simulated from ajoint density f(x, y), where
http://find/7/26/2019 Monte Carlo Notes From Main Book
192/584
f(x, y)dy=f(x),
the estimator I = Ef[I|Y1, . . . , Y n]dominateI(X1, . . . , X n) variance-wise (and is unbiased)
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
skip expectation
Example (Students t expectation)
For2 2
http://find/7/26/2019 Monte Carlo Notes From Main Book
193/584
E[h(x)] = E[exp(x2)] with X T(, 0, 2)
a Studentst distribution can be simulated as
X|y N(, 2y) and Y1 2.
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Example (Students t expectation (2))
Empirical distribution
1
m
mj=1
exp(X2j ),
can be improved from the joint sample
http://find/7/26/2019 Monte Carlo Notes From Main Book
194/584
p j p
((X1, Y1), . . . , (Xm, Ym))
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
Example (Students t expectation (2))
Empirical distribution
1
m
mj=1
exp(X2j ),
can be improved from the joint sample
http://find/7/26/2019 Monte Carlo Notes From Main Book
195/584
p j p
((X1, Y1), . . . , (Xm, Ym))
since
1
m
m
j=1 E[exp(X2)|Yj] = 1
m
m
j=11
22Yj+ 1is the conditional expectation.In this example, precision ten times better
Markov Chain Monte Carlo Methods
Monte Carlo Integration
Acceleration methods
0.5
6
0.5
8
0.
60
0.5
6
0.5
8
0.
60
http://find/7/26/2019 Monte Carlo Notes From Main Book
196/584
0 2000 4000 6000 8000 10000
0.
50
0.
52
0.
54