37
Efficient strategy of MCMC in high-dimension and its application to diffusion processes Kengo Kamatani (Osaka Univ. and CREST, JST) Mar 2015 at LeMans

ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

  • Upload
    ngodieu

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Efficient strategy of MCMC in

high-dimension and its application to

diffusion processes

Kengo Kamatani (Osaka Univ. and CREST, JST)

Mar 2015 at LeMans

Page 2: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

1. New algorithm:

• Markov chain Monte Carlo (MCMC) produces a Markov chainX0, . . . , XM−1 with a given invariant probability measure P .If it is ergodic, we have

M−1M−1∑m=0

f(Xm) → P (f) =∫

f(x)P (dx).

We can approximate P (f) by the empirical average.

• MCMC ∋ RWM, Gibbs, MALA, Slice Sampler, HMC etc.

• Almost all MCMC satisfies reversibility, i.e., if X0 ∼ P (dx),

L(X0, X1, . . . , XM) = L(XM , XM−1, . . . , X0).

1

Page 3: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

1-a. RWM Algorithm: Let P (dx) = p(x)dx be the target on

Rd.

1. Generate x∗ = x+ w where w ∼ Nd(0, σ2Id) = Γd.

2. Accept x∗ as the next state with probability α(x, x∗), and

otherwise, discard x∗, where

α(x, x∗) = min

{1,

p(x∗)

p(x)

}.

Proposal kernel [ x to x∗ ] is reversible with respect to the uniform

distribution on Rd.

2

Page 4: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

1-b. pCN Algorithm: Fix ρ ∈ (0,1). For x = (x1, . . . , xd) ∈ Rd

let ∥x∥ = (∑d

i=1 x2i )

1/2.

1. Generate x∗ = ρ1/2x+ (1− ρ)1/2w where w ∼ Nd(0, Id).

2. Accept x∗ with probability α(x, x∗) where

α(x, x∗) = min

{1,

p(x∗)ϕ(x)

p(x)ϕ(x∗)

}where ϕ is the pdf of Nd(0, Id).

Proposal kernel [ x to x∗ ] is reversible with respect to Nd(0, Id).

3

Page 5: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

1-c. MpCN Algorithm (New method):

1. Generate r ∼ Gamma(d/2, ∥x∥2/2).

2. Generate x∗ = ρ1/2x+(1− ρ)1/2r−1/2w where w ∼ Nd(0, Id).

3. Accept x∗ with probability α(x, x∗) where

α(x, x∗) = min

{1,

p(x∗)∥x∥−d

p(x)∥x∗∥−d

}.

Proposal kernel [ x to x∗ ] is reversible with respect to ∥x∥−ddx.

4

Page 6: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Note for application to Bayesian inference for complicated

models

• x = θ and P (dx) = P (dθ|Xn) = p(θ|Xn)dθ.

• For many advanced MCMC methods, we need to calculate

(log p(x))′ ≈ (score function) in each iteration (ex. 106

times!). Sometimes we also need to calculate (log p(x))′′.

• Previous three methods are nice in this point of view as long

as the performance is nice.

5

Page 7: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

2. Application

2-a. Toy examples; P (dx) = standard normal distribution

−5.0

−2.5

0.0

2.5

5.0

0 250 500 750 1000Iteration

Trajectory of (||x||^2−d)/sqrt(2d) by Gaussian RWM

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100lag

acf

−5.0

−2.5

0.0

2.5

5.0

0 250 500 750 1000Iteration

Trajectory of (||x||^2−d)/sqrt(2d) by pCN

0.0

0.2

0.4

0.6

0.8

0 25 50 75 100lag

acf

−5.0

−2.5

0.0

2.5

5.0

0 250 500 750 1000Iteration

Trajectory of (||x||^2−d)/sqrt(2d) by MpCN

0.00

0.25

0.50

0.75

0 25 50 75 100lag

acf

6

Page 8: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

2-a. Toy examples; t-distribution

0

500

1000

1500

0 250 500 750 1000Iteration

Trajectory of ||x||^2/d by Gaussian RWM

0.00

0.25

0.50

0.75

1.00

0 2500 5000 7500 10000lag

acf

0

500

1000

1500

0 250 500 750 1000Iteration

Trajectory of ||x||^2/d by pCN

0.00

0.25

0.50

0.75

1.00

0 2500 5000 7500 10000lag

acf

0

500

1000

1500

0 250 500 750 1000Iteration

Trajectory of ||x||^2/d by MpCN

0.00

0.25

0.50

0.75

1.00

0 2500 5000 7500 10000lag

acf

7

Page 9: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

2-b. Stochastic processes

Realistic examples. R with Yuima package. We consider some

Bayesian parameter estimation for discretely observed stochastic

processes.

Note

• LA (Likelihood analysis) is not available and we treat QLA

(quasi-LA).

• QLA has been studied extensively. See Yoshida [9] and ref-

erences therein.

8

Page 10: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Consider

dXt = a(Xt, θ)dt+ b(Xt)dWt;X0 = 2, t ∈ [0, T ]

where

a(x, θ) = θ1 − θ2x+2sin(θ3x), b(x) =0.5+ x2

1+ 0.3x2.

N = 5000, T = 250.

P (dθ|XN) ∝ exp

(−1

2

(N∑

n=1

(Xnh −X(n−1)h − a(X(n−1)h, θ)h)2

hb(X(n−1)h)2

))P (dθ)

where h = T/N (Nh3 = 0.625). True is θ = (3,7,5).

9

Page 11: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

• Generate discrete observation XN from the model for a true

parameter.

• Run each MCMC for M = 105 iteration from 100 different

starting points.

• Plot empirical average for each 100 trials to approximate∫θP (dθ|XN).

• We compare RWM (σ = 1.5,2,4), pCN and MpCN.

10

Page 12: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

●●●●●●

●●

●●●●●

●●●

●●●●●●●

●●●

●●●

●●●●●●●

●●

●●●●●

●●

●●

●●

●●

●●

●●●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

RWMH sd = 1.5;0/100 points are out of this region

●●

● ●

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

RWMH sd = 1.5;89/100 points are out of this region

11

Page 13: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●●●●

●●

●●●●

●●●

●●

●●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●●●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

RWMH sd = 2;0/100 points are out of this region

●●

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

RWMH sd = 2;88/100 points are out of this region

12

Page 14: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

● ●●● ●● ●

●●●

● ●●●

●●

●●● ●

●●

●●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●● ●

●●●

●●●●●

● ●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

RWMH sd = 4;0/100 points are out of this region

●●

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

RWMH sd = 4;50/100 points are out of this region

13

Page 15: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

● ●●●

●●

●●● ●●●

●●

●●●

●●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

pCN;54/100 points are out of this region

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

pCN;97/100 points are out of this region

14

Page 16: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

●●●●●●●●●●●●●●●

●●●●●●●●

●●

●●

●●●●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●

●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●● ●●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

MpCN;0/100 points are out of this region

●●

●●

●●

●●

●●

●●

●●

●●

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

MpCN;12/100 points are out of this region

15

Page 17: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●●●●●●

●●

●●

●●

●●

●●●●●

●●

0

10

20

30

40

50

0 2 4 6 8b1

b3

6600661066206630664066506660

ll

Optim;0/100 points are out of this region

●●●●●●●●●●●

4.6

4.7

4.8

4.9

3.2 3.3 3.4 3.5b1

b3

6600661066206630664066506660

ll

Optim;89/100 points are out of this region

16

Page 18: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

MpCN

• (Essentially) No tuning parameter.

• No derivative.

• Good performance.

17

Page 19: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

3. Theoretical results

• We study (HDA) High-Dimensional asymptotics for MCMC.

• HDA is strong assumption ⇒ strong conclusion type frame-

work.

• HDA was developed by Gelman et al. [5] and Roberts et al.

[8] (RGG97).

18

Page 20: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

3-a. What is HDA? RGG97’s results (Here, X is the parame-

ter.)

• They considered asymptotic properties of d-dimensional Markov

chain Xd = (Xdm)m∈N0

as d → ∞, where Xd ∼ Gaussian

RWM.

• Set

Pd(dx) =d∏

i=1

f(xi)dxi (x = (x1, . . . , xd)).

• Under some regularity conditions on f , Pd ≈ Nd(0, σ2Id).

19

Page 21: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

• Introduce time scaling t 7→ [dt], where [x] is the integer partof x, and consider

Xd[dt].

• Introduce projection πE(x) = (xi)i∈E for E ⊂ {1, . . . , d} wherex = (x1, . . . , xd). Ex.

π{3,5,10}(x) = (x3, x5, x10) if E = {3,5,10}

and consider

Y dt := π{1}(X

d[dt]).

• Introduce proposal scaling

σ2 = l2/d.

20

Page 22: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Theorem (GGR97). Y d ⇒ Y where

dYt = h(l)(log f)′(Yt)

2dt+

√h(l)dWt

where

h(l) =2l2Φ

(−l√I

2

),

I =∫ {

(log f)′(x)}2

f(x)dx.

21

Page 23: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Interpretation of RGG97’s

• The rate of convergence is d. Thus the number of iteration

should be proportional to d.

• For the limit process Y , the convergence rate is determined

by h(l).

• The function h(l) is maximised if the average acceptance

probability is approximately 0.23.

The result gives a criterion for constructing a good RWM.

22

Page 24: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

After the seminal paper RGG97, there are many studies for the

generalization of the result.

Generalization of Pd Non i.i.d. Bedard [2], Perturbation of

Gaussian Beskos et al. [4], etc.

Better convergence rate Metropolis adjusted Langevin algo-

rithm (MaLa, d1/3 Roberts and Rosenthal [7]), Hybrid Monte

Carlo (d1/4 Beskos et al. [3]), Metropolis-Coupled MCMC

Atchade et al. [1], etc.

23

Page 25: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Our plan:

• (perturbation of) Gaussian = ideal situation. Heavy-tail ≈realistic, non-ideal situation. We want to know the rate of

convergence (time scaling). It is d for RWM for Gaussian

case.

• We want to construct MCMC, which works well for a difficult

target distribution.

• We only consider a special class of heavy-tailed distribution.

By this, we can apply Stein’s techniques and Malliavin cal-

culus.

24

Page 26: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

3-b. Setting

• Pd is a scale mixture of the Gaussian distribution;

Pd = L(Xd0), where Xd

0|Y ∼ Nd(0, IdY ), Y ∼ Q(dy).

• The class of Pd ∋ Nd(0, Id), Student t-distribution and the

stable distribution.

If Pd is heavy-tailed, the rate of convergence is difficult to

define.

25

Page 27: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

• (Usual) Consistency

ξd = (ξdm)m∈N0(d ∈ N) is consistent if

1

M

M−1∑m=0

f(ξdm)−∫

f(x)Πd(dx) = oP(1) (M,d → ∞)

for any bounded continuous function f (K. 2014 [6]).

• Since the dimension grows as d → ∞, it is not suitable the

current study. We make a generalization of this definition.

26

Page 28: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

• For any bounded continuous function f : Rk → R and for any

sequence Ekd ⊂ {1, . . . , d} s.t. ♯Ek

d = k,

1

Md

Md−1∑m=0

f ◦ πEkd(Xd

m)−∫

f ◦ πEkd(x)Pd(dx) = oP(1)

for any Md → ∞ then we call that (Xd)d is consistent.

• If above satisfies all Md such that Md/Td → ∞, then we call

that Td is the convergence rate.

• This is just a formalisation of the rate of convergence used

in HDA community.

27

Page 29: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

3-c. Gaussian case; Pd = Nd(0, Id)

• Let µk(σ) = E[|ξ|k exp(−ξ+)] for ξ ∼ N(σ2

2 , σ2), and ξ+ =

max{0, ξ}.

• Let

rd(x) =√d

(∥x∥2d

− 1)

(x ∈ Rd).

28

Page 30: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Proposition (Gaussian RWM).Consider the Gaussian RWM and

set σ2 = l2/d. Set Y dt = rd(X

d[dt]). Then Y d ⇒ Y where

dYt = −σ(l)2

4Ytdt+ σ(l)dWt;Y0 ∼ N(0,2).

where σ(l)2 = 4µ2(l). By this, the Gaussian RWM is weakly

consistent with the rate d.

Theorem (Optimality). The above RWM attains the optimal

rate among all the RWM algorithms.

Proposition. Both pCN and MpCN algorithms have the rate 1.

29

Page 31: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

The key of the proof is reversibility.

P(∣∣∣∥Xd

1∥2 − ∥Xd

0∥2∣∣∣ > ϵ

)= 2P

(∥Xd

1∥2 − ∥Xd

0∥2 < −ϵ

)≤ 2P

(∥Xd

0 +W d1∥

2 − ∥Xd0∥

2 < −ϵ)

= 2P(2Zd < −ϵ

)

where

Zd :=∥Xd

0 +W d1∥

2 − ∥Xd0∥

2

2= ⟨Xd

0,Wd1⟩+

∥W d1∥

2

2.

We have

Zd =1

2

(∥W d

1∥+

⟨Xd

0,W d

1

∥W d1∥

⟩)2−⟨Xd

0,W d

1

∥W d1∥

⟩2

≥ −⟨Xd

0,W d

1

∥W d1∥

⟩2

.

30

Page 32: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

3-d. Heavy-tail case

Set

rd(x) =∥x∥2

d(x ∈ Rd).

Proposition. Let Γd = Nd(0, l2Id/d) and set Y d

t = rd(Xd[d2t]

).

Then Y d ⇒ Y where

dYt = a(Yt)dt+√b(Yt)dWt;Y0 ∼ Q

where

a(y) = 2(y+(log q)′(y)y2)µ2(l/√y)+l2µ1(l/

√y), b(y) = 4y2µ2(l/

√y).

In particular, the Gaussian RWM has the rate d2.

31

Page 33: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Theorem. The above RWM attains the optimal rate for the

weak consistency. Thus d2 is the optimal rate of RWM.

Proposition. In this case, pCN does not have any polynomial

rate and MpCN has the rate d.

Summary

Light-tail Heavy-tailRMW d d2

pCN 1 ∞MpCN 1 d

32

Page 34: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

Summary

• We propose a new MCMC algorithm, MpCN algorithm.

• It works well for both toy models, and stochastic process

examples.

• High-dimensional asymptotic theory was provided.

33

Page 35: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

[1] Yves F. Atchade, Gareth O. Roberts, and Jeffrey S. Rosen-thal. Towards optimal scaling of metropolis-coupled markov chainmonte carlo. Statistics and Computing, 21(4):555–568, October2011. ISSN 0960-3174. doi: 10.1007/s11222-010-9192-1. URLhttp://dx.doi.org/10.1007/s11222-010-9192-1.

[2] Mylene Bedard. Weak convergence of Metropolis algorithms fornon-i.i.d. target distributions. Ann. Appl. Probab., 17(4):1222–1244,2007. ISSN 1050-5164. doi: 10.1214/105051607000000096. URLhttp://dx.doi.org/10.1214/105051607000000096.

[3] A. Beskos, N. Pillai, G.O. Roberts, J.-M. Sanz-Serna, and A.M. Stuart.Optimal tuning of hybrid monte-carlo. to appear, 2013.

[4] Alexandros Beskos, Gareth Roberts, and Andrew Stuart. Optimal scal-ings for local Metropolis-Hastings chains on nonproduct targets in highdimensions. Ann. Appl. Probab., 19(3):863–898, 2009. ISSN 1050-5164.doi: 10.1214/08-AAP563. URL http://dx.doi.org/10.1214/08-AAP563.

Page 36: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

[5] A. Gelman, G. O. Roberts, and W. R. Gilks. Efficient Metropolis jumpingrules. In Bayesian statistics, 5 (Alicante, 1994), Oxford Sci. Publ., pages599–607. Oxford Univ. Press, New York, 1996.

[6] Kengo Kamatani. Local consistency of Markov chain Monte Carlo meth-ods. Ann. Inst. Statist. Math., 66(1):63–74, 2014. ISSN 0020-3157. doi:10.1007/s10463-013-0403-3.

[7] Gareth O. Roberts and Jeffrey S. Rosenthal. Optimal scaling of dis-crete approximations to langevin diffusions. J. R. Stat. Soc. Ser. BStat. Methodol., 60(1):255–268, 1998. ISSN 1467-9868. doi: 10.1111/1467-9868.00123.

[8] Gareth O. Roberts, Andrew Gelman, and Walter R. Gilks. Weak con-vergence and optimal scaling of random walk Metropolis algorithms.Ann. Appl. Probab., 7(1):110–120, 1997. ISSN 1050-5164. doi:10.1214/aoap/1034625254.

Page 37: ffi strategy of MCMC in high-dimension and its application ...lmm.univ-lemans.fr/files/saps10/18/Kamatani.pdf · high-dimension and its application to ... distribution on Rd. 2. 1-b

[9] Nakahiro Yoshida. Polynomial type large deviation inequalities andquasi-likelihood analysis for stochastic differential equations. Ann. Inst.Statist. Math., 63(3):431–479, 2011. ISSN 0020-3157. doi: 10.1007/s10463-009-0263-z.