1 Variance Reduction via Lattice Rules By Pierre L’Ecuyer and Christiane Lemieux Presented by Yanzhi Li

1

Variance Reduction via Lattice Rules

By Pierre L’Ecuyer and Christiane Lemieux

Presented by Yanzhi Li

2

Outline

Motivation Lattice rules Functional ANOVA decomposition Lattice selection criterion Random shifts Examples Conclusions

3

Motivation - MC

E[f(U)] = [0,1)t f(u)du where U is a t-di

mensional vector of i.i.d. Unif(0,1) r.v. Monte Carlo method (MC)

• Sample n points

uniformly in [0,1)t

4

Motivation - MC

MC gives a convergence rate MC performs worse for larger t.

• A large amount of points required in order to be uniform sampling

Can we do better? Quasi-Monte Carlo (QMC)

• Constructs the point set Pn more evenly and use a relatively smaller number of points

5

Given fixed number of points, which one is better?

MC QMC

6

Motivation - QMC

D(Pn): measure the non-evenness of Pn, discrepancy between Pn and U

|Qn-| V(f)D*(Pn)=V(f)O(n-1(lnn)t) where V(f) is the total variation of f and D*(Pn) is the rectangular star discrepancy

(D*(Pn) = , J is an interval)

Performs better than MC asymptotically

7

Motivation - QMC

Drawback

• For larger t, convergence rate better than that of MC only for impractically large values of n

• D*(Pn) is difficult to computer

• The bound is very loose for normal functions Good news

• Lower discrepancy point sets seem to effectively reduce the integration error, even for larger t

8

Motivation - Questions

Why does QMC performs better than MC empirically?

• Fourier expansion

• Traditional error bound

• Variance reduction viewpoint

How to select quasi-random point sets Pn? In particular, how to select integration lattice?

9

Lattice Rules

(Integration) lattice • Discrete set of the real space Rt

• Contains Zt

• Dual lattice

• Lattice rule

• An integration method that approximates by Qn using the node set Pn=Lt [0,1) t

10

Lattice doesn’t mean well distributed

11

Lattice Rules - Integration error

Fourier expansion of f

•

with

For lattice point set:

12

Functional ANOVA Decomposition

Writes f(u) as a sum of orthogonal functions

The variance 2 decomposed as

13

Functional ANOVA Decomposition The best mean-square approximation of f(.) by a sum o

f d-dimensional function is |I| dfI(.) f has a low effective dimension in the superposition sen

se when the approximation is good for small d, which is frequent

This suggests that the point sets Pn should be chosen on the basis of the quality of the distribution of the points over the subspace I that are deemed important

When |I| is small it is possible to make sure Pn(I) covers the subspace very well

14

Lattice Selection Criterion

It is desirable that Pn is (for rank-1 lattice)• Fully projection-regular, i.e., for any non-empty

I{1,…,t}, Pn(I) contains as many distinct points as Pn

• Or dimension-stationary, i.e., Pn({i1,…,id}) = Pn({i1+j, …,id+j}) for all i1,..,id and j

One full projection-regular example is

• Pn={(j/n)v mod 1: 0 j<n} for v=(1,a,…,at-1), where a is integer, 0<a<n and gcd(a,n)=1

15

Lattice Selection Criterion Lt(I) is a lattice => its points are contained in famili

es of equidistant parallel hyperplanes Choose the family that is farthest apart and let dt(I)

be the distance between hyperplanes dt(I)=1/lI where lI is the Euclidean length of the shor

test nonzero vector in the dual lattice Lt*(I), which h

as a tight upper bound ld*(n)=cd n1/d where d=|I|

Define a figure of merit lI / ld*(n) so that we can com

pare the quality of projections of different dimensions

16

Lattice Selection Criterion Minimize dt(I) Maximize lI / ld

*(n) Worse-case figure of merit consideration

• For arbitrary d 1 and t1 … tdd, define

This means takes into account the projections over s successive dimensions for all s t1 and over more than d non-successive dimensions that are not too far apart

17

18

Random Shifts

When Pn is deterministic, the integration error is also deterministic and hard to estimate.

To estimate the error, we use independent random shifts

• Generate a r.v. U~Unif[0,1)t and replace ui by ui’=(ui+

U) mod 1

• Let Pn’={u0’,…,un-1’} and Qn’=(1/n)i=0n-1 f(ui’)

• Repeat this m times, independently, with the same Pn thus obtaining m i.i.d copies of Qn’, denoted X1,…,Xm

19

Random Shifts

Let

We have

•

• If 2< , with the MC method,

For a randomly shifted lattice rule

20

Random Shifts

Since Lt* contains exactly 1/n of the points of Zt,

the randomly shifted lattice rule reduces the variance compared with MC

the “average” squared Fourier coefficients are smaller over Lt

* than over Zt, which is true for typical well-behaved functions

The previous selection criterion is also aimed to avoid having small vectors h in the dual lattice Lt

* for the sets I deemed important

21

Example: Stochastic activity network

•Each arc k, 1 k is an activity, with random duration ~ Fk()

•Estimate =P[T x]

•Generate N(A) unif[0,1) r.v.s., N(P) is the number of paths

22

Estimated Variance Reduction Factors w.r.t. MC

MC: Monte Carlo

LR: randomly shifted Lattice Rule

CMC: Conditional Monte Carlo

t: dimension of the integration

n: number of points in Pn

m=100

23

Conclusions Explain the success of QMC with variance

reduction instead of the traditional discrepancy measure

Propose a new way of generating lattice, choosing parameters• Pay more attention to subspace

Things we don’t cover• Rules of higher rank• Polynomial lattice rule• Massaging the problem

24

Variance Reduction via Lattice Rules

Thank you!

Q&A

Documents

1 Variance Reduction via Lattice Rules By Pierre L’Ecuyer and Christiane Lemieux Presented by Yanzhi Li