View
220
Download
0
Tags:
Embed Size (px)
Citation preview
1
Variance Reduction via Lattice Rules
By Pierre L’Ecuyer and Christiane Lemieux
Presented by Yanzhi Li
2
Outline
Motivation Lattice rules Functional ANOVA decomposition Lattice selection criterion Random shifts Examples Conclusions
3
Motivation - MC
E[f(U)] = [0,1)t f(u)du where U is a t-di
mensional vector of i.i.d. Unif(0,1) r.v. Monte Carlo method (MC)
• Sample n points
uniformly in [0,1)t
4
Motivation - MC
MC gives a convergence rate MC performs worse for larger t.
• A large amount of points required in order to be uniform sampling
Can we do better? Quasi-Monte Carlo (QMC)
• Constructs the point set Pn more evenly and use a relatively smaller number of points
5
Given fixed number of points, which one is better?
MC QMC
6
Motivation - QMC
D(Pn): measure the non-evenness of Pn, discrepancy between Pn and U
|Qn-| V(f)D*(Pn)=V(f)O(n-1(lnn)t) where V(f) is the total variation of f and D*(Pn) is the rectangular star discrepancy
(D*(Pn) = , J is an interval)
Performs better than MC asymptotically
7
Motivation - QMC
Drawback
• For larger t, convergence rate better than that of MC only for impractically large values of n
• D*(Pn) is difficult to computer
• The bound is very loose for normal functions Good news
• Lower discrepancy point sets seem to effectively reduce the integration error, even for larger t
8
Motivation - Questions
Why does QMC performs better than MC empirically?
• Fourier expansion
• Traditional error bound
• Variance reduction viewpoint
How to select quasi-random point sets Pn? In particular, how to select integration lattice?
9
Lattice Rules
(Integration) lattice • Discrete set of the real space Rt
• Contains Zt
• Dual lattice
• Lattice rule
• An integration method that approximates by Qn using the node set Pn=Lt [0,1) t
10
Lattice doesn’t mean well distributed
11
Lattice Rules - Integration error
Fourier expansion of f
•
with
For lattice point set:
12
Functional ANOVA Decomposition
Writes f(u) as a sum of orthogonal functions
The variance 2 decomposed as
13
Functional ANOVA Decomposition The best mean-square approximation of f(.) by a sum o
f d-dimensional function is |I| dfI(.) f has a low effective dimension in the superposition sen
se when the approximation is good for small d, which is frequent
This suggests that the point sets Pn should be chosen on the basis of the quality of the distribution of the points over the subspace I that are deemed important
When |I| is small it is possible to make sure Pn(I) covers the subspace very well
14
Lattice Selection Criterion
It is desirable that Pn is (for rank-1 lattice)• Fully projection-regular, i.e., for any non-empty
I{1,…,t}, Pn(I) contains as many distinct points as Pn
• Or dimension-stationary, i.e., Pn({i1,…,id}) = Pn({i1+j, …,id+j}) for all i1,..,id and j
One full projection-regular example is
• Pn={(j/n)v mod 1: 0 j<n} for v=(1,a,…,at-1), where a is integer, 0<a<n and gcd(a,n)=1
15
Lattice Selection Criterion Lt(I) is a lattice => its points are contained in famili
es of equidistant parallel hyperplanes Choose the family that is farthest apart and let dt(I)
be the distance between hyperplanes dt(I)=1/lI where lI is the Euclidean length of the shor
test nonzero vector in the dual lattice Lt*(I), which h
as a tight upper bound ld*(n)=cd n1/d where d=|I|
Define a figure of merit lI / ld*(n) so that we can com
pare the quality of projections of different dimensions
16
Lattice Selection Criterion Minimize dt(I) Maximize lI / ld
*(n) Worse-case figure of merit consideration
• For arbitrary d 1 and t1 … tdd, define
This means takes into account the projections over s successive dimensions for all s t1 and over more than d non-successive dimensions that are not too far apart
17
18
Random Shifts
When Pn is deterministic, the integration error is also deterministic and hard to estimate.
To estimate the error, we use independent random shifts
• Generate a r.v. U~Unif[0,1)t and replace ui by ui’=(ui+
U) mod 1
• Let Pn’={u0’,…,un-1’} and Qn’=(1/n)i=0n-1 f(ui’)
• Repeat this m times, independently, with the same Pn thus obtaining m i.i.d copies of Qn’, denoted X1,…,Xm
19
Random Shifts
Let
We have
•
• If 2< , with the MC method,
For a randomly shifted lattice rule
20
Random Shifts
Since Lt* contains exactly 1/n of the points of Zt,
the randomly shifted lattice rule reduces the variance compared with MC
the “average” squared Fourier coefficients are smaller over Lt
* than over Zt, which is true for typical well-behaved functions
The previous selection criterion is also aimed to avoid having small vectors h in the dual lattice Lt
* for the sets I deemed important
21
Example: Stochastic activity network
•Each arc k, 1 k is an activity, with random duration ~ Fk()
•Estimate =P[T x]
•Generate N(A) unif[0,1) r.v.s., N(P) is the number of paths
22
Estimated Variance Reduction Factors w.r.t. MC
MC: Monte Carlo
LR: randomly shifted Lattice Rule
CMC: Conditional Monte Carlo
t: dimension of the integration
n: number of points in Pn
m=100
23
Conclusions Explain the success of QMC with variance
reduction instead of the traditional discrepancy measure
Propose a new way of generating lattice, choosing parameters• Pay more attention to subspace
Things we don’t cover• Rules of higher rank• Polynomial lattice rule• Massaging the problem
24
Variance Reduction via Lattice Rules
Thank you!
Q&A