Comparison of Latin Hypercube andComparison of Latin...

Preview:

Citation preview

Comparison of Latin Hypercube andComparison of Latin Hypercube and Quasi Monte Carlo Sampling

.

Techniques.

Sergei Kucherenko1, Daniel Albrecht2, Andrea Saltelli2

1Imperial College London, UKEmail: s.kucherenko@imperial.ac.uk

2The European Commission, Joint Research Centre, ISPRA(VA), ITALY

1

Outline Outline

Monte Carlo integration methods

Latin Hypercube sampling design

.

Quasi Monte Carlo methods. Sobol’ sequences and their properties

Comparison of sample distributions generated by different techniques.Comparison of sample distributions generated by different techniques

Do QMC methods loose their efficiency in higher dimensions ?

Global Sensitivity Analysis and Effective dimensions

C i ltComparison results

2

Monte Carlo integration methods

[ ] ( )

see as an expectation: [ ] [ ( )]nH

I f f x dx

I f E f x

=

=∫

1

p [ ] [ ( )]1Monte Carlo : [ ] ( )

N

N ii

f f

I f f zN =

= ∑

.

1

{ } is a sequence of random points in Error: [ ] [ ]

in

i

N

z HI f I fε

= − .

2 1/ 21/ 2

[ ] [ ]

( )= ( ( ))

N

N

f f

fENσε ε = →

Convergence does not depent N

Improve MC convergence by decreasing ( )on dimensionality but it is slow

fσp g y g ( )Use variance reduction techniques:antithetic variables; control variates;

f

3stratified sampling LHS sampling→

Latin Hypercube sampling

..

Latin Hypercube sampling is a type of Stratified Sampling.

To sample N points in d-dimensions

Divide each dimension in N equal intervals => Nd subcubes.

Take one point in each of the subcubes so that being projected to

4

lower dimensions points do not overlap

Latin Hypercube sampling

{ } 1 - independent random permutations of {1 }k n Nπ = { }, 1,..., - independent random permutations of {1,..., }each uniformly distributed over all N! possible permutations

( ) 1

k

k

k n N

U

π =

.

( ) 1LHS coordinates: , 1,..., , 1,...,k

k k ii

i Ux i N k nN

π − += = =

k .

U(0,1) LHS is built by superimposing well stratified one-dimensional

kiU ∼

It cannot be expected to provide good uniformity propertiessamples.

p p g y p pin a n-dimensional unit hypercube .

55

Deficiencies of LHS sampling

..

1) S i b dl l d ( )1) Space is badly explored (a)

2) Possible correlation between variables (b)

3) Points can not be sampled sequentially3) Points can not be sampled sequentially

=> Not suited for integration

6

Discrepancy. Quasi Monte Carlo.

n

:Defintions: ( ) ( ) [0 ) [0 ) [0 )Discrepancy is a measure of deviation from unifo yrmit

Q y H Q y y y y∈ = × × ×1 2

( )*

Defintions: ( ) , ( ) [0, ) [0, ) ... [0, ),( ) volume of

n

Q

Q y H Q y y y ym Q Q

N

∈ = × × ×−

.

( )*

( )sup ( )

n

Q yN

Q y H

ND m Q

N∈= −

.

* 1/ 2 1/ 2

*

Random sequences: (ln ln ) / ~ 1/

(

ND N N N

D c d

≤(ln )) Low discrepancy sequences (LDS)

nN(ND c d≤

*

) Low discrepancy sequences (LDS)

Convergence: [ ] [ ] ( ) ,QMC N N

NI f I f V f Dε

= − ≤

Assymptotically ~ (1/ ) much higher than

(ln )M

n

CQO N

NO Nε

ε

=

7

Assymptotically ~ (1/ ) much higher than

~ ( )1/QMC

MC

O N

O N

ε

ε

QMC. Sobol’ sequences

(ln )Convergence: for all LDSnO N

Nε = −

1(ln )For Sobol' LDS: , if 2 , integern

kO N N kN

ε−

= = −

.Sobol' LDS:

.

1. Best uniformity of distribution as N goes to infinity.2. Good distribution for fairly small initial sets.3 A f t t ti l l ith3. A very fast computational algorithm.

"Preponderance of the experimental evidence amassed to datePreponderance of the experimental evidence amassed to date points to Sobol' sequences as the most effective quasi-Monte Carlo method for application in financial engineering."

Paul Glasserman Monte Carlo Methods in Financial Engineering

8

Paul Glasserman, Monte Carlo Methods in Financial Engineering, Springer, 2003

Sobol LDS. Property A and Property A’

A low-discrepancy sequence is said to satisfy Property A if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 2n there is exactly one point in each 2n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into halfhypercube along each of its length extensions into half.

A low-discrepancy sequence is said to satisfy Property A’ if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 4n there is exactly one point in each 4n hyper octant that results from subdividing the unit

.

exactly one point in each 4n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into four equal parts.

.

9Property A Property A’

Distributions of 4 points in two dimensions

Property A

NoMC -> No

..

LHS -> No

Sobol’ -> Yes

10

Distributions of 16 points in two dimensions

Property A’

NoMC -> No

..

LHS -> No

Sobol’ -> Yes

11

Comparison of Discrepancy I.Low Dimensions

Use standard MC and

0.1MC

LHSQMC-Sobol Use standard MC and ,

LHS generators

Sobol' sequence generator:0.001

0.01Lo

g(D

N_L

2)

.

Sobol sequence generator:

SobolSeq:

Sobol' sequences satisfy1e-05

0.0001

128 256 512 1024 2048 4096 8192 16384 32768

L

*. q y

Properties A and A’

www.broda.co.uk

128 256 512 1024 2048 4096 8192 16384 32768Log2(N)

0.01MC

LHS

N

0.001

g(D

N_L

2)

LHSQMC-Sobol

Result:

QMC in low dimensions shows h ll di th0 0001

Log

12

much smaller discrepancy than MC and LHS

0.0001128 256 512 1024 2048 4096 8192 16384 32768

Log2(N)

Comparison of Discrepancy IIHigh Dimensionsg

1e-07MC

LHSQMC-Sobol

.

1e-08

Log(

DN

_L2)

.

1e-09

L

1e 09128 256 512 1024 2048 4096 8192 16384 32768

Log2(N)

All sampling methods in high-dimensions have comparable discrepancy

13

Do QMC methods loose their efficiency in higher dimensions ?dimensions ?

(ln )QMC

nO Nε =

.

Assymptotically ~ (1/ )

QMC

QMC

NO Nε

.

*

21

but increseas with until exp( )

50 5 10 not achievable for practical applicationsQMC N N n

n N

ε ≈

= ≈50, 5 10 not achievable for practical applicaIs QMC better than MC and LHS in higher dimensions ( 2

tion0

s)

n N≥

= ≈ −

14

ANOVA decomposition and Sensitivity Indices

Consider a modelx is a vector of input variables

( )( )

Y f xx x x x==

f(x) is integrable

ANOVA decomposition:

1 2( , ,..., )0 1

k

i

x x x xx

=≤ ≤

.( ) ( ) ( )0 1,2,..., 1 2

1( ) , ... , , ..., ,

k

i i ij i j k ki i j i

Y f x f f x f x x f x x x>

= = + + + +∑ ∑∑

ANOVA decomposition:

.

1 1

1

1

...0

( , , ..., ) 0, , 1i s is k

i i j i

i i if x x dx k k s

= >

= ∀ ≤ ≤∫0

Variance decomposition: 2 2 2 2, 1,2,...,i iji i j nσ σ σ σ= + +∑ ∑ …

kijlij

k

i SSSS 21...1 ++++= ∑∑∑Sobol’ SI:

15

klji

ijlji

iji

i ,...,2,11

∑∑∑<<<=

Sobol’ Sensitivity Indices (SI)

Definition:1 1

2 2... ... /

s si i i iS σ σ=

- partial variances( )1 1 1 1

12 2... ...

0

,..., ,...,s si i i i i is i isf x x dx xσ = ∫

12

∫.

- total variance

Sensitivity indices for subsets of variables:

( )( )220

0

f x f dxσ = −∫( )x y z.Sensitivity indices for subsets of variables: ( ),x y z=

( )1

2 2, ,

1s

m

y i is i i

σ σ= ⟨ ⟨ ∈Κ

=∑ ∑ …

Total variance for a subset:( )11 ... ss i i= ⟨ ⟨ ∈Κ

( ) 222z

toty σσσ −=

Corresponding global sensitivity indices:

/ 22 σσS ( ) / 22tottotS16

,/σσ yyS = ( ) ./ 2σσ toty

totyS =

Effective dimensions

superposition sensThe effective dimension of ( ) in the i th ll t i t h th t

Let u be a cardinality of a set of variables . ef

ux

d

0

is the smallest integer such that (1 ), 1

S

S

uu d

dS ε ε

< <≥ − <<∑

.

It means that ( ) is almost a us m xf of -dimensional functions.Sd.

truncation sensThe function ( ) has effective dimension in the ie f( 11 ),

T

u

f x dS ε ε≥ − <<∑

___________________________________________________________

{1,2,..., }

Important property:

( ),T

uu d

d d

( ) 1Examp

Importa

le

nt

:

property:

n

TS

n

d d

f x x d d

→ = == ∑17

( )1

1,Example: i

i TS nf x x d d=

→ = == ∑

Classification of functions

Type B,C. Variables are Type A. Variables are yp ,equally important

S S d↔

ypnot equally important

T Ty zS S d>> ↔ <<

.

i j TS S nd≈ ↔ ≈y z

y zT n

n nd>> ↔ <<

.

Type B. Dominant low order indices

Type C. Dominanthigher order indices n

11

n

ii

SS nd=

≈ ↔ <<∑1

1n

ii

SS nd=

<< ↔ ≈∑18

When LHS is more effective than MC ?

0ANOVA: ( ) ( ) ( )i if x f f x r x= + +∑( ) high order interactions terms

1 1

i

r x −

.

2 21 1LHS: ( ) [ ( )] ( ) (Stein, 1987)

1 1 1

nLHS HE r x dx O

N Nε = +∫

∫ ∫.

2 2 2

2

1 1 1MC: ( ) [ ( )] [ ( )] ( )

if [ ( )] i

n nMC i iH Hi

E f x dx r x dx ON N N

d

ε = + +∑∫ ∫

∫ ll ( T B f ti )d2if [ ( )] is snH

r x dx∫ mall ( Type B functions )Sd⇔

2 2 ( ) ( )

LHS MCE Eε ε→ <

1919

Classification of functions

Function type

Description Relationship between

dT dS QMC is more

LHS is more

Si and Sitot efficient

than MCefficient than MC

A A few dominant Sy

tot/ny >> Szto / nz

<< n << n Yes No

.

variablesy y z z

B No unimportant subsets; only

≈ n << n Yes Yes

.subsets; only low-order interaction terms are present

Si ≈ Sj, ∀ i, jSi / Si

tot ≈ 1, ∀ i

presentC No

unimportant subsets; high- Si ≈ Sj ∀ i j

≈ n ≈ n No No

order interaction terms are present

Si ≈ Sj, ∀ i, jSi / Si

tot << 1, ∀ i

20

p

How to monitor convergence of MC LHS and QMC calculations ?MC, LHS and QMC calculations ?

The root mean square error is defined asThe root mean square error is defined as 1/2

2

1

1 ( )K

kd N

k

I IK

ε ⎛ ⎞= −⎜ ⎟⎝ ⎠∑

.

K is a number of independent runs

MC and LHS: all runs should be statistically independent ( use a

1kK =⎝ ⎠

. y p (different seed point ).

QMC: for each run a different part of the Sobol' LDS was used ( start from a different index number )

0 1cN α α− < <

start from a different index number ).

The root mean square error is approximated by the formula

MC: 0.5QMC 1

, 0 1cNα

α

< <≈

21

QMC: 1LHS: ?

αα≤∼

Integration error vs. N. Type A(a) f(x) = ∑n

j 1(-1)iΠij 1 xj n = 360 (b) f(x) = Πs

i 1 │4xi-2│/(1+a i) n = 100(a) f(x) ∑ j=1( 1) Π j=1 xj, n 360, (b) f(x) Π i=1 │4xi 2│/(1+a i), n 100

1/ 21 K⎛ ⎞

-5

0MC (-0.50)

QMC-SOB (-0.94)

LHS (-0.52)

T Ty z

S S d2

1

1 ( )K

kN

k

I IK

ε=

⎛ ⎞= −⎜ ⎟⎝ ⎠∑

20

-15

-10

log 2

(err

or)

LHS ( 0.52) y z

y zT n

n nd>> ↔ <<

.(a)

~ , 0 1N αε α− < <-25

-20

8 11 14 17 20log2(N)

.(a)

-6

-2

or)

MC (0.49)

QMC-SOB (0.65)

LHS (0.50)

-10log 2

(err

o

(b) -148 11 14 17 20

log2(N)

22

Integration error. Type A

1/ 21 K⎛ ⎞

T Ty z

TS S nd>> ↔ <<

2

1

1 ( )K

kN

k

I IK

ε=

⎛ ⎞= −⎜ ⎟⎝ ⎠∑

~ , 0 1N αε α− < <

y zTn n

.

Index Function Dim n Slope MC

Slope QMC

Slope LHS

in .

1A 360 0.50 0.94 0.52( )

1 1

1 ij

i j

x= =

−∑ ∏

4 2n x a+

2Aa1 = a2 = 0

a = = a = 6 52

100 0.49 0.65 0.501

4 21

ni i

i i

x aa=

− ++∏

a3 = … = a100 = 6.52

23

Integration error vs. N. Type BDominant low order indices 1

n

S d↔ <<∑Dominant low order indices1

1ii

SS nd=

≈ ↔ <<∑

-8MC (0.52)

QMC-SOB (0 96)

5.0)(

1 −−

=∏= n

xnxfn

i

i

(a)-16

-12

g2(e

rror

)

QMC-SOB (0.96)

LHS (0.69)

.

360=n(a)

-24

-20

8 11 14 17 20

log2(N)

lo

.log2(N)

MC (0.50)

360

)11()(1

/1

=

+=∏=

n

xnxfn

i

ni

(b)-12

-8

og2(

erro

r)

QMC-SOB (0.87)

LHS (0.62)

-20

-16

8 11 14 17 20

l

24

log2(N)

Integration error. Type B functionsDominant low order indicesDominant low order indices

n

11

n

ii

SS nd=

≈ ↔ <<∑

.

Index Function Dim n Slope MC

Slope QMC

Slope LHS.

1B 30 0.52 0.96 0.691 0 5

ni

i

n xn−−∏

2B 30 0.50 0.87 0.62

1 0.5i n=

1

11n n

ni

i

xn =

⎛ ⎞+⎜ ⎟⎝ ⎠

∏4 2

3Bai = 6.52

30 0.51 0.85 0.551

4 21

ni i

i i

x aa=

− ++∏

25

The integration error vs. N. Type CDominant higher order indices: 1

n

S nd<< ↔ ≈∑Dominant higher order indices: 1

1ii

SS nd=

<< ↔ ≈∑

-2MC (0.47)

QMC-SOB (0.64)

(a) 1

4 2( ) , 0

1

ni i

ii i

x af x a

a=

− += =

+∏-8

-6

-4

log2

(err

or)

QMC SOB (0.64)

LHS (0.50)

.

(a)

1

4 2

10

in

ii

x

n=

→ −

=

∏-12

-10

8 11 14 17 20

log2(N).log2(N)

-2 MC (0.49)

QMC-SOB (0.68)

1/( ) (1/ 2)n

nif x x= ∏(b) -10

-8

-6

-4

log2

(err

or)

LHS (0.51)

1

10i

n=

=

∏-14

-12

8 11 14 17 20

log2(N)

26

Integration error for type C functionsDominant higher order indicesg

n

11

n

ii

SS nd=

<< ↔ ≈∑

.Index Function Dim n Slope

MCSlope QMC

Slope LHS. MC QMC LHS

1C 10 0.47 0.64 0.501

4 2n

ii

x=

−∏n

2C 10 0.49 0.68 0.51( )1

2n

ni

i

x=∏

27

The integration error vs. N. Function 1AMC QMC LHS Theoretical Value

-0.320

-0.315

MC QMC LHS Theoretical Value

( )1in

i∑ ∏-0.330

-0.325

A -

inte

gral

( )1 1

1 ,

360

ij

i j

x

n= =

=

∑ ∏

.-0.345

-0.340

-0.335Fu

nctio

n 1A

.

-0.355

-0.350

0 2 4 6 8 10 12

QMC: convergence is monotonicMC d LHS ill ti

0 2 4 6 8 10 12log2 (N)

MC and LHS: convergence curves are oscillating

QMC is 30 times faster than MC and LHS

28

LHS: it is not possible to incrementally add a new point while keeping the old LHS design

Evaluation of quantiles I. Low quantile

Distribution dimension n = 5.

are independent standard normal variates

2

1

( ) ,n

ii

f x x=

= ∑(0 1)x N are independent standard normal variates(0,1)ix N∼

0.0100

.w)

"MC"

"QMC"

"LHS"Expon. ("QMC")

Expon. ("MC")

Expon. ("LHS").

0.0010

og(M

eanE

rror

Low

Lo

0.000110 11 12 13 14 15 16 17 18

Log2(N)

29

Low quantile (percentile for the cumulative distribution function) = 0.05A superior convergence of the QMC method

Evaluation of quantiles II. High quantile

Distribution dimension n = 5.

are independent standard normal variates

2

1

( ) ,n

ii

f x x=

= ∑(0 1)x N are independent standard normal variates(0,1)ix N∼

0.0100"MC"

.igh)

"QMC"

"LHS"

Expon. ("MC")

Expon. ("QMC")

Expon. ("LHS").

0.0010

Log(

Mea

nErr

or H

0 0001

Hi h til ( til f th l ti di t ib ti f ti ) 0 95

0.000110 11 12 13 14 15 16 17 18

Log2(N)

30

High quantile (percentile for the cumulative distribution function) = 0.95QMC convergences faster than MC and LHS

Summary

Sobol’ sequences possess additional uniformity properties which MC andSobol sequences possess additional uniformity properties which MC and LHS techniques do not have (Properties A and A’).

Comparison of L discrepancies shows that the QMC method has the

.

Comparison of L2 discrepancies shows that the QMC method has the lowest discrepancy in low dimensions ( up to 20).

QMC method outperforms MC and LHS for types A and B functions.QMC method outperforms MC and LHS for types A and B functions (problems with low effective dimensions)

LHS h d f MC l f B f iLHS method outperforms MC only for type B functions.

QMC remains the most efficient method among the three techniques for non-uniform distributions

31

Recommended