37
Estimating mean dimensionality of ANOVA decompositions Ruixue Liu 1 and Art B. Owen 2 Department of Statistics Stanford University Stanford CA, 94305 Orig: June 2003. Revised: March 2005 Abstract The analysis of variance is now often applied to functions defined on the unit cube, where it serves as a tool for the exploratory analysis of func- tions. The mean dimension of a function, defined as a natural weighted combination of its ANOVA mean squares, provides one measure of how hard or easy the function is to integrate by quasi-Monte Carlo sampling. This paper presents some new identities relating the mean dimension, and some analogously defined higher moments, to the variable importance measures of Sobol’ (1993). As a result we are able to measure the mean dimension of certain functions arising in computational finance. We pro- duce an unbiased and non-negative estimate of the variance contribution of the highest order interaction, which avoids the cancellation problems of previous estimates. In an application to extreme value theory, we find 1

Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

Estimating mean dimensionality of ANOVA

decompositions

Ruixue Liu1 and Art B. Owen2

Department of Statistics

Stanford University

Stanford CA, 94305

Orig: June 2003. Revised: March 2005

Abstract

The analysis of variance is now often applied to functions defined on

the unit cube, where it serves as a tool for the exploratory analysis of func-

tions. The mean dimension of a function, defined as a natural weighted

combination of its ANOVA mean squares, provides one measure of how

hard or easy the function is to integrate by quasi-Monte Carlo sampling.

This paper presents some new identities relating the mean dimension,

and some analogously defined higher moments, to the variable importance

measures of Sobol’ (1993). As a result we are able to measure the mean

dimension of certain functions arising in computational finance. We pro-

duce an unbiased and non-negative estimate of the variance contribution

of the highest order interaction, which avoids the cancellation problems

of previous estimates. In an application to extreme value theory, we find

1

Page 2: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

among other things, that the minimum of d independent U [0, 1] random

variables has a mean dimension of 2(d + 1)/(d + 3).

Keywords: Effective Dimension, Extreme Value Theory, Functional ANOVA,

Global Sensitivity Analysis, Quasi-Monte Carlo

1Ruixue Liu is a doctoral candidate in Statistics at Stanford University.

2Art B. Owen is a professor of Statistics at Stanford University.

This work was supported by the U.S. NSF under grants DMS-0072445 and

DMS-0306612. We thank an associate editor and two anonymous referees for

comments that have improved this article.

2

Page 3: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

1 INTRODUCTION

The analysis of variance (ANOVA) for square integrable functions on [0, 1]d

is becoming a widely used tool for the exploratory analysis of functions. The

ANOVA allows us to quantify the notion that some variables and interactions

are much more important than others. The result is a form of global sensitivity

analysis, distinct from local methods based on partial derivatives. Saltelli, Chan,

and Scott (2000) provide a survey of global sensitivity analysis with numerous

applications in the physical sciences.

Within the ANOVA formulation, we may answer questions about variable

importance, via numerical integration. Sobol’ and his co-authors (Sobol’ 1990;

Sobol’ 1993; Archer, Saltelli, and Sobol’ 1997; Sobol’ 2001) have developed un-

biased Monte Carlo methods for estimating global sensitivity indices expressed

through variances of ANOVA component functions.

The ANOVA of [0, 1]d involves 2d−1 effects and corresponding mean squares.

For moderately large d it becomes difficult to estimate them all. It is much less

difficult to estimate certain interpretable weighted sums of these mean squares.

This paper develops some new theory and algorithms for the ANOVA of

[0, 1]d. Then it uses these techniques to investigate some functions, from finan-

cial valuation, and extreme value theory. In addition to sensitivity analysis,

finance, and extreme value theory, the ideas presented here also have useful

applications to the bootstrap, that we omit due to space considerations.

Section 2 introduces our notation, presents the ANOVA of [0, 1]d, some vari-

able importance measures from global sensitivity analysis, and the dimension

distribution. Section 3 presents new results in the ANOVA of [0, 1]d. Included

are a measure of variable importance motivated by a problem in machine learn-

3

Page 4: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

ing and a Monte Carlo algorithm for estimating it, some new identities relating

global sensitivity measures to some weighted combinations of ANOVA mean

squares, and an inequality suitable for bounding effective dimension from di-

mension moments. Section 4 uses the estimation method from Section 3 to

explain why some quasi-Monte Carlo rules can perform well, even for functions

where the discrepancy bound on error is infinite. It also shows an example from

finance in which lower mean dimension corresponds to better gains from quasi-

Monte Carlo sampling. Section 5 develops the ANOVA for the minimum of d

random variables, as studied in extreme value theory. Section 6 describes the

Monte Carlo and quasi-Monte Carlo methods that we used for our numerical

answers. The proofs for the extreme value material appear in an appendix.

2 BACKGROUND

Let f ∈ L2[0, 1]d and for x ∈ [0, 1]d write x = (x1, . . . , xd). Here we present a

brief outline of the ANOVA decomposition for [0, 1]d, Sobol’s global sensitivity

indices, the notion of effective dimension, and the dimension distribution. For

more details, the reader may turn to the cited references.

The ANOVA decomposition of L2[0, 1]d was first presented in Hoeffding

(1948) for his analysis of U -statistics. Sobol’ (1969) uses it in quadrature prob-

lems, and Efron and Stein (1981) use it to study the jackknife. Takemura (1983)

gives a survey of applications in statistics.

The functional ANOVA is well known for the analysis of statistical function-

als. Let T (Y1, . . . , Yd) be a function of d independent and identically distributed

random variables Yj . Suppose that Yj has cumulative distribution function

G(y) = Pr(Yj ≤ y). We may write Yj = G−1(xj) for independent xj ∼ U [0, 1]

4

Page 5: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

and G−1(u) = inf{y | G(y) ≥ u}. Then f(x) = T (G−1(x1), . . . , G−1(xd)) rep-

resents the statistic T as a function on the unit cube. Under some smoothness

conditions in von Mises (1947), the function f becomes dominated by an ad-

ditive approximation in the large d limit. This result underlies central limit

theorems for T (Y1, . . . , Yd).

Sobol’s sensitivity indices describe the relative importance to f of the d

input variables xj considered individually and in subsets, as presented below.

The sensitivity indices are based in turn on the analysis of variance (ANOVA)

decomposition.

2.1 Notation

For subsets u ⊆ D = {1, . . . , d}, let |u| denote the cardinality of u, v− u denote

the set difference {j | j ∈ v, j �∈ u}, and −u denote the complement D − u. By

xu we denote the |u|-tuple of components xj for j ∈ u. The domain of xu is a

copy of [0, 1]|u| written as [0, 1]u.

For x and z in [0, 1]d the expression f(xu, z−u) means f evaluated at the

point p ∈ [0, 1]d with pj = xj for j ∈ u and pj = zj for j �∈ u. When g(xu, x−u) =

g(xu, z−u) for all x, z ∈ [0, 1]d then we say that the function g depends on x

only through xu, or equivalently, that g does not depend on x−u.

Let∫

f(x)dx = I and suppose σ2 =∫(f(x) − I)2dx < ∞. To rule out

trivialities, we also assume that σ2 > 0. Integrals of the form∫

g(x)dx are taken

to be over [0, 1]d and produce scalar values. Integrals of the form∫

g(x)dxv

represent integration with respect to xv ∈ [0, 1]v with the result viewed as a

function of x that does not depend on xv.

5

Page 6: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

2.2 ANOVA decomposition

In the ANOVA decomposition we write

f(x) =∑

u⊆{1,...,d}fu(x) (1)

where fu(x) depends on x only through xu. The term fu is obtained by sub-

tracting from f all terms for strict subsets of u, and then averaging over x−u to

give a function not depending on x−u:

fu(x) =∫ (

f(x) −∑v�u

fv(x))dx−u =

∫f(x)dx−u −

∑v�u

fv(x). (2)

Using usual conventions, f∅ is the constant function equal to I for all x ∈ [0, 1]d.

It follows by induction on |u|, that when j ∈ u then∫ 1

0 fu(x)dxj = 0 so

that∫

fu(x)fv(x)dx = 0 when u �= v. More generally∫

fu(x)gv(x)dx = 0 when

u �= v and f, g ∈ L2[0, 1]d. The variance of fu is written as σ2u. Clearly σ2

∅ = 0,

while if u �= ∅ then σ2u =

∫fu(x)2dx. The ANOVA is named for the following

easily proved property

σ2 =∑

u⊆{1,...,d}σ2

u. (3)

6

Page 7: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

2.3 Sensitivity and variable importance

Sobol’ (1993) gives two measures of the importance of a subset u of the variables,

which we label

τ2u =

∑v⊆u

σ2v, and, (4)

τ2u =

∑v∩u�=∅

σ2v. (5)

These can be thought of as lower and upper limits, respectively, on the impor-

tance of the subset u. It is easy to show that 0 ≤ τ2u ≤ τ2

u ≤ σ2 and that

τ2u + τ2

−u = σ2. Normalized versions, τ2u/σ2 and τ2

u/σ2, are known as global

sensitivity indices.

Let g∗ ∈ L2[0, 1]d be the minimizer of∫(f(x)− g(x))2dx among functions g

that depend only on xu. Then g∗ =∑

v⊆u fv and τ2u =

∫g∗(x)2dx. If τ2

u/σ2 is

close to one then f is close to a function that depends only on xu. If τ2u is small,

then as Sobol’ (1993) describes, the variables xu may be considered unessential,

and in some applications we might choose to fix them at default values.

Equation (4) expresses 2d values τ2u as linear combinations of 2d values σ2

u.

The inverse linear relation is

σ2u =

∑v⊆u

(−1)|u−v|τ2v. (6)

To compute σ2u from τ2

v we can combine equation (6) with the identity σ2 =

7

Page 8: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

τ2v + τ2

−v. Sobol’ (1993) gives the identities:

I2 + τ2u =

∫f(xu, x−u)f(xu, z−u)dxdz−u, and,

τ2u =

12

∫(f(xu, x−u) − f(zu, x−u))2dxdzu.

The integrals in these identities provide the basis for Monte Carlo or quasi-Monte

Carlo estimation of sensitivity indices. For small d it is feasible to estimate I2

and all 2d − 1 non-degenerate τ2u values. Then the ANOVA components σ2

u can

be estimated from (6).

For large |u| it often happens that σ2u is small compared to the numerical

error in estimates in some τ2v for v ⊂ u. Then the subtractions in (6) may yield

estimates of σ2u with large relative errors.

2.4 Effective dimension

A function may be thought to have an effective dimension smaller than d if it can

be closely approximated by certain sums of functions that involve fewer than

d components of x. Caflisch, Morokoff, and Owen (1997) define the effective

dimension of a function in two senses. The function f has effective dimension s

in the superposition sense if∑

|u|≤s σ2u ≥ 0.99σ2 and it has effective dimension

s in the truncation sense if∑

u⊆{1,...,s} σ2u ≥ 0.99σ2.

The choice of 99’th percentile is arbitrary, but reasonable in the context of

quasi-Monte Carlo sampling. Hickernell (1998) makes the threshold quantile a

parameter in the definition. We will emphasize superposition.

The extreme of low effective dimension is attained by additive functions.

An additive function can be integrated very effectively by Latin hypercube

8

Page 9: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

sampling, (McKay, Beckman, and Conover 1979), and it can be optimized by

optimizing separately over each input variable. For instance, one can optimize

f(1/2, . . . , xj , . . . , 1/2) for j = 1, . . . , d to obtain the global optimum (x1, . . . , xd)

of f . Stein (1987) shows that Latin hypercube sampling remains an effective

integration tool, for functions that are nearly additive. For nearly-additive f ,

separate optimization remains a useful heuristic, though we can construct func-

tions for which it will fail badly.

2.5 Dimension distribution

The dimension distribution (in the superposition sense) is a discrete probability

distribution with mass function

ν(j) =1σ2

∑|u|=j

σ2u, j = 1, . . . , d.

If one chooses a non-empty set U ⊆ {1, . . . , d} at random, such that the proba-

bility that U = u is σ2u/σ2 then Pr(|U | = j) = ν(j).

Owen (2003) computes the dimension distribution for some test functions

used in quasi-Monte Carlo. Some widely used test functions for numerical inte-

gration have very low effective dimension, making them relatively easy. Other

test functions are more intrinsically of high dimension. Wang and Fang (2002)

give a recursive algorithm for the dimension distribution of functions of product

form.

The effective dimension is defined through a quantile of the dimension dis-

tribution. Such quantiles can be hard to estimate directly. Moments are easier

to estimate, and in some cases, simple bounds such as those of Chebychev or

Markov, yield usable quantile bounds from moment bounds. When comparing

9

Page 10: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

functions of low effective dimension, it can happen that all of the functions being

compared have the same low effective dimension. See Wang and Fang (2002)

for examples. Then the mean dimension may serve as an easy to compute tie

breaker.

3 NEW ANOVA RESULTS

Here we present new results for the ANOVA of [0, 1]d. Section 3.1 presents a

notion of the importance of xu based on supersets of u. Section 3.2 presents

some new identities relating dimension moments to global sensitivity indices

Section 3.3 proves an inequality bounding tail probabilities of the dimension

distribution in terms of dimension moments.

3.1 Superset importance

The quantity

Υ2u =

∑w⊇u

σ2w (7)

is used in the study of black box functions f in machine learning. The inter-

pretability of f can be improved by ignoring certain collections of high order

interactions among the variables. Then Υ2u represents the cost, in lost model

fidelity, of ignoring fu and fv for all v ⊇ u (Hooker 2004). Clearly Υ2u ≤ τ2

u, so

that ignoring the supereffects of u is a less severe simplification than freezing

xu. The inverse of (7) is

σ2u =

∑w⊇u

(−1)|w−u|Υ2w. (8)

10

Page 11: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

For |u| = 1 we find Υ2{j} = τ2

{j}. For |u| = d, Υ2D = σ2

D describes the effect

of the full d dimensional interaction fD(x). When we are interested in σ2u for

|u| near d, then (8) may introduce much less cancellation than (6). For some

functions fD is the only discontinuous ANOVA term, the others being smoothed

by the integration step in (2).

The quantities Υ2u can be directly estimated as d+ |u| dimensional integrals.

We take 2 independent random values xj and zj for each j ∈ u, both from

the U [0, 1] distribution. There are 2|u| ways to combine these values to sample

a point from [0, 1]u. Every such combination is then completed with a single

random point z−u ∼ U [0, 1]−u, and the expected mean square of the u–effect is

related to Υ2u as follows:

Theorem 1

Υ2u =

12|u|

∫ (∑v⊆u

(−1)|u−v|f(xv, z−v))2

dxudz. (9)

Proof: First∑

v⊆u(−1)|u−v|f(xv, z−v) =∑

w⊆D∑

v⊆u(−1)|u−v|fw(xv, z−v).

Suppose that there is a j ∈ u with j �∈ w. Then

∑v⊆u

(−1)|u−v|fw(xv, z−v)

=∑

v⊆u−{j}(−1)|u−v|

(fw(xv, z−v) − fw(xv∪{j}, z−v−{j})

)

= 0

because fw does not depend on xj . Therefore∑

v⊆u(−1)|u−v|f(xv, z−v) =

11

Page 12: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

∑v⊆u(−1)|u−v|∑

w⊇u fw(xv, z−v), and so

∫ (∑v⊆u

(−1)|u−v|f(xv, z−v))2

dxudz

=∑v1⊆u

(−1)|u−v1|∑v2⊆u

(−1)|u−v2|∑

w1⊇u

∑w2⊇u

∫fw1(x

v1 , z−v1)fw2(xv2 , z−v2)dxudz

=∑v1⊆u

(−1)|u−v1|∑v2⊆u

(−1)|u−v2|∑w⊇u

∫fw(xv1 , z−v1)fw(xv2 , z−v2)dxudz

=∑v⊆u

(−1)2|u−v| ∑w⊇u

∫fw(xv, z−v)fw(xv, z−v)dxudz

= 2|u|∑w⊇u

σ2w. �

The result in Theorem 1 can also be obtained via the classical formulas for

expected mean squares in the discrete ANOVA, established by Cornfield and

Tukey (1956). The sampling scheme in Theorem 1 is a full 2|u| factorial exper-

iment with crossed random effects for xu and the variables of x−u subsumed

into the error term. To translate the language of fixed and random effects in

nested and crossed designs to the present setting, and then apply the multistep

algorithm for expected mean squares (Montgomery 2000) is more awkward than

it is to establish the result directly. There may be other classical experimen-

tal designs worth randomly embedding in the unit cube. The embedding of

randomized orthogonal array designs, such as fractional factorial designs, was

considered in Owen (1992) and Owen (1994).

To see the advantage of using Υ2u consider u = D = {1, . . . , d}. Then the

estimate of Υ2D = σ2

D is an average of squared differences. It is numerically

better to average squared differences than to take a difference of large squared

quantities. The latter operation is subject to cancellation error and may even

12

Page 13: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

give a negative value. This advantage of using Υ2u will extend to other u with

|u| near d. When d is large, it becomes expensive to obtain 2dn evaluations of f .

But the alternative approach based on (6) also runs into difficulty as it requires

estimates of 2d integrals.

In a Monte Carlo evaluation of Υ2u based on Theorem 1 we need 2|u|n function

evaluations, corresponding to n pairs (xui , zi) ∼ U [0, 1]|u|+d. We can use these

same function values to estimate Υ2v for all v ⊆ u. Let v be a non-empty subset

of u. Then

1n2|v|

n∑i=1

(∑w⊆v

(−1)|v−w|f(xwi , z−w

i ))2

(10)

has expected value Υ2v by Theorem 1. Equation (10) only uses 2|v|n of the func-

tion values. An alternative estimate for Υ2v averages together 2|u−v| estimates

like (10),

1n2|u|

n∑i=1

∑w′⊆u−v

(∑w⊆v

(−1)|v−w|f(xw∪w′i , z−w−w′

i ))2

. (11)

and hence uses all 2|u|n function values.

3.2 Dimension moments

Here we present new formulas that relate the mean, variance, and higher mo-

ments of the dimension distribution to certain sums of τ2u, τ2

u, and Υ2u. Direct

estimates of τ2u, τ2

u, and Υ2u, combined with an estimate of σ2, then allow es-

timates of moments of the dimension distribution without having to combine

estimates of 2d − 1 integrals.

Theorem 2 below shows that we can estimate the mean of the dimension

13

Page 14: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

distribution (in the superposition sense) through d integrals. To estimate the

variance, an additional d(d−1)/2 integrals suffice. Generally, an estimate of the

first k moments of the dimension distribution can be made with O(dk) integrals,

as shown in Theorem 3. Recall that σ2 > 0 by assumption.

Theorem 2 Let U be a randomly chosen subset of {1, . . . , d} with Pr(U = u) =

σ2u/σ2. Then

E(|U |) =1σ2

d∑j=1

τ2{j}, and for d ≥ 2, (12)

E(|U |2) = (2d − 1)E(|U |) − 2

σ2

d∑j=2

j−1∑k=1

τ2{j,k}. (13)

Proof:

d∑j=1

τ2{j} =

d∑j=1

∑u∩{j}�=∅

σ2u =

∑u

σ2u

d∑j=1

1j∈u =∑

u

|u|σ2u = σ2E(|U |),

establishing (12). Next,

∑|v|=2

τ2v =

∑u

σ2u

∑|v|=2

1u∩v �=∅.

Among subsets v of {1, . . . , d} with |v| = 2, there are |u|(d−|u|) with |u∩v| = 1

and |u|(|u| − 1)/2 with |u ∩ v| = 2. Therefore

∑|v|=2

1u∩v �=∅ = |u|(d − |u|) + |u|(|u| − 1)/2 = |u|(d − 1/2)− |u|2/2

14

Page 15: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

and so

2σ2

∑|v|=2

τ2v =

1σ2

∑u

σ2u

(|u|(2d − 1) − |u|2

)

= (2d − 1)E(|U |) − E(|U |2),

establishing (13). �

Theorem 2 can be generalized for E(|U |k) with 1 ≤ k ≤ d. The result is an

expression relating a k’th order polynomial in |u| to the sum of τ2v over |v| = k.

These in turn can be used to obtain expressions for k’th and higher moments

of |U |.

Theorem 3 For 1 ≤ k ≤ d,

∑|v|=k

τ2v =

(d

k

)σ2 −

∑u

σ2u

(d − |u|

k

), and, (14)

∑|v|=k

τ2v =

∑u

σ2u

(d − |u|d − k

). (15)

where u, v ⊂ {1, . . . , d}.

Proof: By summing over r = |u ∩ v|,

∑|v|=k

τ2v =

∑u

σ2u

k∑r=1

(|u|r

)(d − |u|k − r

)

=∑

u

σ2u

((k∑

r=0

(|u|r

)(d − |u|k − r

))−(

d − |u|k

))

=∑

u

σ2u

((d

k

)−(

d − |u|k

))

=(

d

k

)σ2 −

∑u

σ2u

(d − |u|

k

),

15

Page 16: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

establishing (14). Equation (15) then follows from τ2v = σ2 − τ2

−v. �

The k’th factorial moment of the dimension distribution is

µ(k) = µ(k)(f) = E(|U |(|U | − 1) . . . (|U | − k + 1)),

for 1 ≤ k ≤ d. The first k ordinary moments E(|U |k) can be calculated from the

first k factorial moments. Factorial moments may be estimated through sums

of Υ2v.

Theorem 4 For 1 ≤ k ≤ d,

∑|v|=k

Υ2v =

∑u

(|u|k

)σ2

u =σ2

k!µ(k)(f), (16)

where u, v ⊂ {1, . . . , d}. Let U be a randomly chosen subset of {1, . . . , d} with

Pr(U = u) = σ2u/σ2. Then

E(|U |) =1σ2

d∑j=1

Υ2{j}, and for d ≥ 2, (17)

E(|U |2) =

2σ2

d∑j=2

j−1∑k=1

Υ2{j,k} + E(|U |). (18)

Proof:

∑|v|=k

Υ2v =

∑|v|=k

∑u⊇v

σ2u =

∑u

σ2u

∑|v|=k

1u⊇v =∑

u

(|u|k

)σ2

u,

establishing (16). Equations (17) and (18) then follow by taking k = 1 and

k = 2, respectively. �

For estimation of E(|U |), Theorems 2 and 4 are equivalent because Υ2u = τ2

u

when |u| = 1. For estimation of E(|U |2), the theorems differ.

16

Page 17: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

The function f(x) is symmetric if f(x1, . . . , xd) = f(xπ(1), . . . , xπ(d)) for any

permutation π of 1, . . . , d. Most commonly considered statistical functionals

are symmetric in this way. The complete dimension distribution of a symmetric

function can be obtained with just d integrals such as τ2{1,...,k} for 1 ≤ k ≤ d,

but there remains the problem of error cancellation.

For a symmetric function f , Theorem 4 yields

µ(k)(f) =k!σ2

(d

k

)Υ2

{1,...,k} =d!

(d − k)!σ2Υ2

{1,...,k}.

3.3 From moments to tail probabilities

When E(|U |) ≤ 1 + ε it is easy to show that ν(1) ≥ 1− ε. For small ε then, the

moment provides adequate bound on the quantile. More generally,

Lemma 1 If E(|U |k) ≤ 1 + ε for k > 0, then for r > 1,

Pr(|U | ≥ r) ≤ ε

rk − 1. (19)

Proof: Because |U |k−1 ≥ 0 we may apply Markov’s inequality to get Pr(|U |k ≥z + 1) ≤ ε/z for any z > 0. Taking z = rk − 1 yields (19). �

4 MULTIDIMENSIONAL INTEGRATION

Our original motivation for looking at the dimension structure of functions

comes from the theory of quasi-Monte Carlo (QMC) integration. In QMC the

integral I =∫

f(x)dx is approximated by I = (1/n)∑n

i=1 f(xi), for carefully

chosen points xi ∈ [0, 1]d. A survey of QMC is beyond the scope of this article.

The reader is referred to Niederreiter (1992) and Sloan and Joe (1994). A recent

17

Page 18: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

survey of randomized QMC is given in L’Ecuyer and Lemieux (2002). The error

in QMC is

|I − I| ≤∑|u|>0

∣∣∣∣ 1nn∑

i=1

fu(xi)∣∣∣∣. (20)

Each term in (20) has an upper bound of the form D(xu1 , . . . , xu

n)‖fu‖u where

the discrepancy D measures distance between the U [0, 1]u distribution and the

sample distribution on its arguments and ‖ ·‖u is a compatible norm. Hickernell

(1996) presents a general family of such quadrature error bounds. Widely used

QMC constructions have small discrepancies in low dimensional projections.

They can be very effective if ‖fu‖u is small whenever |u| is large. The norms

‖fu‖u can be quite different from L2, but Caflisch and Morokoff (1995) and

Schlier (2002) find that QMC error is closely related to variance, in practice.

For example, a non-axis oriented discontinuity in f can make (20) infinite, for

the best known (Koksma-Hlawka) bound where D is the star discrepancy and

‖ · ‖u is total variation in the sense of Hardy and Krause. Yet in practice, QMC

often works very well despite the discontinuity in f .

The QMC error (20) uses equal observation weights on all the xi. If a term

fu has a negligible variance σ2u, then we would not be surprised to find that it

makes a negligible contribution to the error I − I. It could do otherwise, if fu

had some very large values that happened to include the actual points xi used

in the QMC rule. While such a coincidence is not impossible, neither is it to be

expected in general applications.

To illustrate mean dimension calculations, we consider an option valuation

problem. For background in mathematical finance see Duffie (2001) or Hull

(2002). Let S1, . . . , Sd be prices of an asset at times t1 < t2 < · · · < td. A

18

Page 19: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

widely used model for Sj is geometric Brownian motion wherein

Sj(x) = S0 exp(

(tj − t0)(r − 1

2σ2

S

)+ σS

j∑�=1

(t� − t�−1)1/2Φ−1(xj))

. (21)

Here S0 is the price at time t0 < t1, which is usually the present time, r > 0

is a drift parameter usually equal to an interest rate, σS > 0 is the volatility,

and Φ is the N(0, 1) cumulative distribution function. We consider a down

and out barrier option with strike price K and barrier B, that pays (Sd −K)+1min1≤j≤d Sj>B at time td. If any Sj is below B the option is knocked out,

and becomes worthless. The value of this option at time t0 is I =∫

f(x)dx

where

f(x) = e−r(td−t0)(Sd(x) − K)+ × 1min1≤j≤d Sj(x)≥B. (22)

For the barrier option (22) we always use parameters r = 0.06, σS = 0.25,

S0 = 40, with d = 12 and tj = j/12 for 0 ≤ j ≤ d. The values of K and B are

varied below.

The mean dimension of f was estimated on a grid of barrier values B and

strike values K, as described in Section 6. The result is shown in Figure 1.

When both B and K are very low, we see a mean dimension barely larger than

one. Here the function f(x) is usually equal to e−r(Sd(x)−K)+ and Sd is close

to additive in the xj . Generally, increasing the value of K makes the integrand

f more spiky and increases the mean dimension. When the barrier B is low,

it is seldom reached, and then small changes in B have almost no effect on

the function f , including its mean dimension. The (K, B) points giving the

highest mean dimension tend to have a function f that is usually 0 apart from

19

Page 20: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

K n QMC var MC var QMC eff30 2,197 0.000804 0.047135 58.6354 2,197 0.000450 0.006042 13.4330 28,561 0.000037 0.002962 80.0554 28,561 0.000018 0.000558 31.00

Table 1: This table compares the efficiency gain of RQMC versus MC for twointegrands described in the text. The one with strike K = 30 has low meandimension and the one with strike K = 54 has a relatively high mean dimension.For RQMC 132 = 169 internal replicates of n = 133 or 134 were run and theirsample variance is reported. Next RQMC was replaced by MC for 169 replicatesof n. The ratio of MC sample variance to RQMC sample variance appears inthe column headed QMC Eff.

a spike. In these settings we expect QMC to bring less benefit, and some sort

of importance sampling might be advantageous.

Our theory predicts that lower mean dimension yields a better improvement

when MC is replaced by QMC. To test this theory, we considered two instances

of the down and out barrier integrand. Both had barrier B = 30. One had

strike K = 30 and the relatively low mean dimension, 1.13. The other instance

had K = 54, leading to more knockouts, and a spikier integrand with higher

mean dimension, 1.97. The results appear in Table 1. For n = 133 = 2,197 we

find that RQMC is about 59 times as efficient as MC on the integrand with low

mean dimension, and only about 13.4 times as efficient on the other integrand.

For n = 134 = 28,561, RQMC attains a variance improvement of about 80 for

low mean dimension, and only 31 for high mean dimension. In both cases the

benefit of RQMC increased with the sample size.

The functions fu for |u| < d can be smoother than f because they are defined

through integrals of f . Often, fD is the only discontinuous ANOVA component

of a financial integrand. As a case in point, consider an Asian option with an

up and out feature. The payoff is (A − K)+ × 1A≤U where A = (1/d)∑d

j=1 Sj

20

Page 21: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

is the average option price.

For the same geometric Brownian motion Sj described above, taking K =

40, and U = 60, we find that σ2D/σ2 .= 0.0082 using methods described in

Section 6. Thus (R)QMC sampling is little affected by the discontinuity in

this f . It is not surprising that a discontinuous function can be approximated

by a continuous one, as continuous functions are dense in L2[0, 1]d. We could

for instance replace f by a kernel smoothed version of f with a very small

bandwidth. The significance here is that the function f − fD is very close to f ,

while having ANOVA components identical to those of f , for |u| < d.

5 EXTREME VALUES

Many financial options are valued according to the best or the worst of several

choices. For a function of the form f(x) = min(g1(x), . . . , gk(x)), one might

wonder if f must have low dimensionality when the underlying gj do. Alter-

natively, the presence of the min operator might severely increase the dimen-

sionality. To investigate this issue, we consider a simplified problem in which

f(x) = G−1(min(x1, . . . , xd)) for distribution functions G with a finite lower

bound and finite variance. The limiting distribution of f(x) for x ∼ U [0, 1]d

is Weibull, with shape depending on G near the origin. A recent reference on

extreme values is Coles (2001).

The mean dimension is a tractable tool for comparing results for different G.

We find that the mean dimension depends very strongly on G. For some G the

mean dimension is close to d. We do not expect that the mean dimension can

approach 1 as d → ∞ for then f would be additive and a central limit theorem

should apply. We do find that for some G, the mean dimension can be just

21

Page 22: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

slightly smaller than 1.22 for arbitrarily large d. For uniform G, we show that

the limiting dimension distribution is geometric.

In the theory of von Mises (1947), the asymptotic distribution of a statistic

is dominated by the joint effects of the variables taken r at a time, when the

statistic is smooth and precisely r − 1 functional derivatives vanish. The usual

case has r = 1 and a central limit theorem holding. The minimum function

is not a differentiable statistical functional, and the limiting distribution of

f(x) is typically not dominated by fu with |u| = r for any single r. The

dimension distribution describes another way in which extreme values are not

well approximated in von Mises’ framework.

5.1 Minimum of d uniform variables

To begin, let x ∼ U [0, 1]d and take f(x) = min1≤j≤d xj . The following identity,

for the normalization constant of the Beta distribution, is useful below. For

α > 0 and β > 0

∫ 1

0

yα−1(1 − y)β−1dy =Γ(α)Γ(β)Γ(α + β)

, (23)

where Γ(α) =∫∞0 yα−1e−ydy is the Gamma function. For integer α ≥ 1 we

note that Γ(α) = (α − 1)!.

The minimum y of d independent uniform random variables has probability

density function d(1 − y)d−1 on 0 < y < 1. Using (23) we easily find that

I = 1/(d + 1) and σ2 + I2 = 2/[(d + 1)(d + 2)], so that

σ2 =d

(d + 1)2(d + 2). (24)

22

Page 23: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

Lemma 2 Let xj be independent U [0, 1] random variables for 1 ≤ j ≤ d and

let f(x) = min(x1, . . . , xd). Then for non-empty u ⊆ {1, . . . , d},

∑v⊆u

fv(x) =1

d − |u| + 1

(1 − (1 − min

j∈uxj)d−|u|+1

), (25)

I2 + τ2u =

2(d + 1)(2d − |u| + 2)

, and, (26)

τ2u =

|u|(d + 1)2(2d − |u| + 2)

. (27)

Proof: See appendix.

Theorem 5 Let f(x) = min(x1, . . . , xd) for 0 ≤ xj ≤ 1, j = 1, . . . , d. Then the

mean dimension of f(x) is 2(d + 1)/(d + 3).

Proof: See appendix.

As d → ∞ the mean dimension of the minimum of d independent U [0, 1]

random variables increases slowly to 2 as d → ∞. Thus, while the minimum

does not become approximately additive, neither do the high order ANOVA

components dominate. The variables in xu explain a proportion

τ2u

σ2=

|u|/(2d − |u| + 1)d/(d + 2)

of the variance of f . In particular the single variable xj explains (d + 2)/(2d2)

of the variance of f , and so the entire additive component of f thus accounts

for (d + 2)/(2d) → 1/2 of the variance. Similar calculations show that ν(2) =

(d−1)(d+2)/[2d(2d+1)] → 1/4 and ν(3) = (d−2)(d−1)(d+2)/[(2d−1)2d(2d+

1)] → 1/8. The rest of this section is devoted to establishing that ν(k) → 2−k,

as d → ∞. That is, the limiting dimension distribution is a geometric one.

23

Page 24: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

For 1 ≤ k ≤ d, let τ2[k] denote the common value of τ2

u for |u| = k. Similarly

let σ2[k] denote the value of σ2

u for |u| = k. To these values, adjoin τ2[0] = σ2

[0] = 0.

The identity (4) becomes

τ2[k] =

k∑j=0

(k

j

)σ2

[j],

and the inverse relationship (6) becomes

σ2[k] =

k∑j=0

(−1)k−j

(k

j

)τ2

[j].

These expressions for τ2[k] and σ2

[k] allow us to prove:

Theorem 6 Let f(x) = min(x1, . . . , xd) for 0 ≤ xj ≤ 1, j = 1, . . . , d. Then

the limiting dimension distribution of f is geometric: for k ≥ 1, limd→∞ ν(k) =

2−k.

Proof: See appendix.

5.2 Extrema for bounded random variables

Suppose that Yi are independent random variables with cumulative distribution

function G and density function g and that E(Y 4i ) < ∞. Suppose that 0 is the

lower bound for G: G(0) = 0 and G(y) > 0 for all y > 0.

Let ε > 0. For large d, the minimum of Y1, . . . , Yd is below ε with probability

1 − (1 − G(ε))d → 1, as d → ∞. Suppose that G(y) = H(y) for 0 ≤ y ≤ ε

and∫

y4dH(y) < ∞. Let fG(x) = min(G−1(x1), . . . , G−1(xd)) and fH(x) =

min(H−1(x1), . . . , H−1(xd)). Then applying Cauchy-Shwartz to (fG − fH)2 =

24

Page 25: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

(fG − fH)21[ε,1]d yields

∫(fG(x) − fH(x))2dx ≤ (1 − ε)d/2

(∫(fG(x) − fH(x))4dx

)1/2

.

It follows that every ANOVA component of fG−fH has mean square at most (1−ε)d/2M for some M < ∞. Accordingly, for large d, the dimension distribution

depends only on G near 0.

To study distributions H(y) = O(yA) as y → 0+, we use G(y) = yA, with

g(y) = AyA−1. Here 0 ≤ y ≤ 1 and A > 0. Then min1≤j≤r Yj has cumulative

distribution function Gr(y) = 1 − (1 − G(y))r and density function gr(y) =

r(1 − G(y))r−1g(y).

Let x ∈ [0, 1]d, and let f(x) = G−1(min(x1, . . . , xd)). If x ∼ U [0, 1]d, then

the distribution of d1/2f(x) approaches the Weibull distribution with shape A

and scale 1 as d → ∞. Here A = 1 corresponds to uniformly distributed xj

and asymptotically exponentially distributed minimum. Lemma 3 below gives

a formula for the mean dimension of f(x) as a function of A > 0 and dimension

d. The case A = 1 recaptures the uniform case with mean dimension about 2.

When A > 1, small values are relatively rare, while for A < 1 the density g has

an integrable singularity at 0. The mean of the dimension distribution is easier

to find analytically than are the quantiles.

Lemma 3 For integer d ≥ 1 and real A > 0, the mean dimension of f(x) =

min(x1, . . . , xd)1/A is

µA,d =2d

A+A2 (d + 1 + 2/A)−1

1 −∏dj=1

j(j+2/A)(j+1/A)2

. (28)

Proof: See appendix.

25

Page 26: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

Figure 2 shows the mean dimension for varying choices of A and d. For fixed

d and small A, we see a mean dimension that is near d. As A increases, we see

the mean dimension decrease to a value that is near 1.2. For small values of A,

the mean dimension approaches a larger limit as d → ∞ than it does for larger

values of A.

In the limit as A → ∞, the density AxA−1 concentrates near 1. The density

of 1−X concentrates near zero and the dimension distribution of the maximum

for 1−X is the same as that of the minimum for X . An A-dependent rescaling

the variable 1−X to yield variance 1, does not change the dimension distribution

of the maximum. Letting A → ∞, the rescaled random variable approaches an

exponential distribution. Thus the dimension distribution of min(x1, . . . , xd)1/A

for large A approximates the dimension distribution of the maximum of d in-

dependent exponential random variables. We find that the limits with A → ∞involve the trigamma function ψ′(z) = (d2/dz2) log(Γ(z)).

Theorem 7 For d ≥ 1

limA→∞

µA,d =2d

[ψ′(1) − ψ′(d + 1)](d + 1), and, (29)

limd→∞

limA→∞

µA,d =2

ψ′(1).= 1.216. (30)

Proof: See appendix.

6 QUADRATURE METHODS

Our primary focus in the numerical quadratures was to get reliable numbers,

with a rough idea of their accuracy. We did not use the most sophisticated or

26

Page 27: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

formal statistical analyses that we could have. Monte Carlo methods allow the

luxury of using very large sample sizes.

The estimates of mean dimension and related quantities were computed sev-

eral ways. First, two independently coded Monte Carlo implementations were

run. The larger of these used n = 106 points in a Latin hypercube sample, the

smaller used n = 105 pure Monte Carlo points. When we were satisfied that

both were estimating the same quantities, we substituted randomized quasi-

Monte Carlo (RQMC) for MC points in the larger implementation.

Estimates of τ2u, τ2

u, and Υ2u require integrals of a dimension between d + 1

and 2d. We used RQMC points in [0, 1]2d for all of them. The points we used

were from (λ, t, m, s)–nets in base b as described in Owen (1997). These nets

have n = λbm in dimension s, and can be viewed as λ replicated (t, m, s)–nets.

The pooled answer for Υ2D/σ2 = σ2

D/σ2 for the up and out Asian option

using U = 60 and K = 40 is 0.0082. This answer was based on 10 replicates of

Latin hypercube sampling with n = 100,000. The answers from the replicates

were:

0.007923 0.007776 0.008335 0.007918 0.008461

0.008275 0.008144 0.008040 0.008103 0.008779The standard error of this value is 0.000094.

The contours shown in Figure 1 are based on Latin hypercube sampling in

5 replicates of 200, 000 points at each of the (B, K) combinations shown. We

also ran RQMC points at each (B, K) combination. Where the mean dimension

was small, we found that the 5 RQMC replicates agreed well. For example

with B = 36 and K = 40 the replicates were (1.57, 1.54, 1.56, 1.55, 1.54) and the

pooled estimate was 1.55, very close to the LHS value of 1.54. Where the mean

dimension was larger, we found that LHS mean dimension values tended to be

27

Page 28: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

more stable than RQMC ones, probably due to their larger sample size, and the

spikiness of the integrands.

Appendix

This appendix presents the proofs of the results for the extreme value example.

Proof of Lemma 2: Equations (25), (26), and (27) are easy for u = {1, . . . , d}.For u ⊂ {1, . . . , d}, with 0 < |u| < d, let y = y(x) = minj∈u xj and z = z(x) =

minj �∈u xj . Then f(x) = min(y, z). Now

∑v⊆u

fv(x) =∫

min(y(x), z(x))dx−u

=∫ 1

0

min(y, z)(d − |u|)(1 − z)d−|u|−1dz

=∫ y

0

z(d − |u|)(1 − z)d−|u|−1dz +∫ 1

y

y(d − |u|)(1 − z)d−|u|−1dz.

Transforming z into 1 − z we find that (d − |u|)−1∑

v⊆u fv(x) equals

∫ 1

1−y

(1 − z)zd−|u|−1dz + y

∫ 1−y

0

zd−|u|−1dz

=1 − (1 − y)d−|u|

d − |u| − 1 − (1 − y)d−|u|+1

d − |u| + 1+

y(1 − y)d−|u|

d − |u| .

After simplification

∑v⊆u

fv(x) =1 − (1 − y)d−|u|+1

d − |u| + 1,

28

Page 29: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

establishing (25).

To show (26), we integrate the square of∑

v⊆u fv(x), getting

1(d − |u| + 1)2

∫ 1

0

(1 − (1 − y)d−|u|+1)2|u|(1 − y)|u|−1dy

=|u|

(d − |u| + 1)2

∫ 1

0

(1 − yd−|u|+1)2y|u|−1dy

=|u|

(d − |u| + 1)2( 1|u| −

2d + 1

+1

2(d + 1) − |u|)

=(d + 1)(2(d + 1) − |u|) − 2|u|(2(d + 1) − |u|) + |u|(d + 1)

(d − |u| + 1)2(d + 1)(2d − |u| + 2)

=2

(d + 1)(2d − |u| + 2),

after some simplification, establishing (26). Then (27) follows by subtract-

ing (24) from (26). �

Proof of Theorem 5: By symmetry and Theorem 2, the mean dimension is

dτ2{d}

σ2=

d

σ2

(σ2 − τ2

{1,...,d−1})

= d − d(d − 1)(d + 1)2(d + 2)

d(d + 1)2(2d − (d − 1) + 2)

= d − (d − 1)(d + 2)(d + 3)

=2(d + 1)d + 3

. �

Proof of Theorem 6: We write ν(k) as

(d

k

)σ2

[k]

σ2=

(d + 1)2(d + 2)d

(d

k

) k∑j=0

(−1)k−j

(k

j

)j

(d + 1)2(2d − j + 2)

=d + 2

d

(d

k

) k∑j=0

(−1)k−j

(k

j

)j

2d − j + 2,

29

Page 30: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

which we recognize as a k–fold difference of j/(2d−j+2). Let g(z) = z/(1−z) =

z(1 + z + z2 + · · · ) for −1 < z < 1. Then

ν(k) =d + 2

d

(d

k

) k∑j=0

(−1)k−j

(k

j

)g

(j

2(d + 1)

)=

d + 2d

(d

k

)g(k)(z)

[2(d + 1)]k,

where 0 ≤ z ≤ k/[2(d + 1)]. For fixed k ≥ 1, and with d → ∞, g(k)(z) =

g(k)(0) + O(1/d) = k! + O(1/d), and so

limd→∞

ν(k) = limd→∞

(d

k

)k!(2(d + 1))−k = 2−k. �

Proof of Lemma 3: For u ⊂ {1, . . . , d} with 1 ≤ |u| < d, let y = min{xj |j ∈ u} and z = min{xj | j �∈ u}. Then

∑v⊆u

fv(x) =∫ 1

0

min(y, z)gd−|u|(z)dz =∫ y

0

zgd−|u|(z)dz + y

∫ 1

y

gd−|u|(z)dz.

For |u| = d − 1 we find∑

v⊆u fv(x) is

∫ y

0

zg(z)dz + y

∫ 1

y

g(z)dz =∫ y

0

AzAdz + y

∫ 1

y

AzA−1dz = y − yA+1

A + 1.

30

Page 31: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

Then I2 + τ2[d−1] equals

∫ 1

0

(y − yA+1

A + 1

)2

(d − 1)(1 − G(y))d−2g(y)dy

= (d − 1)∫ 1

0

(1 − yA

A + 1

)2

(1 − yA)d−2AyA+1dy

= (d − 1)∫ 1

0

(1 − z

A + 1

)2

(1 − z)d−2Az(A+1)/AA−1z−1+1/Adz

= (d − 1)∫ 1

0

(1 − 2z

A + 1+

z2

(A + 1)2)(1 − z)d−2z2/Adz

= Γ(d)(

Γ(1 + 2/A)Γ(d + 2/A)

− 2A + 1

Γ(2 + 2/A)Γ(d + 1 + 2/A)

+1

(A + 1)2Γ(3 + 2/A)

Γ(d + 2 + 2/A)

).

An easy calculation shows that for p > 0,

∫f(x)pdx =

d!Γ(1 + p/A)Γ(d + 1 + p/A)

so that

I =d!Γ(1 + 1/A)

Γ(d + 1 + 1/A), and, I2 + σ2 =

d!Γ(1 + 2/A)Γ(d + 1 + 2/A)

The mean dimension is d − d(τ 2[d−1] + I2)/σ2 + dI2/σ2, which equals

d −d!(

Γ(1+2/A)Γ(d+2/A) − 2

A+1Γ(2+2/A)

Γ(d+1+2/A) + 1(A+1)2

Γ(3+2/A)Γ(d+2+2/A)

)− d

(d!Γ(1+1/A)Γ(d+1+1/A)

)2

d!Γ(1+2/A)Γ(d+1+2/A) −

(d!Γ(1+1/A)Γ(d+1+1/A)

)2 ,

and after some rearrangement, we get

µA,d =dΓ(1+2/A)

Γ(d+1+2/A) − Γ(1+2/A)Γ(d+2/A) + 2

A+1Γ(2+2/A)

Γ(d+1+2/A) − 1(A+1)2

Γ(3+2/A)Γ(d+2+2/A)

Γ(1+2/A)Γ(d+1+2/A) − d!

(Γ(1+1/A)

Γ(d+1+1/A)

)2 . (31)

After some tedious algebra equation (31) simplifies to equation (28). �

Proof of Theorem 7 (Outline): Substitute ε = 1/A in (31), and then expand

31

Page 32: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

numerator and denominator in a polynomial in ε keeping terms up to order ε2.

The coefficients of 1 and ε vanish in both numerator and denominator, and the

ratio of coefficients of ε2 is (29). Equation (30) follows because ψ′(z) → 0 as

z → ∞. �

References

Archer, G. E. B., A. Saltelli, and I. M. Sobol’ (1997). Sensitivity measures,

ANOVA-like techniques and the use of bootstrap. Journal of Statistical

Computing and Simulation 58, 99–120.

Caflisch, R. E. and W. Morokoff (1995). Quasi-Monte Carlo integration. Jour-

nal of Computational Physics 122, 218–230.

Caflisch, R. E., W. Morokoff, and A. B. Owen (1997). Valuation of mortgage

backed securities using Brownian bridges to reduce effective dimension.

Journal of Computational Finance 1, 27–46.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values.

New York: Springer.

Cornfield, J. and J. W. Tukey (1956). Average values of mean squares in

factorials. Annals of Mathematical Statistics 27, 907–949.

Duffie, D. (2001). Dynamic Asset Pricing Theory (3rd ed.). Princeton, NJ:

Princeton University Press.

Efron, B. and C. Stein (1981). The jackknife estimate of variance. Annals of

Statistics 9, 586–596.

Hickernell, F. J. (1996). Quadrature error bounds and figures of merit for

quasi-random points. SIAM Journal of Numerical Analysis 33, 1995–2016.

32

Page 33: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

corrected printing of Sections 3-6 in ibid., 34 (1997), 853–866.

Hickernell, F. J. (1998). Lattice rules: how well do they measure up? In

P. Hellekalek and G. Larcher (Eds.), Random and Quasi-Random Point

Sets, pp. 109–168. New York: Springer.

Hoeffding, W. (1948). A class of statistics with asymptotically normal distri-

bution. Annals of Mathematical Statistics 19, 293–325.

Hooker, G. (2004). Discovering additive structure in black box functions. In

Proceedings of KDD 2004. To appear.

Hull, J. C. (2002). Options, Futures and Other Derivatives. Upper Saddle

River, NJ: Prentice Hall.

L’Ecuyer, P. and C. Lemieux (2002). A survey of randomized quasi-Monte

Carlo methods. In M. Dror, P. L’Ecuyer, and F. Szidarovszki (Eds.), Mod-

eling Uncertainty: An Examination of Stochastic Theory, Methods, and

Applications, pp. 419–474. Kluwer Academic Publishers.

McKay, M. D., R. J. Beckman, and W. J. Conover (1979). A comparison

of three methods for selecting values of input variables in the analysis of

output from a computer code. Technometrics 21 (2), 239–45.

Montgomery, D. C. (2000). Design and Analysis of Experiments (Fifth Edi-

tion). John Wiley & Sons.

Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo

Methods. Philadelphia, PA: S.I.A.M.

Owen, A. B. (1992). Orthogonal arrays for computer experiments, integration

and visualization. Statistica Sinica 2, 439–452.

Owen, A. B. (1994). Lattice sampling revisited: Monte Carlo variance of

33

Page 34: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

means over randomized orthogonal arrays. The Annals of Statistics 22,

930–945.

Owen, A. B. (1997). Monte Carlo variance of scrambled equidistribution

quadrature. SIAM Journal of Numerical Analysis 34 (5), 1884–1910.

Owen, A. B. (2003). The dimension distribution and quadrature test func-

tions. Statistica Sinica 13 (1), 1–17.

Saltelli, A., K. Chan, and E. M. Scott (2000). Sensitivity Analysis. Chichester:

Wiley.

Schlier, C. (2002). A practitioner’s view on QMC integration. Technical re-

port, Universitat Freiburg, Fakultat fur Physik.

Sloan, I. H. and S. Joe (1994). Lattice Methods for Multiple Integration. Ox-

ford: Oxford Science Publications.

Sobol’, I. M. (1969). Multidimensional Quadrature Formulas and Haar Func-

tions. Moscow: Nauka. (In Russian).

Sobol’, I. M. (1990). Sensitivity estimates for non-linear mathematical models.

Matematicheskoe Modelirovanie 2, 112–118. (In Russian).

Sobol’, I. M. (1993). Sensitivity estimates for nonlinear mathematical models.

Mathematical Modeling and Computational Experiment 1, 407–414.

Sobol’, I. M. (2001). Global sensitivity indices for nonlinear mathematical

models and their Monte Carlo estimates. Mathematics and Computers in

Simulation 55, 271–280.

Stein, M. (1987). Large sample properties of simulations using Latin hyper-

cube sampling. Technometrics 29 (2), 143–51.

34

Page 35: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

Takemura, A. (1983). Tensor analysis of ANOVA decomposition. Journal of

the American Statistical Association 78, 894–900.

von Mises, R. (1947). On the asymptotic distribution of differentiable statis-

tical functionals. Annals of Mathematical Statistics 18, 309–348.

Wang, X. and K.-T. Fang (2002). The effective dimensions and quasi-Monte

Carlo integration. Technical report, Hong Kong Baptist University, De-

partment of Mathematics.

35

Page 36: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

30 40 50 60

2530

3540

1.1 1.3 1.5 1.7 1.9 2.1 2.3

2.3 2.5

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

Figure 1: Mean Dimension of the Barrier Integrand. The horizontal axis showsthe strike price K ranging from 24 to 60, the vertical axis shows the barriervalue B ranging from 24 to 40, and the contours are for the mean dimension ofthe function f in (22), with parameter values given in the text. The positionsof the raw data points are indicated by dots.

36

Page 37: Estimating mean dimensionality of ANOVA decompositions · 2017-09-28 · among other things, that the minimum of d independent U[0,1] random variables has a mean dimension of 2(d

-6 -4 -2 0 2 4 6

020

40

1 5 10 50 100 500

15

50

Figure 2: Mean Dimension of Minimum. The upper plot shows the mean di-mension of f(x) = min(x1, . . . , xd)1/A versus log10(A). The curves, from top tobottom, are for d decreasing from 50 to 10 by steps of 10. A reference point isplotted for A = 1 and the limiting mean dimension of 2, as d → ∞ for A = 1.The lower plot shows the mean dimension of f versus d. The curves, from topto bottom, are for A increasing from 0.01 to 100 by multiples of 10.

37