42
Submitted to the Annals of Applied Probability EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF GAUSSIAN RANDOM FIELDS By Robert J. Adler * Jose H. Blanchet and Jingchen Liu Technion, Columbia, Columbia Our focus is on the design and analysis of efficient Monte Carlo meth- ods for computing tail probabilities for the suprema of Gaussian random fields, along with conditional expectations of functionals of the fields given the existence of excursions above high levels, b. Na¨ ıve Monte Carlo takes an exponential, in b, computational cost to estimate these probabilities and conditional expectations for a prescribed relative accuracy. In contrast, our Monte Carlo procedures achieve, at worst, polynomial complexity in b, as- suming only that the mean and covariance functions are H¨older continuous. We also explain how to fine tune the construction of our procedures in the presence of additional regularity, such as homogeneity and smoothness, in order to further improve the efficiency. 1. Introduction. This paper centers on the design and analysis of efficient Monte Carlo techniques for computing probabilities and conditional expectations related to high excursions of Gaussian random fields. More specifically, suppose that f : T × Ω R is a continuous Gaussian random field over a d dimensional compact set T R d . (Additional regularity conditions on T will be imposed below, as needed.) Our focus is on tail probabilities of the form w(b)= P(max tT f (t) >b) (1.1) and on conditional expectations (1.2) E[Γ(f )| max tT f (t) >b], as b →∞, where Γ is a functional of the field, which, for concreteness we take to be positive and bounded. While much of the paper will concentrate on estimating the exceedence probabil- ity (1.1), it is important to note that our methods, based on importance sampling, are broadly applicable to the efficient evaluation of conditional expectations of the form (1.2). Indeed, as we shall explain at the end of Section 4, our approach to ef- ficient importance sampling is based on a procedure which mimics the conditional * Research supported in part by US-Israel Binational Science Foundation, grant 2008262, and NSF, grant DMS-0852227. Research supported in part by ..... Research supported in part by ..... AMS 2000 subject classifications: Primary 60G15, 65C05; Secondary 60G60,62G32. Keywords and phrases: Gaussian random fields, high level excursions, Monte Carlo, tail distri- butions, efficiency. 1

EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

Submitted to the Annals of Applied Probability

EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OFGAUSSIAN RANDOM FIELDS

By Robert J Adlerlowast Jose H Blanchetdagger and Jingchen LiuDagger

Technion Columbia Columbia

Our focus is on the design and analysis of efficient Monte Carlo meth-ods for computing tail probabilities for the suprema of Gaussian randomfields along with conditional expectations of functionals of the fields giventhe existence of excursions above high levels b Naıve Monte Carlo takesan exponential in b computational cost to estimate these probabilities andconditional expectations for a prescribed relative accuracy In contrast ourMonte Carlo procedures achieve at worst polynomial complexity in b as-suming only that the mean and covariance functions are Holder continuousWe also explain how to fine tune the construction of our procedures in thepresence of additional regularity such as homogeneity and smoothness inorder to further improve the efficiency

1 Introduction This paper centers on the design and analysis of efficientMonte Carlo techniques for computing probabilities and conditional expectationsrelated to high excursions of Gaussian random fields More specifically supposethat f T times Ω rarr R is a continuous Gaussian random field over a d dimensionalcompact set T sub Rd (Additional regularity conditions on T will be imposed belowas needed)

Our focus is on tail probabilities of the form

w(b) = P(maxtisinT

f (t) gt b)(11)

and on conditional expectations

(12) E[Γ(f)|maxtisinT

f (t) gt b]

as b rarr infin where Γ is a functional of the field which for concreteness we take tobe positive and bounded

While much of the paper will concentrate on estimating the exceedence probabil-ity (11) it is important to note that our methods based on importance samplingare broadly applicable to the efficient evaluation of conditional expectations of theform (12) Indeed as we shall explain at the end of Section 4 our approach to ef-ficient importance sampling is based on a procedure which mimics the conditional

lowastResearch supported in part by US-Israel Binational Science Foundation grant 2008262 andNSF grant DMS-0852227

daggerResearch supported in part by DaggerResearch supported in part by AMS 2000 subject classifications Primary 60G15 65C05 Secondary 60G6062G32Keywords and phrases Gaussian random fields high level excursions Monte Carlo tail distri-

butions efficiency

1

2 ADLER BLANCHET LIU

distribution of f given that maxT f(t) gt b Moreover once an efficient (in a pre-cise mathematical sense described in Section 2) importance sampling procedure isin place it follows under mild regularity conditions on Γ that an efficient estimatorfor (12) is immediately obtained by exploiting an efficient estimator for (11)

The need for an efficient estimator of ω(b) should be reasonably clear Supposeone could simulate

flowast suptisinT

f(t)

exactly (ie without bias) via naıve Monte Carlo Such an approach would typicallyrequire a number of replications of flowast which would be exponential in b2 to obtainan accurate estimate (in relative terms) Indeed since in great generality (see [22])w(b) = exp(minuscb2 + o(b2)) as b rarrinfin for some c isin (0infin) it follows that the averageof n iid Bernoulli trials each with success parameter w(b) estimates w(b) witha relative mean squared error equal to nminus12(1 minus w(b))12w(b)12 To control thesize of the error therefore requires1 n = Ω(w(b)minus1) which is typically prohibitivelylarge In addition there is also a problem in that typically flowast cannot be simulatedexactly and that some discretization of f is required

Our goal is to introduce and analyze simulation estimators that can be appliedto a general class of Gaussian fields and that can be shown to require at most apolynomial number of function evaluations in b to obtain estimates with a prescribedrelative error The model of computation that we use to count function evaluationsand the precise definition of an algorithm with polynomial complexity is given inSection 2 Our proposed estimators are in particular asymptotically optimal (Thisproperty which is a popular notion in the context of rare-event simulation (cf[6 12]) essentially requires that the second moments of estimators decay at the sameexponential rate as the square of the first moments) The polynomial complexityof our estimators requires to assume no more than that the underlying Gaussianfield is Holder continuous (see Theorem 31 in Section 3) Therefore our methodsprovide means for efficiently computing probabilities and expectations associatedwith high excursions of Gaussian random fields in wide generality

In the presence of enough smoothness we shall also show how to design estima-tors that can be shown to be strongly efficient in the sense that their associatedcoefficient of variation remains uniformly bounded as b rarrinfin Moreover the associ-ated path generation of the conditional field (given a high excursion) can basicallybe carried out with the same computational complexity as the unconditional sam-pling procedure (uniformly in b) This is Theorem 33 in Section 3

High excursions of Gaussian random fields appear in wide number of applicationsincluding but not limited to

bull Physical oceanography Here the random field can be water pressure or surfacetemperature See [3] for many examples

bull Cosmology This includes the analysis of COBE and WMAP microwave dataon a sphere or galactic density data eg [8 20 21]

1Given h and g positive we shall use the familiar asymptotic notation h(x) = O(g(x)) if thereis c lt infin such that h(x) le cg(x) for all x large enough h(x) = Ω(g(x)) if h(x) ge cg(x) if x issufficiently large and h(x) = o(g(x)) as x rarr infin if h(x)g(x) rarr 0 as x rarr infin and h(x) = Θ(g(x))if h(x) = O(g(x)) and h(x) = Ω(g(x))

MONTE CARLO FOR RANDOM FIELDS 3

bull Quantum chaos Here random planar waves replace deterministic (but unob-tainable) solutions of Schrodinger equations eg the recent review [13]

bull Brain mapping This application is the most developed and very widely usedFor example the paper by Friston et al [15] that introduced random fieldmethodology to the brain imaging community has at the time of writingover 4500 (Google) citations

Many of these applications deal with twice differentiable constant variance ran-dom fields or random fields that have been normalized to have constant variancethe reason being that they require estimates of the tail probabilities (11) and theseare only really well known in the smooth constant variance case In particular itis known that with enough smoothness assumptions

(13) lim infbrarrinfin

minusbminus2 log∣∣P (sup

tisinTf(t) ge b)minus E(χ(t isin T f(t) ge b))∣∣ ge 1

2+

12σ2

c

where χ(A) is the Euler characteristic of the set A and the term σ2c is related

to a geometric quantity known as the critical radius of T and depends on thecovariance structure of f See [4 23] for details The expectation in (13) has anexplicit form that is readily computed for Gaussian and Gaussian-related randomfields of constant variance (see [4 5] for details) although if T is geometricallycomplicated or the covariance of f highly non-stationary there can be terms in theexpectation that can only be evaluated numerically or estimated statistically (eg[1 24]) Nevertheless when available (13) provides excellent approximations andsimulation studies have shown that the approximations are numerically useful forquite moderate b of the order of 2 standard deviations

However as we have already noted (13) holds only for constant variance fieldswhich also need to be twice differentiable In the case of less smooth f other classesof results occur in which the expansions are less reliable and in addition typicallyinvolve the unknown Pickandsrsquo constants (cf [7 19])

These are some of the reasons why despite a well developed theory Monte-Carlotechniques still have a significant role to play in understanding the behavior of Gaus-sian random fields at high levels The estimators proposed in this paper basicallyreduce the rare event calculations associated to high excursions in Gaussian randomfields to calculations that are roughly comparable to the evaluation of expectationsor integrals in which no rare event is involved In other words the computationalcomplexity required to implement the estimators discussed here is similar in somesense to the complexity required to evaluate a given integral in finite dimension oran expectation where no tail parameter is involved To the best of our knowledgethese types of reductions have not been studied in the development of numericalmethods for high excursions of random fields This feature distinguishes the presentwork from the application of other numerical techniques that are generic (such asquasi-Monte Carlo and other numerical integration routines) and that in particularmight be also applicable to the setting of Gaussian fields

Contrary to our methods which are designed to have provably good performanceuniformly over the level of excursion a generic numerical approximation proceduresuch as quasi-Monte Carlo will typically require an exponential increase in the

4 ADLER BLANCHET LIU

number of function evaluations in order to maintain a prescribed level of relativeaccuracy This phenomenon is unrelated to the setting of Gaussian random fields Inparticular it can be easily seen to happen even when evaluating a one dimensionalintegral with a small integrand On the other hand we believe that our estimatorscan in practice be easily combined with quasi-Monte Carlo or other numericalintegration methods The rigorous analysis of such hybrid approaches although ofgreat interest requires an extensive treatment and will be pursued in the futureAs an aside we note that quasi-Monte Carlo techniques have been used in theexcursion analysis of Gaussian random fields in [7]

The remainder of the paper is organized as follows In Section 2 we introducethe basic notions of polynomial algorithmic complexity which are borrowed fromthe general theory of computation Section 3 discusses the main results in light ofthe complexity considerations of Section 2 Section 4 provides a brief introductionto importance sampling a simulation technique that we shall use heavily in thedesign of our algorithms The analysis of finite fields which is given in Section 5is helpful to develop the basic intuition behind our procedures for the general caseSection 6 provides the construction and analysis of a polynomial time algorithm forhigh excursion probabilities of Holder continuous fields Finally in Section 7 weadd additional smoothness assumptions along with stationarity and explain how tofine tune the construction of our procedures in order to further improve efficiencyin these cases

2 Basic Notions of Computational Complexity In this section we shalldiscuss some general notions of efficiency and computational complexity related tothe approximation of the probability of the rare events Bb b ge b0 for whichP (Bb) rarr 0 as b rarrinfin In essence efficiency means that computational complexityis in some sense controllable uniformly in b A notion that is popular in theefficiency analysis of Monte Carlo methods for rare events is weak efficiency (alsoknown as asymptotic optimality) which requires that the coefficient of variation ofa given estimator Lb of P (Bb) to be dominated by 1P (Bb)

ε for any ε gt 0 Moreformally we have

Definition 21 A family of estimators Lb b ge b0 is said to be polynomiallyefficient for estimating P (Bb) if E(Lb) = P (Bb) and there exists a q isin (0infin) forwhich

(21) supbgeb0

V ar(Lb)[P (Bb)]

2 |log P (Bb)|qlt infin

Moreover if (21) holds with q = 0 then the family is said to be strongly efficient

Below we often refer to Lb as a strongly (polynomially) efficient estimator bywhich we mean that the family Lb b gt 0 is strongly (polynomially) efficientIn order to understand the nature of this definition let L(j)

b 1 le j le n be acollection of iid copies of Lb The averaged estimator

Ln (b) =1n

nsum

j=1

L(j)b

MONTE CARLO FOR RANDOM FIELDS 5

has a relative mean squared error equal to [V ar(Lb)]12[n12P (Bb)] A simpleconsequence of Chebyshevrsquos inequality is that

P(∣∣∣Ln (b) P (Bb)minus 1

∣∣∣ ge ε)le V ar (Lb)

ε2nP [(Bb)]2

Thus if Lb is polynomially efficient and we wish to compute P (Bb) with at mostε relative error and at least 1minus δ confidence it suffices to simulate

n = Θ(εminus2δminus1 |log P (Bb)|q)

iid replications of Lb In fact in the presence of polynomial efficiency the boundn = Θ

(εminus2δminus1 |log P (Bb)|q

)can be boosted to n = Θ

(εminus2 log(δminus1) |log P (Bb)

∣∣q)using the so-called median trick (see [18])

Naturally the cost per replication must also be considered in the analysis andwe shall do so but the idea is that evaluating P (Bb) via crude Monte Carlo wouldrequire given ε and δ n = Θ (1P (Bb)) replications Thus a polynomially efficientlyestimator makes the evaluation of P (Bb) exponentially faster relative to crudeMonte Carlo at least in terms of the number of replications

Note that a direct application of deterministic algorithms (such as quasi MonteCarlo or quadrature integration rules) might improve (under appropriate smooth-ness assumptions) the computational complexity relative to Monte Carlo althoughonly by a polynomial rate (ie the absolute error decreases to zero at rate nminusp forp gt 12 where n is the number of function evaluations and p depends on the di-mension of the function that one is integrating see for instance [6]) We believe thatthe procedures that we develop in this paper can guide the construction of efficientdeterministic algorithms with small relative error and with complexity that scalesat a polynomial rate in |log P (Bb)| This is an interesting research topic that weplan to explore in the future

An issue that we shall face in designing our Monte Carlo procedure is that dueto the fact that f will have to be discretized the corresponding estimator Lb willnot be unbiased In turn in order to control the relative bias with an effort that iscomparable to the bound on the number of replications discussed in the preceedingparagraph one must verify that the relative bias can be reduced to an amountless than ε with probability at least 1 minus δ at a computational cost of the formO(εminusq0 |log P (Bb)|q1) If Lb(ε) can be generated with O(εminusq0 |log P (Bb)|q1) costand satisfying

∣∣∣P (Bb)minus ELb(ε)∣∣∣ le εP (Bb) and if

supbgt0

V ar(Lb(ε))P (Bb)

2 |log P (Bb)|qlt infin

for some q isin (0infin) then Lprimen (b ε) =sumn

j=1 L(j)b (ε)n where the L

(j)b (ε)rsquos are iid

copies of Lb(ε) satisfies

P (|Lprimen (b ε) P (Bb)minus 1| ge 2ε) le V ar(Lb(ε))ε2 times ntimes P (Bb)

2

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 2: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

2 ADLER BLANCHET LIU

distribution of f given that maxT f(t) gt b Moreover once an efficient (in a pre-cise mathematical sense described in Section 2) importance sampling procedure isin place it follows under mild regularity conditions on Γ that an efficient estimatorfor (12) is immediately obtained by exploiting an efficient estimator for (11)

The need for an efficient estimator of ω(b) should be reasonably clear Supposeone could simulate

flowast suptisinT

f(t)

exactly (ie without bias) via naıve Monte Carlo Such an approach would typicallyrequire a number of replications of flowast which would be exponential in b2 to obtainan accurate estimate (in relative terms) Indeed since in great generality (see [22])w(b) = exp(minuscb2 + o(b2)) as b rarrinfin for some c isin (0infin) it follows that the averageof n iid Bernoulli trials each with success parameter w(b) estimates w(b) witha relative mean squared error equal to nminus12(1 minus w(b))12w(b)12 To control thesize of the error therefore requires1 n = Ω(w(b)minus1) which is typically prohibitivelylarge In addition there is also a problem in that typically flowast cannot be simulatedexactly and that some discretization of f is required

Our goal is to introduce and analyze simulation estimators that can be appliedto a general class of Gaussian fields and that can be shown to require at most apolynomial number of function evaluations in b to obtain estimates with a prescribedrelative error The model of computation that we use to count function evaluationsand the precise definition of an algorithm with polynomial complexity is given inSection 2 Our proposed estimators are in particular asymptotically optimal (Thisproperty which is a popular notion in the context of rare-event simulation (cf[6 12]) essentially requires that the second moments of estimators decay at the sameexponential rate as the square of the first moments) The polynomial complexityof our estimators requires to assume no more than that the underlying Gaussianfield is Holder continuous (see Theorem 31 in Section 3) Therefore our methodsprovide means for efficiently computing probabilities and expectations associatedwith high excursions of Gaussian random fields in wide generality

In the presence of enough smoothness we shall also show how to design estima-tors that can be shown to be strongly efficient in the sense that their associatedcoefficient of variation remains uniformly bounded as b rarrinfin Moreover the associ-ated path generation of the conditional field (given a high excursion) can basicallybe carried out with the same computational complexity as the unconditional sam-pling procedure (uniformly in b) This is Theorem 33 in Section 3

High excursions of Gaussian random fields appear in wide number of applicationsincluding but not limited to

bull Physical oceanography Here the random field can be water pressure or surfacetemperature See [3] for many examples

bull Cosmology This includes the analysis of COBE and WMAP microwave dataon a sphere or galactic density data eg [8 20 21]

1Given h and g positive we shall use the familiar asymptotic notation h(x) = O(g(x)) if thereis c lt infin such that h(x) le cg(x) for all x large enough h(x) = Ω(g(x)) if h(x) ge cg(x) if x issufficiently large and h(x) = o(g(x)) as x rarr infin if h(x)g(x) rarr 0 as x rarr infin and h(x) = Θ(g(x))if h(x) = O(g(x)) and h(x) = Ω(g(x))

MONTE CARLO FOR RANDOM FIELDS 3

bull Quantum chaos Here random planar waves replace deterministic (but unob-tainable) solutions of Schrodinger equations eg the recent review [13]

bull Brain mapping This application is the most developed and very widely usedFor example the paper by Friston et al [15] that introduced random fieldmethodology to the brain imaging community has at the time of writingover 4500 (Google) citations

Many of these applications deal with twice differentiable constant variance ran-dom fields or random fields that have been normalized to have constant variancethe reason being that they require estimates of the tail probabilities (11) and theseare only really well known in the smooth constant variance case In particular itis known that with enough smoothness assumptions

(13) lim infbrarrinfin

minusbminus2 log∣∣P (sup

tisinTf(t) ge b)minus E(χ(t isin T f(t) ge b))∣∣ ge 1

2+

12σ2

c

where χ(A) is the Euler characteristic of the set A and the term σ2c is related

to a geometric quantity known as the critical radius of T and depends on thecovariance structure of f See [4 23] for details The expectation in (13) has anexplicit form that is readily computed for Gaussian and Gaussian-related randomfields of constant variance (see [4 5] for details) although if T is geometricallycomplicated or the covariance of f highly non-stationary there can be terms in theexpectation that can only be evaluated numerically or estimated statistically (eg[1 24]) Nevertheless when available (13) provides excellent approximations andsimulation studies have shown that the approximations are numerically useful forquite moderate b of the order of 2 standard deviations

However as we have already noted (13) holds only for constant variance fieldswhich also need to be twice differentiable In the case of less smooth f other classesof results occur in which the expansions are less reliable and in addition typicallyinvolve the unknown Pickandsrsquo constants (cf [7 19])

These are some of the reasons why despite a well developed theory Monte-Carlotechniques still have a significant role to play in understanding the behavior of Gaus-sian random fields at high levels The estimators proposed in this paper basicallyreduce the rare event calculations associated to high excursions in Gaussian randomfields to calculations that are roughly comparable to the evaluation of expectationsor integrals in which no rare event is involved In other words the computationalcomplexity required to implement the estimators discussed here is similar in somesense to the complexity required to evaluate a given integral in finite dimension oran expectation where no tail parameter is involved To the best of our knowledgethese types of reductions have not been studied in the development of numericalmethods for high excursions of random fields This feature distinguishes the presentwork from the application of other numerical techniques that are generic (such asquasi-Monte Carlo and other numerical integration routines) and that in particularmight be also applicable to the setting of Gaussian fields

Contrary to our methods which are designed to have provably good performanceuniformly over the level of excursion a generic numerical approximation proceduresuch as quasi-Monte Carlo will typically require an exponential increase in the

4 ADLER BLANCHET LIU

number of function evaluations in order to maintain a prescribed level of relativeaccuracy This phenomenon is unrelated to the setting of Gaussian random fields Inparticular it can be easily seen to happen even when evaluating a one dimensionalintegral with a small integrand On the other hand we believe that our estimatorscan in practice be easily combined with quasi-Monte Carlo or other numericalintegration methods The rigorous analysis of such hybrid approaches although ofgreat interest requires an extensive treatment and will be pursued in the futureAs an aside we note that quasi-Monte Carlo techniques have been used in theexcursion analysis of Gaussian random fields in [7]

The remainder of the paper is organized as follows In Section 2 we introducethe basic notions of polynomial algorithmic complexity which are borrowed fromthe general theory of computation Section 3 discusses the main results in light ofthe complexity considerations of Section 2 Section 4 provides a brief introductionto importance sampling a simulation technique that we shall use heavily in thedesign of our algorithms The analysis of finite fields which is given in Section 5is helpful to develop the basic intuition behind our procedures for the general caseSection 6 provides the construction and analysis of a polynomial time algorithm forhigh excursion probabilities of Holder continuous fields Finally in Section 7 weadd additional smoothness assumptions along with stationarity and explain how tofine tune the construction of our procedures in order to further improve efficiencyin these cases

2 Basic Notions of Computational Complexity In this section we shalldiscuss some general notions of efficiency and computational complexity related tothe approximation of the probability of the rare events Bb b ge b0 for whichP (Bb) rarr 0 as b rarrinfin In essence efficiency means that computational complexityis in some sense controllable uniformly in b A notion that is popular in theefficiency analysis of Monte Carlo methods for rare events is weak efficiency (alsoknown as asymptotic optimality) which requires that the coefficient of variation ofa given estimator Lb of P (Bb) to be dominated by 1P (Bb)

ε for any ε gt 0 Moreformally we have

Definition 21 A family of estimators Lb b ge b0 is said to be polynomiallyefficient for estimating P (Bb) if E(Lb) = P (Bb) and there exists a q isin (0infin) forwhich

(21) supbgeb0

V ar(Lb)[P (Bb)]

2 |log P (Bb)|qlt infin

Moreover if (21) holds with q = 0 then the family is said to be strongly efficient

Below we often refer to Lb as a strongly (polynomially) efficient estimator bywhich we mean that the family Lb b gt 0 is strongly (polynomially) efficientIn order to understand the nature of this definition let L(j)

b 1 le j le n be acollection of iid copies of Lb The averaged estimator

Ln (b) =1n

nsum

j=1

L(j)b

MONTE CARLO FOR RANDOM FIELDS 5

has a relative mean squared error equal to [V ar(Lb)]12[n12P (Bb)] A simpleconsequence of Chebyshevrsquos inequality is that

P(∣∣∣Ln (b) P (Bb)minus 1

∣∣∣ ge ε)le V ar (Lb)

ε2nP [(Bb)]2

Thus if Lb is polynomially efficient and we wish to compute P (Bb) with at mostε relative error and at least 1minus δ confidence it suffices to simulate

n = Θ(εminus2δminus1 |log P (Bb)|q)

iid replications of Lb In fact in the presence of polynomial efficiency the boundn = Θ

(εminus2δminus1 |log P (Bb)|q

)can be boosted to n = Θ

(εminus2 log(δminus1) |log P (Bb)

∣∣q)using the so-called median trick (see [18])

Naturally the cost per replication must also be considered in the analysis andwe shall do so but the idea is that evaluating P (Bb) via crude Monte Carlo wouldrequire given ε and δ n = Θ (1P (Bb)) replications Thus a polynomially efficientlyestimator makes the evaluation of P (Bb) exponentially faster relative to crudeMonte Carlo at least in terms of the number of replications

Note that a direct application of deterministic algorithms (such as quasi MonteCarlo or quadrature integration rules) might improve (under appropriate smooth-ness assumptions) the computational complexity relative to Monte Carlo althoughonly by a polynomial rate (ie the absolute error decreases to zero at rate nminusp forp gt 12 where n is the number of function evaluations and p depends on the di-mension of the function that one is integrating see for instance [6]) We believe thatthe procedures that we develop in this paper can guide the construction of efficientdeterministic algorithms with small relative error and with complexity that scalesat a polynomial rate in |log P (Bb)| This is an interesting research topic that weplan to explore in the future

An issue that we shall face in designing our Monte Carlo procedure is that dueto the fact that f will have to be discretized the corresponding estimator Lb willnot be unbiased In turn in order to control the relative bias with an effort that iscomparable to the bound on the number of replications discussed in the preceedingparagraph one must verify that the relative bias can be reduced to an amountless than ε with probability at least 1 minus δ at a computational cost of the formO(εminusq0 |log P (Bb)|q1) If Lb(ε) can be generated with O(εminusq0 |log P (Bb)|q1) costand satisfying

∣∣∣P (Bb)minus ELb(ε)∣∣∣ le εP (Bb) and if

supbgt0

V ar(Lb(ε))P (Bb)

2 |log P (Bb)|qlt infin

for some q isin (0infin) then Lprimen (b ε) =sumn

j=1 L(j)b (ε)n where the L

(j)b (ε)rsquos are iid

copies of Lb(ε) satisfies

P (|Lprimen (b ε) P (Bb)minus 1| ge 2ε) le V ar(Lb(ε))ε2 times ntimes P (Bb)

2

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 3: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 3

bull Quantum chaos Here random planar waves replace deterministic (but unob-tainable) solutions of Schrodinger equations eg the recent review [13]

bull Brain mapping This application is the most developed and very widely usedFor example the paper by Friston et al [15] that introduced random fieldmethodology to the brain imaging community has at the time of writingover 4500 (Google) citations

Many of these applications deal with twice differentiable constant variance ran-dom fields or random fields that have been normalized to have constant variancethe reason being that they require estimates of the tail probabilities (11) and theseare only really well known in the smooth constant variance case In particular itis known that with enough smoothness assumptions

(13) lim infbrarrinfin

minusbminus2 log∣∣P (sup

tisinTf(t) ge b)minus E(χ(t isin T f(t) ge b))∣∣ ge 1

2+

12σ2

c

where χ(A) is the Euler characteristic of the set A and the term σ2c is related

to a geometric quantity known as the critical radius of T and depends on thecovariance structure of f See [4 23] for details The expectation in (13) has anexplicit form that is readily computed for Gaussian and Gaussian-related randomfields of constant variance (see [4 5] for details) although if T is geometricallycomplicated or the covariance of f highly non-stationary there can be terms in theexpectation that can only be evaluated numerically or estimated statistically (eg[1 24]) Nevertheless when available (13) provides excellent approximations andsimulation studies have shown that the approximations are numerically useful forquite moderate b of the order of 2 standard deviations

However as we have already noted (13) holds only for constant variance fieldswhich also need to be twice differentiable In the case of less smooth f other classesof results occur in which the expansions are less reliable and in addition typicallyinvolve the unknown Pickandsrsquo constants (cf [7 19])

These are some of the reasons why despite a well developed theory Monte-Carlotechniques still have a significant role to play in understanding the behavior of Gaus-sian random fields at high levels The estimators proposed in this paper basicallyreduce the rare event calculations associated to high excursions in Gaussian randomfields to calculations that are roughly comparable to the evaluation of expectationsor integrals in which no rare event is involved In other words the computationalcomplexity required to implement the estimators discussed here is similar in somesense to the complexity required to evaluate a given integral in finite dimension oran expectation where no tail parameter is involved To the best of our knowledgethese types of reductions have not been studied in the development of numericalmethods for high excursions of random fields This feature distinguishes the presentwork from the application of other numerical techniques that are generic (such asquasi-Monte Carlo and other numerical integration routines) and that in particularmight be also applicable to the setting of Gaussian fields

Contrary to our methods which are designed to have provably good performanceuniformly over the level of excursion a generic numerical approximation proceduresuch as quasi-Monte Carlo will typically require an exponential increase in the

4 ADLER BLANCHET LIU

number of function evaluations in order to maintain a prescribed level of relativeaccuracy This phenomenon is unrelated to the setting of Gaussian random fields Inparticular it can be easily seen to happen even when evaluating a one dimensionalintegral with a small integrand On the other hand we believe that our estimatorscan in practice be easily combined with quasi-Monte Carlo or other numericalintegration methods The rigorous analysis of such hybrid approaches although ofgreat interest requires an extensive treatment and will be pursued in the futureAs an aside we note that quasi-Monte Carlo techniques have been used in theexcursion analysis of Gaussian random fields in [7]

The remainder of the paper is organized as follows In Section 2 we introducethe basic notions of polynomial algorithmic complexity which are borrowed fromthe general theory of computation Section 3 discusses the main results in light ofthe complexity considerations of Section 2 Section 4 provides a brief introductionto importance sampling a simulation technique that we shall use heavily in thedesign of our algorithms The analysis of finite fields which is given in Section 5is helpful to develop the basic intuition behind our procedures for the general caseSection 6 provides the construction and analysis of a polynomial time algorithm forhigh excursion probabilities of Holder continuous fields Finally in Section 7 weadd additional smoothness assumptions along with stationarity and explain how tofine tune the construction of our procedures in order to further improve efficiencyin these cases

2 Basic Notions of Computational Complexity In this section we shalldiscuss some general notions of efficiency and computational complexity related tothe approximation of the probability of the rare events Bb b ge b0 for whichP (Bb) rarr 0 as b rarrinfin In essence efficiency means that computational complexityis in some sense controllable uniformly in b A notion that is popular in theefficiency analysis of Monte Carlo methods for rare events is weak efficiency (alsoknown as asymptotic optimality) which requires that the coefficient of variation ofa given estimator Lb of P (Bb) to be dominated by 1P (Bb)

ε for any ε gt 0 Moreformally we have

Definition 21 A family of estimators Lb b ge b0 is said to be polynomiallyefficient for estimating P (Bb) if E(Lb) = P (Bb) and there exists a q isin (0infin) forwhich

(21) supbgeb0

V ar(Lb)[P (Bb)]

2 |log P (Bb)|qlt infin

Moreover if (21) holds with q = 0 then the family is said to be strongly efficient

Below we often refer to Lb as a strongly (polynomially) efficient estimator bywhich we mean that the family Lb b gt 0 is strongly (polynomially) efficientIn order to understand the nature of this definition let L(j)

b 1 le j le n be acollection of iid copies of Lb The averaged estimator

Ln (b) =1n

nsum

j=1

L(j)b

MONTE CARLO FOR RANDOM FIELDS 5

has a relative mean squared error equal to [V ar(Lb)]12[n12P (Bb)] A simpleconsequence of Chebyshevrsquos inequality is that

P(∣∣∣Ln (b) P (Bb)minus 1

∣∣∣ ge ε)le V ar (Lb)

ε2nP [(Bb)]2

Thus if Lb is polynomially efficient and we wish to compute P (Bb) with at mostε relative error and at least 1minus δ confidence it suffices to simulate

n = Θ(εminus2δminus1 |log P (Bb)|q)

iid replications of Lb In fact in the presence of polynomial efficiency the boundn = Θ

(εminus2δminus1 |log P (Bb)|q

)can be boosted to n = Θ

(εminus2 log(δminus1) |log P (Bb)

∣∣q)using the so-called median trick (see [18])

Naturally the cost per replication must also be considered in the analysis andwe shall do so but the idea is that evaluating P (Bb) via crude Monte Carlo wouldrequire given ε and δ n = Θ (1P (Bb)) replications Thus a polynomially efficientlyestimator makes the evaluation of P (Bb) exponentially faster relative to crudeMonte Carlo at least in terms of the number of replications

Note that a direct application of deterministic algorithms (such as quasi MonteCarlo or quadrature integration rules) might improve (under appropriate smooth-ness assumptions) the computational complexity relative to Monte Carlo althoughonly by a polynomial rate (ie the absolute error decreases to zero at rate nminusp forp gt 12 where n is the number of function evaluations and p depends on the di-mension of the function that one is integrating see for instance [6]) We believe thatthe procedures that we develop in this paper can guide the construction of efficientdeterministic algorithms with small relative error and with complexity that scalesat a polynomial rate in |log P (Bb)| This is an interesting research topic that weplan to explore in the future

An issue that we shall face in designing our Monte Carlo procedure is that dueto the fact that f will have to be discretized the corresponding estimator Lb willnot be unbiased In turn in order to control the relative bias with an effort that iscomparable to the bound on the number of replications discussed in the preceedingparagraph one must verify that the relative bias can be reduced to an amountless than ε with probability at least 1 minus δ at a computational cost of the formO(εminusq0 |log P (Bb)|q1) If Lb(ε) can be generated with O(εminusq0 |log P (Bb)|q1) costand satisfying

∣∣∣P (Bb)minus ELb(ε)∣∣∣ le εP (Bb) and if

supbgt0

V ar(Lb(ε))P (Bb)

2 |log P (Bb)|qlt infin

for some q isin (0infin) then Lprimen (b ε) =sumn

j=1 L(j)b (ε)n where the L

(j)b (ε)rsquos are iid

copies of Lb(ε) satisfies

P (|Lprimen (b ε) P (Bb)minus 1| ge 2ε) le V ar(Lb(ε))ε2 times ntimes P (Bb)

2

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 4: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

4 ADLER BLANCHET LIU

number of function evaluations in order to maintain a prescribed level of relativeaccuracy This phenomenon is unrelated to the setting of Gaussian random fields Inparticular it can be easily seen to happen even when evaluating a one dimensionalintegral with a small integrand On the other hand we believe that our estimatorscan in practice be easily combined with quasi-Monte Carlo or other numericalintegration methods The rigorous analysis of such hybrid approaches although ofgreat interest requires an extensive treatment and will be pursued in the futureAs an aside we note that quasi-Monte Carlo techniques have been used in theexcursion analysis of Gaussian random fields in [7]

The remainder of the paper is organized as follows In Section 2 we introducethe basic notions of polynomial algorithmic complexity which are borrowed fromthe general theory of computation Section 3 discusses the main results in light ofthe complexity considerations of Section 2 Section 4 provides a brief introductionto importance sampling a simulation technique that we shall use heavily in thedesign of our algorithms The analysis of finite fields which is given in Section 5is helpful to develop the basic intuition behind our procedures for the general caseSection 6 provides the construction and analysis of a polynomial time algorithm forhigh excursion probabilities of Holder continuous fields Finally in Section 7 weadd additional smoothness assumptions along with stationarity and explain how tofine tune the construction of our procedures in order to further improve efficiencyin these cases

2 Basic Notions of Computational Complexity In this section we shalldiscuss some general notions of efficiency and computational complexity related tothe approximation of the probability of the rare events Bb b ge b0 for whichP (Bb) rarr 0 as b rarrinfin In essence efficiency means that computational complexityis in some sense controllable uniformly in b A notion that is popular in theefficiency analysis of Monte Carlo methods for rare events is weak efficiency (alsoknown as asymptotic optimality) which requires that the coefficient of variation ofa given estimator Lb of P (Bb) to be dominated by 1P (Bb)

ε for any ε gt 0 Moreformally we have

Definition 21 A family of estimators Lb b ge b0 is said to be polynomiallyefficient for estimating P (Bb) if E(Lb) = P (Bb) and there exists a q isin (0infin) forwhich

(21) supbgeb0

V ar(Lb)[P (Bb)]

2 |log P (Bb)|qlt infin

Moreover if (21) holds with q = 0 then the family is said to be strongly efficient

Below we often refer to Lb as a strongly (polynomially) efficient estimator bywhich we mean that the family Lb b gt 0 is strongly (polynomially) efficientIn order to understand the nature of this definition let L(j)

b 1 le j le n be acollection of iid copies of Lb The averaged estimator

Ln (b) =1n

nsum

j=1

L(j)b

MONTE CARLO FOR RANDOM FIELDS 5

has a relative mean squared error equal to [V ar(Lb)]12[n12P (Bb)] A simpleconsequence of Chebyshevrsquos inequality is that

P(∣∣∣Ln (b) P (Bb)minus 1

∣∣∣ ge ε)le V ar (Lb)

ε2nP [(Bb)]2

Thus if Lb is polynomially efficient and we wish to compute P (Bb) with at mostε relative error and at least 1minus δ confidence it suffices to simulate

n = Θ(εminus2δminus1 |log P (Bb)|q)

iid replications of Lb In fact in the presence of polynomial efficiency the boundn = Θ

(εminus2δminus1 |log P (Bb)|q

)can be boosted to n = Θ

(εminus2 log(δminus1) |log P (Bb)

∣∣q)using the so-called median trick (see [18])

Naturally the cost per replication must also be considered in the analysis andwe shall do so but the idea is that evaluating P (Bb) via crude Monte Carlo wouldrequire given ε and δ n = Θ (1P (Bb)) replications Thus a polynomially efficientlyestimator makes the evaluation of P (Bb) exponentially faster relative to crudeMonte Carlo at least in terms of the number of replications

Note that a direct application of deterministic algorithms (such as quasi MonteCarlo or quadrature integration rules) might improve (under appropriate smooth-ness assumptions) the computational complexity relative to Monte Carlo althoughonly by a polynomial rate (ie the absolute error decreases to zero at rate nminusp forp gt 12 where n is the number of function evaluations and p depends on the di-mension of the function that one is integrating see for instance [6]) We believe thatthe procedures that we develop in this paper can guide the construction of efficientdeterministic algorithms with small relative error and with complexity that scalesat a polynomial rate in |log P (Bb)| This is an interesting research topic that weplan to explore in the future

An issue that we shall face in designing our Monte Carlo procedure is that dueto the fact that f will have to be discretized the corresponding estimator Lb willnot be unbiased In turn in order to control the relative bias with an effort that iscomparable to the bound on the number of replications discussed in the preceedingparagraph one must verify that the relative bias can be reduced to an amountless than ε with probability at least 1 minus δ at a computational cost of the formO(εminusq0 |log P (Bb)|q1) If Lb(ε) can be generated with O(εminusq0 |log P (Bb)|q1) costand satisfying

∣∣∣P (Bb)minus ELb(ε)∣∣∣ le εP (Bb) and if

supbgt0

V ar(Lb(ε))P (Bb)

2 |log P (Bb)|qlt infin

for some q isin (0infin) then Lprimen (b ε) =sumn

j=1 L(j)b (ε)n where the L

(j)b (ε)rsquos are iid

copies of Lb(ε) satisfies

P (|Lprimen (b ε) P (Bb)minus 1| ge 2ε) le V ar(Lb(ε))ε2 times ntimes P (Bb)

2

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 5: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 5

has a relative mean squared error equal to [V ar(Lb)]12[n12P (Bb)] A simpleconsequence of Chebyshevrsquos inequality is that

P(∣∣∣Ln (b) P (Bb)minus 1

∣∣∣ ge ε)le V ar (Lb)

ε2nP [(Bb)]2

Thus if Lb is polynomially efficient and we wish to compute P (Bb) with at mostε relative error and at least 1minus δ confidence it suffices to simulate

n = Θ(εminus2δminus1 |log P (Bb)|q)

iid replications of Lb In fact in the presence of polynomial efficiency the boundn = Θ

(εminus2δminus1 |log P (Bb)|q

)can be boosted to n = Θ

(εminus2 log(δminus1) |log P (Bb)

∣∣q)using the so-called median trick (see [18])

Naturally the cost per replication must also be considered in the analysis andwe shall do so but the idea is that evaluating P (Bb) via crude Monte Carlo wouldrequire given ε and δ n = Θ (1P (Bb)) replications Thus a polynomially efficientlyestimator makes the evaluation of P (Bb) exponentially faster relative to crudeMonte Carlo at least in terms of the number of replications

Note that a direct application of deterministic algorithms (such as quasi MonteCarlo or quadrature integration rules) might improve (under appropriate smooth-ness assumptions) the computational complexity relative to Monte Carlo althoughonly by a polynomial rate (ie the absolute error decreases to zero at rate nminusp forp gt 12 where n is the number of function evaluations and p depends on the di-mension of the function that one is integrating see for instance [6]) We believe thatthe procedures that we develop in this paper can guide the construction of efficientdeterministic algorithms with small relative error and with complexity that scalesat a polynomial rate in |log P (Bb)| This is an interesting research topic that weplan to explore in the future

An issue that we shall face in designing our Monte Carlo procedure is that dueto the fact that f will have to be discretized the corresponding estimator Lb willnot be unbiased In turn in order to control the relative bias with an effort that iscomparable to the bound on the number of replications discussed in the preceedingparagraph one must verify that the relative bias can be reduced to an amountless than ε with probability at least 1 minus δ at a computational cost of the formO(εminusq0 |log P (Bb)|q1) If Lb(ε) can be generated with O(εminusq0 |log P (Bb)|q1) costand satisfying

∣∣∣P (Bb)minus ELb(ε)∣∣∣ le εP (Bb) and if

supbgt0

V ar(Lb(ε))P (Bb)

2 |log P (Bb)|qlt infin

for some q isin (0infin) then Lprimen (b ε) =sumn

j=1 L(j)b (ε)n where the L

(j)b (ε)rsquos are iid

copies of Lb(ε) satisfies

P (|Lprimen (b ε) P (Bb)minus 1| ge 2ε) le V ar(Lb(ε))ε2 times ntimes P (Bb)

2

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 6: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

6 ADLER BLANCHET LIU

Consequently taking n = Θ(εminus2δminus1 |log P (Bb)|q

)suffices to give an estimator with

at most ε relative error and 1 minus δ confidence and the total computational cost isΘ

(εminus2minusq0δminus1 |log P (Bb)|q+q1

)

We shall measure computational cost in terms of function evaluations such as asingle addition a multiplication a comparison the generation of a single uniformrandom variable on T the generation of a single standard Gaussian random variableand the evaluation of Φ(x) for fixed x ge 0 where

Φ(x) = 1minusΨ(x) =1radic2π

int x

minusinfineminuss22 ds

All of these function evaluations are assumed to cost at most a fixed amount c ofcomputer time Moreover we shall also assume that first and second order momentcharacteristics of the field such as micro (t) = Ef (t) and C (s t) = Cov (f (t) f (s))can be computed in at most c units of computer time for each s t isin T We note thatsimilar models of computation are often used in the complexity theory of continuousproblems see [25]

The previous discussion motivates the next definition which has its roots in thegeneral theory of computation in both continuous and discrete settings [17 25]In particular completely analogous notions in the setting of complexity theory ofcontinuous problems lead to the notion of ldquotractabilityrdquo of a computational problem[28]

Definition 22 A Monte Carlo procedure is said to be a fully polynomialrandomized approximation scheme (FPRAS) for estimating P (Bb) if for someq q1 q2 isin [0infin) it outputs an averaged estimator that is guaranteed to have at mostε gt 0 relative error with confidence at least 1minusδ isin (0 1) in Θ(εminusq1δminusq2 | log P (Bb) |q)function evaluations

The terminology adopted namely FPRAS is borrowed from the complexitytheory of randomized algorithms for counting [17] Many counting problems canbe expressed as rare event estimation problems In such cases it typically occursthat the previous definition (expressed in terms of a rare event probability) coincidesprecisely with the standard counting definition of a FPRAS (in which there is noreference to any rare event to estimate) This connection is noted for instance in[10] Our terminology is motivated precisely by this connection

By letting Bb = flowast gt b the goal in this paper is to design a class of fully poly-nomial randomized approximation schemes that are applicable to a general class ofGaussian random fields In turn since our Monte Carlo estimators will be based onimportance sampling it turns out that we shall also be able to straightforwardlyconstruct FPRASrsquos to estimate quantities such as E[Γ(f)| suptisinT f(t) gt b] for asuitable class of functionals Γ for which Γ (f) can be computed with an error ofat most ε with a cost that is polynomial as function of εminus1 We shall discuss thisobservation in Section 4 which deals with properties of importance sampling

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 7: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 7

3 Main Results In order to state and discuss our main results we need somenotation For each s t isin T define

micro(t) = E(f(t)) C (s t) = Cov(f (s) f (t)) σ2(t) = C(t t) gt 0 r(s t) =C(s t)

σ(s)σ(t)

Moreover given x isin Rd and β gt 0 we write |x| =sumd

j=1 |xj | where xj is the j-thcomponent of x We shall assume that for each fixed s t isin T both micro (t) and C (s t)can be evaluated in at most c units of computing time

Our first result shows that under modest continuity conditions on micro σ and r itis possible to construct an explicit FPRAS for w(b) under the following regularityconditions

A1 The field f is almost surely continuous on T A2 For some δ gt 0 and |sminus t| lt δ the mean and variance functions satisfies

|σ (t)minus σ (s)|+ |micro (t)minus micro (s)| le κH |sminus t|β

A3 For some δ gt 0 and |s minus sprime| lt δ |t minus tprime| lt δ the correlation function of fsatisfies

|r (t s)minus r (tprime sprime)| le κH [|tminus tprime|β + |sminus sprime|β ]

A4 0 isin T There exist κ0 and ωd such that for any t isin T and ε small enough

m(B(t ε) cap T ) ge κ0εdωd

where m is the Lebesgue measure B(t ε) = s |tminus s| le ε and ωd =m (B(0 1))

The assumption that 0 isin T is of no real consequence and is adopted only fornotational convenience

Theorem 31 Suppose that f T rarr R is a Gaussian random field satisfyingConditions A1ndashA4 above Then Algorithm 61 provides a FPRAS for w (b)

The polynomial rate of the intrinsic complexity bound inherent in this result isdiscussed in Section 6 along with similar rates in results to follow The conditions ofthe previous theorem are weak and hold for virtually all applied settings involvingcontinuous Gaussian fields on compact sets

Not surprisingly the complexity bounds of our algorithms can be improved uponunder additional assumptions on f For example in the case of finite fields (ie whenT is finite) with a non-singular covariance matrix we can show that the complexityof the algorithm is actually bounded as b infin We summarize this in the nextresult whose proof which is given in Section 5 is useful for understanding themain ideas behind the general procedure

Theorem 32 Suppose that T is a finite set and f has a non-singular covari-ance matrix over T times T Then Algorithm 53 provides a FPRAS with q = 0

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 8: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

8 ADLER BLANCHET LIU

As we indicated above the strategy behind the discrete case provides the basisfor the general case In the general situation the underlying idea is to discretize thefield with an appropriate sampling (discretization) rule that depends on the levelb and the continuity characteristics of the field The number of sampling pointsgrows as b increases and the complexity of the algorithm is controlled by findinga good sampling rule There is a trade off between the number of points sampledwhich has a direct impact on the complexity of the algorithm and the fidelityof the discrete approximation to the continuous field Naturally in the presence ofenough smoothness and regularity more information can be obtained with the samesample size This point is illustrated in the next result Theorem 33 which considerssmooth homogeneous fields Note that in addition to controlling the error inducedby discretizing the field the variance is strongly controlled and the discretizationrule is optimal in a sense explained in Section 7 For Theorem 33 we require thefollowing additional regularity conditions

B1 f is homogeneous and almost surely twice continuously differentiableB2 0 isin T sub Rd is a d-dimensional convex set with nonempty interior Denoting

its boundary by partT assume that partT is a (dminus1)-dimensional manifold withoutboundary For any t isin T assume that there exists κ0 gt 0 such that

m(B(t ε) cap T ) ge κ0εd

for any ε lt 1 where m is Lebesgue measure

Theorem 33 Let f satisfy conditions B1ndashB2 Then Algorithm 73 provides aFPRAS Moreover the underlying estimator is strongly efficient and there exists adiscretization scheme for f which is optimal in the sense of Theorem 74

4 Importance Sampling Importance sampling is based on the basic iden-tity for fixed measurable B

(41) P (B) =int

1 (ω isin B) dP (ω) =int

1 (ω isin B)dP

dQ(ω) dQ (ω)

where we assume that the probability measure Q is such that Q (middot capB) is abso-lutely continuous with respect to the measure P (middot capB) If we use EQ to denoteexpectation under Q then (41) trivially yields that the random variable

L (ω) = 1 (ω isin B)dP

dQ(ω)

is an unbiased estimator for P (B) gt 0 under the measure Q or symbolicallyEQL = P (B)

An averaged importance sampling estimator based on the measure Q which isoften referred as an importance sampling distribution or a change-of-measure isobtained by simulating n iid copies L(1) L(n) of L under Q and computing theempirical average Ln = (L(1) + +L(n))n A central issue is that of selecting Q inorder to minimize the variance of Ln It is easy to verify that if Qlowast(middot) = P (middot|B) =

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 9: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 9

P (middot capB)P (B) then the corresponding estimator has zero variance However Qlowast isclearly a change of measure that is of no practical value since P (B) ndash the quantitythat we are attempting to evaluate in the first place ndash is unknown Neverthelesswhen constructing a good importance sampling distribution for a family of setsBb b ge b0 for which 0 lt P (Bb) rarr 0 as b rarrinfin it is often useful to analyze theasymptotic behavior of Qlowast as P (Bb) rarr 0 in order to guide the construction of agood Q

We now describe briefly how an efficient importance sampling estimator forP (Bb) can also be used to estimate a large class of conditional expectations givenBb Suppose that a single replication of the corresponding importance samplingestimator

Lb∆= 1 (ω isin Bb) dPdQ

can be generated in O (log |P (Bb)|q0) function evaluations for some q0 gt 0 andthat

V ar (Lb) = O([P (Bb)]2 log |P (Bb)|q0

)

These assumptions imply that by taking the average of iid replications of Lb weobtain a FPRAS

Fix β isin (0infin) and let X (β q) be the class of random variables X satisfying0 le X le β with

E[X|Bb] = Ω[1 log(P (Bb))q](42)

Then by noting that

(43)EQ (XLb)EQ (Lb)

= E[X|Bb] =E[X Bb]P (Bb)

it follows easily that a FPRAS can be obtained by constructing the natural estima-tor for E[X|Bb] ie the ratio of the corresponding averaged importance samplingestimators suggested by the ratio in the left of (43) Of course when X is difficultto simulate exactly one must assume the bias E[X Bb] can be reduced in poly-nomial time The estimator is naturally biased but the discussion on FPRAS onbiased estimators given in Section 2 can be directly applied

In the context of Gaussian random fields we have that Bb = flowast gt b andone is very often interested in random variables X of the form X = Γ (f) whereΓ C (T ) rarr R and C(T ) denotes the space of continuous functions on T EndowingC(T ) with the uniform topology consider functions Γ that are non-negative andbounded by a positive constant An archetypical example is given by the volumeof (conditioned) high level excursion sets with β = m(T ) is known to satisfy (42)However there are many other examples of X (β q) with β = m(T ) which satisfy(42) for a suitable q depending on the regularity properties of the field In fact ifthe mean and covariance properties of f are Holder continuous then using similartechniques as those given in the arguements of Section 6 it is not difficult to see

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 10: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

10 ADLER BLANCHET LIU

that q can be estimated In turn we have that a FPRAS based importance sam-pling algorithm for w (b) would typically also yield a polynomial time algorithm forfunctional characteristics of the conditional field given high level excursions Sincethis is a very important and novel application we devote the remainder of thispaper to the development of efficient importance sampling algorithms for w(b)

5 The Basic Strategy Finite Fields In this section we develop our mainideas in the setting in which T is a finite set of the form T = t1 tM Toemphasize the discrete nature of our algorithms in this section we write Xi = f (ti)for 1 M and set X = (X1 XM ) This section is mainly of an expositorynature since much of it has already appeared in [2] Nevertheless it is includedhere as a useful guide to the intuition behind the algorithms for the continuouscase

We have already noted that in order to design an efficient importance samplingestimator for w (b) = P (max1leileM Xi gt b) it is useful to study the asymptoticconditional distribution of X given that max1leileM Xi gt b Thus we begin withsome basic large deviation results

Proposition 51 For any set of random variables X1 XM

max1leileM

P (Xi gt b) le P

(max

1leileMXi gt b

)le

Msum

j=1

P (Xj gt b)

Moreover if the Xj are mean zero Gaussian and the correlation between Xi andXj is strictly less than 1 then

P (Xi gt bXj gt b) = o (max[P (Xi gt b) P (Xj gt b)])

Thus if the associated covariance matrix of X is non-singular

w (b) = (1 + o (1))Msum

j=1

P (Xj gt b)

Proof The lower bound in the first display is trivial and the upper bound followsby the union bound The second display follows easily by working with the jointdensity of a bivariate Gaussian distribution (eg [9 16]) and the third claim is adirect consequence of the inclusion-exclusion principle 2

As noted above Qlowast corresponds to the conditional distribution of X given thatXlowast max1leileM Xi gt b It follows from Proposition 51 that conditional on Xlowast gtb the probability that two or more Xj exceed b is negligible Moreover it alsofollows that

P (Xi = Xlowast|Xlowast gt b) = (1 + o (1))P (Xi gt b)sumM

j=1 P (Xj gt b)

The following corollary now follows as an easy consequence of these observations

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 11: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 11

Corollary 52dTV (QlowastQ) rarr 0

as b rarrinfin where dTV denotes the total variation norm and Q is defined for BorelB sub RM as

Q (X isin B) =Msum

i=1

p (i b)P [X isin B|Xi gt b]

where

p (i b) =P (Xi gt b)sumM

j=1 P (Xj gt b)

Proof Pick an arbitrary Borel B Then we have that

Qlowast (X isin B) =P [X isin B max1leileM Xi gt b]

w (b)le

Msum

i=1

P [X isin B Xi gt b]w (b)

=Msum

i=1

P [X isin B|Xi gt b]p (i b)

(1 + o (1))

The above which follows from the union bound and the last part of Proposition51 combined with the definition of Q yields that for each ε gt 0 there exists b0

(independent of B) such that for all b ge b0

Qlowast (X isin B) le Q (X isin B) (1minus ε)

The lower bound follows similarly using the inclusion-exclusion principle and thesecond part of Proposition 51 2

Corollary 52 provides support for choosing Q as an importance sampling dis-tribution Of course we still have to verify that the corresponding algorithm is aFPRAS The importance sampling estimator induced by Q takes the form

(51) Lb =dP

dQ=

sumMj=1 P (Xj gt b)

sumMj=1 1 (Xj gt b)

Note that under Q we have that Xlowast gt b almost surely so the denominator in theexpression for Lb is at least 1 Therefore we have that

EQL2b le

Msum

j=1

P (Xj gt b)

2

and by virtue of Proposition 51 we conclude (using V arQ to denote the varianceunder Q) that

V arQ (Lb)P (Xlowast gt b)2

rarr 0

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 12: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

12 ADLER BLANCHET LIU

as b rarrinfin In particular it follows that Lb is strongly efficientOur proposed algorithm can now be summarized as follows

Algorithm 53 There are two steps in the algorithm

Step 1 Simulate n iid copies X(1) X(n) of X from the distribution QStep 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where L(i)b =

sumMj=1 P (X(i)

j gt b)sumM

j=1 1(X(i)j gt b)

Since the generation of Li under Q takes O(M3

)function evaluations we con-

clude based on the analysis given in Section 2 that Algorithm 53 is a FPRASwith q = 0 This implies Theorem 32 as promised

6 A FPRAS for Holder Continuous Gaussian Fields In this sectionwe shall describe the algorithm and the analysis behind Theorem 31 Throughoutthe section unless stated otherwise we assume conditions A1ndashA4 of Section 3

There are two issues related to the complexity analysis Firstly since f is assumedcontinuous the entire field cannot be generated in a (discrete) computer and so thealgorithm used in the discrete case needs adaptation Once this is done we need tocarry out an appropriate variance analysis

Developing an estimator directly applicable to the continuous field will be car-ried out in Section 61 This construction will not only be useful when studyingthe performance of a suitable discretization but will also help to explain some ofthe features of our discrete construction Then in Section 62 we introduce a dis-cretization approach and study the bias caused by the discretization In additionwe provide bounds on the variance of this discrete importance sampling estimator

61 A Continuous Estimator We start with a change of measure motivatedby the discrete case in Section 5 A natural approach is to consider an importancesampling strategy analogous to that of Algorithm 53 The continuous adaptationinvolves first sampling τb according to the probability measure

(61) P (τb isin middot) =E[m (Ab cap middot)]

E[m (Ab)]

where Ab = t isin T f (t) gt b The idea of introducing τb in the continuous settingis not necessarily to locate the point at which the maximum is achieved as was thesituation in the discrete case Rather τb will be used to find a random point whichhas a reasonable probability of being in the excursion set Ab (This probability willtend to be higher if f is non-homogenous) This relaxation will prove useful in theanalysis of the algorithm Note that τb with the distribution indicated in equation(61) has a density function (with respect to Lebesgue measure) given by

hb(t) =P (f(t) gt b)E[m(Ab)]

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 13: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 13

and that we also can write

E[m (Ab)] = E

int

T

1 (f (t) gt b) dt =int

T

P (f (t) gt b) dt = m (T ) P (f (U) gt b)

where U is uniformly distributed over T Once τb is generated the natural continuous adaptation corresponding to the

strategy described by Algorithm 53 proceeds by sampling f conditional on f(τb) gtb Note that if we use Q to denote the change-of-measure induced by such a con-tinuous sampling strategy then the corresponding importance sampling estimatortakes the form

Lb =dP

dQ=

E[m (Ab)]m (Ab)

The second moment of the estimator then satisfies

EQ[(Lb)2] = E(LbAb 6= empty

)= E[m(Ab)]P (flowast gt b)E[m(Ab)minus1|Ab 6= empty](62)

Unfortunately it is easy to construct examples for which EQ[(Lb

)2] is infiniteConsequently the above construction needs to be modified slightly

Elementary extreme value theory considerations for the maximum of finitelymany Gaussian random variables and their analogues for continuous fields givethat the overshoot of f over a given level b will be of order Θ(1b) Thus in orderto keep τb reasonably close to the excursion set we shall also consider the possibilityof an undershoot of size Θ (1b) right at τb As we shall see this relaxation will allowus to prevent the variance in (62) becoming infinite Thus instead of (61) we shallconsider τbminusab with density

(63) hbminusab(t) =P (f(t) gt bminus ab)

E[m(Abminusab)]

for some a gt 0 To ease on later notation write

γab bminus ab τγab= τbminusab

Let Qprime be the change of measure induced by sampling f as follows Given τγab

sample f(τγab) conditional on f(τγab

) gt γab In turn the rest of f follows its con-ditional distribution (under the nominal or original measure) given the observedvalue f(τγab

) We then have that the corresponding Radon-Nikodym derivative is

(64)dP

dQprime =E[m(Aγab

)]m(Aγab

)

and the importance sampling estimator Lprimeb is

(65) Lprimeb =dP

dQprime 1 (Ab 6= empty) =E[m(Aγab

)]m(Aγab

)1 (m(Ab) gt 0)

Note that we have used the continuity of the field in order to write m (Ab) gt 0 =Ab 6= empty almost surely The motivation behind this choice lies in the fact that

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 14: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

14 ADLER BLANCHET LIU

since m(Aγab) gt m (Ab) gt 0 the denominator may now be big enough to control

the second moment of the estimator As we shall see introducing the undershootof size ab will be very useful in the technical development both in the remainderof this section and in Section 7 In addition its introduction also provides insightinto the appropriate form of the estimator needed when discretizing the field

62 Algorithm and Analysis We still need to face the problem of generatingf in a computer Thus we now concentrate on a suitable discretization schemestill having in mind the change of measure leading to (64) Since our interest isto ultimately design algorithms that are efficient for estimating expectations suchas E[Γ (f) |flowast gt b] where Γ may be a functional of the whole field we shall use aglobal discretization scheme

Consider U = (U1 UM ) where Ui are iid uniform random variables takingvalues in T and independent of the field f Set TM = U1 UM and Xi =Xi (Ui) = f (Ui) for 1 le i le M Then X = (X1 Xm) (conditional on U) isa multivariate Gaussian random vector with conditional means micro(Ui)

∆= E(Xi|Ui)and covariances C (Ui Uj)

∆= Cov (Xi Xj |Ui Uj) Our strategy is to approximatew(b) by

wM (γab) = P (maxtisinTM

f(t) gt γab) = E[P ( max1leileM

Xi gt γab|U)]

Given the development in Section 5 it might not be surprising that if we can ensurethat M = M (ε b) is polynomial in 1ε and b then we shall be in a good positionto develop a FPRAS The idea is to apply an importance sampling strategy similarto that we considered in the construction of Lprimeb of (65) but this time it will beconditional on U In view of our earlier discussions we propose sampling from Qprimeprime

defined via

Qprimeprime (X isin B|U) =Msum

i=1

pU (i)P [X isin B|Xi gt γab U ]

where

pU (i) =P (Xi gt γab|U)sumM

j=1 P (Xj gt γab|U)

We then obtain the (conditional) importance sampling estimator

(66) Lb(U) =sumM

i=1 P (Xi gt γab|U)sumMi=1 1(Xi gt γab)

It is clear thatwM (γab) = EQprimeprime [Lb(U)]

Suppose for the moment that M a and the number of replications n have beenchosen Our future analysis will in particular guide the selection of these parametersThen the procedure is summarized by the next algorithm

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 15: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 15

Algorithm 61 The algorithm has three steps

Step 1 Simulate U (1) U (n) which are n iid copies of the vector U = (U1 UM )described aboveStep 2 Conditional on each U (i) for i = 1 n generate L

(i)b (U (i)) as described

by (66) by considering the distribution of X(i)(U (i)) = (X(i)1 (U (i)

1 ) X(i)M (U (i)

M ))Generate the X(i)(U (i)) independently so that at the end we obtain that the L

(i)b (U (i))

are n iid copies of Lb(U)Step 3 Output

Ln(U (1) U (n)) =1n

nsum

i=1

L(i)b (U (i))

63 Running Time of Algorithm 61 Bias and Variance Control The remain-der of this section is devoted to the analysis of the running time of the Algorithm61 The first step lies in estimating the bias and second moment of Lb(U) underthe change of measure induced by the sampling strategy of the algorithm whichwe denote by Qprimeprime We start with a simple bound for the second moment

Proposition 62 There exists a finite λ0 depending on microT = maxtisinT |micro (t)|and σ2

T = maxtisinT σ2 (t) for which

EQprimeprime [Lb(U)2] le λ0M2P

(maxtisinT

f (t) gt b

)2

Proof Observe that

EQprimeprime [Lb(U)2] le E

(Msum

i=1

P (Xi gt γab|Ui)

)2

le E

(Msum

i=1

suptisinT

P (f (Ui) gt γab|Ui = t)

)2

= M2 maxtisinT

P (f (t) gt γab)2

le λ0M2 max

tisinTP (f (t) gt b)2

le λ0M2P

(maxtisinT

f (t) gt b

)2

which completes the proof 2

Next we obtain a preliminary estimate of the bias

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 16: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

16 ADLER BLANCHET LIU

Proposition 63 For each M ge 1 we have

|w (b)minus wM (b)| le E[exp

(minusMm(Aγab

)m(T )

) Ab cap T 6= empty]

+ P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

Proof Note that

|w (b)minus wM (γab)|le P (max

tisinTM

f (t) le γab maxtisinT

f (t) gt b) + P (maxtisinTM

f (t) gt γab maxtisinT

f (t) le b)

The second term is easily bounded by

P (maxtisinTM

f (t) gt γabmaxtisinT

f (t) le b) le P (maxtisinT

f (t) gt γab maxtisinT

f (t) le b)

The first term can be bounded as follows

P

(maxtisinTM

f (t) le γabmaxtisinT

f (t) gt b

)

le E[(P [f (Ui) le γab| f ])M 1(Ab cap T 6= empty)

]

le E[(

1minusm(Aγab)m (T )

)M Ab cap T 6= empty)]

le E[exp

(minusMm(Aγab)m (T )

) Ab cap T 6= empty)

]

This completes the proof 2

The previous proposition shows that controlling the relative bias of Lb(U) re-quires finding bounds for

(67) E[exp(minusMm(Aγab

)m(T )) Ab cap T 6= empty]

and

(68) P (maxtisinT

f (t) gt γabmaxtisinT

f (t) le b)

and so we develop these To bound (67) we take advantage of the importancesampling strategy based on Qprime introduced earlier in (64) Write

E[exp

(minusMm(Aγab)m (T )

) m(Ab) gt 0

](69)

= EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Ab) gt 0

)Em

(Aγab

)

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 17: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 17

Furthermore note that for each α gt 0 we have

EQprime(

exp(minusMm

(Aγab

)m (T )

)

m(Aγab

) m(Ab) gt 0

)(610)

le αminus1 exp (minusMαm(T ))

+ EQprime(

exp(minusMm(Aγab

)m(T ))

m(Aγab

) m(Aγab) le αm(Ab) gt 0

)

The next result whose proof is given in Section 64 gives a bound for the aboveexpectation

Proposition 64 Let β be as in Conditions A2 and A3 For any v gt 0 thereexist constants κ λ2 isin (0infin) (independent of a isin (0 1) and b but dependent on v)such that if we select

αminus1 ge κdβ(ba)(2+v)2dβ

and define W such that P (W gt x) = exp(minusxβd

)for x ge 0 then

EQprime(

exp(minusMm(Aγab

)m (T ))

m(Aγab)

m(Aγab) le αm(Ab) gt 0

)(611)

le EW 2 m (T )

λ2dβ2 M

(b

a

)4dβ

The following result gives us a useful upper bound on (68) The proof is givenin Section 65

Proposition 65 Assume that Conditions A2 and A3 are in force For anyv gt 0 let ρ = 2dβ + dv + 1 where d is the dimension of T There exist constantsb0 λ isin (0infin) (independent of a but depending on microT = maxtisinT |micro (t)| σ2

T =maxtisinT σ2 (t) v the Holder parameters β and κH) so that for all b ge b0 ge 1 wehave

(612) P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)le λabρ

Consequently

P

(maxtisinT

f (t) gt γab maxtisinT

f (t) le b

)

= P

(maxtisinT

f (t) le b|maxtisinT

f (t) gt γab

)P

(maxtisinT

f(t) gt γab

)

le λabρP

(maxtisinT

f(t) gt γab

)

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 18: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

18 ADLER BLANCHET LIU

Moreover

P

(maxtisinT

f(t) gt γab

)(1minus λabρ) le P

(maxtisinT

f(t) gt b

)

Propositions 64 and 65 allow us to prove Theorem 31 which is rephrased inthe form of the following theorem which contains the detailed rate of complexityand so the main result of this section

Theorem 66 Suppose f is a Gaussian random field satisfying conditions A1ndashA4 in Section 3 Given any v gt 0 put a = ε(4λbρ) (where λ and ρ as in Proposition65) and αminus1 = κdβ(ba)(2+v)dβ Then there exist c ε0 gt 0 such that for allε le ε0

(613) |w (b)minus wM (γab)| le w (b) ε

if M =lceilcεminus1(ba)(4+4v)dβ

rceil Consequently by our discussion in Section 2 and the

bound on the second moment given in Proposition 62 it follows that Algorithm 61provides a FPRAS with running time O

((M)3 times (M)2 times εminus2δminus1

)

Proof Combining (63) (69) and (610) with Propositions 63ndash65 we have that

|w (b)minus wM (b) |le αminus1 exp (minusMαm (T )) E

[m

(Aγab

)]

+E[W 2]m (T )

λ2dβ2 M

(b

a

)4dβ

E[m

(Aγab

)]+

(λabρ

1minus λabρ

)w (b)

Furthermore there exists a constant K isin (0infin) such that

Em(Aγab

) le K maxtisinT

P (f (t) gt b)m (T ) le Kw (b) m (T )

Therefore we have that

|w (b)minus wM (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+E[W 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+(

λabρ

1minus λabρ

)

Moreover since a = ε(4λbρ) we obtain that for ε le 12

|w (b)minus wN (b) |w (b)

le αminus1Km(T ) exp (minusMαm (T ))

+[EW 2]Km (T )2

λ2dβ2 M

(b

a

)4dβ

+ ε2

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 19: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 19

From the selection of α M and θ it follows easily that the first two terms on theright hand side of the previous display can be made less than ε2 for all ε le ε0 bytaking ε0 sufficiently small

The complexity count given in the theorem now corresponds to the followingestimates The factor O((M)3) represents the cost of a Cholesky factorization re-quired to generate a single replication of a finite field of dimension M In additionthe second part of Proposition 62 gives us that O(M2εminus2δminus1) replications are re-quired to control the relative variance of the estimator 2

We now proceed to prove Propositions 64 and 65

64 Proof of Proposition 64 We concentrate on the analysis of the left handside of (611) An important observation is that conditional on the random variableτγab

with distribution

Qprime(τγabisin middot) =

E[m(Aγab

cap middot)]

E[m(Aγab

)]

and given f(τγab) the rest of the field namely (f (t) t isin Tτγab

) is anotherGaussian field with a computable mean and covariance structure The second termin (610) indicates that we must estimate the probability that m(Aγab

) takes smallvalues under Qprime For this purpose we shall develop an upper bound for

(614) P(m(Aγab

) lt yminus1 m(Ab) gt 0∣∣ f (t) = γab + zγab

)

for y large enough Our arguments proceeds in two steps For the first in order tostudy (614) we shall estimate the conditional mean covariance of f (s) s isin Tgiven that f (t) = γab + zγab Then we use the fact that the conditional field isalso Gaussian and take advantage of general results from the theory of Gaussianrandom fields to obtain a bound for (614) For this purpose we recall some usefulresults from the theory of Gaussian random fields The first result is due to Dudley[14]

Theorem 67 Let U be a compact subset of Rn and let f0(t) t isin U be amean zero continuous Gaussian random field Define the canonical metric d on Uas

d (s t) =radic

E[f0 (t)minus f0 (s)]2

and put diam (U) = supstisinU d (s t) which is assumed to be finite Then there existsa finite universal constant κ gt 0 such that

E[maxtisinU

f0 (t)] le κ

int diam(U)2

0

[log (N (ε))]12dε

where the entropy N (ε) is the smallest number of dminusballs of radius ε whose unioncovers U

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 20: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

20 ADLER BLANCHET LIU

The second general result that we shall need is the so-called B-TIS (Borel-Tsirelson-Ibragimov-Sudakov) inequality [4 11 26]

Theorem 68 Under the setting described in Theorem 67

P

(maxtisinU

f0 (t)minus E[maxtisinU

f0 (t)] ge b

)le exp

(minusb2(2σ2U )

)

whereσ2U = max

tisinUE[f2

0 (t)]

We can now proceed with the main proof We shall assume from now on thatτγab

= 0 since as will be obvious from what follows all estimates hold uniformlyover τγab

isin T This is a consequence of the uniform Holder assumptions A2 andA3 Define a new process f

(f (t) t isin T ) L= (f (t) t isin T | f (0) = γab + zγab)

Note that we can always write f (t) = micro (t)+g (t) where g is a mean zero Gaussianrandom field on T We have that

micro (t) = Ef (t) = micro (t) + σ (0)minus2C (0 t) (γab + zγab minus micro (0))

and that the covariance function of f is given by

Cg (s t) = Cov (g (s) g (t)) = C (s t)minus σ (0)minus2C (0 s)C (0 t)

The following lemma describes the behavior of micro (t) and Cg (t s)

Lemma 69 Assume that |s| and |t| small enough Then the following threeconclusions hold

(i) There exist constants λ0 and λ1 gt 0 such that

|micro (t)minus (γab + zγab)| le λ0 |t|β + λ1 |t|β (γab + zγab)

and for all z isin (0 1) and γab large enough

|micro (s)minus micro (t) | le κHγab|sminus t|β

(ii)Cg (s t) le 2κHσ (t)σ (s) |t|β + |s|β + |tminus s|β

(iii)

Dg (s t) =radic

E([g (t)minus g (s)]2) le λ121 |tminus s|β2

Proof All three consequences follow from simple algebraic manipulations Thedetails are omitted 2

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 21: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 21

Proposition 610 For any v gt 0 there exist κ and λ2 such that for all t isin T yminusβd le a2+v

κb(2+v) a sufficiently small and z gt 0

P(m(Aγab

)minus1 gt y m(Ab) gt 0∣∣ f (t) = γab + zγab

) le exp(minusλ2a

2yβdb2)

Proof For notational simplicity and without loss of generality we assume thatt = 0 First consider the case that z ge 1 Then there exist c1 c2 such that for allc2y

minusβd lt bminus2minusv and z gt 1

P(m

(Aγab

cap T)minus1

gt ym(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1df(t) le γab|f (0) = γab + zγab

)

= P

(inf

|t|ltc1yminus1dmicro (t) + g(t) le γab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

Now apply (iii) from Lemma 69 from which it follows that N (ε) le c3m(T )ε2dβ

for some constant c3 By Theorem 67 E(sup|t|ltc1yminus1d f(t)) = O(yminusβ(2d) log y)By Theorem 68 for some constant c4

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0| f (0) = γab + zγab

)

le P

(inf

|t|ltc1yminus1dg(t) le minus 1

2γab

)

le exp

(minus 1

c4γ2aby

minusβd

)

for c2yminusβd lt bminus2minusv and z gt 1

Now consider the case z isin (0 1) Let tlowast be the global maximum of f(t) Then

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(inf

|tminustlowast|ltc1yminus1df(t) lt γab f(tlowast) gt b|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

Consider the new field ξ(s t) = g(s)minus g(t) with parameter space T times T Note thatradic

V ar(ξ(s t)) = Dg(s t) le λ1|sminus t|β2

Via basic algebra it is not hard to show that the entropy of ξ(s t) is bounded byNξ (ε) le c3m(T times T )ε2dβ In addition from (i) of Lemma 69 we have

|micro (s)minus micro (t) | le κHγab|sminus t|β

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 22: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

22 ADLER BLANCHET LIU

Similarly for some κ gt 0 and all yminusβd le 1κ

(ab

)2+v a lt 1 there exists c5 such that

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0|f (0) = γab + zγab

)

le P

(sup

|sminust|ltc1yminus1d

|f(s)minus f(t)| gt ab|f (0) = γab + zγab

)

= P

(sup

|sminust|ltc1yminus1d

|ξ(s t)| gt a

2b

)

le exp(minus a2

c5b2yminusβd

)

Combining the two cases z gt 1 and z isin (0 1) and choosing c5 large enough we have

P(m

(Aγab

cap T)minus1

gt y m(Ab) gt 0∣∣ f (0) = γab + zγab

)

le exp(minus a2

c5b2yminusβd

)

for a small enough and yminusβd le 1κ

(ab

)2+v Renaming the constants completes theproof 2

The final ingredient needed for the proof of Proposition 64 is the following lemmainvolving stochastic domination The proof follows an elementary argument and istherefore omitted

Lemma 611 Let v1 and v2 be finite measures on R and define ηj(x) =intinfin

xvj(ds)

Suppose that η1 (x) ge η2 (x) for each x ge x0 Let (h (x) x ge x0) be a non-decreasing positive and bounded function Then

int infin

x0

h (s) v1(dx) geint infin

x0

h (s) v2 (ds)

Proof of Proposition 64 Note that

EQprime(exp

(minusMm(Aγab

)m (T )

)

m(Aγab

) m(Aγab

) isin (0 α) m(Ab) gt 0)

=int

T

int infin

z=0

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)P

(τγab

isin dt)P (γab[f (t)minus γab] isin dz| f (t) gt γab)

le supzgt0tisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 23: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 23

Now define Y (ba) = (ba)2dβλminusdβ2 W with λ2 as chosen in Proposition 610

and W with distribution given by P (W gt x) = exp(minusxβd) By Lemma 611

suptisinT

E(exp

(minusMm(Aγab

)m (T )

)

m (Aγab)

m(Aγab

) isin (0 α) m(Ab) gt 0∣∣∣

f (t) = γab +z

γab

)

le E[Y (ba) exp(minusMY (ba)minus1m(T ))Y (ba) gt αminus1

]

Now let Z be exponentially distributed with mean 1 and independent of Y (ba)Then we have (using the definition of the tail distribution of Z and Chebyshevrsquosinequality)

exp(minusMY (ba)minus1m (T )

)= P

(Z gt MY (ba)minus1m (T )

∣∣ Y (ba))

le m (T )Y (ba)M

Therefore

E[exp(minusMY (ba)minus1m (T )

)Y (ba) Y (ba) ge αminus1]

le m (T )M

E[Y (ba)2 Y (ba) ge αminus1]

le m (T )

Mλ2dβ2

(b

a

)4dβ

E(W 2)

which completes the proof 2

65 Proof of Proposition 65 We start with the following result of Tsirelson[27]

Theorem 612 Let f be a continuous separable Gaussian process on a compact(in the canonical metric) domain T Suppose that V ar (f) = σ is continuous andthat σ (t) gt 0 for t isin T Moreover assume that micro = Ef is also continuous andmicro(t) ge 0 for all t isin T Define

σ2T

∆= maxtisinT

Var(f(t))

and set F (x) = PmaxtisinT f(t) le x Then F is continuously differentiable on RFurthermore let y be such that F (y) gt 12 and define ylowast by

F (y) = Φ(ylowast)

Then for all x gt y

F prime(x) le Ψ(

xylowasty

)(xylowasty

(1 + 2α) + 1)

(1 + α)

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 24: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

24 ADLER BLANCHET LIU

where

α =y2

x(xminus y)y2lowast

We can now prove the following lemma

Lemma 613 There exists a constant A isin (0infin) independent of a and b ge 0such that

(615) P

(suptisinT

f(t) le b + ab | suptisinT

f(t) gt b

)le aA

P (suptisinT f (t) ge bminus 1b)P (suptisinT f (t) ge b)

Proof By subtracting inftisinT micro (t) gt minusinfin and redefining the level b to be b minusinftisinT micro (t) we may simply assume that Ef (t) ge 0 so that we can apply Theorem612 Adopting the notation of Theorem 612 first we pick b0 large enough so thatF (b0) gt 12 and assume that b ge b0 + 1 Now let y = bminus 1b and F (y) = Φ(ylowast)Note that there exists δ0 isin (0infin) such that δ0b le ylowast le δminus1

0 b for all b ge b0 Thisfollows easily from the fact that

logP

suptisinT

f(t) gt x

sim log sup

tisinTP f(t) gt x sim minus x2

2σ2T

On the other hand by Theorem 612 F is continuously differentiable and so

(616) P

suptisinT

f(t) lt b + ab | suptisinT

f(t) gt b

=

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b

Moreover

F prime(x) le(

1minus Φ(

xylowasty

)) (xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

le (1minus Φ(ylowast))(

xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

= P

(maxtisinT

f (t) gt bminus 1b

)(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x))

Thereforeint b+ab

b

F prime(x) dx

le P

(suptisinT

f (t) gt bminus 1b

) int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx

Recall that α (x) = y2[x(xminus y)y2lowast] we can use the fact that ylowast ge δ0b to conclude

that if x isin [b b + ab] then α (x) le δminus20 and therefore

int b+ab

b

(xylowasty

(1 + 2α (x)) + 1)

ylowasty

(1 + α (x)) dx le 4δminus80 a

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 25: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 25

We thus obtain that

(617)

int b+ab

bF prime(x) dx

P suptisinT f(t) gt b le 4aδminus80

P suptisinT f(t) gt bminus 1bP suptisinT f(t) gt b

for any b ge b0 This inequality together with the fact that F is continuously differ-entiable on (minusinfininfin) yields the proof of the lemma for b ge 0 2

The previous result translates a question that involves the conditional distri-bution of maxtisinT f (t) near b into a question involving the tail distribution ofmaxtisinT f (t) The next result then provides a bound on this tail distribution

Lemma 614 For each v gt 0 there exists a constant C(v) isin (0infin) (possiblydepending on v gt 0 but otherwise independent of b) so that such that

P

(maxtisinT

f (t) gt b

)le C(v)b2dβ+dv+1 max

tisinTP (f (t) gt b)

for all b ge 1

Proof The proof of this result follows along the same lines of Theorem 262 in[5] Consider an open cover of T = cupM

i=1Ti(θ) where Ti(θ) = s |s minus ti| lt θWe choose ti carefully such that N(θ) = O(θminusd) for θ arbitrarily small Writef (t) = g (t)+micro (t) where g (t) is a centered Gaussian random field and note usingA2 and A3 that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

Now we wish to apply the Borel-TIS inequality (Theorem 68) with U = Ti (θ) f0 =g d (s t) = E12([g (t)minus g (s)]2) which as a consequence of A2 and A3 is boundedabove by C0 |tminus s|β2 for some C0 isin (0infin) Thus applying Theorem 67 we havethat E maxtisinTi(θ) g (t) le C1θ

β2 log(1θ) for some C1 isin (0infin) Consequently theBorel-TIS inequality yields that there exists C2 (v) isin (0infin) such that for all bsufficiently large and θ sufficiently small we have

P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)le C2 (v) exp

(minus

(bminus micro (ti)minus C1θ

β(2+βv))2

2σ2Ti

)

where σTi = maxtisinTi(θ) σ (t) Now select v gt 0 small enough and set θβ(2+βv) =bminus1 Straightforward calculations yield that

P

(max

tisinTi(θ)f (t) gt b

)le P

(max

tisinTi(θ)g (t) gt bminus micro (ti)minus κHθβ

)

le C3 (v) maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 26: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

26 ADLER BLANCHET LIU

for some C3(v) isin (0infin) Now recall the well known inequality (valid for x gt 0)that

φ (x)(

1xminus 1

x3

)le 1minus Φ(x) le φ (x)

x

where φ = Φprime is the standard Gaussian density Using this inequality it follows thatC4 (v) isin (0infin) can be chosen so that

maxtisinTi(θ)

exp

(minus (bminus micro (t))2

2σ (t)2

)le C4 (v) bmax

tisinTi

P (f (t) gt b)

for all b ge 1 We then conclude that there exists C (v) isin (0infin) such that

P

(maxtisinT

f (t) gt b

)le N (θ)C4 (v) bmax

tisinTP (f (t) gt b)

le CθminusdbmaxtisinT

P (f (t) gt b) = Cb2dβ+dv+1 maxtisinT

P (f (t) gt b)

giving the result 2

We can now complete the proof of Proposition 65

Proof of Proposition 65 The result is a straightforward corollary of the previoustwo lemmas By (615) in Lemma 613 and Lemma 614 there exists λ isin (0infin) forwhich

P

(maxtisinT

f(t) le b + ab | maxtisinT

f(t) gt b

)

le aAP (maxtisinT f (t) ge bminus 1b)

P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)P (maxtisinT f (t) ge b)

le aCAb2dβ+dv+1 maxtisinT P (f (t) gt bminus 1b)maxtisinT P (f (t) gt b)

le aλb2dβ+dv+1

The last two inequalities follow from the obvious bound

P

(maxtisinT

f (t) ge b

)ge max

tisinTP (f (t) gt b)

and standard properties of the Gaussian distribution This yields (612) from whichthe remainder of the proposition follows 2

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 27: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 27

7 Fine Tuning Twice Differentiable Homogeneous Fields In the pre-ceeding section we constructed a polynomial time algorithm based on a randomizeddiscretization scheme Our goal in this section is to illustrate how to take advantageof additional information to further improve the running time and the efficiency ofthe algorithm In order to illustrate our techniques we shall perform a more refinedanalysis in the setting of smooth and homogeneous fields and shall establish op-timality of the algorithm in a precise sense to described below Our assumptionsthroughout this section are B1ndashB2 of Section 3

Let C(sminus t) = Cov(f(s) f(t)) be the covariance function of f which we assumealso has mean zero Note that it is an immediate consequence of homogeneity anddifferentiability that partiC(0) = part3

ijkC(0) = 0We shall need the following definition

Definition 71 We call T = t1 tM sub T a θ-regular discretization of Tif and only if

mini 6=j

|ti minus tj | ge θ suptisinT

mini|ti minus t| le 2θ

Regularity ensures that points in the grid T are well separated Intuitively sincef is smooth having tight clusters of points translates to a waste of computingresources as a result of sampling highly correlated values of f Also note thatevery region containing a ball of radius 2θ has at least one representative in T Therefore T covers the domain T in an economical way One technical convenienceof θ-regularity is that for subsets A sube T that have positive Lebesgue measure (inparticular ellipsoids)

limMrarrinfin

(A cap T )M

=m(A)m(T )

where here and throughout the remainder of the section (A) denotes the cardi-nality of the set A

Let T = t1 tM be a θ-regular discretization of T and consider

X = (X1 XM )T ∆= (f (t1) f (tM ))T

We shall concentrate on estimating wM (b) = P (max1leileM Xi gt b) The next result(which we prove in Section 71) shows that if θ = εb then the relative bias is O (

radicε)

Proposition 72 Suppose f is a Gaussian random field satisfying conditionsB1 and B2 There exist c0 c1 b0 and ε0 such that for any finite εb-regulardiscretization T of T

(71) P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c0

radicε and (T ) le c1m(T )εminusdbd

for all ε isin (0 ε0] and b gt b0

Note that the bound on the bias obtained for twice differentiable fields is muchsharper than that of the general Holder continuous fields given by (613) in Theorem

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 28: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

28 ADLER BLANCHET LIU

66 This is partly because the conditional distribution of the random field aroundlocal maxima is harder to describe in the Holder continuous than in the case oftwice differentiable fields In addition to the sharper description of the bias weshall also soon show in Theorem 74 that our choice of discretization is optimal in acetain sense Finally we point out that the bound of

radicε in the first term of (71) is

not optimal In fact there seems to be some room of improvement and we believethat a more careful analysis might yield a bound of the form c0ε

2We shall estimate wM (b) by using a slight variation of Algorithm 53 In partic-

ular since the Xirsquos are now identically distributed we redefine Q to be

(72) Q (X isin B) =Msum

i=1

1M

P [X isin B|Xi gt bminus 1b]

Our estimator then takes the form

(73) Lb =M times P (X1 gt bminus 1b)sumM

j=1 1 (Xj gt bminus 1b)1( max

1leileMXi gt b)

Clearly we have that EQ(Lb) = wM (b) (The reason for for subtracting the fcatorof 1b was explained in Section 61)

Algorithm 73 Given a number of replications n and an εb-regular dis-cretization T the algorithm is as follows

Step 1 Sample X(1) X(n) iid copies of X with distribution Q given by (72)Step 2 Compute and output

Ln =1n

nsum

i=1

L(i)b

where

L(i)b =

M times P (X1 gt bminus 1b)sumM

j=1 1(X(i)j gt bminus 1b)

1( max1leileM

X(i)j gt b)

Theorem 75 later guides the selection of n in order to achieve a prescribedrelative error In particular our analysis together with considerations from Section2 implies that choosing n = O

(εminus2δminus1

)suffices to achieve ε relative error with

probability at least 1minus δAlgorithm 73 improves on Algorithm 61 for Holder continuous fields in two

important ways The first aspect is that it is possible to obtain information on thesize of the relative bias of the estimator In Proposition 72 we saw that in orderto overcome bias due to discretization it suffices to take a discretization of sizeM = (T ) = Θ(bd) That this selection is also asymptotically optimal in the sensedescribed in the next result will be proven in Section 71

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 29: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 29

Theorem 74 Suppose f is a Gaussian random field satisfying conditions B1and B2 If θ isin (0 1) then as b rarrinfin

sup(T )lebθd

P (suptisinT

f(t) gt b| suptisinT

f(t) gt b) rarr 0

This result implies that the relative bias goes to 100 as b rarrinfin if one chooses adiscretization scheme of size O

(bθd

)with θ isin (0 1) Consequently d is the smallest

power of b that achieves any given bounded relative bias and so the suggestionabove of choosing M = O

(bd

)points for the discretization is in this sense optimal

The second aspect of improvement involves the variance In the case of Holdercontinuous fields the ratio of the second moment of the estimator and w (b)2 wasshown to be bounded by a quantity that is of order O(M2) In contrast in thecontext of smooth and homogeneous fields considered here the next result showsthat this ratio is bounded uniformly for b gt b0 and M = (T ) ge cbd That is thevariance remains strongly controlled

Theorem 75 Suppose f is a Gaussian random field satisfying conditions B1and B2 Then there exist constants c b0 and ε0 such that for any εb-regulardiscretization T of T we have

supbgtb0εisin[0ε0]

EQL2b

P 2 (suptisinT f (t) gt b)le sup

bgtb0εisin[0ε0]

EQL2b

P 2(sup

tisinTf (t) gt b

) le c

for some c isin (0infin)

The proof of this result is given in Section 72 The fact that the number ofreplications remains bounded in b is a consequence of the strong control on thevariance

Finally we note that the proof of Theorem 33 follows as a direct corollary of The-orem 75 together with Proposition 72 and our discussion in Section 2 Assumingthat placing each point in T takes no more than c units of computer time the totalcomplexity is according to the discussion in section Section 2 O

(nM3 + M

)=

O(εminus2δminus1M3 + M

) The contribution of the term M3 = O

(εminus6b3

)comes from

the complexity of applying Cholesky factorization and the term M = O(εminus2b)corresponds to the complexity of placing T

Remark 76 Condition B2 imposes a convexity assumption on the boundaryof T This assumption although convenient in the development of the proofs of The-orems 74 and 75 is not necessary The results can be generalized at the expenseof increasing the length and the burden in the technical development to the casein which T is a d-dimensional manifold satisfying the so-called Whitney conditions([4])

The remainder of this section is devoted to the proof of Proposition 72 Theorem74 and Theorem 75

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 30: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

30 ADLER BLANCHET LIU

71 Bias Control Proofs of Proposition 72 and Theorem 74 We start withsome useful lemmas for all of which we assume that Conditions B1 and B2 are sat-isfied We shall also assume that the global maximum of f over T is achieved withprobability one at a single point in T Additional conditions under which this willhappen can be found in [4] and require little more than the non-degeneracy of thejoint distribution of f and its first and second order derivatives Of these lemmasLemma 77 the proof of which we defer to Section 73 is of central importance

Lemma 77 For any a0 gt 0 there exists clowast δlowast b0 and δ0 (which depend onthe choice of a0) such that for any s isin T a isin (0 a0) δ isin (0 δ0) and b gt b0

P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L) le clowast exp(minusδlowast

δ2

)

where L is the set of local maxima of f

The next lemma is a direct corollary of Lemma 77

Lemma 78 Let tlowast be the point in T at which the global maximum of f isattained Then with the same choice of constants as in Lemma 77 for any a isin(0 a0) δ isin (0 δ0) and b gt b0 we have

P ( min|tminustlowast|ltδabminus1

f(t) lt b|f(tlowast) = b + ab) le 2clowast exp(minusδlowast

δ2

)

Proof Note that for any s isin T

P

(min

|t|leδabf(t) lt b|f(s) = b + ab s = tlowast

)

=P

(min|t|leδab f(t) lt b f(s) = b + ab s = tlowast

)

P (f(s) = b + ab s = tlowast)

le P(min|t|leδab f(t) lt b f(s) = b + ab s isin L)

P (f(s) = b + ab s = tlowast)

= P ( min|tminuss|ltδabminus1

f(t) lt b|f(s) = b + ab s isin L)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

le clowast exp(minusδlowast

δ2

)P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

Appling this bound and the fact that

P (f(s) = b + ab s isin L)P (f(s) = b + ab s = tlowast)

rarr 1

as b rarrinfin the conclusion of the Lemma holds 2

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 31: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 31

The following lemma gives a bound on the density of suptisinT f(t) which will beused to control the size of overshoot beyond level b

Lemma 79 Let pflowast(x) be the density function of suptisinT f(t) Then there existsa constant cflowast and b0 such that

pflowast(x) le cflowastxd+1P (f(0) gt x)

for all x gt b0

Proof Recalling (13) let the continuous function pE(x) x isin R be defined bythe relationship

E(χ(t isin T f(t) ge b)) =int infin

b

pE(x) dx

where the left hand side is the expected value of the Euler-Poincare characteristicof Ab Then according to Theorem 810 in [7] there exists c and δ such that

|pE(x)minus pflowast(x)| lt cP (f(0) gt (1 + δ)x)

for all x gt 0 In addition thanks to the result of [4] which providesintinfin

bpE(x)dx in

closed form there exists c0 such that for all x gt 1

pE(x) lt c0xd+1P (f(0) gt x)

Hence there exists cflowast such that

pflowast(x) le c0xd+1P (f(0) gt x) + cP (f(0) gt (1 + δ)x) le cflowastx

d+1P (f(0) gt x)

for all x gt 1 2

The last ingredients required to provide the proof of Proposition 72 and Theorem74 are stated in the following well known Theorem adapted from Lemma 61 andTheorem 72 in [19] to the twice differentiable case

Theorem 710 There exists a constant H (depending on the covariance func-tion C such that

P (suptisinT

f(t) gt b) = (1 + o (1))Hm(T )bdP (f(0) gt b)(74)

as b rarrinfinSimilarly choose δ small enough so that [0 δ]d sub T and let ∆0 = [0 bminus1]d Then

there exists a constant H1 such that

P ( suptisin∆0

f(t) gt b) = (1 + o(1))H1P (f(0) gt b)(75)

as b rarrinfin

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 32: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

32 ADLER BLANCHET LIU

We now are ready to provide the proof of Proposition 72 and Theorem 74

Proof of Proposition 72 The fact that there exists c1 such that

(T ) le c1m(T )εminusdbd

is immediate from assumption B2 Therefore we proceed to provide a bound forthe relative bias We choose ε0 lt δ2

0 (δ0 is given in Lemmas 77 and 78) and notethat for any ε lt ε0

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b)

le P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) + P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

By (74) and Lemma 79 there exists c2 such that the first term above can boundedby

P (suptisinT

f(t) lt b + 2radic

εb| suptisinT

f(t) gt b) le c2

radicε

The second term can be bounded by

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b + 2radic

εb)

le P ( sup|tminustlowast|lt2εbminus1

f(t) lt b| suptisinT

f(t) gt b + 2radic

ε)

le 2clowast exp(minusδlowastεminus1

)

Hence there exists c0 such that

P (suptisinT

f(t) lt b| suptisinT

f(t) gt b) le c2

radicε + 2clowasteminusδlowastε le c0

radicε

for all ε isin (0 ε0) 2

Proof of Theorem 74 We write θ = 1minus 3δ isin (0 1) First note that by (74)

P (supT

f(t) gt b + b2δminus1| supT

f(t) gt b) rarr 0

as b rarrinfin Let tlowast be the position of the global maximum of f in T According to theexact Slepian model in Section 73 and an argument similar to the proof of Lemma78

(76) P ( sup|tminustlowast|gtb2δminus1

f(t) gt b|b lt f(tlowast) le b + b2δminus1) rarr 0

as b rarrinfin Consequently

P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b) rarr 0

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 33: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 33

LetB(T b2δminus1) = cup

tisinTB(t b2δminus1)

We have

P (supT

f(t) gt b| supT

f(t) gt b)

le P ( sup|tminustlowast|gtb2δminus1

f(t) gt b| supT

f(t) gt b)

+ P (supT

f(t) gt b sup|tminustlowast|gtb2δminus1

f(t) le b| supT

f(t) gt b)

le o(1) + P (tlowast isin B(T b2δminus1)| supT

f(t) gt b)

le o(1) + P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b)

Since (T ) le b(1minus3δ)d we can find a finite set T prime = tprime1 tprimel sub T and let ∆k =tprimek + [0 bminus1] such that l = O(b(1minusδ)d) and B(T b2δminus1) sub cupl

k=1∆k The choice ofl only depends on (T ) not the particular distribution of T Therefore applying(75)

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b) le O(b(1minusδ)d)P (f(0) gt b)

This together with (74) yields

sup(Tlebθd)

P ( supB(T b2δminus1)

f(t) gt b| supT

f(t) gt b) le O(bminusδd) = o(1)

for b ge b0 which clearly implies the statement of the result 2

72 Variance Control Proof of Theorem 75 We proceed directly to the proofof Theorem 75

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 34: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

34 ADLER BLANCHET LIU

Proof of Theorem 75 Note that

EQL2b

P 2 (suptisinT f (t) gt b)

=E(Lb)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MtimesP (X1gtbminus1b)sumn

j=11(Xjgtbminus 1

b )maxj Xj gt b suptisinT f (t) gt b

)

P 2 (suptisinT f (t) gt b)

=E

(MP (X1gtbminus1b)1(maxj Xjgtb)sumn

j=11(Xjgtbminus1b)

∣∣∣∣ suptisinT f (t) gt b

)

P (suptisinT f (t) gt b)

= E

(M1 (maxj Xj gt b)sumnj=1 1 (xj gt bminus 1b)

∣∣∣∣∣ suptisinT

f (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

The remainder of the proof involves showing that the last conditional expectationhere is of order O(bd) This together with (74) will yield the result Note that forany A (b ε) such that A (b ε) = Θ(M) uniformly over b and ε we can write

E

(M times 1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)(77)

le E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

+ E

M times 1

(1 le sumM

j=1 1 (Xj gt bminus 1b) lt MA(bε)

)

sumMj=1 1

(xj gt bminus 1

b

)∣∣∣∣∣∣suptisinT

f (t) gt b

We shall select A(b ε) appropriately in order to bound the expectations above ByLemma 78 for any 4ε le δ le δ0 there exist constants cprime and cprimeprime isin (0infin) such that

clowast exp(minusδlowast

δ2

)ge P

(min

|tminustlowast|leδbf (t) lt bminus 1b

∣∣∣∣ suptisinT

f (t) gt b

)

ge P

Msum

j=1

1 (Xj gt bminus 1b) le cprimeδdεd

∣∣∣∣∣∣suptisinT

f (t) gt b

ge P

(MsumM

j=1 1(Xj gt bminus 1b)ge bdcprimeprime

δd

∣∣∣∣∣ suptisinT

f (t) gt b

)(78)

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 35: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 35

The first inequality is an application of Lemma 78 The second inequality is dueto the fact that for any ball B of radius 4ε or larger (T cap B) ge cprimedεminusd for somecprime gt 0 Inequality (78) implies that for all x such that bdcprimeprimeδd

0 lt x lt bdcprimeprime[4dεd]there exists δlowastlowast gt 0 such that

P

(MsumM

j=1 1(Xj gt bminus 1

b

)bdge x

∣∣∣∣∣ suptisinT

f (t) gt b

)le clowast exp

(minusδlowastlowastx2d

)

Now let A (b ε) = bdcprimeprime(4dεd) and observe that by the second result in (71) wehave A (b ε) = Θ (M) and moreover that there exists c3 such that the first termon the right hand side of (77) is bounded by

(79) E

M times 1

(sumMj=1 1 (Xj gt bminus 1b) ge M

A(bε)

)

sumMj=1 1 (Xj gt bminus 1b)

∣∣∣∣∣∣supT

f (t) gt b

le c3b

d

Now we turn to the second term on the right hand side of (77) We use the factthat MA(b ε) le cprimeprimeprime isin (0infin) (uniformly as b rarrinfin and ε rarr 0) There exist c4 andc5 such that for ε le δ0c4

E

M1

(1 le sumM

j=1 1(Xj gt bminus 1

b

)lt M

A(bε)

)

sumMj=1 1

(Xj gt bminus 1

b

)∣∣∣∣∣∣supT

f (t) gt b

(710)

le MP

nsum

j=1

1(

Xj gt bminus 1b

)lt cprimeprimeprime

∣∣∣∣∣∣supT

f (t) gt b

le MP

(min

|tminustlowast|lec4εbf (t) lt bminus 1b

∣∣∣∣ supT

f (t) gt b

)

le c1m(T )bdεminusd exp(minus δlowast

c24ε

2

)le c5b

d

The second inequality holds from the fact that ifsumM

j=1 1(Xj gt bminus 1

b

)is less than

cprimeprimeprime then the minimum of f in a ball around the local maximum and of radius c4εb

must be less than bminus 1b Otherwise there are more than cprimeprimeprime elements of T insidesuch a ball The last inequality is due to Lemma 78 and Theorem 72

Putting (79) and (710) together we obtain for all εb -regular discretizationswith ε lt ε0 = min(14 1c4)δ0

EQL2b

P (suptisinT f (t) gt b)le E

(M1 (maxj Xj gt b)sumM

j=1 1(Xj gt bminus 1

b

)∣∣∣∣∣ sup

tisinTf (t) gt b

)P (X1 gt bminus 1b)

P (suptisinT f (t) gt b)

le (c3 + c5)bdP (X1 gt bminus 1b)P (suptisinT f (t) gt b)

Applying now (74) and Proposition 72 we have that

P (suptisinT

f(t) gt b) ltP (sup

tisinTf(t) gt b)

1minus c0radic

ε

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 36: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

36 ADLER BLANCHET LIU

we have

supbgtb0εisin[0ε0]

EQL2b

P(sup

tisinTf (t) gt b

) lt infin

as required 2

73 A final proof Lemma 77 Without loss of generality we assume that thefield has mean zero and unit variance Before getting into the details of the proofof Lemma 77 we need a few additional lemmas for which we adopt the followingnotation Let Ci and Cij be the first and second order derivatives of C and definethe vectors

micro1(t) = (minusC1(t) minusCd(t))micro2(t) = vech ((Cij(t) i = 1 d j = i d))

Let f prime(0) and f primeprime(0) be the gradient and vector of second order derivatives of f at 0where f primeprime(0) is arranged in the same order as micro2(0) Furthermore let micro02 = microgt20 bea vector of second order spectral moments and micro22 a matrix of fourth order spectralmoments The vectors micro02 and micro22 are arranged so that

1 0 micro02

0 Λ 0micro20 0 micro22

is the covariance matrix of (f(0) f prime(0) f primeprime(0)) where Λ = (minusCij(0)) It then followsthat

micro2middot0 = micro22 minus micro20micro02

be the conditional variance of f primeprime(0) given f(0) The following lemma given in [5]provides a stochastic representation of the f given that it has a local maxima atlevel u at the origin

Lemma 711 Given that f has a local maximum with height u at zero (aninterior point of T ) the conditional field is equal in distribution to

(711) fu(t) uC (t)minusWuβgt (t) + g (t)

g(t) is a centered Gaussian random field with covariance function

γ(s t) = C(sminus t)minus (C(s) micro2(s))(

1 micro02

micro20 micro22

)minus1 (C(t)microgt2 (t)

)minus micro1(s)Λminus1microgt1 (t)

and Wu is a d(d+1)2 random vector independent of g(t) with density function

(712) ψu (w) prop |det (rlowast(w)minus uΛ) | exp(minus1

2wgtmicrominus1

2middot0w)1 (rlowast(w)minus uΛ isin N )

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 37: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 37

where rlowast(w) is a dtimesd symmetric matrix whose upper triangular elements consist ofthe components of w The set of negative definite matrices is denoted by N Finallyβ(t) is defined by

(α (t) β (t)) = (C (t) micro2 (t))(

1 micro02

micro20 micro22

)minus1

The following two technical lemmas which we shall prove after completing theproof of Lemma 77 provide bounds for the last two terms of (711)

Lemma 712 Using the notation in (711) there exist δ0 ε1 c1 and b0 suchthat for any u gt b gt b0 and δ isin (0 δ0)

P

(sup

|t|leδab

∣∣Wuβgt (t)∣∣ gt

a

4b

)le c1 exp

(minusε1b

2

δ4

)

Lemma 713 There exist c δ and δ0 such that for any δ isin (0 δ0)

P

(max|t|leδab

|g(t)| gt a

4b

)le c exp

(minus δ

δ2

)

Proof of Lemma 77 Using the notation of Lemma 711 given any s isin L forwhich f(s) = b we have that the corresponding conditional distribution of s is thatof fb(middot minus s) Consequently it suffices to show that

P

(min

|t|leδabfb(t) lt bminus ab

)le clowast exp

(minusδlowast

δ2

)

We consider firstly the case for which the local maximum is in the interior of T Then by the Slepian model (711)

fb (t) = bC (t)minusWbβgt (t) + g (t)

We study the three terms of the Slepian model individually Since

C(t) = 1minus tgtΛt + o(|t|2

)

there exists a ε0 such thatbC (t) ge bminus a

4b

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 38: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

38 ADLER BLANCHET LIU

for all |t| lt ε0radic

ab According to Lemmas 712 and 713 for δ lt min(ε0radic

a0 δ0)

P

(min

|t|leδabfu(t) lt bminus ab

)

le P

(max|t|ltδab

|g (t)| gt a

4b

)+ P

(sup

|t|leδab

∣∣Wbβgt(t)

∣∣ gta

4b

)

le c exp

δ2

)+ c1 exp

(minusε1b

2

δ4

)

le clowast exp(minusδlowast

δ2

)

for some clowast and δlowastNow consider the case for which the local maximum is in the (dminus1)-dimensional

boundary of T Due to convexity of T and without loss of generality we assumethat the tangent space of partT is generated by partpartt2 partparttd the local maximumis located at the origin and T is a subset of the positive half-plane t1 ge 0 Thatthese arguments to not involve a loss of generality follows from the arguments onpages 291ndash192 of [4] which rely on the assumed stationarity of f (for translations)and the fact that rotations while changing the distributions will not affect theprobabilities that we are currently computing)

For the origin positioned as just described to be a local maximum it is necessaryand sufficient that the gradient of f restricted to partT is the zero vector the Hessianmatrix restricted to partT is negative definite and part1f(0) le 0 Applying a version ofLemma 711 for this case conditional on f(0) = u and 0 being a local maximumthe field is equal in distribution to

(713) uC(t)minus Wuβgt(t) + micro1(t)Λminus1 (Z 0 0)T + g(t)

where Z le 0 corresponds to part1f (0) and it follows a truncated (conditional on thenegative axis) Gaussian random variable with mean zero and a variance parameterwhich is computed as the conditional variance of part1f (0) given (part2f (0) partdf (0))The vector Wu is a (d(d + 1)2)-dimensional random vector with density function

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1(wlowast minus uΛ isin N )

where Λ is the second spectral moment of f restricted to partT and rlowast(w) is the(dminus 1)times (dminus 1) symmetric matrix whose upper triangular elements consist of thecomponents of w In the representation (713) the vector Wu and Z are independentAs in the proof of Lemma 712 one can show that a similar bound holds albeitwith with different constants Thus since micro1 (t) = O (t) there exist cprimeprime and δprimeprime suchthat the third term in (713) can be bounded by

P [ max|t|leδab

∣∣∣micro1 (t) Λminus1 (Z 0 0)T∣∣∣ ge a(4b)] le cprimeprime exp

(minusδprimeprime

δ2

)

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 39: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 39

Consequently we can also find clowast and δlowast such that the conclusion holds and weare done 2

We complete the paper with the proofs of Lemmas 712 and 713

Proof of Lemma 712 It suffices to prove the Lemma for the case a = 1 Since

fu (0) = u = uminusWuβgt (0) + g (0)

and Wu and g are independent β (0) = 0 Furthermore since C prime (t) = O (t) andmicroprime2 (t) = O (t) there exists a c0 such that |β(t)| le c0|t|2 In addition Wu has densityfunction proportional to

ψu(w) prop |det(rlowast(w)minus uΛ)| exp(minus12wgtmicrominus1

2middot0w)1 (wlowast minus uΛ isin N )

Note that det(rlowast(w) minus uΛ) is expressible as a polynomial in w and u and thereexists some ε0 and c such that

∣∣∣∣det (rlowast(w)minus uΛ)

det (minusuΛ)

∣∣∣∣ le c

if |w| le ε0u Hence there exist ε2 c2 gt 0 such that

ψu (w) le ψ (w) = c2 exp(minus1

2ε2w

gtmicrominus12middot0w

)

for all u ge 1 The right hand side here is proportional to a multivariate Gaussiandensity Thus

P (|Wu| gt x) =int

|w|gtx

ψu(w)dw leint

|w|gtx

ψu (w) dw = c3P (|W | gt x)

where W is a multivariate Gaussian random variable with density function propor-tional to ψ Therefore by choosing ε1 and c1 appropriately we have

P

(sup|t|leδb

∣∣Wuβgt∣∣ gt

14b

)le P

(|Wu| gt b

c20δ

2

)le c1 exp

(minusε1b

2

δ4

)

for all u ge b 2

Proof of Lemma 713 Once again it suffices to prove the Lemma for the casea = 1 Since

fb(0) = b = bminusWbβgt(0) + g(0)

the covariance function (γ (s t) s t isin T ) of the centered field g satisfies γ(0 0) = 0It is also easy to check that

partsγ(s t) = O(|s|+ |t|) parttγ(s t) = O(|s|+ |t|)

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 40: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

40 ADLER BLANCHET LIU

Consequently there exists a constant cγ isin (0infin) for which

γ(s t) le cγ(|s|2 + |t|2) γ(s s) le cγ |s|2

We need to control the tail probability of sup|t|leδb |g(t)| For this it is useful tointroduce the following scaling Define

gδ (t) =b

δg

(δt

b

)

Then sup|t|leδb g (t) ge 14b if and only if sup|t|le1 gδ (t) ge 1

4δ Let

σδ (s t) = E (gδ (s) gδ (t))

ThensupsisinR

σδ(s s) le cγ

Because γ(s t) is at least twice differentiable applying a Taylor expansion we easilysee that the canonical metric dg corresponding to gδ (s) (cf Theorem 67) can bebounde as follows

d2g(s t) = E (gδ (s)minus gδ (t))2 =

b2

δ2

(δs

bδs

b

)+ γ

(δt

bδt

b

)minus 2γ

(δs

bδt

b

)]

le c |sminus t|2

for some constant c isin (0infin) Therefore the entropy of gδ evaluated at δ is boundedby Kδminusd for any δ gt 0 and with an appropriate choice of K gt 0 Therefore for allδ lt δ0

P ( sup|t|leδb

|g(t)| ge 14b

) = P ( sup|t|le1

gδ (t) ge 14δ

) le cdδminusdminusη exp(minus 1

16cγδ2)

for some constant cd and η gt 0 The last inequality is a direct application of The-orem 411 of [4] The conclusion of the lemma follows immediately by choosing c

and δ appropriately 2

REFERENCES

[1] RJ Adler K Bartz and S Kou Estimating curvatures in the Gaussian kinematic formulaIn preparation

[2] RJ Adler J Blanchet and J Liu Efficient simulation for tail probabilities of Gaussianrandom fields In WSC rsquo08 Proceedings of the 40th Conference on Winter Simulationpages 328ndash336 Winter Simulation Conference 2008

[3] RJ Adler P Muller and Rozovskii B editors Stochastic Modelling in Physical Oceanog-raphy Birkhauser Boston 1996

[4] RJ Adler and JE Taylor Random Fields and Geometry Springer New York USA 2007

[5] RJ Adler JE Taylor and K Worsley Applications of Random Fields and GeometryFoundations and Case Studies httpwebeetechnionacilpeopleadlerhrfpdf

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 41: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

MONTE CARLO FOR RANDOM FIELDS 41

[6] S Asmussen and P Glynn Stochastic Simulation Algorithms and Analysis Springer-VerlagNew York NY USA 2008

[7] JM Azais and M Wschebor Level Sets and Extrema of Random Processes and FieldsWiley New York NY 2009

[8] JM Bardeen JR Bond N Kaiser and AS Szalay The statistics of peaks of Gaussianrandom fields The Astrophysical Journal 30415ndash61 1986

[9] SM Berman Sojourns and Extremes of Stochastic Processes The Wadsworth ampBrooksCole StatisticsProbability Series Wadsworth amp BrooksCole Advanced Books ampSoftware Pacific Grove CA 1992

[10] J Blanchet Efficient importance sampling for binary contingency tables Ann of ApplProb 19949ndash982 2009

[11] C Borell The Brunn-Minkowski inequality in Gauss space Invent Math 30205ndash216 1975

[12] J Bucklew Introduction to Rare-Event Simulation Springer-Verlag New York NY 2004

[13] MR Dennis Nodal densities of planar Gaussian random waves Eur Phys J 145191ndash2102007

[14] RM Dudley Sample functions of the Gaussian process Ann Probab 166ndash103 1973

[15] KJ Friston KJ Worsley RSJ Frackowiak JC Mazziotta and AC Evans Assessing thesignificance of focal activations using their spatial extent Human Brain Mapping 1214ndash2201994

[16] MR Leadbetter G Lindgren and H Rootzen Extremes and Related Properties of RandomSequences and Processes Springer-Verlag New York 1983

[17] M Mitzenmacher and E Upfal Probability and Computing Randomized Algorithms andProbabilistic Analysis Cambridge University Press 2005

[18] W Niemiro and P Pokarowski Fixed precision MCMC estimation by median of productsof averages J Appl Probab 46309ndash329 2009

[19] V Piterbarg Asymptotic Methods in the Theory of Gaussian Processes American Mathe-matical Society USA 1995

[20] SF Shandarin Testing non-Gaussianity in cosmic microwave background maps by morpho-logical statistics Mon Not R Astr Soc 331865ndash874 2002

[21] SF Shandarin HA Feldman Y Xu and M Tegmark Morphological measures of non-gaussianity in cosmic microwave background maps Astrophys J Suppl 1411ndash11 2002

[22] L Shepp and H Landau On the supremum of a Gaussian process Sankhya A 32369ndash378December 1970

[23] JE Taylor A Takemura and RJ Adler Validity of the expected Euler characteristicheuristic Ann Probab 33(4)1362ndash1396 2005

[24] JE Taylor and KJ Worsley Detecting sparse signals in random fields with an applicationto brain mapping J Amer Statist Assoc 102(479)913ndash928 2007

[25] J Traub G Wasilokowski and H Wozniakowski Information-Based Complexity AcademicPress New York NY 1988

[26] BS Tsirelson IA Ibragimov and VN Sudakov Norms of Gaussian sample functionsProceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent 1975)55020ndash41 1976

[27] V S Tsirelrsquoson The density of the maximum of a gaussian process Theory of Probab Appl20847ndash856 1975

[28] H Wozniakowski Computational complexity of continuous problems 1996 Technical Re-port

Robert J AdlerElectrical EngineeringTechnion Haifa Israel 32000E-mail roberteetechnionacilURL webeetechnionacilpeopleadler

Jose H BlanchetIndustrial Engineering amp Operations ResearchColumbia Uninversity340 SW Mudd Building500 W 120 StreetNew York New York 10027E-mail joseblanchetcolumbiaeduURL wwwieorcolumbiaedufac-biosblanchetfacultyhtml

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu

Page 42: EFFICIENT MONTE CARLO FOR HIGH EXCURSIONS OF … › ~jblanche › papers › GRF1.pdf · 1. Introduction. This paper centers on the design and analysis of e–cient Monte Carlo techniques

42 ADLER BLANCHET LIU

Jingchen LiuDepartment of StatisticsColumbia University1255 Amsterdam Avenue Room 1030New York NY 10027E-mail jcliustatcolumbiaeduURL statcolumbiaedu~jcliu