53
Bayesian Quadrature for Multiple Related Integrals Fran¸cois-XavierBriol University of Warwick (Department of Statistics) Imperial College London (Department of Mathematics) University of Sheffield Machine Learning Seminar arXiv:1801.04153 F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 1 / 32

Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

  • Upload
    others

  • View
    11

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Related Integrals

Francois-Xavier Briol

University of Warwick (Department of Statistics)Imperial College London (Department of Mathematics)

University of SheffieldMachine Learning Seminar

arXiv:1801.04153

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 1 / 32

Page 2: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Collaborators

Xiaoyue Xi Mark Girolami Chris Oates(Warwick) (Imperial & ATI) (Newcastle & ATI)

Michael Osborne Dino Sejdinovic

(Oxford) (Oxford & ATI)F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 2 / 32

Page 3: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

Bayesian Numerical Methods

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 3 / 32

Page 4: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

Bayesian Numerical Methods

Standard Numerical Analysis: Study of how to best projectcontinuous mathematical problems into discrete scales. (also thoughtof as the study of numerical errors).

Statistics: Infer some quantity of interest (usually the parameter of amodel) from data samples. Sounds familiar?

Bayesian Numerical Methods: Perform Bayesian statisticalinference on the solution of numerical problems.

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379-422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory andRelated Topics IV, 163-175.[3] O’Hagan, A. (1992). Some Bayesian numerical analysis. Bayesian Statistics, 4,345-363.[4] Hennig, P., Osborne, M. A., & Girolami, M. (2015). Probabilistic Numerics andUncertainty in Computations. J. Roy. Soc. A, 471(2179).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 4 / 32

Page 5: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

Bayesian Numerical Methods

Standard Numerical Analysis: Study of how to best projectcontinuous mathematical problems into discrete scales. (also thoughtof as the study of numerical errors).

Statistics: Infer some quantity of interest (usually the parameter of amodel) from data samples. Sounds familiar?

Bayesian Numerical Methods: Perform Bayesian statisticalinference on the solution of numerical problems.

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379-422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory andRelated Topics IV, 163-175.[3] O’Hagan, A. (1992). Some Bayesian numerical analysis. Bayesian Statistics, 4,345-363.[4] Hennig, P., Osborne, M. A., & Girolami, M. (2015). Probabilistic Numerics andUncertainty in Computations. J. Roy. Soc. A, 471(2179).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 4 / 32

Page 6: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

Bayesian Numerical Methods

Standard Numerical Analysis: Study of how to best projectcontinuous mathematical problems into discrete scales. (also thoughtof as the study of numerical errors).

Statistics: Infer some quantity of interest (usually the parameter of amodel) from data samples. Sounds familiar?

Bayesian Numerical Methods: Perform Bayesian statisticalinference on the solution of numerical problems.

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379-422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory andRelated Topics IV, 163-175.[3] O’Hagan, A. (1992). Some Bayesian numerical analysis. Bayesian Statistics, 4,345-363.[4] Hennig, P., Osborne, M. A., & Girolami, M. (2015). Probabilistic Numerics andUncertainty in Computations. J. Roy. Soc. A, 471(2179).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 4 / 32

Page 7: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

Bayesian Numerical Methods

Standard Numerical Analysis: Study of how to best projectcontinuous mathematical problems into discrete scales. (also thoughtof as the study of numerical errors).

Statistics: Infer some quantity of interest (usually the parameter of amodel) from data samples. Sounds familiar?

Bayesian Numerical Methods: Perform Bayesian statisticalinference on the solution of numerical problems.

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379-422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory andRelated Topics IV, 163-175.[3] O’Hagan, A. (1992). Some Bayesian numerical analysis. Bayesian Statistics, 4,345-363.[4] Hennig, P., Osborne, M. A., & Girolami, M. (2015). Probabilistic Numerics andUncertainty in Computations. J. Roy. Soc. A, 471(2179).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 4 / 32

Page 8: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

What is the Point?

Quantification of the epistemic uncertainty associated with thenumerical problem using probability measures, rather thanworst-case bounds (not always representative of the actual error).

Propagation of uncertainty through pipelines.

Bayesian Numerical Methods can be framed as Bayesian InverseProblems for the solution of the numerical problem.

[1] Cockayne, J., Oates, C., Sullivan, T., & Girolami, M. (2017). Bayesian ProbabilisticNumerical Methods. arXiv:1701.04006.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 5 / 32

Page 9: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

What is the Point?

Quantification of the epistemic uncertainty associated with thenumerical problem using probability measures, rather thanworst-case bounds (not always representative of the actual error).

Propagation of uncertainty through pipelines.

Bayesian Numerical Methods can be framed as Bayesian InverseProblems for the solution of the numerical problem.

[1] Cockayne, J., Oates, C., Sullivan, T., & Girolami, M. (2017). Bayesian ProbabilisticNumerical Methods. arXiv:1701.04006.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 5 / 32

Page 10: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Methods

What is the Point?

Quantification of the epistemic uncertainty associated with thenumerical problem using probability measures, rather thanworst-case bounds (not always representative of the actual error).

Propagation of uncertainty through pipelines.

Bayesian Numerical Methods can be framed as Bayesian InverseProblems for the solution of the numerical problem.

[1] Cockayne, J., Oates, C., Sullivan, T., & Girolami, M. (2017). Bayesian ProbabilisticNumerical Methods. arXiv:1701.04006.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 5 / 32

Page 11: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Bayesian Numerical Integration

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 6 / 32

Page 12: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

The Problem

Let’s come back to our problem of computing integrals! Consider afunction f : X → R (X ⊆ Rp) assumed to be square-integrable and aprobability measure Π.

Π[f ] =

∫Xf (x)dΠ(x) ≈

n∑i=1

wi f (xi ) = Π[f ]

where {xi}ni=1 ∈ X & {wi}ni=1 ∈ R.

Examples include:

1 Monte Carlo (MC): Sample {xi}ni=1 ∼ Π and let wi = 1/n ∀i .2 Markov Chain Monte Carlo (MCMC): Sample states {xi}ni=1 from a

Markov Chain with invariant distribution Π and let wi = 1/n ∀i .3 Gaussian quadrature, importance sampling, QMC, SMC, etc...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 7 / 32

Page 13: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

The Problem

Let’s come back to our problem of computing integrals! Consider afunction f : X → R (X ⊆ Rp) assumed to be square-integrable and aprobability measure Π.

Π[f ] =

∫Xf (x)dΠ(x) ≈

n∑i=1

wi f (xi ) = Π[f ]

where {xi}ni=1 ∈ X & {wi}ni=1 ∈ R.

Examples include:

1 Monte Carlo (MC): Sample {xi}ni=1 ∼ Π and let wi = 1/n ∀i .2 Markov Chain Monte Carlo (MCMC): Sample states {xi}ni=1 from a

Markov Chain with invariant distribution Π and let wi = 1/n ∀i .3 Gaussian quadrature, importance sampling, QMC, SMC, etc...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 7 / 32

Page 14: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Sketch of Bayesian Quadrature

n=0 n=3 n=8

x

Inte

gran

d

Solution of the integral

Pos

terio

r di

strib

utio

n

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 8 / 32

Page 15: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Bayesian Quadrature

1 Place a Gaussian Process prior (assumed w.l.o.g. to have zero mean).2 Evaluate the integrand f at several locations {xi}ni=1 on X . We get a

Gaussian Process with mean and covariance function:

mn(x) = k(x ,X )k(X ,X )−1f (X )

kn(x , x ′) = k(x , x ′)− k(x ,X )k(X ,X )−1k(X , x ′)

3 Taking the pushforward through the integral operator, we get:

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory and RelatedTopics IV, 163175.[3] OHagan, A. (1991). Bayes-Hermite quadrature. Journal of Statistical Planning andInference, 29, 245-260.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 9 / 32

Page 16: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Bayesian Quadrature

1 Place a Gaussian Process prior (assumed w.l.o.g. to have zero mean).2 Evaluate the integrand f at several locations {xi}ni=1 on X . We get a

Gaussian Process with mean and covariance function:

mn(x) = k(x ,X )k(X ,X )−1f (X )

kn(x , x ′) = k(x , x ′)− k(x ,X )k(X ,X )−1k(X , x ′)

3 Taking the pushforward through the integral operator, we get:

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory and RelatedTopics IV, 163175.[3] OHagan, A. (1991). Bayes-Hermite quadrature. Journal of Statistical Planning andInference, 29, 245-260.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 9 / 32

Page 17: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Bayesian Quadrature

1 Place a Gaussian Process prior (assumed w.l.o.g. to have zero mean).2 Evaluate the integrand f at several locations {xi}ni=1 on X . We get a

Gaussian Process with mean and covariance function:

mn(x) = k(x ,X )k(X ,X )−1f (X )

kn(x , x ′) = k(x , x ′)− k(x ,X )k(X ,X )−1k(X , x ′)

3 Taking the pushforward through the integral operator, we get:

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379422.[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory and RelatedTopics IV, 163175.[3] OHagan, A. (1991). Bayes-Hermite quadrature. Journal of Statistical Planning andInference, 29, 245-260.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 9 / 32

Page 18: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Important Points

The probability measure represents our uncertainty about the value ofΠ[f ] due to the fact that we cant evaluate the integrand f everywhere.

We have chosen to model f (and hence Π[f ]) using a Gaussianmeasure. This is mostly for computational tractability, but isn’tnecessarily the right thing to do!

Notice that the formulae below do not specify where to evaluate f ,but provide a posterior given the points at which we have evaluated.

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

We have the freedom to combine these with MC, MCMC, QMC, etc...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 10 / 32

Page 19: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Important Points

The probability measure represents our uncertainty about the value ofΠ[f ] due to the fact that we cant evaluate the integrand f everywhere.

We have chosen to model f (and hence Π[f ]) using a Gaussianmeasure. This is mostly for computational tractability, but isn’tnecessarily the right thing to do!

Notice that the formulae below do not specify where to evaluate f ,but provide a posterior given the points at which we have evaluated.

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

We have the freedom to combine these with MC, MCMC, QMC, etc...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 10 / 32

Page 20: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Important Points

The probability measure represents our uncertainty about the value ofΠ[f ] due to the fact that we cant evaluate the integrand f everywhere.

We have chosen to model f (and hence Π[f ]) using a Gaussianmeasure. This is mostly for computational tractability, but isn’tnecessarily the right thing to do!

Notice that the formulae below do not specify where to evaluate f ,but provide a posterior given the points at which we have evaluated.

En[Π[f ]] = ΠBQ[f ] := Π[k(·,X )]k(X ,X )−1f (X )

Vn[Π[f ]] = ΠΠ[k]− Π[k(·,X )]k(X ,X )−1Π[k(X , ·)].

We have the freedom to combine these with MC, MCMC, QMC, etc...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 10 / 32

Page 21: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Toy example

Toy problem: f (x) = sin(x) + 1 & Π is N (0, 1). We use the kernelk(x , y) = exp(−(x − y)2/l2) (with l = 1) and sample {xi}ni=1 ∼ Π.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 11 / 32

Page 22: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Toy example: The effect of sampling

Toy problem: f (x) = sin(x) + 1 & Π is N (0, 1). We use the kernelk(x , y) = exp(−(x − y)2/l2) (with l = 1) and sample {xi}ni=1 ∼ N (0, σ2).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 12 / 32

Page 23: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Toy example: The effect of sampling and kernel choice

Toy problem: f (x) = sin(x) + 1 & Π is N (0, 12). We use the kernelk(x , y) = exp(−(x − y)2/l2) and sample from {xi}ni=1 ∼ N (0, σ2).

Number of samples: 25

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 13 / 32

Page 24: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Toy example: The effect of sampling and kernel choice

Toy problem: f (x) = sin(x) + 1 & Π is N (0, 12). We use the kernelk(x , y) = exp(−(x − y)2/l2) and sample from {xi}ni=1 ∼ N (0, σ2).

Number of samples: 50

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 14 / 32

Page 25: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Toy example: The effect of sampling and kernel choice

Toy problem: f (x) = sin(x) + 1 & Π is N (0, 12). We use the kernelk(x , y) = exp(−(x − y)2/l2) and sample from {xi}ni=1 ∼ N (0, σ2).

Number of samples: 75

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 15 / 32

Page 26: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Calibration

Clearly, from the previous slide we can tell that hyperparameters will be ofgreat importance. To do this we have several alternatives:

Go full Bayesian and put a prior on the parameter, then look at theposterior predictive. Problem: We lose the conjugacy property andthis becomes very expensive to do!

Do some sort of cross validation. Problem: This is still expensive asrequires solving many linear systems.

Do some empirical Bayes, i.e. maximise the marginal likelihood of thedata. Problem: Not very Bayesian.

In practice, we tend to use empirical Bayes for speed reason...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 16 / 32

Page 27: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Calibration

Clearly, from the previous slide we can tell that hyperparameters will be ofgreat importance. To do this we have several alternatives:

Go full Bayesian and put a prior on the parameter, then look at theposterior predictive. Problem: We lose the conjugacy property andthis becomes very expensive to do!

Do some sort of cross validation. Problem: This is still expensive asrequires solving many linear systems.

Do some empirical Bayes, i.e. maximise the marginal likelihood of thedata. Problem: Not very Bayesian.

In practice, we tend to use empirical Bayes for speed reason...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 16 / 32

Page 28: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Calibration

Clearly, from the previous slide we can tell that hyperparameters will be ofgreat importance. To do this we have several alternatives:

Go full Bayesian and put a prior on the parameter, then look at theposterior predictive. Problem: We lose the conjugacy property andthis becomes very expensive to do!

Do some sort of cross validation. Problem: This is still expensive asrequires solving many linear systems.

Do some empirical Bayes, i.e. maximise the marginal likelihood of thedata. Problem: Not very Bayesian.

In practice, we tend to use empirical Bayes for speed reason...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 16 / 32

Page 29: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Calibration

Clearly, from the previous slide we can tell that hyperparameters will be ofgreat importance. To do this we have several alternatives:

Go full Bayesian and put a prior on the parameter, then look at theposterior predictive. Problem: We lose the conjugacy property andthis becomes very expensive to do!

Do some sort of cross validation. Problem: This is still expensive asrequires solving many linear systems.

Do some empirical Bayes, i.e. maximise the marginal likelihood of thedata. Problem: Not very Bayesian.

In practice, we tend to use empirical Bayes for speed reason...

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 16 / 32

Page 30: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Convergence Results

So why does it converge fast?

For integration in an RKHS H (associated to the GP kernel k), thestandard quantity to consider is the worst-case error:

e(Π; Π,H

):= sup

‖f ‖H≤1

∣∣∣Π[f ]− Π[f ]∣∣∣

One can show that BQ attains optimal rates of convergence forcertain classes of smooth functions. In particular, with i.i.d points,one can get O(n−

αd

+ε) for spaces of smoothness α and

O(exp(−Cn1d−ε)) for infinitely smooth functions.

Briol, F.-X., Oates, C. J., Girolami, M., Osborne, M. A., & Sejdinovic, D. (2015). ProbabilisticIntegration: A Role for Statisticians in Numerical Analysis?, arXiv:1512.00933.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 17 / 32

Page 31: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Application: Global Illumination

object

𝝎𝒊

𝝎𝒐

𝑆2

environment mapcamera

𝒏

𝐿𝑒(𝝎𝑜)

𝐿𝑖(𝝎𝑜)

We need to compute three integrals at each pixel:

Lo(ωo) = Le(wo) +

∫S2

Li (ωi )ρ(ωi , ωo)[ωi · n]+dσ(ωi )

[1] Marques, R., Bouville, C., Ribardiere, M., Santos, P., & Bouatouch, K. (2013). A sphericalGaussian framework for Bayesian Monte Carlo rendering of glossy surfaces. IEEE Transactionson Visualization and Computer Graphics, 19(10), 1619-1632.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 18 / 32

Page 32: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Application: Global Illumination

Number of States (n)

102 103

0.00001

0.001

1MMD

MC BMC QMC BQMC

Inte

gra

l E

stim

ate

0.014

0.016

0.018

0.02

0.022Red Channel

Number of States (n)

102 103

Inte

gra

l E

stim

ate

0.016

0.017

0.018

0.019

0.02

×10-3

-4

-2

0

2

4Green Channel

Number of States (n)

102 103

×10-3

-2

-1

0

1

2

0.012

0.014

0.016

0.018

0.02Blue Channel

Number of States (n)

102 1030.014

0.015

0.016

0.017

0.018

The kernel used gives an RKHS norm-equivalent to a Sobolev spaceof smoothness 3

2 :

k(x , x ′) =8

3− ‖x − x ′‖2 for all x , x ′ ∈ S2.

We can show a convergence rate of e(ΠBMC; Π,H) = OP(n−34 )

which is optimal for this space!

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 19 / 32

Page 33: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Numerical Integration

Application: Global Illumination

Number of States (n)

102 10310-6

10-4

10-2

100MMD

MC

BMC

QMC

BQMC

Spreading the points and re-weighting can help significantly!

[1] Briol, F.-X., Oates, C. J., Girolami, M., & Osborne, M. A. (2015). Frank-Wolfe BayesianQuadrature: Probabilistic Integration with Theoretical Guarantees. In Advances In NeuralInformation Processing Systems 28 (pp. 1162–1170).[2] Briol, F.-X., Oates, C. J., Cockayne, J., Chen, W. Y., & Girolami, M. (2017). On theSampling Problem for Kernel Quadrature. In Proceedings of the 34th International Conferenceon Machine Learning (pp. 586–595).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 20 / 32

Page 34: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Bayesian Quadrature for Multiple Integrals

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 21 / 32

Page 35: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

What About Multiple Integrals?

In the example, we actually need to approximate thousands ofintegrals for each frame of a virtual environment... This is slow andexpensive!

We can formalise the process above as that of finding the integral ofa set of functions f1, . . . , fD against some measure Π. But what if weknow something about how f1 relates to f2, etc...?

It might make more sense to approximate the integral with aquadrature rule of the form:

Π[fd ] =D∑

d ′=1

n∑i=1

(Wi )dd ′fd ′(xd ′i )

where the weights would encode the correlation across functions.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 22 / 32

Page 36: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

What About Multiple Integrals?

In the example, we actually need to approximate thousands ofintegrals for each frame of a virtual environment... This is slow andexpensive!

We can formalise the process above as that of finding the integral ofa set of functions f1, . . . , fD against some measure Π. But what if weknow something about how f1 relates to f2, etc...?

It might make more sense to approximate the integral with aquadrature rule of the form:

Π[fd ] =D∑

d ′=1

n∑i=1

(Wi )dd ′fd ′(xd ′i )

where the weights would encode the correlation across functions.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 22 / 32

Page 37: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

What About Multiple Integrals?

In the example, we actually need to approximate thousands ofintegrals for each frame of a virtual environment... This is slow andexpensive!

We can formalise the process above as that of finding the integral ofa set of functions f1, . . . , fD against some measure Π. But what if weknow something about how f1 relates to f2, etc...?

It might make more sense to approximate the integral with aquadrature rule of the form:

Π[fd ] =D∑

d ′=1

n∑i=1

(Wi )dd ′fd ′(xd ′i )

where the weights would encode the correlation across functions.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 22 / 32

Page 38: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Bayesian Quadrature for Multiple Related Functions

We can use the same type of results for Gaussian Processes on theextended space of vector-valued functions f : X → RD (rather thanf : X → R) where f (x) = (f1(x), . . . , fD(x)).

This approach allow us to directly encode the relation between eachfunction fi by specifying the kernel K .

In this case the posterior distribution is a GP(mn,Kn) withvector-valued mean mn : X → RD and matrix-valued covarianceKn : X × X → RD×D :

mn(x) = K (x ,X )K (X ,X )−1f (X )

Kn(x , x ′) = K (x , x ′)−K (x ,X )K (X ,X )−1K (X , x ′).

The overall cost for computing this is O(n3D3).

Alvarez, M. A., Rosasco, L., & Lawrence, N. D. (2012). Kernels for vector-valuedfunctions: A review. Foundations and Trends in Machine Learning, 4(3), 195-266.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 23 / 32

Page 39: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Bayesian Quadrature for Multiple Related Functions

We can use the same type of results for Gaussian Processes on theextended space of vector-valued functions f : X → RD (rather thanf : X → R) where f (x) = (f1(x), . . . , fD(x)).

This approach allow us to directly encode the relation between eachfunction fi by specifying the kernel K .

In this case the posterior distribution is a GP(mn,Kn) withvector-valued mean mn : X → RD and matrix-valued covarianceKn : X × X → RD×D :

mn(x) = K (x ,X )K (X ,X )−1f (X )

Kn(x , x ′) = K (x , x ′)−K (x ,X )K (X ,X )−1K (X , x ′).

The overall cost for computing this is O(n3D3).

Alvarez, M. A., Rosasco, L., & Lawrence, N. D. (2012). Kernels for vector-valuedfunctions: A review. Foundations and Trends in Machine Learning, 4(3), 195-266.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 23 / 32

Page 40: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Consider multi-output Bayesian Quadrature with a GP(0,K ) prior onf = (f1, . . . , fD)>. The posterior distribution on Π[f ] is aD-dimensional Gaussian with mean and covariance matrix:

EN [Π[f ]] = Π[K (·,X )]K (X ,X )−1f (X )

VN [Π[f ]] = ΠΠ [K ]− Π[K (·,X )]K (X ,X )−1Π[K (X , ·)]

Kernel evaluations are now matrix-valued (i.e. in RD×D) as opposedto scalar-valued. A simple example is the following separable kernel:

K (x , x ′) = Bk(x , x ′)

B encodes the covariance between function, and k the type offunction in each of the components.

In this case, we can reduce the cost to O(n3 + D3).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 24 / 32

Page 41: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Consider multi-output Bayesian Quadrature with a GP(0,K ) prior onf = (f1, . . . , fD)>. The posterior distribution on Π[f ] is aD-dimensional Gaussian with mean and covariance matrix:

EN [Π[f ]] = Π[K (·,X )]K (X ,X )−1f (X )

VN [Π[f ]] = ΠΠ [K ]− Π[K (·,X )]K (X ,X )−1Π[K (X , ·)]

Kernel evaluations are now matrix-valued (i.e. in RD×D) as opposedto scalar-valued. A simple example is the following separable kernel:

K (x , x ′) = Bk(x , x ′)

B encodes the covariance between function, and k the type offunction in each of the components.

In this case, we can reduce the cost to O(n3 + D3).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 24 / 32

Page 42: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Some Theory for Multi-output BQ

Define the individual worst-case errors:

e(Π; Π,HK , d) = sup‖f ‖K≤1

∣∣∣Π[fd ]− Π[fd ]∣∣∣

Theorem (Convergence rate for BQ with separable kernel)

Suppose we want to approximate Π[f ] for some f : X → RD and ΠBQ[f ]is the multi-output BQ rule with the kernel K (x , x ′) = Bk(x , x ′) for somepositive definite B ∈ RD×D and scalar-valued kernel k : X ×X → R whereall functions are evaluated on a point set {xi}ni=1.Then, ∀d = 1, . . . ,D, we have:

e(ΠBQ; Π,HK , d) = O(e(ΠBQ; Π,Hk)

)F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 25 / 32

Page 43: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Some Theory for Multi-output BQ

Assume that X ⊂ Rp is a domain with Lipschitz boundary and satisfyingan interior cone condition. Furthermore, assume the {xi}ni=1 are either:(A1) IID samples from some distribution Π′ with density π′ > 0 on X , or(A2) obtained from a quasi-uniform grid on X .

If Hk is norm-equivalent to an RKHS with Matern kernel ofsmoothness α > p

2 , we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(n−

αp

+ε).

for ε > 0 arbitrarily small.

If Hk is norm-equivalent to the RKHS with squared-exponential,multiquadric or inverse multiquadric kernel, we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(

exp(−C1n

1p−ε))

.

for some C1 > 0 and ε > 0 arbitrarily small.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 26 / 32

Page 44: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Some Theory for Multi-output BQ

Assume that X ⊂ Rp is a domain with Lipschitz boundary and satisfyingan interior cone condition. Furthermore, assume the {xi}ni=1 are either:(A1) IID samples from some distribution Π′ with density π′ > 0 on X , or(A2) obtained from a quasi-uniform grid on X .

If Hk is norm-equivalent to an RKHS with Matern kernel ofsmoothness α > p

2 , we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(n−

αp

+ε).

for ε > 0 arbitrarily small.

If Hk is norm-equivalent to the RKHS with squared-exponential,multiquadric or inverse multiquadric kernel, we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(

exp(−C1n

1p−ε))

.

for some C1 > 0 and ε > 0 arbitrarily small.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 26 / 32

Page 45: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Some Theory for Multi-output BQ

Assume that X ⊂ Rp is a domain with Lipschitz boundary and satisfyingan interior cone condition. Furthermore, assume the {xi}ni=1 are either:(A1) IID samples from some distribution Π′ with density π′ > 0 on X , or(A2) obtained from a quasi-uniform grid on X .

If Hk is norm-equivalent to an RKHS with Matern kernel ofsmoothness α > p

2 , we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(n−

αp

+ε).

for ε > 0 arbitrarily small.

If Hk is norm-equivalent to the RKHS with squared-exponential,multiquadric or inverse multiquadric kernel, we have ∀d = 1, . . . ,D:

e(HK , ΠBQ,X , d) = O(

exp(−C1n

1p−ε))

.

for some C1 > 0 and ε > 0 arbitrarily small.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 26 / 32

Page 46: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Theory in the Misspecified Case

Theorem (Misspecified Convergence Result for Separable Kernel)

Let kα be a kernel norm-equivalent to a Matern kernel on some domain Xwith Lipschitz boundary and interior cone condition and consider the BQrule ΠBQ[f ] corresponding to a separable kernel Kα(x , x ′) = Bkα(x , x ′).

Suppose {xi}ni=1 satisfies (A2), and f ∈ HCβwhere p

2 ≤ β ≤ α.Then, ∀d = 1, . . . ,D:∣∣∣Π[fd ]− ΠBQ[fd ]

∣∣∣ = O(n−

βp

+ε)

for some ε > 0.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 27 / 32

Page 47: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Multi-output BQ for the Computer Graphics Example

object

𝝎𝒊

𝝎𝒐

𝑆2

environment mapcamera

𝒏

𝐿𝑒(𝝎𝑜)

𝐿𝑖(𝝎𝑜)

We compute integrals for different integrands based on various anglesω0 (akin to a camera moving).

We pick a separable kernel K (x , x ′) = Bk(x , x ′) where B is chosen torepresent the angle between integrands and k(x , x ′) = 8

3 −‖x −x ′‖2.

We can prove that the worst-case integration error converges at a

rate O(n−34 ) for each integrand. This is the same rate as uni-output

BQ (but we usually improve on constants).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 28 / 32

Page 48: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Multi-output BQ for the Computer Graphics Example

object

𝝎𝒊

𝝎𝒐

𝑆2

environment mapcamera

𝒏

𝐿𝑒(𝝎𝑜)

𝐿𝑖(𝝎𝑜)

We compute integrals for different integrands based on various anglesω0 (akin to a camera moving).

We pick a separable kernel K (x , x ′) = Bk(x , x ′) where B is chosen torepresent the angle between integrands and k(x , x ′) = 8

3 −‖x −x ′‖2.

We can prove that the worst-case integration error converges at a

rate O(n−34 ) for each integrand. This is the same rate as uni-output

BQ (but we usually improve on constants).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 28 / 32

Page 49: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Multi-output BQ for the Computer Graphics Example

object

𝝎𝒊

𝝎𝒐

𝑆2

environment mapcamera

𝒏

𝐿𝑒(𝝎𝑜)

𝐿𝑖(𝝎𝑜)

We compute integrals for different integrands based on various anglesω0 (akin to a camera moving).

We pick a separable kernel K (x , x ′) = Bk(x , x ′) where B is chosen torepresent the angle between integrands and k(x , x ′) = 8

3 −‖x −x ′‖2.

We can prove that the worst-case integration error converges at a

rate O(n−34 ) for each integrand. This is the same rate as uni-output

BQ (but we usually improve on constants).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 28 / 32

Page 50: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Multi-output BQ for the Computer Graphics Example

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 29 / 32

Page 51: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Application: Multifidelity Modelling

In each case, we have access to a cheap simulator/function (left) and anexpensive simulator/function (right).

blue = truth, red = uni-output, yellow & purple = two outputs[1] Perdikaris, P., Raissi, M., Damianou, A., Lawrence, N. D., & Karniadakis, G. E. (2016).Nonlinear information fusion algorithms for robust multi-fidelity modeling. Proceedings of theRoyal Society A: Mathematical, Physical, and Engineering Sciences, 473(2198).

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 30 / 32

Page 52: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

Application: Multifidelity Modelling

In multifidelity modelling, multi-output Gaussian processes are alreadybeing used as efficient surrogate models.

Furthermore, we are in general interested in some statistic of theexpensive surrogate model. We therefore might as well re-use thismulti-output GP approximate the integral.

Results:

Model 1-output BQ 2-output BQ

Step function (l) 0.024 (0.223) 0.021 (0.213)Step function (h) 0.405 (0.03) 0.09 (0.091)

Forrester function (l) 0.076 (4.913) 0.076 (4.951)Forrester function (h) 3.962 (3.984) 2.856 (27.01)

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 31 / 32

Page 53: Bayesian Quadrature for Multiple Related Integrals · Bayesian Quadrature for Multiple Related Integrals Fran˘cois-Xavier Briol University of Warwick (Department of Statistics) Imperial

Bayesian Quadrature for Multiple Integrals

References

[1] Larkin, F. M. (1972). Gaussian measure in Hilbert space and applications in numericalanalysis. Rocky Mountain Journal of Mathematics, 2(3), 379422.

[2] Diaconis, P. (1988). Bayesian Numerical Analysis. Statistical Decision Theory and RelatedTopics IV, 163175.

[3] O’Hagan, A. (1991). Bayes-Hermite quadrature. Journal of Statistical Planning andInference, 29, 245260.

[4] Rasmussen, C., & Ghahramani, Z. (2002). Bayesian Monte Carlo. In Advances in NeuralInformation Processing Systems (pp. 489496).

[5] Briol, F.-X., Oates, C. J., Girolami, M., Osborne, M. A., & Sejdinovic, D. (2015).Probabilistic Integration: A Role for Statisticians in Numerical Analysis?, arXiv:1512.00933.

[6] Xi, X., Briol, F.-X., & Girolami, M. (2018). Bayesian Quadrature for Multiple RelatedIntegrals. arXiv:1801.04153.

F-X Briol (Warwick & Imperial) BQ for Multiple Related Integrals May 2018 32 / 32