Introduction to polynomials chaos with NISP · 1 Orthogonal polynomials In this section, we de ne the orthogonal polynomials which are used in the context of polynomial chaos decomposition

Introduction to polynomials chaos with NISP

Michael Baudin

Version 0.5March 2015

Abstract

This document is an introduction to polynomials chaos with the NISP module for Scilab.

Contents

1 Orthogonal polynomials 41.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Orthogonal polynomials for probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Quadrature and the Lagrange polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4 Gaussian quadrature and orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . 111.5 The three term recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.6 Accuracy of roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.7 Computing the nodes and weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.8 Sturm-Liouville equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.9 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Concrete orthogonal polynomials 222.1 Hermite polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.2 Legendre polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.3 Laguerre polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.4 Chebyshev polynomials of the first kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.5 Accuracy of evaluation of the polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.6 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 Multivariate polynomials 423.1 Occupancy problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.2 Multivariate monomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.3 Multivariate polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.4 Generating multi-indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.5 Multivariate orthogonal functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.6 Tensor product of orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.7 Multivariate orthogonal polynomials and probabilities . . . . . . . . . . . . . . . . . . . . 533.8 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 Polynomial chaos 554.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 Truncated decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.3 Univariate decomposition examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.4 Generalized polynomial chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.5 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5 Acknowledgments 63

1

A Integrals 63A.1 Gaussian integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63A.2 Weighted Gaussian integral of powers of x . . . . . . . . . . . . . . . . . . . . . . . . . . . 64A.3 A Legendre integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65A.4 A Laguerre integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Bibliography 67

2

Copyright c© 2013 - 2015 - Michael BaudinThis file must be used under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported

License:

http://creativecommons.org/licenses/by-sa/3.0

3

http://creativecommons.org/licenses/by-sa/3.0

Figure 1: The Gaussian weight function w(x) = exp(−x2/2).

1 Orthogonal polynomials

In this section, we define the orthogonal polynomials which are used in the context of polynomial chaosdecomposition. We first present the weight function which defines the orthogonality of the square inte-grable functions. Then we analyze the link between orthogonal polynomials and quadrature.

1.1 Definitions and examples

We will denote by I an interval in R. The interval [a, b] is the set of real numbers such that x ≥ a andx ≤ b, for two real numbers a and b. The open interval (−∞,+∞) is also considered as an interval,although its boundaries are infinite. The half-open intervals (−∞, b) and (a,+∞) are also valid intervals.

Definition 1.1. ( Weight function of R) Let I be an interval in R. A weight function w is a nonnegativecontinuous integrable function of x ∈ I.

Example 1.2. ( Weight function for Hermite polynomials) The weight function for Hermite polynomialsis

w(x) = exp

(−x

2

2

), (1)

for x ∈ R. This function is presented in the figure 1. Clearly, this function is differentiable and nonneg-ative, since exp(−x2) > 0, for any finite x. The integral∫

Rexp(−x2/2)dx =

√2π (2)

is called the Gaussian integral (see the proposition A.1 for the proof).

Definition 1.3. ( Weighted L2 space in R) Let I be an interval in R and let L2w(I) be the set of functions

g which are square integrable with respect to the weight function w, i.e. such that the integral

‖g‖2 =

∫I

g(x)2w(x)dx (3)

is finite. In this case, the norm of g is ‖g‖.

4

Example 1.4. ( Functions in L2w(R)) Consider the Gaussian weight w(x) = exp(−x2/2), for x ∈ R.

Obviously, the function g(x) = 1 is in L2w(R).

Now consider the function

g(x) = x

for x ∈ R. The proposition A.3 (see in appendix) proves that∫Rx2 exp(−x2/2)dx =

√2π.

Therefore, g ∈ L2w(R).

Definition 1.5. ( Inner product in L2w(I) space) For any g, h ∈ L2

w(I), the inner product of g and h is

(g, h) =

∫I

g(x)h(x)w(x)dx. (4)

Assume that g ∈ L2w(I). We can combine the equations 3 and 4, which implies that the L2

w(I) normof g can be expressed as an inner product:

‖g‖2 = (g, g) . (5)

Example 1.6. ( Inner product in L2w(R)) Consider the Gaussian weight w(x) = exp(−x2/2), for x ∈ R.

Then consider the function g(x) = 1 and h(x) = x. We have∫Rg(x)h(x)w(x)dx =

∫Rx exp(−x2/2)dx

=

∫ 0

−∞x exp(−x2/2)dx+

∫ ∞0

x exp(−x2/2)dx.

However, ∫ 0

−∞x exp(−x2/2)dx = −

∫ ∞0

x exp(−x2/2)dx,

after the change of variable y = −x. Therefore,∫Rg(x)h(x)w(x)dx = 0,

which implies (g, h) = 0.

The L2w(I) space is a Hilbert space.

Definition 1.7. ( Orthogonality in L2w(I) space) Let I be an interval of R. Let g and h be two functions

in L2w(I). The functions g and h are orthogonal if

(g, h) = 0.

Definition 1.8. ( Polynomials) We denote by Pn the set of real polynomials with degree n, i.e. pn ∈ Pnif :

pn(x) = an+1xn + anx

n−1 + . . .+ a1, (6)

for any x ∈ I, where an+1, an,..., a1 are real numbers. In this case, the degree of the polynomial pn is n.If the leading term is an+1 = 1, then we say that the polynomial pn is monic.

Definition 1.9. ( Orthogonal polynomials) The set of polynomials {pn}n≥0 are orthogonal polynomialsif pn is a polynomial of degree n and:

(pi, pj) = 0

for i 6= j.

5

In this orthogonality context, we will use the Kronecker symbol.

Definition 1.10. ( Kronecker symbol) For any integers i and j, the Kronecker symbol is

δij =

{1 if i = j,0 otherwise.

(7)

Definition 1.11. ( Orthonormal polynomials) The set of polynomials {pn}n≥0 are orthonormal polyno-mials if pn is a polynomial of degree n and:

(pi, pj) = δij

for i 6= j.

We assume that, for all orthogonal polynomials, the degree zero polynomial P0 is equal to one:

P0(x) = 1, (8)

for any x ∈ I.

Proposition 1.12. ( Integral of orthogonal polynomials) Let {pn}n≥0 be orthogonal polynomials. Wehave ∫

I

P0(x)w(x)dx =

∫I

w(x)dx (9)

Moreover, for n ≥ 1, we have ∫I

pn(x)w(x)dx = 0 (10)

Proof. The equation 9 is the straightforward consequence of 8. Moreover, for any n ≥ 1, we have∫I

pn(x)w(x)dx =

∫I

P0(x)pn(x)w(x)dx

= (P0(x), pn(x))

= 0,

by the orthogonality property.

The figure 2 presents the function Hen(x)w(x), for n = 0, 1, ..., 5, where w is the Gaussian weightw(x) = exp(−x2/2) and Hen is the Hermite polynomial of degree n (which will be defined precisely inthe next section). The proposition 1.12 states that the only n for which the integral is nonzero is n = 0.Indeed, for n ≥ 1, we can see that the area above zero is equal to the area below zero. This is more obviousfor n = 1, 3, 5, i.e. for n odd, because the function Hen(x) is antisymmetric, i.e. Hen(−x) = −Hen(x).

1.2 Orthogonal polynomials for probabilities

In this section, we present the properties of pn(X), when X is a random variable associated with theorthogonal polynomials {pn}n≥0.

Proposition 1.13. ( Distribution from a weight) Let I be an interval in R and let w be a weight functionon I. The function:

f(x) =w(x)∫

Iw(x)dx

, (11)

for any x ∈ I, is a distribution function.

6

Figure 2: The function Hen(x)w(x), for n = 0, 1, ..., 5, and the Gaussian weight w(x) = exp(−x2/2).

Proof. Indeed, its integral is equal to one:∫I

f(x)dx =1∫

Iw(x)dx

∫I

w(x)dx

= 1,

which concludes the proof.

Example 1.14. ( Distribution function for Hermite polynomials) The equations 1 and 2 imply that thedistribution function for Hermite polynomials is

f(x) =1√2π

exp

(−x

2

2

),

for x ∈ R. The function f(x) is presented in the figure 3.

Proposition 1.15. ( Expectation of orthogonal polynomials) Let {pn}n≥0 be orthogonal polynomials.Assume that X is a random variable associated with the probability distribution function f , derived fromthe weight function w. We have

E(P0(X)) = 1. (12)

Moreover, for n ≥ 1, we have

E(pn(X)) = 0. (13)

Proof. By definition of the expectation,

E(P0(X)) =

∫I

P0(x)f(x)dx

=

∫I

f(x)dx

= 1,

7

Figure 3: The Gaussian distribution function f(x) = 1√2π

exp(−x2/2).

since f is a distribution function. Moreover, for any n ≥ 1, we have

E(pn(X)) =

∫I

pn(x)f(x)dx

=1∫

Iw(x)dx

∫I

pn(x)w(x)dx

= 0,

where the first equation derives from the equation 11, and the second equality is implied by 10.

Proposition 1.16. ( Variance of orthogonal polynomials) Let {pn}n≥0 be orthogonal polynomials. As-sume that x is a random variable associated with the probability distribution function f , derived from theweight function w. We have

V (P0(X)) = 0. (14)

Moreover, for n ≥ 1, we have

V (pn(X)) =‖pn‖2∫Iw(x)dx

. (15)

Proof. The equation 14 is implied by the fact that P0 is a constant. Moreover, for n ≥ 1, we have:

V (pn(X)) = E(

(pn(X)− E(pn(X)))2)

= E(pn(X)2

)=

∫I

pn(x)2f(x)dx

=1∫

Iw(x)dx

∫I

pn(x)2w(x)dx,

where the second equality is implied by the equation 13.

Example 1.17. In order to experimentally check the propositions 1.15 and 1.16, we consider Hermitepolynomials. We generate 10000 pseudo random outcomes of a standard normal variable X, and computeHen(X). In the following Scilab session, the first column prints n, the second column prints the empiricalmean, the third column prints the empirical variance and the last column prints n!, which is the exactvariance.

8

Distrib. Support Poly. w(x) f(x) ‖pn‖2 V (pn)

N (0, 1) R Hermite exp(−x

2

2

)1√2π

exp(−x

2

2

) √2πn! n!

U(−1, 1) [−1, 1] Legendre 1 12

22n+1

12n+1

E(1) R+ Laguerre exp(−x) exp(−x) 1 1

Figure 4: Some properties of orthogonal polynomials.

n mean(X) variance(X) n!

0. 1. 0. 1.

1. 0.0041817 0.9978115 1.

2. - 0.0021810 2.0144023 2.

3. 0.0179225 6.1480027 6.

4. 0.0231110 24.483042 24.

5. - 0.0114358 112.13277 120.

Proposition 1.18. Let {pn}n≥0 be orthogonal polynomials. Assume that x is a random variable asso-ciated with the probability distribution function f , derived from the weight function w. For two integersi, j ≥ 0, we have

E(pi(X)pj(X)) = 0 (16)

if i 6= j. Moreover, if i ≥ 1, then

E(pi(X)2) = V (pi(X)). (17)

Proof. We have

E(pi(X)pj(X)) =

∫I

pi(x)pj(x)f(x)dx

=1∫

Iw(x)dx

∫I

pi(x)pj(x)w(x)dx

=(pi, pj)∫Iw(x)dx

.

If i 6= j, the orthogonality of the polynomials implies 16. If, on the other hand, we have i = j ≥ 1, then

E(pi(X)2) =(pi, pi)∫Iw(x)dx

=‖pi‖2∫Iw(x)dx

.

We then use the equation 15, which leads to the equation 17.

In the next sections, we review a few orthogonal polynomials which are important in the context ofpolynomial chaos. The table 4 summarizes the results.

1.3 Quadrature and the Lagrange polynomial

In this section, we show that Lagrange interpolation with n + 1 nodes leads to quadrature rules ofmaximum degree equal to n.

Assume that we want to compute :

I(f) =

∫I

f(x)w(x)dx. (18)

9

Definition 1.19. ( Quadrature rule) Let x1, x2, ..., xn+1 be n + 1 real numbers in the interval I calledthe quadrature nodes. Let α1, α2, ..., αn+1 be n + 1 nonnegative real numbers called the weights. Aquadrature rule is a formula :

In(f) =

n+1∑i=1

αif(xi), (19)

to approximate I(f).

Definition 1.20. ( Lagrange interpolation polynomial) Let li be the polynomial :

li(x) =

n+1∏j=1j 6=i

x− xjxi − xj

, (20)

for any x ∈ I. The Lagrange interpolating polynomial is

πn(x) =

n+1∑i=1

f(xi)li(x), (21)

for any x ∈ I.

Definition 1.21. ( Degree of exactness) The degree of exactness of a quadrature rule is d if d is thelargest degree for which we have

I(pd) = In(pd), (22)

for any polynomial pd ∈ Pd.

In other words, the degree of exactness of a quadrature rule is the degree of the polynomial of largestdegree that the rule integrates exactly.

Proposition 1.22. ( Degree of exactness of Lagrange interpolation) If

αi =

∫I

li(x)w(x)dx (23)

for i = 1, 2, ..., n+ 1, therefore the degree of exactness of the associated quadrature rule is at least n.In this case, we say that the quadrature rule is interpolatory.

The previous proposition implies that the weights of Lagrange integration must be chosen accordingto the equation 23, but this does not impose any condition on the nodes.

Proof. Let pn ∈ Pn be a degree n polynomial. Let πn be the Lagrange interpolating polynomial for pn.Since these are two degree n polynomials equal at n + 1 points, the polynomial πn is equal to pn, i.e.pn(x) = πn(x), for any x ∈ I. Therefore, the equation 21 implies :

I(pn) =

∫I

pn(x)w(x)dx

=

∫I

πn(x)w(x)dx

=

∫I

(n+1∑i=1

pn(xi)li(x)

)w(x)dx

=

n+1∑i=1

pn(xi)

∫I

li(x)w(x)dx

=

n+1∑i=1

pn(xi)αi

= In(pn)

where we have use the equation 23 in the next to last equality.

10

1.4 Gaussian quadrature and orthogonal polynomials

In this section, we show that we can achieve a greater degree of exactness by carefully choosing the nodesof the quadrature. This procedure leads to orthogonal polynomials and Gaussian quadrature.

Definition 1.23. ( Node polynomial) The node polynomial is

ωn+1(x) =

n+1∏i=1

(x− xi), (24)

for any x ∈ I.

Proposition 1.24. ( Polynomial division) Let n and m be two integers. Let an+m ∈ Pn+m and bm ∈ Pm.There exists a quotient qn ∈ Pn and a remainder rm−1 ∈ Pm−1 such that the division of the polynomialan+m by the polynomial bm is

an+m = qnbm + rm−1. (25)

This proposition will not be proved here (see [13], section 4.6 ”Polynomial arithmetic” for more detailson this topic).

The following proposition is due to Jacobi (1826).

Proposition 1.25. ( Conditions for maximum degree of exactness) Let m > 0 be and integer. Thequadrature rule 19 has degree of exactness n+m if and only if

1. the formula 19 is interpolatory,

2. for any pm−1 ∈ Pm−1, we have∫I

ωn+1(x)pm−1(x)w(x)dx = 0. (26)

Proof. In the first part of this proof, let us assume that the conditions 1 and 2 are satisfied. We mustprove that the quadrature rule has degree of exactness equal to n + m. Let f ∈ Pn+m. We must provethat

I(f) = In(f). (27)

We can divide the polynomial f by the polynomial ωn+1. The proposition 1.24 implies that there existsa quotient qm−1 ∈ Pm−1 and a remainder rn ∈ Pn such that :

f = qm−1ωn+1 + rn. (28)

Since the quadrature rule is interpolatory, the quadrature rule is exact for rn, which is a degree npolynomial. Therefore,

n+1∑i=1

αirn(xi) =

∫I

rn(x)w(x)dx

=

∫I

f(x)w(x)dx−∫I

qm−1(x)ωn+1(x)w(x)dx

=

∫I

f(x)w(x)dx

= I(f), (29)

where the third equality is implied by the equation 28 and the fourth is from the equation 26. Theequation 28 implies

n+1∑i=1

αirn(xi) =

n+1∑i=1

αif(xi)−n+1∑i=1

αiqm−1(xi)ωn+1(xi)

=

n+1∑i=1

αif(xi)

= In(f),

11

since ωn+1(xi) = 0 for i = 1, ..., n+ 1. We combine the previous equation with the equation 29 and getthe equation 27, which concludes the first part of the proof.

In the second part of the proof, let us assume that the quadrature rule has degree of exactness equalto n+m. We must prove that the conditions 1 and 2 are satisfied. By hypothesis, the quadrature rule isexact for the interpolating Lagrange polynomial (which has degree n), which implies that the condition1 is satisfied. By the same reasoning as in the first part, we have

n+1∑i=1

αirn(xi) = I(f)−∫I

qm−1(x)ωn+1(x)w(x)dx

= In(f). (30)

By hypothesis, the quadrature rule has degree of exactness equal to n+m, which implies I(f) = In(f).We plug this equality into 30, which implies that the condition 2 is satisfied, and concludes the proof.

Proposition 1.26. ( Maximum degree of exactness) The maximum degree of exactness of the quadraturerule 19 is 2n+ 1.

Proof. We are going to prove that m ≤ n + 1. Then the proposition 1.25 implies that the maximumdegree of exactness is n+m = n+ n+ 1 = 2n+ 1.

Let us prove this by contradiction : suppose that m ≥ n + 2. Then the equality 26 is true form = n+ 2. Therefore n+ 1 = m− 1, which implies that the polynomial ωn+1 ∈ Pn+1 = Pm−1 satisfiesthe equality : ∫

I

ω2n+1(x)w(x)dx = 0.

Since the weight w is by hypothesis continuous, nonnegative and with a nonnegative integral, the previousequation implies that ωn+1 = 0, which is impossible.

We have seen that the maximum possible value of m is n+1. Therefore, the equation 26 implies thatthe node polynomial ωn+1 must satisfy the equation :∫

I

ωn+1(x)pn(x)w(x)dx = 0, (31)

for any pn ∈ Pn. In terms of scalar product, this writes :

(ωn+1, pn) = 0, (32)

for any pn ∈ Pn. This leads to the following proposition.

Proposition 1.27. ( Orthogonality of the node polynomial) The maximal quadrature rule is so that themonic node polynomial is orthogonal to all polynomials of degree less or equal to n.

Definition 1.28. ( Gaussian quadrature) If the integration nodes x1, ..., xn+1 are the roots of a nodepolynomial satisfying the equation 32, the quadrature rule 19 is called a Gaussian quadrature.

In the following, we denote by {πk}k=1,...,n the monic node polynomials associated with the equa-tion 32.

Proposition 1.29. ( Properties of node polynomials)

1. The polynomials {πk}k=1,...,n are orthogonals.

2. They are linearly independent.

3. The polynomials {πk}k=1,...,n are a basis of Pn.

Proof. 1. Let us prove that the polynomials {πk}k=1,...,n are orthogonals. Consider two polynomialsπi and πj satisfying the equation 32, with i 6= j. Without loss of generality, we can assume thati > j (otherwise, we just have to switch i and j). Since πj has a degree lower than πi, the equation32 implies :

(πi, πj) = 0,


12

2. The fact that the orthogonal polynomials {πk}k=1,...,n are linearly independent is a result of linearalgebra. Indeed, for any real numbers α1,..., αn, assume that

α1π1 + ...+ αnπn = 0.

For any i = 1, ..., n, the scalar product of the previous equation with πi implies :

αi(πi, πi) = 0,

since the other terms in the sum are zero, by orthogonality. Suppose (πi, πi) = 0. This wouldcontradict the hypothesis that the weight w is a continuous, nonnegative function. This impliesthat, necessarily, we have (πi, πi) > 0. We combine this inequality with the previous equality, andget αi = 0, which shows that the orthogonal polynomials are linearily independent.

3. Linear algebra shows that the polynomials {πk}k=1,...,n are a basis of Pn. Indeed, consider thefollowing set of polynomials :

1, x, x2, ..., xn.

The definition 1.8 shows that the previous polynomials are a basis of Pn, since any degree npolynomial is equal to a linear combination of these n polynomials. Therefore, the dimension ofthe vector space Pn is n. However, the n polynomials {πk}k=1,...,n are linearly independent, whichimplies that these are a basis of Pn, and concludes the proof.

1.5 The three term recurrence

In this section, we show that orthogonal polynomials satisfy a three term recurrence.

Proposition 1.30. ( Three term recurrence of monic orthogonal polynomials) Assume {πk}k=−1,0,1,...,nis a family of monic orthogonal polynomials, with

π−1 = 0, π0 = 1.

Therefore,

πk+1(x) = (x− αk)πk(x)− βkπk−1(x), (33)

for k = 0, 1, ..., n and any x ∈ I, where

αk =(xπk, πk)

(πk, πk)(34)

for k = 0, 1, ..., n and

βk =(πk, πk)

(πk−1, πk−1)(35)

for k = 1, 2, ..., n.

In the previous proposition, let us make clear that the scalar product (xπk, πk) in αk involves thepolynomial xπk(x), for any x ∈ I.

Notice that the proposition does not state the value of β0, which can be chosen freely. Indeed, considerthe equation 33 for k = 0. We have

π1(x) = (x− α0)π0(x)− β0π−1(x)

= (x− α0)π0(x),

since π−1 = 0. As in [9], we choose :

β0 = (π0, π0) =

∫I

w(x)dx. (36)

13

Proof. For any integer k = 1, ..., n, the polynomial πk+1 − xπk has degree lower or equal to k. Indeed,both πk+1 and πk are monic, so that the leading term xk+1 cancels. Since the orthogonal polynomials{π0, π1, ..., πk} are a basis of Pk, we have

πk+1 − xπk = −αkπk − βkπk−1 −k−2∑j=0

γkjπj , (37)

for k = 0, 1, ..., n, where αk, βk and γkj , for j = 0, ..., k − 2, are real numbers.

1. The scalar product of the equation 37 with πk is :

−(xπk, πk) = −αk(πk, πk),

since the orthogonality of the polynomials implies that the other terms in the sum are zero. Theprevious equation immediately leads to the equation 34.

2. The scalar product of the equation 37 with πk−1 is :

−(xπk, πk−1) = −β(πk−1, πk−1), (38)

by the same orthogonality principle. However, the left hand side of the previous equation can besimplified, because :

(xπk, πk−1) =

∫I

xπk(x)πk−1(x)w(x)dx

= (πk, xπk−1). (39)

Moreover, the polynomial xπk−1 is a monic degree k polynomial. Hence, it can be decomposed as :

xπk−1 = πk + ckπk−1 + ...+ c1π0,

where ck,...,c1 are real numbers. By orthogonality, the scalar product of the previous equation withπk is :

(πk, xπk−1) = (πk, πk).

We plug the previous equation into 39, which implies :

(xπk, πk−1) = (πk, πk).

The previous equation can be combined with 38, which leads to 35.

3. In order to prove the equation 33, we are going to use the equation 37 and prove that, for anyj = 0, 1, ..., k− 2, we have γkj = 0. Using orthogonality, the scalar product of the equation 37 withπj is :

−(xπk, πj) = −γkj(πj , πj).

Moreover, we have(xπk, πj) = 0

since xπk is a degree k + 1 polynomial and j = 0, 1, ..., k − 2. This leads to

γkj(πj , πj) = 0.

However, we know that (πj , πj) > 0, which concludes the proof.

The three-term recurrence 33 is for monic orthogonal polynomials. However, a set of polynomialscan be orthogonal without being monic, as we are going to show in the next proposition.

14

Proposition 1.31. ( Three term recurrence of orthogonal polynomials) Assume {pk}k=−1,0,1,...,n is afamily of orthogonal polynomials, with

p−1 = 0, p0 =1

γ0, (40)

Therefore,

pk+1(x) =γkγk+1

(x− αk)pk(x)− γk−1γk+1

βkpk−1(x), (41)


αk =(xpk, pk)

(pk, pk)(42)

for k = 0, 1, ..., n,

βk =γ2k(pk, pk)

γ2k−1(pk−1, pk−1)(43)

for k = 1, 2, ..., n and γk is a nonzero real number which can be chosen arbitrarily.

Proof. Any polynomial pk is the product of a monic polynomial πk with a nonzero real number. Indeed,consider the polynomials

pk =πkγk, (44)

where πk is a monic orthogonal polynomial and γk is a nonzero real number. We plug the equationπk = γkpk into 33 and get :

γk+1pk+1(x) = (x− αk)γkpk(x)− βkγk−1pk−1(x),

for any x ∈ I. We divide the previous equation by γk+1, which leads to 41. We obtain the equations 42and 43 using the same equation.

Finally, we can normalize the polynomials and get orthonormal polynomials as presented in thedefinition 1.11. The following proposition is a straightforward consequence of the proposition 1.30.Notice that, as we normalize the polynomials, they are not monic anymore.

Proposition 1.32. ( Three term recurrence of orthonormal polynomials) Assume {pk}k=−1,0,1,...,n is afamily of orthonormal polynomials, with

p−1 = 0, p0 =1√β0, (45)

where β0 is defined by the equation 36. Therefore,

pk+1(x) =x− αk√βk+1

pk(x)−√βk√βk+1

pk−1(x), (46)


αk = (xpk, pk) (47)

for k = 0, 1, ..., n,

βk =γ2kγ2k−1

(48)

for k = 1, 2, ..., n and

γk = ‖πk‖ =√

(πk, πk). (49)

15

Proof. We use the proposition 1.31 with γk = ‖pk‖. The equation 44 implies

‖pk‖ =‖πk‖‖πk‖

= 1,

and(pk, pk) = 1.

We plug the previous equation into 42 and get 47. When we plug it into 43, we get 48.Finally, the equation 48 implies :

γkγk+1

=1√βk+1

. (50)

The previous equation implies

βkγk−1γk+1

= βkγk−1γk

γkγk+1

= βk1√βk

1√βk+1

=

√βk√βk+1

.

We plug the previous equation and the equation 50 into 41, which implies 46.

1.6 Accuracy of roots of polynomials

We could compute the nodes by searching for the roots of the orthogonal polynomials. As we are goingto see, this method is very sensitive to small errors in the coefficients of the polynomial. Hence, if n issufficiently large, computing the roots of the polynomials fails to produce accurate nodes.

Consider the Chebyshev polynomial of degree n. Its roots are given by the equation 96, page 37.This is the equation used by the chebyshev quadrature function.

The other method uses the chebyshev poly function (see the section 2.4), which computes theChebyshev polynomial. More precisely, Scilab uses a data structure based on its coefficients. It is thenstraightforward to use the roots function, which returns the roots of the polynomial, based on theeigenvalues of the companion matrix [7]. By definition, the companion matrix of the polynomial definedby the equation 6 is :

C(p) =

0 0 . . . 0 −a1/an+1

1 0 . . . 0 −a2/an+1

0 1 . . . 0 −a3/an+1

......

. . ....

...0 0 . . . 1 −an/an+1

.

In the following script, we compute the roots of the Chebyshev polynomials from degree 2 to 50 by thetwo previous methods. Then we compute the number of common digits, using the assert computedigits

function.

scf();

for n=2:50

[x,w]= chebyshev_quadrature(n);

p=chebyshev_poly(n);

xp=real(roots(p));

xp=gsort(xp,"g","i");

if (modulo(n ,2)==1) then

x(abs(x)<1.e -7)=[];

xp(abs(xp)<1.e -7)=[];

end

d=assert_computedigits(xp ,x);

16

Figure 5: The accuracy of Chebyshev nodes.

dmin=min(d);

mprintf("n=%d , dmin=%.1f\n",n,dmin)

if (modulo(n ,2)==0) then

plot(n*ones(n,1),d,"bo")

else

plot((n-1)* ones(d),d,"bo")

end

end

xtitle("Accuracy of Chebyshev roots");

xlabel("n");

ylabel("Number of accurate digits");

If n is odd, then zero is an exact root. But in this case, the equation 96 produces a node which isclose to 10−16, whereas the roots function produces a corresponding node which is exactly equal to zero.Since there is no common digit between these two nodes, and since both of them are sufficiently accuratewith respect to the absolute accuracy that we can expect, we remove these nodes in the comparison.

The previous script produces the figure 5. The figure shows that, when n increases from 1 to 50, theaccuracy of the roots computed with the roots function can decrease from 15 (the maximum) down to1 digits (the minimum).

For a given n, the accuracy generally decreases when the node is close to 1, as shown in the figure 6for n = 50 and n = 100.

The case where n = 50 is difficult to manage for the roots function. For example, the last two rootsof the Chebyshev polynomial of degree 50 are :

1.00039787307992989

1.01163758359688472

which are not even in the [−1, 1] interval.Although our method based on the roots function does not perform well in this case, this does not

imply that the algorithm computing the roots is, by itself, the cause of the accuracy issue. Notice thatthe nonzero coefficients of the Chebyshev polynomial of degree 50 have a magnitude which range from 1up to approximately 1018, with coefficients which alternate in sign.

17

Figure 6: The accuracy of Chebyshev nodes when n = 50 and n = 100.

1.7 Computing the nodes and weights

In this section, we analyze a method to accurately compute the nodes and weights of the quadraturerule.

Definition 1.33. ( Jacobi matrix) The Jacobi matrix of order n is

Jn =

α0

√β1 0√

β1 α1

√β2

. . .. . .

. . .√βn−1 αn−1

√βn

0√βn αn

,

for n ≥ 0.

Notice that the previous matrix is symmetric.

Proposition 1.34. ( Eigenvalues of the Jacobi matrix) The eigenvalues of the Jacobi matrix are thenodes of the quadrature rule, i.e. if x is a node of the quadrature rule, therefore

Jnp(x) = xp(x), (51)

where

p =

p0p1...pn,

(52)

where pi is the orthonormal polynomial of degree i, for i = 0, ..., n.

Proof. For k = 0, 1, ..., n and for any x ∈ I, the equation 46 implies :√βk+1pk+1(x) = (x− αk)pk(x)−

√βkpk−1(x)

= xpk(x)− αkpk(x)−√βkpk−1(x).

Therefore,

xpk(x) =√βkpk−1(x) + αkpk(x) +

√βk+1pk+1(x). (53)

18

In matrix terms, the definition of the Jacobi matrix and the equation 52 implies :

Jnp =

α0p0 +

√β1p1√

β1p0 + α1p1 +√β2p2

...√βn−1pn−2 + αn−1pn−1 +

√βnpn√

βnpn−1 + αnpn

=

xp0 −

√β0p−1

xp1...

xpn−1xpn −

√βn+1pn+1

,

where we have used the equation 53 in the last equality. First, the equation 40 implies that the first rowis equal to xp0. Second, if x is a root of pn+1, then pn+1(x) = 0 which implies that the last row is equalto xpn. This is why the previous equation implies the equation 51, which concludes the proof.

Proposition 1.35. ( Weights from the design matrix) Consider the vectors α,v ∈ RRn+1, defined by :

α =

α1

...αn+1

, v =

√β00...0

. (54)

The weights of the quadrature rule satisfy the equality

Pα = v, (55)

where P is a n+ 1 by n+ 1 design matrix which entries are :

pij = pi−1(xj), (56)

for i, j = 1, 2, ..., n+ 1 where p0, ..., pn are orthonormal polynomials.

Proof. For i = 0, ..., n, the quadrature rule 19 integrates pi exactly, since pi is a polynomial of degreelower than 2n+ 1. Therefore, ∫

I

pi(x)w(x)dx =

n+1∑j=1

αjpi(xj),

for i = 0, ..., n. However, the equation 45 implies :∫I

pi(x)w(x)dx =√β0

∫i

pi(x)p0(x)w(x)dx

=√β0(pi, p0).

Therefore,n+1∑j=1

αjpi(xj) =√β0(pi, p0)

for i = 0, ..., n. The previous expression is equal to√β0 if i = 0 and is equal to zero otherwise, which

concludes the proof.

Proposition 1.36. ( Weights from Jacobi’s matrix) For any i = 1, ..., n + 1, suppose that xi is a nodeof the quadrature rule. Assume that ui is the normalized eigenvector associated with the eigenvalue xiof Jn, i.e. assume that

Jnui = xiui, (57)

19

where

‖ui‖2 = 1. (58)

Therefore, the weights of the quadrature rule satisfy the equation

αi = β0u21i. (59)

Proof. Consider the matrix P defined by the equation 56. Its columns are

P = (p(x1), ...,p(xn+1)),

where the column vector p(xi) is defined by the equation 52. The matrix Jn is symmetric and linearalgebra theory implies that its eigenvectors are orthogonal. Therefore, the proposition 1.34, which statesthat the columns of P are the eigenvectors of Jn, implies that the columns of P are orthogonal.

Therefore, the matrix PTP is diagonal :

PTP = diag(d1, ..., dn+1), (60)

where

di =

n+1∑j=1

pTjipij

=

n+1∑j=1

pijpij

=

n+1∑j=1

p2j−1(xi),

which implies :

di =

n∑j=0

p2j (xi). (61)

We left multiply the equation 55 by PT , which implies

PTPα = PTv.

We insert the equation 60 into the previous equality, and get :

Dα = PTv. (62)

However, the equation 45 implies that the first row of P is :

(p0(x1), ..., p0(xn+1)) =

(1√β0, ...,

1√β0

),

which is also the first column of PT . The equation 54 then leads to :

PTv =

√β0√β0

...√β0√β0

=

1...1

.

Therefore, the equation 62 implies :diαi = 1,

20

for i = 1, ..., n+ 1. We substitute di from the equation 61 into the previous equation and get :

αi =1∑n

j=0 p2j (xi)

, (63)

for i = 1, ..., n+ 1.Consider the equation 51, where p(xi) is an eigenvector of Jn and xi is a node of the quadrature rule.

Let ui ∈ Rn+1 be a normalized eigenvector of the Jacobi matrix :

ui =p(xi)

‖p(xi)‖2

=p(xi)√∑nj=0 p

2j (xi)

.

The first component of the previous vector is

u1i =p0(x)√∑nj=0 p

2j (xi)

=1√∑n

j=0 p2j (xi)

1√β0,

after the equation 45. We square the previous equation and get :

β0u21i =

1∑nj=0 p

2j (xi)

.

We combine the previous equation and 63, which leads to 59.

1.8 Sturm-Liouville equations

Definition 1.37. ( Sturm-Liouville equation) Let p, q, and w be functions on R, with w(x) > 0, for anyx ∈ R. The differential equation

d

dx

(p(x)

dy

dx

)+ q(x)y = −λw(x)y,

where y is a function of x is called a Sturm-Liouville equation.

Let L be the differential operator defined by :

L =1

w(x)

(− d

dx

(p(x)

d

dx

)+ q(x)

).

The operator L is called Sturm-Liouville operator. The S-L differential equation is the eigenvalue equa-tion :

Ly = λy.

Provided suitable boundary conditions are chosen, the S-L operator L is symmetric, i.e. :

(Lf, g) = (f, Lg),

for any functions f and g in the weighted Hilbert space. In cases where complex functions are considered,this operator is self-adjoint (some authors use the term Hermitian).

In practice, we must prescribe some appropriate boundary conditions to the Sturm-Liouville (S-L)equation. If a value λ exists, then it is an eigenvalue of the S-L equation. Under some hypotheses,the eigenvalues λ1, λ2, ... of the S-L problem are real. For each eigenvalue λn, the solution yn(x) is thecorresponding eigenvector function. The normalized eigenfunctions form an orthonormal basis in theweighted Hilbert space associated with the scalar product :∫

Ryn(x)ym(x)w(x)dx = δnm,

for n,m ≥ 1.As we are going to see, all orthogonal polynomials are associated with a S-L equation.

21

1.9 Notes and references

More details on the link between quadrature rules, the Lagrangian interpolating polynomial and orthog-onal polynomials can be found in [18], in the chapter 10 ”Orthogonal polynomials in approximationtheory”.

The three-term recurrence of orthogonal polynomials is presented in [18], and proved in [9].

2 Concrete orthogonal polynomials

In this section, we present the Hermite, Legendre, Laguerre and Chebyshev orthogonal polynomials.

2.1 Hermite polynomials

In this section, we present Hermite polynomials and their properties.Hermite polynomials are associated with the Gaussian weight:

w(x) = exp(−x2/2), (64)

for x ∈ R. The integral of this weight is:∫Rw(x)dx =

√2π. (65)

The distribution function for Hermite polynomials is the standard Normal distribution:

f(x) =1√2π

exp

(−x

2

2

), (66)

for x ∈ R.Here, we consider the probabilist polynomials Hen, as opposed to the physicist polynomials Hn [22].The first Hermite polynomials are

He0(x) = 1,

He1(x) = x

The remaining Hermite polynomials satisfy the recurrence:

Hen+1(x) = xHen(x)− nHen−1(x), (67)

for n = 1, 2, ....

Example 2.1. ( First Hermite polynomials) We have

He2(x) = xHe1(x)−He0(x)

= x · x− 1

= x2 − 1.

Similarily,

He3(x) = xHe2(x)− 2He1(x)

= x(x2 − 1)− 2x

= x3 − 3x.

The first Hermite polynomials are presented in the figure 7.The figure 8 plots the Hermite polynomials Hen, for n = 0, 1, 2, 3.The figure 9 presents the coefficients ck, for k ≥ 0 of the Hermite polynomials, so that

Hen(x) =

n∑k=0

ckxk. (68)

22

He0(x) = 1He1(x) = xHe2(x) = x2 − 1He3(x) = x3 − 3xHe4(x) = x4 − 6x2 + 3He5(x) = x5 − 10x3 + 15x

Figure 7: Hermite polynomials

Figure 8: The Hermite polynomials Hen, for n = 0, 1, 2, 3.

n c0 c1 c2 c3 c4 c5 c6 c7 c8 c90 11 12 -1 13 -3 14 3 -6 15 15 -10 16 -15 45 -15 17 -105 105 -21 18 105 -420 210 -28 19 945 -1260 378 -36 1

Figure 9: Coefficients of Hermite polynomials Hen, for n = 0, 1, 2, ..., 9

23

The Hermite polynomials are orthogonal with respect to the weight w(x). Moreover,

‖Hen‖2 =√

2πn!. (69)

Hence,

V (Hen(X)) = n!, (70)

for n ≥ 1, where X is a standard normal random variable.The following HermitePoly Scilab function creates the Hermite polynomial of order n.

function y=HermitePoly(n)

if (n==0) then

y=poly(1,"x","coeff")

elseif (n==1) then

y=poly ([0 1],"x","coeff")

else

polyx=poly ([0 1],"x","coeff")

// y(n-2)

yn2=poly(1,"x","coeff")

// y(n-1)

yn1=polyx

for k=2:n

y=polyx*yn1 -(k-1)* yn2

yn2=yn1

yn1=y

end

end

endfunction

The script:

for n=0:10

y=HermitePoly(x,n);

disp(y)

end

produces the following output:

1

x

2

- 1 + x

3

- 3x + x

2 4

3 - 6x + x

3 5

15x - 10x + x

Proposition 2.2. ( Rodrigues formula for Hermite polynomials) Hermite polynomials are so that :

Hen(x) = (−1)nex2

2dn

dxn

(e−

x2

2

), (71)

for x ∈ R.

The previous proposition will not be proved in this document.

24

Proposition 2.3. ( Orthogonality of Hermite polynomials) Hermite polynomials are orthogonal, and

(Hen, Hem) =√

2πn!δnm, (72)

for n,m ≥ 0.

In terms of integrals, the previous equation is :∫ ∞−∞

Hen(x)Hem(x)w(x)dx =√

2πn!δnm,

where w is the weight of Hermite polynomials.

Proof. Rodrigues’ formula for Hermite polynomials imply :

(Hen, Hem) =

∫ ∞−∞

Hen(x)Hem(x)w(x)dx

= (−1)n∫ ∞−∞

ex2

2dn

dxn

(e−

x2

2

)Hem(x)e−

x2

2 dx

= (−1)n∫ ∞−∞

dn

dxn

(e−

x2

2

)Hem(x)dx. (73)

Integrating by part, we get :∫ ∞−∞

dn

dxn

(e−

x2

2

)Hem(x)dx =

[dn−1

dxn−1

(e−

x2

2

)Hem(x)

]∞−∞−∫ ∞−∞

dn−1

dxn−1

(e−

x2

2

) d

dxHem(x)dx

However,

d

dx

(e−

x2

2

)= −2xe−

x2

2 .

This shows that e−x2

2 is a factor of the first derivative of e−x2

2 . By recurrence, using the derivative of a

product, this is also true for any derivative of e−x2

2 . This implies :[dn−1

dxn−1

(e−

x2

2

)Hem(x)

]∞−∞

= 0,

for any integer n, since the limit of e−x2

2 at infinity is zero.Therefore, ∫ ∞

−∞

dn

dxn

(e−

x2

2

)Hem(x)dx = −

∫ ∞−∞

dn−1

dxn−1

(e−

x2

2

) d

dxHem(x)dx

Integrating by part n− 1 more times, we get :∫ ∞−∞

dn

dxn

(e−

x2

2

)Hem(x)dx = (−1)n

∫ ∞−∞

e−x2

2dn

dxnHem(x)dx

We plug the previous equation into 73 and get :

(Hen, Hem) = (−1)n(−1)n∫ ∞−∞

e−x2

2dn

dxnHem(x)dx

=

∫ ∞−∞

e−x2

2dn

dxnHem(x)dx (74)

We first assume that n 6= m and prove that the integral is zero.Assume that n > m (otherwise, we just switch n and m). Since Hem is a degree m polynomial, its

n-th derivative is zero. This implies that the previous integral is zero, which proves that the equation 72is true when n 6= m.

25

Secondly, assume that n = m. The equation 74 implies :

(Hen, Hen) =

∫ ∞−∞

e−x2

2dn

dxnHen(x)dx.

The equation 67 implies that Hen is a monic degree n polynomial since the monomial with highestexponent is xHen(x). In other words, the leading term of Hen is xn. The n-th derivative of thismonomial is n! and the n-th derivative of lower degree monomials is zero. Hence,

(Hen, Hen) = n!

∫ ∞−∞

e−x2

2 dx

= n!√

2π,

where we have used the Gaussian integral proved in the section A.1.

Hermite polynomials are solutions of the following differential equation :

y′′ − xy′ = −ny.

This leads to the following S-L equation :(e−

x2

2 y′)′

= −ne− x2

2 y.

The eigenvalue is λ = n, for n = 0, 1, ....

2.2 Legendre polynomials

In this section, we present Legendre polynomials and their properties.Legendre polynomials are associated with the unity weight

w(x) = 1,

for x ∈ [−1, 1]. We have ∫ 1

−1w(x)dx = 2.

The associated probability distribution function is:

f(x) =1

2,

for x ∈ [−1, 1]. This corresponds to the uniform distribution in probability theory.The first Legendre polynomials are

P0(x) = 1,

P1(x) = x

The remaining Legendre polynomials satisfy the recurrence:

(n+ 1)Pn+1(x) = (2n+ 1)xPn(x)− nPn−1(x),

for n = 1, 2, ....

Example 2.4. ( First Legendre polynomials) We have

2P2(x) = 3xP1(x)− P0(x)

= 3x2 − 1,

26

P0(x) = 1P1(x) = xP2(x) = 1

2 (3x2 − 1)P3(x) = 1

2 (5x3 − 3x)P4(x) = 1

8 (35x4 − 30x2 + 3)P5(x) = 1

8 (63x5 − 70x3 + 15x)

Figure 10: Legendre polynomials

which implies

P2(x) =1

2(3x2 − 1).

Similarily,

3P3(x) = 5xP2(x)− 2P1(x)

= 5x1

2(3x2 − 1)− 2x

=15

2x3 − 5

2x− 2x

=15

2x3 − 9

2x

=1

2(15x3 − 9x).

Finally, we divide both sides of the equation by 3 and get

P3(x) =1

2(5x3 − 3x).

The first Legendre polynomials are presented in the figure 10.The Legendre polynomials are orthogonal with respect to w(x) = 1. Moreover,

‖Pn‖2 =2

2n+ 1

Furthermore,

V (Pn(X)) =1

2n+ 1,

for n ≥ 1, where X is a random variable uniform in the interval [−1, 1].The figure 11 plots the Legendre polynomials Pn, for n = 0, 1, 2, 3.The following LegendrePoly Scilab function creates the Legendre polynomial of order n.

function y=LegendrePoly(n)

if (n==0) then


elseif (n==1) then


else


// y(n-2)


// y(n-1)

yn1=polyx

for k=2:n

y=((2*k-1)* polyx*yn1 -(k-1)* yn2)/k

yn2=yn1

27

Figure 11: The Legendre polynomials Pn, for n = 0, 1, 2, 3.

yn1=y

end

end

endfunction

Proposition 2.5. ( Rodrigues formula for Legendre polynomials) Legendre polynomials are so that :

Pn(x) =1

2nn!

dn

dxn((x2 − 1)n), (75)

for x ∈ [−1, 1].


Proposition 2.6. ( Orthogonality of Legendre polynomials) Legendre polynomials are orthogonal, and

(Pn, Pm) =2

2n+ 1δnm, (76)

for n,m ≥ 0.

In terms of integrals, the previous equation is :∫ 1

−1Pn(x)Pm(x)w(x)dx =

2

2n+ 1δnm,

where w is the weight of Legendre polynomials.

Proof. We first prove the equation 76 when m 6= n.Rodrigues’ formula 75 implies :

(Pn, Pm) =

∫ 1

−1Pn(x)Pm(x)w(x)dx

=1

2n+mn!m!

∫ 1

−1

dn

dxn((x2 − 1)n)

dm

dxm((x2 − 1)m)dx.

28

Integration by part implies : ∫ 1

−1

dn

dxn((x2 − 1)n)

dm

dxm((x2 − 1)m)dx

=

[dn−1

dxn−1((x2 − 1)n)

dm

dxm((x2 − 1)m)dx

]1−1

−∫ 1

−1

dn−1

dxn−1((x2 − 1)n)

dm+1

dxm+1((x2 − 1)m)dx. (77)

Differentiating the polynomial (x2 − 1)n one times implies :

d

dx((x2 − 1)n) = 2nx(x2 − 1)n−1.

This shows that the first derivative of (x2 − 1)n has (x2 − 1)n−1 as a factor. This is true up to the(n− 1)-th derivative of (x2 − 1)n. Hence, the (n− 1)-th derivative

dn−1

dxn−1((x2 − 1)n)

has x2 − 1 as a factor. Therefore the (n− 1)-th derivative of (x2 − 1)n is equal to zero for x = −1 andx = 1. Hence, the first expression in the equation 77 is zero, which implies :∫ 1

−1

dn

dxn((x2 − 1)n)

dm

dxm((x2 − 1)m)dx

= −∫ 1

−1

dn−1

dxn−1((x2 − 1)n)

dm+1

dxm+1((x2 − 1)m)dx.

We continue to integrate by part n− 1 more times, and get :

(Pn, Pm) =(−1)n

2n+mn!m!

∫ 1

−1(x2 − 1)n

dm+n

dxm+n((x2 − 1)m)dx, (78)

where the expression (−1)n comes from multiplying the minus sign in front of the integral n times.Now, suppose that m < n (in the case where m > n, then we just switch m and n). This implies

2m < m + n. But the leading monomial of (x2 − 1)m is x2m, which implies that the derivative in theright hand side of the previous integral is zero, and concludes the first part of the proof.

We now prove the equation 76 when m = n. The equation 78 implies :

(Pn, Pn) =

∫ 1

−1Pn(x)2w(x)dx

=(−1)n

22n(n!)2

∫ 1

−1(x2 − 1)n

d2n

dx2n((x2 − 1)n)dx, (79)

The polynomial (x2−1)n has leading monomial x2n. Differentiating this polynomial 2n times cancelsall terms, except for the leading term x2n. This implies :

d2n

dx2n((x2 − 1)n) =

d2n

dx2nx2n

= (2n)!.

We plug the previous equation into 79, which implies

(Pn, Pn) = (−1)n(2n)!

22n(n!)2

∫ 1

−1(x2 − 1)ndx. (80)

It can be proved (see section A.3) that∫ 1

−1(x2 − 1)ndx = (−1)n

(n!)222n+1

(2n+ 1)!.

29


(Pn, Pn) = (−1)n(2n)!

22n(n!)2(−1)n

(n!)222n+1

(2n+ 1)!

=2

2n+ 1,

and concludes the proof.

Legendre polynomials are solutions of the following differential equation :

(1− x2)y′′ − 2xy′ = −n(n+ 1)y.

The associated S-L equation is : ((1− x2)y′

)′= −n(n+ 1)y.

2.3 Laguerre polynomials

In this section, we present Laguerre polynomials and their properties.Laguerre polynomials are associated with the weight

w(x) = exp(−x),

for x ≥ 0.The figure 12 plots the exponential weight function.

Figure 12: The Exponential weight.

We have: ∫ +∞

0

w(x)dx = 1.

Hence, the associated probability distribution function is:

f(x) = exp(−x),

for x ≥ 0. This is the standard exponential distribution in probability theory.The first Laguerre polynomials are

L0(x) = 1,

L1(x) = −x+ 1

30

L0(x) = 1L1(x) = −x+ 1L2(x) = 1

2! (x2 − 4x+ 2)

L3(x) = 13! (−x

3 + 9x2 − 18x+ 6)L4(x) = 1

4! (x4 − 16x3 + 72x2 − 96x+ 24)

L5(x) = 15! (−x

5 + 25x4 − 200x3 + 600x2 − 600x+ 120)

Figure 13: Laguerre polynomials

The remaining Laguerre polynomials satisfy the recurrence:

(n+ 1)Ln+1(x) = (2n+ 1− x)Ln(x)− nLn−1(x), (81)

for n = 1, 2, ....The first Laguerre polynomials are presented in the figure 13.

Example 2.7. ( First Laguerre polynomials) We have

2L2(x) = (3− x)L1(x)− L0(x)

= (3− x)(1− x)− 1

= 3− x− 3x+ x2 − 1

= x2 − 4x+ 2.

Hence,

L2(x) =1

2(x2 − 4x+ 2).

Similarily,

3L3(x) = (5− x)L2(x)− 2L1(x)

= (5− x)1

2(x2 − 4x+ 2)− 2(1− x).

Therefore,

6L3(x) = (5− x)(x2 − 4x+ 2)− 4(1− x)

= 5x2 − 20x+ 10− x3 + 4x2 − 2x− 4 + 4x

= −x3 + 9x2 − 18x+ 6,

which implies

L3(x) =1

6(−x3 + 9x2 − 18x+ 6).

The figure 14 plots the Laguerre polynomials Ln, for n = 0, 1, 2, 3.The Laguerre polynomials are orthogonal with respect to w(x) = exp(−x). Moreover,

‖Ln‖2 = 1.

Furthermore,

V (Ln(X)) = 1,

for n ≥ 1, where X is a standard exponential random variable.The following LaguerrePoly Scilab function creates the Laguerre polynomial of order n.

31

Figure 14: The Laguerre polynomials Ln, for n = 0, 1, 2, 3.

function y=LaguerrePoly(n)

if (n==0) then


elseif (n==1) then

y=poly ([1 -1],"x","coeff")

else


// y(n-2)


// y(n-1)

yn1=1-polyx

for k=2:n

y=((2*k-1-polyx )*yn1 -(k-1)* yn2)/k

yn2=yn1

yn1=y

end

end

endfunction

Proposition 2.8. ( Rodrigues formula for Laguerre polynomials) Laguerre polynomials are so that :

Ln(x) =ex

n!

dn

dxn(xne−x

), (82)

for x ≥ 0.


Proposition 2.9. ( Orthogonality of Laguerre polynomials) Laguerre polynomials are orthogonal, and

(Ln, Lm) = δnm, (83)

for n,m ≥ 0.

In terms of integrals, the previous equation is :∫ ∞0

Ln(x)Lm(x)w(x)dx = δnm,

where w is the weight of Laguerre polynomials.

32

Proof. We first prove that :

(Ln, Lm) =(−1)n

n!

∫ ∞0

xne−xdn

dxnLm(x)dx. (84)

Indeed, we can use the Rodrigues formula 82 then integrate by part. This leads to :

(Ln, Lm) =

∫ ∞0

ex

n!

dn

dxn(xne−x

)Lm(x)e−xdx

=1

n!

∫ ∞0

dn

dxn(xne−x

)Lm(x)dx

=1

n!

[dn−1

dxn−1(xne−x

)Lm(x)

]∞0

− 1

n!

∫ ∞0

dn−1

dxn−1(xne−x

) d

dxLm(x)dx. (85)

However,

d

dx

(xne−x

)= nxn−1e−x − xne−x

= (n− x)xn−1e−x.

The previous equation implies that the first derivative of xne−x has xn−1e−x as factor. By induction onn, this implies that

dn−1

dxn−1(xne−x

)must have xe−x as factor. Hence the first expression in the equation 85 is zero. This implies :

(Ln, Lm) = − 1

n!

∫ ∞0

dn−1

dxn−1(xne−x

) d

dxLm(x)dx.

Using the same method n− 1 more times leads to the equation 84.Assume first that n 6= m. Therefore, we can assume that n > m (otherwise, we can just switch n

and m). In this case, we must havedn

dxnLm(x) = 0,

since Lm is a m-th degree polynomial. The equation 84 then implies :

(Ln, Lm) = 0,

and concludes the first part of the proof.Secondly, assume that n = m. The equation 84 combined with Rodrigues formula implies :

(Ln, Ln) =(−1)n

n!

∫ ∞0

xne−xdn

dxnLn(x)dx. (86)

The equation 81 implies :

Ln+1(x) = − x

n+ 1Ln(x) +

2n+ 1

n+ 1Ln(x)− n

n+ 1Ln−1(x),

for n = 1, 2, .... Since L1(x) = 1, this implies that the leading term of Ln(x) is

(−1)nxn

n!.

Therefore,dn

dxnLn(x) = (−1)n

n!

n!= (−1)n.

33


(Ln, Ln) =(−1)2n

n!

∫ ∞0

xne−xdx

=1

n!

∫ ∞0

xne−xdx.

=n!

n!= 1,

where we have used the proposition A.5, page 66, which proves that∫ ∞0

xne−xdx = n!,

for n ≥ 0.

Laguerre polynomials are solutions of the following differential equation :

xy′′ + (1− x)y′ = −ny.

The associated S-L equation is : (xe−xy′

)′= −ne−xy.

2.4 Chebyshev polynomials of the first kind

In this section, we present Chebyshev polynomials of the first kind and their properties.It is important to mention that Chebyshev polynomials of the second kind are not presented in this

document.

Proposition 2.10. ( A trigonometry property) For any integers n and m, we have

cos(nθ) cos(mθ) =1

2cos((n+m)θ) +

1

2cos((n−m)θ), (87)

for any θ ∈ R.

We will not prove the previous proposition in this document.

Definition 2.11. ( Chebyshev polynomials) For any x ∈ [−1, 1], the Chebyshev polynomial is

Tn(x) = cos(n arccos(x)), (88)

for n = 0, 1, ....

It is not obvious that Tn is a polynomial : this will be proved later in this document. However, wecan see that this is true for n = 0 and n = 1, since

T0(x) = 1,

T1(x) = x,

for any x ∈ [−1, 1].

Proposition 2.12. ( Three term recurrence for Chebyshev polynomials) Chebyshev polynomials are sothat :

Tn+1(x) = 2xTn(x)− Tn−1(x), (89)

for any x ∈ [−1, 1].

34

T0(x) = 1T1(x) = xT2(x) = 2x2 − 1T3(x) = 4x3 − 3xT4(x) = 8x4 − 8x2 + 1T5(x) = 16x5 − 20x3 + 5x

Figure 15: Chebyshev polynomials of the first kind

The previous proposition implies that Chebyshev functions are, indeed, polynomials since T0 and T1are polynomials.

Proof. Letθ = arccos(x),

for any x ∈ [−1, 1]. This impliesTn(x) = cos(nθ),

for any n = 0, 1, .... The equation 87 with m = 1 implies :

cos(nθ) cos(θ) =1

2cos((n+ 1)θ) +

1

2cos((n− 1)θ)

=1

2Tn+1(x) +

1

2Tn−1(x). (90)

Moreover,

cos(nθ) cos(θ) = Tn(x)x,

which, combined with the equation 90, leads to 89.

The first Chebyshev polynomials are presented in the figure 15.

Example 2.13. ( First Chebyshev polynomials) We have

T2(x) = 2xT1(x)− T0(x)

= 2x2 − 1.

Similarily,

T3(x) = 2xT2(x)− T1(x)

= 2x(2x2 − 1)− x= 4x3 − 2x− x= 4x3 − 3x.

The figure 16 plots the Chebyshev polynomials Tn, for n = 0, 1, 2, 3.The following chebyshev poly Scilab function creates the Chebyshev polynomial of order n.

function y=chebyshev_poly(n)

if (n==0) then


elseif (n==1) then


else


// y(n-2)


// y(n-1)

yn1=polyx

35

Figure 16: The Chebyshev polynomials Tn, for n = 0, 1, 2, 3.

for k=2:n

y=2* polyx*yn1 -yn2

yn2=yn1

yn1=y

end

end

endfunction

Proposition 2.14. ( Orthogonality of Chebyshev polynomials) The Chebyshev polynomials are orthog-onal with respect to the weight

w(x) =1√

1− x2, (91)

for x ∈ [−1, 1]. Moreover,

(Tn, Tm) =

∫ 1

−1Tn(x)Tm(x)w(x)dx =

π

2δnm, (92)

for any integers n and m.

The weight defined by the equation 91 does not correspond to a particular distribution in probabilitytheory.

Proof. First, notice that

arccos′(x) =1√

1− x2, (93)

for x ∈ [−1, 1]. Using the change of variable θ = arccos(x) and the equation 87, we get :

(Tn, Tm) =

∫ 1

−1Tn(x)Tm(x)

1√1− x2

dx

= −∫ 0

π

cos(nθ) cos(mθ)dθ

=

∫ π

0

cos(nθ) cos(mθ)dθ

=1

2

∫ π

0

(cos((n+m)θ) + cos((n−m)θ)) dθ.

36

If n 6= m, then

(Tn, Tm) =1

2

[sin((n+m)θ)

n+m+

sin((n−m)θ)

n−m

]π0

= 0,

since sin(jθ) = 0 for θ = 0 or θ = π, for any integer j. This shows that the equation 92 is true for n 6= m.If, on the other hand, we have n = m, then

(Tn, Tm) =1

2

∫ π

0

(cos((n+m)θ) + 1) dθ

=1

2

∫ π

0

(cos(2nθ) + 1) dθ

=1

2

[sin(2nθ)

2n+ θ

]π0

=1

2,


Proposition 2.15. ( Roots of Chebyshev polynomials) The Chebyshev polynomial of degree n has nsimple real roots in [−1, 1], which are :

xk = cos

(2k − 1

2nπ

), (94)

for k = 1, ..., n.

Proof. The equation 94 implies :

arccos(xk) =2i− 1

2nπ, (95)

which implies

Tn(xk) = cos (n arccos (xk))

= cos

(n

2i− 1

2nπ

)= cos

(2i− 1

2π

)= 0,


There is one small issue with the equation 94. When k = 1, ..., n, the roots x1, ..., xn are from theend of the interval [−1, 1] to the beginning. This might be surprising, because we may expect that theroots are in increasing order. This issue is solved by the following proposition.

Proposition 2.16. The roots of the Chebyshev polynomial of degree n are :

xk = − cos

(2k − 1

2nπ

), (96)

for k = 1, ..., n.

The roots defined by the equation 96 are computed in increasing order when k = 1, ..., n and are thesame as the roots defined by the equation 94.

37

Proof. Notice that :cos(θ) = − cos(π − θ),

for any θ ∈ R. Hence,

− cos

(2k − 1

2nπ

)= cos

(π − 2k − 1

2nπ

)= cos

(2n− 2k + 1

2nπ

)= cos

(2(n− k + 1)− 1

2nπ

)= cos

(2j − 1

2nπ

),

with j = n− i+ 1.

The corresponding weights in the quadrature rule are

αk =π

n, (97)

for k = 1, ..., n.

Proposition 2.17. ( Extrema of Chebyshev polynomials) The extrema of the Chebyshev polynomial ofdegree n are :

xk = cos

(kπ

n

), (98)

for k = 1, ..., n. They are so that

Tn(xk) = (−1)k, (99)

for k = 1, ..., n.

Proof. The equation 88 implies that the derivative of Tn is :

T ′n(x) = −n sin(n arccos(x)) arccos′(x)

=n sin(n arccos(x))√

1− x2,

where the derivative of arccos is given by the equation 93. Therefore, we have T ′n(xk) = 0 if

sin(arccos(xk)) = 0,

for k = 1, ..., n and xk 6= ±1. This implies that the argument of sin must be a multiple of π :

n arccos(xk) = kπ,

which implies the equation 98. Moreover, the abscissas defined by the equation 98 are different from -1and 1, so that the derivative of Tn is well defined for these points.

We plug the equation 98 into the equation 88 and get :

Tn(xk) = cos (kπ)

= (−1)k,

for k = 1, ..., n, which concludes the proof.

38

Figure 17: Degree 11 Chebyshev polynomial.

Chebyshev polynomials are solutions of the following differential equation :

(1− x2)y′′ − xy′ = −n2y,

for x ∈ [−1, 1]. The associated S-L equation is :(1√

1− x2y′)′

= − n2√1− x2

y.

The figure 17 presents the polynomial T11. It is produced by the following script, which uses thechebyshev eval function in order to evaluate T11 on the interval [−1, 1]. Then we use the cheby-

shev quadrature function in order to computes the roots r and the weights w of the Gauss-Chebyshevquadrature.

n=11;

x=linspace (-1,1,100);

y=chebyshev_eval(x,n);

plot(x,y)

[r,w]= chebyshev_quadrature(n);

plot(r,zeros(r),"rx")

xlabel("x")

ylabel("P(x)")

title(msprintf("Chebyshev polynomial - Degree %d",n))

We can see that the roots of the Chebyshev polynomial have a higher density at the ends of the interval[−1, 1]. Moreover, we check that the polynomial T11 oscillates, although its absolute value is alwayslower than 1, as predicted by the equation 99.

2.5 Accuracy of evaluation of the polynomial

In this section, we present the accuracy issues which appear when we evaluate the orthogonal polynomialswith large values of the polynomial degree n. More precisely, we present a fundamental limitation in theaccuracy that can be achieved by the polynomial representation of the orthogonal polynomials, while asufficient accuracy can be achieved with a different evaluation algorithm.

39

The following LaguerreEval function evaluates the Laguerre polynomial of degree n at point x. Oneone hand, the LaguerrePoly function creates a polynomial which can be evaluated with the horner

function, while, on the other hand, the function LaguerreEval directly evaluates the polynomial. Aswe are going to see soon, although the recurrence is the same, the numerical performances of these twofunctions is rather different.

function y=LaguerreEval(x,n)

if (n==0) then

nrows=size(x,"r")

y=ones(nrows ,1)

elseif (n==1) then

y=1-x

else

// y(n-2)

yn2=1

// y(n-1)

yn1=1-x

for k=2:n

y=((2*k-1-x).*yn1 -(k-1)* yn2)/k

yn2=yn1

yn1=y

end

end

endfunction

Consider the value of the polynomial L100 at the point x = 10. From Wolfram Alpha, which uses anarbitrary precision computer system, the exact value of L100(10) is

13.277662844303454137789892644070318985922997400185621...

In the following session, we compare the accuracies of LaguerrePoly and LaguerreEval.

-->n=100;

-->x=10;

-->L=LaguerrePoly(n);

-->y1=horner(L,x)

y1 =

3.695D+08

-->y2=LaguerreEval(x,n)

y2 =

13.277663

We see that the polynomial created by LaguerrePoly does not produce a sufficient accuracy, whileLaguerreEval seems to be accurate. In fact, the value produced by LaguerreEval has more than 15correct decimal digits.

In order to see the trend when x and n increases, we compute the number of common digits, in base 2,in the evaluation of Ln(x), computed from LaguerrePoly and LaguerreEval. Since Scilab uses doubleprecision floating point numbers with 64 bits and 53 bits of precision, this number varies from 53 (whenthe two functions return exactly the same value) to 0 (when the two functions return two value whichhave no digit in common). The figure 18 plots the number of common digits, when n ranges from 1 to100 and x ranges from 1 to 21.

We see that the number of correct digits produced by LaguerrePoly decreases when x and n increase.The explanation for this fact is that the polynomial representation of the Laguerre polynomial forces

to use n = 100 additions of positive and negative powers of x. This computation is ill-conditionned,because it involves large numbers with different signs.

For example, the following session presents some of the coefficients of L100.

-->L

L =

2 3 4 5 6

1 - 100x + 2475x - 26950x + 163384.37x - 627396x + 1655628.3x

7 8 9 10

40

Figure 18: The number of common digits in the evaluation of Laguerre polynomials, for increasing nand x.

- 3176103.3x + 4615275.2x - 5242040.9x + 4770257.2x

[...]

95 96 97 98

- 7.28D-141x + 3.95D-144x - 1.68D-147x + 5.25D-151x

99 100

- 1.07D-154x + 1.07D-158x

We see that the coefficients signs alternate.Consider n real numbers yi, for i = 1, 2, ..., n, representing the monomial values which appear in the

polynomial representation. We are interested in the sum

S = y1 + y2 + ...+ yn.

We know from numerical analysis [11] that the condition number of a sum is

|y1|+ |y2|+ ...|yn||y1 + y2 + ...yn|

.

Hence, when the sum S is small in magnitude, but involves large terms yi, the condition number is large.In other words, small relative errors in yi are converted into large relative errors in S, and this is whathappens for polynomials.

Hence, the polynomial representation should be avoided in practice to evaluate the orthogonal poly-nomial, although they are a convenient way of using them in Scilab. On the other hand, a straightforwardevaluation of the polynomial value based on the recurrence formula can produce a good accuracy.


The classic [1] gives a brief overview of the main results on orthogonal polynomials in the chapter 22”Orthogonal polynomials”. More details on orthogonal polynomials are presented in [3], especially thechapter 12 ”Legendre functions” and chapter 13 ”More special functions”.

The book [2] has three chapters on orthogonal polynomials, including quadrature, and other advancedtopics.

One of the references on the topic is [20], which covers the topic in depth.In the appendices of [15], we can find a presentation of the most common orthogonal polynomials

used in uncertainty analysis.The proposition 2.6 is given in [10], chapter 6.

41

Cell 1 Cell 2 Cell 31 — *** — - — - —2 — - — *** — - —3 — - — - — *** —4 — ** — * — - —5 — ** — - — * —6 — * — ** — - —7 — - — ** — * —8 — * — - — ** —9 — - — * — ** —10 — * — * — * —

Figure 19: Placing d = 3 balls into p = 3 cells. The balls are represented by stars ∗.

3 Multivariate polynomials

In this section, we define the multivariate polynomials which are used in the context of polynomial chaosdecomposition. We consider the regular polynomials as well as multivariate orthogonal polynomials.

3.1 Occupancy problems

In the next section, we will count the number of monomials of degree d with p variables. This problemsis the same as placing d balls into p cells, which is the topic of the current section.

Consider the problem of placing d = 3 balls into p = 3 cells. The figure 19 presents the 10 ways toplace the balls.

Let us denote by αi ≥ 0 the number of balls in the i-th cell, for i = 1, 2, ..., p. Let α = (α1, α2, ..., αp)be the associated vector. Each particular configuration is so that the total number of balls is d, i.e. wehave

α1 + α2 + ...+ αp = d.

Example 3.1. ( Case where d = 8 and p = 6) Consider the case where there are d = 8 balls to distributeinto p = 6 cells. We represent the balls with stars ”*” and the cells are represented by the spaces betweenthe p+ 1 bars ”—”. For example, the string ”—***—*— — — —****—” represents the configurationwhere α = (3, 1, 0, 0, 0, 4).

Proposition 3.2. ( Occupancy problem) The number of ways to distribute d balls into p cells is:((pd

))=

(p+ d− 1

d

). (100)

The left hand side of the previous equation is the multiset coefficient.

Proof. Within the string, there is a total of p + 1 bars : the first and last bars, and the intermediatep − 1 bars. We are free to move only the p − 1 bars and the d stars, for a total of p + d − 1 symbols.Therefore, the problem reduces to finding the place of the d stars in a string of p+ d− 1 symbols. Fromcombinatorics, we know that the number of ways to choose k items in a set of n items is(

nk

)=

n!

k!(n− k)!, (101)

for any nonnegative integers n and k, and k ≤ n. Hence, the number of ways to choose the place of thed stars in a string of p+ d− 1 symbols is defined in the equation 100.

42

α1 α2 α3 α4

2 0 0 01 1 0 01 0 1 01 0 0 10 2 0 00 1 1 00 1 0 10 0 2 00 0 1 10 0 0 2

Figure 20: The monomials with degree d = 2 and p = 4 variables.

3.2 Multivariate monomials

The four one-dimensional monomials 1, x, x2 can be used to create a multivariate monomial, just bymultiplying the appropriate components. Consider, for example, p = 3 and let x ∈ R3 be a point in thethree-dimensional space. The function

x2x23

is the monomial associated to the vector α = (0, 1, 2), which entries are the exponents of x = (x1, x2, x3).

Definition 3.3. ( Multivariate monomial) Let x ∈ Rp. The function Mα(x) is a multivariate monomialif there is a vector α = (α1, α2, ..., αp) of integers exponents, where αi ∈ {0, 1, ..., p}, for i = 1, 2, ..., psuch that

Mα(x) = xα11 xα2

2 ...xαpp , (102)

for any x ∈ Rp.

The vector α is called a multi-index.The previous equation can be rewritten in a more general way:

Mα(x) =

p∏i=1

xαii , (103)

for any x ∈ Rp.

Definition 3.4. ( Degree of a multivariate monomial) Let M(x) be a monomial associated with theexponents α, for any x ∈ Rp. The degree of M is

d = α1 + α2 + ...+ αp.

The degree d of a multivariate monomial is also denoted by |α|.The figure 20 presents the 10 monomials with degree d = 2 and p = 4 variables.

Proposition 3.5. ( Number of multivariate monomials) The number of degree d multivariate monomialsof p variables is: ((

pd

))=

(p+ d− 1

d

). (104)

Proof. The problem can be reduced to distributing d balls into p cells: the i-th cell represents the variablexi where we have to put αi balls. It is then straightforward to use the proposition 3.2.

The figure 21 presents the number of monomials for various d and p.

43

p\d 0 1 2 3 4 51 1 1 1 1 1 12 1 2 3 4 5 63 1 3 6 10 15 214 1 4 10 20 35 565 1 5 15 35 70 126

Figure 21: Number of degree d monomials with p variables.

3.3 Multivariate polynomials

Definition 3.6. ( Multivariate polynomial) Let x ∈ Rp. The function P (x) is a degree d multivariatepolynomial if there is a set of multivariate monomials Mα(x) and a set of real numbers βα such that

P (x) =∑|α|≤d

βαMα(x), (105)

for any x ∈ Rp.

We can plug the equation 103 into 105 and get

P (x) =∑|α|≤d

βα

p∏i=1

xαii , (106)

for any x ∈ Rp.

Example 3.7. ( Case where d = 2 and p = 3) Consider the degree d = 2 multivariate polynomial withp = 3 variables:

P (x1, x2, x3) = 4 + 2x3 + 5x21 − 3x1x2x3 (107)

for any x ∈ R3. The relevant multivariate monomials exponents α are:

(0, 0, 0), (0, 0, 1),

(1, 0, 0), (1, 1, 1).

The only nonzero coefficients β are:

β(0,0,0) = 4, β(0,0,1) = 2,

β(1,0,0) = 5, β(1,1,1) = −3.

Proposition 3.8. ( Number of multivariate polynomials) The number of degree d multivariate polyno-mials of p variables is:

P pd =

(p+ dd

). (108)

Proof. The proof is by recurrence on d. The only polynomial of degree d = 0 is:

P0(x) = 1. (109)

Hence, there are:

1 =

(p0

)multivariate polynomials of degree d = 0. This proves that the equation 108 is true for d = 0. Thepolynomials of degree d = 1 are:

Pi(x) = xi, (110)

44

p\d 0 1 2 3 4 5 61 1 2 3 4 5 6 72 1 3 6 10 15 21 283 1 4 10 20 35 56 844 1 5 15 35 70 126 2105 1 6 21 56 126 252 4626 1 7 28 84 210 462 924

Figure 22: Number of degree d polynomials with p variables.

for i = 1, 2, ..., p. Hence, there is a total of 1 + p polynomials of degree d = 1, which writes

1 + p =

(p+ 1

1

).

This proves that the equation 108 is true for d = 1.Now, assume that the equation 108 is true for d, and let us prove that it is true for d + 1. For a

multivariate polynomial P (x) of degree d+ 1, the equation 105 implies

P (x) =∑

|α|≤d+1

βαMα(x),

=∑|α|≤d

βαMα(x) +∑

|α|=d+1

βαMα(x),

for any x ∈ Rp. Hence, the number of multivariate polynomials of degree d+ 1 is the sum of the numberof multivariate polynomials of degree d and the number of multivariate monomials of degree d+1. Fromthe equation 104, the number of such monomials is((

pd+ 1

))=

(p+ dd+ 1

). (111)

Hence, the number of multivative polynomials of degree d+ 1 is:(p+ dd+ 1

)+

(p+ dd

). (112)

However, we know from combinatorics that(nk

)+

(n

k − 1

)=

(n+ 1k

), (113)

for any nonnegative integers n and k. We apply the previous equality with n = p+ d and k = d+ 1, andget (

p+ dd+ 1

)+

(p+ dd

)=

(p+ d+ 1d+ 1

), (114)

which proves that the equation 108 is also true for d+ 1, and concludes the proof.

The figures 22 and 23 present the number of polynomials for various d and p.One possible issue with the equation 105 is that it does not specify a way of ordering the P pd mul-

tivariate polynomials of degree d. However, we will present in the next section a constructive way ofordering the multi-indices is such a way that there is a one-to-one mapping from the single index k, inthe range from 1 to P pd , to the corresponding multi-index α(k). With this ordering, the equation 105becomes

P (x) =

Ppd∑

k=1

βkMα(k)(x), (115)

for any x ∈ Rp.

45

Figure 23: Number of degree d polynomials with p variables.

Example 3.9. ( Case where d = 2 and p = 2) Consider the degree d = 2 multivariate polynomial withp = 2 variables:

P (x1, x2) = β(0,0) + β(1,0)x1 + β(0,1)x2 + β(2,0)x21 + β(1,1)x1x2 + β(0,2)x

22 (116)

for any x1, x2 ∈ R2. This corresponds to the equation

P (x1, x2) = β1 + β2x1 + β3x2 + β4x21 + β5x1x2 + β6x

22 (117)

for any x1, x2 ∈ R2.

3.4 Generating multi-indices

In this section, we present an algorithm to generate all the multi-indices α associated with a degree dmultivariate polynomial of p variables.

The following polymultiindex function implements the algorithm suggested by Lemaıtre and Knioin the appendix of [15]. The function returns a matrix a with p columns and P pd rows. Each row a(i,:)

represents the i-th multivariate polynomial, for i = 1, 2, ..., P pd . For each row i, the entry a(i,j) is theexponent of the j-th variable xj , for i = 1, 2, ..., P pd and j = 1, 2, ..., p. The sum of exponents for each rowis lower or equal to d. The first row is the zero-degree polynomial, so that all entries in a(1,1:p) arezero. The rows 2 to p+1 are the first degree polynomials: all columns are zero, except one entry whichis equal to 1.

function a=polymultiindex(p,d)

// Zero -th order polynomial

a(1,1:p)= zeros(1,p)

// First order polynomials

a(2:p+1,1:p)=eye(p,p)

P=p+1

pmat =[]

pmat (1:p ,1)=1

for k=2:d

L=P

for i=1:p

pmat(i,k)=sum(pmat(i:p,k-1))

46

end

for j=1:p

for m=L-pmat(j,k)+1:L

P=P+1

a(P,1:p)=a(m,1:p)

a(P,j)=a(P,j)+1

end

end

end

endfunction

For example, in the following session, we compute the list of multi-indices corresponding to multi-variate polynomials of degree d = 3 with p = 3 variables.

-->p=3;

-->d=3;

-->a=polymultiindex(p,d)

a =

0. 0. 0.

1. 0. 0.

0. 1. 0.

0. 0. 1.

2. 0. 0.

1. 1. 0.

1. 0. 1.

0. 2. 0.

0. 1. 1.

0. 0. 2.

3. 0. 0.

2. 1. 0.

2. 0. 1.

1. 2. 0.

1. 1. 1.

1. 0. 2.

0. 3. 0.

0. 2. 1.

0. 1. 2.

0. 0. 3.

There are 20 rows in the previous matrix a, which is consistent with the table 22. In the previoussession, the first row correspond to the zero-th degree polynomial, the rows 2 to 4 corresponds to thedegree one monomials, the rows from 5 to 10 corresponds to the degree 2 monomials, whereas the rowsfrom 11 to 20 corresponds to the degree 3 monomials.

3.5 Multivariate orthogonal functions

In this section, we introduce the weighted space of square integrable functions in Rp, so that we candefine the orthogonality of multivariate functions.

Definition 3.10. ( Interval of Rp) An interval I of Rp is a subspace of Rp, such that x ∈ I if

x1 ∈ I1, x2 ∈ I2, ..., xp ∈ Ip,

where I1, I2,..., Ip are intervals of R. We denote this tensor product by:

I = I1 ⊗ I2 ⊗ ...⊗ Ip.

Now that we have multivariate intervals, we can consider a multivariate weight function on such aninterval.

Definition 3.11. ( Multivariate weight function of Rp) Let I be an interval in Rp. A weight function won I is a nonnegative integrable function of x ∈ I.

47

Such a weight can be created by making the tensor product of univariate weights.

Proposition 3.12. ( Tensor product of univariate weights) Let I1, I2, ..., Ip be intervals of R. Assumethat w1, w2, ..., wp are univariate weights on I1, I2, ..., Ip. Therefore, the tensor product function

w(x) = w1(x1)w2(x2)...wp(xp),

for any x ∈ R, is a weight function of x ∈ I.

Proof. Indeed, its integral is∫I

w(x)dx =

∫I

w1(x1)w2(x2)...wp(xp)dx1dx2...dxp

=

(∫I1

w1(x1)dx1

)(∫I2

w2(x2)dx2

)...

(∫Ip

wp(xp)dxp

).

However, all the individual integrals are finite, since each function wi is, by hypothesis, a weight on Ii,for i = 1, 2, ..., p. Hence, the product is finite, so that the function w is, indeed, a weight on I.

Example 3.13. ( Multivariate Gaussian weight) In the particular case where all weights are the same,the expression simplifies further. Consider, for example, the Gaussian weight

exp(−x2/2), (118)

for x ∈ R. We can then consider the tensor product:

w(x) = w(x1)w(x2)...w(xp), (119)

for x1, x2, ..., xp ∈ R.

Example 3.14. ( Multivariate Hermite-Legendre weight) Consider the weight for Hermite polynomials

w1(x1) = exp(−x21/2)

for x1 ∈ R, and consider the weight for Legendre polynomials w2(x2) = 1 for x2 ∈ [−1, 1]. Therefore,

w(x) = w1(x1)w2(x2)

for x ∈ I is a weight on I = R⊗ [−1, 1].

In the remaining of this document, we will not make a notation difference between the univariateweight function w(x) and the multivariate weight w(x), assuming that it is based on a tensor product ifneeded.

Definition 3.15. ( Multivariate weighted L2 space in Rp) Let I be an interval in Rp. Let w be amultivariate weight function on I. Let L2

w(I) be the set of functions g which are square integrable withrespect to the weight function w, i.e. such that the integral

‖g‖2 =

∫I

g(x)2w(x)dx (120)

is finite. In this case, the norm of g is ‖g‖.

Definition 3.16. ( Multivariate inner product in L2w(I) space) Let I be an interval of Rp. Let w be a

multivariate weight function on I. For any g, h ∈ L2w(I), the inner product of g and h is

(g, h) =

∫I

g(x)h(x)w(x)dx. (121)

Let I be an interval of Rp and assume that g ∈ L2w(I). We can combine the equations 120 and 121,

which implies that the L2w(I) norm of g can be expressed as an inner product:

‖g‖2 = (g, g) . (122)

48

Proposition 3.17. ( Multivariate probability distribution function) Let I be an interval of Rp. Let w bea multivariate weight function on I. Assume {Xi}i=1,2,...,p are independent random variables associatedwith the probability distribution functions fi, derived from the weight functions wi, for i = 1, 2, ..., p.Therefore, the function

f(x) = f1(x1)f2(x2)...fp(xp), (123)

for x ∈ I is a probability distribution function.

Proof. We must prove that the integral of f is equal to one. Indeed,∫I

f(x)dx =

∫I

f1(x1)f2(x2)...fp(xp)dx1dx2...dxp

=

(∫I1

f1(x1)dx1

)(∫I2

f2(x2)dx2

)...

(∫Ip

fp(xp)dxp

)= 1,

where each integral is equal to one since, by hypothesis, fi is a probability distribution function fori = 1, 2, ..., p.

3.6 Tensor product of orthogonal polynomials

In this section, we present a method to create multivariate orthogonal polynomials.Consider the cas where p = 2 and let’s try to create a bivariate Hermite orthogonal polynomials.

Therefore, we consider the inner product:

(g, h) =

∫R2

g(x1, x2)h(x1, x2)w(x1)w(x2)dx1dx2, (124)

associated with the multivariate Gaussian weight w(x).We can make the tensor product of some Hermite polynomials, which leads, for example, to:

Ψ1(x1, x2) = He1(x1)

Ψ2(x1, x2) = He1(x1)He2(x2),

associated with the multivariate Gaussian weight w(x). These polynomials are of degree 2. We maywonder if these two polynomials are orthogonal. We have

(Ψ1,Ψ2)

=

∫R2

Ψ1(x1, x2)Ψ2(x1, x2)w(x1)w(x2)dx1dx2

=

∫R2

He1(x1)2He2(x2)w(x1)w(x2)dx1dx2

which implies

(Ψ1,Ψ2) (125)

=

(∫RHe1(x1)2w(x1)dx1

)(∫RHe2(x2)w(x2)dx2

). (126)

By the orthogonality of Hermite polynomials, we have∫RHe2(x2)w(x2)dx2 =

∫RHe0(x2)He2(x2)w(x2)dx2

= 0.

Hence,

(Ψ1,Ψ2) = 0,

49

d Multi-Index Polynomial

0 α(1) = [0, 0] Ψ1(x) = He0(x1)He0(x2) = 11 α(2) = [1, 0] Ψ2(x) = He1(x1)He0(x2) = x11 α(3) = [0, 1] Ψ3(x) = He0(x1)He1(x2) = x22 α(4) = [2, 0] Ψ4(x) = He2(x1)He0(x2) = x21 − 12 α(5) = [1, 1] Ψ5(x) = He1(x1)He1(x2) = x1x22 α(6) = [0, 2] Ψ6(x) = He0(x1)He2(x2) = x22 − 13 α(7) = [3, 0] Ψ7(x) = He3(x1)He0(x2) = x31 − 3x13 α(8) = [2, 1] Ψ8(x) = He2(x1)He1(x2) = (x21 − 1)x23 α(9) = [1, 2] Ψ9(x) = He1(x1)He2(x2) = x1(x22 − 1)3 α(10) = [0, 3] Ψ10(x) = He0(x1)He3(x2) = x32 − 3x2

Figure 24: Multivariate Hermite polynomials of p = 2 variables and degree d = 3.

which implies that Ψ1 and Ψ2 are orthogonal. However,

‖Ψ1‖2 =

∫R2

He1(x1)2w(x1)w(x2)dx1dx2

=

(∫RHe1(x1)2w(x1)dx1

)(∫Rw(x2)dx2

)=√

2π‖He1‖2.

Definition 3.18. ( Tensor product of orthogonal polynomials) Let I be a tensor product interval ofRp and let w the associated multivariate tensor product weight function. Let φ

α(k)i

(xi) be a family of

univariate orthogonal polynomials. The associated multivariate tensor product polynomials are

Ψk(x) =

p∏i=1

φα

(k)i

(xi) (127)

for k = 1, 2, ..., P pd , where the degree of Ψk is

d =

p∑i=1

α(k)i . (128)

Example 3.19. ( Multivariate Hermite polynomials) The figure 24 presents the multivariate Hermitepolynomials of p = 2 variables and degree d = 3. The figure 25 presents the multivariate Hermitepolynomials, with degree d = 2 and p = 2 variables.

Proposition 3.20. ( Multivariate orthogonal polynomials) The multivariate tensor product polynomialsdefined in 3.18 are orthogonal.

Proof. We must prove that, for two different integers k and `, we have

(Ψk,Ψ`) = 0.

By the definition of the inner product on L2w(I), we have

(Ψk,Ψ`) =

∫I

Ψk(x)Ψ`(x)w(x)dx

By assumption, the multivariate weight function w on I is the tensor product of univariate weightfunctions wi on Ii. Hence,

(Ψk,Ψ`) =

∫I

p∏i=1

φα

(k)i

(xi)

p∏i=1

φα

(`)i

(xi)

p∏i=1

wi(xi)dx

=

p∏i=1

∫Ii

φα

(k)i

(xi)φα(`)i

(xi)wi(xi)dxi.

50

Figure 25: The multivariate Hermite polynomials, with degree d = 2 and p = 2 variables.

51

In other words,

(Ψk,Ψ`) =

p∏i=1

(φα

(k)i, φα

(`)i

). (129)

However, the multi-indice ordering implies that, if k 6= `, then there exists an integer i ∈ {1, 2, ..., p},such that

α(k)i 6= α

(`)i .

By assumption, the φα

(k)i

(xi) polynomials are orthogonal. This implies(φα

(k)i, φα

(`)i

)= 0,


We can use the equation 129 to compute the L2w(I) norm of Ψk. Indeed,

‖Ψk‖2 = (Ψk,Ψk)

=

p∏i=1

(φα

(k)i, φα

(k)i

)=

p∏i=1

‖φα

(k)i‖2.

This proves the following proposition.

Proposition 3.21. ( Norm of multivariate orthogonal polynomials) The L2w(I) norm of the multivariate

orthogonal polynomials defined in 3.18 is:

‖Ψk‖2 =

p∏i=1

‖φα

(k)i‖2. (130)

Example 3.22. ( Norm of the multivariate Hermite polynomials) Consider the multivariate Hermitepolynomials in the case where p = 2. The figure 24 indicates that

Ψ8(x) = He2(x1)He1(x2), (131)

for x1, x2 ∈ R. Hence, the equation 130 implies:

‖Ψ8‖2 = ‖He2‖2‖He1‖2.

The norm of the univariate Hermite polynomial is given by the equation 69. Hence,

‖Ψ8‖2 =√

2π · 2! ·√

2π · 1!

= 4π.

For any p ≥ 1, we have

‖Ψk‖2 =

p∏i=1

√2πα

(k)i !

= (2π)2/pp∏i=1

α(k)i !.

52

3.7 Multivariate orthogonal polynomials and probabilities

In this section, we present the properties of Ψk(X), where X is a multivariate random variable associatedwith multivariate orthogonal polynomials {Ψk}k≥1.

Proposition 3.23. ( Expectation of multivariate orthogonal polynomials) Let I be a tensor productinterval of Rp and let w the associated multivariate tensor product weight function. Let Ψk be thefamily of tensor product orthogonal multivariate polynomials, associated with the weight w. Let X be amultivariate random variable associated with the f multivariate probability distribution function, where{Xi}i=1,2,...,p are independent random variables. Therefore,

E(Ψ1(X)) = 1 (132)

and

E(Ψk(X)) = 0, (133)

for k > 1.

Proof. Indeed,

E(Ψ1(X)) =

∫I

Ψ1(x)f(x)dx

=

∫I

f(x)dx

= 1,

since Ψ1(x) = 1, for any x ∈ I. Moreover, for any k > 1, we have

E(Ψk(X)) =

∫I

Ψk(x)f(x)dx

=

∫I

p∏i=1

φα

(k)i

(xi)

p∏i=1

fi(xi)dx1dx2...dxp

=

p∏i=1

∫Ii

φα

(k)i

(xi)fi(xi)dxi

which implies

E(Ψk(X)) =

p∏i=1

E(φα

(k)i

(xi)). (134)

However, the proposition 1.15 implies:

E(φα

(k)i

(X))

= 1

if α(k)i = 0 and

E(φα

(k)i

(X))

= 0

if α(k)i ≥ 1.

Since k > 1, there is at least one integer i ∈ {1, 2, ..., p} such that α(k)i ≥ 1. If this was not true, this

would imply that α(k)i = 0, for i = 1, 2, ..., p. This implies that k = 1, which contradicts the hypothesis.

Therefore, there is at least one integer i such that the expectation is zero, so that the product in theequation 134 is also zero.

53

Proposition 3.24. ( Variance of multivariate orthogonal polynomials) Under the same hypotheses asin proposition 3.23, we have

V (Ψ1(X)) = 0 (135)

and

V (Ψk(X)) =

p∏i=1

E(φα

(k)i

(Xi)2), (136)

for k > 1.

Proof. Obviously, the random variable Ψ1(X) = 1 has a zero variance. Furthermore, for k > 1, we have

V (Ψk(X)) = E(

(Ψk(X)− E(Ψk(X)))2)

= E(Ψk(X)2

),

since E(Ψk(X)) = 0. We then use the tensor product definition of both Ψk and f , and finally get to theequation 136.

For k > 1, we have

E(φα

(k)i

(Xi)2)

= V(φα

(k)i

(Xi))

+ E(φα

(k)i

(Xi))2.

Therefore, based on the propositions 1.15 and 1.16, we can use the equation 136 to evaluate the varianceof Ψk.

Proposition 3.25. ( Covariance of multivariate orthogonal polynomials) Under the same hypotheses asin proposition 3.23, for any two integers k and `, we have

Cov(Ψk(X),Ψ`(X)) = 0, (137)

if k 6= `.

Proof. Assume that k > 1. Therefore,

Cov(Ψ1(X),Ψk(X)) = E ((Ψ1(X)− µ1) (Ψk(X)− µk)) ,

where µ1 = E(Ψ1(X)) and µk = E(Ψk(X)). However, we have Ψ1(X) = E(Ψ1(X)) = 1. Hence,

Cov(Ψ1(X),Ψk(X)) = 0.

The same is true if we consider

Cov(Ψk(X),Ψ1(X)) = Cov(Ψ1(X),Ψk(X))

= 0,

by the symmetry property of the covariance.Assume now that k and ` are two integers such that k, ` > 1. We have,

Cov(Ψk(X),Ψ`(X)) = E ((Ψk(X)− µk) (Ψ`(X)− µ`))= E (Ψk(X)Ψ`(X)) ,

since, by the proposition 3.23, we have µk = µ` = 0. Hence,

Cov(Ψk(X),Ψ`(X)) =

∫I

Ψk(x)Ψ`(x)f(x)dx

=1∫

Iw(x)dx

∫I

Ψk(x)Ψ`(x)w(x)dx

=1∫

Iw(x)dx

(Ψk,Ψ`)

by the definition of the weight function w. By the proposition 3.20, the polynomials Ψk and Ψ` areorthogonal, which concludes the proof.

54


The stars and bars proof used in section 3.1 is presented in Feller’s book [8], in the section ”Application tooccupancy problems” of the chapter 2 ”Elements of combinatorial analysis”. The book [15] and the thesis[14] present spectral methods, including the multivariate orthogonal polynomials involved in polynomialchaos. The figures 24 and 25 are presented in several papers and slides related to multivariate orthogonalpolynomials, including [16], for example.

4 Polynomial chaos

4.1 Introduction

The polynomial chaos, introduced by Wiener [21], uses Hermite polynomials as the basis and involvesindependent Gaussian random variables.

Denote [4, 17] the set of multi-indices with finite number of nonzero components as

J = {α = (αi)i≥1, αi ∈ {0, 1, 2, ...}, |α| <∞} , (138)

where

|α| =∞∑i=1

αi. (139)

If α ∈ J , then there is only a finite number of nonzero components (otherwise |α| would be infinite).Let {Xi}i≥0 be an infinite set of independent standard normal random variables and let f be the

associated multivariate tensor product normal distribution function. Assume that g(X) is a randomvariable with finite variance. We are interested in the decomposition of g(X) onto Heαi

, the Hermitepolynomial of degree αi ≥ 0.

The expectation of g(X) is

E(g(X)) =

∫g(x)f(x)dx, (140)

and its variance is:

V (g(X)) = E((g(X)− µ)2

), (141)

where µ = E(g(X)).For any α ∈ J , the Wick polynomial is

Ψα(X) =

∞∏i=1

Heαi(Xi). (142)

The degree of the polynomial Tα is |α|. Notice that, since there is only a finite number of nonzerocomponents in α, the right hand side of the expression 142 has a finite number of factors.

We consider the inner product

(g,Ψα) =

∫g(x)Ψα(x)f(x)dx. (143)

We use the norm

‖Ψα‖2 =

∫Ψα(x)2f(x)dx. (144)

The following theorem is due to Cameron and Martin [5].

55

Theorem 4.1. ( Cameron-Martin) Let {xi}∞i=1 be an infinite set of independent standard normal randomvariables. Let g(X) be a random variable with finite variance. Then g(X) has the decomposition

g(X) =∑α∈J

aαΨα(X) (145)

where X = (ξ1, ξ2, ...) and

aα =(g,Ψα)

‖Ψα‖2. (146)

Moreover,

E(g(X)) = aα0, (147)

V (g(X)) =∑α∈J

a2α‖Ψα‖2, (148)

where α0 = (0, 0, ...).

The previous theorem is due to [5] and will not be proved here.Moreover, the decomposition 145 converges in the L2

w sense.

4.2 Truncated decomposition

In the decomposition 145, there is an infinite number of random variables xi and an unrestricted polyno-mial degree |α|. In order to use such an expansion in practice, we have to perform a double truncation.This is why we keep only p independent normal random variables and only the order d multivariatepolynomials. Define the truncated index set [12]

Jp,d = {α = (α1, α2, ..., αp), |α| ≤ d} . (149)

The finite decomposition is

g(X) ≈∑

α∈Jp,d

aαΨα(X), (150)

where X = (X1, X2, ..., Xp) are independent standard normal random variables. There is a one-to-onemapping from the multi-indices of α of the set Jp,d and the indices k = 1, 2, ..., P pd defined in the section3.4. Therefore, the decomposition 145 can be written:

g(X) ≈Pp

d∑k=1

akΨk(X), (151)

where Ψk is defined by the equation 127.

Proposition 4.2. ( Expectation and variance of the truncated PC expansion) The truncated expansion151 is such that

E(g(X)) = a1 (152)

V (g(X)) =

Ppd∑

k=2

a2kV (Ψk(X)). (153)

Since the expansion 151 involves only a finite number of terms, it is relatively easy to prove theprevious proposition.

56

Proof. The expectation of g(X) is

E(g(X)) = E

Ppd∑

k=1

akΨk(X)

=

Ppd∑

k=1

E(akΨk(X))

=

Ppd∑

k=1

akE(Ψk(X)),

since the expectation of a sum is the sum of expectations. Then, the equation 152 is a straightforwardconsequence of the proposition 3.23.

The variance of g(X) is

V (g(X)) = V

Ppd∑

k=1

akΨk(X)

=

Ppd∑

k=1

V (akΨk(X)) +

Ppd∑

k,`=1k 6=`

Cov(akΨk(X), a`Ψ`(X))

=

Ppd∑

k=1

a2kV (Ψk(X)) +

Ppd∑

k,`=1k 6=`

aka`Cov(Ψk(X),Ψ`(X)).

However, the proposition 3.25 states that the covariances are zero. Moreover, the variance of Ψ1 is zero,since this is a constant. This leads to the equation 153.

In the following proposition, we prove the equation 146 in the case of the truncated decomposition.

Proposition 4.3. ( Coefficients of the truncated decomposition) The truncated expansion 151 is suchthat

ak =(g,Ψk)

‖Ψk‖2. (154)

Proof. Indeed,

(g,Ψk) =

∫I

g(x)Ψk(x)w(x)dx

=

Ppd∑

`=1

a`

∫I

Ψ`(x)Ψk(x)w(x)dx

=

Ppd∑

`=1

a` (Ψ`,Ψk)

= ak (Ψk,Ψk)

= ak‖Ψk‖2,

where we have used the orthogonality of the functions {Ψk}k≥1.

An immediate consequence of the equation 4.3 is that the decomposition 151 is unique. Indeed,assume that ak and bk are real numbers and consider the decompositions

g(X) ≈Pp

d∑k=1

akΨk(X),

57

and

g(X) ≈Pp

d∑k=1

bkΨk(X).

The equation 154 is satisfied both by the coefficients ak and bk, so that

ak = bk.

The following decomposition shows how the coefficients be be expressed in termes of expectationsand variances.

Proposition 4.4. ( Coefficients of the truncated decomposition (2)) The truncated expansion 151 issuch that

a1 = E(g(X)) (155)

and

ak =E (g(X)Ψk(X))

V (Ψk(X))(156)

for k > 1.

Proof. By definition of the expectation, we have

E (g(X)Ψk(X)) =

∫I

g(x)Ψk(x)f(x)dx

=1∫

Iw(x)dx

∫I

g(x)Ψk(x)w(x)dx

=(g,Ψk)∫Iw(x)dx

=ak‖Ψk‖2∫Iw(x)dx

,

where the last equation is implied by 154. Hence,

E (g(X)Ψk(X)) =ak∫

Iw(x)dx

∫I

Ψk(x)2w(x)dx

= ak

∫I

Ψk(x)2f(x)dx

= akE(Ψk(X)2).

This implies:

ak =E (g(X)Ψk(X))

E(Ψk(X)2).

For k = 1, we have Ψ1(X) = 1, so that the previous equation implies 155. For k > 1, we have

ak =E (g(X)Ψk(X))

V (Ψk(X)) + E(Ψk(X))2.

However, we know from proposition 3.23 that, for k > 1, we have E(Ψk(X)) = 0, which concludes theproof.

58

4.3 Univariate decomposition examples

In this section, we present several examples of decomposition of univariate random variables on Hermitepolynomials.

Assume that X is a standard univariate normal random variable (i.e. p = 1) and consider the randomvariable g(X) where g is a square integrable function. Its truncated polynomial chaos decomposition is

g(X) ≈P 1

d∑k=1

akΨk(X), (157)

where Ψk are the univariate Hermite polynomials. The number of univariate polynomials is given by theproposition 3.8, which states that, for p = 1, we have

P 1d =

(1 + dd

)=

(1 + d)!

d!1!= 1 + d.

Moreover, the equation 127 states that the functions Ψk are defined in terms of the Hermite polynomialsas

Ψk(x) = Heα(k)(x),

for any x ∈ R and k ≥ 1, where

α(k) = k − 1.

This implies :

Ψk(x) = Hek−1(x),

for any x ∈ R and k ≥ 1. The equation 157 simplifies to:

g(X) ≈ a1He0(X) + a2He1(X) + ...+ ad+1Hed(X),

where d ≥ 1 is an integer and {Hei}i=0,1,...,d are Hermite polynomials.The proposition 4.4 states that the coefficients are

a1 = E(g(X))

and

ak =E (g(X)Hek−1(X))

V (Hek−1(X))

=E (g(X)Hek−1(X))

(k − 1)!

for k > 1. In general, this requires to compute the integral

E (g(X)Hek−1(X)) =

∫Rg(x)Hek−1(x)f(x)dx, (158)

for k ≥ 1, where f is the Gaussian probability distribution function defined in the equation 66.In this section, we consider several specific examples of functions g and present the associated coeffi-

cients ai, for i = 1, 2, ..., d+ 1.

59

Example 4.5. ( Constant) Assume that

g(x) = c,

for some real constant c. A possible exact decomposition is:

g(X) = cHe0(X).

Since the polynomial chaos decomposition is unique, this is the decomposition. But it is interesting tosee how the coefficients can be computed in this particular case. We have

a1 = E(c)

= c.

Moreover,

ak =E (cHek−1(X))

(k − 1)!

= cE (Hek−1(X))

(k − 1)!

However, the proposition 1.15 states that E (Hek−1(X)) = 0 for k > 1. This implies

ak = 0

for k > 1.

Example 4.6. ( Standard normal random variable) Assume that

g(x) = x,

for any x ∈ R. The exact polynomial chaos decomposition is

g(X) = He1(X), (159)

since He1(X) = X. Again, it is interesting to see how the coefficients can be computed in this particularcase. We have

a1 = E(X)

= 0,

since, by assumption, X has a mean equal to zero. Moreover,

ak =E (XHek−1(X))

(k − 1)!

=E (He1(X)Hek−1(X))

(k − 1)!,

since the first Hermite polynomial is He1(x) = x. The equation 16 then implies ak = 0, for k > 2.Moreover,

a2 =E(He1(X)2

)V (He1(X))

=V (He1(X))

0!

=0!

1= 1,

where the previous equation comes from the equation 17. This immediately leads to the equation 159.

60

k a0 a1 a2 a3 a4 a5 a6 a7 a8 a9x0 1x1 1x2 1 1x3 3 1x4 3 6 1x5 15 10 1x6 15 45 15 1x7 105 105 21 1x8 105 420 210 28 1x9 945 1260 378 36 1

Figure 26: Coefficients ak in the decomposition of xn into Hermite polynomials, for n = 0, 1, ..., 9.

Example 4.7. ( The square function) Consider the function

g(x) = x2,

for any x ∈ R. Consider the random variable g(X) = X, where X is a standard normal random variable.From the table 7, we see that

x2 = x2 − 1 + 1

= He2(x) +He0(x).

Therefore, the exact polynomial chaos decomposition of g is

g(x) = He0(x) +He2(x).

Example 4.8. ( Decomposition of xn) We can compute the decomposition of xn, where n is a positiveinteger. For example, we have

x5 = 15He1(x) + 10He3(x) +He5(x).

In the figure 26, we present the coefficients ak in the decomposition:

xn =∑k≥0

akHek(x).

Example 4.9. ( Transformation) TODO : X normal µ, σ and the standardization of X.

Example 4.10. ( Some approximate decompositions) We compute the approximate polynomial chaosdecomposition of several functions, where X is a standard normal random variable. In order to performthe numerical integration involved in the integral 158, we have used the intg function, with an absolutetolerance equal to 1.e-8, and numerical bounds equal to -10 and +10. We compute the coefficients upto the degree 14.

These numerical experiments are consistent with the following exact integrals, computed from WolframAlpha [19].

∫ ∞−∞

cos(x)Hen(x)w(x)dx = {

4.4 Generalized polynomial chaos

In 2002, Xiu and Karniadakis [23, 24] introduced the generalized polynomial chaos. They notice that theHermite polynomials are quite effective to solve some stochastic differential equations involving Gaussianas well as non Gaussian random variables, e.g. log-normal random variables. However, they emphasize

61

cos(x) sin(x)a0 0.6065307a1 0.6065307a2 -0.3032653a3 -0.1010884a4 0.0252721a5 0.0050544a6 -0.0008424a7 -0.0001203a8 0.0000150a9 0.0000017a10 -0.0000002a11 -1.519D-08a12 1.266D-09a13 9.740D-11a14 -6.957D-12

Figure 27: Coefficients ak in the decomposition of several functions into Hermite polynomials.

Distribution Polynomial SupportContinuous Normal Hermite (−∞,∞)Distribution Gamma (Exponential) Laguerre [0,∞)

Beta Jacobi [a, b]Uniform Legendre [a, b]

Discrete Poisson Chalier {0, 1, 2, ...}Distribution Binomial Krawtchouk {0, 1, 2, ..., N}

Negative Binomial Meixner {0, 1, 2, ...}Hypergeometric Hahn {0, 1, 2, ..., N}

Figure 28: Map from a distribution function to the associated orthogonal polynomials [16].

that, for general non-Gaussian random inputs, the convergence rate is not fast. This is why they introducethe generalized polynomial chaos, where each probability distribution function is associated to a familyof orthogonal polynomials.

The involved orthogonal polynomials are constructed from limit relationships from the hypergeometricorthogonal polynomials. This leads to a classification which allows to create a tree of relations betweenthe polynomials, called the Askey scheme. The mapping from the distribution to the correspondingorthogonal polynomial is presented in the figure 28.

The finite decomposition is

g(X) ≈∑

α∈Jp,d

aαΨα(X), (160)

where X = (X1, X2, ..., Xp) are independent random variables and Ψk is the tensor product of orthogonalpolynomials φ, so that

Ψk(x) =

p∏i=1

φα

(k)i

(xi) (161)

for k = 1, 2, ..., P pd .

Example 4.11. Assume that X1 is a standard normal random variable, X2 is a uniform random variablein the interval [−1, 1], and assume that X1 and X2 are independent. Therefore, the generalized polynomialchaos associated with the random variable X = (X1, X2) involves the Hermite orthogonal polynomials(for X1) and the Legendre orthogonal polynomials (for X2).

Example 4.12. ( Transformation) TODO : X U(0,1) and G(X)=exp(X).

62


The book [15] and the thesis [14] present spectral methods, including polynomial chaos.In the figure 26, we present the coefficients ak in the decomposition of xn into Hermite polynomials,

for n = 0, 1, ..., 9. This table is similar to the table 22.12 in the chapter 22 of [1]. But our tablepresents the coefficients for the probabilist’s Hermite polynomials Hen, while the table 22.12 presentsthe coefficients of the physicist’s Hermite polynomials Hn.

5 Acknowledgments

We thank Thibaut Delage for his stimulating support and suggestions.

A Integrals

A.1 Gaussian integral

Proposition A.1. ( Gaussian integral)∫ +∞

−∞exp(−x2/2)dx =

√2π. (162)

Proof. We have,(∫ +∞

−∞exp(−x2)dx

)2

=

(∫ +∞

−∞exp(−x2)dx

)(∫ +∞

−∞exp(−x2)dx

)=

(∫ +∞

−∞exp(−x2)dx

)(∫ +∞

−∞exp(−y2)dy

)=

∫ +∞

−∞

∫ +∞

−∞exp

(−x2 − y2

)dxdy.

Let us consider the polar change of variable:

x = r cos(θ),

y = r sin(θ)

where r ∈ R and θ ∈ [0, 2π]. Its Jacobian is J(r, θ) = r (see, e.g. [6]). Hence,(∫ +∞

−∞exp(−x2)dx

)2

=

∫ +∞

0

∫ 2π

0

exp(−r2

)rdrdθ

= 2π

∫ +∞

0

r exp(−r2

)dr.

Consider the change of variable s = −r2. We have ds = 2rdr, which implies(∫ +∞

−∞exp(−x2)dx

)2

= π

∫ 0

−∞exp (−s) ds

= π(exp(0)− 0)

= π.

Hence, ∫R

exp(−x2)dx =√π. (163)

Now consider the change of variable y =√

2x. We have dx = dy/√

2, which implies∫R

exp(−x2)dx =1√2

∫R

exp(−y2/2)dy.

We plug the equation 163 into the previous equality and get 162.

63

The previous proof does not take into account the improper integrals which appear during the com-putation.

A.2 Weighted Gaussian integral of powers of x

Proposition A.2. For any odd n, ∫ +∞

−∞xn exp(−x2/2)dx = 0, (164)

Proof. If n is odd, the function xn is antysymmetric, since (−x)n = −xn for any x ∈ R. On the otherhand, the function exp(−x2/2) is symmetric, since exp(−(−x)2/2) = exp(−x2/2). We consider thechange of variable y = −x and get:∫ 0

−∞xn exp(−x2/2)dx =

∫ +∞

0

(−y)n exp(−(−y)2/2)dy

= −∫ +∞

0

yn exp(−y2/2)dy.

This implies ∫ +∞

−∞xn exp(−x2/2)dx =

∫ 0

−∞xn exp(−x2/2)dx+

∫ +∞

0

xn exp(−x2/2)dx

= 0.

and concludes the proof.

Proposition A.3. For any even n, we have∫ +∞

−∞xn exp(−x2/2)dx = 1 · 3 · 5 · ... · (n− 1)

√2π. (165)

Proof. Let us denote by In the integral

In =

∫ +∞

−∞xn exp(−x2/2)dx. (166)

We are going to prove the equality

In+2 = (n+ 1)In. (167)

Integrating by part, we get

In =

[xn+1

n+ 1exp(−x2/2)

]+∞−∞

+

∫ +∞

−∞

xn+1

n+ 1x exp(−x2/2)dx.

On the other hand,

limx→+∞

xn exp(−x2/2) = 0,

for any integer n. Hence, [xn+1

n+ 1exp(−x2/2)

]+∞−∞

= 0.

This implies

In =

∫ +∞

−∞

xn+2

n+ 1exp(−x2/2)dx

=1

n+ 1In+2.

Moreover, we from from proposition A.1 that I0 =√

2π, which concludes the proof.

Consider the Gaussian weight function w(x) = exp(−x2/2). The consequence of the propositions A.2and A.3 is that a monomial xn is in L2

w(R), for any integer n.

64

A.3 A Legendre integral

The goal of this section is to prove the following proposition.

Proposition A.4. For any integer n,∫ 1

−1(x2 − 1)ndx = (−1)n

(n!)222n+1

(2n+ 1)!. (168)

Proof. Consider the change of variable x = sin(θ) for θ ∈ [−π/2, π/2]. Therefore dx = cos(θ)dθ.Moreover,

x2 − 1 = sin(θ)2 − 1

= − cos(θ)2.

Hence, ∫ 1

−1(x2 − 1)ndx = (−1)n

∫ π/2

−π/2cos(θ)2n+1dθ.

Let In denote the integral :

In =

∫ π/2

−π/2cos(θ)2n+1dθ.

We are going to prove that

In =(n!)222n+1

(2n+ 1)!, (169)

which will immediately lead to the equation 168.In order to do this, we are going to derive a recurrence formula for In. Integrating by part, we get :

In =

∫ π/2

−π/2cos(θ)2n cos(θ)dθ

=[cos(θ)2n sin(θ)

]π/2−π/2 +

∫ π/2

−π/22n cos(θ)2n−1 sin(θ) sin(θ)dθ

= 2n

∫ π/2

−π/2cos(θ)2n−1 sin(θ)2dθ,

since the sin function is zero at θ = −π/2 and θ = π/2. This implies :

In = 2n

∫ π/2

−π/2cos(θ)2n−1(1− cos(θ)2)dθ

= 2n

∫ π/2

−π/2cos(θ)2n−1dθ − 2n

∫ π/2

−π/2cos(θ)2n+1dθ

= 2nIn−1 − 2nIn.

We move In from the right to the left hand side and get :

(2n+ 1)In = 2nIn−1.

Therefore,

In =2n

2n+ 1In−1, (170)

for n ≥ 0.

65

The first term in the recurrence is

I0 =

∫ π/2

−π/2cos(θ)dθ

= [sin(θ)]π/2−π/2

= 2.

We can now prove that the equation 169 is true, by recurrence on n. Assuming that 0! = 1, theprevious equation shows that the equation 169 is true for n = 0.

Assume that the recurrence is correct for n− 1. This implies :

In−1 =((n− 1)!)222n−1

(2n− 1)!.

The equation 170 then implies :

In =2n

2n+ 1

((n− 1)!)222n−1

(2n− 1)!

=n2

n(2n+ 1)

((n− 1)!)222n

(2n− 1)!

=(n!)222n

(2n+ 1)!,

which shows that the recurrence is true for n and concludes the proof.

A.4 A Laguerre integral

The goal of this section is to prove the following proposition.

Proposition A.5. We have ∫ ∞0

xne−xdx = n!,

for n ≥ 0.

Proof. Let

In =

∫ ∞0

xne−xdx,

for n = 0, 1, .... Obviously,

I0 =

∫ ∞0

e−xdx = 1.

Integrating by part, we get :

In+1 =

∫ ∞0

xn+1e−xdx

= −[xn+1e−x

]+ (n+ 1)

∫ ∞0

xne−xdx

= (n+ 1)In.

By induction on n, we get :In = n!.

66

References

[1] Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formulas,Graphs, and Mathematical Tables. Dover, New York, ninth dover printing, tenth gpo printingedition, 1964.

[2] George E. Andrews, Richard Askey, and Ranjan Roy. Special functions, volume 71 of Encyclopediaof Mathematics and its Applications. Cambridge University Press, Cambridge, 1999.

[3] G. Arfken. Mathematical Methods for Physicists, 3rd ed. Elsevier Academic Press, 2005.

[4] M. Branicki and A.J Majda. Fundamental limitations of polynomial chaos for uncertainty quantifi-cation in systems with intermittent instabilities, 2012.

[5] R. H. Cameron and W. T. Martin. The orthogonal development of non-linear functionals in seriesof fourier-hermite functionals. Annals of Mathematics, 48(2):385–392, April 1947.

[6] Philip J. Davis and Philip Rabinowitz. Methods of Numerical Integration. Academic Press, NewYork, 1984.

[7] Alan Edelman and H. Murakamiy. Polynomial roots from companion matrix eigenvalues. Mathe-matics of computation, 64(210):763–776, April 1995.

[8] William Feller. An Introduction to Probability Theory and Its Applications - Volume I. John Wileyand Sons, third edition edition, 1968.

[9] Walter Gautschi. Orthogonal polynomials: applications and computation. Acta Numerica, 5:45–119,1 1996.

[10] David J. Griffiths. Introduction to Quantum Mechanics, 2nd Edition. Pearson Education, 2005.

[11] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial andApplied Mathematics, Philadelphia, PA, USA, second edition, 2002.

[12] Thomas Y. Hou, Wuan Luo, Boris Rozovskii, and Hao min Zhou. Wiener chaos expansions andnumerical solutions of randomly forced equations of fluid mechanics. J. Comput. Phys, 216:687–706,2006.

[13] D. E. Knuth. The Art of Computer Programming, Volume 2, Seminumerical Algorithms. ThirdEdition, Addison Wesley, Reading, MA, 1998.

[14] O.P. Le Maıtre. Mthodes spectrales pour la propagation d’incertitudes paramtriques dans les modlesnumriques, mmoire d’habilitation diriger des recherches. Springer, 2005.

[15] O.P. Le Maıtre and O.M. Knio. Spectral methods for uncertainty quantification. Springer, 2010.

[16] Didier Lucor. Introduction to stochastic spectral methods, November 2008.

[17] Wuan Luo. Wiener Chaos Expansion and Numerical Solutions of Stochastic Stochastic PartialDifferential Equations. PhD thesis, California Institute of Technology, May 2006.

[18] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri. Numerical Mathematics. Springer, 2000.

[19] Wolfram Research. Wolfram alpha. http://www.wolframalpha.com.

[20] Gabor Szego. Orthogonal polynomials. American Mathematical Society, 1975.

[21] Norbert Wiener. The homogeneous chaos. Amer. J. Math., 60(4):897–936, 1938.

[22] Wikipedia. Hermite polynomials — wikipedia, the free encyclopedia, 2013. [Online; accessed 26-January-2013].

[23] Dongbin Xiu and George Em Karniadakis. The wiener–askey polynomial chaos for stochastic dif-ferential equations. SIAM J. Sci. Comput., 24(2):619–644, February 2002.

[24] Dongbin Xiu and George Em Karniadakis. Modeling uncertainty in flow simulations via generalizedpolynomial chaos. J. Comput. Phys., 187(1):137–167, May 2003.

67

http://www.wolframalpha.com

Documents

Introduction to polynomials chaos with NISP · 1 Orthogonal polynomials In this section, we de ne the orthogonal polynomials which are used in the context of polynomial chaos decomposition