87
MATH 321: Real Variables II Notes 2015W2 Term Taught by Dr. Kalle Karu, taken by Adrian She Please report typos or errors to Adrian at [email protected] Contents I Riemann-Steiljes Integration 5 1 The Riemann Integral 5 1.1 Darboux’s Definition of the Riemann Integral .................... 5 1.2 Introduction to Integrability .............................. 7 2 The Riemann-Stieltjes Integral 8 3 Integrability 11 3.1 Upper and Lower Integrals ............................... 11 3.2 Integrability of Continuous Functions ......................... 12 3.2.1 Review of Uniform Continuity ........................ 13 3.2.2 Proof of Theorem ................................ 13 3.3 Riemann Sums ..................................... 14 3.4 Discontinuous Functions ................................ 15 4 Properties of the Integral 19 4.1 Change of Variables .................................. 23 4.2 The Fundamental Theorem of Calculus ........................ 24 5 Functions of Bounded Variations 25 5.1 The Riesz Representation Theorem .......................... 26 5.2 The Length of a Curve ................................. 28 5.3 Functional Analysis Revisited ............................. 30 II Sequences and Series of Functions 32 6 Sequences and Series of Functions: Definitions and Issues 32 7 Uniform Convergence 36 7.1 Uniform Convergence of Sequences .......................... 36 7.2 Uniform Convergence of Series ............................ 38 7.3 Interpretation of Uniform Convergence ........................ 39 1

MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Embed Size (px)

Citation preview

Page 1: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

MATH 321: Real Variables II Notes2015W2 Term

Taught by Dr. Kalle Karu, taken by Adrian She

Please report typos or errors to Adrian at [email protected]

Contents

I Riemann-Steiljes Integration 5

1 The Riemann Integral 51.1 Darboux’s Definition of the Riemann Integral . . . . . . . . . . . . . . . . . . . . 51.2 Introduction to Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 The Riemann-Stieltjes Integral 8

3 Integrability 113.1 Upper and Lower Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Integrability of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2.1 Review of Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . 133.2.2 Proof of Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Riemann Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4 Discontinuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Properties of the Integral 194.1 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.2 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . 24

5 Functions of Bounded Variations 255.1 The Riesz Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 265.2 The Length of a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.3 Functional Analysis Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

II Sequences and Series of Functions 32

6 Sequences and Series of Functions: Definitions and Issues 32

7 Uniform Convergence 367.1 Uniform Convergence of Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 367.2 Uniform Convergence of Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387.3 Interpretation of Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . 39

1

Page 2: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

8 Properties of Uniform Convergence 408.1 Uniform Convergence and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 40

8.1.1 The Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408.1.2 Dini’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418.1.3 Strange Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

8.2 Uniform Convergence and Integration . . . . . . . . . . . . . . . . . . . . . . . . 458.2.1 Application to Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . 46

8.3 Uniform Convergence and Differentiation . . . . . . . . . . . . . . . . . . . . . . 488.4 Some Counterexamples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

9 The Arzela-Ascoli Theorem 509.1 Types of Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519.2 Pointwise Boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519.3 Proof of Arzela-Ascoli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529.4 Converse to Arzela-Ascoli Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 539.5 Application: Peano’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

9.5.1 Proof of Peano’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

10 Weierstrass’ Theorem 5810.1 Motivation for the Proof - Averaging Operators . . . . . . . . . . . . . . . . . . . 5810.2 Proof of Weierstrass’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6010.3 Stone’s Generalization of Weierstrass’ Theorem . . . . . . . . . . . . . . . . . . . 6210.4 Proof of Stone’s Theorem- The Lattice Version . . . . . . . . . . . . . . . . . . . 6410.5 Proofs of Stone-Weierstrass Theorem: Algebra Version . . . . . . . . . . . . . . . 65

10.5.1 The Real Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6510.5.2 The Complex Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

III Power Series and Fourier Series 68

11 Power Series 6811.1 Power Series Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7011.2 Behaviour at Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7011.3 Rearrangement of Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7111.4 Application to Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7311.5 Zeros of Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

12 Fourier Series as Orthogonal Series 7412.1 The Hermitian Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7412.2 Orthogonal Bases of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7512.3 Examples of Orthogonal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7612.4 Bessel’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

12.4.1 The Finite Dimensional Case . . . . . . . . . . . . . . . . . . . . . . . . . 7712.4.2 Orthogonal Series Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

12.5 Riesz-Fischer Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

13 Convergence of Fourier Series 8113.1 L2 convergence of Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 8113.2 Pointwise Convergence of Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 83

2

Page 3: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

List of Figures

1 Illustration of a partition, Riemann sum, and tag . . . . . . . . . . . . . . . . . . 52 Illustration of upper and lower Darboux sums . . . . . . . . . . . . . . . . . . . . 63 Quantity we want to compute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 The graph of f , and its transformation under (x, y) 7→ (α(x), y). The area under

the left graph represents∫ baf dx and the area under the right graph represents∫ b

af dα . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5 Visualization of∫ 2

0f dα . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

6 Illustration of Lemma for L(P, f, α). Refining the partition increases L(P, f, α) . 117 Division of the Interval into Three Parts . . . . . . . . . . . . . . . . . . . . . . . 158 The Cantor Set can be covered with finitely many intervals of arbitrarily small

length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Illustration of the integration by parts formula and symmetry between

∫f dα and∫

αdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1710 The shaded region is U(P, f, α)− L(P, f, α) . . . . . . . . . . . . . . . . . . . . . 1711 Illustration of β(x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2112 Example of f(x) and the corresponding F (x) . . . . . . . . . . . . . . . . . . . . 2413 A function not of bounded variation . . . . . . . . . . . . . . . . . . . . . . . . . 2514 Illustration of Riesz Representation Theorem . . . . . . . . . . . . . . . . . . . . 2815 Illustration of the proof for a plane curve . . . . . . . . . . . . . . . . . . . . . . 2916 Illustration of the Sequence of Functions . . . . . . . . . . . . . . . . . . . . . . . 3317 fn are a sequence of functions which form a “travelling wave” . . . . . . . . . . . 3518 Illustration of the Sequence of Functions . . . . . . . . . . . . . . . . . . . . . . . 3619 Illustration of uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 3720 fn does not lie within an ε neighbourhood of the limit . . . . . . . . . . . . . . . 3721 Schematic of the Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4022 Illustration of Proof of Claim. Given ε, there are n, δ such that |fn(x)| < ε in a δ

neighbourhood of x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4323 First few iterations of the Takagi Function . . . . . . . . . . . . . . . . . . . . . . 4424 First few iterations of construction . . . . . . . . . . . . . . . . . . . . . . . . . . 4425 Alternate Construction of Cantor Staircase . . . . . . . . . . . . . . . . . . . . . 4526 Illustration the L∞ and L1 distances between functions. Particularly, the L∞

distance is the maximum pointwise distance between the two function and the L1

distance is the area between the two curves. . . . . . . . . . . . . . . . . . . . . . 4727 Illustration between Modes of Convergence . . . . . . . . . . . . . . . . . . . . . 4728 Another solution of the differential equation is constructed by shifting the where

the function is first non-zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5429 Other solutions of the differential equation are constructed in this case, again by

shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5530 Euler’s method produces a series of piecewise linear approximations to the solution

of a differential equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5531 Two cases for Euler’s Methods when sovling x′ =

√|x| . . . . . . . . . . . . . . . 56

32 Application of the averaging operator to a step function yields a piecewise linearfunction, then a piecewise quadratic function . . . . . . . . . . . . . . . . . . . . 59

33 Definition of g(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5934 A sequence of smooth g which approach the delta function . . . . . . . . . . . . . 6035 Recall that such gn are bump functions, which approach the delta function . . . 61

3

Page 4: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

36 First few terms of 12 +

∑∞n=0

2(2k+1)π sin((2k + 1)π), a Fourier series for a step

function, overlaid with the original function. The original function is plotted ingreen; the Fourier series is plotted in blue. . . . . . . . . . . . . . . . . . . . . . . 77

37 Example of the Gibbs Phenomenon for a Square Wave. Gibbs phenomenon aredisplayed at the point of discontinuity and lie approximately on the line y = 1.09 84

38 Plot of the Dirichlet Kernel DN (x) for some N . . . . . . . . . . . . . . . . . . . 85

4

Page 5: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Part I

Riemann-Steiljes Integration

1 The Riemann Integral

Our first problem in this course is to rigorously define the integral. How do we define∫ baf(x) dx? This problem was first explored by Riemann in his thesis.From previous calculus courses, we define the integral as the limit of a Riemann sum. That

is: ∫ b

a

f(x) dx = lim

n∑i=1

f(ti)∆xi

wherein ti ∈ [xi−1, xi] (known as a tag of the partition), and ∆xi = xi−xi−1. As n approachesinfinity, the partitions should get finer and finer.

x0

ax1 x2 x3 xn

b

ti

Figure 1: Illustration of a partition, Riemann sum, and tag

However, the above definition of the integral raises two problems:

1. How is ti, the tag, chosen?

2. How is the limit taken as n approaches infinity?

The definition of the Riemann integral given by Darboux solves the above two issues. Next,we will add a generalization of the Darboux integral due to Stieltjes.

1.1 Darboux’s Definition of the Riemann Integral

We firstly define partition.

Definition I.1 (Partition). A partition P of [a, b] is a set

P = {a = x0 < x1 < x2 < ... < xn = b}

Now suppose that f(x) is a bounded function on [a, b]. To solve the first issue, the tag-ging problem, we will replace f(ti) with the maximum or minimum within each interval of thepartition. Let Mi = supx∈[xi−1,xi] f(x) and mi = infx∈[xi−1,xi] f(x)

5

Page 6: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Then define the upper and lower sums to be

U(P, f) =

n∑i=1

Mi∆xi

and

L(P, f) =

n∑i=1

mi∆xi

a b

inf

sup

Figure 2: Illustration of upper and lower Darboux sums

Supposing that∫ baf(x) dx exists, we conjecture that

L(P, f) ≤∫ b

a

f(x) dx ≤ U(P, f)

should hold. Accordingly, we define the upper and lower integrals respectively as∫ b

a

f(x) dx = infP{U(P, f)}

and ∫ b

a

f(x) dx = supP{L(P, f)}

where the P denotes all possible partitions of [a, b]. That is,

P =

∞⋃n=1

partitions with n parts

. In the case where partitions have three parts (a = x0 < x1 < x2 < x3 = b), the set of allpartitions is a upper-triangular region bounded by x1 = a, x2 = b and x1 = x2 where these linesare not included in the region. We can also think of taking sup or inf over all possible partitionsas making partitions finer and finer.

If∫ baf(x) dx =

∫ baf(x) dx, then

∫ baf(x) dx is equal to either quantity and we say that f is

Riemann-integrable. We write f ∈ R.

6

Page 7: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

This solves the second issue since we take a supremum or infimum over a set, instead of alimit in defining the integral this way.

The process just described is similar to finding the area of a plane region R. We can super-impose a square grid on the plane. Then we define the outer sum as counting every square whichmeets R and the inner sum as counting every square which lies within R. As the grid is madefiner and finer, the outer and inner sums approximate more and more the area of R, and in thelimit, everything should be equal.

1.2 Introduction to Integrability

The next thing we want to do is to definition what functions are integrable. We begin withthe following example.

Example I.1 (A non-integrable function). Suppose f(x) is defined on [0, 1] as follows:

f(x) =

{1 x ∈ Q0 otherwise

Fix some partition P . Then Mi = 1 and mi = 0 on each interval of the partition.

Thus, U(P, f) =∑ni=1Mi∆xi = 1 implies that

∫ 1

0f(x)dx = 1.

Similarly, L(P, f) =∑ni=1mi∆xi = 0 implies that

∫ 1

0f(x)dx = 0.

Since∫ 1

0f dx 6=

∫ 1

0f dx, then f /∈ R.

We will prove that f ∈ R if:

1. f(x) is continuous.

2. f(x) is continuous except at a finite number of points.

The above are sufficient conditions for Riemann integrability. Lebsegue formulated a nec-essary and sufficient condition for Riemann integrability. A function f is integrable iff f iscontinuous except on a set of measure zero. Informally, measure denotes the “length” of a set.If we can cover a set with smaller and smaller intervals whose length tends to zero, then we saythat the set is of measure zero. This is covered in more detail in subsequent analysis courses.

Example I.2 (Computation). Using the definition of the Riemann integral, we would like to

compute∫ bax dx. We would expect that this is equal to I = b2

2 −a2

2 .To apply the definition of the Riemann integral, we need to prove that the upper and lower

integrals are both equal to I. To prove that I = supP L(f, P ), we must show

a L(P, f) ≤ I

b For every ε > 0, there exists a partition P such that |I − L(P, f)| < ε

7

Page 8: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

a b

y = x

Figure 3: Quantity we want to compute

Proof. a Informally, L(P, f) lies within the trapezoid of area I, and accordingly I, representingthe area of the trapezoid, will be an upper bound of L(P, f).

b Let Pn be the regular partition of n points. That is, a partition where ∆xi are all equal(∆xi = b−a

n ). Then I − L(P, f) will be the area of triangles below the line y = x and aboveL(P, f). Thus

I − L(P, f) =1

2(∆x)2n =

1

2(b− an

)2n =(b− a)2

2n

. By choosing sufficiently large n, we can make (b−a)2

2n < ε.

Remark. Since we know the “answer” in advance here, we can apply geometric arguments.For a more formal argument, we may need to display a sum or resort to other criterion to proveintegrability. For instance, we may have written I as

∑ni=1

xi+xi−1

2 ∆xi, the sum of each trapezoid

in each interval of the partition, we prove that I = b2

2 −a2

2 .

We have yet to properties of the integral, but before then we will need to define the Riemann-Stietjes integral.

2 The Riemann-Stieltjes Integral

In computing L(P, f) and U(P, f), we take the heights mi,Mi are compute ∆xi per rectanglein the partition. We will change the definition of ∆xi = l([xi−1, xi]) and use the new length tocompute areas.

We do this by fixing α : [a, b]→ R which is monotonically increasing. Let

l([s, t]) = α(t)− α(s)

Then we can define the integral as before, replacing ∆xi with ∆αi, which is a new measureof the length of the interval [xi−1, xi].

We can now define the Riemann-Stieltjes integral.

Definition I.2 (Riemann-Stieltjes Integral). Suppose f is bounded on [a, b] and α(x) is a mono-tonically increasing function on [a, b]. Fixing a partition P , define

8

Page 9: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

L(P, f, α) =

n∑i=1

mi∆αi

taking∫ baf dα = supP L(P, f, α) and

U(P, f, α) =

n∑i=1

Mi∆αi

taking∫ baf dα = infP U(P, f, α)

If these two are equal, we call it∫ baf dα and say f ∈ R(α).

Remark. 1. Note that if α(x) = x, then ∆αi = ∆xi and the Riemann-Stieltjes integral isthe Riemann integral.

2. If α is continuous, there’s not much interesting to consider than we can compare this tothe Riemann integral. But the case where α is discontinuous is interesting.

Example I.3 (A discontinuous α). Let α(x) =

{0 x < 1

1 x ≥ 1(this is a step function), [a, b] = [0, 2],

and f be any continuous function. We would like to compute∫ 2

0f dα.

In the interval [xj−1, xj ] where xj ≥ 1 and xj−1 < 1, ∆αj = α(xj) − α(xj−1) = 1. In allother intervals of the partition, α is constant and accordingly, ∆αj = 0.

Thus, L(P, f, α) = mj and U(P, f, α) = Mj where the sup and inf are taken on the interval[xj−1, xj ]. As the interval shrinks, we conclude that supmj = f(1) and inf Mj = f(1) since f is

a continuous function. It follows that∫ 2

0f dα = f(1).

Remark. 1. The Dirac delta function, δ1(x), is one which is infinite at x = 1, 0 everywhere

else. It has the property that∫∞−∞ δ1(x) dx = 1, and

∫ 2

0f(x)δ1(x) dx = f(1). One of the

motivations for introducing the Riemann-Stieljes integral is to study these objects. Later,

we will prove that if α is differentiable, then∫ baf dα =

∫ bafα′dx. Thus, we can interpret

δ1(x) as being the derivative in the discontinuous step function defined in the previousexample, using this interpretation of the Riemann-Stieljes integral.

2. We claim that∫ 2

0αdα does not exist, which we need to check.

We interpreted∫ baf dx as the area under the graph of f(x). We can assign a similar inter-

pretation to∫ baf dα.

Fix some f . In the case where α is continuous, consider the map (x, y) 7→ (α(x), y). Fixing

a partition P , the heights mi in each partition in∫ baf(x) dx is preserved in

∫ baf dα. However,

each ∆xi is changed to ∆αi = α(xi) − α(xi−1) under the transformation. Since each mi∆αi is

an area of a rectangle, the integral∫ baf dα then represents the area of the graph of f(x), under

the transformation (x, y)→ (α(x), y).

9

Page 10: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

a xi−1 xi b

∆xi = xi − xi−1

(x, y) 7→ (α(x), y)

α(a) α(xi−1) α(xi)α(b)

∆αi = α(xi)− α(xi−1)

Figure 4: The graph of f , and its transformation under (x, y) 7→ (α(x), y). The area under the

left graph represents∫ baf dx and the area under the right graph represents

∫ baf dα

The more interesting case is the one where α is discontinuous. Consider the step function:

α =

{0 x < 1

1 x ≥ 1

We determined last day that for continuous f ,∫f dα = f(1). Pictorically we can illustrate

this as:

0 1 2

(x, y) 7→ (α(x), y)

f(1)

0 1

Figure 5: Visualization of∫ 2

0f dα

The value f(1) is spread out across the interval [0, 1], as this is what the transformed graphwould look like in the limit, if we take α as the limit of smooth functions which approximate it.

We claimed last day, also that∫ 2

0αdα does not exist, which we will now prove.

Example I.4 (A Non-Integrable Function). Consider∫ 2

0αdα. Note that we only need to con-

sider the interval [xj−1, xj ] containing 1 to compute the upper and lower integrals. This isbecause

∆αi =

{1 i = j

0 otherwise

On this interval, U(P, f, α) = Mj = 1 and L(P, f, α) = mj = 0. Thus,∫ 2

0αdα = 0 and∫ 2

0αdα = 1. Thus,

∫ 2

0αdα does not exist.

We remark that Problem 2 on the problem set contains a similar function as α, which isintegrable on [0, 2].

10

Page 11: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

3 Integrability

3.1 Upper and Lower Integrals

We now begin to investigate which functions are integrable. Before then, we need to establishsome properties of the integral. For instance:

Question I.1. Is∫ baf dα ≤

∫ baf dα?

We begin by comparing upper and lower sums. We know that by fixing P , that∑mi∆αi = L(P, f, α) ≤ U(P, f, α) =

∑Mi∆αi

.This is because mi ≤ Mi by definition (they are lower and upper bounds respectively), and

∆αi ≥ 0 since α is an increasing function.

Question I.2. If we have two partitions P1, P2, is it true that L(P1, f, α) ≤ U(P2, f, α)?

Definition I.3. A partition P ∗ is a refinement of P if {x0, x1, ...xn} = P ⊂ P ∗ = {y0, y1, ..., ym}

Lemma I.1. If P ∗ is a refinement of P , then

1 L(P, f, α) ≤ L(P ∗, f, α) and

2 U(P, f, α) ≥ U(P ∗, f, α)

Figure 6: Illustration of Lemma for L(P, f, α). Refining the partition increases L(P, f, α)

Proof. It suffices to prove this for a partition P ∗ = P ∪ {y} where y ∈ [xi−1, xi]. Let mi =infx∈[xi−1,xi] f(x). Then

L(P, f, α) = ...+mi∆αi + ...

and

L(P ∗, f, α) = ...+m∗1∆α∗1 +m∗2∆α∗2︸ ︷︷ ︸s

+...

11

Page 12: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

where the m∗1 = infx∈[xi−1,y] f(x), m∗2 = infx∈[y,xi] f(x), ∆α∗1 = α(y) − α(xi−1) and ∆α∗2 =α(xi)− α(y). The parts in ... are the same between L(P, f, α) and L(P ∗, f, α).

Noting that ∆α∗1 + ∆α∗2 = ∆αi, m∗1 ≥ mi and m∗2 ≥ mi allows us to conclude that s ≥

mi(∆α∗1 + ∆α∗2) = mi∆αi.

It follows that L(P ∗, f, α) ≥ L(P, f, α). The other inequality follows similarly.

Note that the fact that α was increasing is crucial, since we need ∆α∗1 and ∆α∗2 to be non-negative for the argument to work.

Lemma I.2. Any partitions P1, P2 have a common refinement P ∗.

Proof. Take P ∗ = P1 ∪ P2.

This is minimal common refinement but we can also add points to P1∪P2 to create a commonrefinement.

We now return to the original question we wanted to discuss.

Lemma I.3. Suppose P1, P2 are partitions. Then L(P1, f, α) ≤ U(P2, f, α).

Proof. Let P ∗ be the common refinement of P1, P2. Then

L(P1, f, α) ≤︸︷︷︸By Lemma

L(P ∗, f, α) ≤︸︷︷︸P∗ is same here

≤ U(P ∗, f, α) ≤︸︷︷︸By Lemma

U(P2, f, α)

It follows that L(P1, f, α) ≤ U(P2, f, α).

Theorem I.1.∫ baf dα ≤

∫ baf dα

Proof. This comes from playing around with definitions of the upper and lower integrals. Fixa partition P1. Then L(P1, f, α) ≤ U(P, f, α) for all partitions P . Since L(P1, f, α) is a lower

bound for {U(P, f, α)} then L(P, f, α) ≤∫ baf dα since

∫ baf dα is the greatest lower bound for

U(P, f, α).

Likewise,∫ baf dα is an upper bound for L(P, f, α) and hence

∫ baf dα, the least upper bound

for L(P, f, α) must satisfy∫ baf dα ≤

∫ baf dα.

3.2 Integrability of Continuous Functions

Last day, we established that∫ baf dα ≤

∫ baf dα holds. When do we have equality between

the upper and lower integrals, that is Riemann integrability?We can restate the condition for Riemann integrability as follows:

f ∈ R(α)↔ ∀ε > 0 ∃P1, P2 s.t. U(P1, f, α)− L(P2, f, α) < ε (*)

That is, the difference between the upper and lower sums can be made arbitrarily small. Itturns out that we only need one partition which works for both the upper and lower sums.

Theorem I.2. f ∈ R(α) if and only if for every ε > 0, there is a partition P such thatU(P, f, α)− L(P, f, α) < ε.

12

Page 13: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof. (←) The condition (∗) is satisfied for P1 = P and P2 = P and therefore, f ∈ R(α).

(→) Assume that f ∈ R(α) and there are two partitions P1, P2 for which the condition (∗)is true. Let P be the common refinement of P1, P2. Then

L(P2, f, α) ≤ L(P, f, α) ≤ U(P, f, α) ≤ U(P1, f, α)

holds. By assumption, U(P1, f, α)− L(P2, f, α) < ε. Accordingly,

U(P, f, α)− L(P, f, α) < ε

.

Remark. We can rewrite the sum U(P, f, α)−L(P, f, α) =∑ni=1(Mi−mi)∆αi. Then informally,

a function is integrable if the areas between the upper and lower Riemann sums can be madearbitrarily small.

We will apply the above condition in proving the following:

Theorem I.3. If f is continuous on [a, b], then f lies in R(α). That is, it is integrable withrespect to any α.

3.2.1 Review of Uniform Continuity

Before completing the proof, we will recall the notion of uniform continuity. A function f iscontinuous at all points x ∈ [a, b] if

∀x∀ε > 0∃δ s.t. |x− y| < δ → |f(x)− f(y)| < ε

Here δ = δ(x, ε). Then f is uniformly continuous if it is continuous on [a, b] and δ is notdependent on x. That is:

∀ε > 0 ∃δ s.t. |x− y| < δ → |f(x)− f(y)| < ε

Note that if f is continuous on [a, b], then f is uniformly continuous as the notions of uniformcontinuity and continuity are equivalent on a compact set.

For instance, f(x) = x2 on [0,∞) is continuous but not uniformly continuous. This is becausethe graph gets steeper as x increases. Accordingly, for |f(x)− f(y)| < ε for fixed ε to hold when|x − y| < δ, then δ must be decreased as x increases. We can also note this from the fact that[0,∞) is not a compact set.

Note that existence of the derivative is not required for a function to be uniformly continuous.For example, if we regard part of a circle on [0, 1] as a function, the function is uniformlycontinuous on that interval because [0, 1] is compact, although the derivative will be infinite atx = 1.

3.2.2 Proof of Theorem

Proof. We need to show that for ε > 0, there exists a partition P such that U(P, f, α) −L(P, f, α) < ε holds for any continuous f .

Since f is defined on a compact set, f is uniformly continuous. Then for every η > 0, thereexists δ for which |f(x)− f(y)| < η if |x− y| ≤ δ. We will specify the η later.

13

Page 14: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Next, we define the mesh of a partition P as ||P || = maxi∈{1,...,n}(xi − xi−1). Choose Pwhose mesh is less than δ. It follows that in each interval on the partition, Mi −mi ≤ η holdsby (uniform) continuity of f .

Accordingly:

U(P, f, α)− L(P, f, α) =

n∑i=1

(Mi −mi)∆αi < η

n∑i=1

∆αi = η(α(b)− α(a)) < ε

Taking η = εα(b)−α(a) completes the proof.

Remark. • In the case when α(a) = α(b),∫ baf dα = 0 holds since α is constant, and any

function is integrable wrt to such alpha.

• We actually proved something stronger. We remarked before that f ∈ R(α) if and only iffor every ε, there was a partition P such that U −L < ε. We can express this condition interms of the mesh of the partition. For every ε, if there is a δ for which ||P || < δ, impliesU − L < ε then f is integrable.

3.3 Riemann Sums

Recall the definition of a Riemann sum. A Riemann Sum

RS(P, {t}, f, α) =

n∑i=1

f(ti)∆αi

depends on not only a partition P , but also the tagging ti of each partition in the interval[xi−1, xi].

Theorem I.4. Let f be continuous, so f ∈ R(α). Choose a sequence of partitions Pk such that||Pk|| → 0 as k →∞. Then:

limk→∞

RS(Pk, {ti}, f, α)→∫ b

a

f dα

for any choice of {ti} or tagging in each Pk.

Remark. Continuity is really needed here. This is not necessarily true for non-continuousfunctions.

Proof. By definition mi ≤ f(ti) ≤Mi holds for all intervals in the partition P . It follows that

L(Pk, f, α) ≤ RS(Pk, {ti}, f, α) ≤ U(Pk, f, α)

holds for any partition Pk and tagging. In the limit:

limk→∞

L ≤ limk→∞

R ≤ limk→∞

U

But since f is continuous, it is integrable and limk→∞ L = limk→∞ U . It follows that all the

above limits are equal and tend towards∫ baf dα.

Remark. We really proved that if f is continuous, then for every ε, there exists δ such that∣∣∣RS(Pi, {ti}, f, α)−∫ b

a

f dα∣∣∣ < ε

provided ||P || < δ.

Next time we will prove more stuff about classes of integrable functions.

14

Page 15: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

3.4 Discontinuous Functions

We already know that if f is continuous, then f ∈ R(α) for any α, but functions mayalso be discontinuous and also be Riemann-Stieljes integrable. Note in the case of the Riemann-Stieljes integral, there is not complete characterization of functions which are integrable unlike theRiemann integral, where a function is integrable if and only if it is continuous almost everywhere.

Theorem I.5. Let f be bounded and α be non-decreasing on [a, b]. If f is continuous except ata finite number of points y1, y2, ..., yn and α is continuous at each yi, then f ∈ R(α).

Remark. Recall the example of a step function α(x) =

{0 x ≤ 1

1 x > 1and our observation that∫

αdα did not exist. This is because α had a discontinuity at x = 1.

Proof. It suffices to prove this in the case where m = 1, so we can apply the argument inductivelyin the case where there is more than one discontinuity. Let y1 = y be the discontinuity of f(x).

We must show that for ε > 0, there exists a partition P such that U(P, f, α)−L(P, f, α) < ε.Divide [a, b] into three pieces: [a, y− δ], [y− δ, y+ δ], and [y+ δ, b] where δ is a quantity we willchoose later. We choose P1, P2, P3 on each piece separately, such that U(Pi, f, α)−L(Pi, f, α) < ε

3holds for i = 1, 2, 3. We will combine partitions P1, P2, P3 into the partition P to get U(P, f, α)−L(P, f, α) < ε.

a by − δ y + δ

Figure 7: Division of the Interval into Three Parts

On [a, y − δ] and [y + δ, b], f is continuous and hence in R(α). Then there exist P1, P3 suchthat U(P1, f, α)− L(P1, f, α) < ε

3 and U(P3, f, α)− L(P3, f, α) < ε3 hold.

Then on [y − δ, y + δ] we have a discontinuity. Let P2 have one part (P2 = {y − δ, y + δ}). Itfollows that

U(P2, f, α)− L(P2, f, α) = (Mi −mi)∆αi = (Mi −mi)(α(y + δ)− α(y − δ))

By boundedness of f , there exists B such that Mi −mi ≤ 2B. By continuity of α at y:

∀η > 0∃δ > 0 s.t. |y − x| < δ → |α(y)− α(x)| < η

Thus, U(P2, f, α) − L(P2, f, α) ≤ 2B(2η) < ε3 on [y − δ, y + δ] for some η. Choose η = ε

12B ,so that δ is the corresponding value such that |y − x| < δ → |α(y) − α(x)| < η. This ensuresU(P2, f, α)− L(P2, f, α) < ε

3 , completing the proof.

15

Page 16: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Remark. We can adapt the above argument to prove that if for ε > 0, the discontinuities of fcan be covered by a finite number of intervals of total α-length < ε, then f is integrable withrespect to α.

This is the case if f has discontinuities on the Cantor set (taking α(x) = x). For example,the Cantor set can be covered with finitely many intervals of arbitrarilly small length.

Iteration 2: Length = 49

Iteration 3: Length = 827

Figure 8: The Cantor Set can be covered with finitely many intervals of arbitrarily small length.

Theorem I.6. Let f be non-decreasing and α be continuous (and non-decreasing). Then f isintegrable (f ∈ R(α))

Remark. Here, a non-decreasing f can be “arbitrarily bad” if it is integrated with respect to acontinuous α. Note that

∫αdα does not exist because α is non-decreasing, but not continuous.

Proof. Choose a partition Pn for which all ∆αi are all equal (i.e. ∆αi = α(b)−α(a)n ). This is

possible since α is a continuous function (so the intermediate value theorem holds for α).Then

U(Pn, f, α)− L(Pn, f, α) =

n∑i=1

(Mi −mi)∆αi

= ∆αi

n∑i=1

[f(xi)− f(xi−1)]

= ∆αi(f(b)− f(a)) (This sum telescopes)

=[(f(b)− f(a)][α(b)− α(a)]

n

Therefore, as n approaches infinitely, U(Pn, f, α)−L(Pn, f, α) approaches 0, which completesthe proof.

Remark. In the setting of the above theorem, since f is non-decreasing, we may compute both∫ baf dα and

∫ baαdf . In particular:∫ b

a

f dα+

∫ b

a

αdf = f(b)α(b)− f(a)α(a)

This is the integration by parts formula. We may interpret as the formula as the claim that∫ baf dα exists iff

∫ baαdf exists, although we need to prove this more formally.

16

Page 17: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

α(a) α(b)

f(a)

f(b)

∫f dα

∫αdf

α(a) α(b)

f(a)

f(b)

L(α, P, f)

U(f, P, α)

Figure 9: Illustration of the integration by parts formula and symmetry between∫f dα and∫

αdf

Theorem I.7 (Composition of Functions). Let f ∈ R(α) and f : [a, b] → [c, d]. Let φ becontinuous on [c, d]. Then φ(f(x)) ∈ R(α)

Remark. The theorem enlarges the classes of functions which are integrable. For instance, ifwe know that f ∈ R(α), then we will also know that f2 ∈ R(α) and |f | ∈ R(α).

Before completing this proof, recall that f ∈ R(α) if and only if for every ε > 0, there existsa partition P for which U(P, f, α)− L(P, f, α) < ε. We may illustrate this as follows:

Figure 10: The shaded region is U(P, f, α)− L(P, f, α)

That is, the area between U(P, f, α) and L(P, f, α) may be made arbitrarily small. To do so,we either make the length or height of each rectangle between U(P, f, α) and L(P, f, α) small.

In the case of f continuous, each rectangle has small height and length since Mi −mi < ηmay be satisfied given an interval whose length is as small as we please. Supposing f hasdiscontinuities, we may have some boxes with large height, although those boxes may have smallwidth to make the difference U − L small. In the bad case, for instance

∫αdα, we may have

rectangles with both large height and width in U − L for any partition P . We can now proceedwith our next proof, now having understood this principle.

17

Page 18: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem I.8. Suppose f : [a, b]→ [c, d] is integrable with respect to α, and φ is continuous on[c, d]. Then φ(f(x)) ∈ R(α)

Proof. Assume that f is integrable and φ and continuous. In terms of ε− δ definitions:

1. Assuming φ is continuous on [c, d] means that it is uniformly continuous. This means:

∀ε > 0, ∃δ s.t. |y1 − y2| < δ → |φ(y1)− φ(y2)| < ε (1)

2. Assuming f is integrable, this means

∀η > 0, ∃P s.t. U(P, f, α)− L(P, f, α) < η (2)

We will need to prove that U(P, φ(f), α) − L(P, φ(f), α) < ε, that is their difference can bemade arbitrarily small.

Let η > 0. Take the P which satisfies equation (2). On each [xi−1, xi] on P , consider Mi,mi

which are the supremum and infimum of f on that interval. Let M∗i ,m∗i be the supremum and

infimum of φ(f) on that interval.By uniform continuity of φ, if |Mi−mi| < δ then |M∗i −m∗i | < ε. We then divide our intervals

into two sets A,B defined as follows:

A = {i |Mi −mi < δ}B = {i |Mi −mi ≥ δ}

Consider the contribution of A,B to U − L in φ(f). In A:∑i∈A

(M∗i −m∗i )∆αi ≤∑i∈A

ε∆αi ≤ ε(α(b)− α(a))

In B, by boundedness of φ:∑i∈B

(M∗i −m∗i )∆αi ≤∑i∈B

2K∆αi = 2K∑i∈B

∆αi

where K is the bound on K. We claim∑i∈B ∆αi is small since

n∑i=1

(Mi −mi)∆αi < η

by integrability of f . We can then derive the following inequalities to bound∑i∈B ∆αi.∑

i∈B(Mi −mi)∆αi︸ ︷︷ ︸

Since we are taking fewer points

<

n∑i=1

(Mi −mi)∆αi < η

∑i∈B

(Mi −mi)∆αi <∑i∈B

δ∆αi = δ∑i∈B

∆αi︸ ︷︷ ︸by bound on Mi −mi

< η

18

Page 19: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Thus, ∑i∈B

∆αi <η

δ

We then can complete the proof. Given ε > 0, we get some δ > 0 which satisfies (1). Thenchoose η = ε · δ to obtain a partition P from (2). Therefore:

U(P, φ(f)− α)− L(P, φ(f), α) < ε(α(b)− α(a))︸ ︷︷ ︸Contribution from A

+ 2K · ηδ︸ ︷︷ ︸

Contribution from B

= ε (α(b)− α(a) + 2K)︸ ︷︷ ︸Constant

Accordingly, U − L may be as small as we please and φ(f) is integrable. The key idea to betaken from this proof is the division of the partition into A,B and making U − L small on eachset separately.

4 Properties of the Integral

We now list some properties of the integral:

Theorem I.9. The following are properties of the Riemann-Steiljes integral:

1. Assume f, g ∈ R(α), then for all c, d ∈ R, cf + dg ∈ R(α) and

∫ b

a

(cf + dg) dα = c

∫ b

a

f dα+ d

∫ b

a

g dα

In other words, the integral is a linear operator.

2. The integral is also linear in α. That is, if f ∈ R(α) and f ∈ R(β), then f ∈ R(c1α+ c2β)for c1, c2 ≥ 0 and

∫ b

a

f d(c1α+ c2β) = c1

∫ b

a

f dα+ c2

∫ b

a

fdβ

The condition that c1, c2 ≥ 0 is needed here to ensure c1α, c2β are non-decreasing functions.

3. If f, g are integrable and f(x) ≤ g(x) for all x, then∫ b

a

f dα ≤∫ b

a

g dα

.

4. f ∈ R(α) on [a, b] if and only f ∈ R(α) on [a, c] and [c, b] where a ≤ c ≤ b. Additionally:

∫ b

a

f dα =

∫ c

a

f dα+

∫ b

c

f dα

19

Page 20: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

5. If f ∈ R(α) and |f(x)| ≤M , then∣∣∣ ∫ b

a

f dα∣∣∣ ≤M [α(b)− α(a)]

We will omit the proof for most of these theorems but they can be done by considering thedifference between the upper and lower sums, as follows:

Proof of Item 1. Suppose f, g ∈ R(α) and let h = f + g. Then on [xi−1, xi]:

mi = infx∈[xi−1,xi]

h(x) ≥ infx∈[xi−1,xi]

f(x) + infx∈[xi−1,xi]

g(x)

Mi = supx∈[xi−1,xi]

h(x) ≤ supx∈[xi−1,xi]

f(x) + supx∈[xi−1,xi]

g(x)

It follows that

L(P, f, α) + L(P, g, α) ≤ L(P, h, α) ≤ U(P, h, α) ≤ U(P, f, α) + U(P, g, α)

Since f, g are integrable, then

(U(P, f, α)− L(P, f, α)) + (U(P, g, α)− L(P, g, α)) < 2ε

It follows that U(P, h, α)− L(P, h, α) < 2ε, so h ∈ R(α).

We can get the following corollaries from the above theorem:

Corollary I.1. Assume that f, g ∈ R(α). Then:

1. f2 ∈ R(α)

2. 1f ∈ R(α), provided f(x) ≥ ε for some ε > 0.

3. fg ∈ R(α).

4. |f | ∈ R(α), with∣∣∣ ∫ ba f dα∣∣∣ ≤ ∫ ba |f | dα

Proof. Apply the preceeding theorem re composition of functions. In 1), choose φ(y) = y2. In 2),choose φ(y) = 1

y . For 3), note that fg = 14 [(f + g)2− (f − g)2]. Finally, for 4), choose φ(y) = |y|.

To prove the assertion that∣∣∣ ∫ ba f dα∣∣∣ ≤ ∫ ba |f | dα. It suffices to prove bounds on the upper or

lower sums. For instance, |U(P, f, α)| ≤ U(P, |f |, α) can be shown from the triangle inequality.Letting Mi be its usual meaning, then:∣∣∣∑Mi∆αi

∣∣∣ ≤∑ |Mi∆αi| ≤∑

sup |f(x)|∆αi

We now come to our main theorem for today, which reduces Riemann-Steiljes integrals intoRiemann integrals.

20

Page 21: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem I.10. Suppose α′(x) exists and α′ ∈ R (i.e. α′ is Riemann-integrable). Then f ∈R(α) if and only if fα′ ∈ R, and ∫ b

a

f dα =

∫ b

a

f α′dx

Example I.5 (Applications of Theorem). Recall that one of the motivations for defining theRiemann Steiljes integral was the Dirac delta function. In this case, α is a “smooth” step functionwhere the area under α′(x) is approximately 1, and α′(x) approaches δ(x). Similarly, if β = α′(x)is the following function:

1− δ 1 + δ

Of area 1

12δ

Figure 11: Illustration of β(x)

Then we may interpret∫ 2

0f dα =

∫ 2

0fβ dx as the average value of f on [1− δ, 1 + δ]. To see

why note: ∫ 2

0

fβ dx =

∫ 1+δ

1−δf(x)

1

2δdx =

1

∫ 1+δ

1−δf(x) dx

To prove the above theorem, we will use Riemann sums. Recall a Riemann sumRS(P, {ti}, f, α)is defined as

∑ni=1 f(ti)∆αi where ti is a tagging of a partition P . Note that L ≤ RS ≤ U where

L,U denote the lower and upper sums respectively as L chooses the infimum of each partition andU chooses the supremum of the partition for the tagging in these cases (and mi ≤ f(ti) ≤ Mi)holds. Furthermore, if Pk is a sequence of partitions where limk→∞ L(Pk) = limk→∞ U(Pk),then they are equal to limk→∞RS(Pk) for any tagging {ti} of the partitions.

Step 1. We will first assume that both integrals exist and prove that∫ baf dα =

∫ bafα′ dx in this

case.Fix some partition P . By the mean value theorem, ∆αi = α(xi) − α(xi−1) = α′(ti)∆xi for

some ti ∈ [xi−1, xi]. Use these {ti}s as tagging in a Riemann sum. In this case:

RS(P, {ti}, f, α) =∑

f(ti)∆αi =∑

f(ti)α′(ti)∆xi = RS(P, {ti}, fα′)

To show that they are equal, suppose they were not. Then there exists a partition P1, P′

for which L(P1, f, α) ≤ U(P1, f, α) < L(P ′, fα′) ≤ U(P ′, fα′). Taking the common refinementP yields L(P, f, α) ≤ U(P, f, α) < L(P, fα′) ≤ U(P, fα′). Then there exist Riemann sums forwhich RS(P, {ti}, f, α) 6= RS(P, {ti}, fα′), in contradiction to the result we just proved!

21

Page 22: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Before proceeding with the rest of the proof, we will prove a lemma.

Lemma I.4. For every η, there exists a partition P such that |RS(P, {si}, f, α)−RS(P, {si}, fα′)| <η for any choice of {si}. Moreover this is true for any refinement of P .

The lemma means that for any choice of {si}, the Riemann sum calculated using this taggingdiffers little, by at most η, between RS(P, {si}, f, α) and RS(P, {si}, fα′). Furthermore, thelemma implies the theorem.

Proof. We will use the fact that α′ ∈ R in the proof of this theorem. Since α′ ∈ R, then thereexists a partition such that U(P, α′)− L(P, α′) < ε = η

B , where |f(x)| < B.Then, let {si} and {ti} be taggings of the partition P .

∣∣∣∑α′(si)∆xi −∑

α′(ti)∆xi

∣∣∣ ≤∑ |α′(si)− α′(ti)|∆xi

≤ |Mi −mi|∆xi (Mi,mi taken of α′)

< ε (Riemann integrability of α′)

Then: ∑f(si)∆αi =︸︷︷︸

By choice of ti

∑f(si)α

′(ti)∆xi

Changing the tis to sis yields a difference of:

∣∣∣∑ f(si)[α′(ti)− α′(si)]∆xi

∣∣∣ ≤ B∑ |α′(t)− α′(si)|∆xi < Bε = η

We shall continue this discussion more next day.It remains to show that the lemma implies the theorem.

Proof. Fix a partition P and let η > 0 be given. Let U(P, f, α) = sup{si}RS(P, {si}, f, α) andU(P, fα′) = sup{si}RS(P, {si}, fα′). Then

|U(P, f, α)− U(P, fα′)| ≤ η

. Otherwise, for P , there exist Riemann sums for which

|RS(P, {si}, f, α)−RS(P, {si}, fα′)| ≥ η

, in contradiction to the lemma we proved earlier.Then ∫ b

a

f dα = infP ′U(P ′, f, α) = inf

All P∗ refining PU(P ∗, f, α)

It follows that

|∫ b

a

f dα−∫ b

a

fα′ dx| ≤ η (1)

and in the same way

|∫ b

a

f dα−∫ b

a

fα′ dx| ≤ η (2)

22

Page 23: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

If, for instance,∫ baf dα and

∫ baf dα are the same, the fact that (1) and (2) hold means

that the difference between upper and lower integrals∫ bafα′ dx and

∫ baf dα, along with

∫ baf dα

and∫ bafα′ dx is small. Since the integrals of f with respect to α are equal, it means that∫ b

afα′ dx =

∫ bafα′ dx must hold.

Therefore, f ∈ R(α) if and only if fα′ ∈ R, by equality of these upper and lower integrals.

The above theorem, finally, gives a meaning to dα as α′ dx when α is differentiable andRiemann integrable.

4.1 Change of Variables

Recall from calculus the change of variables formula: if x = u(t), then∫ b

a

f(x) dx =

∫ B

A

f(u(t) d(u(t))︸ ︷︷ ︸u′(t)dt

where a = u(A) and b = u(B). We may make a similar statement for the Riemann-Steiljesintegral.

Theorem I.11. Let u : [A,B] → [a, b] be a strictly increasing and onto function, and let f, αhave their usual meanings. Then:∫ b

a

f(x) dα(x) =

∫ B

A

f(u(t)) dα(u(t))

The fact that u is strictly increasing is needed to ensure that dα(u(t)) is non-decreasing.

Proof. Let P = {x0, ..., xn} be a partition on [a, b] and Q = {t0, ..., tn} be a partition on [A,B]such that u(ti) = xi. We claim that

U(P, f, α) = U(Q, f(u), α(u))

L(P, f, α) = L(Q, f(u), α(u))

To see why, we write out the upper and lower sums.

U(P, f, α) =∑

Mi∆αi

andU(Q, f(u), α(u)) =

∑M ′∆α ◦ ui

Then Mi = supx∈[xi−1,xi] f(xi) and M ′i = supt∈[ti−1,ti] f(u(t)) = supx∈[xi−1,xi] f(x) by choiceof ti. Furthermore,

∆αi = α(xi)− α(xi−1)

and∆α ◦ ui = α(u(ti))− α(u(ti−1)) = α(xi)− α(xi−1)

again by choice of ti. So upper and lower sums between the two integrals are the same.Accordingly, the two integrals are equal.

23

Page 24: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

4.2 The Fundamental Theorem of Calculus

Given f(x), we may define F (x) =∫ xaf(t) dt, and we may expect F ′(x) = f(x). However,

this may not work under some conditions. For instance, consider the following step function f(x)and the corresponding function F (x).

f(x) F (x)

Figure 12: Example of f(x) and the corresponding F (x)

We can see that at the point where f(x) changes from 0 to 1, the corresponding part in F (x)is not differentiable. However, F ′(x) = f(x) under some conditions:

Theorem I.12. Let f be integrable on [a, b] and let F (x) =∫ xaf(t) dt. Then

i F is continuous.

ii If f is continuous, then F is differentiable and F ′ = f .

Proof. i Here we will need to prove that limx→x0F (x) = F (x0). Then

|F (x)− F (y)| =∣∣∣ ∫ y

x

f(t) dt∣∣∣ ≤ B(y − x)

where B is a bound on f(t). Thus, limx−y→0(F (y)− F (x)) = 0.

ii Here we will need to show that lim F (y)−F (x)y−x = limh→0

F (x0+h)−F (x0)h = f(x0).

In the case of h positive, we may bound the difference quotient as:

∣∣∣F (x0 + h)− F (x0)

h

∣∣∣ =∣∣∣ 1n

∫ x0+h

x0

f(t) dt∣∣∣

Letting m = inf f(t) and Mi = sup f(t) on [x0, x0 + h] yields the inequality:

mh ≤∫ x0+h

x0

f(t) dt ≤Mh

Therefore, m ≤ F (x0+h)−F (x0)h ≤ M . By continuity as h approaches 0 yields sup f(x) =

inf f(x) = f(x0) on [x0, x0 + h]. Therefore: limh→0F (x0+h)−F (x0)

h = f(x0).

24

Page 25: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

5 Functions of Bounded Variations

We considered the issue of defining∫ baf dα where α was a non-decreasing function. How do

we now define,∫ baf dg where g may not be monotone?

We may define∫ baf dg whenever g is of bounded variation.

Definition I.4. The variation of g on [a, b] is

V ba (g) = supP

n∑i=1

|∆gi| =n∑i=1

|g(xi)− g(xi−1)|

If we consider g : [a, b]→ R as a path, then V ba (g) is the length of the path. In particular, ifg′(x) exists and is integrable, then

V ba (g) =

∫ b

a

|g′(x)| dx

.The set of functions of bounded variation on [a, b] is denoted as BV [a, b], where

BV [a, b] = {g |V ba (g) <∞}

The following is a function which is not of bounded variation.

y0

y1

Figure 13: A function not of bounded variation

We construct the function as follows. Pick points y0, y1, ... such that |y1−y0| = 1, |y2−y1| = 12 ,

|y3 − y2| = 13 ... and so on. Then the variation of the function defined by the harmonic series

1 + 12 + 1

3 ... which diverges.Functions of bounded variation can be used in defining the integral because of the Jordan

decomposition.

25

Page 26: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem I.13 (Jordan Decomposition). A function g is of bounded variation if and only ifg = α− β for some non-decreasing α, β.

Proof. (→) Assigned on Homework 3. (←) Let g = α− β for some non-decreasing, α, β. Then

V ba (α− β) ≤ V ba (α) + V ba (β)

Since α, β are non-decreasing, then

V ba (α) + V ba (β) = [α(b)− α(a)] + [β(b)− β(a)]

which is finite.

Remark. The decomposition into α, β need not be unique. For instance:

g(x) = 0 = α− α = β − β

for any non-decreasing α, β

Definition I.5 (Integral wrt to g). If f is continuous and g ∈ BV (i.e. g is of bounded variation),then g may be decomposed as g = α− β. Therefore, we define:∫ b

a

f dg =

∫ b

a

f dα−∫ b

a

f dβ

where the integrals on the right hand side are the Riemann-Steiljes integral we have alreadydefined.

Remark. It’s enough to assume that f ∈ R(α) and f ∈ R(β) for the integral to exist, but fcontinuous ensures that the integral always exists.

Furthermore, the integral is always the same regardless of the choice of α, β, wherever itexists. This result is a problem on the next problem set.

In the case where g is not of bounded variation, we may also be able to decompose g intog = α − β. Although |α − β| may be finite for every x in the interval, the issue here is α, β

themselves may not be bounded and furthermore, the integral∫ baf dα −

∫ baf dβ may result in

an ∞−∞ answer, which is not defined.This notion of integration, of integrating with respect to functions of bounded variation, is

applied in proving a result in functional analysis- the Riesz Representation Theorem.

5.1 The Riesz Representation Theorem

This result was first stated on 1910. Before stating the result, we will need to first make somedefinitions.

Fix an interval [a, b]. Then let C[a, b] denote the set of continuous functions on [a, b].Let C∗ denote the vector space dual over R. The dual of a vector space is a vector space

consisting linear maps (maps respecting addition and scalar multiplication) from the originalspace to R. In this case:

C∗ = {T : C → R, T is a linear functional}

For instance, the dual space of the vector space of polynomials R[x] is the set of power seriesR[[x]]. In this instance, the vector space of polynomial has countable dimension since its basisis the monomials {1, x, x2, x3...}. However, linear functionals may act on any finite or infinite

26

Page 27: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

subset of this basis, thereby making the space of power series R[[x]], with an uncountable basis,the dual space of the vector space of polynomials.

Likewise, C[a, b] itself a very large set with an uncountable basis. However, the dual spaceC∗ is not well-behaved due to its extremely large size! We then take the subset C∗B ⊂ C∗ ofbounded functionals. To make precise what bounded means, we will need to define a norm onC[a, b]. For f ∈ C[a, b], its norm is:

||f ||∞ = supx∈[a,b]

|f(x)|

Since any continuous function on [a, b] achieves its maximum value, then ||f ||∞ returns themaximum value of f on [a, b] if f ∈ C[a, b]. We may check that this norm satisfies the propertiesneeded of a norm, such as the triangle inequality and the fact that ||f || = 0 iff f(x) = 0, forinstance. This norm is part of a set of norms called Lp norms and is known as the infinity norm.

Then a linear functional T : C → R will be bounded if there exists M ∈ R such that forevery function f(x) ∈ C[a, b]:

|T (f)| ≤M · ||f ||∞We further note that the set of bounded functionals C∗B is a vector space. Before proceeding

further, we will examine some elements of C∗B :

Example I.6 (Examples of Elements of C∗B).

1. The Evaluation Map:

Let evx0: C → R be the map f 7→ f(x0). That is, we take a function f ∈ C[a, b] and

return its value at x0 ∈ [a, b]. It is a linear map since evaluation of f + g and cf at x0

return f(x0) + g(x0) and cf(x0) respectively. Furthermore, it is bounded since:

|evx0(f)| ≤ |f(x0)| ≤ ||f ||∞

as f(x0) is necessarily less than or equal to its maximum on [a, b]. which is ||f ||∞. TakingM = 1 completes the proof that evx0

is a bounded linear functional.

2. Integration:

a Fix some non-decreasing α and define a map C[a, b]→ R as f 7→∫ baf dα. It is linear by

properties of the integral. Furthermore, it is bounded since:

∣∣∣ ∫ b

a

f dα∣∣∣ ≤ ||f ||∞(α(b)− α(a)|

since the integral is no bigger than the maximum of f multiplied by the length of theinterval on which we want to integrate. Taking M = |α(b) − α(a)| completes the proofthat integration is a bounded linear functional.

b Furthermore, fixing some g of bounded variation, then f →∫ baf dg induces a map from

C → R. This is a bounded linear functional, taking M = V ba (g). We may check this asan exercise.

3. Differentiation: Define C1[a, b] as a set of functions which have continuous derivatives. Leta map C1[a, b] → R be defined as f 7→ f ′(x0) where x0 is a point in [a, b]. This is not abounded linear functions since |f ′(x0)| may be arbitrarily big in relation to ||f ||∞, such asin the case of a function which gets arbitrarily steep close to the origin.

27

Page 28: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

We now come to statement of the Riesz Representation Theorem.

Theorem I.14 (Riesz). All bounded linear functionals come from integration. More precisely,every T ∈ C∗B, that is every bounded linear functional, is defined by some g ∈ BV such that

T : f 7→∫ b

a

f dg

Remark.

1. For instance, the evaluation map at x0 may be defined by∫ baf dα where α is a step function

which changes values at x0. We have previously encountered this example.

2. We may restate the Riesz representation theorem in terms of maps from functions to C∗B .That is, the map from functions of bounded variations to C∗B is surjective since everyfunction of bounded variation can define a bounded linear functional.

However, the map is not injective since:

• The functions g and g + c where c is a constant define the same integral. We fix thisproblem by only considering the functions where c = 0.

• Consider the following two step functions:

α

x0

β

x0

Then∫f dα =

∫f dβ = f(x0) where x0 is the jump point. We fix this problem

by only considering functions which are continuous from the right, so β will be notincluded in our set.

By imposing the above two restrictions on BV , we get a subspace of functions BV ⊂ BV .The map BV → C∗B , is then an isomorphism between the two sets. We may illustrate thisgraphically as follows.

BV

subset

BV

C∗Bsurjective

injective, '

{g ∈ BV |g(α) = 0 and g continuous from the right} =

Figure 14: Illustration of Riesz Representation Theorem

5.2 The Length of a Curve

Recall from last day the definition for the variation of a function on [a, b]. This is:

V ba (g) = supP

n∑i=1

|g(xi)− g(xi−1)|

28

Page 29: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

. The variation satisfies properties such as

V ba (g) = V ca (g) + V bc (g)

for an interval a < c < b. We may interpret the variation of a function, as the length of the curvethe function traces out.

Definition I.6. A curve is a (continuous) function γ : [a, b]→ Rn, or a map

x 7→ (γ1(x), γ2(x), ...γn(x))

.The length of γ is

Λ(γ) = supPartitions of [a,b]

∑i

||γ(xi)− γ(xi−1)||

.Call a curve rectifiable if Λ(γ) is finite.

We may think of calculating the length of a curve as approximating the curve as many littleline segments, and adding up the length of each line segment. Furthermore, in the case whereγ : [a, b]→ R, then Λ(γ) = V ba (γ).

Note that Λ(P, γ), the length of the curve calculated using a partition P underestimates thelength of γ by construction (Λ(P, γ) ≤ supP Λ(P, γ) = Λ(γ)). As such, we may think of it as alower sum L(P, f) and indeed, the two quantities share some similar properties.

Lemma I.5. If P ∗ is a refinement of P , then Λ(P ∗, γ) ≥ Λ(P, γ).

Proof. It suffices to prove that for a partition P ∗ = P ∪ {y}.

γ(xi−1)

γ(xi)γ(xy)

Figure 15: Illustration of the proof for a plane curve

Note thatΛ(P, γ) = ...+ ||λ(xi)− λ(xi−1)||+ ...

andΛ(P ∗, γ) = ...+ ||λ(xi−1)− λ(y)||+ ||λ(xi)λ(y)||+ ...

. An application of the triangle inequality completes the proof.

Theorem I.15. Suppose a < c < b. Then if γ is rectifiable on [a, b], then

Λba(γ) = Λca(γ) + Λbc(γ)

29

Page 30: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof. By definition:Λba(γ) = sup

Partitions over [a,b]

Λ(P, γ)

. We claim thatsup

Partitions over [a,b]

Λ(P, γ) = supPartitions P∗=P∪{C}

Λ(P ∗, γ)

The direction

supPartitions over [a,b]

Λ(P, γ) ≥ supPartitions P∗=P∪{C}

Λ(P ∗, γ)

follows from the fact that partitions containing {c} are subset of all partitions. The direction

supPartitions over [a,b]

Λ(P, γ) ≤ supPartitions P∗=P∪{C}

Λ(P ∗, γ)

comes from noting that for every partition in [a, b], there exists a refinement of P , P ∗ containing{c} such that Λ(P, γ) ≤ Λ(P ∗, γ).

Then, the set of partitions P ∗ for which c ∈ P ∗ is in bijection with the set {(P1, P2)} whereP1 is a partition over [a, c] and P2 is a partition over [c, b].

Thus, Λ(P ∗, γ) = Λ(P1, γ) + Λ(P2, γ). Taking the supremum over P ∗ on the left side, andover (P1, P2) on the right side yields the equality

Λba(γ) = Λca(γ) + Λbc(γ)

Note that above argument also works to show equality of integrals when an interval [a, b] isdivided into intervals [a, c] and [c, b].

Example I.7 (Non-Rectifiable Curves). We would like an example of a non-rectifiable continuous

(hence bounded) curve. Taking γ(x) =

[xa cos 1

xxa sin 1

x

]where a is an appropriate exponent and

x ∈ [0, 1] should work. Note that we define γ(0) = (0, 0). In particular, we may calculate the

length of the curve as Λ(γ) =∫ ba|λ′(t)|︸ ︷︷ ︸

the speed

dt whenever γ is differentiable.

Next, the Koch snowflake is a non-rectifiable curve which begins as a map from [0, 3] toa triangle, and where we draw a new triangle on each third of each side of the triangle uponeach iteration. This is not rectifiable since the length of the snowflake increases by 4

3 upon eachiteration, but it is continuous.

Finally space-filling curves are maps [0, 1]→ [0, 1]× [0, 1] which are not rectifiable becausethey fill the unit square.

5.3 Functional Analysis Revisited

We now make some remarks on the field of functional analysis.

• Functional analysis studies spaces of functions, such as C[a, b], the continuous functionson [a, b], or BV [a, b], the functions of bounded variation on [a, b].

• By introducing a norm on the space of functions, then we may define a metric betweentwo functions and introduce a topology induced by the metric. An example of the normwe saw last day was the supremum norm: ||f ||∞ which returns the maximum value of thefunction on C[a, b].

30

Page 31: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

• Functional analysis then studies the bounded linear maps: V → W where V,W are twospaces. A bounded linear map will be a continuous map in this case.

• In the example of the Riesz representation theorem, the dual space of C[a, b], which isall linear maps L : C[a, b] → R was found to be isomorphic to BV [a, b] or the functionsof bounded variation on [a, b]. This is because all linear functionals on C[a, b] could berepresented by integration with respect to a function of bounded variation. This induces

a map between BV [a, b] → C[a, b] as g 7→∫ baf dg where f is any continuous function.

Furthermore, NBV [a, b] = BV as defined last day, is in isomorphism with C[a, b]∗.

• There exist different norms on a space. For instance, we may define ||g|| = V ba (g) instead ofusing the supremum norm previously. Furthermore, norms have operators, where we maydefine the norm of an operator L as

||L|| = supf 6=0

||Lf ||||f ||

• We may claim that the isomorphism BV ' C[a, b]∗ as stated in the Riesz representationtheorem is one which preserves norms. More precisely, if we let L denote the operator∫ baf dg on C[a, b], then the norm of L is

||L|| = supf 6=0

||Lf ||∞||f ||∞

= V ba (g)

We may see this in the case of α monotonic by the fact that:

∣∣∣ ∫ b

a

f dα∣∣∣ ≤ ||f ||∞︸ ︷︷ ︸

The Maximum

|α(b)− α(a)|︸ ︷︷ ︸V ba (α)

Thus,

|∫ baf dα|

||f ||∞≤ |α(b)− α(a)|

.

It follows by taking f as a constant function that:

sup|∫ baf dα|

||f ||∞= |α(b)− α(a)|

31

Page 32: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Part II

Sequences and Series of FunctionsToday we will begin the main topic of the course: sequences and series of functions. This

culminates studying in the Stone-Weierstrass theorem.

6 Sequences and Series of Functions: Definitions and Is-sues

Definition II.1. A sequence of functions f1, f2, ... is denoted as {fi}ni=1, where each fi(x) areall defined on some domain E.

Similarly, a series of functions is denoted as∑ni=1 fi.

We may compare sequences of functions with sequences of numbers by considering V : a spaceof functions. If V is a function space, then each “point” in the space is a function, wherein asequence of points may converge to some function under a particular metric. We formally defineconvergence here:

Definition II.2 (Convergence of Sequences). A sequence {fi(x)} converges to f(x) if

∀x ∈ E, limi→∞

fi(x) = f(x)

.In other words, the sequence of numbers fi(x) on x ∈ E converges to f(x). We write

limi→∞ fi = f to denote convergence of the sequence of functions.In terms of epsilon-delta definitions, limi→∞ fi = f , if

∀x∀ε,∃N = N(x, ε) s.t. |fi(x)− f(x)| < ε for i ≥ N

Definition II.3 (Convergence of Series). A series∑∞i=1 fi converges of f if the sequence of

partial sums {sn =∑ni=1 fi} converges to f . Write

∑∞i=1 fi = f .

Note that in both of the above definitions, we allow different x to take different N such that|fi(x)− f(x)| < ε for i ≥ N .

Example II.1. The Taylor series presents an example of a series of functions. We know that:

ex = 1 + x+x2

2...

This is a series consisting of the functions 1, x, x2

2 , ... and is convergent on R.Next, consider the functions fn(x) = xn on [0, 1]. Note that

limn→∞

fn = f(x) =

{1 x = 1

0 otherwise

The sequence of functions may be illustrated as follows:

32

Page 33: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

x

x2

x3

x4

Figure 16: Illustration of the Sequence of Functions

The above example illustrates the main issue we are dealing with when considering a sequenceof functions. Each fn(x) = xn is continuous and differentiable everywhere, but the function inthe limit is not continuous and differentiable everywhere. Thus, is the limit compatible withproperties of functions of a sequence? More precisely, we may ask the following questions abouta sequence of functions {fn} and f = limi→∞ fn:

1. If each fn is continuous, is f also continuous? Again, this is not demonstrated in theexample above.

2. If each fn is differentiable, is f also differentiable? Furthermore, if f is differentiable, doesf ′ = limn→∞ f ′n?

3. If each fn is integrable, is f also integrable? Furthermore, if f is integrable, is∫ baf dα =

lim∫ bafn dα?

We may extend these questions to ask if any properties which each function in {fn} possessesmay be extended to the limit f = limn→∞ fn. However, the answer to each of the above questionsis no in general- an instance of Murphy’s law in mathematics. However, if the sequence offunctions fn is uniformly convergent, then properties of f is generally preserved under thelimit. For instance, if each fn is continuous and fn converges uniformly, then f will also becontinuous. In essence, uniformly continuity ensures that N is chosen depending on ε only andnot the x at which the function is evaluated.

Before defining uniform convergence, we will examine a number of sequences of functionswhich gives us negative results for the questions above.

33

Page 34: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Example II.2. 1. In Rudin Example 7.4, we consider the functions fm(x) = limn→∞[cos(m!πx)]2n,each of which is integrable, but the limit f(x) = limm→∞ limn→∞ fm,n is not. We can seefrom the example that the crux of the issue is the interchange of two limits (which generallycannot do).

For instance f continuous at x0 means that limx→x0f(x) = f(x0). If we have a sequence

of functions {fn(x)}, each continuous at x0, and want to ask if the function in the limit iscontinuous at x0, then we must verify that:

limx→x0

( limn→∞

fn(x)) = limn→∞

limx→x0

fn(x)︸ ︷︷ ︸fn(x0)

holds.

2. Let h(x, y) = yx+y . Then

limy→0+

limx→0+

h(x, y) = limy→0+

y

y= 1

limx→0+

limy→0+

h(x, y) = limx→0+

0 = 0

Evidently the interchange of limits fails in this case. We may see this additionally byconsidering h being a slope function which is constant on all lines passing through theorigin. The limit then depends on which line the origin is approached from in this case.

3. We’ve already considered the case of fn(x) = xn on [0, 1], wherein each fn is continuousand differentiable but f(x) is not continuous and not differentiable.

4. This example shows that fn(x) is integrable on [0, 1] but f(x) may not. Let Q ∩ [0, 1] ={q1, q2, q3, ...}. We may write the set this way since Q is countable. Then let:

fn(x) =

{1 x ∈ {q1, ..., qn}0 otherwise

Each fn is integrable since it has a finite number of discontinuities. But the limit functionis

f(x) =

{1 x ∈ Q0 otherwise

which we’re already shown not to be integrable.

When the limit function is also differentiable or integrable, the equalities f ′ = limn→∞ f ′nand

∫fn dα =

∫f dα may not hold as we will see in the next two examples.

5. Let fn be defined on [0, 1] as in the picture below:

34

Page 35: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

2n

1n

fn

f1

f2

f3

Figure 17: fn are a sequence of functions which form a “travelling wave”

Then since fn(x) = 0 for all n, f(0) = 0 in the limit. For x > 0 and sufficiently large n(n ≥ 1

x ), fn(x) = 0. Hence f(x) = 0 in the limit. We then have f(x) = limn→∞ fn(x) = 0and that each fn, and f are integrable on [0, 1].

However,∫ 1

0fn dx = 1

2 ( 1n )(2n) = 1 by the area of a triangle formula, and

∫ 1

0f dx = 0.

Thus, we have in this case an example of a function where∫ 1

0

limn→∞

fn(x) dx 6= limn→∞

∫ 1

0

fn(x) dx

although all functions in question were integrable.

We may easily extend the above example to R by considering the function here:

n n+ 1

2

For n sufficiently large, the wave moves past each x ∈ R, hence the limit function is again0 in this case, although each constituent fn(x) have a non-zero area under the curve.

6. Finally we give a case where fn, f are differentiable but lim f ′n 6= f ′. Consider fn = xn

n on[0, 1] for which limn→∞ fn = 0 uniformly. This is because for sufficiently large n, f(x) < ε

for all x ∈ [0, 1]. However, f ′n = xn−1 which converges to f ′n =

{0 x 6= 1

1 x = 1as in our

previous example. We may extend this example to have functions for which the sequenceof second, third, and additional derivatives do not converge to the derivatives of the limitfunction.

35

Page 36: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

x

x2

2

x3

3x4

4

Figure 18: Illustration of the Sequence of Functions

We will then define uniform continuity in the next class.

7 Uniform Convergence

7.1 Uniform Convergence of Sequences

Before defining uniform convergence, we will write fn → f if a sequence of functions {fn}converges pointwise to f , and fn ⇒ f if a sequence of functions converges uniformly to f .

Recall that fn → f on a domain E if:

∀x, ∀ε,∃N = N(x, ε) s.t. |fn(x)− f(x)| < ε for n ≥ N

In uniform convergence, N does not depend on which x we choose. Given some ε, we maychoose one N which works for all x.

Definition II.4 (Uniform Convergence). fn converges uniformly on f if

∀ε > 0,∃N = N(ε) s.t. |fn(x)− f(x)| < ε for all n ≥ N and x ∈ E

We may illustrate uniform convergence as follows:

36

Page 37: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

a b

ε

ε

f(x)

fn(x)

Figure 19: Illustration of uniform convergence

If fn ⇒ f , we may draw an tube of width ε about the graph of f . Then fn(x) lies within thetube for n sufficiently large in that case.

Example II.3. Reconsider our example of a sequence: fn(x) = xn defined on [0, 1] which

converges to f(x) =

{0 x < 1

1 x = 1. The claim is that fn does not converge uniformly to f although

fn → f .We may see this informally by considering the graph of f . The graph of fn should lie entirely

within a region of width ε about fn if fn ⇒ f , although this is not the case as each fn iscontinuous and accordingly cannot “jump” near x = 1.

ε

ε

Figure 20: fn does not lie within an ε neighbourhood of the limit

We can reformulate the definition of uniform convergence as follows:

Theorem II.1. Assume fn → f on E and let Mn = supx∈E |fn(x)− f(x)|. Then fn convergesuniformly to f if and only if Mn → 0 as n→∞.

Proof. This follows from the definition of uniformly convergence, which states that Mn < ε forsufficiently large N . Thus, Mn → 0 must hold.

37

Page 38: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Example II.4. We may apply the above example in considering the sequence fn = xn again on[0, 1].

Here Mn = supx∈[0,1) |fn(x) − 0| = supx∈[0,1) |xn| = 1. Since limnrightarrow∞Mn 6= 0, thenfn does not converge uniformly to f .

Next, reconsider the example last day of functions which form a travelling wave. Let fn be awave of height 1 and have a width of 1

n . Recall that fn → 0. Since Mn = supx∈[0, 1n ] |fn − f | =maxx∈[0, 1n ] |fn(x)| = 1 9 0, then the sequence of fn does not converge uniformly to f .

Finally, reconsider the example where fn(x) = xn

n . This converges uniformly to 0 sinceMn = 1

n → 0 holds.

We may state the Cauchy criterion for the uniform convergence of sequences, as we may statethe Cauchy criterion for the convergence of sequences of real numbers.

Theorem II.2 (Cauchy Criterion for Uniform Convergence). A sequence of functions {fn}converges uniformly if and only if for all ε > 0, there exists N such that |fn(x)− fm(x)| < ε forall x and for all m,n ≥ N .

Remark that this means for fixed x, the sequence {fn(x)} is a Cauchy sequence and the Nchosen does not depend on x but only on ε.

Proof. (→). Assume fn converges to f uniformly. Then

|fn(x)− fm(x)| ≤ |fn(x)− f(x)|+ |f(x)− fm(x)|

by the triangle inequality. Hence given ε, we may find by uniform convergence, n sufficientlylarge such that |fn(x)− f(x)| < ε

2 . Hence |fn(x)− fm(x)| < ε for all m,n sufficiently large.(←). Here, for all x, fn(x) is a Cauchy sequence converging to f(x), hence we have pointwise

convergence to a function f(x). We must now show uniform convergence. We know that forall ε, there exists N such that |fn(x) − fm(x)| < ε for all m,n ≥ N . Fix n and let m → ∞.Then since fm(x) → f(x) by definition and |fm(x) − fn(x)| < ε for all m ≥ N , it follows that|fn(x)− f(x)| ≤ ε in the limit for that n. Hence, we have uniform convergence.

7.2 Uniform Convergence of Series

Definition II.5 (Uniform Convergence of Series). Consider a series of functions∑∞n=0 fn(x) =

f(x). Let sn(x) =∑nk=0 fk(x) be the nth partial sum. Then the series converges uniformly if

sn ⇒ f , that is the sequence of partial sums converges uniformly.

Theorem II.3 (Characterizations of Uniform Convergence). The following are consequences ofthe definition of uniform convergence and previous theorems:

1∑fn → f uniformly if and only if limn→∞ supx |sn(x)−f(x)| = limn→∞ supx |

∑∞k=n+1 fn(x)| →

0

2 (Cauchy Criterion) The series converges uniformly if and only if for every ε > 0, there existsN such that |

∑mk=n+1 fk(x)| = |sm(x)− sn(x)| < ε for all m,n ≥ N and for all x.

3 (Weirstrauss M-Test) If |fn(x)| < Mn for all x and∑Mn converges, then

∑fn(x) converges

uniformly.

38

Page 39: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof. We will prove the Weirstrauss M-Test. Suppose∑Mn converges, then for all ε, there

exists m,n such that∑mk=n+1Mk < ε by the Cauchy criterion for convergence of numeric series.

By hypothesis:

|m∑

k=n+1

fk| ≤ |m∑

k=n+1

|fk(x)| ≤m∑

k=n+1

Mk < ε

which implies uniform convergence of∑fk(x).

Beginning of class announcement: Exam 1 will be held next Wednesday, and will cover up tomaterial done this week. The problems will be mainly based on homework questions.

7.3 Interpretation of Uniform Convergence

Recall that fn → f uniformly on E if for all ε, there exists N such that |fn(x) − f(x)| < εfor all n ≥ N and for all x ∈ E. As we noticed last day, this is equivalent to sup |fn(x)− f(x)|tending to zero for sufficiently large n. We may use sup |fn(x) − f(x)| as a measure for thedistance between fn, f .

More precisely, let B(E) be the space of bounded functions on E. We may define a normon B(E). If f ∈ B(E), then define ||f || = supx∈E |f(x)|, known as the infinity norm. We maydefine a distance between f, g ∈ B(E) as ||f − g||, which induces a metric space on B(E).

A sequence of functions fn → f converges uniformly, if and only if fn → f in B(E) withrespect to the norm ||f || in B(E). We may think of each f as a point in B(E) and uniformconvergence will be the convergence of those points with the respect to the metric in B(E).More precisely, a sequence of points {fn} converges to f if:

∀ε,∃N s.t. ||fn − f || < ε∀n ≥ N

Likewise, if F (E) is the space of all functions on E, we may define ||f || as in B(E) although||f || ∈ R∪ {∞} in this case. Furthermore, in the case where ||f ||, ||g|| are infinite, it may be thecase that ||f − g|| is finite.

Restating the theorems we proved last day in the language above yields the following:

1 fn → f uniformly if and only if supn→∞ |fn(x)− f(x)| = limn→∞ ||fn − f || → 0

2 (The Cauchy Criterion): fn → f uniformly if and only if for all ε > 0, there is N such that forall m,n ≥ N , |fn(x)− fm(x)| < ε for all x, or ||fn − fm|| ≤ ε for all n,m ≥ N .

Here, if a sequence of functions {fn} converges uniformly, then {fn} is a Cauchy sequence in(B(E), ||f ||). Recall that a sequence {an} is a Cauchy in an arbitrary metric space if for everyε, there exists N for which d(an, am) < ε for all n,m ≥ N . A metric space is a completeif every Cauchy sequence converges, hence B(E) is a complete metric space since a Cauchysequence in B(E) is equivalent to a sequence of functions {fn} converging uniformly to somef in B(E).

3 (The Weierstrass M-Test): If a series∑∞n=0 fn(x) satisfies |fn(x)| < Mn for all x such that∑∞

n=0Mn converges, then∑fn(x) converges uniformly.

We may rewrite this by replacing Mn with the supremum since they are upper bounds andsumming over those suprema. Thus, if

∑||fn|| converges, then

∑fn converges uniformly in

B(E). This is the analogous thing to saying the absolute convergence implies convergencewhen dealing with sequences of real numbers.

39

Page 40: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

The analogies we made above when considering functions as points in a function space cannotbe made when solely consider pointwise, instead of uniform convergence since a meaningfulmeasure of d(fn, f) cannot be made in this case. However, this analogy is useful when we try toextend properties of sequences of points in R into sequences in B(E).

8 Properties of Uniform Convergence

8.1 Uniform Convergence and Continuity

8.1.1 The Main Result

We now prove an important result in uniform convergence.

Theorem II.4. If fn → f uniformly and each fn is continuous, then f is also continuous.

Proof. We will need to show that the limit function f is continuous at x0 for all x0 ∈ E. That is

∀ε∃δ s.t. |f(x)− f(x0)| < ε if |x− x0| < δ

Let ε be given. The following is a schematic of the bound we use in bounding the distance|f(x)− f(x0)|:

f

fn

f(x)f(x0)

fn(x)fn(x0)

Figure 21: Schematic of the Proof

By the triangle inequality:

|f(x)− f(x0)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(x0)|+ |fn(x0)− f(x0)|

(This is also illustrated in the figure above.) Since fn → f uniformly, there exists n such thatfor all x, |f(x) − fn(x)| < ε

3 and |fn(x0) − f(x0)| < ε3 . Furthermore, by continuity of each fn,

there exists δ such that |fn(x) − fn(x0)| < ε3 if |x − x0| < δ. Hence, we have produced δ such

that |f(x)− f(x0)| < 3( ε3 ) = ε

40

Page 41: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

The above proof illustrates something common we will do when dealing with sequences offunctions. We will prove something about the limit function f by jumping to an fn which isclose to it. The above proof does not work too if we only have pointwise convergence, since|f(x)− fn(x)| may not be able to be made small for all x in the domain.

The above theorem also has an interpretation in the language of function spaces. Let C(E)be the continuous and bounded functions of E. Note that C(E) ⊂ B(E) holds. If {fn} is asequence in C(E), converging to a point in B(E) (i.e. uniformly), then the limit f must liein C(E). Hence, C(E) is a complete space in its own right, and all limit points of C(E) liein B(E). Thus, C(E) is additionally a closed space, which is an interesting feature for theinfinite-dimensional normed vector space B(E).

More precisely, in a normed vector space V ⊂W , where V is a linear subspace of W , then Vmay not be closed in W if dimW =∞. Consider the class C1(E) ⊂ B(E), where C1(E) consistsof the continuously differentiable functions. There exists fn, each of which is differentiable anduniformly converging to f , although f is not differentiable. Geometrically, this represents asequence of points in a plane, although the limit does not lie in the plane itself! This is perhapsa counter-intuitive feature of infinite dimensional vector spaces.

We’ve established that if fn → f uniformly, and each fn is continuous, then the function inthe limit f is continuous. That is, uniform convergence of continuous functions means we have acontinuous limit function. Today we consider the converse problem. That is, if fn → f pointwiseand each fn, f is continuous, does fn → f uniformly? In other words, if we have a continuouslimit function when each fn is continuous, does the sequence converge uniformly?

Dini’s theorem answers the above problem.

8.1.2 Dini’s Theorem

Theorem II.5 (Dini). Let fn → f pointwise on a compact set K. Assume that the sequence offunctions is monotone. That is fn(x) ≥ fn+1(x) for all x, n, or fn(x) ≤ fn+1(x) for all x, n. Iffn and f are continuous, then fn → f uniformly.

We now give some examples to show why the two assumptions are needed in the abovestatement:

1 Compactness is needed. Consider the sequence of functions fn(x) defined on R according tothe figure below:

n n+ 1

These functions are continuous on R, a non-compact set, and a monotonically decreasing sincefn+1(x) ≤ fn(x) for all n. Furthermore, fn(x) → 0 pointwise. But supR ||fn − 0|| = 1 for alln, hence the convergence is not uniform.

2 Monotonicity is needed. Consider the functions [0, 1] defined as follows, as triangular waves:

41

Page 42: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

1 1n

The above functions, on the compact set [0, 1], are not monotone since there exist x for whichfn+1(x) < fn(x) and fn+1(x) > fn(x). The functions converge pointwise to 0, but fn 9 funiformly since again supx∈[0,1] ||fn − f || = 1.

Note too that a sequence of discontinuous functions may converge uniformly to a continuousfunction. Consider g(x) defined as:

and define fn(x) = g(x)n . Each fn(x) is discontinuous, but fn(x) ⇒ 0 uniformly since

supx || 1n || → 0 as n → ∞. Hence Dini’s theorem specifies a sufficient but not necessary con-dition for uniformly convergence, given that the function in the limit is continuous. We now willprove the theorem.

Proof. Let fn → f pointwise. Relabel fn as the differences fn → f . Hence fn → 0 pointwiseand without loss of generality, we may take the sequence of fns are monotonically decreasing.

To prove uniform convergence, we may show that for ε, there exists N for which |fn(x)| =fn(x) < ε for all x. We claim that locally:

∀y ∈ K, ∃δy, ny s.t. fny (x) < ε∀x s.t. |x− y| < δy

and use compactness to get the global result given in the theorem.

Proof of Theorem Assuming Claim. Since the claim holds, the open intervals (y − δy, y + δy)form a covering of K (and each y ∈ K is a centre of an interval). Hence by compactness, thereexists a finite subcover of those open intervals, meaning that there are y1, y2, ..., ym such thatK = Iy1 ∪ Iy2 ∪ ... ∪ Iym and Iy = {x ∈ K||x− y| < δy}.

Hence, taking N = max{ny1 , ny2 , ...., nyn} ensures that fn(x) < ε for all n ≥ N since thegraph of the function f is guaranteed to lie under the line y = ε for those points. Hence fn(x) < εfor all x ∈ K.

We now return to proving the previous claim to complete the proof of the theorem.

42

Page 43: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof of Claim. Fix some y ∈ K. We need to construct δy, ny. By assumption of pointwiseconvergence at y, then fn(y) → 0. Hence for n sufficiently large, |fn(y)| < ε

2 . By continuity offny (y), there exists δy such that

|fny (x)− fny (y)| < ε

2∀|x− y| < δy

Hence, by the triangle inequality

|fny (x)| ≤ |fny (x)− fny (y)|+ |fny (y)| < ε

2+ε

2= ε

for such δy, ny.

ε2

fn(x)

y

δε2

ε

Figure 22: Illustration of Proof of Claim. Given ε, there are n, δ such that |fn(x)| < ε in a δneighbourhood of x.

We can conclude state a condition for the uniform convergence of series by applying the abovetheorem.

Corollary II.1. If a series∑∞n=0 fn(x)→ f pointwise, each fn, f is continuous, and each fn(x)

is non-negative on a compact set K, then fn ⇒ f uniformly.

Proof. The sequence of partial sums is monotone increasing. Since we have fn, f continuous ona compact set and fn → f pointwise, hence fn ⇒ f uniformly by Dini’s Theorem.

We may further apply the theorems we have just proved in constructing some strange con-tinuous functions. If we know that fn → f uniformly and each fn is continuous, then f iscontinuous. But there may be some strange behaviours that f may take.

8.1.3 Strange Functions

1. The Weierstrass Function (1872): This is defined as

fa,b =

∞∑k=0

ak cos(bkπx)

43

Page 44: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

where a, b are constants satisfying 0 < a < 1 and ab > 1. The sum fa,b is continuous asthe uniform limit of a sum of smooth functions, but is nowhere differentiable and nowheremonotone! In fact it is a fractal function which may be infinitely rough.

2. Rudin presents an example of a continuous but nowhere differentiable function.

3. The Takagi Function (1901): Define f0 as follows:

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

12

f0

Next, define fk = 12kf0(2kx). Thus the following are images of f1, f2:

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

14

f1

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

18

f2

Figure 23: First few iterations of the Takagi Function

Define the Takagi function as f(x) =∑∞k=0 fk(x). Note that f(x) is continuous since it

is the uniform limit of a series of continuous functions. This follows by the WeirestrassM-Test, since |fn(x)| ≤ 1

2n = Mn, hence∑Mn <∞ implies that the series f is uniformly

convergent. However, we also need to check that this is nowhere differentiable.

4. The Devil’s Staircase, or Cantor’s Staircase: Consider the following iterative process whichconstructs a function based on removed intervals, during the construction of the Cantorset:

0

1

1/2

1 0

1

1/2

1/4

1 0

1

1/2

1/41/8

1

Figure 24: First few iterations of construction

44

Page 45: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

The Cantor staircase is the limit of this construction, hence we get a “staircase” of infinitelymany steps in the limit. What is suprising is that f(x) is continuous in the limit and f isconstant almost everywhere (on [0, 1] \ C), but still rises from 0 to 1.

To see that the Cantor staircase is continuous, we may also construct it as a sequence ofpiecewise linear functions as follows:

Figure 25: Alternate Construction of Cantor Staircase

Before moving to the next topic, an additional example of a continuous but nowhere differen-tiable function is the Koch snowflake, a fractal defined as γ : [0, 1]→ R2 where γ = (γ1(t), γ2(t)).Each of γ1, γ2 is continuous but nowhere differentiable.

8.2 Uniform Convergence and Integration

Theorem II.6. Let α be non-decreasing, fn be defined on [a, b], fn → f uniformly and eachfn ∈ R(α). Then, f ∈ R(α) and ∫ b

a

f dα = limn→∞

∫ b

a

fn dα

.

We’ve already seen an example where the limit of the integrals is not equal to the integralof the limit. This is the “triangle wave” example introduced in the first class when we discusseduniformly convergence, wherein

limn→∞

∫ 1

0

fn dx = 1 6=∫ 1

0

f dx = 0

Proof. We will need to prove that for all ε, there exists a partition P for which

U(P, f, α)− L(P, f, α) < ε

. Bound this difference, using the triangle inequality, by:

|U(P, f, α)− L(P, f, α)| < |U(P, f, α)− U(P, fn, α)|+ |U(P, fn, α)− L(P, fn, α)|+ |L(P, fn, α)− L(P, f, α)|

Firstly, choose n such that |f(x)−fn(x)| < ε3(α(b)−α(a)) , which exists by uniform convergence.

For every P :

45

Page 46: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

|U(P, f, α)− U(P, fn, α)| = |n∑i=1

( Mi︸︷︷︸sup f(x)

− M∗i︸︷︷︸sup fn(x)

)∆αi|

≤∑|Mi −M∗i |∆αi

≤ ε

3(α(b)− α(a))

∑∆αi

3

A similar proof shows that |L(P, f, α)− L(P, fn, α)| < ε3 for all partitions and such n. Now,

choose P such that |U(P, fn, α)− L(P, fn, α)| < ε3 since each fn is integrable.

Hence, U(P, f, α)− L(P, f, α) < 3( ε3 ) = ε and f ∈ R(α).To prove that the integral of the limit is equal to the limit of the integrals, we may prove, for

instance, that there exists n such that for all η,

|∫ b

a

f dα−∫ b

a

fn dα| < η

by appealing to upper and lower sums.Hence in the limit, the integrals are equal.

8.2.1 Application to Function Spaces

Recall that the set B[a, b] denotes all the bounded functions on [a, b], and there exists a norm||f || = supx |f(x)| on this space. The distance between two functions f, g in B[a, b] is ||f − g||,and in addition, uniform convergence is the same as convergence with respect to this norm.

What the above theorem implies is that the subspace R(α) ⊂ B[a, b] is closed in B[a, b]. Iffn is a sequence of functions in R(α) for which fn → f , then f ∈ R(α). Hence the space R(α)contains all of its limit points. Then we have thus far constructed, two closed subspaces of B[a, b]by noting that C[a, b] ⊂ R(α) ⊂ B[a, b], wherein each of C[a, b] and R(α) are closed.

Theorem II.7. Define a functional I : R(α) → R as f 7→∫ baf dα. This is a continuous

functional.

Proof. The proof of the above comes via the sequential characterization of continuity. If fn → fin B[a, b], then I(fn)→ I(f) also holds by the previous theorem about uniform convergence andintegration.

We may also define some additional norms in addition to the ||f || norm we had on B[a, b].This is also known as the L∞ norm. We may define the class of Lp norms as follows:

Definition II.6 (Lp norms). ||f ||p = [∫ ba|f |p dx]

1p for p ≥ 1

For instance, some common norms include:

• The L2 norm was explored on a previous problem set and comes from the inner product.Here:

||f ||2 =

√∫ b

a

|f |2 dx

.

46

Page 47: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

• The L1 norm is

||f ||1 =

∫ b

a

|f | dx

.

Given some Lp norm, the Lp distance between functions f, g may be defined as ||f − g||p.We may illustrate that L∞ and L1 distances as follows:

sup |f(x)− g(x)|f(x)

g(x)

f(x)

g(x)∫ ba|f(x)− g(x)| dx

Figure 26: Illustration the L∞ and L1 distances between functions. Particularly, the L∞ distanceis the maximum pointwise distance between the two function and the L1 distance is the areabetween the two curves.

Now we have three modes of convergence: uniform convergence (L∞ convergence), pointwiseconvergence, and Lp convergence. We may illustrate the relationships between the following themodes of convergence as follows:

Uniform Convergence

Lp Convergence for all pPointwise Convergence

(1)

Figure 27: Illustration between Modes of Convergence

Hence, uniform convergence implies both pointwise and Lp convergence for all p. But Lp

convergence implies neither pointwise or uniform convergence, and pointwise convergence impliesneither Lp and uniform convergence.

Theorem II.8. Uniform convergence implies Lp convergence for all p.

Proof. We will need to show that ||fn − f ||p → 0 as n → ∞, provided fn → f uniformly. Byuniform convergence, there exists N such that |fn(x) − f(x)| < ε for all n ≥ N . Hence, bydefinition of Lp distance:

47

Page 48: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

||fn − f ||p = (

∫ b

a

|fn(x)− f(x)|p dx)1p

≤ (

∫ b

a

εp dx)1p (By uniform convergence)

= p√εp(b− a)

= εp√b− a

Hence, the distance ||fn − f ||p may be made arbitrary small given uniform convergence.

We also want to give a proof that the converse doesn’t hold. That is, an example where fnconverge in Lp but not uniform. Here, we refer back to the triangle wave example where fn → 0pointwise but not uniformly. In addition, fn → 0 in L1 since:

||fn − 0||1 =

∫ 1

0

|fn| dx =1

2

1

n(1)→ 0

but again fn does not converge uniformly. Geometrically, we can interpret this as the areaunderneath the curve tending to zero, but the maximum of each function does not tend to zero.

8.3 Uniform Convergence and Differentiation

If a sequence of functions fn converges to f wherein each fn is differentiable, is it the casethat f is too differentiable, and that f ′n → f ′?

We’re already seen examples when the above two statements are false:

• The Weierstrass function, defined as∑n b

n cos(anx) is a sum of smooth functions. However,the function is nowhere differentiable in the limit, although fn → f uniformly.

• In the case where fn = xn

n , this sequence of functions uniformly converges to 0. However,the sequence of derivatives f ′n = xn−1 does not converge to the derivative of 0, which is 0.

Hence, the condition that a sequence of differentiable {fn} converges to f uniformly is notenough to ensure that f itself is differentiable, or that f ′n → f ′. We need to have extra conditionslike the following:

Theorem II.9. Assume fn is differentiable on [a, b], fn → f pointwise, and f ′n → g uniformly.Then g = f ′.

Indeed if f ′n → g uniformly, then fn → f uniformly. It is, however, enough to replace fn → f(convergence of the sequence of functions), with fn(x0) → a (convergence of a sequence of realnumbers, at one point of the function). If we attempt to recover f from f ′ from integration, fis recovered up to a constant and hence a needs to be fixed to obtain the original sequence offunctions. We may restate the theorem as follows, with this weaker condition:

Theorem II.10. Assume fn is differentiable on [a, b], fn(x0) → L for some x0 ∈ [a, b], andf ′n → g uniformly. Then:

1 fn converges to f uniformly.

2 f ′ = g. That is, limn→∞ f ′n = (limn→∞ fn)′.

48

Page 49: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof. In the case where f ′n are continuous, we may apply the Fundamental Theorem of Calculusto recover f . We will prove the above theorem in this special case.

Let f ′n be continuous and f ′n → g uniformly. Hence g itself is continuous. Write

fn(x) =

∫ x

x0

f ′n(t) dt+ fn(x0)

and define

f(x) =

∫ x

x0

g(t) dt+ L

.We will now check assertion 2, then assertion 1.By construction, f ′(x) = g(x). It remains to check the first assertion. By assumption,

limn→∞ fn(x0) = L. Next, since f ′n → g uniformly, then

limn→∞

∫ x

x0

f ′n(t) dt =

∫ x

x0

limn→∞

f ′n(t) dt

. Hence, fn → f pointwise.Furthermore,

supx|fn(x)− f(x)| = sup

x|∫ x

x0

(f ′n(t)− g(t)) dt+ fn(x0)− L|

≤ supx|∫ x

x0

|f ′n(t)− g(t)| dt|+ |fn(x0)− L|

By assumption, |fn(x0)− L| < ε for n ≥ N . Then, since f ′n → g uniformly, then |f ′n − g| < εfor all x and for n ≥M . Hence, for n ≥ max{N,M}:

supx|fn(x)− f(x)| ≤ ε|x− x0|+ ε ≤ ε(|b− a|) + ε = ε(|b− a|+ 1)

This proves uniform convergence of fn to f , by the criterion for uniform convergence.

8.4 Some Counterexamples

• Suppose fn = x + n. Then f ′n = 1 → 1 uniformly. However, the sequence of fn doesnot converge since the sequence fn(x0) always diverges for any given x0. Hence the bothconsequences of our theorem are violated here and this shows why the first assumption isneeded.

• Suppose we are considering functions not on [a, b] but on R. Let fn(x) = xn . Then fn → 0

pointwise (but not uniformly), but f ′n = 1n → 0 uniformly. Hence, the first consequence of

our theorem is violated if we do not have a finite domain (as fn 9 f uniformly here).

– However, if we have fn → f pointwise on R, and f ′n → g converges uniformly on everyinterval [a, b], then f ′ exists on R and the convergence of fn is uniform to f . We mayadapt the original proof of the theorem to this situation.

The next two topics which will be studied are two major theorems in analysis: the Arzela-Ascoli Theorem and the Stone-Weierstrass Theorem.

49

Page 50: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

9 The Arzela-Ascoli Theorem

Let C[a, b] be the vector space of continuous functions on [a, b] and ||f ||∞ = supx∈[a,b] |f(x)|.Is every closed and bounded set in C[a, b] compact, as seen in Rn by the Heine-Borel theorem?

We may restate the compactness condition in terms of sequences. Given a sequence {fn} ⊂C[a, b] which is bounded with respect to the supremum norm (this is equivalent to a sequencebeing uniformly bounded, meaning that there exists M such that |fn(x)| < M for all x andfor all n), does {fn} contain a convergent subsequence? That is, does every uniformly boundedsequence of functions contain a uniformly convergent subsequence? The answer is no in general.For instance, if the sequence {fn} consists of the following functions:

0

1

1n

then the sequence of functions {fn} is uniformly bounded by 1, although no subsequenceconverges uniformly since sup |fn(x)| = 1 for all n. However, {fn} will contain a uniformlyconvergent subsequence if the sequence is equicontinuous.

Definition II.7. A function fn is uniformly continuous (in x) if ∀ε, ∃δ such that

|x− y| < δ → |fn(x)− fn(y)| < ε

A sequence {fn} is equi-continuous (in n) if every fn is uniformly continuous and the sameδ works for all n, that is it is uniformly continuous in x, n. In terms of ε− δ, this means that ∀ε,∃δ such that

|x− y| < δ → (∀n, |fn(x)− fn(y)| < ε)

The Arzela-Ascoli Theorem is then as follows:

Theorem II.11. If {fn} is a uniformly bounded and equi-continuous sequence of functions on[a, b], then {fn} contains a uniformly convergent subsequence.

Firstly, a finite interval is needed. Consider the functions defined on R as:

n n+ 1

50

Page 51: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

The sequence of these functions {fn} is equicontinuous since their derivative is boundedand fn → 0 pointwise. But no subsequence converges to 0 uniformly since again we havesupx |fn(x)| = 1 where the supremum is taken over R.

9.1 Types of Continuity

Boundedness of the derivative is a sufficient condition for equicontinuity:

Lemma II.1. If {fn} are differentiable and there exists B ≥ 0 such that |f ′n(x)| ≤ B for all n,then {fn} is equicontinuous.

Proof. Note that by the Mean Value Theorem

|fn(x)− fn(y)| = |f ′n(c)||x− y| ≤ B|x− y|where c ∈ [x, y]. Taking δ = ε

B show that if |x− y| < δ, then |fn(x)− fn(y)| < ε, for all n

There is a chain of implications for types of continuity:

Theorem II.12 (Types of Continuity). • If f ′ exists and is bounded, then f is Lipchitz con-tinuous. That is there is K such that |f(x)− f(y)| ≤ K|x− y|.

• Lipchitz continuity → Holder continuity. That is there is K and α > 0 for which |f(x) −f(y)| ≤ K|x− y|α.

• Holder continuity → uniform continuity

• Uniform continuity → continuity

In the above example with the function on R, then {fn} is Lipchitz continuous with K = 1.We will now prove that Holder continuity implies uniform continuity.

Proof. Let ε be given and f be a Holder continuous function with K,α known. Hence, takingδ = ( εK )

1α implies that

|f(x)− f(y)| ≤ K(ε

K) = ε

9.2 Pointwise Boundedness

In the statement of the Arzela-Ascoli theorem, we need only pointwise boundedness insteadof uniform boundedness to have the theorem hold.

Definition II.8. A sequence {fn} is pointwise bounded if for every x, the sequence {fn(x)}∞n=1

is bounded.

Theorem II.13. If a sequence {fn} is pointwise bounded and equi-continuous, then it is uni-formly bounded.

Proof. If fn is defined on [a, b], choose a partition of [a, b] where xi in the partition satisfies|x− xi| < δ, which we will choose later. By the triangle inequality,

|fn(x)| ≤ |fn(x)− fn(xi)|+ |fn(xi)|

holds. Then given ε, we may choose δ such that |fn(x)−fn(y)| < ε2 if |x−y| < δ by equicontinuity

of the sequence. Then since |fn(xi)| ≤ Mi by pointwise boundedness for all i, letting M =maxi=1,...,nMi yields a uniform bound of M + ε for |fn(x)|.

We will prove the Arzela-Ascoli theorem next day.

51

Page 52: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

9.3 Proof of Arzela-Ascoli

Recall the statement of the Arzela-Ascoli Theorem.

Theorem II.14. Suppose {fn} is a sequence of functions on [a, b] such that {fn} is pointwisebounded, and {fn} is equi-continuous. Then there exists a subsequence fnk such that fnk → funiformly.

Recall that a function is pointwise bounded if for every x, the set {fn(x)} is bounded.Furthermore, an equi-continuous set of functions {fn} which is pointwise bounded, is uniformlybounded too. We will prove the theorem today.

Proof. We shall split the proof into three cases.

Case I: {fn} is defined on a finite set E = {x1, x2, ..., xn}

Here, if {fn} is pointwise bounded, then there exists a convergent subsequence (and we donot need the equi-continuity condition). The set {(fi(x1), ...., fi(xn))} forms a sequence in Rn.Hence, by the Heine-Borel theorem, there exists a subsequence {fni} which converges pointwise.

Case II: {fn} is defined on a countable set E = {x1, x2, ...}

Here, if {fn} is pointwise bounded, then there exists a convergent subsequence (and we againdo not need the equi-continuity condition). However, we cannot resort directly to the Heine-Borel Theorem since there exist sequences in R∞ with no convergent-subsequence (taking thestandard basis vectors in R∞, for example). However, the representations of R∞ as functions:

fi(xj) =

{1 if i = j

0 otherwise

produce a sequence of functions which tend to zero pointwise.In this case, let S = {fn}. Then the sequence {fn(x1)} ⊂ R is bounded. Hence there

exists a convergent subsequence {f1,1(x1), f1,2(x1), ...} of the previous sequence which convergesto y1. Call this set of functions S1 : {f1,1, f1,2, ...}. Next we may construct S2 by taking asubsequence of S1 which converges when evaluated at x2. Repeating this process yields sequencesSn ⊂ ... ⊂ S2 ⊂ S1 for which Sn converges on the set {x1, ..., xn} and Sn is constructed suchthat fn,k → yn as k →∞ (and fi,k → yi at all previous points i < n.)

We may now diagonalize, taking the sequence:

L = {f1,1, f2,2, ..., fn,n, fn+1,n+1, ...}

which is a subsequence of S which converges for every point xi ∈ E, as after the ith point,Si ⊂ L and Si,k(xi)→ yi as k →∞ for some yi by construction.

Case III: {fn} is defined on the interval E = [a, b]

Consider the countable set [a, b] ∩ Q = {q1, q2, ...}. By the result of Case II, there exists asubsequence, {fnk} of {fn} which converges on [a, b]∩Q. That is fnk(qj)→ r for some r ∈ R ask →∞.

Claim: This subsequence {fnk} = {gk} converges pointwise on [a, b] and also uniformly onthat interval.

52

Page 53: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

To prove the claim, we need to use the uniform Cauchy criterion. Let ε > 0 be given. Weneed to exhibit N such that m,n ≥ N implies |gn(x)− gm(x)| < ε. By the triangle inequality:

|gn(x)− gm(x)| < |gn(x)− gn(qi)|+ |gn(qi)− gm(qi)|+ |gm(qi)− gm(x)|

By equicontinuity, there exists δ such that for all x, y, n, |x − y| < δ implies that |gn(x) −gn(y)| < ε

3 . Hence choose δ for which |gn(x)−gn(y)| < ε3 and construct a partition P of [a, b]∩Q

such that the mesh of the partition is less than δ. This ensures that |gn(x) − gn(qi)| < ε3 and

|gm(qi)− gm(x)| < ε3 .

Furthermore, since gn(qi) converges by construction, then there exists N such that |gn(qi)−gm(qi)| < ε

3 , if m,n ≥ N .Hence

|gn(x)− gm(x)| < |gn(x)− gn(qi)|︸ ︷︷ ︸By Equi-Continuity

+ |gn(qi)− gm(qi)|︸ ︷︷ ︸By Construction

+ |gm(qi)− gm(x)|︸ ︷︷ ︸By Equi-Continuity

< 3(ε

3) = ε

9.4 Converse to Arzela-Ascoli Theorem

Recall the statement of the Arzela-Ascoli Theorem:

Theorem II.15. If {fn} is uniformly bounded and equi-continuous on [a, b], then there exists auniformly convergent subsequence of {fn}.

Recall that the proof involved exhibiting a subsequence fnk which converges to some f onQ∩ [a, b], by diagonalization. The equi-continuous condition then allows us to conclude that fnkconverges to f uniformly on [a, b]. The converse to the Arzela-Ascoli Theorem holds:

Theorem II.16. If fn → f uniformly on [a, b] and fn is continuous, then {fn} is equicontinuousand uniformly bounded.

Proof. Uniform boundedness was proved in a homework. To prove equicontinuity, we need toshow that we may choose δ to satisfy this condition whenever ε is provided.

Firstly, choose N such that |fn(x)− f(x)| < ε3 for all x and for all n ≥ N .

Furthermore, choose δN such that |f(x) − f(y)| < ε3 if |x − y| < δN . Then by the triangle

inequality:

|fn(x)− fn(y)| < |fn(x)− f(x)|+ |f(x)− f(y)|+ |f(y)− fn(y)|

3+ε

3+ε

3= ε

for all n ≥ N and for such δN . Since the equi-continuity condition stipulates that we mustfind δ which works for all n ∈ N, we may make it smaller to satisfy this condition. Sincefunctions {f1, ..., fN−1} in the sequence are uniformly continuous, hence there exist δ1, ..., δN−1

for which |fi(x) − fi(y)| < ε for all |x − y| < δi for i ∈ {1, ..., N − 1}. Hence, choosing δ =min{δ1, ..., δN−1, δN} makes |fn(x)− fn(y)| < ε satisfied for all n, if |x− y| < δ.

53

Page 54: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Recall that the Heine-Borel Theorem states that a set C ⊂ Rn is compact if and only ifit is closed and bounded. Another interpretation is the Arzela-Ascoli Theorem, is that a setC ⊂ C[a, b] (continuous functions on [a, b] with the infinity norm) is compact if and only if C isclosed, bounded, and equicontinuous. We may make this interpretation based on the conversewe just proved, and noting that compactness in a space means that a sequence in the space hasa convergent subsequence.

Furthermore, we may define equicontinuity not just for sequences of functions but also anyarbitrary set of functions.

Definition II.9. If C ⊂ C[a, b] is a family of functions, then it is equicontinuous if ∀ε > 0,there exist δ > 0 for which

|x− y| < δ → |f(x)− f(y)| < ε

for all f ∈ C.

9.5 Application: Peano’s Theorem

The main application of the Arzela-Ascoli Theorem is the proof of Peano’s Theorem, whichtells us about solutions of differential equations. A differential equation relates a function to itsderivative(s), with some initial conditions on the function. For instance, x′ = x2, x(0) = 1, is anexample of a differential equation.

Theorem II.17 (Peano’s Theorem). Let x′(t) = f(t, x(t)) be a differential equation with initialcondition x(t0) = x0. If f is continuous, then there exists a solution x(t) where t ∈ [t0, t0 + ε].

1 Peano’s Theorem only stipulates that a solution exists locally, since a solution of a differentialequation may approach ∞ in finite time. The differential equation

x′(t) = x3 x(0) = 1

is such an example. The above differential equation has solution x(t) =√

11−2t , which ap-

proaches ∞ as t→ 12 .

2 We may not have a unique solution either. In the case of the differential equation

x′ =√|x| x(0) = 0

the functions x(t) = 0 and x(t) = t2

4 are solutions. In the latter case, we may verify that

x′ = t2 =

√t2

4 . In fact, we may construct infinitely many solutions by shifting the initial

condition as follows:

t2

4

t0

(t−1)2

4

t1

Figure 28: Another solution of the differential equation is constructed by shifting the where thefunction is first non-zero

In the case where x(t0) < 0 , then two different solutions may be constructed similarly, byshifting the time where the function starts growing after 0.

54

Page 55: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

t0 t1

Figure 29: Other solutions of the differential equation are constructed in this case, again byshifting

Hence, locally, there is not even a unique solution for this above differential equation! However,the Picard-Lindelof theorem says that if f is Lipchitz continuous, then there is a uniquesolution. This theorem here fails since

√|x| is not Lipchitz continuous- the slope of the

function goes to ∞ as x→ 0.

The idea of the proof of Peano’s theorem, is then to find a sequence of functions xn(t), whichhopefully converge to a solution of the the desired differential equation. For example, Euler’sMethod with step size 1

n , produces a piecewise linear approximation to the solution, which wehope to converge to the actual solution.

x(t)

xn(t)

Figure 30: Euler’s method produces a series of piecewise linear approximations to the solutionof a differential equation

The issue is the the sequence of iterates produced by decreasing the step size in Euler’smethod, that is the sequence of {xn(t)} may not converge to the solution x(t). For instance,when solving the differential equation x′ =

√|x|, assuming that x(t0) < 0, we either have two

cases with Euler’s method:

55

Page 56: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Figure 31: Two cases for Euler’s Methods when sovling x′ =√|x|

In the case presented on the left, xn(t) = 0 after a finite number of steps and the function thenbecomes identically zero. Otherwise, zero is never reached and the solution becomes unbounded(this is shown in the right case). Hence, the sequence of iterates Euler’s method produces in thiscase may contain solutions of both types, and does not converge to a function. However, we mayapply Arzela-Ascoli theorem to obtain a sequence xnk which converges to a solution x. Indeed,if f(t, x(t)) is bounded, then the slope of xn at every step in Euler’s method is bounded, and weget an equicontinuous sequence of functions which is uniformly bounded. This is the idea of theproof.

Proof Sketch. Instead of solving the differential equation x′ = f(t, x) x(t0) = x0, we will solvethe integral equation:

x(t) = x0 +

∫ t

t0

f(s, x(s)) ds

for a function x(s). We know that f(t, x(t)) is continuous, hence the right hand side isdifferentiable and x′(t) exists (which satisfies the original differential equation).

Define an operator L{u(t)} = x0 +∫ tt0f(s, u(s)) ds. Solving the integral equation is then

equal to finding a fixed point of this operator. We will then continue the proof next day.

9.5.1 Proof of Peano’s Theorem

Recall the statement of Peano’s Theorem:

Theorem II.18. Consider the differential equation x′(t) = f(t, x(t)) with initial conditionx(t0) = x0. If f is continuous, then there exists a solution x(t) of the equation in some in-terval [t0, t0 + ε].

Proof. Consider the integral equation

x(t) = x0 +

∫ t

t0

f(s, x(s)) ds︸ ︷︷ ︸This is the operator L{x(t)}

56

Page 57: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

. Solving the differential equation is equivalent to finding a continuous function x(t) whichsatisfies the integral equation (and is is a fixed point of the operator L(x(t))).

Let f(t, x) be defined and continuous for t0 ≤ t ≤ t0 +a and x0−b ≤ x ≤ x0 +b. Since f(t, x)is continuous, then there exists M such that |f(t, x)| < M on D = [t0, t0 + a]× [x0 − b, x0 + b].Let ε = min{a, bM }. Define xn(t) on [t0, t0 + ε] as follows:

xn(t) =

{x0 t ∈ [t0 + ε

n ]

x0 +∫ t− ε

n

t0f(s, xn(s)) ds t ∈ [t0 + ε

n , t0 + ε]

Divide in the interval [t0, t0 + ε] into n equal pieces. The claim is that this constructionenables the part of xn(t) on [t0 + i

n , t0 + i+1n ] to be determined by the part on the previous piece

[t0 + i−1n , t0 + i

n ]. We can see this by noting that:

x′n(t) =

{0 t ∈ [t0, t0 + ε

n ]

f(t− εn , xn(t− ε

n )) t ∈ [t0 + εn , t0 + ε]

Hence the function xn(t) is determined by the initial condition, and the value of f at xn(t− εn ),

a time slightly before the current time. In other words, there is a time delay since, x′n(t) 6=f(t, xn(t)), but instead, x′n(t) = f(t− ε

n , xn(t− εn )). In a way, this construction would be similar

to Euler’s method since we are using information produced at an earlier time (i.e. the derivative)to construct the solution at the current time.

We may rewrite xn(t) using operator notation as:

xn(t) =

{x0 t ∈ [t0, t0 + ε

n ]

L{xn(t− εn )} = Ln{xn}(t) t ∈ [t0 + ε

n , t0 + ε]

So xn are the fixed points of operators L(xn) which we have just defined.We will first check that x0 − b ≤ xn(t) ≤ x0 + b so that the sequence of xn(t) are potentially

solutions to the differential equation.

|xn(t)− x0| ≤ |∫ t− ε

n

t0

f(s, xn(s)) ds| = |t− ε

n− t0|︸ ︷︷ ︸

Length of interval

M︸︷︷︸Bound on f(t,x)

≤ εM < b︸︷︷︸By choice of ε

Hence, we have verified that xn(t) are possibly converging to a solution. We may nowapply the Arzela-Ascoli theorem since xn(t) 9 x(t), but some sequence xnk → x may convergeuniformly to a solution x(t). It remains to check the two hypotheses:

• We have just verified that {xn} are uniformly bounded since |xn(t)| ≤ |x0|+ b.

• {xn} are equicontinuous. Supposing t1, t2 ≤ εn , then

|xn(t2)− xn(t1)| ≤∫ t2− ε

n

t1− εn

|f(s, xn(s))| ds ≤M |t2 − t1|

Since {xn} are uniformly Lipchitz by the above observation, then they are equi-continuous.

57

Page 58: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Hence a subsequence of {xn} converges. It remains to show that the limit x(t), the limit isa solution to the integral equation. We although know that Lnk(xnk) = xnk and that Lnk → Lin some sense. Using this, we must show that for η > 0, |L(x(t)) − x(t)| < η for all t. By thetriangle inequality

|L(x)− x| < |L(x)− Lnk(x)|︸ ︷︷ ︸1

+ |Lnk(x)− Lnk(xnk)︸ ︷︷ ︸2

+ |xnk − x|︸ ︷︷ ︸3

and Lnk(xnk) = xnk by construction. We must show Nk sufficiently large such that |L(x)−x| ≤ η.

Since xnk → x uniformly, then we may make |xnk − x| <η3 for all t, taking care of the third

part of the inequality.Since the L operator is (uniformly) continuous, then |L(x(t))−L(x(t− ε

n ))| < η3 if ε

n is small.Note that Lnk(x) = L(x)(t− ε

nk) so this makes the first part of the inequality small.

To make the second part of the inequality small, note that Lnk is continuous, then ||x−xnk ||∞small means that ||Lnk(x)− Lnk(xn)||∞ is too small.

Hence we have proven that |L(x) − x| < η for n sufficiently large, and we have produced asolution to the differential equation.

We finally note that the Arzela-Ascoli theorem is useful in computing fixed points of operator,like we have done here. This is equivalent to minimizing the norm ||L(x) − x||∞. If we havea compact set of functions x ∈ C and a continuous operator, then L attains its minimum,and indeed if it is zero, we have some fixed point of L in C. Next day, we will discuss theStone-Weierstrass Theorem.

10 Weierstrass’ Theorem

Theorem II.19. Supposing f is a continuous function on [a.b]. Then there exists a sequence ofpolynomials Pn(x) such that Pn(x)→ f(x) uniformly on [a, b].

This is in stark contrast to the previous example of Weierstrass which we have encountered:the continuous everywhere but nowhere differentiable function, since this theorem states thatany continuous function may be approximated, uniformly, by smooth functions. In the languageof function spaces, R[x], the set of polynomials with real coefficients, is a dense set of C[a, b]. Forinstance, a function such as f(x) = |x| on [−1, 1] may be approximated by a polynomial of highdegree (to approximate the cusp at the origin) on that interval, but it may not approximate thefunction well (and do anything) outside that interval.

10.1 Motivation for the Proof - Averaging Operators

We may interpret Weierstrass’ theorem in terms of averaging operators. Define the averagingoperator Aδ (which maps functions to function) as:

Aδ(f)(x) = Average of f on [x-δ,x+δ] =1

∫ x+δ

x−δf(t) dt

The following images illustrate what happens when the averaging operator is applied to astep function.

58

Page 59: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Aδ(t)

−δ δ

Aδ(t)

−2δ 2δ

Figure 32: Application of the averaging operator to a step function yields a piecewise linearfunction, then a piecewise quadratic function

It is evident from the above figure that the averaging operator is a smoothing operator. Wemay rewrite the averaging operator as follows. Let g(t) be defined as:

−δ δ

12δ

Figure 33: Definition of g(t)

Then, Aδ(f)(x) may be defined as∫∞−∞ f(t)g(t− x) dt, the convolution of f and g (denoted

as f ∗ g). We may check that this is equal to∫ x0+δ

x0−δ f(t) 12δ dt, since g(t− x) = 1

2δ on the interval

[x− δ, x+ δ] and is zero everywhere else.Using the convolution operator, we may define other smoothing operators, for example,

weighted averages, with other choices of g (instead of using a function with its mass distributedevenly on its support as in the example above). The function which we use for g in a smoothingoperator, must have finite support, and we also need that

∫∞−∞ g(t) dt = 1 to hold.

Here, we note that if g(x) is smooth, then f ∗ g is smooth no matter what f is (and providedthat the integral exists). Furthermore, if g → δ0, the delta function, and f is continuous, thenf ∗ g → f .

Hence, our goal to find a sequence of smooth gn, approximating the delta function, to producea sequence of smooth functions f ∗ gn, each of which is approximating f but approaches f in thelimit.

59

Page 60: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Figure 34: A sequence of smooth g which approach the delta function

Weierstrass’ theorem adds the condition that if g is a polynomial, then so is f ∗ g. So seewhy:

f ∗ g(x) =

∫ ∞−∞

f(t)(xnan(t) + xn−1an−1(t)...) dt =

n∑k=0

xk∫ ∞−∞

f(t)ak(t) dt =

n∑k=0

bkxk ∈ R[x]

.

10.2 Proof of Weierstrass’ Theorem

It suffices to assume that [a, b] = [0, 1] and that f(0) = f(1) = 0 (note that f(x) = 0 outside[0,1]). We may recover the original f by adding a linear function to f , since there is a uniqueline which passes through f(0) and f(1).

Let gn = cn(1−x2)n. Let cn be chosen such that cn∫ 1

−1gn(t) dt = 1. Note that this sequence

of polynomials, with normalization constants chosen appropriately, approaches the delta functionin the limit.

However, this is something we need to prove. We need to establish a bound on the magnitudeof cn (cn <

√n) and show that gn(x) → δ in order to establish Weierstrass’ theorem. We will

show that estimate on cn next day, and the convergence.Recall the lemma we stated last day in preparation for proving the Stone-Weierstrass Theo-

rem:

Lemma II.2. If gn are polynomials and f is any continuous function, then Pn = f ∗ gn arepolynomials.

To complete the proof of the Stone-Weierstrass theorem, we must show that Pn → f uni-formly. Recall that we assume that f is defined on [0, 1] where f(0) = f(1) = 0 and f(x) = 0outside of that interval. Recall that we may assume this as the original f can be recoveredby a linear transformation. Next, recall that gn = cn(1 − x2)n, where cn are chosen such that∫ 1

−1gn(x) dx = 1.

60

Page 61: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

g1

g2

g3

Figure 35: Recall that such gn are bump functions, which approach the delta function

Lemma II.3. gn approaches the delta function. That is limn→∞ gn(x) =

{0 x ∈ [−1, 1] \ {0}∞ x = 0

Lemma II.4. cn <√n

Proof of Lemma 2 assuming Lemma 3. In this case, fixing some x ∈ [−1, 1] \ {0}, we get:

limn→∞

gn(x) = limn→∞

cn(1− x2)n = limn→∞

cnan < lim

n→∞

√nan = 0

By the Ratio Test:

√n+ 1an+1

√nan

=

√1 +

1

na→ a < 1

Hence, the series∑∞n=0

√nan converges, and hence limn→∞

√nan = 0.

Proof of Lemma 2. By symmetry:

1

cn= 2

∫ 1

0

(1− x2)n dx ≥ 2

∫ 1√n

0

(1− x2)n dx

By the binomial theorem and for x ≥ 0: (1− x2)n ≥ 1− nx2. Hence:

1

cn≥ 2

∫ 1√n

0

1− nx2 dx = 2(x− nx3

3)|

1√n

0 = 2(1√n− n

3n√n

) = 2(2

3√n

) =4

3√n

Hence, 1cn≥ 4

3√n≥ 1√

n. This means that cn <

√n.

We may now completed the proof of the lemma, that gn → δ where δ denotes the deltafunction. We may now prove the main Stone-Weierstrass theorem: that for any continuous fand for the gn we have just defined, then:

Pn = f ∗ gn(x) =

∫ ∞−∞

f(t)g(t− x) dx =

∫ 1

0

f(t)g(t− x) dt

approaches f uniformly. Note that we may also write Pn as:

Pn(x) =

∫ ∞−∞

f(x+ t)gn(t) dt =

∫ 1

−1

f(x+ t)gn(t) dt

61

Page 62: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

that is in terms of where g is defined. Since x ∈ [0, 1] by assumption, then f(x + t) = 0 ift /∈ [−1, 1].

Theorem II.20. Pn(x)→ f(x) uniformly on [0, 1].

Proof. Let ε be given. Then by definition:

|Pn(x)− f(x)| = |f(x)

∫ 1

−1

gn(t) dt−∫ 1

−1

f(x+ t)gn(t) dt|

= |∫ 1

−1

(f(x)− f(x+ t))gn(t) dt|

=

∫ 1

−1

|f(x)− f(x+ t)||gn(t)| dt

Partition [−1, 1] into three intervals as follows:

∫ −δ−1

|f(x)− f(x+ t)||gn(t)| dt︸ ︷︷ ︸1

+

∫ δ

−δ|f(x)− f(x+ t)||gn(t)| dt︸ ︷︷ ︸

2

+

∫ 1

δ

|f(x)− f(x+ t)||gn(t)| dt︸ ︷︷ ︸3

and make each part small such that Pn(x)→ f uniformly on [0, 1].The fact that 2 is small is derived from the fact that f is uniformly continuous on [−δ, δ].

Hence, there exists δ such that

|f(x)− f(x+ t) < ε if |t| ≤ δ

The fact that 1, 3 is small is derived from the fact that |gn| is small for large n. (We provedthat gn(δ)→ 0 as n→∞, if δ 6= 0). Hence, let N be such that gn(δ) < ε when n ≥ N .

Hence, by the bound on gn(t),∫ −δ−1

|f(x)− f(x+ t)||gn(t)| dt ≤ 2Mε(1− δ) < 2Mε

where |f(x)| < M (M is a bound for f). A similar argument holds to bound (3).Furthermore, by the bound on f(x),∫ δ

−δ|f(x)− f(x+ t)||gn(t)| dt ≤ ε

∫ δ

−δ|gn(t)| dt < ε

∫ 1

−1

|gn(t)| dt = ε

Hence, |Pn(x)− f(x)| < (4M + 1)ε, for all x where 4M + 1 is a constant. Since ε is arbitrary,the difference |Pn(x)−f(x)| may be made small for all x, and hence Pn(x)→ f(x) uniformly.

Next day, we will discuss Stone’s theorem: a generalization of Weierstrass’ Theorem.

10.3 Stone’s Generalization of Weierstrass’ Theorem

Recall Weierstrass’ Theorem. We may restate the theorem in the language of function spacesas follows:

62

Page 63: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem II.21. The set of polynomials R[x] is dense in C[a, b], the set of continuous functionon [a, b], with respect to the supremum norm.

This is because, R[x] = C[a, b], where the closure here denotes the set of uniform limits ofsequences {fn(x)} of polynomials. We may form a sequence such a sequence of polynomialssince there exist polynomial Pn(x) for which |Pn(x) − f(x)| < ε for every x in the domain, if fis continuous.

Note that Weierstrass’ Theorem is not Taylor’s theorem. Consider the function

f(x) =1

1 + x2

whose Taylor expansion is 1− x2 + x4 − x6... converges on |x| < 1. If we take our interval to be[0, 2], the Taylor polynomial does not converge on here. However, Weierstrass’ theorem enablesus to find a different set of polynomials Pn(x) ⊂ R[x] for which Pn(x) is able to approximate

11+x2 on [0, 2].

Weierstrass’ theorem tells us that a certain set of functions, the polynomials, are dense inC[a, b]. In general, if A ⊂ C[a, b], when is A dense in C[a, b]? That is, when is A = C[a, b]?

• Weierstrass’ theorem tells us that A = R[x] is dense in C[a, b].

• We may want to investigate if the trigonometric polynomials A = {∑Nn=0 an cos(nx) +

bn sin(nx)} are dense on C[−π, π]. This leads into the problem of investigating Fourierseries.

Stone’s theorem, tells us in-general, which sets of functions are dense in C[a, b]. WhenWeierstrass proved his theorem in the 1880s, Stone proved his theorem in the 1930s and 1940s.Before stating his theorem, we will make some definitions:

Definition II.10. A subset A ⊂ C(E), where E is a domain on which a function is defined andC(E) is the set of all continuous functions on that domain, is an R-algebra if

1. A is closed under addition, subtraction and multiplication (i.e. A is a subring of C(E))

2. A is closed under scalar product with c ∈ R. That is, cf ∈ A for every f ∈ A and c ∈ R.

An R-algebra is unital if 1 ∈ A

A consequence is that every unital R-algebra contains all of the constant functions. We makethe distinction between unital and non-unital algebras based on the distinction in general ringtheory: for instance, Z is a ring with 1 but 2Z is a ring without 1. The term algebra also comesfrom the corresponding term in ring theory. If A,B are rings with B ⊂ A, then A is a B-algebra.

Note too that R and R[x] are both unital R-algebras.

Definition II.11. A ⊂ C(E) seperates points if ∀x, y ∈ E, there is f ∈ A such that f(x) 6=f(y)

In addition, if A is a unital algebra and separates points, then there exists a function f suchthat f(x) = 0 and f(y) = 1 for all x, y in the domain.

• The algebra R[x] separates points, by considering the function f(x) = x − x0 where x0 isfixed.

63

Page 64: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

• The trigonometric polynomials do not separate points by their periodicity: f(−π) = f(π).Hence A, being the trigonometric polynomials, are not dense in C[−π, π] since they cannotapproximate a function with different values at the endpoints. The best can do, in this case,is to have a trigonometric polynomial which takes the average value of the endpoints, at eachendpoint. Hence, we may not be able to approximate these functions by a trigonometricpolynomial which is arbitrarily close to the original function. We have a solution here oncewe identify the two endpoints as being essentially the same.

We may now state a few versions of Stone’s theorem.

Theorem II.22 (Stone). Let A ⊂ C[a, b] be a unital R-algebra. Then A = C[a, b] if and only ifA separates points in [a, b].

We may extend this theorem to complex-valued continuous functions. That is, maps from[a, b] 7→ C. In this case, a function in C([a, b],C) can be written as two functions f(x) =f1(x) + if2(x). The definition for R-algebra extends to the definition of a C-algebra.

• The algebra A = {∑Nn=1 cne

inx} is a C-algebra, since einx · eimx = ei(n+m)x

Theorem II.23 (Stone- Complex Version). Let A ⊂ C([a, b],C) be a unital C-algebra closedunder complex conjugation. That is, if f ∈ A, then f ∈ A. Then, A = C([a, b],C) if and only ifA separates points.

• The algebra A = {∑Nn=−N cne

inx} satisfies closure under complex conjugation since einx =

e−inx. Hence the Stone-Weierstrass theorem applies to this algebra, if we again consider πto be the same as −π.

The version of Stone’s theorem we will first prove is the lattice version.

Definition II.12. The algebra A ∈ C[a, b] is a lattice if it is closed under minimum andmaximum operations, where the minimum and maximum are taken pointwise.

Theorem II.24 (Stone- Lattice Version). Let A be a lattice of C[a, b] such that for all x, y ∈[a, b], x 6= y, and for all c, d ∈ R, there exists a function f such that f(x) = c and f(y) = d.Then A = C[a, b].

For instance, R[x] is not a lattice, since the pointwise maximum or minimum of polynomialsis in general, not a polynomial, but if A consist of the piecewise linear functions on [a, b], thenthe theorem applies here since A is a lattice in this case.

10.4 Proof of Stone’s Theorem- The Lattice Version

Recall from last day the setting of Stone’s theorem. Let A ⊂ (C[a, b], || · ||∞) and A = {f ∈C[a, b]|gn → f uniformly where gn ∈ A} When is A = C[a, b]? Equivalently, when is A dense inC[a, b]?

Definition II.13. A ⊂ C[a, b] is a lattice if it is closed under pointwise maximum and minimum.That is, if f, g ∈ A, max(f, g)(x) = max{f(x), g(x)} and min(f, g)(x) = min{f(x), g(x)} are bothin A.

The term lattice is derived from the corresponding term from partial ordered set theory. Wemay define a partial order ≤ on C[a, b] by identifying f ≤ g if f(x) ≤ g(x) for all x. Themaximum of the two functions f, g is larger than both f, g and similarly, the minimum of thetwo functions f, g is smaller than both. Hence, their positions relative to each other “form alattice”.

64

Page 65: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem II.25 (Stone - Lattice Version). Let A ⊂ C[a, b] be a lattice such that for any x1, x2 ∈[a, b] and a1, a2 ∈ R, there exists f ∈ A such that f(x1) = a1 and f(x2) = a2. Then A = C[a, b].

Note that we may replace [a, b] with any compact set (in Rn or in a general metric space)with at least two points. This fails when we don’t have two points in our domain. For instance,if we take a = b, and define A = {f(a) = 1}, then A = A 6= C[a, a].

Proof. We need to prove that for every f ∈ C[a, b] and ε > 0, there exists h ∈ A such that||h(x)− f(x)|| < ε for every x.

Lemma II.5. Suppose f and ε are as given before. Fix x0 ∈ [a, b]. Then there exists g ∈ A suchthat g(x0) = f(x0) and g(x) > f(x)− ε for all x.

Pictorally, this means that g lies above a curve of f(x)− ε at all points of its domain.

Proof. By assumption, for any x1 ∈ [a, b], there exists hx1(x) ∈ A such that hx1

(x0) = f(x0) = a0

and hx1(x1) = f(x1) = a1. A candidate for g is g(x) = maxx1∈[a,b]x1 6=x0

hx1since g(x1) ≥ f(x1)

for all x1 and g(x0) = f(x0). However, we may take only the maximum of a finite number offunctions. We now will use the compactness assumption.

Each hx1 is continuous, hence exists δx1 such that |x−y| < δx1 → |hx1(x)−hx1(y)| < ε. DefineIx1 = (x1−δx1 , x1+δx1). Construct Ix1 for every hx1 . Then, the set Ixi form an open cover of [a, b]and a finite subcover {Ix1

, ..., Ixn} covers [a, b]. Hence choosing g(x) = max{hx1, hx2

, ..., hxn}satisfies g(x0) = f(x0) and g(x) > f(x)− ε for every x

For each x0 ∈ [a, b], we will obtain gx0constructed by lemma. We want to take h =

minx0∈[a,b] gx0to satisfy the condition that ||h − f || < ε for every x. Choose a finite number of

x0 over which to take the minimum as follows:Each gx0 is continuous. Hence there are intervals Jx0 = (x0 − δx0 , x0 + δx0) which lies in a

ε-neighbourhood of the graph about g(x0). The set of all Jx0cover [a, b], hence there exists a

finite subcover [a, b] = Jx0∪ ... ∪ Jxn . Hence, taking h = min{gx0

, gx1, ..., gxn} works, as this

function lies below f(x1) on Jx0but also above f(x)− ε.

10.5 Proofs of Stone-Weierstrass Theorem: Algebra Version

10.5.1 The Real Case

Recall that Stone-Weierstrass theorem stated in terms of algebras:

Theorem II.26 (Stone-Weierstrass). Suppose A ⊂ C[a, b] is a unital R-algebra such that Aseparates points. Then A = C[a, b].

Recall that A is an algebra if it is closed under addition, multiplication, and scalar multipli-cation with real numbers. The algebra is unital if the constant functions lie in A. Furthermore,A separates points if for all x1, x2 in the domain, there exists an f ∈ A such that f(x1) 6= f(x2).The Stone-Weierstrass theorem allows us to claim, that for an A satisfies these conditions,A = {gn → f , gn ∈ A} as the set of all uniform limits of sequences in A, is C[a, b].

Proof. We have previously proved the Stone-Weierstrass theorem in the case where A is a lattice,but an algebra in general is not a lattice (since an algebra may not be closed under minimumand maximum operations- for example, the algebra of polynomials). The idea of the proof is toreduce the case of algebras to the case of lattices.

Let us define B = A. By the Weierstrass approximation theorem (hence why this theorem iscalled the Stone-Weierstrass theorem), we get that B is a lattice. Applying the lattice version

65

Page 66: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

of the Stone-Weierstrass theorem yields B = C[a, b], but since B is closed (as the closure of A),then B = B = A = C[a, b], which was to be shown.

It remains to show that B is a lattice.

Lemma II.6. B is a unital R-algebra.

Proof. Since R ⊂ A ⊂ A = B, then B is unital. It remains to show that B is closed underaddition and multiplication. Let f, g ∈ B. Then f = lim fn and g = lim gn where fn, gn ∈ A.Since f+g = lim(fn+gn) and fg = lim fngn, and fn+gn ∈ A and fngn ∈ A, hence f+g, fg ∈ B.Hence B is a unital R-algebra.

To apply the lattice version of the theorem, we need to show that:

1. B is closed under minimum and maximum.

2. For all x1 6= x2 and for all a1, a2 ∈ R, there exists f ∈ B such that f(x1) = a1 andf(x2) = a2

We will first check property two.

Proof. Let x1 6= x2 on the domain and a1, a2 ∈ R be given. Since A seperates points, then thereexists g ∈ A such that g(x1) 6= g(x2). We now construct f(x) = cg(x) + d, for appropriateconstants c, d such that f(x1) = a1 and f(x2) = a2. Note that f ∈ A ⊂ B since A was a unitalalgebra. The constants c, g satisfy the systems of equations:

a1 = cg(x1) + d

a2 = cg(x2) + d

Solving the system of equations yields c = a1−a2g(x1)−g(x2) and d = a1 − cg(x1). Hence property

two is satisfied.

We now prove property one: that B is closed under minimum and maximum.

Lemma II.7. If f ∈ B, then |f | ∈ B.

The above lemma, along with the fact that B is an algebra, implies the lattice property since

max(f, g) =f + g

2+|f − g|

2

In the case where f ≥ g, then max(f, g) = f+g2 + f−g

2 = 2f2 = f . Otherwise if f ≤ g, then

max(f, g) = f+g2 + g−f

2 = 2g2 = g. Similarly

min(f, g) =f + g

2− |f − g|

2

We will use the Weierstrass Approximation Theorem to prove the above lemma.

Proof. Suppose f ∈ B. We must show that |f | ∈ B. Since f is continuous, then it is boundedwhere |f | ≤ M . Then consider g(y) = |y| on [−M,M ]. By the Weierstrass approximationtheorem applied to g(y), for every ε > 0, there exists Pn(y) =

∑ni=0 ciy

i such that

|n∑i=0

ciyi − |y|| < ε

66

Page 67: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

for all y ∈ [−M,M ]. Substituting y = f(x) yields

|n∑i=0

cif(x)i − |f(x)|| < ε

for all x ∈ [a, b] (since x ∈ [a, b] → |f(x)| ∈ [−M,M ]). Let F (x) =∑ni=0 cif(x)i. Then f(x) ∈

B → F (x) ∈ B since B is an algebra and is hence closed under addition, scalar multiplicationwith R, and multiplication of functions.

Furthermore, there exists a sequence {gn(x)} of such F (x) which uniformly approach |f | asε→ 0, by the above construction. Since B is closed under uniform limits, hence |f | ∈ B.

Now applying the lattice version of the Stone-Weierstrass theorem for B yields the Stone-Weierstrass theorem for the algebra A

In Rudin, the algebra A may not be a unital algebra (that is R * A), although A is an R-algebra. The Stone-Weierstrass theorem, in this case holds, if A separates points and if A doesnot vanish at any point. That is, for all x, there exists f ∈ A such that f(x) 6= 0. Furthermore,note that the theorem holds not only when the functions are defined on the interval [a, b] butalso any compact Hausdorff set K.

10.5.2 The Complex Case

Theorem II.27. Let A ⊂ C([a, b],C). This consists of functions of form f(x) + ig(x). If A isa unital C-algebra (closure under +, · holds and C ⊂ A), A sepereates points, and A is closedunder complex conjugation (f(x) + ig(x) ∈ A → f(x)− ig(x) ∈ A), then A = C([a, b],C).

We may call A a C∗-algebra in this case. The complex conjugation condition is importantsince if F = f + ig ∈ A, then f, g ∈ A since

f =F + F

2and g =

F − F2i

Conversely if f, g ∈ A, so does f + ig ∈ A.

Proof. Let F = f(x) + ig(x) be in C([a, b],C) and A be an algebra which satisfies the givenproperties. We will approximate the function F by elements of A. We will need to find fn → fand gn → g for which fn, gn ∈ A uniformly approximates f, g respectively. Finding such elementsimplies that fn+ ign → F uniformly. Applying the real version of the Stone-Weierstrass theoremto A ∩ C([a, b],R) completes the proof since the real and imaginary parts of the function lie inA and are each real functions.

67

Page 68: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Part III

Power Series and Fourier SeriesThe next topic we will study in this class will be power series and Fourier series, as an

application of the material we have covered on uniform convergence. This will not be tested onthe next exam.

11 Power Series

Definition III.1. A power series is a sum∑∞n=0 anx

n. where an ∈ R. The centre of thepower series may be shifted to

∑∞n=0 an(x − a)n. We may recover a series of form

∑∞n=0 anx

n

by making the substitution y = x− a.

Definition III.2. A power series has radius of convergence R, if the series converges in(−R,R) and diverges for |x| > R. We will need to determine behaviour of the series at ±Rseparately.

Theorem III.1. Let∑n anx

n be a power series, and α = lim supn→∞n√|an|. Then the radius

of convergence R = 1α .

Proof. This is an application of the root test. Recall that for a series∑n cn, the root test says

that the series converges or diverges depending on L = lim supn→∞n√|cn|. If L < 1, the series

converges, otherwise if L > 1, the series diverges.For the power series, we compute

lim supn→∞

n√|anxn| = |x| lim sup

n→∞

n√|an| = |x|α =

|x|R

If |x|R < 1, the series converges. Then the series converges when |x| < R, and R = 1α will be

the radius of convergence. Otherwise of |x|R > 1→ |x| > R, the series will diverge.

Using the radius of convergence, we may also make a statement as to where the series con-verges uniformly.

Theorem III.2. The series∑n anx

n converges uniformly in [−R+ ε, R− ε] for any ε > 0 andif its radius of convergence is greater than zero. If R =∞, there is uniform convergence on anyfinite interval [−b, b].

For instance, the series 1 + x + x2 + ... = 11−x on (−1, 1) and converges uniformly on any

smaller closed interval. That is, for any fixed (but arbitrary small) ε > 0, and for every η > 0,there exists N such that |Sn(x) − 1

1−x | < η for every x in [−1 + ε, 1 − ε] and for every n ≥ Nwhere Sn(x) = 1 + ...+ xn.

Proof. Apply the Weierstrass M-Test. Note that on [−R+ ε, R− ε]:

|anxn| ≤ |an||R− ε|n

Let Mn = |an||R− ε|n. Applying the root test to Mn yields

lim supn→∞

n√Mn = |R− ε| lim sup

n→∞

n√|an| =

|R− ε|R

< 1

68

Page 69: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Since the original series was bounded above by a convergent series, then∑n anx

n convergesuniformly in that interval.

We may combine the above statement that the power series converges uniformly, with theproperties we have already established on uniformly convergent series.

Corollary III.1. f(x) =∑n anx

n is continuous on (−R,R)

Proof. The series of partial sums Sn(x) are polynomials which converge uniformly to f(x) on[−R − ε, R + ε]. Hence f(x) is continuous. Since f(x) is continuous for every ε > 0, then it iscontinuous on (−R,R).

Corollary III.2. Let f(x) =∑n anx

n be defined on [a, b] ⊂ (−R,R). Then∫ b

a

f(x) dx =

∫ b

a

∑n

anxn dx =

∑n

anxn+1

n+ 1|ba =

∑n

ann+ 1

(bn+1 − an+1)

since uniform convergence allows us to interchange the sum and the integral.

This enables us to define the antiderivative as a power series as follows:

F (x) =

∫ x

0

f(t) dt =∑n

anxn+1

n+ 1

This is term by term integration.Similarly, may we perform term by term differentiation to find a power series representa-

tion for f ′(x)? We have established previously that additional conditions are needed for asequence of derivatives to converge to the derivative of the limit. However, it is true thatf ′(x) =

∑∞n=0 annx

n−1.Recall that if fn → f pointwise and f ′n → g uniformly, then g = f ′. In the case of series,∑n anx

n → f pointwise (given), and if∑n annx

n−1 → g uniformly, then g = f ′ uniformly.

Theorem III.3. If∑n anx

n has a radius of convergence R, then∑n annx

n−1 has the sameradius of convergence and f ′(x) =

∑n annx

n−1 on (−R,R).

Proof.

lim supn→∞

n√|ann| = lim sup

n→∞n√n lim sup

n→∞

n√|an| = α(1) =

1

R

Hence R is too the radius of convergence of∑n annx

n−1. Hence by theorem, the convergenceof∑n annx

n−1 is uniform on [−R + ε, R − ε]. This means that f ′(x) =∑n annx

n−1 on [−R +ε, R− ε] for every ε > 0, and hence for every x in (−R,R).

The above argument may be made more careful with shift of index arguments.

Corollary III.3. If f(x) =∑n anx

n, then f(x) has continuous derivatives of any order for allx ∈ (−R,R).

This comes from inductively applying the above theorem.

Definition III.3. f(x) is analytic if it is defined by a power series near every point in a domain.

69

Page 70: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

For instance f(x) = 11−x is analytic in (−1, 1). Every analytic function is a smooth functions

since by corollary, it has derivatives of any order. The above terminology comes from the complexfunction theory (the analysis of functions C 7→ C) where once differentiable complex functionsare called holomorphic, but are also analytic since a once differentiable function in C hasderivatives of all orders.

This is unfortunately not the case when dealing with functions of a real variable. We have achain of inclusions where smooth functions, may not be analytic (and within analytic functions we

have subsets of functions such as the polynomials). An illustrative example will be f(x) = e−1x2

whose Taylor series is identically zero.

11.1 Power Series Properties

Recall the following facts about power series mentioned during the previous class:

• A power series∑n anx

n has an interval of convergence (−R,R), and converges uniformlyin [−R+ ε, R− ε].

• Similarly,∑n |an|xn converges in (−R,R), since the radius of convergence is calculated as

1

R= lim sup

n→∞

n√|an|

• Furthermore, if f(x) =∑n anx

n, then f (m), the mth derivative exists for every m in thesame interval. This implies that f (m)(0) = m!am. Hence, if the series converges in (−R,R),then the series must be its Taylor/Mclaurin Series.

However, we are not guaranteed when dealing with functions of a real variable, that thefunction equals its power series, even when dealing with smooth functions! This behaviour isshown, for instance, by the function

f(x) =

{e−

1x2 x 6= 0

0 x = 0

, where f (m)(0) = 0 for every m. Hence, f(x) 6=∑∞n=0 anx

n in any (−R,R) where R > 0, sincethe power series is identically 0!

We may hence classify power series as divergent, convergent (and hence representing a smoothfunction), and smooth functions as having a power series or not having a power series represen-tation.

11.2 Behaviour at Endpoints

Consider∑n anx

n which converges in (−R,R). The series may converge at its endpoints,±R. Would f(x), the function which

∑n anx

n represents, continuous at x = ±R, given itconverges there?

Indeed this is the case, as Abel’s theorem states. As an illustrative example, consider∑∞n=1

xn

nwhere R = 1. At x = 1, the series diverges since

∑n

1n = ∞. Otherwise the series, at x = −1

converges (applying the alternating series test to∑n

(−1)n

n . Since f ′(x) =∑∞n=0 x

n = 11−x , we

can conclude that f(x) =∑∞n=1

xn

n = − log(1−x), hence f(−1) = − log 2, but f goes to infinityas x→ 1.

70

Page 71: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Theorem III.4 (Abel). Let∑n cnx

n be a power series where R is its radius of convergence.Suppose it converges at R (or −R). Then f(x) is continuous at R. In other words:

limx→R

∞∑n=0

cnxn =

∞∑n=0

cnRn

Proof. Without loss of generality, suppose R = 1 and∑n cn = s. We need to prove that

limx→1 f(x) = limx→1

∑n cnx

n = s. Symbolically, this means that ∀ε > 0, |f(x)− s| < ε if |x−1| < δ. Rewrite the sum as

|∑n

cnxn −

∑n

cn| = |∑n

cn(xn − 1)|

= |(x− 1)

∞∑n=0

cn(1 + x+ x2 + ...+ xn−1)|

= |(x− 1)

∞∑n=0

xn(s− sn)|

where s− sn = cn+1 + cn+2 + ... = s−∑nk=0 ck. Note that the above step relied in formally

switching the order of two sums.Hence

|(x−1)

∞∑n=0

xn(s−sn)| ≤ |1−x| ≤∞n=0 xn|s−sn| = (1−x)

N∑n=0

xn|s−sn|+(1−x)

∞∑n=N+1

xn|s−sn|

where N is chosen such that |s− sn| < ε2 for n > N , by convergence of the sum. Hence the

second sum may be bounded by

(1− x)

∞∑n=N+1

xn|s− sn| ≤ (1− x)ε

2

∞∑n=0

xn =ε

2

1− x1− x

2

and the first sum may be bounded by

(1− x)

N∑n=0

xn|s− sn| < (1− x)

N∑n=0

|s− sn|

since x < 1. Let∑Nn=0 |s− sn| < M . Hence if |1− x| < ε

2M , the first sum is bounded by ε2 .

Choosing δ = ε2M completes the proof.

A stronger version of Abel’s theorem states that the convergence at the endpoints is uniform,in [0, R].

11.3 Rearrangement of Sums

The proof of Abel’s theorem relied on switching two sums:

∞∑n=0

n−1∑m=0

cnxm =

∞∑m=0

∞∑n=m+1

cnxm

71

Page 72: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Under what conditions may we do this?We will first examine a case where we may not rearrange the series is not the case. Take the

sequence sm,n = mm+n as the partial sums of a series. Then

limn→∞

limm→∞

m

m+ n= limn→∞

1 = 1

butlimm→∞

limn→∞

m

m+ n= limm→∞

0 = 0

Hence the order of addition in the series may not be switched. The series can be reconstructedfrom the sequence of partial sums as a11 = s11, a12 = s12 − s11, a21 = s21 − s11, a22 =s22 − s12 − s21 + s11 and so on. Note that this series may not be absolutely convergent, as thefollowing theorem shows:

Theorem III.5. Assume∑∞j=1 |aij | = bij where

∑i bi converges. Then

∞∑i=1

∞∑j=1

aij =

∞∑j=1

∞∑i=1

aij

Compare the above with the theorem that if∑∞j=1 aj converges absolutely, then we may

reorder aj and still get the same sum.

Proof. Take x1, x2, ... where xi ∈ E which converge to x∞ ∈ E. Define fi(xn), g on E asfi(xn) =

∑nj=1 aij , fi(x∞) =

∑∞j=1 aij , and g(x) =

∑∞i=1 fi(x). Hence

g(x∞) =

∞∑i=1

∞∑j=1

aij︸ ︷︷ ︸fi(x∞)︸ ︷︷ ︸g(x)

On the Z2, the 2-dimensional lattice where i is on the x-axis and j is the y-axis, we may thinkof fi(xn) as a finite sum of n elements at x = i, and fi(x∞) as a infinite sum x = i (i.e. f sumsvertically). Furthermore, g(xn) sums i f functions along a horizontal infinite strip of width n.

1. Note firstly, that fi(x) is continuous at x∞, since fi(xn) → fi(x∞) as n → ∞, since thepartial sums converge by the absolute convergence assumption.

2. Furthermore, each fi is bounded. Namely, |fi(x)| ≤ bi for every x, since

|fi(x)| ≤∑j

|aij | = bi

by the assumption of absolute convergence.

3. Then applying the Weierstrass M-Test to the sequence {fi} means that∑∞i=1 fi(x) con-

verges uniformly.

Combining the above facts: fi being continuous, and∑∞i=1 fi converging uniformly, hence

g(x) =∑i fi(x) is continuous. g being continuous means that g(xn)→ g(x∞) as n→∞ Hence

72

Page 73: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

g(x∞) =

∞∑i=1

fi(x∞) =

∞∑i=1

∞∑j=1

aij = limn→∞

g(xn)

Note that

limn→∞

g(xn) = limn→∞

∞∑i=1

n∑j=1

fi(xn) = limn→∞

n∑j=1

∞∑i=1

aij

inductively applying the fact that∑∞i=1(ai + bi) =

∑∞i=1 ai +

∑∞i=1 bi. Hence

limn→∞

g(xn) =

∞∑j=1

∞∑i=1

aij

which is equal to g(x∞) =∑∞i=1

∑∞j=1 aij = limm→∞

∑mi=1

∑∞j=1 aij .

Note that this is actually a problem of interchanging two limit processes, as we want to provethat

limn→∞

∞∑i=1

fi(xn) =

∞∑i=1

limn→∞

fi(xn)

11.4 Application to Taylor Series

Suppose f is a function defined for x ∈ E ⊂ R. Say f is analytic if for every a ∈ E, theTaylor series of f at a converges to f in some interval (a− ε, a+ ε) where ε > 0.

Suppose we have a power series∑∞n=0 anx

n → f(x) in (−R,R). Is f analytic in (−R,R)?That is, does anx

n → f in (−R,R), and given a ∈ (−R,R), does the Taylor series of f(x) at aconverge to f(x) in some (a− ε, a+ ε)?

The answer is yes, if |x− a| < R− |a| = ε. Hence, the function will be analytic in the wholeinterval.

Proof. f(x) is defined as the sum∑n anx

n for all x ∈ (−R,R). Write this as∑n

an((x− a) + a)n =∑n

an

n∑m=0

(n

m

)(x− a)man−m

by the binomial theorem. Grouping terms yields∑m

∑n

[an

(n

m

)an−m](x− a)m

To apply a switching of sums, we must prove that the inner sum bn =∑nm=0

(nm

)ana

n−m(x−a)m converges absolutely at n → ∞. Note that |bn| =

∑nm=0

(nm

)|an−m||x − a|m =

∑nm=0(|x −

a|+ |a|)n. Since (|x− a|+ |a|)n < Rn, hence∑n |bn| is finite.

Hence if an = limn→∞ bn: ∑n

|an|rn

converges where r < R since∑n |an|Rn converges by the ratio test ( rR < 1 implies that we have

a convergent series).

For the remainder of the course, we will complete our study of power series, and Fourierseries.

73

Page 74: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

11.5 Zeros of Analytic Functions

If f(x) is a polynomial of degree n, then f(x) can have at most n roots by the FundamentalTheorem of Algebra. In the case where f is an analytic function, f can have infinitely many

roots. For instance, in the case where f(x) = sinx = x − x3

3! + ..., its roots are x = nπ wheren ∈ Z.

The main result is that the sequence of roots, of an analytic function, cannot converge.

Theorem III.6. Let∑n anx

n converge to f(x) 6= 0 in (−R,R), and let E ⊂ (−R,R) be the setof zeroes of f . Then E has no limit point in the interval (−R,R)

We note that in the statement of the above theorem, R may be infinite.

Proof. Suppose x0 ∈ (−R,R) is a limit point of the set of zeros of f . Consider the power seriesof f centered at x0, which is f =

∑∞n=0 bn(x− x0)n. This series converges in (x0 − ε, x0 + ε) for

some ε > 0.Since f(x0) = 0, then b0 = 0. Hence f(x) =

∑∞n=k bn(x−x0)n = (x−x0)k

∑∞n=k bn(x−x0)n−k

for some k, since x0 is a zero of order k where bk 6= 0. Since g(x) =∑∞n=k bn(x − x0)n−k 6= 0,

and g is continuous, then g(x) 6= 0 in some interval (x0 − η, x0 + η), for some η > 0. Hence f(x)has no zeros in this interval other than x0. This is a contradiction.

Note the above theorem can be used to show that certain functions do not have power seriesrepresentations about a point. For instance, f(x) = x sin 1

x has no power series at x = 0 sincex = 0 is a limit point, for the zeros of this function. The same can be said about the function

f(x) =

{e−

1x2 x > 0

0 x ≤ 0since x = 0 is again a limit point of the zeros and there are no-zero values

in every neighbourhood about x = 0. Hence f is not analytic at x = 0.This ends our study of power series in this course. Power series become useful in complex

analysis, differential equations, algebraic geometry, and number theory.

12 Fourier Series as Orthogonal Series

12.1 The Hermitian Inner Product

Suppose f(x) is defined on [−π, π]. Is it true that

f(x) = a0 +

∞∑n=1

an cos(nx) + bn sin(nx) (1)

for some constants an, bn? This is a power series, of trigonometric polynomials.We will first rewrite the above series a little differently. Allow f(x) to be a complex valued

function and allow an, bn ∈ C. The series

f(x) =

∞∑n=−∞

cneinx (2)

is equivalent to series (1) since eiθ may be thought of as a point on the unit circle. That is:eiθ = cos θ + i sin θ. Hence we may obtain

sin θ =eiθ − e−iθ

2icos θ =

eiθ + e−iθ

2

74

Page 75: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

where sin θ, cos θ are the real-valued trigonometric functions, if we let cn ∈ C and the series toextend infinitely in both directions.

Furthermore, let us consider the trigonometric functions B = {1, cos(nx), sin(nx)}∞n=1 as a“basis” for V , some vector space of functions. This is not really a basis as every element of avector space should able to be written as a finite linear combination of some elements in a basis,but here we allow an element of a vector space to written as a infinite linear combination of basiselements (a series). Likewise, A = {xn}∞n=0 is a basis for power series.

The basis B is orthogonal, that is

〈cos(nx), sin(mx)〉 =

∫ π

−πcos(nx) sin(mx) = 0

for all n,m. This may be used to compute an, bn in the original series (1), since these coefficientsmay be calculated by projections onto subspaces. Linear algebra applies to this space.

Before we define more rigorously, what orthogonality means, we need a definition of innerproduct, especially if we employ the series in (2) as the series we use to define Fourier series.From V = Rn, the inner product is 〈~v, ~w〉 = ~v · ~w, where ||~v|| =

√~v · ~v. In V = Cn, this does not

work. For instance, taking ~v = (1, i), we obtain ~v · ~v =√

12 + i2 = 0, which is not the length ofthe vector in the complex plane.

The correct notion of an inner product, in a complex-valued vector space, then becomes theHermitian inner product. In a finite-dimensional vector space, we may use the dot productwhere 〈~v, ~w〉 = ~v · ~w =

∑nk=1 vkwk. In a function space for functions defined on [−π, π], we may

define 〈f, g〉 =∫ π−π f(x)g(x) dx. We can see that taking the conjugate is the correct notion since

here:

||~v||2 = 〈~v,~v〉 =

n∑k=1

vk · vk =

n∑k=1

|vk|2

where |vk|2 denotes the length squared of the complex number vk. Furthermore, the innerproduct must satisfy certain linearity assumptions. The Hermitian inner product is linear in ~vsince

〈a~v1 + b ~v2, ~w〉 = a〈~v1, ~w〉+ b〈~v1, ~w〉and anti-linear in ~w since

〈~v, a ~w1 + b ~w2〉 = a〈~v, ~w1〉+ b〈~v, ~w2〉

Hence the inner product on Rn is bilinear, since it is linear in both arguments, but on Cn,it is sesqui-linear (or 11

2 linear) in Cn as this inner product is linear in the first argument butthere is anti-linearity in the second argument. Furthermore, there is a certain symmetry in theinner product where 〈~v, ~w〉 = 〈~w,~v〉. We may finally define what a Hermitian inner productis:

Definition III.4. A Hermitian inner product is a map 〈·, ·〉 : V × V → C, where V is avector space over C which is:

1. Sesquilinear, or linear in the first argument, and anti-linear in the second

2. Symmetric according to the conjugate (〈~v, ~w〉 = 〈~w,~v〉)

3. Positive definite: the quantity 〈~v,~v〉 is real, and we want 〈~v,~v〉 ≥ 0, with equality only for~v = 0.

Next time ,we will show that the basis B of functions is orthogonal with respect to the innerproduct of functions 〈f, g〉 =

∫ π−π f(x)g(x) dx

75

Page 76: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

12.2 Orthogonal Bases of Functions

Let V be a vector space of complex-valued functions on [−π, π], which may be continuous,integrable, or square-integrable (these are functions in L2 space). We will not impose specificconditions on these functions now. Define a Hermitian inner product on the function space as〈f, g〉 =

∫ π−π fg dx. The inner product defines orthogonality relations in this space and the norm

of a function f as ||f || =√〈f, g〉.

Definition III.5. A sequence {ϕn}∞n=1 in V is an orthonormal system if

〈ϕn, ϕm〉 =

{1 n = m

0 n 6= m

In other words, the functions, each of length one, are pairwise orthogonal.

For any f ∈ V , write f(x) ∼∑∞n=1 cnϕn where cn = 〈f, ϕn〉 =

∫ π−π f(x)ϕn(x) dx. The ∼

denotes an association of a series with f , and may not imply equality. (The series generated may

or may not converge to f). For instance, in the case of power series, we may write e−1x2 ∼ 0

since f(x) = e−1x2 has a power series representation at x = 0 which identically zero, but not

equal to the function f in any open interval about 0.The above definition is motivated by orthonormal bases in Cn. If {v1, ..., vn} is an orthonor-

mal basis in Cn, then for any ~v ∈ Cn, we may write ~v =∑nk=1 ck ~vn for some constants ck where

ck = 〈~v, ~vk〉.

12.3 Examples of Orthogonal Systems

1. The sequence of functions { 12π ,

cos(x)√π, sin(x)√

π, cos(2x)√

π, ...} is an orthogonal system, which leads

to the series

f(x) ∼ a0 +

∞∑n=1

an cos(nx) + bn sin(nx)

2. The sequence { 1√πeinx}∞−∞ is an orthonormal system since

〈 einx

√2π,eimx√

2π〉 =

∫ π

−π

1

2πeinxeimx dx

=1

∫ π

−πeinxe−imx dx

=1

∫ π

−πeix(n−m) dx

=

{1

∫ π−π dx = 2π

2π = 1 n = m1

2πeix(n−m)

i(n−m) |π−π = 0 n 6= m

hence the property of being an orthogonal system is verified. Since eikx = cos(kx)+i sin(kx)and both cos, sin are periodic functions with 2π, then eikx is too a periodic function withperiod 2π. Hence the series f(x) ∼

∑∞n=−∞ ane

inx is too periodic with 2π.

76

Page 77: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

3. Using the inner product 〈f, g〉 = 12π

∫ π−π f(x)g(x) dx, then the sequence {einx}∞−∞ with

respect to this product is an orthogonal sequence. The series is then written as f(x) ∼∑∞n=−∞ cne

inx where cn = 12π

∫ π−π f(x)e−inx dx.

Note that Rudin calls any orthogonal series Fourier, whereas sometimes a Fourier series onlyrefers to the series constructed from trigonometric polynomials.

Example III.1 (Computation of Fourier Series). Let f(x) =

{1 x ∈ [0, π]

0 x ∈ [−π, 0). Expand f as

f(x) ∼∑∞n=−∞ cne

inx.

The inner product defines cn = 12π

∫ π−π f(x)e−inx dx = 1

∫ π0e−inx dx

This is 12 when n = 0. Otherwise, cn = 1

2πe−inx

−in |π0 = − e

−inπ−12πin .

Since e−inπ = cos(−nπ) + i sin(−nπ) = (−1)n, then cn = 12πin ((−1)n − 1) =

{0 n even

1πin n odd

.

Hence

f(x) ∼ 1

2+∑

n odd

1

πineinx

since 1πine

inx − 1πine

−inx = 2πn

einx−e−inx2i = 2 sin(nx)

πn , hence the series may be expressed as

f(x) ∼ 1

2+

∑n>0n odd

2

πnsin(nx)

. The series is periodic with period 2π.

Note the following plots of the first few terms of the series:

−π π1 term

−π π2 terms

−π π3 terms

Figure 36: First few terms of 12 +∑∞n=0

2(2k+1)π sin((2k+1)π), a Fourier series for a step function,

overlaid with the original function. The original function is plotted in green; the Fourier seriesis plotted in blue.

It can noted from the plots above that the Fourier series does not actually converge to thedesired function pointwise, as the endpoints as identically 1

2 for all terms in the series and hencenever approach the original function. However, the Fourier series converges to f in some othersense- in the L2 metric.

12.4 Bessel’s Inequality

12.4.1 The Finite Dimensional Case

Consider S = {~v1, ..., ~vm} as an orthonormal basis of Cn. Consider ~w ∈ Cn. Set ak = 〈~v, ~vk〉and consider the expansion ~w =

∑mk=1 ak ~vk. In case where n = m, we know that ~w = ~v, as the

space is n dimensional.

77

Page 78: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Then suppose that S is not a basis of Cn. Then we call ~w the projection of ~v on SpanS = W .Note that:

1. ~w is the closest vector to ~v among all vectors in the vector space W .

2. ||~w|| ≤ ||~v||

We shall derive an inequality from the second claim. By definition, ||~w|| is

〈m∑k=1

ak ~vk,

m∑l=1

al~vl〉 =∑k

∑l

〈ak ~vk, al~vl〉

Manipulating the above sum, we have∑k

∑l

〈ak ~vk, al~vl〉 =∑k

∑l

akal〈 ~vk, ~vl〉

and since {vk} form an orthogonal basis, we have 〈 ~vk, ~vl〉 =

{0 k 6= l

1 k = l. Hence the above

double sum reduces to the single sum:

m∑k=1

akak = ||w||

This means that the coefficients are related to the norm of the original vector by

m∑k=1

akak ≤ ||~v||

and furthermore note that the left hand side is the norm of the coefficient vector ||(a1, ..., am)|| ∈Cm. This is a special case of Bessel’s Inequality.

12.4.2 Orthogonal Series Case

Recall the case of Fourier Series, in which we have some function space V of functions definedon [−π, π], and the inner product is

〈f, g〉 =

∫ π

−πf(x)g(x) dx

Let {en}∞n=1 denote an orthogonal system in this case. Recall that the expansion of f into anorthogonal series is denoted as

f ∼∞∑n=1

cnen

where cn = 〈f, en〉.

Theorem III.7. 1. For any N ≥ 1, the partial sum SN =∑Nn=1 cnen is closest to f among

all linear combinations tN =∑Nn=1 dnen. That is to say that

||f −N∑n=1

cnen|| ≤ ||f −N∑n=1

dnen||

with equality if and only if cn = dn for all 1 ≤ n ≤ N .

78

Page 79: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

2. For all N ≥ 1:∑Nn=1 cncn ≤ ||f ||2 and furthermore

∞∑n=1

cncn ≤ ||f ||2

Proof. 1. Note

||f − tN ||2 = 〈f − tN , f − tN 〉= 〈f, f〉 − 〈f, tN 〉 − 〈tN , f〉+ 〈tN , tN 〉

Since

〈f, tN 〉 = 〈f,N∑n=1

dnen〉 =∑n

〈f, dnen〉 =∑n

dncn

〈tN , f〉 = 〈f, tN 〉 =

N∑n=1

dncn

and

〈tN , tN 〉 = 〈∑n

dnen,∑m

dmem〉 =∑n

dndn

then

||f − tN ||2 = 〈f, f〉 −N∑n=1

dncn −N∑n=1

dncn +

N∑n=1

dndn

Furthermore, since

(dn − cn)(dn − cn) = dndn − cndn − dncn + cncn

Then

||f−tN ||2 = 〈f, f〉+N∑n=1

[(dn−cn)(dn−cn)−cncn] = 〈f, f〉+N∑n=1

(dn−cn)(dn − cn)−N∑n=1

cncn

Since (dn − cn)(dn − cn) = ||dn − cn||2, then it is minimum if and only if cn = dn for all n,since in that case ||dn − cn||2 = 0. This proves Claim 1.

2. Since ||f − sN ||2 = 〈f, f〉 −∑Nn=1 cncn, then

N∑n=1

cncn + ||f − sN ||2 = ||f ||2

79

Page 80: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Since the norm is always non-negative, then

N∑n=1

cncn ≤ ||f ||2

. The sequence of∑Nn=1 cncn is then a bounded, monotone sequence. This means that as

n→∞, the infinite series converges and

∞∑n=1

cncn ≤ ||f ||2

The above inequalities allow us to regard Fourier series as a type of projection. Let V be afunction space on which there are some restriction and let W denote all sequences of complexnumbers {cn}. Construct a map V 7→W by taking f(x) and mapping it to its Fourier coefficients.That is (cn)n = (〈f, en〉)n. Both V,W are inner product spaces where in V , the inner productand norm are

〈f, g〉 =

∫ π

−πfg dx ||f || =

√〈f, f〉

and in W , the inner product and norm are

〈(cn), (dn)〉 =

∞∑n=1

cndn ||cn|| =√〈cn, cn〉

In some sense, cn are then the coordinates of a function in V relative to the orthogonal system{ei}, which becomes as “basis” for V .

The Riesz-Fischer theorem states this correspondence more rigorously.

Theorem III.8 (Riesz-Fischer). The map V 7→ C∞ is an isomorphism preserving the innerproduct, if V is the L2 space of square-integrable functions on [−π, π] where ||f ||2 < ∞, and(en) = {einx}n∈Z.

12.5 Riesz-Fischer Theorem

We will now restrict attention to trigonometric series. Let f be expanded into its Fourierseries

f ∼∞∑

n=−∞cne

inx

Recall the Riesz-Fisher theorem which states that there is an isomorphism between the two spacesL2, l2 according to the map ∼ which takes a function and sends it into its Fourier coefficient.L2-space is the set of functions f on [−π, π] for which

∫ π−π |f |

2 dx <∞ (although f itself may be

unbounded), and l2-space is the set of sequences (cn)n in complex numbers for which∑n |cn|2 <

∞.The Riesz Fisher theorem says that the two inner products correspond. That is, taking

(cn), (dn) to be Fourier coefficients of f, g respectively then

〈f, g〉 =1

∫ π

−πfg dx =

∞∑n=−∞

cndn = 〈cn, dn〉

80

Page 81: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

and furthermore, the norms correspond. That is

||f || = ||cn|| =√∑

n

cncn

We can then conclude that {einx} is an orthonormal basis with respect to L2 space. Thecorresponding sequences in l2 space under this isomorphism will be the coordinates of f ∈ L2

according this space. This is the projection onto subspaces.

13 Convergence of Fourier Series

13.1 L2 convergence of Fourier Series

We shall now discuss the convergence of Fourier series. Define the N th partial sum of theFourier series of f to be sN (f, x) =

∑Nn=−N cne

inx. The Fourier series may converge to f in twoways:

1. For any (L2−) integrable f , the Fourier series converges in L2. That is ||f − SN ||2 → 0.

2. The pointwise convergence of a Fourier series to f requires a Lipchitz condition.

Before proving the convergence of Fourier series, we first note what the Stone-WeierstrassTheorem has to say about uniform convergence of functions by trigonometric polynomials.

Suppose A = {∑n cne

inx} which are the set of trigonometric polynomials (hence there isa finite sum where N ≥ 0 and cn ∈ C). We claim that A is a (uniformly) dense in the spaceC[−π, π] of continuous functions on [−π, π] that are 2π periodic (f(−π) = f(π)).

Theorem III.9. For any 2π-periodic continuous function f , and for every ε > 0, there exists atrigonometric polynomial

∑Nn=−N cne

inx wherein |f(x)−∑Nn=−N cne

inx| < ε, for every x

Proof Sketch. We need to check the following conditions:

1. A is a unital C-algebra.

2. A separates points in [−π, π], regarding −π = π. The function eix works.

3. A is closed under conjugation.

Hence if f is continuous and 2π-periodic, the Stone-Weierstrass theorem says that there existsa sequence of trigonometric polynomials pn which approach f uniformly.

However, this does not guarantee that the Fourier series of f converges to f . There was a

similar situation in Taylor series. Consider the function f(x) = e−1x2 defined on [−1, 1]. Since

this continuous, Weierstrass’ theorem says that there exists a sequence of polynomials pn(x)which converge to f uniformly on [−1, 1]. But the Taylor series of f at zero, which is zero,does not converge to f . This is because the sequence of polynomials pn which is given to us byWeierstrass’ theorem is not the sum of a series. The lower order terms may change as n → ∞.However, we may use the above result in establishing the convergence of Fourier series. Our firsttheorem establishes L2 convergence of Fourier series, for Riemann-integrable functions.

Theorem III.10 (Parseval’s Theorem). Consider Riemann-integrable and 2π periodic functionsf, g on [−π, π].

81

Page 82: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

1. The Fourier series sN (f, x) converges to f in L2.

2. The L2 inner product of functions corresponds to l2 inner product of sequences. That is〈f, g〉 = 〈(cn), (dn)〉.

Proof of Part 1. Let f ∈ R. By the homework, for every ε > 0, there exists h which is continuousfor which ||f − h||2 < ε.

The Stone-Weierstrass theorem says that there exists a polynomial P such that ||h−P ||2 < ε,since h is continuous. Hence by the triangle inequality: ||f − P ||2 < 2ε.

Suppose P has degree N (P =∑Nn=−N cne

inx). Then by Bessel’s inequality, among alltrigonometric polynomials of degree N , the partial sum of the Fourier series sN (f, x) is closestto f . Hence

||f − sN ||2 ≤ ||f − P ||2 < 2ε

For all M > N , the same inequality holds. Hence in the L2 metric,

||f − sM || ≤ ||f − sN || ≤ ||f − P || < 2ε

This series that the Fourier series of f converges in L2, since for all ε, there exists N suchthat ||f − sM || < 2ε for all M ≥ N . Note, however, that this theorem does not tell us anythingabout pointwise convergence of the series.

Recall the theorem stated last class about the L2 convergence of Fourier Series.

Theorem III.11. Suppose f, g ∈ R, periodic on [−π, π].

1. The Fourier series of f converges to f in L2. That is

||f − sN (f, x)||2 → 0

as n→∞, where sN (f, x) denotes the N th partial sum in the Fourier series of f .

2. The L2 inner product of corresponds to the l2 inner product. That is, if f ∼∑n cne

inx

and g ∼∑n dne

inx, then〈f, g〉 = 〈(cn), (dn)〉

This is known as Parseval’s equality.

More generally, the above theorem is true if f, g ∈ L2[−π, π], and not just if f, g are Riemannintegrable.

Proof Sketch for (1). If f is in Riemann-integrable, it may be approximated in L2 by a continuousfunction h. Since h is continuous, it may be approximated by Stone-Weierstrass uniformly by atrigonometric polynomials tN . Hence

||f − tN ||2 < ε

in L2. By Bessel’s inequality, we may conclude that the Fourier series sN is closest to f in L2

among all trigonometric polynomials. Hence

||f − sN || < ε

proves L2 convergence since we have demonstrated that for any ε, there exist N such thatn ≥ N → ||f − sn|| < ε.

82

Page 83: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

Proof for (2) using a Formal Computation. Note firstly that {einx} is a set of functions whichis an orthonormal basis. Then because of the L2 convergence proved in (1):

〈f, g〉 = 〈∑n

cneinx,

∑m

dmeimx〉

=∑n

cndn〈einx, einx〉

= 〈(cn), (dn)〉

Note, however, that the above proof assumed that

1

∫ π

−π

∑n

cneinx∑m

dmeimx dx =∑n

∑m

1

∫ π

−πcne

inxdmeimx

The switching of the integral and the series may not be justified here since we have an infiniteseries. Instead, we may repeat the above proof by taking the limit of partial sums.

Proof. Define the partial sum sN (f, x) =∑Nn=−N cne

inx. Then

〈sN (f, x), g〉 =1

∫ π

−π

N∑n=−N

cneinxg(x) dx

=

N∑n=−N

cn1

∫ π

−πeinxg(x) dx︸ ︷︷ ︸

Fourier coefficient of g

=

N∑n=−N

cndn

Hence as N → ∞, then 〈sN , g〉 → 〈cn, dn〉. To show the stated equality, we then need toshow that 〈sN , g〉 − 〈f, g〉 = 〈sN − f, g〉 goes to zero as N →∞. This follows from:

〈sN , g〉 − 〈f, g〉 = 〈sN − f, g〉 ≤ ||sN − f ||||g||

Applying the Schawrz inequality. As sN − f → 0 by L2 convergence of the Fourier series and||g|| is a constant, then 〈sN − f, g〉 → 0

This is a weaker form of the Riesz-Fischer theorem which states that the space of L2-integrablefunctions on [−π, π] mapping to l2 is an isomorphism of inner product spaces under the mapf 7→ (cn) where cn is the sequence of Fourier coefficients.

Furthermore, to establish convergence for general orthogonal series, it depends on the com-pleteness of the orthogonal set in question. For instance, for Fourier series, we take {φn} ={sin(nx), cos(nx)} but if we do a restriction to {cos(nx)}, only symmetric (even) functions fcan be expanded in a series with only cosine terms in it. Similarly, if we use {sin(nx)} as theorthogonal set, then only anti-symmetric (odd) functions can be expanded in a series with sineterms.

83

Page 84: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

13.2 Pointwise Convergence of Fourier Series

We have previously established L2 convergence for the Fourier series although the L2 con-vergence does not guarantee pointwise convergence. We cannot say that

∑n cne

inx → f(x) forevery x in general, even when f is continuous. An example of where pointwise convergence of aFourier series fails is the Gibbs phenomenon:

Figure 37: Example of the Gibbs Phenomenon for a Square Wave. Gibbs phenomenon aredisplayed at the point of discontinuity and lie approximately on the line y = 1.09

To prove that a certain Fourier series converges to the function pointwise, we will expressthe partial sums sN as a convolution sN = f ∗DN where DN is known as the Dirichlet kernel.This is similar to how to proved the Weierstrass approximation theorem where we expressedpolynomials approaching a function as a convolution. Let the parial sum sN be expressed as

sN =

N∑n=−N

cneinx =

N∑n=−N

1

2π(

∫ π

−πf(t)e−int dt)einx =

1

∫ π

−πf(t)

N∑n=−N

ein(x−t) dt

84

Page 85: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

If we let DN (x) =∑Nn=−N e

inx, then we have expressed sN as

sN =1

∫ π

−πf(t)DN (x− t) =

1

2πf ∗DN

.To obtain a more explicit form for DN we first multiply both sides by eix − 1 to obtain

(eix − 1)DN (x) =∑n

ei(n+1)x −∑n

einx = ei(N+1)x − e−iNx (1)

since the sum on the right hand side telescopes. Recalling that sinx = eix−e−ix2i , then multi-

plying both sides of (1) by e−ix2

2i yields

eix2 − e−ix2

2iDN (x) =

ei(N+ 12 )x − e−i(N+ 1

2 )x

2i

Hence we can conclude

sin(x

2)DN (x) = sin((N +

1

2)x)

Hence

DN (x) =sin((N + 1

2 )x)

sin(x2 )

is an explicit form of the Dirichlet kernel.

2N+1

Figure 38: Plot of the Dirichlet Kernel DN (x) for some N

The Dirichlet kernel in some sense approaches the delta function. Consider expanding f(x) =δ0 into a Fourier series

∑n cne

inx. Then

cn =1

∫ π

−πδ0e

inx dx =1

85

Page 86: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

This means that

1

N∑n=−N

einx → δ0

as N → ∞, hence 12πDN (x) is the N th partial sum of the Fourier series of δ0. This mimics our

proof of the Weierstrass approximation theorem since we are picking smooth functions whichapproach the delta function, and convolving a function we want to approximate with them toproduce a sequence of functions approaching the original function.

Recall the proof of the Weierstrass approximation theorem where we picked polynomialsgn such that gn → δ0 and produced a sequence of polynomials Pn = f ∗ gn which uniformlyapproximate f as n → ∞. In the case of Fourier series, our partial sums Sn is the convolutionSN = 1

2πf ∗DN where DN is the Dirchlet kernel

DN =

N∑n=−N

einx =sin((N + 1

2 )x)

sin( 12x)

. We want to show that sN → f as f →∞. Recall that DN in some sense is the partial sum ofthe Fourier series of a delta function. It has the same property as the delta function that

1

∫ π

−πDN (x) dx = 1

since1

∫ π

−πeinx dx =

{1 if n = 0

0 otherwise

Theorem III.12 (Fourier Convergence Theorem). Suppose f is periodic (f(−π) = f(π)) andfix some x ∈ [−π, π]. Then sN (f, x)→ f(x) as N →∞ if f(x) is Lipchiz at x. This means thatthere exists M, δ such that

|f(x+ t)− f(x)| ≤M |t|

for |t| < δ.

Corollary III.4. If f ′ exists at x, then the Fourier series converges to f at x (sN (f, x)→ f(x))

Corollary III.5. If f ′ is continuous on [−π, π], then sN converges to f uniformly.

Proof. We need to show that |sN (f, x)− f(x)| → 0. We may write the convolution

1

∫ π

−πf(t)D(x− t) dt = − 1

∫ x−π

x+π

f(x− u)D(u) du

by making the substitution t = x− u. As the original function is 2π periodic by assumptionthen the convolution is then 1

∫ π−π f(x− t)D(t) dt. And since

f(x) = f(x)1

∫ π

−πDN (t) dt

we may write

86

Page 87: MATH 321: Real Variables II Notes 2015W2 Termkaru/m321/all.pdf · MATH 321: Real Variables II Notes 2015W2 Term ... 3.3 Riemann Sums ... 5.3 Functional Analysis Revisited

|sN (f, x)− f(x)| = 1

∫ π

−πf(x− t)DN (t) dt− 1

∫ π

−πf(x)DN (t) dt

=1

∫ π

−π[f(x− t)− f(x)]DN (t) dt

=1

∫ π

−π[f(x− t)− f(x)]

sin(N + 12 )t

sin( 12 t)

dt

by definition of the Dirichlet kernel. Since

sin(N +1

2)t = sin(Nt) cos(

1

2t) + cos(Nt) + sin(

1

2t)

Hence the above integral by me decomposed into two parts:

1

∫ π

−π

f(x− t)− f(x)

sin( t2 )cos(

t

2)︸ ︷︷ ︸

g

sin(Nt) dt+1

∫ π

−π[f(x− t)− f(x)]︸ ︷︷ ︸

h

cos(Nt) dt

We may notice now that the first integral yields the N th Fourier coefficient for g(t) =f(x−t)−f(x)

sin( t2 )when it is expanded as g ∼

∑n bn sin(Nt), and that the second integral yields

the N th Fourier coefficient for h(t) = f(x− t)− f(x) when it is expanded as h ∼∑n an cos(nt).

Then recalling Bessel’s inequality, which states that for an orthogonal basis {φn}, if f ∼∑n cnφn, then ||f ||2 ≥ ||cn||2 =

∑n |cn|2 where the two-norms of the function and the sequence

of Fourier coefficients are considered. If f is finite, the infinite series∑n |cn|2 converges and

hence we may conclude that |cn|2 → 0 and cn → 0.Hence if g, h are bounded, we get that bn → 0, an → 0. Hence, we get the result desired since

the difference |sN (f, x)− f(x)| = aN + bN → 0 as n→∞.Since h is continuous at x, we have that h is bounded in an interval about x. It remains to

show that g is bounded. For some t ∈ (−δ, δ) and by the Lipchitz condition:

g(t) = |f(x− t)− f(x)

sin( t2 )|| cos(

t

2)| ≤ |f(x− t)− f(x)

sin( t2 )| ≤ | M |t|

sin( t2 )| ≈ M |t|

|t|2

= 2M

Hence g is bounded. To show that g is integrable, we will need to show furthermore, that itis integrable on every interval [−π,−ε], avoiding the singularity to argue that the singularity att = 0 becomes removable upon integration.

Corollary III.6 (The Localization Property). Assume that f is zero on (x0 − δ, x0 + δ), butmay take any value outside that interval. Then sN (f, x)→ f(x) = 0 on (x0 − δ, x0 + δ)

The corollary follows by the Lipchitz property of f on (x0 − δ, x0 + δ). The localizationproperty implies, then, that if f(x) = g(x) on an interval (x0 − δ, x0 + δ), then sN (f, x)→ f(x)iff sN (g, x) → g(x) on that interval, since f − g = 0 on that interval and we are applying thecorollary.

The study of Fourier series then leads into wavelet theory and additional applications in signalprocessing.

87