LW 1129 Pmath753notes

mlbaker.org presents

PMATH 453/753Functional Analysis

Dr. Nicolaas Spronk • Fall 2012 (1129) • University of Waterloo

www.math.uwaterloo.ca/˜nspronk

Disclaimer: These notes are provided as-is, and may be incomplete or contain errors.

Contents1 Normed spaces and Banach spaces 2

2 Topological spaces and continuous functions 3

3 `p-spaces 5

4 Linear operators on normed spaces 7

5 Separation by hyperplanes 12

6 Consequences of Baire Category Theorem 15

7 Finite-dimensional Banach spaces 20

8 Initial topologies, compactness 22

9 An application of ultrafilters: ultrafilter limits 25

10 Nets 27

11 Extreme points and the Krein-Milman theorem 31

12 Euclidean and Hilbert spaces 34

13 Spectral theory 40

14 Adjoint operators 44

15 Compact operators 46

16 Structure theorem for compact operators 48

17 Operators on Hilbert space 50

1

1 Normed spaces and Banach spaces

Let F be R or C. Define

|x| =

absolute value of x if x ∈ Rmodulus of x if x ∈ C.

1.1 Definition. Let X be an F-vector space. A norm on X is a functional ‖ · ‖ : X → R≥0, such that

(i) ‖x‖ = 0 if and only if x = 0 (non-degenerate).

(ii) ‖αx‖ = |α| · ‖x‖, for all α ∈ F (| · |-homogeneity).

(iii) ‖x+ y‖ ≤ ‖x‖+ ‖y‖ (subadditivity).

We call the pair (X , ‖ · ‖) a normed (vector) space. Often, when no ambiguity arises, we simply say “X is anormed space”. If (X , ‖ · ‖) is complete with respect to the metric d(x, y) = ‖x − y‖ induced by ‖ · ‖ (i.e. eachCauchy sequence in X has a limit in X ) we say that (X , ‖ · ‖) is a Banach space.

1.2 Example. We have the following examples:

(i) (F, | · |) is a Banach space1.

(ii) (Fn, ‖ · ‖2) is a Banach space, where

‖(x1, . . . , xn)‖2 =

(n∑i=1

|xi|2)1/2

.

(iii) Let (X, d) be a metric space. Define

Cb(X) = CFb (X) = f : X → F : f is continuous and bounded i.e. ‖f‖∞ = supx∈X|f(x)| <∞.

Then (Cb(X), ‖ · ‖∞) is a Banach space2.

(iv) Let 1 ≤ p ≤ ∞. (Lp[0, 1], ‖ · ‖p) is a Banach space3.

(v) On C[0, 1] (note, since ([0, 1], usual metric) is compact, the “b” for boundedness is redundant), we can putnorm

‖f‖p =

(∫ 1

0

|f |p)1/p

where the integral is a Riemann integral. (C[0, 1], ‖·‖p) is a normed space. It is not complete. Indeed, considerthe sequence of functions

fn(t) =

0 if 0 ≤ t ≤ 1

2

some affine bit if 12 < t < 1

2 + 1n

1 if 12 + 1

n ≤ t ≤ 1.

This sequence is ‖ ·‖p-Cauchy but converges to no g ∈ C[0, 1], i.e. for no such g do we have limn ‖fn−g‖p = 0.

1M147 in the case of R, M247 in the case of C ∼= R2.2PM351.3PM450/354.

2

(vi) Define` = x = (x1, x2, . . .) ∈ FN : xi = 0 for all but finitely many indices i.

This is an F-vector space under pointwise operations

(x1, x2, . . .) + (y1, y2, . . .) = (x1 + y1, x2 + y2, . . .), α(x1, x2, . . .) = (αx1, αx2, . . .).

We can put many norms on `:

‖x‖∞ = supi∈N|xi|, ‖x‖2 =

√√√√ ∞∑i=1

|xi|2, ‖x‖1 =

∞∑i=1

|xi|.

Note that all sums are finite by definition of `. Convince yourself that none of (`, ‖ · ‖∞), (`, ‖ · ‖2), (`, ‖ · ‖1)are complete.

2 Topological spaces and continuous functions

2.1 Definition. Given a non-empty set X, define its power set, P(X) = A : A ⊂ X. A topology on X is asubfamily τ ⊂ P(X) such that

(i) ∅, X ∈ τ .

(ii) Uii∈I ⊂ τ implies that⋃i∈I

Ui ∈ τ (closure under arbitrary unions).

(iii) U1, . . . , Un ⊂ τ implies thatn⋂i=1

Ui ∈ τ (closure under finite intersections).

We call the elements of τ (τ-)open sets. The pair (X, τ) is called a topological space.


(i) τtriv = ∅, X is called the trivial topology.

(ii) Let (X, ρ) be a metric space, i.e. ρ : X ×X → R≥0 satisfies

(i) ρ(x, y) = 0 if and only if x = y (non-degeneracy).

(ii) ρ(x, y) = ρ(y, x) (symmetry).

(iii) ρ(x, z) ≤ ρ(x, y) + ρ(y, z) (triangle inequality).

For ε > 0, we define the open ball of radius ε centered at x by Bε(x) = x′ ∈ X : ρ(x, x′) < ε. We let

τρ = U ⊂ X : for each x ∈ U , there is εx > 0 such that Bεx(x) ⊂ U.

(iii) τd = P(X), called the discrete topology. Notice that τd = τρ, where ρ is the discrete metric, i.e.

ρ(x, y) =

1 if x 6= y

0 if x = y.

(iv) The Sorgenfrey line: Take X = R. Let

σ = U ⊂ R : for each x ∈ U , there is εx > 0 such that [x, x+ εx) ⊂ U.

Show there is no metric ρ such that σ = τρ.

3

(v) Recall, if ρ1, ρ2 are metrics on X, we say ρ1 and ρ2 are equivalent (denoted ρ1 ∼ ρ2) if there exist c, C > 0such that

cρ1 ≤ ρ2 ≤ Cρ1.

If ρ1 ∼ ρ2, then we have τρ1 = τρ2 . However, we may have τρ1 = τρ2 without ρ1 ∼ ρ2. As an example, considerthe metrics on R given by

ρ1(s, t) = |s− t|, ρ2(s, t) =|s− t|

1 + |s− t|.

The latter is bounded, whereas the former is unbounded, so there is no way these could be equivalent. However,they induce the same topology on R.

Assignment #1 is posted; google “Nico Spronk”, hit PMATH 753 tab under Teaching – Fall 2012. I wouldadvise trying most questions after Friday’s lecture.

2.3 Definition. Let (X, τ), (Y, σ) be topological spaces. A function f : X → Y is (τ-σ-)continuous at x0 ∈ X iffor each neighbourhood4 f(x0) ∈ V ∈ σ, there is a neighbourhood x0 ∈ U ∈ τ such that f(U) ⊂ V , i.e. f(x) ∈ Vfor each x ∈ U .

2.4 Remark. Note that in metric topology, V plays the role of ε and U plays the role of δ.

2.5 Definition. Let F = R or C. Let

Cb(X, τ) = CFb (X, τ) =

f : X → F | f continuous, and bounded: ‖f‖∞ = sup

x∈X|f(x)| <∞

.

It is straightforward to prove that under pointwise operations, i.e.

(f + g)(x) = f(x) + g(x), (αf)(x) = αf(x) ∀x ∈ X

for α ∈ F, f, g ∈ Cb(X, τ), this is an F-vector space, and ‖ · ‖∞ is a norm on Cb(X, τ).

It is really the completeness of the codomain that drives the following result.

2.6 Theorem. (Cb(X, τ), ‖ · ‖∞) is a Banach space, i.e. Cb(X, τ) is complete with respect to ‖ · ‖∞.

Proof. Let (fn)∞n=1 ⊂ Cb(X, τ) be a Cauchy sequence. Then for all ε > 0, there is N ∈ N such that for all n,m ≥ Nwe have ‖fn − fm‖∞ < ε. Then for each fixed x ∈ X we observe that |fn(x) − fm(x)| ≤ ‖fn − fm‖∞ hence(fn(x))∞n=1 ⊂ F is Cauchy. Since (F, | · |) is complete, we can define

f(x) = limnfn(x)

for all x ∈ X. We will show thatsupx∈X|fn(x)− f(x)| =: ‖fn − f‖∞ → 0.

Let ε > 0. Choose N ∈ N such that if n,m ≥ N then ‖fn − fm‖∞ < ε. Observe that for x ∈ X, we have

|fn(x)− f(x)| = limm|fn(x)− fm(x)| ≤ ε.

It follows that fn → f uniformly on X. Also, let us note that f is bounded: we can pick ε = 1 and find some N sothat ‖fN − f‖∞ < ε. However, this means that for any x ∈ X we have

|f(x)| ≤ |f(x)− fN (x)|+ |fN (x)| ≤ 1 + ‖fN (x)‖∞.

Now we show f is continuous. Fix an arbitrary x0 ∈ X, ε > 0. Let n0 be such that for n,m ≥ n0 we have‖fn − fm‖∞ < ε/3. Notice that

‖fn − f‖∞ = limm‖fn − fm‖∞ ≤

ε

3, for n ≥ n0.

Let x0 ∈ U ∈ τ such thatfn0

(U) ⊂ Bε/3(fn0(x0)) ⊂ F

4Here, a neighbourhood of x is defined to be an open set containing x. Some authors define it to mean any set containing an openset containing x.

4

(we can do this by definition of continuity for fn0). Then for x ∈ U we have

|f(x)− f(x0)| ≤ |f(x)− fn0(x)|+ |fn0

(x)− fn0(x0)|+ |fn0

(x0)− f(x0)| < ε

3+ε

3+ε

3= ε

and hence f(U) ⊂ Bε(f(x0)). Hence f is continuous at x0. Thus f is continuous at each point in X.

2.7 Example. We have:

(i) Let`∞ = `∞(N) = x = (x1, x2, . . .) ∈ FN : ‖x‖∞ = sup

n|xn| <∞.

Note that `∞ is an F-vector space under pointwise operations,

x+ y = (x1 + y1, x2 + y2, . . .), αx = (αx1, αx2, . . .).

The map Cb(N, τd) → `∞, given by f 7→ (f(n))n∈N is easily seen to be a surjective linear isometry. Hencethese spaces are isometrically isomorphic, hence (`∞, ‖ · ‖∞) is complete. Verify the details.

(ii) Letccc = x ∈ `∞ : lim

nxn exists.

We consider the topology τ on N ∪ ∞ given by S ⊂ N and S ∪ n, n + 1, . . . ,∞ for S ⊂ N and n ∈ N.Verify that τ is indeed a topology on N∪∞, and that the map Cb(N∪∞, τ)→ ccc given by f 7→ (f(n))n∈Nis a bijective linear isometry.

(iii) Letccc0 = x ∈ `∞ : lim

nxn = 0 ⊂ ccc.

2.8 Proposition. If x0 ∈ X, (X, τ) a topological space, then the ideal of x0,

I(x0) = f ∈ Cb(X, τ) : f(x0) = 0

is a closed subspace, and hence a Banach space.

Proof. It is easy to show that I(x0) is a subspace, and is closed. We note that any closed subset of a completemetric space is also complete5.

In terms of ccc0, we simply observe that the map from (ii) above takes I(∞) ⊂ Cb(N ∪ ∞, τ) onto ccc0.

2.9 Remark. The map N ∪ ∞ → 0 ∪ 1n : n ∈ N given by

n 7→

1n if n ∈ N0 if n =∞

is continuous, with continuous inverse. Note that 0 ∪ 1n : n ∈ N is a closed subspace.

3 `p-spaces

3.1 Lemma. If I ⊂ R is an open interval and ϕ : I → R satisfies ϕ′′(t) > 0 for all t ∈ I, then for a < c in I and0 < s < 1 we have

ϕ ((1− s)a+ sc) < (1− s)ϕ(a) + sϕ(c) (*)

Any function satisfying (*) is called strictly convex.

Proof. By MVT, ϕ′ is strictly increasing. Let b = (1− s)a+ sc so that

s =b− ac− a

, 1− s =c− bc− a

.

Another application of MVT allows us to find ξ, η such that a < ξ < b < η < c and

ϕ(b)− ϕ(a)

b− a= ϕ′(ξ) < ϕ′(η) =

ϕ(c)− ϕ(b)

c− b.

Thus,(ϕ(b)− ϕ(a)) (c− b) < (ϕ(c)− ϕ(b)) (b− a) =⇒ ϕ(b)(c− a) < ϕ(c)(b− a) + ϕ(a)(c− b)

which reads ϕ((1− s)a+ sc) < sϕ(c) + (1− s)ϕ(a).5PM351.

5

3.2 Corollary. If p, q > 1 are such that 1p + 1

q = 1 (such p, q are called conjugate indices) and a, b ≥ 0, then

ab ≤ ap

p+bq

q

with equality if and only if ap = bq.

Proof. Let ϕ(t) = − log t for t ∈ (0,∞), so ϕ′′(t) = 1t2 > 0. Then

− log

(ap

p+bq

q

)≤ −1

plog(ap)− 1

qlog(bq) = − log a− log b = − log(ab).

We observe that “<” holds if ap 6= bq, whereas “=” holds if ap = bq. Now, apply the strictly decreasing functionx 7→ e−x to gain the desired result6.

3.3 Theorem (Hölder’s Inequality). If a1, . . . , an, b1, . . . , bn ∈ C then∣∣∣∣∣n∑i=1

aibi

∣∣∣∣∣ ≤(

n∑i=1

|ai|p)1/p( n∑

i=1

|bi|q)1/q

where p and q are conjugate indices. Moreover, equality holds if and only if either ai = 0 for all i or there is t ≥ 0such that t|ai|p = |bi|q for all i, and |z| = 1 such that aibi = z|aibi| for all i.

Proof. Let A = (∑|ai|p)1/p, B = (

∑|bi|q)1/q. Suppose that AB 6= 0. From the corollary above, we have

1

AB

∣∣∣∣∣n∑i=1

aibi

∣∣∣∣∣ (1)

≤ 1

AB

n∑i=1

|ai||bi|(2)

≤n∑i=1

(|ai|p

pAp+|bi|q

qBq

)=

1

pApAp +

1

qBqBq =

1

p+

1

q= 1.

Note that equality can hold at (1) if and only if |z| = 1 exists as claimed. Also, equality holds at (2) if(|ai|A

)p=

(|bi|B

)qfor all i (or at least one of A,B = 0), hence t = Bq/Ap suffices.

As a corollary, we have the following generalisation of the triangle inequality.

3.4 Theorem (Minkowski’s Inequality). If p > 1 and a1, . . . , an, b1, . . . , bn ∈ C then(n∑i=1

|ai + bi|p)1/p

≤

(n∑i=1

|ai|p)1/p

+

(n∑i=1

|bi|p)1/p

with equality if and only if either ai = 0 for all i, or there is t ≥ 0 such that tbi = ai for all i.

Proof. Again, suppose∑|ai + bi| 6= 0. Then by the triangle inequality and Hölder’s inequality,

n∑i=1

|ai + bi|p ≤(1)

n∑i=1

(|ai|+ |bi|) |ai + bi|p−1H.I.≤(2)

( n∑i=1

|ai|p)1/p

+

(n∑i=1

|bi|p)1/p

[ n∑i=1

|ai + bi|(p−1)q

]1/q

.

Note that1

p+

1

q= 1 =⇒ 1

q= 1− 1

p=p− 1

p=⇒ (p− 1)q = p.

Divide by (∑|ai + bi|p)1/q, and note that 1− 1

q = 1p , so the inequality prevails. Equality at (1) holds if and only if

sgn ai = sgn bi, where

sgn z =

z|z| if z 6= 0

0 if z = 0

or all ai or all bi are zero. Equality at (2) holds if and only if

|ai + bi|(p−1)q = |ai + bi|p = t1|ai|p = t2|bi|p, for some t1, t2 ≥ 0.

So t = (t2/t1)1/p assuming that not all bi are zero.6The result is also a special case of a weighted AM-GM inequality (obtainable from Jensen’s inequality):

∏apii ≤

∑aipi for ai ≥ 0,

pi > 0 where∑

pi = 1. See Bollobás’ Linear Analysis, Chapter 1, Theorem 3.

6

3.5 Definition. For 1 ≤ p <∞, we define

`p = `p(N) =

x = (x1, x2, . . .) ∈ FN : ‖x‖p =

( ∞∑i=1

|xi|p)1/p

<∞

.

3.6 Proposition. For 1 < p <∞, `p is a vector space, under pointwise operations, and ‖ · ‖p is a norm on `p.

Proof. If x = (x1, x2, . . .) let x(N) = (x1, . . . , xN , 0, 0, . . .). Then if x ∈ `p,

‖x− x(N)‖p =

( ∞∑i=N+1

|xi|p)1/p

N→∞−−−−→ 0

by the definition of convergence, and ‖x‖p = limN ‖x(N)‖p. Hence, by Minkowski’s Inequality, if x, y ∈ `p, then wehave

‖x+ y‖p = limN→∞

‖x(N) + y(N)‖pM.I.≤ lim

N→∞

(‖x(N)‖p + ‖y(N)‖p

)= ‖x‖p + ‖y‖p

and hence x + y ∈ `p with ‖x + y‖p ≤ ‖x‖p + ‖y‖p. Moreover, it is obvious that if α ∈ F, and x ∈ `p, then‖αx‖p = |α|‖x‖p <∞, so αx ∈ `p.3.7 Proposition. (`1, ‖ · ‖1) is a normed space.

Proof. Easy exercise.

4 Linear operators on normed spaces

4.1 Definition. Let X ,Y be F-vector spaces. Let

L(X ,Y) = T : X → Y | T is linear.

This is itself an F-vector space under pointwise operations, i.e. for α ∈ F, S, T ∈ L(X ,Y) we put

(S + αT )x = Sx+ αTx, ∀x ∈ X .

4.2 Definition. If (X , ‖ · ‖) is a normed space then we define

• The closed ball, B‖·‖(X ) = B(X ) = x ∈ X : ‖x‖ ≤ 1.

• The open ball/disc, D‖·‖(X ) = D(X ) = x ∈ X : ‖x‖ < 1.

• The sphere, S‖·‖(X ) = S(X ) = x ∈ X : ‖x‖ = 1.4.3 Proposition. If X ,Y are both normed spaces, and T : X → Y, then the following are equivalent:

(i) T is continuous on X .

(ii) T is continuous at one point x0 ∈ X .

(iii) sup‖Tx‖ : x ∈ D(X ) <∞, i.e. T is bounded.

(iii’) sup‖Tx‖ : x ∈ B(X ) <∞.

Proof. (i) → (ii): Obvious.

(ii) → (iii): Note thatTx0 + D(Y) = y ∈ Y : ‖Tx0 − y‖ < 1

is an open neighbourhood of Tx0. Hence, by the continuity assumption there is δ > 0 such that

Tx0 + δT (D(X )) = T (x0 + δD(X )) ⊂ Tx0 + D(Y) =⇒ δT (D(X )) ⊂ D(Y) =⇒ T (D(X )) ⊂ 1

δD(Y)

i.e. we have sup‖Tx‖ : x ∈ D(X ) < 1/δ <∞.

(iii) → (i): If N = sup‖Tx‖ : x ∈ D(X ), then for ε > 0, x ∈ X we have

‖Tx‖ = (‖x‖+ ε)

∥∥∥∥∥T(

1

‖x‖+ εx

)︸︷︷︸∈D(X )

∥∥∥∥∥ ≤ (‖x‖+ ε)N

7

and taking ε→ 0+ we see ‖Tx‖ ≤ N‖x‖ for x ∈ X , which yields

‖Tx− Ty‖ = ‖T (x− y)‖ ≤ N‖x− y‖

so T is Lipschitz, and hence continuous.

(iii’) → (iii): Obvious.

(iii) → (iii’): We saw that T satisfying (iii) is Lipschitz and hence (iii’) follows.

4.4 Definition. For normed spaces X and Y, let

B(X ,Y) = T ∈ L(X ,Y) : T is bounded.

For T ∈ B(X ,Y), we define its operator norm by

‖T‖ = sup‖Tx‖ : x ∈ D(X ) = sup‖Tx‖ : x ∈ B(X ) = sup‖Tx‖ : x ∈ S(X ).

We note that ‖T‖ is the Lipschitz constant of T .

4.5 Definition. If (X, τ) is a topological space, and Y is a normed space, let

CYb (X, τ) = F : X → Y : F is τ -‖ · ‖ continuous and ‖F‖∞ = supx∈X‖F (x)‖ <∞.

4.6 Theorem. If Y is a Banach space, then (CYb (X, τ), ‖ · ‖∞) is also a Banach space.

Proof. Trivial modifications of F-valued case.

4.7 Theorem. If X ,Y are normed spaces, then (B(X ,Y), ‖ · ‖) is a normed space. If Y is a Banach space, then(B(X ,Y), ‖ · ‖) is a Banach space.

Proof. First, if S, T ∈ B(X ,Y) then

‖S + T‖ = sup‖(S + T )x‖︸︷︷︸‖Sx+Tx‖≤‖Sx‖+‖Tx‖

: x ∈ D(X ) ≤ sup‖Sx‖ : x ∈ D(X )+ sup‖Tx′‖ : x′ ∈ D(X ) = ‖S‖+ ‖T‖.

It is easy to show ‖αS‖ = |α|‖S‖ for α ∈ F. Now, suppose that Y is a Banach space. Let B(X ) have the usualnorm topology from X . Define

Γ : B(X ,Y)→ CYb (B(X )), T 7→ T |B(X )

that is, Γ restricts T to the unit ball. Then Γ is an isometry, i.e. for all T we have

‖Γ(T )‖∞ = sup‖Tx‖ : x ∈ B(X ) = ‖T‖.

It is easy to see that Γ is linear. Hence if (Tn)∞n=1 ⊂ B(X ,Y) is Cauchy, then (Γ(Tn))∞n=1 is Cauchy as well. Thus

F = limn

Γ(Tn)

exists, since (CYb (B(X )), ‖ · ‖∞) is complete. Now, let T : X → Y be given by

T (x) =

‖x‖F ( 1

‖x‖x) if x 6= 0

0 if x = 0.

Note that ‖T − Tn‖ = ‖F − Γ(Tn)‖∞ → 0. It remains to show that T is linear. Let x, x′ ∈ X , α ∈ F, and supposethat ‖x‖, ‖x′‖, ‖x+ αx′‖ 6= 0. Then

T (x+ αx′) = ‖x+ αx′‖F(

1

‖x+ αx′‖(x+ αx′)

)= ‖x+ αx′‖ lim

nTn

(1

‖x+ αx′‖(x+ αx′)

)= lim

nTn(x+ αx′) = lim

nTn(x) + α lim

nTn(x′) = ‖x‖ lim

nTn

(1

‖x‖x

)+ α‖x′‖ lim

nTn

(1

‖x′‖x′)

= ‖x‖F(

1

‖x‖x

)+ α‖x′‖F

(1

‖x′‖x′)

= T (x) + αT (x′).

Thus T is linear.

8

4.8 Definition. For a vector space X , we define the algebraic dual space of X by X ′ = L(X ,F); it consists ofthe linear functionals on X . If X is a normed space7, then X ∗ = B(X ,F) denotes the continuous dual space, i.e.the subspace of continuous linear functionals on X .4.9 Proposition. If x ∈ `1, then fx : ccc0 → F given by

fx(y) =

∞∑i=1

xiyi

is a bounded linear functional. Moreover, all bounded linear functionals arise in this way, with ‖fx‖ = ‖x‖1.4.10 Remark. This identification x 7→ fx is a linear isometric isomorphism ccc0

∗ ∼= `1.

Proof. Let x ∈ `1, and y ∈ ccc0. We establish that∞∑i=1

xiyi

converges. If m < n, observe ∣∣∣∣∣n∑

i=m

xiyi

∣∣∣∣∣ ≤n∑

i=m

|xi||yi| ≤n∑

i=m

|xi|‖y‖∞m<n→∞−−−−−−→ 0

so the series converges absolutely. Thus∣∣∣∣∣∞∑i=1

xiyi

∣∣∣∣∣ ≤∞∑i=1

|xi||yi| ≤∞∑i=1

|xi|‖y‖∞ = ‖x‖1‖y‖∞.

This shows fx is well-defined. It is easy to check that it is linear. Observe that ‖fx‖ ≤ ‖x‖1 is (essentially) alreadyshown. Let

y(N) = (sgnx1, . . . , sgnxN , 0, 0, . . .) ∈ ccc0,

and ‖y(N)‖ ≤ 1. Then

‖fx‖ ≥ supN|fx(y(N))| = sup

N∈N

N∑n=1

|xi| =∞∑n=1

|xi| = ‖x‖1.

Now suppose that f ∈ ccc0∗. Let en be the sequence with 0 everywhere except for a 1 in the nth position. Define xby putting xi = f(ei). We observe, by linearity, that

n∑i=1

|xi| = f(y(n)) = |f(y(n))| ≤ ‖f‖ <∞.

Thus,∞∑i=1

|xi| = supn∈N

n∑i=1

|xi| <∞ =⇒ x ∈ `1.

We observe that f |` = fx|`, where

` = ccc00 = y = (y1, . . . , yn, 0, 0, . . .) : y1, . . . , yn ∈ F, n ∈ N.

Note that ` is dense in ccc0, i.e. y = limn(y1, . . . , yn, 0, 0, . . .) because

‖y − (y1, y2, . . . , yn, 0, 0, . . .)‖∞ = supi≥n+1

|yi|n→∞−−−−→ 0.

Using the continuity of f and the fact that f = fx on ` (a dense subset), we have

f(y) = limnf((y1, . . . , yn, 0, 0, . . .)) = lim

nfx((y1, . . . , yn, 0, 0, . . .)) = fx(y)

so that f = fx.

7More generally, X is a topological vector space.

9

A1, Due Monday. Office hours today 11-noon, Friday 2-3:30 pm, or by appointment. [email protected].

Grader needed for AM/PM 331 (Real Analysis), about 3 hours per week, ≤ $600 per term. Talk to me by 3pmtoday.

4.11 Definition. A Hamel basis of an F-vector space X is any family B ⊆ X such that

1. B is linearly independent i.e. for any finite distinct e1, . . . , en ∈ B and α1, . . . , αn ∈ F, the equationn∑i=1

αiei = 0

occurs if and only if αi = 0 for all i.

2. B is spanning i.e. for any x ∈ X , there are e1, . . . , en ∈ B and α1, . . . , αn ∈ F such that

x =

n∑i=1

αiei.

If we assume the axiom of choice, every vector space admits a basis.

4.12 Question. Given a normed space X , does there exist a non-zero bounded linear functional? Do there exist“enough” bounded functionals, i.e. given a point 0 6= x ∈ X , is there f ∈ X ∗ such that f(x) 6= 0? Given x ∈ X , isthere f ∈ X ∗ such that |f(x)| = ‖x‖, ‖f‖ ≤ 1?

4.13 Definition. Let X be an F-vector space. A sublinear functional on X is a functional p : X → R such that

• p(tx) = tp(x) for t ∈ R≥0 (non-negative homogeneity).

• p(x+ y) ≤ p(x) + p(y) for x, y ∈ X (subadditivity).

4.14 Theorem (Hahn-Banach Theorem). Let X be an R-vector space, p : X → R a sublinear functional,Y ⊂ X a subspace, and f ∈ Y ′ such that f(y) ≤ p(y) for y ∈ Y. Then there exists F ∈ X ′ such that

F∣∣∣Y

= f

and F (x) ≤ p(x) for x ∈ X . We will refer to F as a Hahn-Banach extension of f (relative to p).

Proof. Step 1 : Suppose there is x ∈ X \ Y such that X = spanRx,Y. We observe for y+, y− ∈ Y, we have

f(y+) + f(y−) = f(y+ + y−) ≤ p(y+ + x+ y− − x) ≤ p(y+ + x) + p(y− − x)

which implies thatf(y−)− p(y− − x) ≤ p(y+ + x)− f(y+).

Hence there is c ∈ R so that

supf(y−)− p(y− − x) : y− ∈ Y ≤ c ≤ infp(y+ + x)− f(y+) : y+ ∈ Y.

We then define F ∈ X ′ = spanRx,Y′ by

F (αx+ y) = αc+ f(y).

Observe F |Y = f . It remains to check that F ≤ p on X . Suppose α = t > 0. Then

c ≤ p( 1t y + x)− f( 1

t y) =⇒ tc ≤ p(y + tx)− f(y) =⇒ F (tx+ y) = tc+ f(y) ≤ p(y + tx).

If α = −s, s > 0, then

f( 1sy)− p( 1

sy − x) ≤ c =⇒ f(y)− p(y − sx) ≤ cs =⇒ F (−sx+ y) = −sc+ f(y) ≤ p(y − sx).

Notice that if dimX <∞ (or if Y is of finite codimension) then we use simple induction to finish.

Step 2 : We call a pair (ϕ,M) a p-extension of f if Y ⊆ M andM is a subspace of X , ϕ|Y = f and ϕ ≤ p onM.We assign the following partial ordering to the set E of p-extensions of f :

(ϕ,M) ≤ (ψ,N ) ifM⊆ N and ψ|M = ϕ.

10

This is indeed a partial ordering. Let C ⊂ E be a chain. We let

U =⋃

(ϕ,M)∈C

M

so U is a subspace of X , as (M)(ϕ,M)∈C is a totally ordered collection. Define Φ ∈ U ′ by

Φ(x) = ϕ(x), whenever x ∈M, for (ϕ,M) ∈ C.

Then Φ is well-defined: if x ∈ M and y ∈ M, where (ϕ,M), (ϕ,M) ∈ C, with M ⊆ M, say, then ϕ(x) = ϕ(x).Also, Φ is clearly linear. Note that (Φ,U) is an upper bound for C.

By Zorn’s Lemma, there exists a maximal element (F,M) for E . If it were the case thatM ( X , then there wouldbe x ∈ X \M. Then, by Step 1, we would find ϕ ∈ spanRx,M

′ such that

ϕ|M = f , and ϕ ≤ p on spanRx,M.

Then (ϕ, spanRx,M) ∈ E and (F,M) < (ϕ, spanRx,M), which would violate the fact that (F,M) is maximal.ThusM = X .4.15 Lemma. Let X be a C-vector space, hence a R-vector space.

(i) If f ∈ XR′ (i.e. a R-valued linear functional), then

fC(x) = f(x)− if(ix)

is C-linear, i.e. fC ∈ XC′.

(ii) Conversely if g ∈ XC′ and f = Re g, then f ∈ XR

′ and moreover fC = g.

(iii) If X is a normed C-space, hence a normed R-space, then f ∈ XR∗ if and only if fC ∈ XC

∗ and ‖f‖ = ‖fC‖.

Proof. (i) and (ii) are simple exercises.

(iii): For x ∈ X , let z(x) = sgn fC(x), so fC(x) = z(x)|fC(x)|. Then we assume f ∈ XR∗:

|fC(x)|︸︷︷︸∈R≥0

= z(x)fC(x) = fC(z(x)x) = |f(z(x)x)| ≤ ‖f‖‖z(x)x‖ = ‖f‖‖x‖

and hence ‖fC‖ ≤ ‖f‖. However ‖f‖ ≤ ‖fC‖ is clear.4.16 Corollary (to Hahn-Banach). Let X be a normed space, and Y ⊆ X a subspace with f ∈ Y∗. Thenthere is F ∈ X ∗ such that

F∣∣∣Y

= f and ‖F‖ = ‖f‖.

Proof. Let p(x) = ‖f‖‖x‖, for x ∈ X . We observe for y ∈ Y that

Re f(y) ≤ |f(y)| ≤ ‖f‖‖y‖ = p(y).

By the Hahn-Banach theorem, there exists F0 ∈ XR∗ such that F0|Y = Re f and F0 ≤ p on X , i.e. F0(x) ≤ ‖f‖‖x‖

and hence−F0(x) = F0(−x) ≤ ‖f‖‖ − x‖ = ‖f‖‖x‖,

so |F0(x)| ≤ ‖f‖‖x‖ = p(x). Thus, ‖F0‖ ≤ ‖f‖, but clearly ‖F0‖ ≥ ‖f‖. If F = R, let F = F0, otherwise if F = Cwe let F = (F0)C.

4.17 Corollary. If X is a normed space and x ∈ X , then there exists f ∈ X ∗ such that f(x) = ‖x‖.

Proof. Let Y = Fx. Let f0 : Y → F be given by f0(αx) = α‖x‖, so clearly f0 ∈ Y ′. Also

‖f0‖ = sup|f(αx)| : ‖αx‖ ≤ 1 = sup|α|‖x‖ : |α|‖x‖ ≤ 1 ≤ 1.

We apply the last corollary to get f ∈ X ∗ such that f |Y = f0 and ‖f‖ = ‖f0‖ ≤ 1.

4.18 Theorem. Let X be a normed space and X ∗∗ = (X ∗)∗. The map (x 7→ x) : X → X ∗∗ given by evaluationfunctionals,

x(f) = f(x), ∀x ∈ X , f ∈ X ∗

is a linear isometry.

11

Proof. That x ∈ (X ∗)′ is clear, and it is clear that (x 7→ x) ∈ L(X , (X ∗)′). Now we have for x ∈ X , f ∈ X ∗

|x(f)| = |f(x)| ≤ ‖f‖‖x‖

so that ‖x‖ ≤ ‖x‖. However, by the last corollary (really, the Hahn-Banach theorem), given we have that there isf ∈ X ∗, ‖f‖ ≤ 1, such that

|x(f)| = |f(x)| = ‖x‖,it follows that ‖x‖ ≥ ‖x‖. Thus ‖x‖ = ‖x‖.4.19 Remark. We have:

(i) If Y is a linear subspace of a normed space X , then its closure Y ⊆ X is also a linear subspace. Indeed,observe that x 7→ αx and x 7→ x+ y are uniformly continuous on X , hence on Y.

(ii) If X is a normed space, but not complete, then we may define its completion as

X := X ⊆ X ∗∗.

Since X ∗∗ is a dual space, it is complete, and any closed subset of a complete space is itself complete. Thisis, up to isometric isomorphism, the unique Banach space containing a dense copy of X .

5 Separation by hyperplanes

5.1 Definition. If X is an F-vector space, then:

• A hyperplane (containing 0) is any subspace of the form ker f , 0 6= f ∈ X ′.

• An (affine) hyperplane is any subset of the form x0 + ker f , x0 ∈ X , 0 6= f ∈ X ′.

• A R-hyperplane is any set of the form x0 + ker Re f where x0 ∈ X , 0 6= f ∈ X ′.

Note ker Re f ⊇ ker f .

Our goal is a geometric version of Hahn-Banach theorem: given A,B convex sets with A ∩B = ∅, we want to puta R-hyperplane between them (we want this to be closed in the normed setting).

5.2 Proposition. Let X be a normed space.

(i) If 0 6= f ∈ X ∗, then ker f is closed and nowhere dense.

(ii) If f ∈ X ′ \ X ∗, then ker f is dense in X .

Thus a hyperplane in X is either closed or dense in X .

Proof. We have:

(i) If 0 6= f ∈ X ∗, then ker f = f−1(0) is closed. Also, any proper closed subspace of a normed space is nowheredense. Indeed, if Y ⊆ X is such a subspace, and there is y0 ∈ Y, δ > 0 such that y0 + δD(X ) ⊂ Y, thenD(X ) ⊆ 1

δ (Y − y0) = Y and spanD(X ) = X , so X = Y.

(ii) Suppose ker f is not dense in X . Hence there is x0 ∈ X and δ > 0 such that (x0 + δD(X)) ∩ ker f = ∅. Thus

0 /∈ f(x0 + δD(X )) = f(x0) + δf(D(X )),

hence 1δ f(x0) /∈ −f(D(X )) = f(−D(X )) = f(D(X )). Thus

‖f‖ ≤∣∣∣∣1δ f(x0)

∣∣∣∣ .Indeed, if there were x ∈ D(X ) such that |f(x)| >

∣∣ 1δ f(x0)

∣∣, then∣∣∣∣f(x0)

δf(x)

∣∣∣∣ < 1

and we thus havef( f(x0)

δf(x)x︸︷︷︸

‖·‖<1

)=

1

δf(x0)

contradicting our statement before.

12

5.3 Remark. If X is an infinite-dimensional normed space, then in fact X ′ \ X ∗ 6= ∅ (assuming the Axiom ofChoice). To see this, note that there exists a Hamel basis eγγ∈Γ for X , and we may assume ‖eγ‖ = 1. Find anunbounded subset αγγ∈Γ ⊂ F. Let f ∈ X ′ be given by

f

∑γ∈Γ

βγeγ

=∑γ∈Γ

βγαγ

(where βγ 6= 0 only for finitely many γ). Then

‖f‖ ≥ supγ∈Γ|f(eγ)| = sup

γ∈Γ|αγ | =∞

so f is an unbounded linear functional on X .5.4 Definition. Let X be an F-vector space.

• A nonempty subset of X is convex if for any x, y ∈ A, and 0 < s < 1, we have (1− s)x+ sy ∈ A.

• A subset A of X is absorbing about x0 ∈ A if for every x ∈ X , there is ε = ε(x, x0, A) > 0 such that for0 ≤ s < ε, we have x0 + sx ∈ A.

5.5 Example. If X is a normed space, any open set U ⊂ X is absorbing about any of its points.

5.6 Lemma. Let X be an F-vector space, and A ⊂ X be convex and absorbing about 0. Define p : X → Rby p(x) = inft ≥ 0 : x ∈ tA where tA = ta : a ∈ A. Then p is a sublinear functional8, which we call theMinkowski functional (or gauge functional) of A. Moreover,

(i) x ∈ X : p(x) < 1 ⊆ A ⊆ x ∈ X : p(x) ≤ 1.

(ii) If X has norm ‖ · ‖, by which A is a neighbourhood of 0, then there is N > 0 such that p(x) ≤ N‖x‖, x ∈ X .

Proof. First, note that since A is absorbing at 0, we find for each x, there is t > 0 such that tx = 0 + tx ∈ A,hence x ∈ 1

tA, and hence the infimum describing p is over a non-empty set, so p(x) is well-defined. Let us verifysublinearity.

To see positive homogeneity, note that if s > 0, we have

p(sx) = inft ≥ 0 : sx ∈ tA = inft ≥ 0 : x ∈ tsA = s · inf ts ≥ 0 : x ∈ t

sA = sp(x).

Clearly p(0) = 0. To see subadditivity, first note if s, t ≥ 0, we have for a, b ∈ A that

sa+ tb = (s+ t)

(s

s+ ta+

t

s+ tb

)︸︷︷︸convex combination

∈ (s+ t)A

since A is convex, and hence sA+ tA ⊆ (s+ t)A. On the other hand we always have

(s+ t)A = (s+ t)a : a ∈ A ⊆ sa+ tb : a, b ∈ A = sA+ tA

so (s+ t)A = sA+ tA. Now for x, y ∈ X we have

p(x) + p(y) = infs ≥ 0 : x ∈ sA+ inft ≥ 0 : y ∈ tA = infs+ t : s, t ≥ 0, x ∈ sA, y ∈ tA≥ infs+ t : s, t ≥ 0, x+ y ∈ sA+ tA = (s+ t)A= infr ≥ 0 : x+ y ∈ rA = p(x+ y).

Let us prove the remaining claims:

(i) If p(x) < 1, then because A is absorbing, there is 0 < t < 1 so x ∈ tA, i.e. 1tx ∈ A, but then by convexity of

A, we have x = (1− t)0 + t 1tx ∈ A. Also, if x ∈ A, then x ∈ 1A so p(x) ≤ 1.

(ii) Let δ > 0 be such that δD(X ) ⊂ A. Then for 0 6= x ∈ X , ε > 0,

x ∈ (‖x‖+ ε)D(X ) =‖x‖+ ε

δδD(X ) ⊆ ‖x‖+ ε

δA.

Hence p(x) ≤ ‖x‖+εδ → 1δ ‖x‖ as ε→ 0. Let N = 1

δ .8If A ⊂ X is not convex, the proof above still shows, at least, that the Minkowski functional of A is finite-valued, non-negative and

positive-homogeneous, with A ⊆ x ∈ X : p(x) ≤ 1. Furthermore, if A is both convex and balanced, then p is a seminorm on X .Converses of this result are also true; see Megginson’s An Introduction to Banach Space Theory, Chapter 1, Section 9.

13

5.7 Theorem (Separation Theorem/Geometric Form of the Hahn-Banach Theorem). Sup-pose X is an F-vector space and A,B ⊂ X are non-empty convex sets with A ∩B = ∅ and A is absorbing about apoint a0. Then there are f ∈ X ′ and α ∈ R such that

Re f(a) ≥ α ≥ Re f(b), ∀a ∈ A, b ∈ B.

Moreover, if X admits a norm ‖ · ‖ by which A is a neighbourhood of a0, then f can be chosen to be continuous.In addition, if A is open, then we can arrange for the inequality to be strict on one side, i.e.

Re f(a) > α ≥ Re f(b), ∀a ∈ A, b ∈ B.

Proof. Let A−B = a− b : a ∈ A, b ∈ B. It is straightforward to verify that

(i) A−B is absorbing around each a0 − b, for b ∈ B.

(ii) A−B is convex.

(iii) In the case that X is normed and A is a neighbourhood of a0, then A−B is a neighbourhood of each a0 − b,for b ∈ B. Furthermore, if A is open, A−B is open too.

Let x0 = a0 − b for some fixed b in B, and set

C = x0 − (A−B) = x0 − (a− b) : a ∈ A, b ∈ B.

Then C is convex, absorbing about 0, and is a neighbourhood of 0, given relevant assumptions. Let p be theMinkowski functional of C. Since A ∩B = ∅, we have 0 /∈ A−B and hence x0 /∈ x0 − (A−B) = C. Thus, by thelemma above, p(x0) ≥ 1.

Let f0 : Rx0 → R be given by f0(sx0) = sp(x0). If s ≥ 0,

f0(sx0) = sp(x0) = p(sx0) ≤ p(sx0),

and if s < 0,f0(sx0) = sp(x0) < 0 ≤ p(sx0),

so f0(sx0) ≤ p(sx0) on Rx0. Let f ∈ X ′R be any Hahn-Banach extension of f0 to all of X , such that f(x) ≤ p(x)for x ∈ X . We will show that f satisfies the statement of the Theorem if F = R. If F = C, we will replace f by fC,and be done.

If a ∈ A, b ∈ B, then note x0 − (a− b) ∈ C so we have

f(x0 − (a− b)) ≤ p(x0 − (a− b)) ≤ 1

by (i) of the Lemma. Therefore,

f(x0)− f(a) + f(b) ≤ 1 =⇒ f(x0) + f(b) ≤ 1 + f(a).

Since f(x0) = p(x0) ≥ 1, we have f(b) ≤ f(a). Hence, since a ∈ A, b ∈ B are arbitrary,

supf(b) : b ∈ B ≤ α ≤ inff(a) : a ∈ A

for some α ∈ R.

Now assume X admits a norm by which A is a neighbourhood of a0, hence C is a neighbourhood of 0. By thelemma, we see p ≤ N‖ · ‖ on X , so for x ∈ X

f(x) ≤ p(x) ≤ N‖x‖, −f(x) = f(−x) ≤ p(−x) ≤ N‖ − x‖ = N‖x‖

so |f(x)| ≤ N‖x‖ so ‖f‖ ≤ N , i.e. f is continuous. Moreover, if A is open, we cannot have that f(a) = α for anya ∈ A, where α is as above. Indeed there is ε > 0 so that a± εx0 ∈ A, which means that

α ≤ f(a− εx0) = f(a)− εp(x0)

but p(x0) ≥ 1 so we certainly cannot have f(a) = α.

5.8 Definition. In an F-vector space, a (R)-half-space is any set of the form

H = x ∈ X : Re f(x) ≥ α, for some 0 6= f ∈ X ′, α ∈ R.

14

We observe that if X is normed, H is closed if and only if f ∈ X ′ is continuous (“if” is easy because continuousmaps pull back closed sets to closed sets – why is “only if” true?).

5.9 Definition. In an F-vector space, given a non-empty S ⊂ X , we let its convex hull be given by

coS = λ1x1 + . . .+ λnxn : λi ∈ [0, 1],∑

λi = 1, xi ∈ S, n ∈ N.

We note thatcoS =

⋂C : C ⊃ S and C is convex,

i.e. coS is the smallest convex set containing S. If X is normed then we let coS = coS, denote the closed convexhull. We remark that if C ⊂ X is convex, then its closure C is convex.

5.10 Theorem (Closed Convex Hull Theorem). If X is a normed space and S ⊂ X is nonempty, then

coS =⋂H : H is a closed half-space containing S.

Proof. The collection of all closed half-spaces containing S is a collection of closed convex sets, which proves the“⊆” direction (arbitrary intersections of closed/convex sets remain closed/convex). It remains to prove “⊇”. It willsuffice to see that for any x0 /∈ coS that there is a closed half-space H such that coS ⊆ H and x0 /∈ H.

If x0 /∈ coS, then there is δ > 0 such that

(x0 + δD(X )) ∩ coS = ∅,

but x0 + δD(X ) is open and convex, while coS is convex so by the Separation Theorem, there is f ∈ X ∗ and α ∈ Rsuch that

Re f(c) ≥ α > Re f(x), ∀c ∈ coS, x ∈ x0 + δD(X ).

In particular, the closed half-spaceH = x ∈ X : α ≤ Re f(x)

contains coS, but misses x0.

6 Consequences of Baire Category Theorem

We now discuss the Banach-Steinhaus Theorem, the Open Mapping Theorem, and the Closed Graph Theorem.These are traditionally proved as consequences of the Baire Category Theorem9.

6.1 Theorem (Baire Category Theorem I). If (X, ρ) is a complete metric space and Un∞n=1 is a collectionof dense open sets, then

∞⋂n=1

Un is dense in X.

Proof. PM351.

6.2 Definition. Let (X, ρ) be a metric space.

• A set F ⊂ X is called nowhere dense if its closure F contains a neighbourhood of none of its points, i.e. ifx0 ∈ F , there is no δ > 0 such that

Bδ(x0) = x ∈ X : ρ(x, x0) < δ ⊂ F .

In other words, F has empty interior. This is equivalent to saying that X \ F is dense in X.

• A set M ⊂ X is called meager (or 1st category) if

M =

∞⋃n=1

Fn

where each Fn is nowhere dense, and non-meager (or 2nd category) otherwise.9In some cases, one can find non-Baire proofs, e.g. of Banach-Steinhaus via “gliding hump”. By the way, the Open Mapping Theorem

may be proved via Zabreıko’s Lemma: every countably subadditive seminorm on a Banach space is continuous (which follows from thefact that every closed, convex, absorbing-about-0 subset in a Banach space contains a nbhd of 0, which is a special incarnation of Baire).

15

6.3 Theorem (Baire Category Theorem II). If (X, ρ) is a complete metric space, and ∅ 6= U ⊂ X isopen, then U is non-meager.

Proof. If U was meager, i.e. we could write

U =

∞⋃n=1

Fn

where each Fn is nowhere dense, then each Un = X \ Fn would be open and dense, and hence

∞⋂n=1

Un

is dense in X, hence meets U . This means we cannot write U as above.

A2 posted online since Monday. By next Monday’s lecture, we will have covered all relevant lecture material –however pages 1 and 2 are easily accessible now.

6.4 Theorem (Banach-Steinhaus Theorem/Uniform Boundedness Principle). Let X ,Y benormed spaces. If F ⊂ B(X ,Y) is pointwise bounded on a set U non-meager in X , i.e.

sup‖Tx‖ : T ∈ F <∞, ∀x ∈ U,

then we have that F is uniformly bounded on X , i.e.

sup‖T‖ : T ∈ F = sup‖Tx‖ : T ∈ F , x ∈ D(X ) <∞.

Proof. LetFn = x ∈ X : ‖Tx‖ ≤ n for all T ∈ F =

⋂T∈F

T−1(nB(Y))

where T−1(nB(Y)) is closed as T ∈ B(X ,Y) so Fn, as an intersection of closed sets, is again closed. By assumption

U ⊂∞⋃n=1

Fn =⇒ U =

∞⋃n=1

(Fn ∩ U).

Since U is non-meager, Baire Category says that at least one Fn ∩ U has non-empty interior, so there is x0 ∈ X ,δ > 0 so that

x0 + δD(X ) ⊂ Fn ∩ U ⊂ Fn.Now, for x ∈ D(X ), we can write x = 1

2δ [x0 + δx− (x0 − δx)], hence

Tx =1

2δ[T (x0 + δx)︸︷︷︸‖·‖≤n

−T (x0 − δx)︸︷︷︸‖·‖≤n

], ∀T ∈ F

so ‖Tx‖ ≤ nδ , so ‖T‖ ≤

nδ , independent of our choice of T ∈ F .

6.5 Corollary. If F ⊂ B(X ,Y) is not uniformly bounded, then the subspace

X0 = x ∈ X : supT∈F‖Tx‖ <∞

is meager in X .6.6 Theorem (Open Mapping Theorem/Banach-Schauder Theorem). Let X ,Y be Banach spacesand T ∈ B(X ,Y). Then if T is surjective, i.e. T (X ) = Y, then T is open, i.e. if U ⊂ X is open, then T (U) ⊂ Y isopen.

6.7 Remark. We shall frequently use the following fact: if ∅ 6= A ⊂ X, x ∈ X , 0 6= α ∈ F, then x+ αA = x+αA.Indeed, ak → a in X if and only if x+ αak → x+ αa in X .6.8 Lemma (Main Lemma10). Suppose there is r > 0 that

T (D(X )) ⊃ rD(Y)

then T (D(X )) ⊃ rD(Y).10See Bollobás’ Linear Analysis, Chapter 5, Lemma 5.

16

Proof of lemma. Let z ∈ rD(Y), and let δ > 0 be so that ‖z‖ < r(1− δ) < r. Set y = 11−δ z, so ‖y‖ = ‖z‖

1−δ < r. Wewill show that y ∈ 1

1−δT (D(X )) i.e. there is x ∈ 11−δD(X ) so Tx = y. Then T ((1 − δ)x) = z and so we will have

shown T (D(X )) ⊃ rD(Y).

Put A = T (D(X )) ∩ rD(Y); it follows from our assumptions that A ⊃ T (D(X )) ⊃ rD(Y). We proceed inductively:

y ∈ rD(Y) ⊂ A =⇒ ∃y1 ∈ A ∩ (y + δrD(Y))

y ∈ y1 + δrD(Y) ⊂ y1 + δA =⇒ ∃y2 ∈ (y1 + δA) ∩ (y + δ2rD(Y))

y ∈ yn + δnrD(Y) ⊂ yn + δnA =⇒ ∃yn+1 ∈ (yn + δnA) ∩ (y + δn+1rD(Y))

Thus, we found a sequence (yn)∞n=1 ⊂ Y. We note

yn+1 ∈ yn + δnA

‖yn+1 − y‖ < δn+1r, ‖yn − y‖ < δnr =⇒ ‖yn+1 − yn‖ < δn+1r + δnr

yn+1 ∈ yn + δnA =⇒ yn+1 − yn ∈ δnA

and A = T (D(X )) ∩ rD(Y) which yields

yn+1 − yn ∈ δnrD(Y), i.e. ‖yn+1 − yn‖ < δnr

yn+1 − yn ∈ δnT (D(X )) = T (δnD(X ))

i.e. ∃x ∈ X , ‖x‖ < δn s.t. yn+1 − yn = Txn. Moreover, we similarly find x0 ∈ D(X ) such that y1 = Tx0. We alsonote that yn+1 ∈ y + δn+1rD(Y) hence ‖yn+1 − y‖ < δn+1r implies

limn→∞

yn = y.

Now

x =

∞∑n=0

xn,

note that

‖x‖ ≤∞∑n=0

‖xn‖ <∞∑n=0

δn =1

1− δ.

Using the linearity and continuity of T we have

Tx =

∞∑n=0

Txn = y1 +

∞∑n=1

(yn+1 − yn) = . . . = yN︸︷︷︸→y

+

∞∑n=N

(yn+1 − yn)︸︷︷︸‖·‖<δnr

= y

by

‖Tx− y‖ ≤ ‖yN − y‖+

∞∑n=N

δnr → 0

hence ‖Tx− y‖ = 0, as claimed.

Proof of Open Mapping Theorem. It suffices to show that T (D(X )) ⊃ rD(Y) for some r > 0. Indeed, supposingthis, note that if U ⊂ X is open and x ∈ U then x+ δD(X ) ⊂ U for some δ > 0. Thus, U − x ⊃ δD(X ). Hence

T (U − x) ⊃ T (δD(X )) ⊃ δrD(Y),

so thatTx+ δrD(Y) ⊂ Tx+ T (U − x) = T (U).

Thus, T (U) is indeed open. Now, since T (X ) = Y we see that

Y =

∞⋃n=1

nT (D(X ))

because⋃nT (D(X )) = T (

⋃nD(X )). Since Y is a Banach space, the Baire Category Theorem tells us that some

nT (D(X )) is not nowhere dense, hence there is y0 ∈ Y and δ > 0 such that

y0 + δD(Y) ⊂ nT (D(X )) = nT (D(X )).

17

Since nT (D(X )) is convex, and symmetric, i.e.

−nT (D(X )) = nT (−D(X )) = nT (D(X ))

and for y ∈ D(Y), we have that δy is just the midpoint of two vectors in y0 + δD(Y) ⊂ nT (D(X )), that is,

δy =1

2[(y0 + δy)︸︷︷︸∈y0+δD(Y)

− (y0 − δy)︸︷︷︸∈y0+δD(Y)

] ∈ nT (D(X ))

so that δny ∈ T (D(X )), which yields that δ

nD(Y) ⊂ T (D(X )). Hence by the Main Lemma, δnD(Y) ⊂ T (D(X )).

6.9 Theorem (Inverse Mapping Theorem). If X ,Y are Banach spaces and T ∈ B(X ,Y) is bijective, thenT−1 ∈ B(Y,X ).

Proof. Since T ∈ L(X ,Y) and bijective, we clearly have that T−1 ∈ L(Y,X ). Since T ∈ B(X ,Y) and surjective,the Open Mapping Theorem says there is r > 0 such that T (D(X )) ⊃ rD(Y). Applying T−1, we see that D(X ) ⊃rT−1(D(Y)) so 1

rD(X ) ⊃ T−1(D(Y)), i.e. ‖T−1‖ ≤ 1r <∞, so that T−1 ∈ B(Y,X ).

6.10 Definition. Let X ,Y be normed spaces and X ⊕ Y denote their direct sum, i.e.

X ⊕ Y = (x, y) : x ∈ X , y ∈ Y.

If 1 ≤ p <∞, let‖(x, y)‖p = (‖x‖p + ‖y‖p)1/p, ‖(x, y)‖∞ = max‖x‖, ‖y‖.

It is trivial to check that ‖ · ‖p : 1 ≤ p ≤ ∞ is a family of norms on X ⊕ Y. If X ,Y are Banach spaces, then

X ⊕p Y = (X ⊕ Y, ‖ · ‖p)

is also a Banach space for each p.

6.11 Fact. Any norm on X ⊕ Y such that ‖(x, 0)‖ = ‖x‖ and ‖(0, y)‖ = ‖y‖, is equivalent to ‖ · ‖1.6.12 Theorem (Closed Graph Theorem). Let X , Y be Banach spaces, and T ∈ L(X ,Y). Then T ∈ B(X ,Y)if and only if its graph,

Γ(T ) = (x, Tx) : x ∈ X ⊂ X ⊕1 Y,

is closed.

Proof. (→) Note that if xn → x in X , then Txn → Tx in Y. Hence if (x, y) ∈ Γ(T ), then there is (xn)∞n=1 ⊂ Xsuch that (x, y) = limn(xn, Txn). However by the above, limn(xn, Txn) = (x, Tx). Hence (x, y) = (x, Tx) ∈ Γ(T ),so Γ(T ) = Γ(T ).

(←) If Γ(T ) is closed in the Banach space X ⊕1 Y, then Γ(T ) itself is a Banach space (it is trivial to see Γ(T ) is asubspace). Define π1 : Γ(T )→ X by π1(x, Tx) = x. Note that

‖π1(x, Tx)‖ = ‖x‖ ≤ ‖x‖+ ‖Tx‖ = ‖(x, Tx)‖ =⇒ ‖π1‖ ≤ 1.

Clearly, π1 is linear, so π1 ∈ B(Γ(T ),X ). Also, it is clear that π1 is a bijection. Thus by the Inverse MappingTheorem, π1

−1 ∈ B(X ,Γ(T )). Hence, if x ∈ X ,

‖Tx‖ ≤ ‖x‖+ ‖Tx‖ = ‖(x, Tx)‖1 = ‖π1−1x‖1 ≤ ‖π1

−1‖‖x‖

and hence ‖T‖ ≤ ‖π1−1‖ <∞.

6.13 Proposition (Closed Graph Test). Given normed spaces X ,Y and T ∈ L(X ,Y), we have Γ(T ) ⊂X ⊕1 Y is closed if and only if whenever xn → 0 in X and Txn → y in Y, then it must be the case that y = 0.

The point is that we know/assume that limn Txn exists.

Proof. Note that in Γ(T ), (xn, Txn)→ (x, z) if and only if (xn−x, T (xn−x)) = (xn−x, Txn−Tx)→ (0, z−Tx).That is, [(x, z) ∈ Γ(T ) if and only if z = Tx] occurs if and only if [Txn → y if and only if y = Tx].

6.14 Definition. Let Y be a normed space and X ⊂ Y be a subspace. We say that X is (boundedly) comple-mented if there is P ∈ B(Y,Y) = B(Y) such that

ImP = P (Y) = X

and P is idempotent, i.e. P 2 = P P = P .

18

6.15 Remark (“obvious” projection). In general, if X ⊂ Y is a subspace, let B be a basis for X . Thenthere is B′ ⊂ Y so B ∪ B′ is a basis for Y. Hence each y ∈ Y admits scalars yee∈B∪B′ such that ye 6= 0 for onlyfinitely many elements e, and

y =∑

e∈B∪B′yee.

Define P : Y → X byPy =

∑e∈B

yee.

Then P 2 = P , and ImP = X . However, it is unlikely that such P is bounded.

Assignment #2 due Wednesday October 10. Office hours Friday 2-3:30, Tuesday 2:30-4. Assignment #1 willbe available for retrieval in the wooden box adjacent to my office door as of about 11am today.

6.16 Theorem. If Y is a Banach space and X ⊂ Y is a closed subspace, then X is boundedly complemented in Yif and only if there is a closed subspace Z ⊂ Y such that X ∩ Z = 0 and X + Z = Y.

Proof. (→) We suppose there is P ∈ B(Y) such that P 2 = P and ImP = X . Let Z = kerP . If x ∈ X ∩ Z, thenx = Px = 0, so indeed X ∩ Z = 0. Also if y ∈ Y, we have

y = Py︸︷︷︸∈X

+ (I − P )y︸︷︷︸∈Z, check

so that X + Z = Y. Also Z = P−1(0), so it is closed.

(←) Let J : X ⊕1 Z → Y be given by J(x, z) = x + z. Then J is linear, ker J = 0, i.e. as X + Z = Y.Also ‖J(x, z)‖ = ‖x + z‖ ≤ ‖x‖ + ‖z‖ = ‖(x, z)‖1, which implies ‖J‖ ≤ 1. By the Inverse Mapping Theorem,J−1 ∈ B(Y,X ⊕1 Z).

Let i : X → Y be the inclusion map and P : X ⊕1Z → X be given by P (x, y) = x, so ‖P‖ ≤ 1. Let P = i P J−1.Then ImP = X . Also, for y ∈ Y,

‖Py‖ = ‖i P J−1y‖ ≤ ‖P J−1y‖ ≤ ‖J−1y‖ ≤ ‖J−1‖‖y‖

so ‖P‖ ≤ ‖J−1‖ <∞ hence P ∈ B(Y). Finally, observe that

P 2 = i P J−1 i︸︷︷︸identity on X

P J−1 = i P J−1 = P

6.17 Remark (Murray, 1940s). If 1 ≤ p ≤ ∞ (p 6= 2), then `p admits subspaces which are not boundedlycomplemented11.

6.18 Theorem. ccc0 is not boundedly complemented in `∞.

Proof. We suppose, for sake of contradiction, that P ∈ B(`∞) exists with ImP = ccc0, P 2 = P . Let F ⊂ P(N) bean uncountable collection of infinite sets, such that for distinct sets F1, F2 ∈ F , |F1 ∩ F2| < ∞. For each F ∈ F ,let yF = (I − P )χF . Note that ccc0 = ker(I − P ), and clearly χF /∈ ccc0, so yF 6= 0 for any F . Observe for distinctF1, . . . , Fm ∈ F and α1, . . . , αm ∈ F, we have

m∑i=1

αiχFi =

m∑i=1

αiχFi\Gi︸︷︷︸‖·‖∞= max

i=1,...,m|αi|

+

m∑k=2

∑1≤j1≤...jk≤m

(αj1 + . . .+ αjk)χFj1∩...Fjk︸︷︷︸∈ccc0

where Gi =⋃j 6=i Fj . Hence∥∥∥∥∥

m∑i=1

αiyFi

∥∥∥∥∥ =

∥∥∥∥∥(I − P )

m∑i=1

αiχFi

∥∥∥∥∥ ≤ ‖I − P‖∥∥∥∥∥m∑i=1

αiχFi\Gi

∥∥∥∥∥ = ‖I − P‖ maxi=1,...,m

|αi|.

Now, set for n, k ∈ N,Fn,k = F ∈ F : |δk(yF )| > 1

n11In fact, it was proved by J. Lindenstrauss and L. Tzafriri in 1971 (On the complemented subspaces problem, Israel J. Math. 9 (1971),

263-269; MR 43 #2474) that a Banach space is isomorphic to a Hilbert space if all of its closed subspaces are boundedly complemented!

19

where δk((xi)∞i=1) = xk, so δk ∈ `∞∗ with ‖δk‖ = 1. Now, if F1, . . . , Fm are distinct elements of Fn,k, then with

αi = sgn δk(yFi), we have

‖I − P‖ ≥

∥∥∥∥∥m∑i=1

αiyFi

∥∥∥∥∥ ≥∣∣∣∣∣m∑i=1

αiδk(yFi)

∣∣∣∣∣ =

m∑i=1

|δk(yFi)|︸︷︷︸≥1/n

≥ m

n

so m ≤ n‖I − P‖, hence Fn,k is finite. Since each yF 6= 0, for F ∈ F , we see that

F =

∞⋃n=1

∞⋃k=1

Fn,k

but Fn,k is finite, which contradicts that F is uncountable.

7 Finite-dimensional Banach spaces

7.1 Lemma. Given a finite-dimensional R-vector space, let e1, . . . , ed be a basis, and let ‖x‖1 =∑di=1 |xi| where

x =∑di=1 xiei. If B = B‖·‖1(X ), then B is compact.

Proof. Let us accept the Bolzano-Weierstrass Theorem, any sequence in [−1, 1] has a convergent subsequence. Let(x(n))∞n=1 ⊂ B. We will show that this has a convergent subsequence.

• (x(n)1 )∞n=1 ⊂ [−1, 1] has a convergent subsequence (x

(nk(1))1 )∞k(1)=1

• (x(nk(1))2 )∞k(1)=1 ⊂ [−1, 1] has convergent subsequence (x

(nk(2))2 )∞k(2)=1

•...

• (x(nk(d−1))

d )∞k(d−1)=1 ⊂ [−1, 1] has convergent subsequence (x(nk(d))

d )∞k(d)=1

Check that (x(nk(d)))∞k(d)=1 converges in B.

7.2 Theorem. Let X be a finite-dimensional F-vector space. Then any two norms on X are equivalent.

Proof. Given a norm ‖ · ‖ on X , we will show that ‖ · ‖ ∼ ‖ · ‖1, where ‖ · ‖1 is the R-norm given with respect toR-basis e1, . . . , ed.

First, let M = supi=1,...,d

‖ei‖. Then for x =∑di=1 xiei,

‖x‖ ≤d∑i=1

‖xiei‖ =

d∑i=1

|xi|‖ei‖ ≤M‖x‖1.

Thus we have for x, y ∈ X ,|‖x‖ − ‖y‖| ≤ ‖x− y‖ ≤M‖x− y‖1

so ‖ · ‖ is ‖ · ‖1-continuous. Now if S = S‖·‖(X ), this is closed in B‖·‖1(X ), and hence compact. Thus

m = minx∈S‖x‖

exists, and m > 0. Now if x ∈ X \ 0 we have

m ≤

∥∥∥∥∥∥∥∥1

‖x‖1x︸︷︷︸

∈S

∥∥∥∥∥∥∥∥hence m‖x‖1 ≤ ‖x‖.7.3 Corollary. Let (X , ‖ · ‖) be a finite-dimensional normed space.

(i) (Heine-Borel) A subset K ⊂ X is compact if and only if K is closed and bounded.

20

(ii) (X , ‖ · ‖) is complete, hence a Banach space.

(iii) If Y is any normed space, then L(X ,Y) = B(X ,Y).

(iii’) X ′ = X ∗.

Proof. We have:

(i) (→) Straightforward exercise from PM 351.

(←) We have an m > 0 such that ‖ · ‖ ≥ m‖ · ‖1. Hence B‖·‖(X ) ⊂ mB‖·‖1(X ), so for any r > 0, rB‖·‖(X ) ⊂rmB‖·‖1(X ). The map x 7→ rmx on X is continuous, so rmB‖·‖1(X ) is compact. Then K ⊂ rB‖·‖(X ) ⊂rmB‖·‖1(X ), for large enough r, and is a closed subset, hence compact.

(ii) Given a Cauchy sequence (xn)∞n=1 ⊆ X the set xn∞n=1 is bounded, so xn∞n=1 is compact. A Cauchysequence has a unique cluster point, hence limit point.

(iii) Given T ∈ L(X ,Y), we let for x ∈ X|||x||| = ‖x‖+ ‖Tx‖.

Then ||| · ||| is a norm, so there is M > 0 so that ||| · ||| ≤M‖ · ‖. Then we have for x ∈ X that

‖Tx‖ ≤ ‖x‖+ ‖Tx‖ = |||x||| ≤M‖x‖.

Hence ‖T‖ ≤M <∞.

(iii’) Let Y = F.

Balls in finite-dimensional spaces

If ‖ · ‖ is a norm on X , then B‖·‖(X ) is convex, symmetric (−x ∈ B‖·‖(X ) whenever x ∈ B‖·‖(X )) [if F = C thenB‖·‖(X ) is balanced (zx ∈ B‖·‖(X ) whenever |z| ≤ 1 in C and x ∈ B‖·‖(X ))], and absorbing about 0.

If we are given, in an F-vector space, a set B such that B is convex, symmetric (balanced if F = C) and absorbingabout 0, then the Minkowski functional ‖ · ‖B is a norm (exercise).

Given a norm ‖ · ‖ on X , we have‖ · ‖B‖·‖(X ) = ‖ · ‖.

Now suppose dimX <∞. We have that if A ⊂ X is convex and absorbing about 0, then A is a neighbourhood of0 (w.r.t. any norm we may put on X ). Indeed, since A is absorbing at 0, given a R-basis e1, . . . , ed for X , thereare δ1, . . . , δd > 0 such that ±δiei = 0± δiei ∈ A. Let

B = co±δiei : i = 1, . . . , d ⊂ A.

Check that B is closed. We note that B is convex, symmetric, and absorbing at 0, and hence

B = B‖·‖B (X ) ⊂ A

i.e. D‖·‖B (X ) ⊂ A.7.4 Proposition. If X is a finite-dimensional subspace of a normed space Y, then X is closed and boundedlycomplemented.

Proof. Let e1, . . . , ed be a basis for X . Define f1, . . . , fd ∈ X ′ by

fi

d∑j=1

αjej

= αi

(α1, . . . , αn ∈ F). Since X is finite-dimensional, with the norm inherited from Y, X ′ = X ∗. We find Hahn-Banachextensions F1, . . . , Fd for f1, . . . , fd respectively so each Fi ∈ Y∗. Then define P : Y → Y by

Py =

d∑i=1

Fi(y)ei.

21

Then P is linear, ImP = X , P 2 = P , and

‖P‖ ≤d∑i=1

‖|Fi‖‖ei‖ <∞.

We note that X , with norm from Y, is complete, hence closed. Alternatively, X = ker(I − P ), which is closed.

7.5 Theorem. Let X be a normed space. Then B(X ) is compact if and only if X is finite-dimensional.

Proof. (←) Heine-Borel, above.

7.6 Lemma (Riesz’s Lemma). If X is a normed space and Y is a proper closed subspace, then given ε > 0,there is x0 ∈ B(X ) such that d(x0,Y) ≥ 1− ε.

Proof. Given x ∈ X \ Y, we define f : Y + Fx → F by f(y + αx) = α for y ∈ Y, α ∈ F. We note that ker f = Ywhich is closed so f is continuous. (Prop’n before Separation Theorem). Thus by Hahn-Banach Theorem, there isan extension F ∈ X ∗ of f . Find x0 ∈ B(X ) such that |F (x0)| ≥ (1− ε)‖F‖. We observe for any y in Y

‖x0 − y‖ ≥1

‖F‖|F (x0 − y)| = |F (x0)|

‖F‖≥ 1− ε.

Proof of (→) of theorem. We show that if dimX 6<∞, then B(X ) is not compact. We perform an induction; given0 < ε < 1, find

• x1 ∈ S(X ), i.e. ‖x1‖ = 1.

• x2 ∈ B(X ) such that d(x2,Fx1) ≥ 1− ε.

• x3 ∈ B(X ) such that d(x3, spanx1, x2) ≥ 1− ε.

•...

• xn ∈ B(X ) such that d(xn, spanx1, . . . , xn−1) ≥ 1− ε.

Note, at each stage, spanx1, . . . , xn is finite-dimensional, and hence closed. Moreover, dimX 6< ∞, so thesefinite-dimensional subspaces are proper. We note if n > m then

‖xn − xm‖ ≥ d(xn, spanx1, . . . , xn−1) ≥ 1− ε

and we see that (xn)∞n=1 ⊂ B(X ) admits no Cauchy subsequence.

8 Initial topologies, compactness

8.1 Definition. Let X be a non-empty set, (Xα, τα)α∈A be a collection of topological spaces, and fα : X →Xαα∈A be a family of functions. We define the initial topology σ = σ(X, fαα∈A) as follows:

U ∈ σ ⇐⇒ for each x ∈ U , there exist U1 ∈ τα1, . . . , Un ∈ ταn (n ∈ N) such that x ∈

n⋂i=1

f−1αi (Ui) ⊂ U.

Hence the subsets n⋂i=1

f−1αi (Ui) : n ∈ N, each αi ∈ A, each Ui ∈ ταi

form a base for σ, i.e. any U in σ is the union of such sets. We might call the sets f−1

α (U) : α ∈ A,U ∈ τα asubbase, i.e. finite intersections of such sets form a base.

8.2 Remark. Each fα : (X,σ) → (Xα, τα) is σ-τα-continuous (A1Q1). In fact, σ is the coarsest topology onX which allows each fα to be continuous, i.e. if τ ⊆ P(X) is any topology for which each fα : X → Xα isτ -τα-continuous, then σ ⊆ τ . We remark that σ is trivially a topology: it is closed under finite intersections, andarbitrary unions.

8.3 Example (Metric Topology). Let ρ : X × X → [0,∞) be a metric on X. For each x ∈ X, letρx(y) = ρ(x, y), so ρx : X → Rx∈X is a family of functions on X (we equip R with its usual topology, of course).It’s not difficult to show that

τρ = σ(X, ρxx∈X).

22

8.4 Example (Product Topology). Let (Xα, τα)α∈A be a collection of topological spaces. Let

X =∏α∈A

Xα = x = (xα)α∈A : xα ∈ Xα for all α ∈ A

denote the Cartesian product. For each α, denote by πα : X → Xα the canonical projections, i.e. παx = xα.Then we define the product topology on X by

π = σ(X, παα∈A).

8.5 Remark. Note the following:

(i) If A = N, then basic open neighbourhoods of any point in X =∏n∈NXn, are given by

n⋂i=1

p−1i (Ui) = U1 × U2 × . . .× Un ×Xn+1 ×Xn+2 × . . .

(ii) Consider the set RR (functions from R to R) with the product topology π. We’re effectively selecting finitelymany points t1, . . . , tn. [DIAGRAM]

8.6 Example (Linear Topologies). Let X be a normed space and Z ⊂ X ∗. We may write Z = f : X →Ff∈Z . The initial topology σ(X ,Z) is called the linear topology on X from Z. Note that σ(X ,Z) ⊂ τ‖·‖. Inparticular:

• The weak topology is defined as w = σ(X ,X ∗) ⊂ P(X ).

• The weak* topology is defined as w∗ = σ(X ∗, X ) ⊂ P(X ∗).8.7 Remark. Recall that X ∗∗ ⊃ X = x : x ∈ X, where x(f) = f(x). What do subbasic w-open sets of X looklike? Fix f ∈ X ∗ \ 0, x0 ∈ X . Then

f−1(f(x0) + εD) = x ∈ X : f(x) ∈ f(x0) + εD = x ∈ X : f(x)− f(x0) ∈ εD = x ∈ X : |f(x)− f(x0)| < ε

or alternatively, f−1(f(x0) + εD) = x ∈ X : |g(x)− g(x0)| < 1 where g = 1ε f . Here,

D = D(F) =

(−1, 1) if F = Rz : |z| ≤ 1 if F = C.

8.8 Example (Relative Topology). Let (Y, τ) a topological space. For X ⊂ Y , let ι : X → Y denote thecanonical injection/embedding map. The relative (or relativized) topology on X is denoted

τ |X = σ(X, ι).

Note U ∈ τ |X if and only if there exists V ∈ τ such that U = X ∩ V .

8.9 Definition. If (X, τ) is a topological space, a set K ⊂ X is τ-compact if given any open cover of K, i.e. anyfamily O ⊂ τ such that

K ⊂⋃U∈O

U

there is a finite subcollection U1, . . . , Un ⊂ O such that

K ⊂n⋃i=1

Ui.

We say that (X, τ) is a compact space if X itself is (τ -)compact.

8.10 Definition. Let (X, τ) be a topological space. A set F ⊂ X is (τ-)closed if X \ F ∈ τ . If S ⊂ X is any set,we define its τ-closure by

S = Sτ

=⋂F ⊂ X : F ⊃ S and F is τ -closed.

Since unions of open sets are open, intersections of closed sets are closed. Thus Sτis the smallest τ -closed set

containing S.

8.11 Proposition. S = x ∈ X : for every U with x ∈ U ∈ τ , we have U ∩ S 6= ∅.

23

Proof. (⊆) If for x ∈ X, there exists a τ -neighbourhood U of x such that U ∩ S = ∅, then S ⊆ X \ U so x /∈ S.

(⊇) If x /∈ S then x ∈ X \ S so x is in the set described by the RHS.

8.12 Proposition. A topological space (X, τ) is compact if and only if for each family F ⊂ P(X) with FIP (finiteintersection property, i.e. for any finite collection F1, . . . , Fn ∈ F we have

⋂ni=1 Fi 6= ∅), we have

⋂F∈F F 6= ∅.

Proof. (→) Suppose (X, τ) is compact. If F ⊆ P(X) is such that⋂F∈F F = ∅, then X \ F : F ∈ F is an open

cover of X, and so admits a finite subcover X \ Fini=1. But then

X =

n⋃i=1

(X \ Fi) = X \n⋂i=1

Fi =⇒n⋂i=1

Fi = ∅ =⇒n⋂i=1

Fi ⊆n⋂i=1

Fi = ∅

so we don’t have FIP.

(←) If O is a τ -open cover of X, then F = X \ U : U ∈ O is a family of closed sets. If O admitted no finitesubcover, then every finite collection X \ Uini=1 ⊂ F would have

n⋂i=1

(X \ Ui) 6= ∅.

However, by assumption, this would mean that ⋂U∈O

(X \ U) 6= ∅,

but then⋃O ( X violating the fact that O is an open cover.

8.13 Definition. Given a set X, an ultrafilter is a collection U ∈ P(X) such that:

1. U has the FIP.

2. For any A ⊂ X, either A ∈ U or X \A ∈ U (note that both cannot hold, else FIP is violated).

8.14 Example. If x ∈ X, then Ux = U ⊂ X : x ∈ U is an ultrafilter, called a principal (or trivial) ultrafilter.

8.15 Lemma (Ultrafilter Lemma). If X 6= ∅ and F ⊂ P(X) has FIP, then an ultrafilter U ⊃ F exists.

Proof. See Dr. Spronk’s website for a tidied-up proof.

(Repair: for G1, G2 ∈ G, ∃G3 such that ∅ 6= G3 ⊂ G1 ∩ G2, F ′ = ⋂ni=1 Fi : F1, . . . , Fn ∈ F , n ∈ N then F ′ ∈ Φ

so Φ 6= ∅).

Let Φ = G ⊂ P(X) : F ⊂ G,G has FIP. We partially order Φ by inclusion. If Γ is a chain in Φ then letGΓ =

⋃G∈ΓG. If G1 ∈ G1, . . . , Gn ∈ Gn for some Gi ∈ Γ, up to reindexing G1 ⊂ . . . ⊂ Gn so G1, . . . , Gn ∈ Gn.

Hence⋂ni=1Gi 6= ∅ since Gn ∈ Φ. Trivially, F ⊂ GΓ so GΓ ∈ Φ and is an upper bound. Hence by Zorn’s Lemma, a

maximal element U of Φ exists.

For each instance of G1, . . . , Gn replace it G1, G2 ∈ GΓ then G1 ∈ G1, G2 ∈ G2 for some G1,G2 ∈ Γ, up to reindexing,G1 ⊂ G2 so ∃G3 ∈ G2 such that ∅ 6= G3 ⊂ G1 ∩G2.

We observe A ⊂ X and if A ∩ U 6= ∅ for each U in U then U ∪ A ∩ UU∈U ∈ Φ. Indeed, if U1, U2 ∈ U , then thereis U3 ⊂ U1 ∩U2 and A∩U3 6= ∅ so A∩ (U1 ∪U2) 6= ∅ hence U ∪ A∩UU∈A = U by max. of U . If A∩U = ∅ forat least one U in U , then (X \A) ⊃ U , so U ∪ X \A ∈ Φ so again U ∪ X \A = U by max. of U .8.16 Corollary. Non-principal ultrafilters exist for infinite X.

Proof. Let F0 = X \E : E ⊂ X is finite so that F0 has FIP. Any ultrafilter U containing F0 is non-principal, i.e.U 6= Ux for any x ∈ X.

8.17 Theorem (Tychonoff’s Theorem). If (Xα, τα)α∈A is a family of compact spaces, thenX =∏α∈AXα

is compact (with the product topology π).

Proof. See Dr. Spronk’s website for a tidied-up proof.

Let F ⊂ P(X) have FIP. We let U be any ultrafilter containing F . Fix, for the moment, α ∈ A. We observe thatpα(U)U∈U has FIP in Xα. Indeed, if U1, . . . , Un ∈ U then

n⋂i=1

pα(Ui) ⊃ pα

(n⋂i=1

Ui︸︷︷︸6=∅

)6= ∅

24

(pα has full domain(?)). Then by an earlier proposition,⋂U∈U pα(U)

τα 6= ∅ in Xα. Thus there is xα ∈⋂U∈U pα(U)

τα and hence for any τα-neighbourhood V of xα, V ∩pα(U) 6= ∅ for each U ∈ U . Then pα−1(V )∩U 6= ∅for any U ∈ U .

We do this for all α and obtain x = (xα)α∈A ∈ X. If α1, . . . , αn ∈ A and xαi ∈ Vi ∈ ταi then the τ -neighbourhood⋂ni=1 pαi

−1(Vi) ∈ U . Hence x ∈ Uπ for each U ∈ U and thus x ∈⋂U∈U U

π ⊂⋂F∈F F

πso⋂F∈F F

π 6= ∅. Hence(X,π) is compact.

9 An application of ultrafilters: ultrafilter limits

Let F0 = N \ E : E ∈ P(N) is finite. Observe that F0 certainly has the Finite Intersection Property. Let U beany ultrafilter containing F0. Define δU : P(N)→ 0, 1 ⊂ R by

δU (A) =

1 if A ∈ U0 if A /∈ U .

We note that δU (∅) = 0. Also, if A,B ∈ P(N) and A ∩ B = ∅ then at most one of A or B is in U , and henceδU (A) + δU (B) = δU (A ∪ B) (using the property that for E ∈ P(N), either E ∈ U or N \ E ∈ U). If we have apartition

N =

n⊔i=1

Ei,

then exactly one Ei ∈ U and the others are not. It follows that the variation V (δU ) = 1. Hence δU ∈ FA(N); seeA1Q4. By A1Q4, there is LU ∈ `∞∗ such that LU (χS) = δU (S). We note the following facts (the proofs are similarto A2Q4):

(i) LU (1) = 1, where 1 denotes the all-ones sequence, and ‖LU‖ = V (δU ) = 1.

(ii) LU |ccc0 = 0, LU (x) ≥ 0 if xn ≥ 0 for all n, and

lim infn→∞

xn ≤ LU (x) ≤ lim supn→∞

xn (if each xn ∈ R).

(iii) LU is not translation-invariant:LU (χ2N) 6= LU (χ2N−1)

noting that only one of 2N or 2N− 1 is a member of U . However LU (χ2N−1) = LU (1 ∗ χ2N).

9.1 Definition. We call LU ∈ `∞∗ the ultrafilter limit given by U .

We can construct Banach limits: “Cesaro” operator: S ∈ B(`∞), where we define

Sx = (x1,12 (x1 + x2), 1

3 (x1 + x2 + x3), . . .).

Note that ‖S‖ = 1. We note that Sx− S(1 ∗ x) ∈ ccc0 and hence LU S is a Banach limit.

9.2 Proposition. The cardinality of the set of ultrafilters on N is ≥ c.

Proof. Let F be a family of infinite sets, |F| = c, and |F ∩ E| < ∞ for any F 6= E in F . For F ∈ F , defineGF = F0∪F which has FIP (where F0 consists of the cofinite sets in N, as above). Let UF ⊃ GF be an ultrafilter.Since |F ∩ E| <∞ for F 6= E in F , we find that UF 6= UE .9.3 Remark. In fact, the cardinality of ultrafilters on N is 2c.

On A3, we see that one can create product spaces that are compact, but not sequentially compact (i.e. compactspaces in which sequences do not have convergent subsequences. This will justify our development of nets, to come).

9.4 Definition. A topological space (X, τ) is Hausdorff if for x 6= y in X, we have neighbourhoods x ∈ Ux ∈ τand y ∈ Uy ∈ τ such that Ux ∩ Uy = ∅.


(i) Any metric space is Hausdorff: indeed if ρ is a metric, then ρ(x, y) = 0 if and only if x = y.

(ii) If X is a normed space, then the weak topology σ(X ,X ∗) is Hausdorff (consequence of the Hahn-BanachTheorem).

25

(iii) The weak-* topology σ(X ∗, X ) is Hausdorff (by definition).

(iv) If (Xα, τα)α∈A is a family of Hausdorff spaces, then their product X =∏Xα is Hausdorff when equipped

with the product topology π.

Proof. If x 6= y in X, find α in A such that xα 6= yα. Find neighbourhoods xα ∈ U ∈ τα, yα ∈ V ∈ τα andU ∩ V = ∅, then it is easy to check that p−1

α (U) ∩ p−1α (V ) = p−1

α (U ∩ V ) = ∅.

(v) Sierpinski space: (0, 1, σ), where σ = ∅, 1, 0, 1. Notice that this is not Hausdorff.

9.6 Exercise. If (X, τ) is a topological space, then C((X, τ), (0, 1, σ)) = χU : U ∈ τ, that is, the continuousfunctions are precisely the indicator functions of τ -open sets.

9.7 Proposition. We have:

(i) If (X, τ) is compact, and K ⊂ X is τ -closed, then K is τ -compact.

(ii) If (X, τ) is Hausdorff, then any τ -compact set K ⊂ X is closed.

9.8 Remark. In the Sierpinski space (0, 1, σ), 1 is compact but not closed.

Proof. We have:

(i) Let O ⊂ τ be an open cover for K. Then O ∪ X \K ⊂ τ is a cover for X. Since X is compact, that coveradmits a finite subcover for X, which is hence itself a finite subcover for K.

(ii) Fix x ∈ X \ K. For each y ∈ K find neighbourhoods y ∈ Uy ∈ τ , x ∈ Vy ∈ τ so that Uy ∩ Vy = ∅. Wehave that K ⊂

⋃Uy : y ∈ K so there are y1, . . . , yn ∈ K such that K ⊂

⋃ni=1 Uyi . Now we have that

x ∈⋂ni=1 Vyi ∈ τ and moreover (

⋂ni=1 Vyi)∩ (

⋃ni=1 Uyi) = ∅, so (

⋂ni=1 Vyi)∩K = ∅. Thus X \K is open.

9.9 Proposition. We have:

(i) If (X, τ) is compact, (Y, σ) is a topological space and ϕ : X → Y is τ -σ-continuous, then ϕ(X) is σ-compact.

(ii) If (X, τ) is compact, (Y, σ) is Hausdorff and ϕ : X → Y is a τ -σ-continuous bijection, then ϕ−1 : Y → X isσ-τ -continuous.

Proof. We have:

(i) If O ⊂ σ is an open cover of ϕ(X), then by A1Q1, ϕ−1(U)U∈σ ⊂ τ is a cover of X. Extract a finite subcover,using compactness of X.

(ii) If K ⊂ X is closed, then K is τ -compact, hence ϕ(K) is σ-compact and hence σ-closed, as σ is Hausdorff.Hence we have for τ -closed K ⊂ X that (ϕ−1)−1(K) = ϕ(K) which is closed. Use A1Q1.

Final exam date: TBA.

Next week: I’m away. We’ll have a guest lecturer E. Elgun.

Office hours: F 2:30-3:30, week of Oct. 29, M 2:30-4, T 2:30-4

9.10 Theorem (Alaoglu’s Theorem). If X is a normed space, then B(X ∗) is w∗-compact.

Proof. Let Γ : X ∗ → FX be given by Γ(f) = (f(x))x∈X (its injectivity is clear). If U ⊂ F is open, and x ∈ X , thenf ∈ x−1(U) ⇐⇒ f(x) ∈ U ⇐⇒ Γ(f) ∈ π−1

x (U) (here, πx : FX → F are the coordinate projections), so

Γ(x−1(U)) = π−1x (U) ∩ Γ(X ∗).

Thus if x1, . . . , xn ∈ X , and U1, . . . , Un ⊂ F are open, we see that

Γ

(n⋂i=1

x−1i (Ui)︸︷︷︸

basic set in w∗

)=

n⋂i=1

Γ(x−1i (Ui)) =

n⋂i=1

π−1xi (Ui) ∩ Γ(X ∗)

and hence Γ : X ∗ → Γ(X ∗) is an open map from w∗ to π|Γ(X∗), thus by A1Q1, Γ−1 : Γ(X ∗)→ X ∗ is π-w-continuous.

We prove, now, that Γ(B(X ∗)) is π-closed in FX . Suppose g = (g(x))x∈X ∈ Γ(B(X ∗))π. Now for x, y in X , α ∈ F

and for each n in N we have

∅ 6= Γ(B(X ∗)) ∩ π−1x (g(x) + 1

nD) ∩ π−1y (g(y) + 1

nD) ∩ π−1x+αy(g(x+ αy) + 1

nD)

26

(D = D(F)) so contains Γ(fn) for some fn ∈ B(X ∗). We note that

|g(x+ αy)− (g(x) + αg(y))| ≤ |g(x+ αy)− fn(x+ αy)|+ |g(x)− fn(x)|+ |α||g(y)− fn(y)|

<1

n+

1

n+|α|n

=2 + |α|n

(0 = fn(x + αy) − (fn(x) + αfn(y))) which holds for all n, so g, i.e. x 7→ g(x), is linear. Likewise, we see that foreach x in X we find fn ∈ Γ(B(X ∗))

|g(x)− fn(x)| < 1

n=⇒ |g(x)| ≤ |fn(x)|+ 1

n≤ ‖x‖+

1

n

and it follows that ‖g‖ ≤ 1. Hence g “=” Γ(g), g ∈ B(X ∗).

We now observe that Γ(B(X ∗)) ⊂∏x∈X ‖x‖B ⊂ FX (B = B(X )). The latter set is compact, by Tychonoff’s

theorem; moreover Γ(B(X ∗)) is π-closed, and thus itself π-compact. Then

B(X ∗) = Γ−1(Γ(B(X ∗))︸︷︷︸π-compact

)

and hence B(X ∗) is w∗-compact, by the continuity of Γ−1.

9.11 Corollary. Any bounded w∗-closed subset A ⊂ X ∗ is w∗-compact.

Proof. The proof could have easily been conducted on rB(X ∗) for any r > 0, so rB(X ∗) is w∗-compact. If A isclosed, and A ⊂ rB(X ∗) for some r > 0, then A is a closed subset of a compact set.

10 Nets

These are like sequences in some ways, but unlike sequences in other ways.

10.1 Definition. We make the following definitions.

• A directed set is a pair (N,≤) where ≤ is

– a preorder on N (reflexive and transitive, but not necessarily antisymmetric: we allow ν ≤ ν′, ν′ ≤ νwithout ν = ν′).

– a cofinal ordering, i.e. ν1, ν2 ∈ N , there exists ν3 such that ν1 ≤ ν3, ν2 ≤ ν3.

• Let X 6= ∅. A net is a map (ν 7→ xν) : N → X where (N,≤) is a directed set.

• If (M,≤) is another directed set, a map ψ : M → N is directed if

– µ ≤ µ′ in M implies ψ(µ) ≤ ψ(µ′) in N .

– ψ is cofinal, i.e. for any ν in N , then there is µ ∈M such that ν ≤ ψ(µ).

• If (xν)ν∈N ⊂ X is a net, and ψ : M → N is a directed map, then (xψ(µ))µ∈M is called a subnet of (xν)ν∈N .We often write νµ = ψ(µ) and hence (xνµ)µ∈M .

• If (xν)ν∈N ⊂ X is a net and A ⊂ X, we say that (xν)ν∈N is

– eventually in A if there is νA ∈ N such that xν ∈ A whenever ν ≥ νA.

– frequently in A if for any ν in N , there is ν′ ≥ ν such that xν′ ∈ A.10.2 Example. We have:

(i) Sequences are nets.

(ii) R, [0,∞), [a, b] are directed sets via usual ≤. So any map [a, b]→ X is a net.

(iii) (Riemann sums) Fix a compact interval [a, b] ⊂ R. Let

N = (s0, s1, . . . , sn; t1, . . . , tn) : a = s0 < s1 < . . . < sn = b, ti ∈ [si−1, si], i = 1, . . . , n, n ∈ N.

27

We say that (s0, . . . , sn; t1, . . . , tn) ≤ (s′0, . . . , s′m; t′1, . . . , t

′m) iff s0, . . . , sn ⊂ s′0, . . . , s′m. Inspect that this

is a cofinal preordering. If f : [a, b]→ F is a function, a Riemann sum is given for ν = (s0, . . . , sn; t1, . . . , tn)by

Sν(f) =

n∑i=1

f(ti)(si − si−1).

(iv) Let X 6= ∅. A family of sets F ⊂ P(X) is filtering if for any F1, F2 ∈ F , there is ∅ 6= F3 ∈ F such thatF1 ∩ F2 ⊃ F3. Now let N = (x, F ) : x ∈ F ∈ F. We let (x, F ) ≤ (x′, F ′) iff F ⊃ F ′. This is a cofinalpreordering. We let (x)(x,F )∈N ⊂ X and this is a net.

10.3 Definition. Let X be a non-empty set, (xν)ν∈N a net on X. We say that (xν)ν∈N is an ultranet if for anyA ⊂ X, either (xν)ν∈N is eventually in A, or is eventually in X \A.10.4 Remark. In general, if (xν)ν∈N is a net in X and A ⊂ X, then either (xν)ν∈N is frequently in A or iseventually in X \A.10.5 Proposition. If (xν)ν∈N is a net in X and A ⊂ X is such that (xν)ν∈N is frequently in A then there is asubnet (xνµ)µ∈M which is eventually in A. Moreover, this subnet can be arranged to be an ultranet.

Proof. Let Φ = F ⊂ P(X) : A ∈ F ,F is filtering, and for all F in F , (xν)ν∈N is frequently in F. Recall: filter-ing, F1, F2 ∈ F , then there is ∅ 6= F3 such that F3 ⊂ F1 ∩ F2. Observe that A ∈ Φ, if Fν = A ∩ xν′ν∈N,ν′≥νthen Fνν∈N ∈ Φ, i.e. if ν1, ν2 ∈ N , ν3 ≥ ν1, ν2 and Fν3 = Fν1 ∩Fν2 and since (xν)ν∈N is frequently in A, there isν ≥ ν3 such that xν ∈ A, so Fν3 6= ∅.

Given F ∈ Φ we letM = (ν, F ) : xν ∈ F ∈ F

We preorder M , (ν, F ) ≤ (ν′, F ′) iff ν ≤ ν′ in N , F ⊇ F ′. [We let νν,F = ν. (†)] We remark that this pre-ordering is cofinal: if (ν1, F1), (ν2, F2) ∈ M , find ∅ 6= F3 ⊂ F1 ∩ F2 and ν3 ≥ ν1, ν2 such that xν3 ∈ F3, so(ν3, F3) ≥ (ν1, F1), (ν2, F2). [†] It is obvious that (ν, F ) 7→ ν is a directed map. Thus (xν)(ν,F )∈M is a subnet of(xν)ν∈N . Moreover, (xν)(ν,F )∈M is eventually in A. Indeed if (ν0, F0) ∈M such that F0 ⊂ A (and F0 ∈ F), whichexists by assumption on Φ, then for (ν, F ) ≥ (ν0, F0) we have x0 ∈ F ⊂ F0 ⊂ A, so xν , (ν, F ) ≥ (ν0, F0) in M is inA.

Now, if F ∈ Φ, the Ultrafilter Lemma provides an ultrafilter U ⊃ F . We observe for U ∈ U , that U ∩F 6= ∅ for eachF in F , hence U ∈ Φ (check!). Then (xν)(ν,U)∈M (where M is as above, U replacing F) is an ultranet (check!).

10.6 Definition. Suppose (X, τ) is a topological space, and (xν)ν∈N is a net in X. If x0 ∈ X, we say that

• x0 is a limit point for (xν)ν∈N if each neighbourhood x0 ∈ U ∈ τ has that (xν)ν∈N is eventually in U .

• x0 is a cluster point for (xν)ν∈N if each neighbourhood x0 ∈ U ∈ τ has that (xν)ν∈N is frequently in U .

In the first case we write x0 = τ -limν∈N

xν . We may abuse notation and write ρ-lim rather than τρ-lim if ρ is a metric.

10.7 Proposition. If (xν)ν∈N is a net in a topological space (X, τ) and x0 in X is a τ -cluster point, then there isa subnet (xνµ)µ∈M such that x0 = τ -limµ∈M xνµ .

Proof. We let Ox0= U ∈ τ : x0 ∈ U. Then Ox0

is a filtering family of sets, and for U in Ox0, (xν)ν∈N is

frequently in U . Consider the subnet (xνµ)µ∈M of the prior proposition. Check that for U in Ox0, (xνµ)µ∈M is

eventually in U .

10.8 Remark. By A3Q5, f : (X, τ)→ (Y, σ) is continuous if and only if it preserves net convergence.

(i) If σ, τ ∈ P(X) are topologies, then σ ⊂ τ iff id : (X, τ)→ (X,σ) is continuous, iff σ-limν∈N xν = x0 wheneverτ -limν∈N xν = x0.

(ii) In FX , the product topology is the topology of “pointwise convergence”. If (f(x))x∈X ∈ FX , (fν)ν∈N ⊂ FX isa net, then f0 = π-limν∈N fν iff for each x in X, f0(x) = limν∈N fν(x) in F (F has metric topology). Check!

(iii) X =∏α∈AXα, each (Xα, τα) a topological space, then a net (xν)ν∈N converges to x(0) in π (product topology)

if and only if τα-limν∈N x(ν)α = x

(0)α .

10.9 Theorem (Metrisation Theorem). Let X be a separable normed space. Then (B(X ∗), w∗) is metris-able.

Proof. Since X is separable, B(X ) contains a dense countable set xn∞n=1 (from this, we can extract a linearlyindependent, countable set x′n∞n=1 such that spanx′n∞n=1 is dense in X . We can use x′ns in place of elements xn

28

in this proof). Let for f, g ∈ X ∗

ρ(f, g) =

∞∑n=1

|f(xn)− g(xn)|2n

Observe |f(xn) − g(xn)| = |(f − g)(xn)| ≤ ‖f − g‖ ‖xn‖︸︷︷︸≤1

≤ ‖f − g‖ < ∞, so ρ(f, g) ≤ ‖f − g‖ < ∞. It is

straightforward to see that this is a metric. Let us check:

• non-degeneracy: If f 6= g then for at least one xn, f(xn) 6= g(xn) so ρ(f, g) > 0.

• triangle law: ρ(f, g) ≤ ρ(f, h) + ρ(h, g) for f, g, h ∈ X ∗ which is straightforward.

Now we show that for (fν)ν∈N , a net in B(X ∗), w*-limν∈N fν = f0 implies τρ-limν∈N fν = f0. We have ifw*-limν∈N fν = f0, then given ε > 0, there is n0 such that

∞∑k=n0+1

2

2k<ε

2,

and then for k = 1, . . . , n0 there is νk in ν such that

|fν(xk)− f0(xk)| < ε

2.

If ν ≥ ν1, . . . , νn0, we calculate ρ(fν , f0) < ε, and it follows that ρ-limν∈N fν = f0. Hence we see that id :

B(X ∗, w∗)→ (B(X ∗), τρ) is continuous. But the latter is Hausdorff, while the first is compact, and hence τρ = w∗

on B(X ∗).10.10 Exercise. If X is a normed space and (B(X ∗), w∗) is metrisable, then X is separable.

Note: Closed Convex Hull Theorem implies that if C is a convex set with C ⊆ X, normed space, then

C is ‖ · ‖-closed ⇐⇒ C is w-closed.

10.11 Theorem (w∗-Separation Theorem). Let X be a normed space. Suppose A,B be convex, disjoint inX ∗ of which B is w∗-open. Then there exists x ∈ X and α ∈ R such that

Re f(x) ≤ α < Re g(x) ∀f ∈ A, g ∈ B.

Proof. By the Separation Theorem, there is F ∈ X ∗∗ and α ∈ R such that

ReF (f) ≤ α < ReF (g) ∀f ∈ A, g ∈ B.

We know B is w∗-open, i.e. B ∈ σ(X ∗,X ). So for any f0 ∈ B, then there are x1, . . . , xn ∈ X such that

f0 ∈ U =

n⋂i=1

xi−1(f0(xi) + D) ⊂ B.

Let Y =⋂ni=1 ker xi, then xi(Y + f0) = xi(f0) = f0(xi) ⊆ f0(xi) + D. So Y + f0 ⊂ U ⊂ B. For any f ∈ Y ,

ReF (f + f0) > α since f + f0 ∈ B. Hence ReF (f) > α − ReF (f0), which means F |Y = 0 (why?), so Y ⊆ kerF .Together with the next lemma will prove the result; the lemma will conclude that F ∈ spanx1, . . . , xn.10.12 Lemma. In an F -vector spaceX, if f0, . . . , fn ∈ X ′ such that ker f0 ⊃

⋂ni=1 ker fi, then f0 ∈ spanf1, . . . , fn.

Proof. Let T : X → Fn be given by x 7→ (f1(x), . . . , fn(x)). Then

kerT =

n⋂i=1

ker fi.

Let R = ImT ⊆ Fn is a vector subspace. Define g0 ∈ R′ by g0(Tx) = f0(x). Notice that if Tx = Ty, thenT (x − y) = 0 so f0(x − y) = 0 i.e. f0(x) = f0(y). Then g0(Tx) = f0(x) = f0(y) = g0(Ty) so g0 is well-defined.Also, g0 is linear. g0 : Fn ⊇ R → F. Let g be any extension of g0 to Fn, g ∈ (Fn)′. Then there are α1, . . . , αn ∈ Fsuch that

g(y1, . . . , yn) =

n∑i=1

αiyi.

Hence,

f0 =

n∑i=1

αifi

if we restrict our attention to R.

29

10.13 Theorem (w∗-Closed Convex Hull Theorem). If S ⊂ X ∗, then

cow∗S =

⋂H : H is a w∗-closed half-space containing S.

Proof. Similar to the Closed Convex Hull Theorem.

10.14 Theorem (Goldstine’s Theorem). Let X be a normed space. Then B(X ) is w∗-dense (i.e. σ(X ∗∗, X ∗)-dense) in B(X ∗∗).

Proof. Let A = B(X )w∗

. Since B(X ∗∗) is w∗-closed (by Alaoglu) A ⊆ B(X ∗∗). Let F0 ∈ X ∗∗ \ A, noting thatX ∗∗ \A is w∗-open. Then there is w∗-open basic neighbourhood U of F0, U ⊆ X ∗∗ \A but this means U ∩A = ∅.Such U can be chosen to be convex i.e.

U =

n⋂i=1

fi−1

(F0(fi) + D)

for some f1, . . . , fn ∈ X ∗. Hence, by the w∗-Separation Theorem, there are f ∈ X ∗ and α ∈ R such that

ReF (f) > α ≥ ReG(f) ∀F ∈ U, G ∈ A.

In particular, F0 ∈ U , so ReF0(f) > α. Since B(X ) is symmetric

Re x(f) = Re f(x) ≥ 0 for some x ∈ B(X )

so α ≥ 0. Moreover, since ReF0(f) > α, and f 6= 0 so that α > 0. Now, f ′ = 1αf . Then

ReF (f ′) > 1 ≥ ReG(f ′), ∀F ∈ U,G ∈ A. (1)

HenceReF0(f ′) > 1 ≥ Re x(f ′) for x ∈ B(X ) (2)

now (2) implies ‖Re f ′‖ ≤ 1, so ‖f ′‖ ≤ 1. Hence (1) implies |F0(f ′)| > 1, thus ‖F0‖ > 1. Thus F0 /∈ B(X ∗∗).Therefore B(X ∗∗) = A.

10.15 Definition. A normed space X is called reflexive if X = X ∗∗.

Since dual spaces are always complete, a reflexive normed space is Banach.

10.16 Theorem. Given a Banach space X , the following are equivalent:

1. X is reflexive.

2. B(X ) is weakly compact.

3. σ(X ∗, X ) = σ(X ∗,X ∗∗) i.e. weak and weak* topologies concide on X ∗.

4. X ∗ is reflexive.

X normed is reflexive if X = X ∗∗. We saw that any reflexive X is Banach.

Proof. Consider the map f : (B(X ), σ(X ,X ∗))→ (B(X ), σ(X , X ∗)) given by x 7→ x. This is a homeomorphism.

X ∗ ⊂ (X ∗∗)∗ = X ∗∗∗

the space of evaluation functionals.

(i) → (ii): If X is reflexive, the range of the map f is (B(X ∗∗), σ(X ∗∗, X ∗)) which is compact by the AlaogluTheorem. Thus the domain B(X ) is also w-compact.

(ii) → (i): If B(X ) is w-compact, then the range of the map f is w∗-compact in X ∗∗. By Goldstine’s Theorem, weget B(X ) = B(X ∗∗). Thus X = X ∗∗.

(i) → (iii): Obvious.

(iii) → (iv): We have (B(X ∗), σ(X ∗,X ∗∗)) = (B(X ∗), σ(X ∗, X )), and the latter is compact by Alaoglu. Thus (ii) →(i) applied to X ∗ yields that X ∗ is reflexive.

(iv) → (i): Suppose X ∗ is reflexive. B(X ) is convex and norm-closed in B(X ∗∗), hence by Closed Convex HullTheorem, B(X ) is σ(X ∗∗,X ∗∗∗)-closed. By our assumption X ∗∗∗ = X ∗ so σ(X ∗∗,X ∗∗∗) = σ(X ∗∗, X ∗). Hence B(X )

is closed with respect to σ(X ∗∗, X ∗). By Goldstine. B(X ) = B(X ∗∗) hence X = X ∗∗ i.e. X is reflexive.

30

10.17 Corollary. Any finite-dimensional normed space is reflexive.

Proof. B(X ) is norm-compact (Heine-Borel) implies B(X ) is w-compact.

10.18 Corollary. If Y is a closed subspace of a reflexive X then Y is also reflexive.

Proof. By Hahn-Banach theorem, then Y ∗ = X ∗|Y so σ(Y, Y ∗) = σ(Y,X ∗|Y ) = σ(X ,X ∗)|Y . Then B(Y ) = B(X )∩Yis norm-closed and convex, hence weakly closed, by Closed Convex Hull Theorem.

B(Y )︸︷︷︸w-closed

⊆ B(X )︸︷︷︸w-compact

=⇒ B(Y ) is w-compact

Therefore, Y is reflexive.

11 Extreme points and the Krein-Milman theorem

11.1 Definition. Let X be a normed space and C ⊂ X be a convex set. A face of C is a non-empty convex subsetF ⊂ C, such that if x, y ∈ C, t ∈ [0, 1] and (1 − t)x + ty ∈ F then both x, y ∈ F . Furthermore, a face E of C iscalled an extreme point if E is a singleton. For example [DIAGRAM].

11.2 Theorem (Krein-Milman Theorem). Let X be a normed space and C ⊂ X ∗ is w∗-compact andconvex. Then C = cow

∗ext(C).

Last time, we saw that for X a normed space and C ⊂ X, a face F of C is nonempty, convex, and if (1−tx)+ty ∈ Ffor some x, y ∈ C and t ∈ [0, 1] then x, y ∈ F . This should have read “t ∈ (0, 1)”. There exists x ∈ F so that forany y ∈ C, t = 0 implies x, y ∈ F implies F = C.

If F = f, then we called f (or f) an extreme point of C.

Proof of Krein-Milman Theorem. C w∗-closed, convex implies cow∗

ext(C) ⊆ C. First we will prove that ext(C) 6=∅. We take

F = F ⊂ C : F is a w∗-closed face of C.C ∈ F . Then (F ,⊇) is a partially ordered set. Let C be a chain in F . Define

FC =⋂F∈C

F

and noting each F is w∗-closed we have that the intersection above is too. Furthermore, FC is a face: supposex, y ∈ C, t ∈ [0, 1] such that

(1− t)x+ ty ∈ FC =⇒ (1− t)x+ ty ∈ F for any F ∈ C

so indeed x, y ∈ F for any F ∈ C. So x, y ∈ FC . Hence FC ∈ F , so it is an upper bound with respect to ⊇, for C.By Zorn’s lemma, there is a maximal element of F which we call E. E is w∗-compact, so for x ∈ X

Re x(E)︸︷︷︸compact

⊆ R

Hence Re x(E) has a minimum value, say mx. Let Ex = f ∈ E : Re f(x) = Re x(f) = mx 6= ∅, w∗-closed. AndEx is a face (exercise). Hence Ex ⊆ E but E was minimal so Ex = E. Now if f, g ∈ E, then for any x ∈ X

Re f(x) = mx = Re g(x) =⇒ f = g

Thus E = f is a singleton and f ∈ ext(C). Finally, we will prove that C ⊆ cow∗

ext(C). Let f0 ∈ X ∗\cow∗

ext(C).We want to show f0 /∈ C.

C ⊇ cow∗

extC. We proved that extC 6= ∅. We need to prove C ⊆ cow∗

extC. Let f0 ∈ X∗ \ cow∗

extC. Notethat cow

∗extC is convex, and X∗ \ cow

∗extC is w∗-open. By the w∗-separation theorem, there is x ∈ X, α ∈ R

such that Re f0(x) < α ≤ Re f(x) for any f ∈ extC. C is w∗-compact, so mx = min Re x(C) exists. Let

Cx = f ∈ C : Re f(x) = mx

noting that this is w∗-closed. Also, Cx is a face. By the first part of this proof applied to Cx, we conclude Cx admitsan extreme point, say f . Then f is also an extreme point of C. So α ≤ Re f(x) = mx. Hence Re f0(x) < mx thusf0 /∈ C. Thus, C = cow

∗extC.

31

11.3 Corollary. If X is a normed space, and C ⊂ X is a w-compact convex set, then C = cow extC.

Proof. The embedding (X,σ(X,X∗)) → (X, σ(X, X∗)) ⊆ (X∗∗, σ(X∗∗, X∗)) is a homeomorphism onto its range.C is w-compact, hence C is w∗-compact. Then by the Krein-Milman theorem,

C = cow∗

extC = cow extC.

Hence C = cow extC.

11.4 Corollary. If X is a normed space, C ⊆ X is a norm-compact convex set, then C = co‖·‖ extC.

Proof. C is norm-compact and convex, so C is w-compact. Then C = cow extC = co‖·‖ extC, by the Closed ConvexHull Theorem.

11.5 Theorem (Minkowski’s Inequality). ‖f + g‖p ≤ ‖f‖p + ‖g‖p with equality for 1 < p <∞ if and onlyif f = λg for some λ ≥ 0.

11.6 Proposition. Let X be any of `p, Lp(R) or Lp([0, 1]). Then extB(X) = S(X).

Proof. If f ∈ D(X), i.e. ‖f‖p < 1, then

(1− ‖f‖p)0 + ‖f‖p1

‖f‖pf︸︷︷︸

/∈D(X)

= f =⇒ f /∈ extB(X).

Thus extB(X) ⊆ S(X). Conversely, f ∈ S(X) i.e. ‖f‖p = 1, then f is a face: let f = (1 − t)x + ty for somex, y ∈ B(X) and t ∈ [0, 1]. Note

1 = ‖f‖p = ‖(1− t)x+ ty‖p ≤ (1− t) ‖x‖p︸︷︷︸≤1

+t ‖y‖p︸︷︷︸≤1

=⇒ ‖x‖p = ‖y‖p = 1.

So, ‖f‖p = (1− t)‖x‖p + t‖y‖p. By the equality case of Minkowski we know that there is λ ≥ 0 with λ(1− t)x = ty.Taking norms, λ(1 − t) ‖x‖p︸︷︷︸

1

= t ‖y‖p︸︷︷︸1

. Hence λ(1 − t) = t, so that x = y. Thus f = (1 − t)x + tx = x = y. So

x, y ∈ f, but f is a face of B(X). Therefore S(X) = extB(X).

11.7 Proposition. B(ccc0) has no extreme points.

Proof. If x = (xn)∞n=1 ∈ B(ccc0), then limn xn = 0. For some n0, such that |xn0| ≤ 1

2 . Let

yn =

xn if n 6= n0

2xn0if n = n0.

zn =

xn if n 6= n0

0 if n = n0.

Then y = (yn)∞n=1, z = (zn)∞n=1 ∈ B(ccc0) and 12 (y + z) = x.

Can ccc0 be a dual space? That is, is there a normed space X with X∗ = ccc0? Assume yes. So X∗ = ccc0. Consider(ccc0, σ(ccc0, X)). B(ccc0) is convex and w∗-compact by Alaoglu. By Krein-Milman, B(ccc0) = cow

∗extB(ccc0). Contradic-

tion.

Office hours: 2:30-4pm, today and tomorrow. A3 due Wednesday.

11.8 Example. Consider the unit ball of R2 in the p-norm (2 < p <∞).

We now prove that the extreme points of the set P (X) of probability measures on X is just the set of Dirac measuresδx; integration of a function f against such a δx merely results in evaluation at x, that is, δx(f) = f(x). It is justas nontrivial if we choose (X, τ) to just be [0, 1] with the usual topology.

11.9 Theorem. For (X, τ) a compact Hausdorff space, define

P (X) = µ ∈ CR(X, τ)∗ : ‖µ‖ ≤ 1, µ(1) = 1.

Then extP (X) = δx : x ∈ X, where for each x we define δx(f) = f(x).

32

Proof. Since we use symbols f, g for elements of C = CR(X, τ) we shall use δ, µ, ν for elements of C∗. Observe thatP (X) = B(C∗) ∩ 1−1(1) is w∗-compact (B(C∗) is w∗-compact by Banach-Alaoglu, and 1−1(1) is w∗-closed).Also, P (X) is evidently convex, so it admits extreme points by the Krein-Milman Theorem.

If µ ∈ P (X), we note for 0 ≤ f ≤ 1 in C, then 0 ≤ 1− f ≤ 1 and hence |µ(f)| ≤ 1 and |1− µ(f)| = |µ(1− f)| ≤ 1since ‖f‖∞, ‖1− f‖∞ ≤ 1. Thus 0 ≤ µ(f) ≤ 1. From this it follows that if g ≥ 0, then µ(g) ≥ 0.

Now, if g ∈ C, let g+ = maxg, 0, g− = max−g, 0 so g = g+ − g− and |g| = g+ + g−. If we fix 0 ≤ f ≤ 1 in Cand µ in P (X) we let ν(g) = µ(fg), then

|ν(g)| = |ν(g+ − g−)| = |

≥0︷︸︸︷µ(fg+︸︷︷︸≥0

)−

≥0︷︸︸︷µ(fg−︸︷︷︸≥0

) | ≤ µ(fg+) + µ(fg−) = µ(f(g+ + g−))

= µ(f |g|) ≤ µ(f‖g‖∞) = µ(f)‖g‖∞. (*)

noting that f |g| ≤ f‖g‖∞. Now, fix δ ∈ extP (X). Find 0 ≤ f ≤ 1 in C such that 0 < δ(f) < 1 (since ‖δ‖ = 1, thisis always possible, exercise). Let µ(g) = 1

δ(f)δ(fg) for g in C. Then using (*),

|µ(g)| = 1

δ(f)|δ(fg)| ≤ 1

δ(f)δ(f)‖g‖∞ = ‖g‖∞

so ‖µ‖ ≤ 1. Also µ(1) = 1δ(f)δ(f1) = 1. Hence µ ∈ P (X). Similarly, if we put ν(g) = 1

1−δ(f)δ((1 − f)g), thennoting that 1 − δ(f) = δ(1 − f), we see that ν defines an element of P (X). We observe δ(f)µ + (1 − δ(f))ν = δ.In particular µ = δ, i.e for g in C, µ(g) = 1

δ(f)δ(fg) = δ(g) so δ(fg) = δ(f)δ(g). Since spanf : 0 ≤ f ≤ 1 = C, itfollows that δ(fg) = δ(f)δ(g) for f, g ∈ C. Now, suppose for each x in X there is fx ∈ C such that fx(x) 6= 0 butfx ∈ ker δ. Let Ux = f−1

x (R \ 0) so Ux is open and x ∈ Ux, hence X =⋃x∈X Ux. Since X is compact, there are

x1, . . . , xn in X such that X =⋃ni=1 Uxi . Thus f :=

∑ni=1 f

2xi > 0 on X so 1/f ∈ C. Then

1 = δ(1) = δ( 1f f) = δ( 1

f )δ(f) = δ( 1f ) =

n∑i=1

δ(fxi)2 = 0.

This is absurd. Hence there is some x in X such that δx(f) = f(x) = 0 whenever δ(f) = 0, ker δx ⊇ ker δ. However,by the lemma following w∗-Separation Theorem, this implies δx = cδ for some c ∈ R. However 1 = δx(1) = cδ(1) =c, so δx = δ. Thus extP (X) ⊆ δxx∈X .

It remains to show that each δx is an extreme point of P (X). If δx = (1− t)µ+ tν for 0 < t < 1 and µ, ν ∈ P (X),then computations similar to but much simpler than (*) show that for f in C,

t|ν(f)| ≤ tν(|f |) ≤ tδx(|f |) = t|f(x)|.

Hence if f ∈ ker δx, then f ∈ ker ν so ker δx ⊆ ker ν and, as above, δx = ν. Similarly δx = µ, and δx ∈ extP (X).

11.10 Remark. w∗-closed convex hull of extreme points of a w∗-closed convex set in the dual space is that setitself (this is the K-M theorem). It’s very tricky to exhibit an example where it is not, a fortiori, the norm-closure.

11.11 Exercise. Prove (only the second equality needs proof) that

co extP (X) = coδxx∈X =

∞∑i=1

λiδxi : xi ∈ X,λi ≥ 0 for all i and∞∑i=1

λi = 1

.

11.12 Exercise. Let (X, τ) = ([0, 1], usual top.). Let m ∈ P ([0, 1]) be given by the Riemann integral (i.e. m isintegration against the usual Lebesgue measure, which reduces to Riemann integration because all f are continuous):

m(f) =

∫ 1

0

f.

Then m /∈ coδtt∈[0,1] (otherwise, we would have covered this in MATH 147).

11.13 Remark. Observe, however, that m ∈ cow∗δtt∈[0,1]. We use Riemann sums; let

N = (s0, . . . , sn, t1, . . . , tn) : 0 = s0 < s1 < . . . < sn = 1, and ti ∈ [si−1, si] for i = 1, . . . , n, and n ∈ N.

Pre-order N by declaring that

ν = (s0, . . . , sn, t1, . . . , tn) ≤ (s′0, . . . , s′n′ , t

′1, . . . , t

′n′) = ν′ ⇐⇒ s0, . . . , sn ⊆ s′0, . . . , s′n′.

33

If ν is as above,

Sν(f) =

n∑i=1

(si − si−1)f(ti) =

n∑i=1

(si − si−1)δti(f) ∈ coδtt∈[0,1].

We have (where (s0, . . . , sn, t1, . . . , tn) = ν ∈ N)

m(f) = limν∈N

Sν(f) i.e. m = w*-limν∈N

n∑i=1

(si − si−1)δti .

11.14 Exercise. In `∞ = `∞R, we have `∞∗ ∼= FA(N) (by A1Q4), and

P = µ ∈ FA(N) : V (µ) ≤ 1, µ(N) = 1.

Here, extP = δU : U ⊂ P(N) is an ultrafilter, i.e. the extreme points are precisely the ultrafilter limits.

Note that S = T.

12 Euclidean and Hilbert spaces

12.1 Definition. Let X be a vector space. A form [·, ·] : X × X → F (F = R or C) is called Hermitian if

(i) [x+ αx′, y] = [x, y] + α[x′, y], for x, x′, y ∈ X , α ∈ F.

(ii) [x, y] = [y, x], for x, y ∈ X (symmetry if F = R, skew-symmetry if F = C).

We shall say [·, ·] is positive if

(iii) [x, x] ≥ 0, for x ∈ X .

Finally, we shall say [·, ·] is non-degenerate if

(iv) If 0 6= x ∈ X , then [x, y] 6= 0 for some y in X .

We state the following in a little bit more generality than is typical for this, but it often gets used.

12.2 Proposition. Let [·, ·] be a positive Hermitian form on X , and let p(x) = [x, x]1/2 for x in X . Then for x, yin X , α ∈ F, we have

(i) p(αx) = |α|p(x).

(ii) |[x, y]| ≤ p(x)p(y) (Cauchy-Schwarz inequality).

(iii) p(x+ y) ≤ p(x) + p(y).

Moreover, if [·, ·] is non-degenerate, then [x, x] > 0 for x 6= 0. Also, in the case of non-degeneracy, assuming y 6= 0,equality holds at (ii) if and only if x = αy for some α ∈ F, and equality holds at (iii) if and only if x = ty, for somet ≥ 0 in R.

Recall that extreme points of the unit ball in an `p space were all the elements of the unit sphere. We want to havethis fact in storage, in case we want to ask the same question for what we’ll call a Hilbert space. So it’s not thatinteresting, but there are actually some situations where this fact might be useful to know.

Patterson’s book shows the following version of the proof (which I believe is a corrupted version of something thatappeared in Terry Tao’s blog).

Proof. We have:

(i) p(αx) = [αx, αx]1/2 = (αα[x, x])1/2 = (|α|2[x, x])1/2 = |α|p(x).

(ii) First, let ζ = sgn[x, y] and note that

0 ≤ p(x− ζy)2 = [x− ζy, x− ζy] = [x, x]−ζ[x,y]︷︸︸︷ζ[y, x]−ζ[x, y] + |ζ|[y, y] = p(x)2 − 2 Re ζ[x, y]︸︷︷︸

|[x,y]|

+ p(y)2

34

and hence |[x, y]| ≤ p(x)2+p(y)2

2 . We observe for t > 0,

|[x, y]| = |[tx, 1t y]| ≤

t2p(x)2 + 1t2 p(y)2

2.

If p(x) = 0, take t→∞; if p(y) = 0, take t→ 0+, to see that |[x, y]| = 0 in either of these cases. If [x, y] 6= 0,then p(x)p(y) 6= 0, and we substitute t = p(y)/p(x), above to get (ii).

(iii) Note

p(x+ y)2 = p(x)2 + 2 Re[x, y] + p(y)2 (as above)

≤ p(x)2 + 2|[x, y]|+ p(y)2 (**)

= (p(x) + p(y))2

which gives (iii).

It is immediate from (the proof) of (iii), that if [·, ·] is non-degenerate then [x, x] > 0 for x 6= 0. Fix y 6= 0. Weobserve that if x 6= αy for any α in F, then “=” is impossible to obtain at (*). Hence for “=” to be achieved in (ii),we require x = αy. Likewise, “=” can achieved at (**) only if x = ty for some t ≥ 0. Hence, this is required for “=”at (iii).

12.3 Corollary. Let (·, ·) be a non-degenerate positive Hermitian form on a vector space X , and ‖x‖ = (x, x)1/2.Then (X , ‖ · ‖) is a normed space.

Proof. Immediate from above.

There are many features of `p spaces that these spaces will share, but not all of them. This is perhaps a veryextreme kind of something called “rotundity”.

12.4 Definition. We call a non-degenerate positive Hermitian form an inner product and the pair (X , (·, ·)) aninner product space or Euclidean space. If X is complete with respect to the norm ‖ · ‖ induced by the innerproduct, we call (X , (·, ·)) a Hilbert space.

12.5 Proposition (Polarisation Identities). Let [·, ·] be a Hermitian form on X .

• In the R case: 4[x, y] = [x+ y, x+ y]− [x− y, x− y].

• In the C case: 4[x, y] =∑3k=0 i

k[x+ iky, x+ iky].

Proof. For the real case, just bash it out. For the complex case, observe∑3k=0 i

k = 0,∑3k=0 i

2k = 0.

12.6 Definition. Suppose (·, ·) is an inner product with associated norm ‖ · ‖. We call a set S ⊂ X orthogonal ifx 6= y ∈ S we have (x, y) = 0; write x ⊥ y (read as “is perpendicular to”).

12.7 Proposition (Pythagoras’ Law). If x1, . . . , xn ⊂ X is orthogonal, then

∥∥∥∥∥n∑i=1

xi

∥∥∥∥∥2

=

n∑i=1

‖xi‖2.

12.8 Proposition (Parallelogram Law). If x, y ∈ X , then ‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.12.9 Exercise (tedious exercise). If (X , ‖ · ‖) is a normed space which satisfies parallelogram law, then theform (·, ·) given by

(x, y) :=‖x+ y‖2 − ‖x− y‖2

4, (x, y) :=

1

4

3∑k=0

i4‖x+ iky‖2

can be shown to be an inner product which gives the norm.


(i) C[0, 1]. (f, g) =∫ 1

0fg (Riemann integral). This gives rise to an (incomplete) Euclidean space.

(ii) L2[0, 1]. (f, g) =∫ 1

0fg (Lebesgue integral). Hilbert space (PM 450/354).

(iii) `2 = `2(N) = (xn)∞n=1 ∈ FN :∑∞n=1 |xn|2 <∞. We let (x, y) =

∑∞n=1 xnyn which converges (absolutely) by

Hölder’s inequality (exercise). We observe ‖x‖2 = (x, x)1/2, and `2 ∼= `∗2, so is complete.

(iv) Γ be a set (possibly uncountable), and define `2(Γ) = x = (xγ)γ∈Γ ∈ FΓ :∑γ∈Γ |xγ |2 < ∞. We define the

meaning of “∑γ∈Γ” as follows: if (aγ)γ∈Γ, aγ ≥ 0, we can write∑

γ∈Γ

aγ = supF∈F

∑γ∈F

aγ = limF∈F

∑γ∈F

aγ

35

where F = F ⊂ Γ : F is finite directed by F ≤ F ′ iff F ⊆ F ′. We note if∑γ∈Γ aγ < ∞ for a = (aγ)γ∈Γ

with aγ ≥ 0, then for Fn = γ ∈ Γ : aγ ≥ 1n, we have

∞⋃n=1

Fn = γ ∈ Γ : aγ 6= 0.

If∑γ∈Γ aγ <∞, each Fn is necessarily finite, so Γa = γ ∈ Γ : aγ > 0 is countable. If x, y ∈ `2(Γ) we write

(x, y) =∑γ∈Γ

xγyγ =∑

γ∈Γx∪Γy

xγyγ

which is a countable series. If (x(n))∞n=1 ⊂ `2(Γ) is ‖ · ‖2-Cauchy, then

(x(n))∞n=1 ⊂ `2

( ∞⋃n=1

Γx(n)

)

and the countable union of countable sets is countable, from which it follows that `2(Γ) is complete.

12.11 Proposition. Let (X , (·, ·)) be an Euclidean space. For each x in X , the functional fx : X → F given byfx(y) = (y, x) is in X ∗ and ‖fx‖ = ‖x‖.

Proof. fx ∈ X ′ by the properties of (·, ·). Also ‖fx‖ ≤ ‖x‖ is a consequence of the Cauchy-Schwarz inequality.Observe if x = 0 then fx = 0; if x 6= 0, then fx( 1

‖x‖x) = ( 1‖x‖x, x) = 1

‖x‖ (x, x) = ‖x‖, so ‖fx‖ ≥ ‖x‖.

12.12 Remark (Notation). Let X be an Euclidean space and ∅ 6= S ⊂ X . We define

S⊥ = y ∈ X : (y, x) = 0 for all x in S,

pronounced “S-perp”. Observe that S⊥ =⋂x∈S ker fx, and each of these kernels is a closed subspace, so this is

itself a closed subspace of X .12.13 Theorem (Complementation Theorem). Let X be an Euclidean space and Y ⊂ X be a completesubspace (e.g. X is Hilbert, Y closed; or in general Y is finite dimensional). Then for x in X then there is a uniquedecomposition x = xY + xY⊥ such that xY ∈ Y, xY⊥ ∈ Y⊥. Moreover, the map P : X → X given by Px = xY islinear, and ImP = Y, P 2 = P and if Y 6= 0, ‖P‖ = 1.

Proof. First, let us find xY . Let d = dist(x,Y) = infy ∈ Y : ‖x−y‖. Let for each n, yn ∈ Y be so ‖x−yn‖ < d+ 1n .

The parallelogram law for n,m ∈ N gives

‖(x− yn)− (x− ym)‖2 + ‖(x− yn) + (x− ym)‖2 = 2‖x− yn‖2 + 2‖x− ym‖2

so

‖ym − yn‖2 = 2‖x− yn‖2 + 2‖x− ym‖2 − ‖2x− (yn + ym)‖2

= 2‖x− yn‖2 + 2‖x− ym‖2 − 4‖x− 12 (yn + ym)︸︷︷︸

∈Y

‖2

︸︷︷︸≥4d2

< 2(d+ 1n )2 + 2(d+ 1

m )2 − 4d2 n,m→∞−−−−−→ 0+.

So (yn)∞n=1 is Cauchy in Y. By completeness, xY = limn→∞ yn. Now, let xY⊥ = x− xY . Observe that ‖xY⊥‖ = d.Suppose there were y in Y such that (xY⊥ , y) 6= 0. We let z = (xY⊥ , y)y, so z ∈ Y, (xY⊥ , z) = |(xY⊥ , y)|2 > 0.Then for any ε > 0 we have

‖x− (xY + εz︸︷︷︸∈Y

)‖2 = ‖xY⊥ − εz‖2 = (xY⊥ − εz, xY⊥ − εz) = ‖xY⊥‖2 − 2εRe(xY⊥ , z) + ε2‖z‖2

= d2 − ε[2(xY⊥ , z)− ε‖z‖2] < d2 if 0 < ε <2(xY⊥ , z)

‖z‖2

which contradicts the definition of d = dist(x,Y). Hence xY⊥ ∈ Y⊥. We have x = xY + xY⊥ , xY ∈ Y, xY⊥ ∈ Y⊥.Since Y ∩ Y⊥ = y ∈ Y : (y, x) = 0∀x ∈ Y ⊆ y ∈ Y : (y, y) = 0 = 0, so Y ∩ Y⊥ = 0. This shows that theorthogonal decomposition is unique i.e. if x = x′Y + x′Y⊥ = xY + xY⊥ then xY − x′Y = x′Y⊥ − xY⊥ which forces bothpairs to coincide.

36

We first note, by Pythagoras’ law, ‖x‖2 = ‖xy‖2 + ‖xY⊥‖2 so ‖xY‖ ≤ ‖x‖. If x, x′ are in Y and α ∈ F, then

(x+ αx′)Y + (x+ αx′)Y⊥ = xY + xY⊥ + α(x′Y + x′Y⊥)

= xY + αx′Y︸︷︷︸∈ Y + xY⊥ + αx′Y⊥︸︷︷︸∈Y⊥

so by uniqueness, we see (x + αx′)Y = xY + αx′Y . Thus P : X → X , Px = xY is linear. Also ‖P‖ ≤ 1. If y ∈ Ythen Py = y, so ‖P‖ ≥ 1.

12.14 Remark. In the situation above, we call P = PY the orthogonal projection onto Y.

We also note that X = Y ⊕2 Y⊥.12.15 Proposition. If H is a Hilbert space and ∅ 6= S ⊂ H then S⊥ = (spanS)⊥ and (S⊥)⊥ = spanS.

Proof. We havex ∈ S⊥ ⇐⇒ S ∈ x︸︷︷︸

closed linear space

⊥ ⇐⇒ spanS ⊂ x⊥.

Now for any closed, hence complete, subspace Y we have Y ⊕2 Y⊥ = H = (Y⊥)⊥⊕2 Y⊥. We have that Y ⊆ (Y⊥)⊥,and hence Y = (Y⊥)⊥. We apply this to Y = spanS.

12.16 Theorem (Riesz Representation Theorem). Let H be a Hilbert space, and f ∈ H∗. Then thereis a unique vector x0 ∈ H such that f = fx0 , i.e. f(x) = (x, x0) for x ∈ H, ‖fx0‖ = ‖x0‖.

Proof. Say f 6= 0. We first note that (ker f)⊥ is one-dimensional. Indeed, if x1, x2 ∈ (ker f)⊥ \ 0 then f(x1) 6=0 6= f(x2), and

f(x2)x1 − f(x1)x2 ∈ ker f ∩ (ker f)⊥ = 0,

so x1, x2 cannot be linearly independent. Hence (ker f)⊥ = spanx1, ‖x1‖ = 1. Let x0 = f(x1)x1. Let P = Pker f

be the orthogonal projection onto ker f . Now, if x ∈ H,

x = Px+ λx1︸︷︷︸(I−P )x

for some λ ∈ F.

Thus

(x, x0) = ( Px︸︷︷︸ker f=x1⊥

+λx1, f(x1)x1) = (λx1, f(x1)x1) = λf(x1) (x1, x1)︸︷︷︸1

= f(λx1) = f( Px︸︷︷︸∈ker f

+λx1) = f(x).

So we see indeed that f = fx0 . If (x, x0) = (x, y0) for all x in H, then with x = x0−y0, we see (x0−y0, x0−y0) = 0so x0 = y0.

Final exam date is Dec. 12, 12:30-3pm (see website). Possible talk topics posted soon.

If F = C, x 7→ fx is conjugate linear, i.e. fx+αx′ = fx + αfx′ (passing scalars at cost of conjugation). Generally, wewrite H∗ ∼= H where H denotes the conjugate space. We have H = H as sets, but scalar multiplication in H isgoverned by the rule α · x = αx where the right-hand side is scalar multiplication in H, and (x, y)H = (y, x)H.

12.17 Remark (on the completion of an Euclidean space). We have:

(i) Let (X , (·, ·)) be an Euclidean space. Let H = X‖·‖⊂ X ∗∗. We observe that H is a Hilbert space. If F,G ∈ H

let (xn)∞n=1, (yn)∞n=1 ⊂ X be so F = limn→∞ xn, G = limn→∞ yn. We observe for n,m in N

|(xn, yn)− (xm, ym)| ≤ |(xn, yn)− (xn, ym)|+ |(xn, ym)− (xm, ym)| ≤ ‖xn‖‖yn − ym‖+ ‖xn − xm‖‖ym‖

just using Cauchy-Schwarz and biadditivity of the form giving the inner product. So it is easy to see that((xn, yn))∞n=1 ⊂ F is Cauchy, and hence converges to something, call it (F,G). We observe if (x′n)∞n=1,(y′n)∞n=1 ⊂ X , F = limn→∞ x′n, G = limn→∞ y′n, then the same technique as above shows that limn→∞(xn, yn) =limn→∞(x′n, y

′n). Hence (F,G) is well-defined and may be shown to define an inner product on H for which

‖F‖ = (F, F )1/2.

(ii) Now, if f ∈ X ∗, consider f ∈ X ∗∗∗, so f |H ∈ H∗, so f |H = fG0. Hence for x in X , f(x) = (x, G0). It follows

that H = X ∗∗, i.e. X ∗ ∼= H so X ∗∗ ∼= H∗ ∼= H = H.

37

12.18 Theorem (Gram-Schmidt Orthogonalisation). Let x1, x2, x3, . . . be a linearly independent se-quence in an Euclidean space X . Then there exists an orthogonal sequence e1, e2, . . . such that spane1, . . . , en =spanx1, . . . , xn.

Proof. Let En = spanx1, . . . , xn and Pn denote the orthogonal projection from X to En (complementation theo-rem). We define

e1 = x1

e2 = x2 − P1x2

e3 = x3 − P2x3

...en = xn − Pn−1xn.

We observe, inductively, then en ∈ En. Hence spanx1, . . . , xn = spane1, . . . , en holds. Moreover, each en ⊥En−1, and it follows that e1, . . . , en is orthogonal, hence e1, e2, . . . is orthogonal.12.19 Theorem. We have:

(i) If X is a separable Euclidean space, then X contains an orthogonal sequence, E = e1, e2, . . . such thatspanE = X .

(ii) If H is any Hilbert space, there is an orthogonal set E ⊂ H such that spanE = H.

Proof. We have:

(i) If xn∞n=1 is a dense subset of X , then spanxn∞n=1 = X . Now let

n1 = mink ∈ N : xk 6= 0n2 = mink ∈ N : xk /∈ spanxn1

...

nm = mink ∈ N : xk /∈ spanxn1, . . . , xnm−1

we have that spanxnm∞m=1 = spanxn∞n=1 = X . We apply Gram-Schmidt to (xkm)∞m=1 to obtain orthogonalsequence (em)∞m=1. Observing that spanxn1 , . . . , xnm = spane1, . . . , em, we see that spanem∞m=1 =spanxnm∞m=1.

(ii) Let O = S ⊂ H : O /∈ S, S is orthogonal. Partially order O by inclusion. If C ⊂ O is a chain, it is easyto verify that

⋃S∈C S is an orthgonal set, not including O. Hence by Zorn’s Lemma, there is a maximal

orthogonal set E. If we had spanE ( H, we let PE denote the orthogonal projection onto spanE. Then ifx ∈ H \ spanE, we would see that 0 6= (I − PE)x ⊥ E, contradicting maximality.

12.20 Definition. A set E in an Euclidean space is orthonormal if for e, e′ in E,

(e, e′) =

1 if e = e′

0 if e 6= e′.

12.21 Remark. If e1, . . . , en is an orthonormal set, let E = spane1, . . . , en. If x ∈ X , we have orthogonaldecomposition x = PEx+ (I − PE)x, where (I − PE)x ⊥ E . Moreover PEx ∈ E , so PEx =

∑ni=1 αiei. Then for each

i,

(x, ei) =

n∑i=1

αiei + (I − PEx)︸︷︷︸⊥ to E

, ei

and each ej is orthogonal to ei unless j = i. So we obtain (x, ei) = (αiei, ei) = αi. Hence PEx =

∑ni=1(x, ei)ei.

12.22 Theorem (Abstract Riesz-Fischer Theorem). We have:

(i) If X is an Euclidean space and E ⊂ X is an orthonormal set, then for x in X∑e∈E|(x, e)|2 ≤ ‖x‖2 (Bessel’s inequality).

(ii) Let dimX =∞. X is complete if and only if for any orthonormal sequence (en)∞n=1 in X , and any α ∈ `2 wehave that

∑∞i=1 αiei converges in X .

38

Proof. We have:

(i) If F ⊂ E is finite then ∑e∈F|(x, e)|2 = ‖PFx‖2 ≤ ‖x‖2

where PF is the orthogonal projection onto spanF . But

∑e∈E|(x, e)|2 = sup

∑e∈F|(x, e)|2 : F ⊂ E finite

≤ ‖x‖2.

(ii) (→) We let xn =∑ni=1 αiei. Then for m < n we have

‖xn − xm‖2 =

∥∥∥∥∥n∑

i=m+1

αiei

∥∥∥∥∥2

=Pyth.

n∑i=m+1

|αi|2m,n→∞−−−−−→ 0.

So (xn)∞n=1 ⊂ X is Cauchy, hence converges.

(←) Let (x(n))∞n=1 be a Cauchy sequence in X . Then E = spanx(n)∞n=1 is separable. Hence, by thelast theorem, there is an orthogonal (hence we can make it orthonormal) sequence en∞n=1 such that E =

spanen∞n=1. From (i) above, we have that∑∞i=1 |(x(n), ei)|2 ≤ ‖x(n)‖2 so if α(n)

i = (x(n), ei), then α(n) =

(α(n)i )∞i=1 ∈ `2. We observe for n,m in N that

‖α(n) − α(m)‖22 =

∞∑i=1

|(x(n), ei)− (x(m), ei)|2 ≤ ‖x(n) − x(m)‖2

so (α(n))∞n=1 ⊂ `2 is Cauchy, and hence converges to some α in `2, as `2 ∼= `2∗ and is complete. Thus

x =∑∞i=1 αiei ∈ X , by assumption. Given ε > 0, find n0 such that ‖α − α(n)‖2 < ε

2 for n ≥ n0. Then, forfixed n ≥ n0, find N such that

∑∞i=N+1 |α

(n)i |2 < (ε/3)2,

∑∞i=N+1 |αi|2 < (ε/3)2. Then

‖x− x(n)‖ ≤

∥∥∥∥∥N∑i=1

(αi − α(n)i )ei

∥∥∥∥∥+

∥∥∥∥∥∞∑

i=N+1

αiei

∥∥∥∥∥+

∥∥∥∥∥∞∑

i=N+1

α(n)i ei

∥∥∥∥∥≤ ‖α− α(n)‖2 + lim

m→∞

∥∥∥∥∥m∑

i=N+1

αiei

∥∥∥∥∥︸︷︷︸(∑mi=N+1 |αi|2)1/2

+ limm→∞

∥∥∥∥∥m∑

i=N+1

α(n)i ei

∥∥∥∥∥ ≤ ε

3+ε

3+ε

3= ε.

12.23 Theorem (Orthonormal Basis Theorem). Let H be a Hilbert space and E ⊂ H be an orthonormalset, with E = spanE and PE = PE be the orthogonal projection on E . Then for x, y in H we have:

(i)∑e∈E

(x, e)e = limF∈F

∑e∈F

(x, e)e = PEx, where F = F ⊂ E : F finite directed by F ≤ F ′ iff F ⊆ F ′.

(ii)∑e∈E|(x, e)|2 = ‖PEx‖2 (Bessel’s identity).

(iii) (PEx, PEy) = (PEx, y) = (x, PEy) =∑e∈E

(x, e)(e, y) (Parseval’s identity).

Moreover, if for every x, y in H we have any of

(i’)∑e∈E

(x, e) = x.

(ii’)∑e∈E|(x, e)|2 = ‖x‖2.

(iii’) (x, y) =∑e∈E

(x, e)(e, y).

Then spanE = H.

39

Proof. We have:

(i) It follows from Bessel’s inequality (from the theorem above) that∑e∈E |(x, e)|2 ≤ ‖x‖2, so Ex = e ∈ E :

(x, e) 6= 0 is countable, hence a sequence, so by Riesz-Fischer x′ =∑e∈E(x, e)e =

∑e∈Ex(x, e)e converges in

H. Also, x′ ∈ E . Now if e′ ∈ E, then

(x− x′, e) = (x, e′)−

(∑e∈Ex

(x, e)e, e′

)= (x, e′)−

∑e∈Ex

(x, e)(e, e′)

by the continuity of y 7→ (y, e′), but this is just equal to (x, e′)− (x, e′) = 0. It follows that x− x′ ⊥ E, hencex− x′ ⊥ E . Also

(x− x′, x′) = (x, x′)− (x′, x′)

=

(x,∑e∈E

(x, e)e

)−

(∑e∈E

(x, e)e,∑e′∈E

(x, e′)e′

)

=∑e∈E

(x, e)(x, e)−∑e∈E

(x, e)

(e,∑e′∈E

(x, e′)e′

)(†)

=∑e∈E|(x, e)|2 −

∑e∈E

∑e′∈E

(x, e)(x, e′)(e, e′) = 0

so x = x′ + x− x′ is the unique orthogonal decomposition with x′ ∈ E . Hence PEx = x′.

(ii) Let x′ be as above. For finite F ⊂ E, we let x′ = x′F + x′ − x′F where x′F =∑e∈F (x′, e)e is the closest point

in spanF to x′, so this is an orthogonal sum. By Pythagoras,

‖x′‖2 = ‖x′F ‖2 + ‖x′ − x′F ‖2 =∑e∈F| (x′, e)︸︷︷︸

(x,e)

|2 + dist(x′, spanF )2.

If we take limit for F in F we get

limF∈F‖x′F ‖2 =

∑e∈E|(x, e)|2, lim

F∈Fdist(x′, spanF ) = dist(x′, spanE) = 0

where by (i), limF∈F x′F = x′ = PEx.

(iii) Using computations similar to (†) we compute any of (PEx, PEy), (PEx, y) or (x, PEy) to obtain∑e∈E(x, e)(e, y).

Now, if (i’) holds, then H ⊆ spanE ⊆ H. If (ii’) holds then x = PEx+ (I − PE)x is an orthogonal decompositionby which ‖x‖2 = ‖PEx‖2 so ‖(I − PE)x‖2 = 0, thus PE = I which gives (i’). If (iii’) holds, let x = y and obtain(ii’).

12.24 Corollary. If H is a Hilbert space and E ⊂ H is an orthonormal set for which spanE = H, then the mapU : H → `2(E) given by

Ux = ((x, e))e∈E

defines a surjective linear isometry such that (Ux,Uy) = (x, y).

Proof. That (Ux,Uy)`2 = (x, y)H is Parseval’s identity and hence ‖Ux‖2 = (Ux,Ux)`2 = (x, x)H = ‖x‖2, so U isan isometry. Surjectivity of U is a consequence of the Riesz-Fischer Theorem.

Talk topics have been posted.

13 Spectral theory

Let X be a complex Banach space and consider B(X ) = B(X ,X ). Compose!

13.1 Definition. Let T ∈ B(X ).

• The spectrum of T is σ(T ) = λ ∈ C : λI − T does not admit a (two-sided) inverse in B(X ).

• The point spectrum of T is σp(T ) = λ ∈ C : ker(λI − T ) ) 0.

40

Of course, σp(T ) ⊂ σ(T ) and equality holds if dimX <∞.


(i) For dimX <∞, and T ∈ B(X ) ∼= MdimX (C), we have

σ(T ) = λ ∈ C : det(λI − T ) = 0.

(ii) Let S ∈ B(`p) (1 ≤ p ≤ ∞) be given by S(x1, x2, . . .) = (0, x1, x2, . . .), the (unilateral) shift operator.Note ‖Sx‖p = ‖x‖p for all x ∈ `p so kerS = 0, so 0 /∈ σp(S). We claim σp(S) = ∅. Let λ ∈ C \ 0. Ifx ∈ ker(λI − S), then

λx− Sx = 0 =⇒ λx = (λx1, λx2, . . .) = Sx = (0, x1, x2, . . .)

so if x 6= 0, let k = minn ∈ N : xn 6= 0. Then λxk = (Sx)k = xk−1, so xk−1 6= 0, contradicting minimalityof k. So ker(λI − S) = 0.

Question: Is σ(T ) always non-empty?

13.3 Theorem (Inversion Theorem). Let X be a Banach space.

(i) If T ∈ B(X ) with ‖T‖ < 1, then∞∑k=0

T k (the Neumann series) converges in B(X ) (note T 0 = I), and

∞∑k=0

T k = (I − T )−1.

(ii) If T ∈ B(X ) and S ∈ G(X ) = U ∈ B(X ) : U−1 ∈ B(X ) and ‖T − S‖ < 1‖S−1‖ , then T ∈ G(X ). Moreover,

G(X ) is open in B(X ) and S 7→ S−1 is continuous.

Proof. Note ‖ST‖ ≤ ‖S‖‖T‖.

(i) Let Sn =∑nk=0 T

k, and note that for m < n,

‖Sn − Sm‖ =

∥∥∥∥∥n∑

k=m+1

T k

∥∥∥∥∥ ≤n∑

k=m+1

‖T‖k < ‖T‖m+1

1− ‖T‖m→∞−−−−→ 0.

So, (Sn)∞n=1 is Cauchy, hence convergent. Now,

(I − T )Sn =

n∑k=0

(T k − T k+1) = I − Tn+1 n→∞−−−−→ I.

Similarly, Sn(I − T )→ I. So∑∞k=0 T

k = (I − T )−1.

(ii) Observe ‖S−1(S − T )‖ ≤ ‖S−1‖‖S − T‖ < 1 so T = (S − (S − T )) = S(I − S−1(S − T )) ∈ G(X ) from (i),with T−1 =

∑∞k=0(S−1(S − T ))kS−1. Therefore, T−1 − S−1 =

∑∞k=1(S−1(S − T ))kS−1, so

‖T−1 − S−1‖ ≤∞∑k=1

(‖S−1‖‖S − T‖)︸︷︷︸<1

k‖S−1‖ =‖S−1‖‖S − T‖

1− ‖S−1‖‖S − T‖‖S−1‖,

so limT→S ‖T−1 − S−1‖ = 0, so S 7→ S−1 is continuous. Moreover, for S ∈ G(X ), S + 1‖S−1‖D(B(X )) ⊂ G(X )

so G(X ) is open.

13.4 Definition. If X is a complex Banach space and T ∈ B(X ), the resolvent set of T is given by

ρ(T ) = λ ∈ C : (λI − T )−1 ∈ B(X ) = C \ σ(T ).

If B is a complex Banach space and U ⊆ C is open we call F : U → B holomorphic if for each z0 in U ,

F ′(z0) = limz→z0

F (z)− F (z0)

z − z0

exists.

41

13.5 Remark. A holomorphic function is continuous on its domain.

13.6 Proposition. Let X be a complex Banach space, T ∈ B(X ).

(i) ρ(T ) is open in C.

(ii) The resolvent function, R : ρ(T )→ B(X ) given by R(z) = (zI − T )−1, is holomorphic.

(iii) σ(T ) ⊆ z ∈ C : |z| ≤ ‖T‖ and ‖R(z)‖ ≤ 1

|z| − ‖T‖for |z| > ‖T‖.

Proof. We have:

(i) Let F : C→ B(X ), F (z) = zI − T . Then F is continuous, and ρ(T ) = F−1(G(X )) is open.

(ii) If z, z0 ∈ ρ(T ),

R(z)−R(z0) = (zI − T )−1 − (z0I − T )−1 = (zI − T )−1[(z0I − T )− (zI − T )](z0I − T )−1

= (z0 − z)(zI − T )−1(z0I − T )−1

so thatR(z)−R(z0)

z − z0= −(zI − T )−1(z0I − T )−1 z→z0−−−→ −((z0I − T )−1)2.

(iii) If |z| > ‖T‖, then ‖ 1zT‖ < 1 so zI − T = z(I − 1

zT ) is invertible. So ρ(T ) ⊇ z ∈ C : |z| > ‖T‖. Also,

R(z) = (zI − T )−1 =1

z

(I − 1

zT

)−1

=1

z

∞∑k=0

1

zkT k

so that

‖R(z)‖ ≤ 1

|z|

∞∑k=0

1

|z|k‖T‖k =

1

|z|

(1

1− 1|z|‖T‖

)=

1

|z| − ‖T‖.

13.7 Corollary. σ(T ) is compact.

Proof. σ(T ) = C \ ρ(T ), and ρ(T ) is open, so σ(T ) is closed. By the above, σ(T ) is bounded. Heine-Borel.

Assign 4, hand-in Friday. Assign 5 will still be posted by Wed.

Office hours this week, T & Th. 2:30-4pm.

13.8 Theorem (Liouville’s Theorem). Let f : C→ C be holomorphic. If f is bounded, then f is constant.

Proof. Complex analysis (PMATH 352, say).

When you have a finite-dimensional operator, you can come up with some sort of basis and realize it as the matrix.We know that the spectrum is exactly the set of eigenvalues of said matrix. Most proofs of the FundamentalTheorem of Algebra are consequences of Liouville’s Theorem in complex analysis.

13.9 Theorem (Liouville’s Theorem for Banach space-valued functions). Let X be a C-Banachspace, and let F : C→ X be a holomorphic function. If F is bounded, then F is constant.

Proof. Let f ∈ X ∗. Then f F : C→ C is holomorphic, i.e.

limz→z0

f F (z)− f F (z0)

z − z0= limz→z0

f(F (z)− F (z0))

z − z0= f

(limz→z0

F (z)− F (z0)

z − z0

)exists for all z0 ∈ C, and is bounded if F is bounded:

supz∈C|f F (z)| ≤ sup

z∈C‖f‖‖F (z)‖.

Applying Liouville’s Theorem to f F we see for z, z′ ∈ C that f(F (z)) = f(F (z′)). Now, applying this, for anyf ∈ X ∗ and appealing to the Hahn-Banach Theorem (that there’s a sufficient quantity of linear functionals toseparate all points), then we see that F (z) = F (z′) for all z, z′ ∈ C.

13.10 Theorem. Let X be a C-Banach space and T ∈ B(X ). Then σ(T ) 6= ∅.

42

Proof. Let R(z) = (zI − T )−1 denote the resolvent function, for z ∈ ρ(T ). Suppose that ρ(T ) = C. We observethat

lim|z|→∞

‖R(z)‖ ≤ lim|z|→∞

1

|z| − ‖T‖= 0,

and that R, being holomorphic, is continuous, i.e. bounded on 2‖T‖B ⊂ C. [diagram]. Hence R is bounded on allof C, and is thus constant. Due to the limit above, this means that R(z) = 0 for all z. This contradicts that eachR(z) is invertible.

13.11 Theorem (Spectral Mapping Theorem). Let X be a C-Banach space. If T ∈ B(X ), and p(t) ∈ C[t](polynomials with complex coefficients) then

σ(p(T )) = p(σ(T )) = p(λ) : λ ∈ σ(T ).

Proof. We suppose p(t) 6= 0. If z0 ∈ C, write

p(t)− z0 = α

n∏k=1

(t− zk)

where z1, . . . , zn is the family of zeros of p(t)− z0 (counting multiplicity) [Fundamental Theorem of Algebra]. Then

p(T )− z0I = α

n∏k=1

(T − zkI).

Thus p(T )− z0I is non-invertible, if and only if at least one T − zkI is non-invertible (i.e. zk ∈ σ(T ) for some k).Thus z0 ∈ σ(p(T )) if and only if p(λ)− z0 = 0 for some λ ∈ σ(T ).

13.12 Theorem (Spectral Radius Formula). Let X be a C-Banach space. Let T ∈ B(X ) and define thespectral radius of T by r(T ) = max|λ| : λ ∈ σ(T ). Then

r(T ) = limn→∞

‖Tn‖1/n.

Proof. Let us first see that limn→∞ ‖Tn‖1/n exists. Let ω(n) = log ‖Tn‖ for n ∈ N. Observe that

‖Tn+m‖ ≤ ‖Tn‖‖Tm‖ =⇒ ω(n+m) ≤ ω(n) + ω(m).

Fix, for the moment, m and let n > m, so n = qnm+ rn where qn = 0, 1, 2, . . ., rn = 0, 1, . . . ,m− 1 and then

ω(n)

n=ω(qnm+ rn)

n≤ ω(qnm) + ω(rn)

n≤ qnω(m) + ω(rn)

n≤ ω(m)

m+ω(rn)

n

so lim supn→∞

ω(n)

n≤ ω(m)

m, hence

lim supn→∞

ω(n)

n≤ infm∈N

ω(m)

m≤ lim inf

n→∞

ω(n)

n

so limn→∞ω(n)n exists. Exponentiation shows limn→∞ ‖Tn‖1/n exists.

Now, by the spectral mapping theorem, we have that r(Tn) = r(T )n. Also, we saw earlier that r(Tn) ≤ ‖Tn‖.Thus

r(T ) ≤ ‖Tn‖1/n =⇒ r(T ) ≤ limn→∞

‖Tn‖1/n.

We let f ∈ B(X )∗ and consider f R : ρ(T )→ C. Then f R has a Laurent series for all |z| > r(T ). We write

f R(z) =

∞∑k=0

fk2k

(note we have omitted the polynomial part but we will prove we didn’t need that anyway). We have for |z| > ‖T‖,

R(z) =

∞∑k=1

1

2k+1T k,

43

and hence we can compute fk+1 = f(T k), f0 = 0. Hence, by Banach-Steinhaus, we see that

∞∑k=0

1

zk+1T k

converges for all |z| > r(T ). Now, if |z| < limn→∞ ‖Tn‖1/n, we can find ε > 0 such that

(1 + ε)|z| < limn→∞

‖Tn‖1/n

and there is n0 such that (1 + ε)|z| < ‖Tn‖1/n for n ≥ n0. But then (1 + ε)n < 1|z|n ‖T

n‖ but, this means

∞∑k=0

1

zkT k

cannot converge, since the nth grows without bound. Thus we have that |z| > r(T ) implies |z| > limn→∞ ‖Tn‖1/n.Hence r(T ) ≥ limn→∞ ‖Tn‖1/n.13.13 Example. Suppose dimX < ∞, where X is a C-vector space. Then L(X ) = B(X ) ∼= MdimX (C) for anynorm on X . Here, if T ∈ B(X ), σ(T ) = σp(T ) = λ ∈ C : det(λI − T ) = 0. Observe r(T ) = max|λ| : λ ∈ σ(T )depends only on algebraic structure of T , whereas ‖T‖ depends on our choice of norm on X . But, by SRF,r(T ) = limn→∞ ‖Tn‖1/n depends only on algebraic structure.

14 Adjoint operators

14.1 Definition. Let X ,Y be vector spaces and T ∈ L(X ,Y). Define T ∗ ∈ L(Y ′,X ′) by (T ∗(f))(x) = f(Tx), soT ∗f = f T = fT . Linearity is clear. This is the adjoint of T .

14.2 Proposition (on the adjoint). Let X ,Y,Z be normed spaces, and T, S ∈ B(X ,Y), R ∈ B(Y,Z).

(i) T ∗ ∈ B(Y∗,X ∗), ‖T ∗‖ = ‖T‖.

(ii) If α ∈ F, (S + αT )∗ = S∗ + αT ∗.

(iii) If T ∗∗ = (T ∗)∗ ∈ B(X ∗∗,Y∗∗), then T ∗∗x = T x for all x ∈ X .

(iv) RS ∈ B(X ,Z) implies (RS)∗ = S∗R∗ ∈ B(Z∗,X ∗).

Proof. We have:

(i) and (iii) If f ∈ Y∗,

‖T ∗f‖ = sup|T ∗f(x)| : x ∈ B(X ) = sup|f(Tx)| : x ∈ B(X ) ≤ ‖f‖‖T‖

so ‖T ∗‖ ≤ ‖T‖. If x ∈ X , f ∈ Y∗,

T ∗∗x(f) = x(T ∗f) = T ∗f(x) = f(Tx) = T x(f).

So ‖T‖ = ‖T ∗∗|X ‖ ≤ ‖T ∗∗‖ ≤ ‖T ∗‖ ≤ ‖T‖. So ‖T‖ = ‖T ∗‖.

(ii) Obvious.

(iv) If f ∈ Z∗, (RS)∗f = f R S = S∗(f R) = S∗R∗f .

14.3 Definition. Let X be a normed space, Y ⊆ X a subspace, and Z ⊆ X ∗ a subspace.

• The annihilator of Y in X ∗ is Ya = f ∈ X ∗ : f|Y = 0 =⋂y∈Y

ker y︸︷︷︸w∗-closed

(w∗-closed subspace of X ∗).

• The pre-annihilator of Z on X is Za = x ∈ X : x|Z = 0 =⋂f∈Z

ker f (closed subspace of X ).

14.4 Remark. We have:

(i) Ya = Ya and (Ya)a = Y (A2Q5a).

(ii) Za = (Zw∗

)a and (Za)a = Zw∗

(prove as in A2Q5a, using w∗-closed convex hull theorem).

44

14.5 Example. Recall `1∗ ∼= `∞, so for x ∈ `1, y ∈ `∞, let

〈x, y〉 = fy(x) =

∞∑i=1

xiyi

denote the “dual pairing”. Now, ccc0 ⊆ `∞ and with respect to the duality above, (ccc0)a = 0 ⊂ `1 since ccc0∗ ∼= `1.Hence, ((ccc0)a)a = 0a = `∞ which captures that ccc0w

∗= `∞.

14.6 Theorem (Kernel-Annihilator Theorem). Let X ,Y be normed spaces, and T ∈ B(X ,Y). ThenkerT = Im(T ∗)a and ker(T ∗) = (ImT )a.

Proof. We have

kerT = x ∈ X : Tx = 0 = x ∈ X : T ∗g(x) = g(Tx) = 0 ∀g ∈ Y∗ = Im(T ∗)a.

ker(T ∗) = g ∈ Y∗ : T ∗g = 0 = g ∈ Y∗ : g(Tx) = T ∗g(x) = 0 ∀x ∈ X = (ImT )a.

14.7 Corollary. We have:

(i) ker(T ∗) = 0 if and only if ImT = Y.

(ii) kerT = 0 if and only if ImT ∗w∗

= X ∗.

Proof. Apply preceding remark to Kernel-Annihilator.

14.8 Theorem. Let X ,Y be Banach spaces, T ∈ B(X ,Y). The following are equivalent:

(i) T is invertible; i.e. there is an inverse T−1 ∈ B(Y,X ).

(ii) T ∗ is invertible.

(iii) ImT is dense in Y and T is bounded below: inf‖Tx‖ : x ∈ S(X ) > 0.

(iv) Both T and T ∗ are bounded below.

14.9 Remark. If T is bounded below, m = inf‖Tx‖ : x ∈ S(X ), we have ‖Tx‖ ≥ m‖x‖ for all x ∈ X .

Proof. (i) → (ii): TT−1 = IX = T−1T , so by an earlier proposition, (T−1)∗T ∗ = IX∗ = T ∗(T−1)∗ so (T−1)∗ =(T ∗)−1.

(ii) → (iii): By the last corollary, (ImT )a = ker(T ∗) = 0, so ImT = Y. Also, if x ∈ X \ 0, choose f ∈ X ∗,‖f‖ = 1, f(x) = ‖x‖, by Hahn-Banach. Then

‖x‖ = f(x) = T ∗(T ∗)−1f(x) = (T ∗)−1f(Tx) ≤ ‖(T ∗)−1f‖‖Tx‖ ≤ ‖(T ∗)−1‖‖Tx‖

so 1‖(T∗)−1‖‖x‖ ≤ ‖Tx‖, so T is bounded below.

(iii) → (i): If T is bounded below, then kerT = 0. We also have that if T is bounded below, then ImT is closed.Let (yn)∞n=1 ⊂ ImT with y = limn→∞ yn in Y. Write yn = Txn for some xn in X , and we observe

‖xm − xk‖ ≤1

m‖T (xm − xk)‖ = ‖ym − yk‖

and hence (xn)∞n=1 is Cauchy, hence x = limn→∞ xn exists in X . We have y = limn→∞ yn = limn→∞ Txn = Tx.

We note ImT = Y by assumptions. Hence T−1 exists. Finally, for y ∈ Y, ‖y‖ = ‖TT−1y‖ ≥ m‖T−1y‖ so‖T−1y‖ ≤ 1

m‖y‖, i.e. ‖T−1‖ ≤ 1

m .

(i), (ii) → (iv): (i) → (iii) hence T is bounded below. (ii) → (iii) (stated for T ∗), so T ∗ is bounded below.

(iv) → (iii): T bounded below implies T is bounded below. Also, T ∗ bounded below implies ker(T ∗) = 0 impliesImT = Y by Corollary to Kernel-Annihilator Theorem.

14.10 Remark. Reasons why T ∈ B(X ) may not be invertible:

• kerT ) 0.

• ImT , ImT ( X .

• T fails to be bounded below.

14.11 Corollary. If X is a C-Banach space, T ∈ B(X ), then σ(T ) = σ(T ∗).

45

Proof. Given λ in C, we have by (i) ↔ (ii) above that

λIX − T is invertible ⇐⇒ λIX∗ − T ∗ is invertible.


• Point spectrum: σp(T ) = λ ∈ C : ker(λI − T ) ) 0.

• Compression spectrum: σcom(T ) = λ ∈ C : Im(λI − T ) ( X.

• Approximate point spectrum: σap(T ) = λ ∈ C : λI − T is not bounded below.

If T ∈ B(X ) is not bounded below, then there is xn ∈ S(X ) such that ‖Txn‖ < 1n , i.e. limn→∞ Txn = 0. Hence

σap(T ) = λ ∈ C : there is (xn)∞n=1 ⊂ S(X ) s.t. (λI − T )xnn→∞−−−−→ 0︸︷︷︸

Txn−λxnn→∞−−−−→0

.

14.13 Example. Fix 1 ≤ p < ∞, S ∈ B(`p), S(x1, x2, . . .) = (0, x1, x2, . . .). We showed earlier that σp(S) = ∅.Let’s compute S∗ ∈ B(`q), 1

p + 1q = 1. Write for x ∈ `p, y ∈ `q, 〈x, y〉 = fy(x) =

∑∞i=1 xiyi. We have

〈x, S∗y〉 = fy(Sx) = 〈Sx, y〉 =

∞∑i=2

xi−1yi =

∞∑i=1

xiyi+1

and we find that S∗(y1, y2, . . .) = (y2, y3, y4, . . .) (“back-shift”).

Observe that if |λ| < 1 then x(λ) = (1, λ, λ2, λ3, . . .) ∈ `p and S∗x(λ) = (λ, λ2, λ3, . . .) = λ(1, λ, λ2, . . .) = λx(λ).Hence σp(S∗) ⊇ D. Moreover, ‖S∗‖ = ‖S‖ = 1, so σ(S∗) ⊆ B. Conclusion is that σ(S∗) = B, hence σ(S) = B.Check that D ⊆ σcom(S) (use Corollary to K-A Theorem).

14.14 Lemma. If (Tn)∞n=1 ⊂ G(X ) (X a Banach space) and T = limn→∞ Tn exists and supn∈N ‖(Tn)−1‖ < ∞,then T ∈ G(X ).

Proof. Let M = supn∈N ‖(Tn)−1‖. For sufficiently large n,

‖Tn − T‖ <1

M≤ 1

‖(Tn)−1‖,

which implies T ∈ G(X ) by the Inversion Theorem.

14.15 Proposition. If T ∈ B(X ) (X a C-Banach space), and if ∂σ(T ) denotes the (topological) boundary of σ(T ),then ∂σ(T ) ⊆ σap(T ).

Proof. If λ ∈ ∂σ(T ), then there is (λn)∞n=1 ⊂ ρ(T ) = C \ σ(T ) such that λ = limn→∞ λn. We observe ‖(T −λnI) − (T − λI)‖ = |λ − λn|

n→∞−−−−→ 0 while T − λI /∈ G(X ) (i.e. λ ∈ ∂σ(T ) ⊆ σ(T ), as σ(T ) is closed), sosupn∈N ‖(T−λnI)−1‖ =∞, by the Lemma above. By dropping to subsequence, assume limn→∞ ‖(T−λnI)−1‖ =∞.Fix, for each n, ‖xn‖ = 1 in X so that

αn = ‖(T − λnI)−1xn‖ > ‖(T − λnI)−1‖ − 1

n, so αn

n→∞−−−−→∞.

Let yn = 1αn

(T − λnI)−1xn, so ‖yn‖ = 1. Compute

(T − λI)yn = (T − λnI)yn + (λn − λ)yn =1

αnxn + (λn − λ)yn

n→∞−−−−→ 0.

15 Compact operators

15.1 Definition. Let X ,Y be Banach spaces. An operator K ∈ L(X ,Y) is compact if K(B(X )) ⊂ Y is compact.

In particular, any compact operator is a fortiori bounded.

15.2 Remark. We observe for K ∈ B(X ,Y) that the following are equivalent:

46

1. K is compact.

2. For every sequence (Kxn)∞n=1 ⊂ K(B(X )) there is a converging subsequence (Kxnk)∞k=1.

3. K(B(X )) is totally bounded, i.e. given ε > 0, there are x1, . . . , xn ∈ B(X ) such that

K(B(X )) ⊆n⋃i=1

(Kxi + εD(Y)) .

15.3 Proposition. Let X ,Y,Z be Banach spaces and K(X ,Y) = K ∈ B(X ,Y) : K is compact. Then

(i) K(X ,Y) is a closed subspace of B(X ,Y).

(ii) If S ∈ B(Y,Z) or T ∈ B(Z,X ) then for K ∈ K(X ,Y) then SK,KT are both compact.

Proof. We have:

(i) Suppose K,L ∈ K(X ,Y), α ∈ F. If (xn)∞n=1 ⊂ B(X ), then ((Kxn, Lxn))∞n=1 ∈ K(B(X )) × L(B(X )), asequence in a compact space, admits a converging subsequence ((Kxnk , Lxnk))∞k=1 and hence ((K+αL)xnk)∞k=1

converges in K(B(X )) ⊂ Y.

Now, suppose (Kn)∞n=1 ⊂ K(X ,Y) and K = limn→∞Kn. Given ε > 0 we let n0 be so that ‖K − Kn‖ <ε/3 for all n ≥ n0, and fix such an n ≥ n0. Since Kn(B(X )) is totally bounded, there is an (ε/3)-netKnx1, . . . ,Knxm for this set. We observe for x ∈ B(X ) that there is j = 1, . . . ,m such that ‖Knx−Knxj‖ <ε/3, and hence

‖Kx−Kxj‖ ≤ ‖Kx−Knx‖+ ‖Knx+Knxj‖+ ‖Knxj −Kxj‖ <ε

3+ε

3+ε

3= ε,

so Kx1, . . . ,Kxm is an ε-net for K(B(X )).

(ii) We observe

SK(B(X )) = S(K(B(X)) ⊆

compact︷︸︸︷S(K(B(X ))︸︷︷︸

compact

)

and hence SK(B(X )) is compact. On the other hand

KT (B(Z)) ⊆ K(‖T‖B(X )) = ‖T‖K(B(X ))

has compact closure.


(i) Let for Banach spaces X ,Y

F(X ,Y) = F ∈ B(X ,Y) : ImF is finite-dimensional.

These are the so-called finite rank operators. By the Heine-Borel theorem, F(X ,Y) ⊆ K(X ,Y).

(ii) F(X ,Y)‖·‖⊆ K(X ,Y), from the proposition above.

15.5 Remark. Often, F(X ,Y)‖·‖

= K(X ,Y) if X ,Y have the “approximation property” [Grothendieck]. Itturns out the approximation property characterizes this equality. You can devise separable Banach spaces forwhich this equality fails to be true.

(iii) Let I = [0, 1]. Let k ∈ C(I2) (the “kernel”). Define K : C(I)→ C(I) by

Kf(x) =

∫ 1

0

k(x, y)f(y) dy.

We’ve claimed that C(I) is the codomain, which isn’t completely obvious, so let’s check that. Observe

|Kf(x)−Kf(x′)| ≤∫ 1

0

|k(x, y)− k(x′, y)| · |f(y)| dy ≤ supy∈I|k(x, y)− k(x′, y)| · ‖f‖∞

and k, being continuous on a compact space I2, is uniformly continuous, so Kf ∈ C(I). It’s obvious thatK is linear, and for all x ∈ I, |Kf(x)| ≤

∫ 1

0|k(x, y)| · |f(y)| dy ≤ ‖k‖∞‖f‖∞, so ‖Kf‖∞ ≤ ‖k‖∞‖f‖∞, so

‖K‖ ≤ ‖k‖∞.

47

These are the so-called integral operators. We will approximate K by finite-rank operators, and henceshow that K is compact. Let A = spanϕ × ψ : C(I), ϕ × ψ(x, y) = ϕ(x)ψ(y). We observe that A is analgebra of functions, which is point separating and conjugate closed (if F = C). Hence by Stone-WeierstrassTheorem, A‖·‖∞ = C(I2). Hence if k ∈ C(I2) then there is (kn)∞n=1 ⊂ A such that limn→∞ ‖k − kn‖∞ = 0.Hence if Knf =

∫ 1

0kn(·, y)f(y) dy, then ‖K −Kn‖ ≤ ‖k − kn‖∞

n→∞−−−−→ 0. If

kn =

m(n)∑i=1

ϕn,i × ψn,i,

then ImKn ⊆ spanϕn,1, . . . , ϕn,m(n) i.e.∫ 1

0ϕn,i(·)ψn,i(y)f(y) dy = [

∫ 1

0ψn,if ] · ϕn,i.

15.6 Remark. If k,K are as above, then by Hölder’s inequality for 1 < p < ∞ shows that ‖Kf‖∞ ≤‖k‖∞‖f‖p, so K ∈ K(Lp(I), C(I)). The injection map J : C(I) → Lp(I) is linear and contractive (‖J‖ ≤ 1).Hence JK ∈ K(Lp(I)).

15.7 Theorem. If X ,Y are Banach spaces and K ∈ B(X ,Y) then K ∈ K(X ,Y) if and only if K∗ ∈ K(Y∗,X ∗).

Proof. We will observe that K∗ is w∗-norm continuous on bounded sets. Indeed, suppose (fν)ν∈N ∈ MB(Y∗)(M > 0) with f0 = limν∈N fν . Then f0 ∈ MB(Y∗) (Alaoglu). Given ε > 0, let Kx1, . . . ,Kxn be an (ε/2M)-netof K(B(X )). Then for x ∈ B(X ), there is xj so ‖Kx−Kxj‖ < ε

2M and for each ν ∈ N ,

|fν(Kx)− f0(Kx)| ≤ |fν(Kx)− fν(Kxj)|+ |fν(Kxj)− f0(Kxj)|+ |f0(Kxj)− f0(Kx)|≤ ‖fν‖‖Kx−Kxj‖+ |fν(Kxj)− f0(Kxj)|+ ‖f0‖‖Kxj −Kx‖< ε+ |fν(Kxj)− f0(Kxj)|.

Hence

‖K∗fν −K∗f0‖ = supx∈B(X )

|K∗fν(x)−K∗f0(x)| = supx∈B(X )

|fν(Kx)− f0(Kx)|

≤ ε+ maxj=1,...,n

|fν(Kxj)− f0(Kxj)|ν∈N−−−→ ε

and hence limν∈N ‖K∗fν −K∗f0‖ = 0, so A3P5 tells us that K∗ : MB(Y∗)→ X ∗ is w∗-norm continuous.

Now, B(Y∗) is w∗-compact, by Alaoglu’s Theorem, so K∗(B(Y∗)) is ‖ · ‖-compact in X ∗.

The usual Arzelá-Ascoli proof will be typed and posted on the website.

Proof, continued. (→) We showed that on bounded sets in Y∗, K∗ is w∗-norm continuous.

(←) We have, from above, that if K∗ is compact, then K∗∗ is compact. Hence we see that K(B(X )) = K∗∗(B(X )) ⊆K∗∗(B(X ∗∗)); the first equality here is from the first proposition on adjoint operators. Also, the last space is compact.

Hence K(B(X )) ∼= K(B(X )) is compact (∼= denotes isometry).

15.8 Corollary. If X ,Y are reflexive spaces, then for K ∈ B(X ,Y) we have K ∈ K(X ,Y) if and only if K isw-norm continuous on bounded sets.

Proof. On reflexive spaces, w = w∗ coincide. Also, K = (K∗)∗. Appeal to the proof above.

16 Structure theorem for compact operators

16.1 Lemma (Key Lemma). Let X be a Banach space, and K ∈ K(X ). Suppose that there exist:

• a sequence Y0 ( Y1 ( Y2 ( . . ., each Yn a closed subspace, and

• scalars (αn)∞n=1 such that (αnI −K)Yn ⊆ Yn−1.

Then limn→∞ αn = 0.

Proof (notes 16.1). Suppose not. Then by dropping to a subsequence if necessary, we may suppose |αn| ≥ ε > 0 foreach n, for some ε > 0. By the Riesz Lemma (used to prove B(X ) is not compact if dimX 6<∞) there exists for each

48

i some xi ∈ Yi such that dist(xi,Yi−1) > 12 , and ‖xi‖ ≤ 1. Let yi = (K − αiI)xi ∈ Yi−1, so Kxi = αixi + yi ∈ Yi.

If j < i, we have

‖Kxi −Kxj‖ = ‖αixi + yi −Kxj‖ = |αi| · ‖xi +1

αi(yi −Kxj︸︷︷︸∈Yi−1

)‖ ≥ |αi| · dist(xi,Yi−1) ≥ ε

2.

Hence (Kxi)∞i=1 has no converging subsequence, which means that K is not compact.


(i) Let Ω ⊆ C be a compact set. If the interior Ω 6= ∅ then |∂Ω| = c. Indeed, let z0 ∈ Ω. For each z ∈ S(i.e. |z| = 1) let rz = maxr > 0 : z0 + rz ∈ Ω > 0. Then z0 + rzz ⊆ ∂Ω, so |S| ≤ |∂Ω|. By theCantor-Bernstein-Schroeder Theorem, we have that |∂Ω| = c. Thus, if |∂Ω| ≤ ℵ0 then Ω = ∅ and henceΩ = ∂Ω is countable.

(ii) If T ∈ L(X ) (X a C-vector space) and λ1, . . . , λn is a collection of distinct eigenvalues of T and xi ∈ker(λiI − T ) \ 0 then x1, . . . , xn is linearly independent.

Proof by induction. Let n = 2. If x1 = α2x2 then λ1α2x2 = λ1x1 = Tx1 = T (α2x2) = α2λ2x2, so α2(λ1 −λ2)x2 = 0 which contradicts that x2 6= 0. Hence having proved this for n−1, suppose x1 = α2x2 + . . .+αnxn.Then λ1(α2x2 + . . . + αnxn) = λ1x1 = Tx1 = T (α2x2 + . . . + αnxn) = α2λ2x2 + . . . + αnλnxn, so α2(λ1 −λ2)x2 + . . .+ αn(λ1 − λn) = 0, which contradicts the linear independence.

16.3 Theorem (Structure Theorem for Compact Operators). Let X be a C-Banach space, K ∈K(X ).

(i) If λ ∈ C\0, each ker[(λI−K)n] is finite-dimensional and there is n0 so that ker[(λI−K)n] = ker[(λI−K)n0 ]for n ≥ n0.

(ii) If dimX 6<∞, σ(K) = σp(K) ∪ 0 and σp(K) = λ1, λ2, . . . where limn→∞ λn = 0 if σp(K) is infinite.

Proof. We have:

(i) Consider first the sequence ker(λI −K) ⊆ ker(λI −K)2 ⊆ . . .. If n0, as suggested, does not exist, then letn1 = 1 and nk+1 = minn ∈ N : ker(λI −K)nk ( ker(Iλ−K)n. Let Yk = ker(λI −K)nk , and Y0 = 0, soY0 ( Y1 ( Y2 ( . . .. We have that (λI−K)Yj ⊆ Yj−1, so by the Key Lemma we would obtain limn→∞ λ = 0,but λ is constant which is absurd. Hence n0 exists, as claimed.

Now suppose dim ker(λI−K)n 6<∞ for some n. Pick the nk, above, for which dimYk−1 <∞ and dimYk 6<∞.Let xi∞i=1 ∈ Yk \ Yk−1 be linearly independent. Let V0 = Yk−1 and Vn = spanV0, x1, . . . , xn. We observethat (λI−K)Vn ⊆ V0 ⊆ Vn−1. As well, V0 ( V1 ( V2 ( . . ., and hence by the Key Lemma we have limn λ = 0,which is still absurd.

(ii) We first prove that σap(K) ⊆ σp(K) ∪ 0. If λ ∈ σap(K) \ 0, then there is (xn)∞n=1 ⊂ S(X ) such that‖(λI −K)xn‖

n→∞−−−−→ 0. Since K is compact, (Kxn)∞n=1 ⊂ K(B(X )) admits a subsequence (Kxnj )∞j=1 which

converges to y. Hence ‖λxnj − y‖ ≤ ‖(λI − K)xnj‖ + ‖Kxnj − y‖j→∞−−−→ 0, so K 1

λy = limj→∞Kxnj = yso Ky = λy. Hence λ ∈ σp(K). Suppose λn∞n=1 ⊆ ∂σ(K) is a sequence of distinct elements. We recallthat ∂σ(K) ⊆ σap(K) by an earlier Proposition, and moreover, σap(K) ⊆ σp(K)∪ 0, from above. Considerspaces Y0 = 0, Yn = spanker(λiI −K) : i = 1, . . . , n. By linear independence of non-zero eigenvectorsassociated to distinct eigenvalues, we have that

Y0 ( Y1 ( Y2 ( . . .

We observe that (λnI −K)Yn ⊆ Yn−1. Hence, by the Key Lemma, limn→∞ λn = 0. Hence we conclude that

∂σ(K) ⊆∞⋃n=1

∂σ(K) \ 1

nD︸︷︷︸

finite

∪ 0so that ∂σ(K) is countable. Hence we have that σ(K) itself is countable (remark, last class). We thus haveσ(K) = ∂σ(K) = σap(K) ⊆ σp(K) ∪ 0. Thus, either |σp(K)| < ∞, in which case 0 ∈ σp(K) since eachnon-zero eigenvalue has finite-dimensional eigenspace, that is, dim ker[(λI −K)n] <∞, by (i) above. And, if|σp(K)| 6<∞, then 0 is the only cluster point of σp(K). Because σ(K) is compact, 0 ∈ σ(K).

49

16.4 Aside. |σp(K)| < ∞. σp(K) = λ1, . . . , λn. Let Vk = ker[(λkI −K)nk ] where nk is the maximal exponentas in (i). dimVk <∞. Then dim spanv1, . . . , vn <∞.

16.5 Example. It is possible for 0 6= K ∈ K(X ) with empty point spectrum.

Let K ∈ B(`p), 1 ≤ p <∞, be given by K(x1, x2, . . .) = (0, x1,12x2,

13x3, . . .) (“weighted shift”).

K(x1, x2, . . .) = (0, x1,12x2,

13x3, . . .)

K2(x1, x2, . . .) = (0, 0, 12x1,

13·2x2,

14·3x3, . . .)

...

Kn(x1, x2, . . .) = (0, 0, . . . , 0, 1n!x1,

2(n+1)!x2,

3·2(n+2)!x3, . . . ,

k!(n+k)!xk, . . .).

Note 1n! ≥

k!(n+k)! for k in N. We can compute

‖Knx‖p ≤1

n!‖x‖p =⇒ ‖Kn‖ ≤ 1

n!

hence the spectral radius formula tells us that

r(K) = limn→∞

‖Kn‖1/n ≤ limn→∞

1n√n!

= 0.

Use an estimate like n! ≥ (n/2)n/2. Think back to the theory of converging series – the root test is the primarytheoretical test. There’s a really nice exercise that shows that if the ratio of the n + 1st term over the nth termhas really nice behaviour, then so do the roots, so the ratio test follows from the root test. Hence σ(K) = 0. Soeither you believe the proof we gave in terms of compact operators on infinite-dimensional Banach spaces, or youbelieve the Liouville theorem argument that the spectrum is non-empty. Either way it has to contain 0. We notefor x 6= 0 that

‖Kx‖p =

( ∞∑n=1

1

np|xn|p

)1/p

and necessarily at least one |xn|p is not zero, so this norm is not zero, which tells us that kerK = 0. This givesus a little bit of a warning: even as pretty as this theorem, the first part of it might essentially have no content;there might be no eigenvalues to deal with. Because we were only able to get finite dimensionality of eigenspaceswith non-zero eigenvalues, we see that the zero part of the spectrum can contain all the craziness of the operator.

16.6 Remark. There are quite a few ways to manufacture operators that contain only 0 in the spectrum. LetI = [0, 1], let K be a continuous function on K ∈ C(I2) such that k(x, y) = 0 when x ≤ y. For example, let

k(x, y) =

0 if x ≤ yx− y if x > y

If we let K be the integral operator coming from this kernel, Kf(x) =∫ 1

0k(x, y)f(y) dy =

∫ 1

xk(x, y)f(y) dy. Now

we can write down a formula for what happens when we iterate this operator. We can check

‖Knf‖p ≤‖k‖n∞n!‖f‖p

and hence r(K) = 0. If k is continuous partially differentiable in the 1st coordinate, then one can show thatσp(K) = ∅.

16.7 Remark. If Kn(x1, x2, . . .) = (0, x, 12x2, . . . ,

1nxn, 0, 0, . . .). Kn ∈ F(`p) ⊆ K(`p). Check

‖(K −Kn)x‖p ≤1

n+ 1‖x‖p =⇒ lim

n→∞‖Kn −K‖ = 0.

17 Operators on Hilbert space

17.1 Definition. Let H,L be C-Hilbert space. Given T ∈ B(H,L). We define B(L,H) by (x, T ∗y) := (Tx, y) forall x ∈ H, y ∈ L. Indeed, for fixed y ∈ L, x 7→ (Tx, y) is a bounded linear functional so by Riesz RepresentationTheorem there exists T ∗y ∈ H such that

(Tx, y) = (x, T ∗y).

Check that y 7→ T ∗y is linear.

50

Cost: (αT )∗ = αT ∗ for α ∈ C.

Advantage: If L = H we can now compose TS∗ in H.

Office hours this week. Thursday 2:30-4. Friday 2-3:30. Typo 5(b) σp(Mf ) 6= ∅.

Alternate exam time: if you wish to take advantage, send an email including all other exam times.

Recall: for H a Hilbert space, T ∈ B(H), we define T ∗ ∈ B(H) by

(T ∗x, y) = (x, Ty).

von Neumann preferred Hilbert spaces to model quantum mechanics. It plays host to non-commutative geometry;I am involved with non-commutative harmonic analysis (a more general kind of harmonic analysis). The notion ofthe adjoint is just mildly different than it is in the Banach space setting. If we have an operator on a Banach space,then the adjoint will be defined its dual space. Hilbert spaces are essentially self-dual (we say “essentially”, becausethe small cost of conjugation, in the complex realm).

17.2 Proposition. If H,K,L are Hilbert spaces and S, T ∈ B(H,K) and R ∈ B(K,L) then

(i) T ∗ ∈ B(K,H) and ‖T ∗‖ = ‖T‖.

(ii) (T + αS)∗ = T ∗ + αS∗ (meaningful only if F = C, but generally for spectral theory this is the domain inwhich we wish to stay).

(iii) (RT )∗ = T ∗R∗.

(iv) T ∗∗ = T (in Banach spaces, we couldn’t quite say anything this nice; compare to the statement T ∗∗x = T x).

(v) If K = H, then ‖T ∗T‖ = ‖T‖2 (C∗-identity; makes a profound theory – no study of Banach algebras iscomplete without a substudy of C∗-algebras).

This is not very hard to prove. We will give very small hints at the details of C∗-algebra theory.

Proof. (i), (iii) and (iv) are pretty much just like the Banach space case. Also, (ii) is very straightforward (alsosimilar to the Banach space case).

(v): First, we have‖T ∗T‖ ≤ ‖T ∗‖‖T‖ = ‖T‖2.

Hence, it is really required to show ‖T ∗T‖ ≥ ‖T‖2. We have

‖T‖2 = sup‖Tx‖2 : x ∈ B(H) = sup(Tx, Tx) : x ∈ B(H) = sup(T ∗Tx, x) : x ∈ B(H).

By Cauchy-Schwarz,≤ sup‖T ∗Tx‖‖x‖ : x ∈ B(H) = ‖T ∗T‖.

This will be the driving force for the theorem on the spectral theorem for compact Hermitian operators on Hilbertspace.

We mention two theorems whose proofs are just like the Banach space case.

17.3 Theorem (Kernel-Annihilator Theorem). If H,K are Hilbert spaces, T ∈ B(H,K), then

kerT ⊥ Im(T ∗).

17.4 Remark. In Hilbert spaces, if Y is a subspace of H, then Y⊥ = Ya = Ya (due to reflexivity).

17.5 Theorem (Schauder). If H,L are Hilbert spaces, and K ∈ B(H,L) then the following are equivalent:

(i) K ∈ K(H,L) ⇐⇒ K∗ ∈ K(L,H).

(ii) K is w∗-norm continuous on bounded sets.

There is only a modest difference, because K∗ is only slightly differently described as K∗ from the Banach spacesetting.

Let’s talk about some particular classes of operators on a Hilbert space.

17.6 Definition. Let H be a Hilbert space, and T ∈ B(H). We say that

51

(i) T is positive if (Tx, x) ≥ 0.

(ii) T is Hermitian (or self-adjoint) if T = T ∗.

(iii) T is normal if T ∗T = TT ∗.

17.7 Proposition. If H is a Hilbert space, T ∈ B(H), then

(i) T is Hermitian if and only if [x, y] = (Tx, y) is a Hermitian form.

(ii) T is positive implies T is Hermitian.

(iii) T ∗T is positive.

(iv) If T is Hermitian, then T 2 is positive.

Proof. (i) (→)[y, x] = (Ty, x) = (y, T ∗x) = (y, Tx) = (Tx, y) = [x, y].

(←) First, observe that if [x, y] = (Tx, y) defines a Hermitian form, then [x, x] = [x, x] and hence [x, x] ∈ R. Thuswe have the polarisation identity

(Tx, y) = [x, y] =1

4

3∑k=0

ik[x+ iky, x+ iky] =1

4

3∑k=0

ik (T (x+ iky), x+ iky)︸︷︷︸∈R

=1

4

3∑k=0

ik(x+ iky, T (x+ iky)) = (x, Ty).

Remark: In finite dimensions, represent T with respect to an orthonormal basis e1, . . . , ed = β, and we have[T ]β = [tij ] and [T ∗]β = [T ]∗β = [tji] and tij = (Tej , ei).

(ii): If T is positive, then [x, y] = (Tx, y) is a positive form, i.e. [x, x] ≥ 0 for x ∈ H. Hence this form is Hermitian,by polarisation identity.

(iii): (T ∗Tx, x) = (Tx, Tx) = ‖Tx‖2 ≥ 0.

(iv): T = T ∗ implies that T ∗T = T 2.

17.8 Proposition. If T ∈ B(H) with H a Hilbert space, then there exist unique Hermitian operators ReT, ImTsuch that T = ReT + i ImT .

Proof. Let ReT = 12 (T + T ∗) and ImT = 1

2i (T − T∗). Check that (ReT )∗ = ReT and (ImT )∗ = ImT (recall

(αT )∗ = αT ∗). Now suppose T = T1 + iT2 where T ∗k = Tk for k = 1, 2. Then

0 = T − T = (T1 − ReT ) + i(T2 − ImT )

where T1 − ReT , T2 − ImT are each Hermitian. Thus, adding

0 = 0 + 0∗ = 2(T1 − ReT ) =⇒ T1 = ReT,

and subtracting,0 = 0− 0∗ = 2i(T2 − ImT ) =⇒ T2 = ImT.

Now we talk about why Hermitian and normal operators are absolutely as nice as an operator can be.

17.9 Corollary. If T ∈ B(H), T is normal if and only if ReT and ImT commute.

Proof. (→) Obvious.

(←) If ReT · ImT = ImT · ReT , then T = ReT + i ImT and ReT − i ImT necessarily commute too.

17.10 Proposition. If H is a Hilbert space and T ∈ B(H), then T is normal if and only if ‖Tx‖ = ‖T ∗x‖ for all xin H.

Proof. (→) ‖Tx‖2 = (Tx, Tx) = (T ∗Tx, x) = (TT ∗x, x) = (T ∗x, T ∗x) = ‖T ∗x‖2.

(←) 0 = ‖Tx‖2−‖T ∗x‖2 = (T ∗Tx, x)−(TT ∗x, x) = ([T ∗T −TT ∗]x, x). Since T ∗T −TT ∗ is Hermitian, polarisationidentity shows that ([T ∗T − TT ∗]x, y) = 0 for all x, y ∈ H. It follows that T ∗T − TT ∗ = 0, by the Hahn-BanachTheorem and Riesz Representation Theorem.

If we take a normal operator, its spectral radius is as large as can be: it’s always the norm of the operator. This is avery magical property, and it’s exactly the sort of engine that’s going to drive the spectral theory for both Hermitianand normal operators. We’ll see this over the next two lectures (proof is just a nice mixture of the Spectral RadiusFormula, and the C∗-identity).

52

Office hours Th 2:30-4, F 2:00-3:30. A5 due Monday. Please make copies for study purposes. Final examquestions posted soon.

17.11 Proposition. If H is a Hilbert space and T ∈ B(H) is normal (i.e. T ∗T = TT ∗) then ‖Tn‖ = ‖T‖n.

Proof. Observe that ‖T‖2 = ‖T ∗T‖ by the C*-identity. Let H = T ∗T so H = H∗ and hence ‖H‖2 = ‖H∗H‖ =‖H2‖. Thus for k in N:

‖H‖2k

= ‖H‖2·2k−1

= ‖H2‖2k−1

= . . . = ‖H2k‖

Thus if n ∈ N, 2k ≥ n,

‖H‖n‖H‖2k−n = ‖H‖2

k

= ‖H2k‖ = ‖HnH2k−n‖ ≤ ‖Hn‖‖H2k−n‖ ≤ ‖Hn‖‖H‖2k−n

If H 6= 0, i.e. ‖H‖ 6= 0, we see that ‖H‖n ≤ ‖Hn‖ hence ‖Hn‖ = ‖H‖n. Thus

‖T‖n‖T‖n = (‖T‖2)n = ‖T ∗T‖n = ‖H‖n = ‖Hn‖ = ‖(T ∗T )n‖ = ‖(T ∗)nTn‖

where normality of T is used in the last equality.

‖(T ∗T )n‖ ≤ ‖T ∗‖n‖Tn‖ = ‖T‖n‖Tn‖.

Hence as above ‖T‖n ≤ ‖Tn‖, so ‖T‖n = ‖Tn‖.17.12 Corollary. If T ∈ B(H) is normal, then r(T ) = ‖T‖.

Proof. Recall Spectral Radius Formula: r(T ) = limn→∞ ‖Tn‖1/n.

This is a special property. For exame K ∈ K(`2) given by

Kx = (0, x1,12x2,

13x3, . . .)

has that ‖K‖ = 1 while r(K) = 0, i.e. σ(K) = 0. Recall σp(K) = ∅.

17.13 Proposition. Let H be a Hilbert space and U ∈ B(H). Then the following are equivalent:

(i) (Ux,Uy) = (x, y) for x, y in H.

(ii) ‖Ux‖ = ‖x‖ for x in H (i.e. U is an isometry).

(iii) U∗U = I.

Furthermore, if any of (i) – (iii) holds, then the following are equivalent (we are not suggesting the following areequivalent to those above):

(iv) U is surjective.

(v) UU∗ = I (i.e. U is normal).

Recall if U∗U = I = UU∗, then U is a unitary.

Proof. (i) → (ii) and (iii) → (i): easy.

(ii) → (iii): If ‖Ux‖2 = ‖x‖2, then

((U∗U − I)x, x) = (Ux,Ux)− (x, x) = 0

and by applying polarisation identity (U∗U − I is Hermitian), we see that U∗U − I = 0.

(iv) → (v) (with (iii) holding).

(iii) → U is injective. If U is also surjective and bounded below: ‖Ux‖ ≥ ‖x‖, by (ii), means that U−1 ∈ B(H)exists. U−1 = IU−1 = U∗UU−1 = U∗ (by (iv)).

(v) → (iv): U∗U = I = UU∗ (i.e. by (iii)) then U∗ = U−1 so U is bijective.

In finite dimensions, U∗U = I implies U is invertible and hence U∗ = U−1. In infinite dimensions, this may not betrue:

U ∈ B(`2), U(x1, x2, . . .) = (0, x1, x2, x3, . . .)

53

Calculate:

U∗(x1, x2, . . .) = (x2, x3, . . .)

UU∗(x1, x2, . . .) = (0, x2, x3, . . .)

so ‖UU∗‖ = 1, (UU∗)∗ = U∗∗U∗ = UU∗, (UU∗)2 = U U∗U︸︷︷︸I

U∗ = UU∗. One can calculate, moreover, that U∗U = I

or ‖Ux‖2 = ‖x‖2 (so this is in fact an isometry, as advertised).

17.14 Lemma. Let H be a Hilbert space and H ∈ B(H) be Hermitian.

(i) σp(H) ⊆ R (comment: in fact σ(H) ⊆ R).

(ii) λ 6= µ in σp(H) then ker(λI −H) ⊥ ker(µI −H).

(iii) If λ ∈ σp(H), and P = Pker(λI−H) is the orthogonal projection and TH = HT for T ∈ B(H) then TP = PT .

Proof. We have:

(i) The form [x, y] = (Hx, y) is a Hermitian form, so [x, x] = [x, x], so [x, x] ∈ R. If x ∈ ker(λI − H) \ 0 forsome λ ∈ σp(H) then λ(x, x) = (λx, x) = (Hx, x) = (x,Hx) = (x, λx) = λ(x, x) so λ = λ so λ ∈ R.

(ii) If x ∈ ker(λI −H), y ∈ ker(µI −H) then

λ(x, y) = (λx, y) = (Hx, y) = (x,Hy) = (x, µy) = µ(x, y)

(µ ∈ R). So (λ− µ)(x, y) = 0 so (x, y) = 0.

(iii) If λ, P are as above, then for x ∈ ker(λI −H),

HTx = THx = T (λx) = λTx

so Tx ∈ ker(λI −H). Hence T (ker(λI −H)) ⊆ ker(λI −H), which means that

TP = PTP, i.e. T Px︸︷︷︸∈ker(λI−H)

∈ ker(λI −H),

so TPx = PTPx. Also, TH = HT implies (TH)∗ = (HT )∗ hence HT ∗ = T ∗H and as above T ∗P = PT ∗P .Hence

PT = (T ∗P )∗ = (PT ∗P )∗ = PTP = TP.

LAST TIME:

17.15 Proposition. N ∈ B(H) normal if and only if r(N) = ‖N‖.17.16 Lemma. H ∈ B(H) Hermitian

(i) σp(H) ⊆ R

(ii) λ 6= µ in σp(H) implies ker(λI −H) ⊥ ker(µI −H).

(iii) If TH = HT , T ∈ B(H), then for P = Pker(λI−H)(⊥’l projection), TP = PT .

17.17 Theorem (Spectral Theorem for Compact Hermitian Operators). Let H ∈ K(H) be ahermitian operator (H a hilbert space) and let λ1, λ2, . . . be a list of σ(H) \ 0 = σp(H) \ 0. Then

H = `2 −⊕

n=1,2,3,...

ker(λnI −H)⊕2 kerH

i.e. if x ∈ H then x has a unique decomposition x =∑n=1,2,... xn + x0 where xn ∈ ker(λnI −H), x0 ∈ kerH, with

xn ⊥ xm for n 6= m, i.e. ‖x‖2 =∑n=1,2,... ‖xn‖2 + ‖x0‖2. Thus, letting Pn = Pker(λnI−H) (orthogonal projection),

we haveH =

∑n=1,2,...

λnPn

where this sum converges in norm if λ1, λ2, . . . is infinite.

54

Proof. Let E = spanker(λnI −H) : n = 1, 2, . . .. Then part (ii) of the above lemma tells us that

E = `2 −⊕

n=1,2,...

ker(λnI −H)

We note that E contains all eigenspaces corresponding to nonzero eigenvalues ofH. Now, letH0 = E⊥ and P0 = PH0

(orthogonal projection).

We observe, first, that HH0 ⊆ H0. Indeed ,if x ∈ H0, and y ∈ E then (Hx, y) = (x,Hy) = 0 since HE ⊆ Eso Hy ∈ E and x ∈ E⊥ by assumption. That is Hx ∈ H0 whenever x ∈ H0. Thus H = HP0 = P0HP0 (byproof of part (iii) of lemma above) is a hermitian operator: H∗0 = (P0HP0)∗ = P ∗0HP

∗0 = P0HP0. Since H0 is

compact and since r(H0) = ‖H0‖, there exists λ ∈ σ(H0) such that |λ| = ‖H0‖. Further, if H0 6= 0, i.e. ‖H0‖ 6= 0,then λ ∈ σp(H0). If λ 6= 0 and x ∈ ker(λI − H0) \ 0 then x ⊥ E because x ⊥ ker(0I − H) = kerH andE ⊆ kerH0 = kerP0HP0 ⊆ kerP0 = E . Thus for x as above, P0x = x. Thus Hx = HP0x = H0x = λx. But thiscontradicts the fact that E contains all eigenspaces of H corresponding to nonzero eigenvalues. Thus H0 = 0 andH0 = kerH. Hence H = E ⊕2 E⊥ = `2 −

⊕n=1,2,... ker(λnI −H)⊕2 kerH.

To see the “operator form” of this, letH1 =

∑λnPn.

We observe that if λ1, λ2, . . . is infinite then limn→∞ λn = 0 by the structure theorem for general compactoperators and ∥∥∥∥∥H1 −

N∑n=1

λnPn

∥∥∥∥∥2

= supx∈B(H)

∥∥∥∥∥(H1 −

N∑n=1

λnPn

)x

∥∥∥∥∥2

= supx∈B(H)

∥∥∥∥∥∞∑

n=N+1

λnPnx

∥∥∥∥∥2

which by Pythagoras’ formula and the orthonormal basis theorem is

supx∈B(H)

∞∑n=N+1

|λn|2‖Pnx‖2 =CHECK

supn≥N+1

|λn|2N→∞−−−−→ 0.

So

H1 = limN→∞

N∑n=1

λnPn

in norm. We observe that the structure of H, above, tells us that H −H1 = 0, so H = H1.

17.18 Corollary. Let H = H∗ ∈ K(H) (with H a Hilbert space) and T ∈ B(H) be such that TH = HT . Then

T =∑

n=1,2,...

PnTPn + P0TP0

where P1, P2, . . . are from above. The sense of convergence above is pointwise (“strong operator topology”):

Tx = limN→∞

N∑n=1

PnTPnx

(if λ1, λ2, . . . is infinite).

Proof. Part (iii) of the Lemma shows that TPn = PnTPn for n = 0, 1, 2, . . . Moreover

H = `2 −⊕

n=1,2,...

ImPn ⊕2 Im(P0)

means exactly thatI =

∑n=1,2,...

Pn + P0

in the pointwise sense described above, i.e. Ix =∑n=1,2,... Pnx+ P0x. Then Tx = TIx, and we obtain the desired

result.

55

17.19 Corollary (Simultaneous Diagonalisation Theorem). IfH1, H2 ∈ K(H) are both Hermitian andH2H1 = H1H2 then there are finite rank orthogonal projections Q1, Q2, . . . ⊆ K(H) such that ImQn ⊥ ImQmfor n 6= m and for any scalars α1, α2 ∈ C we can write

α1H1 + α2H2 =∑

n=1,2,...

λn(α1, α2)Qn

where each λn(α1, α2) is a C-valued bilinear form and limn→∞ λn(α1, α2) = 0, so the series converges in norm.

Proof. First, write H1 =∑n=1,2,... λnPn as in the spectral theorem. We have, by the Corollary above, that

H2 =∑

n=1,2,...

PnH2Pn + P0H2P0

where P0 = PkerH1 . We have that each PnH2Pn is compact (in fact finite rank if n ≥ 0). So in fact each PnH2Pncan be written

PnH2Pn =∑

i=1,2,...

µi,nQi,n

where ∑i=1,2,...

Qi,n = Pn

(in the pointwise sense if infinite i.e. n = 0) by the structure of PnH2Pn. We let Q1, Q2, . . . be an enumerationof Qi,n : i = 1, 2, . . . ;n = 1, 2, . . . and let λn(α1, α2) = α1λm + α2µi,m where Qn = Qi,m.

17.20 Corollary (Spectral Theorem for Compact Normal Operators). If N ∈ K(H) is normal,then there is a sequence of finite rank projections Q1, Q2, . . . such that

N =∑

n=1,2,...

µnQn

for scalars µ1, µ2, . . ., with limn→∞ µn = 0.

Proof. Let H1 = ReN = 12 (N +N∗) and let H2 = ImN = 1

2i (N −N∗) and we apply the Corollary above. Recall,

ReN ImN = ImN ReN if and only if N is normal.

17.21 Theorem (Spectral Theorem for Compact Normal Operators). If N ∈ K(H) is normal,then there exists a sequence (Qn)n=1,2,... of finite rank orthogonal projections with ImQn ⊥ ImQm for m 6= n, anda sequence (µn)n=1,2,... ⊂ C \ 0, and limn→∞ µn = 0 if infinite, such that

N =∑

n=1,2,...

µnQn

where the series converges in norm, if infinite.

17.22 Remark. By combining projections associated to µn = µm (i.e. taking Qn + Qm in place of Qn, Qm) wemay assume that (µn)n=1,2,... = σp(N) \ 0.17.23 Corollary. Given an infinite dimensional separable H, and N ∈ K(H), normal, there is a unitary U : H → `2such that UNU∗ = Ma where Ma is the multiplication operator by a.

Proof. Let for each projection Qn, let en1, . . . , en,m(n) be an orthonormal basis for the finite-dimensional spaceImQn. Also, let e01, e02, . . . (possibly empty) be an orthonormal basis for kerN . Recall that kerN ⊥ ImQn =ker(µnI − N). Let (ei)

∞i=1 be an enumeration of enj : n = 0, 1, 2, . . . ; j = 1, . . . ,m(n); j = 1, 2, . . . if n = 0. Let

U∑∞i=1 xiei = (xi)

∞i=1 ∈ `2. Check that U is unitary. Check that UNU∗ = Ma, a = (ai)

∞i=1, ai = µnj according to

ei = enj .

17.24 Remark. As usual, let C(σ(N)) denote the continuous C-valued functions on σ(N). Define ΓN : C(σ(N))→B(H) to be the unique continuous linear operator such that

ΓN (p) = p(N,N∗)

if p(z, z) is a polynomial in z, z. Check that ‖ΓN (p)‖ = supz∈σ(N) |p(z, z)|. By Stone-Weierstrass, such polynomialsare dense in C(σ(N)). The operator ΓN is called a “functional calculus”. Often we write f(N) = ΓN (f). Note thatfor f, g ∈ C(σ(N)), fg(N) = f(N)g(N), and f(N) = f(N)∗.

56

17.25 Proposition. If T ∈ K(H) is positive, then there is a unique positive operator S ∈ K(H) such that S2 = T .We write S = T 1/2.

Proof. Recall, T positive means that (Tx, x) ≥ 0 for all x ∈ H. Hence σ(T ) ⊆ [0,∞). Indeed, if λ ∈ σ(T ) \ 0,then λ ∈ σp(T ) and we have, for x ∈ ker(λI − T ) \ 0 that

λ(x, x) = (λx, x) = (Tx, x) ≥ 0

so λ > 0. By the Spectral Theorem (Hermitian),

T =∑

n=1,2,...

λnPn.

Let S =∑n=1,2,...

√λnPn. Suppose S1 ∈ K(H) and S1 ≥ 0, S2

1 = T . Then S1S21 = S2

1S1, so S1T = TS1. Hence byan earlier corollary,

S1 =∑

n=1,2,...

PnS1Pn + P0S1P0

where P0 = PkerT . Observe(PnS1Pnx, x) = (S1Pnx, Pnx) ≥ 0.

Hence we again use Spectral Theorem to write

PnS1Pn =∑

i=1,2,...

µn,iQn,i, where spani=1,2,... Im(Qn,i) ⊆ ImPn

Then ∑n=1,2,...

λnPn = T = S21∗=

∑n=1,2,...

(PnS1Pn)2 =∑

n=1,2,...

∑m=1,2,...

µ2n,iQn,i︸︷︷︸

Q2n,i=Qn,i

where (*) holds since ImPn ⊥ ImPm for n 6= m. This forces µ2n,1 = λn, which means S1 = S, above.

17.26 Theorem (Polar decomposition). Let K ∈ K(H). Then let |K| = (K∗K)1/2. There is a uniqueoperator U ∈ B(H) such that

U |K| = K, kerU = kerK

and ‖Ux‖ = ‖x‖ whenever x ∈ (kerK)⊥. Here U is called a partial isometry.

Proof. First, for x ∈ H

‖|K|x‖2 = (|K|x, |K|x) = (|K|2x, x) = (K∗Kx, x) = (Kx,Kx) = ‖Kx‖2.

Hence we can define V0 : Im |K| → H by V0|K|x = Kx, so ‖V0y‖ = ‖y‖, i.e. V0 is an isometry. Let V : Im |K| → Hbe the unique continuous extension of V0. As usual, V is an isometry and is linear. By the Kernel-AnnihilatorTheorem, we know that Im |K| = (ker |K|)⊥ and we note from above that kerK = ker |K|. Let U : H → H begiven by V P where P = P(kerK)⊥ . We observe that if y ∈ Im |K| = (kerK)⊥ then ‖Uy‖ = ‖V Py‖ = ‖V y‖ sincePy = y, hence ‖Uy‖ = ‖y‖. Also, kerU ⊆ kerP = kerK so kerU = kerK.

57

Documents

LW 1129 Pmath753notes