Upload
hatuong
View
225
Download
0
Embed Size (px)
Citation preview
Functional Analysis
Richard Bass
These notes are based mostly on the book by P. Lax, Functional Analysis.
Functional analysis can best be characterized as infinite dimensional linearalgebra. We will use some real analysis, complex analysis, and algebra, butfunctional analysis is not really an extension of any one of these.
1 Linear spaces
A linear space is the same as a vector space. We have an operation “+” underwhich the space is an Abelian group: addition is commutative, associative,there exists an identity (which we call 0) and every element has an inverse(the inverse of x will be denoted −x).
We define x− y = x+ (−y).
We also have a field F , which for us will always be the reals or the complexfield. Elements of F will be called scalars.
A linear space also has an operation called scalar multiplication, whichis associative (a(bx) = (ab)x) and distributive (a(x + y) = ax + ay and(a+ b)x = ax+ bx), and we have 1x = x, where 1 is the identity in F .
By the same proofs as in the finite dimensional case, we have 0x = 0because 0x = (0 + 0)x = 0x+ 0x and (−1)x = −x because
0 = 0x = (1)x+ (−1)x = x+ (−1)x.
Examples of linear spaces are
1. Rn
2. The collection of all infinite sequences.
3. B(S), the bounded functions on a set S.
4. C(S), the continuous functions on S, where S is a topological space.
1
5. Ck(R)
6. Lp(X,m)
7. {f : f is analytic in D}
If X is a linear space, Y ⊂ X, then Y is a linear subspace of X if y ∈ Yimplies ay ∈ Y for all a ∈ F and x, y ∈ Y implies x+ y ∈ Y .
Let S be a subset of X. Consider the collection
{Yα : Yα is a linear subspace of X,S ⊂ Yα}.
It is easy to check that ∩αYα is a subspace of X, and it is called the linearspan of S.
Proposition 1.1 The linear span of S is equal to{ n∑i=1
aixi : ai ∈ F, xi ∈ S, n ∈ N}.
Proof. Let M be the above set of sums. M is clearly a linear subspace of Xcontaining S, therefore the span of S is contained in M . If Yα is any linearsubspace containing S, then Yα must contain M , therefore ∩αYα contains M .
Let X and U be linear spaces over F . M : X → U is linear if M(x+ y) =M(x) +M(y) and M(kx) = kM(x).
Examples of linear operators:
1) X = L1(m), g is bounded and measurable, and
Tf =
∫f(x)g(x)m(dx).
2) Let our space be B(S), fix a point x0 ∈ S or points x0, . . . , xn ∈ S, andlet Tf = f(x0) or Tf = (f(x0), . . . , f(xn)). Here T maps B(S) to R or Rn+1.
3) Let X be the space of n-tuples, and define the ith coordinate of Mxto be
∑nj=1 aijxj. This is just matrix multiplication, and all linear maps in
finite dimensions can be viewed in this way.
2
4) Let X be the set of bounded sequences, suppose that supi∑∞
j=1 |aij| <∞, and define the ith coordinate of Mx to be
∑∞j=1 aijxj.
5) Let X be the set of bounded measurable functions on some measurespace with finite measure m, and suppose K(x, y) is jointly measurable andbounded. Define Mf by Mf(x) =
∫K(x, y)m(dy).
Two linear spaces are isomorphic if there exists a one-to-one linear map-ping from one space onto the other.
If M is a 1-1 linear mapping from X onto U , then M−1 is also linear. Tosee this, suppose u1 = Mx1 and u2 = Mx2 are elements of U with x1, x2 ∈ X.Then u1 +u2 = Mx1 +Mx2 = M(x1 +x2). Hence M−1(u1 +u2) = x1 +x2 =M−1u1 +M−1u2. Similarly M−1(ku) = kM−1u.
A set K ⊂ X is convex if λx + (1 − λ)y ∈ K whenever x, y ∈ K andλ ∈ [0, 1].
A convex combination of x1, . . . , xm is a sum of the form∑n
i=1 aixi, where∑ni=1 ai = 1, n is a positive integer, and all the ai are non-negative.
Lemma 1.2 Linear subspaces are convex. Intersections of convex sets areconvex. If M : X → U is linear and K ⊂ X is convex, then {M(x) : x ∈ K}is convex.
If S ⊂ X, the convex hull of S is the intersection of all the convex setscontaining S.
Proposition 1.3 The convex hull of S is equal to the set of all convex com-binations of points of x.
If K is convex, and E ⊂ K, then E is an extreme subset of K if
(1) E is convex and non-empty
(2) if x ∈ E and x = y+z2
with y, z ∈ K, then y, z ∈ E.
If E is a single point, then the point is called an extreme point of K.
For an example, consider the case where K is a polygon (plus the interior)in R2. Each edge of K is an extreme subset. Each vertex is an extreme point.
3
2 Linear maps
2.1 Definitions
Definition 2.1 If M and N are linear maps from X into U and k is ascalar, we define
(M +N)(x) = M(x) +N(x), (kM)(x) = kM(x).
So the set of linear maps from X into U is a linear space, and we denoteit L(X,U).
If M : X → U and N : U → W , we define (NM)(x) = N(M(x)).
An exercise is to show this is associative but not necessarily commutative.(Multiplication by matrices is an example to show commutativity need nothold.) It is distributive:
M(N +K) = MN +MK, (M +K)N = MN +KN.
We usually write Mx for M(x).
Define the identity I : X → X by Ix = x. We will also write IX when wewant to emphasize the space.
We say M : X → U is invertible if there exists M−1 : U → X such thatM−1M = IX , MM−1 = IU .
Definition 2.2 The null space or kernel of M is NM = {x ∈ X : Mx = 0}and the range of M is RM = {Mx : x ∈ X}.
Observe that NM ⊂ X and RM ⊂ U .
Some easily checked facts: NM and RM are linear subspaces, and if K,Mare invertible, then (KM)−1 = M−1K−1.
Let Y be a subspace of X. Y is called an invariant subspace for M : X →X if M maps Y into Y .
One example of an invariant subspace is to let y be a fixed element of Xand set
Y = {p(M)y : p a polynomial}.
4
To check this, if z ∈ Y , then z = p(M)y for some polynomial p. Then Mz =Mp(M)y, and Mp(M) = p(M), where p(x) = xp(x) is another polynomial.
A second example: suppose T : X → X and MT = TM . Then NT isinvariant. To see this, if z ∈ NT , then Tz = 0 and then T (Mz) = TMz =MTz = M0 = 0, or Mz ∈ NT .
If N and Y are subspaces of X, we write X = N ⊕ Y if for each x ∈ X,there exist n ∈ N and y ∈ Y such that x = n+ y, and the decomposition isunique, i.e., there is only one n and one y that works for any particular x.Of course, n and y depend on x.
As an example, let X = R3, N = {(y, 0, 0}). There are lots of possibilitiesfor Y , in fact, any plane in R3 that passes through the origin and does notcontain the real axis. Given any choice of Y , though, there is only one wayto write a given x as n+ y.
We will frequently use Zorn’s lemma, which is equivalent to the axiom ofchoice.
Suppose we have a partially ordered set S, which means that there is anorder relation such that a ≤ a for all a ∈ S, and if a ≤ b and b ≤ c, thena ≤ c. A subset is totally ordered if for every pair x, y in the subset, eitherx ≤ y or y ≤ x. An element u of a partially ordered set is an upper bound fora subset of S if x ≤ u for every x in the subset. An element x of a partiallyordered set is maximal if y ≥ x implies y = x. (Lax stated this incorrectly.)
Lemma 2.3 (Zorn’s lemma) Let X be a partially ordered set. If every totallyordered subset of X has an upper bound in X, then X has a maximal element.
Lemma 2.4 Suppose N is a subspace of a linear space X. Then there existsa linear subspace Y such that X = N ⊕ Y .
Proof. Look at {Y : Y a subspace of X, Y ∩N = {0}}. We partially orderthis collection by inclusion, If {Yα} is a totally ordered subcollection, then∪αYα is an upper bound. Let Y0 be the maximal element guaranteed byZorn’s lemma.
Suppose there is a point x ∈ X that is not in N ⊕ Y0. We adjoin x to Y0
to form Y1, that is, Y1 = {ax+ y : y ∈ Y0, a ∈ R}. Y1 is a subspace of X that
5
is strictly bigger than Y0. We argue that Y1 ∩ N = {0}, a contradiction tothe fact that Y0 is maximal.
x is not in the direct sum of N and Y0, so x /∈ N , or else we could writex = x+ 0. If z 6= 0 and z ∈ Y1 ∩N , then there exist a ∈ R and y ∈ Y0 suchthat z = ax + y. One possibility is that a = 0; but then z = y ∈ Y0 ∩ N ,which isn’t possible since z is nonzero. The other possibility is that a 6= 0.But z ∈ N , so
x =z
a+−ya∈ N ⊕ Y0,
also a contradiction.
2.2 Index of a map
We say that dimX <∞, that is X is finite dimensional, if there exist finitelymany points x1, . . . , xn ∈ X such that X is equal to the span of {x1, . . . , xn}.The smallest such n is the dimension of X.
Let X be a linear space and Y a subspace. We say x1 ≡ x2(mod Y ) orthat x1 is equivalent to x2 if x1 − x2 ∈ Y . This is an equivalence relation.Let x denote the equivalence class containing x. The collection of all suchequivalence classes is denoted X/Y and called the quotient space of X withrespect to Y ,
Let’s make X/Y into a linear space. If x1, x2 are in X/Y , define x1 + x2
to be x1 + x2. To see that this is well defined, if z1, z2 are elements of x1, x2,resp., then (x1 + x2) − (z1 + z2) = (x1 − z1) + (x2 − z2), the sum of twoelements of Y , hence an element of Y . We similarly define kx = kx. It isroutine to verify that X/Y is now a linear space.
We define the codimension of Y by
codim Y = dimX/Y.
Let’s look at an example. Suppose X = R5 and Y = {(x, y, 0, 0, 0)}.x1 ≡ x2 if and only if the 3rd through 5th coordinates of x1 and x2 agree.Therefore X/Y is (essentially - at least it is isomorphic to) the 3rd through5th coordinates of points in R5, hence isomorphic to R3. We see codim Y =dimX/Y = 3, while dimY = 2.
6
Lemma 2.5 If X = N ⊕ Y , then X/N is isomorphic to Y .
Proof. If x ∈ X/N , then we can write x = y + n. Define Mx = y. We willshow that M is an isomorphism.
First we need to show M is well defined. If x′ is another element of x, wecan write x′ = y′ + n′. Then x − x′ = (y − y′) + (n − n′) is in N . Since wecan write x− x′ = 0 + (x− x′), we must have y − y′ = 0, or y = y′.
Next we show M is linear. If x1 + x2 ∈ x1 + x2, then x1 = y1 + n1,x2 = y2 + n2, and then x1 + x2 = (y1 + y2) + (n1 + n2). So M(x1 + x2) =y1 + y2 = Mx1 +Mx2. The linearity with respect to scalar multiplication issimilar.
We show M is 1-1. If Mx = Mx′, and we write x = y + n, x′ = y′ + n′,then y = Mx = Mx′ = y′. Hence
x− x′ = (y − y′) + (n− n′) = n− n′ ∈ N,
so x = x′.
Finally, M is onto, because if y ∈ Y , then y = y + 0 ∈ X and My = y.
Let M : X → U . We will need the fact that
Proposition 2.6 X/NM is isomorphic to RM .
Proof. If x ∈ X/NM , we define Mx to be Mx for any x ∈ x. If x′ is anyother element of x, then x− x′ ∈ NM , or M(x− x′) = 0, or Mx = Mx′. So
the map M is well defined. It is routine to check that M is linear.
To show M is 1-1, if Mx = My, then Mx = My, or M(x − y) = 0, or
x− y ∈ NM , so x = y. To show M is onto, if y ∈ RM , then y = Mx for somex ∈ X. Then Mx = Mx = y.
We say a linear map G is degenerate if dimRG < ∞. Maps M : X → Uand L : U → X are pseudo-inverses if there exist G : X → X and H : U → Uthat are degenerate and
LM = IX +G, ML = IU +H.
7
This concept is not interesting in finite dimensions. For example, letX = U = R5, let A be any 5 × 5 matrix, and define Mx = Ax, where weview x as a 5×1 matrix. Let B be any other 5×5 matrix and define Lx = Bx.Then LMx = BAx = (I + G)x, where G = BA− I. Obviously RG is finitedimensional. We deal with ML similarly. So M and L are pseudo-inverses.
For a non-trivial example of pseudo-inverse, define the right and left shiftson X, the linear space of infinite sequences, by
R(a1, a2, . . .) = (0, a1, a2, . . .),
L(a1, a2, . . .) = (a2, a3, . . .).
ThenRL(a1, . . .) = (0, a2, a3, . . .) = (I −G)(a1, a2, . . .),
whereG(a1, a2, . . .) = −(a1, 0, 0, . . .).
Clearly RG is finite dimensional. We see that LR is the identity, so L and Rare pseudo-inverses.
We will prove that M : X → U has a pseudo-inverse if and only ifdimNM <∞ and codim RM <∞.
We start by proving
Proposition 2.7 If M has a pseudo-inverse, then
dimNM <∞ and codim RM <∞.
Proof. First suppose G is degenerate. We show dimNI+G < ∞. To seethis, if x ∈ NI+G, x + Gx = 0, so x = −Gx ∈ RG. Therefore NI+G ⊂ RG,which implies dimNI+G ≤ dimRG <∞.
Next, LM = I+G, so if x ∈ NM , then Mx = 0, so LMx = 0, or (I+G)x =0, or x ∈ NI+G. Therefore NM ⊂ NI+G, hence dimNM ≤ dimNI+G <∞.
Recall the X/NG is isomorphic to RG. Therefore codim NG = dimRG.
We claim codim RI+G < ∞. To prove the claim, if x ∈ NG, then (I +G)x = x + Gx = x, so NG ⊂ RI+G. Then codim RI+G ≤ codim NG =dimRG <∞.
8
Finally, since ML = I + G, then RI+G ⊂ RM . To see this, if y ∈ RI+G,then y = (I +G)x, or y = M(Lx). We then write
codim RM ≤ codim RI+G <∞.
We now prove
Proposition 2.8 If dimNM < ∞ and codim RM < ∞, then M has apseudo-inverse.
Proof. Suppose M : X → U and write X = NM ⊕ Y , U = RM ⊕ V . Weclaim that M restricted to Y is a 1-1 map onto RM . First we show onto.If z ∈ RM , then z = Mx for some x ∈ X. For some n ∈ N and y ∈ Y ,x = n+ y. So z = My.
Next we show 1-1. If My1 = My2, then M(y1 − y2) = 0, or y1 − y2 ∈ N .Let n = y1−y2. Then y2 +n = y1 +0. Since X = NM⊕Y , the decompositionis unique, and n = 0 and y1 = y2.
Therefore M : Y → RM is invertible. Define
K =
{M−1 on RM
0 on V.
Extend K to U = RM ⊕ V by linearity as follows. If u = z + v with z ∈ RM
and v ∈ V , let Ku = M−1z.
If y ∈ Y , then My ∈ RM and so KMy = y. If n ∈ NM , then KMn = 0.So
KM =
{1 on Y
0 on NM .
If z ∈ RM , then z = My for some y ∈ Y , so MKz = MKMy = My = z. Ifz ∈ V , Kz = 0. So
MK =
{1 on RM
0 on V.
9
Let P be the projection onto NM : if x ∈ X and x = y+n, define Px = n.Then KM = I − P . Since dimRP = dimNM <∞, P is degenerate.
Similarly MK = I − Q, where Q is the projection onto V . Since V isisomorphic to U/RM by a previous lemma, dimV = codim RM <∞, and Qis degenerate.
A sequence of spaces and maps
V0T0−→V1
T1−→· · ·Vn−1Tn−1−→Vn
is an exact sequence if RTj = NTj+1.
Lemma 2.9 Suppose we have an exact sequence, each Vj is finite dimen-sional, and dimV0 = 0 = dimVn. Then
n∑j=0
(−1)j dimVj = 0.
Note since dimV0 = dimVn, we can write the sum from j = 0 to n, oromit or include j = 0 or j = n as we choose.
Proof. Write Nj for NTj , Rj = RTj , and write Vj = Nj ⊕ Yj. Because wehave an exact sequence, Rj−1 = Nj.
Now Vj/Nj is isomorphic to Rj = Nj+1. On the other hand Yj is isomor-phic to Vj/Nj by a lemma. Hence Yj is isomorphic to Nj+1. Hence
dimVj = dimNj + dimNj+1.
Note N0 ⊂ V0, so dimN0 = 0. Also, because Rn−1 ⊂ Vn, Rn−1 = {0}.Since Vn−1/Nn−1 is isomorphic to Rn−1, dimVn−1 = dimNn−1.
Then∑(−1)j dimVj =
∑(−1)j[dimNj + dimNj+1]
= dimN0 + dimN1 − dimN1 − dimN2 + · · ·± (dimNn−2 + dimNn−1 − dimNn−1)
= dimN0
= 0.
10
Suppose M has a pseudo-inverse. Define
ind (M) = dimNM − codim RM .
Theorem 2.10 If M : X → U and L : U → W both have pseudo-inverses,then LM has a pseudo-inverse and
ind (LM) = ind (L) + ind (M).
Proof. We use an exact sequence. V0 = 0, V1 = NM , V2 = NLM , V3 = NL,V4 = U/RM , V5 = W/RLM , V6 = W/RL, and V7 = 0.
Since NM ⊂ NLM , we let T1 be the inclusion map. T2 = M . T3 is the maptaking U to U/RM . T4 is the map taking U/RM to W/RLM , while T5 is themap from W/RLM to W/RL. It is an exercise to check that this is an exactsequence. Using the lemma,
− dimNM + dimNLM − dimNL + codim RM − codim RLM + codim RL = 0.
This does it.
3 Hahn-Banach theorem
3.1 The theorem
A linear functional ` is a linear map from X to F . Let’s take F to be thereals for now.
We will be working with a sublinear functional p(x) in the statement ofthe theorem. An example to keep in mind is p(x) = ‖x‖, although we havenot defined what a norm is yet.
Theorem 3.1 Suppose p : X → R satisfies p(ax) = ap(x) if a > 0 andp(x + y) ≤ p(x) + p(y) if x, y ∈ X. Suppose Y is a linear subspace, ` isa linear functional on Y , and `(y) ≤ p(y) for all y ∈ Y . Then ` can beextended to a linear functional on X satisfying `(x) ≤ p(x) for all x ∈ X.
11
Proof. If Y is not all of X, pick z ∈ X \ Y . Look at Y1 = {y + az : y ∈Y, a ∈ R}. We want to define `(z) to be some real number with the propertythat if we set
`(y + az) = `(y) + a`(z),
we would have `(y) + a`(z) ≤ p(y+ az) for all y ∈ Y and a ∈ R. This wouldgive us an extension of ` from Y to Y1.
For all y, y′ ∈ Y ,
`(y′) + `(y) = `(y′ + y) ≤ p(y′ + y) = p((y + z) + (y′ − z))
≤ p(y + z) + p(y′ − z).
So`(y′)− p(y′ − z) ≤ p(y + z)− `(y).
This is true for all y, y′ ∈ Y . So choose `(z) to be a number betweensupy′ [`(y
′)− p(y′ − z)] and infy[p(y + z)− `(y)]. Therefore
`(y′)− p(y′ − z) ≤ `(z) ≤ p(y + z)− `(y),
or`(y) + `(z) ≤ p(y + z), `(y′)− `(z) ≤ p(y′ − z).
If a > 0,`(y + az) = a`(y
a+ z) ≤ ap(y
a+ z) = p(y + az).
Similarly `(y′ − az) ≤ p(y′ − az) if a > 0.
So we have extended ` from Y to Y1, a larger space. Let {(Yα, `α)} be thecollection of all extensions of (Y, `). This is partially ordered by inclusion. If{(Yβ, `β)} is a totally ordered subset, define ` on ∪βYβ by setting `(z) = `β(z)if z ∈ Yβ. By Zorn’s lemma, there is a maximal extension. This maximalextension must be all of X, or else by the above we could extend it.
The Hahn-Banach theorem one learns in a real analysis course has thesame proof, but one takes p(x) = c|x|. Saying `(x) ≤ p(x) then translatesto saying |`| ≤ c, and we extend ` so that the norm of the extension is thesame.
12
3.2 Separating hyperplanes
If ` is a linear functional, {x : `(x) = c} is a hyperplane. This splits X intotwo parts, those x for which `(x) > c and those for which `(x) < c.
A point x0 ∈ S ⊂ X is interior to S if for all y ∈ X, there exists ε(depending on y) such that x0 + ty ∈ S if −ε < t < ε.
Let K be a convex set with an interior point. Without loss of generality,we may assume x0 = 0. pK , the gauge, is defined by
pK(x) = inf{a > 0 :
x
a∈ K
}.
Proposition 3.2 pK is homogeneous and subadditive.
Saying pK is homogeneous means that pK(ax) = apK(x) if a > 0. Subad-ditive means pK(x+ y) ≤ pK(x) + pK(y).
Proof. Homogeneity is obvious. We look at subadditivity. Let x, y ∈ X. IfpK(x) or pK(y) is infinite, there is nothing to prove. So suppose both are finiteand let ε > 0. Choose pK(x) < a < pK(x) + ε and pK(y) < b < pK(y) + ε.Then x
aand y
bare in K. Letting λ = a/(a+ b),
λx
a+ (1− λ)
y
b=x+ y
a+ b
is in K. SopK(x+ y) ≤ a+ b ≤ pK(x) + pK(y) + 2ε.
Since ε is arbitrary, we are done.
The following proposition’s proof is left to the reader.
Proposition 3.3 (a) If K is convex and x ∈ K, then pK(x) ≤ 1. If K isconvex and x is interior to K, then pK(x) < 1.
(b) Let p be positive, homogeneous, and subadditive. Then {x : p(x) < 1}is convex and 0 is an interior point. Also {x : p(x) ≤ 1} is convex.
We now prove the hyperplane separation theorem.
13
Theorem 3.4 Suppose K is a nonempty convex subset of X and all pointsof K are interior. If y /∈ K, then there exist ` and c such that `(x) < c forall x ∈ K and `(y) = c.
Proof. Without loss of generality, assume 0 ∈ K. Note pK(x) < 1 for allx ∈ K. Set `(y) = 1 and `(ay) = a. If a ≤ 0, `(ay) ≤ 0 ≤ pK(ay). If a > 0,then since y /∈ K, pK(y) ≥ 1, and so pK(ay) ≥ a = `(ay).
We let Y = {ay} and use Hahn-Banach to extend ` to all of X. We have`(x) ≤ pK(x) < 1 if x ∈ K and l(y) = 1. We take c = 1.
Corollary 3.5 If K is convex with at least one interior point and y /∈ K,there exists ` 6= 0 such that `(x) ≤ `(y) for all x ∈ K.
A+B is defined to be {a+ b : a ∈ A, b ∈ B}.
Corollary 3.6 Let H and M be disjoint convex sets, with at least one havingan interior point. Then there exist ` and c such that
`(u) ≤ c ≤ `(v), u ∈ H, v ∈M.
Proof. −M is convex, so K = H+(−M) is convex. K must have an interiorpoint. H ∩M = ∅, so 0 /∈ K. Let y = 0. There exists ` such that
`(x) ≤ `(0) = 0 x ∈ K.
If u, v ∈ H,M , resp., then x = u−v ∈ K, so `(x) ≤ 0, and hence `(u) ≤ `(v).
3.3 Complex linear functionals
Theorem 3.7 Let X be a linear space over C. Suppose p ≥ 0 satisfiesp(ax) = |a|p(x) for all x ∈ X, a ∈ C, and p(x + y) ≤ p(x) + p(y). If Y is asubspace of X, ` is a linear functional on Y , and |`(y)| ≤ p(y) for all y ∈ Y ,then ` can be extended to a linear functional on X with |`(x)| ≤ p(x) for allx.
14
Again, we think of p(x) as ‖x‖, once that is defined.
Proof. Write ` as `(y) = `1(y) + i`2(y), the real and imaginary parts of `.Since ` is linear,
i`(y) = `1(iy) + i`2(iy).
On the other handi`(y) = i`1(y)− `2(y)
by substituting in for `(y) and multiplying by i. Equating the real parts,`1(iy) = −`2(y).
One can work this in reverse to see that if `1 is a linear functional overthe reals, and we define `(x) = `1(x)− i`1(ix), we get a linear functional overthe complexes.
To extend `, we have
`1(y) ≤ |`(y)| ≤ p(y).
Use Hahn-Banach to extend `1 to all of X and set `(x) = `1(x)− i`1(ix).
We need to show that |`(x)| ≤ p(x) for all x. Fix x and write `(x) = ar,where r is real and |a| = 1. Then
|`(x)| = r = a−1`(x) = `(a−1x).
Since `(a−1x) = |`(x)|, it is real with no imaginary part, and therefore equals
`1(a−1x) ≤ p(a−1x) = |a−1|p(x) = p(x).
4 An application of the Hahn-Banach theo-
rem
Let S be an arbitrary set and let B be the collection of real-valued boundedfunctions on S. We say x ≤ y if x(s) ≤ y(s) for all s ∈ S. (We’ll use x ≥ yif y ≤ x.) A function x is non-negative if 0 ≤ x. Let Y be a linear subspace
15
of B. ` is a positive linear functional on Y if `(y) ≥ 0 whenever y ≥ 0. Notethat if x ≤ y, then 0 ≤ `(y − x) = `(y)− `(x), so ` is monotone.
One example is to take `(y) = y(s0) for some point s0 in S. Or we couldtake a linear combination
∑ciy(si) provided all the ci ≥ 0. Another example
is if S is a measure space and we let `(y) =∫y(s)m(dx).
Proposition 4.1 Let Y be a linear subspace and suppose there exists y0 ∈ Ysuch that y0(s) ≥ 1 for all s. Let ` be a positive linear functional on Y . Then` can be extended to a positive linear functional on B.
Proof. Definep(x) = inf{`(y) : y ∈ Y, y ≥ x}.
Since −cy0 ≤ x ≤ cy0 if x is bounded by c, we are not taking the infinum ofan empty set. Since x ≤ cy0, then p(x) ≤ c`(y0) < ∞. If y ≥ x and y ∈ Y ,then −cy0 ≤ x ≤ y, so `(y) ≥ `(−cy0) = −c`(y0) > −∞.
It is clear that p is homogeneous. To show that p is subadditive, supposex1, x2 ∈ B and y1, y2 ∈ Y with x1 ≤ y1, x2 ≤ y2. Then
p(x1 + x2) = infx1+x2≤y
`(y) ≤ infx1≤y1,x2≤y2
`(y1 + y2)
= infx1≤y1
`(y1) + infx2≤y2
`(y2) = p(x1) + p(x2).
If y ∈ Y and y′ ≥ y is any other element in Y , then `(y) ≤ `(y′), sop(y) ≥ `(y). If x ≤ 0, then since 0 ∈ Y , p(x) ≤ `(0) = 0.
We now use Hahn-Banach to extend ` to all of B. If x ≤ 0, then `(x) ≤p(x) ≤ `(0) = 0, which proves that ` is positive on B.
5 Normed linear spaces
A norm is a map from X → R, denoted |x|, such that |0| = 0, |x| > 0 ifx 6= 0, |x+ y| ≤ |x|+ |y|, and |ax| = |a| |x|. A linear space together with itsnorm is called a normed linear space.
16
If we define d(x, y) = |x − y|, then d is a metric, and we can use all theterminology of topology.
Two norms |x|1 and |x|2 are equivalent if there exists a constant c suchthat
c|x|1 ≤ |x|2 ≤ c−1|x|1, x ∈ X.
Equivalent norms give rise to the same topology.
A subspace of a normed linear space is again a normed linear space. If Zand U are normed linear spaces, we can make Z ⊕ U into a normed linearspace by defining
|(z, u)| = |z|+ |u|,
For many purposes it is important to know whether a subspace is closed ornot, closed being in the topological sense. Here is an example of a subspacethat is not closed. Let X = `2, the set of all sequences {x = (x1, x2, . . .)}with |x| = (
∑∞j=1 |xj|2)1/2 < ∞. Let M be the collection of points in X
such that all but finitely many coordinates are zero. Clearly M is a linearsubspace. Let y1 = (1, 0, . . .), y2 = (1, 1
2, 0, . . .), y3 = (1, 1
2, 1
4, 0, . . .) and so
on. Each yk ∈ M . But it is easy to see that |yk − y| → 0 as k → ∞, wherey = (1, 1
2, 1
4, 1
8, . . .) and y /∈M . Thus M is not closed.
Proposition 5.1 If X is a normed linear space and Y is a closed subspace,then X/Y is a normed linear space with |x| = infy∈x |y|.
Proof. Homogeneity is easy. To prove subadditivity, let ε > 0. Given x, z,there exist x, z such that |x| < |x|+ ε, |z| < |z|+ ε. then
|x+ z| ≤ |x+ z| ≤ |x|+ |z| ≤ |x|+ |z|+ 2ε.
But ε is arbitrary.
It remains to prove positivity. Suppose |x| = 0. So there exists a sequencexn ∈ x such that |xn| → 0. Since each xn ∈ x, then xn − x1 ∈ Y . Letyn = x1 − xn. So lim |x1 − yn| = lim |xn| = 0. Therefore yn → x1. Since Y isclosed, x1 ∈ Y , which implies x = x1 = 0.
If X is a normed linear space and Y is a subspace of X, then the closureof Y is also a linear subspace of X.
17
A Banach space is a complete normed linear space.
Recall that any metric space can be embedded in a complete metric space.
Examples:
1) `∞ is the collection of infinite sequences {a1, a2, . . .} with each ai ∈ Cand supi |ai| <∞. We define |x|∞ = supj |aj|. This is a complete space.
2) If 1 ≤ p <∞, `p is the collection of infinite sequences for which
|x|p =(∑
j
|aj|p)1/p
is finite. This is a complete space.
3) If S is a set, the collection of bounded functions on S with |f |∞ =sups |f(s)| is a complete normed linear space.
4) If S is a topological space, then the collection of continuous boundedfunctions with |f | = sups |f(s)| is a complete normed linear space.
5) The Lp spaces are Banach spaces.
6) (Sobolev spaces) Let D be a domain in Rn and consider the C∞ func-tions on D with ∫
D
|∂αf(x)|p dx
finite for all |α| ≤ k. Here ∂α = ∂α1
∂xα11· · · ∂αn
∂xαnnand |α| = α1 + · · ·+ αn. For a
norm, we take
|f |k,p =( ∑|α|≤k
∫|∂αf(x)|p dx
)1/p
.
This is not a complete space, but its completion is denoted W k,p and is calleda Sobolev space.
A normed linear space is separable if it contains a countable dense subset.Most of the examples are separable, but B(S) is not if S is uncountable.Another example of one that is not is the collection of finite signed measureson [0, 1], with |m| =
∫ 1
0|dm|. If we let δy be point mass at y, then |δy−δz| = 2
unless y = z.
18
5.1 The unit ball is not compact in infinite dimensions
In finite dimensions, the closed unit ball is always compact, but this is not thecase in infinite dimensions. As an example, consider `2. If ei is the sequencewhich has a one in the ith place and 0 everywhere else, then |ei − ej| =
√2
if i 6= j. But then {ei} is a sequence contained in the unit ball that has noconvergent subsequence, hence the unit ball is not compact.
In fact, the closed unit ball B = {x : |x| ≤ 1} is never compact in infinitedimensions.
Theorem 5.2 Let X be an infinite dimensional normed linear space. Thenthe closed unit ball is not compact.
Proof. Choose y1 such that |y1| = 1. Given y1, . . . , yn−1, let Yn be the linearspan. Since Yn is finite dimensional, it is closed. We will show in a momenthow to find yn such that |yn| = 1 and infy∈Yn |y − yn| ≥ 1/2. We continueby induction and find a sequence {yn} contained in the closed unit ball suchthat |yj − yn| ≥ 1/2 if j < n, hence which has no convergent subsequence.
Since Yn is finite dimensional, it is not all of X and there exists x ∈ X \Yn.Since Yn is closed, d = infy∈Yn |y − x| > 0. Choose y′ ∈ Yn such that|x− y′| < 2d. Let z′ = x− y′ so |z′| < 2d. If y ∈ Yn, then y + y′ ∈ Yn and
|z′ − y| = |x− (y + y′)| ≥ d.
If we let yn = z′/|z′|, then for all y ∈ Yn
|yn − y| =∣∣∣ z′|z′| − y∣∣∣ =
1
|z′|∣∣z′ − |z′|y∣∣ ≥ d
2d=
1
2
since |z′|y ∈ Yn.
5.2 Isometries
A (not necessarily linear) map M from X onto X is an isometry if
|Mx−My| = |x− y|
19
for all x, y ∈ X.
As an example, Mx = x+u, where u is a fixed element of X is an isometry.
The collection of isometries forms a group.
Theorem 5.3 Let X,X ′ be two normed linear spaces over the reals, M anisometric map from X onto X ′ such that M(0) = 0. Then M is linear.
Proof. Fix x, y and let z = x+y2
. Then
|x− z| = |y − z| = |x− y|2
.
Let
A ={u : |x− u| = |y − u| = |x− y|
2
}.
We show A is symmetric with respect to z. That is, if u ∈ A, we mustshow v = 2z−u ∈ A. To see this, 2z = x+y, so v−x = y−u, v−y = x−u,so |v − x| = |y − u| = |x−y|
2, and similarly for |v − y|.
The diameter of A, dA, is defined by dA = supu,w∈A |u − w|. Since A issymmetric with respect to z,
dA ≥ |u− v| = |u− (2z − u)| = |2u− 2z| = 2|u− z|,
so |u− z| ≤ dA/2.
LetA1 =
{p ∈ A : |u− p| ≤ 1
2dA for all u ∈ A
}.
Note z ∈ A1. A1 is symmetric with respect to z, because if p ∈ A1 andq = 2z − p, then q − u = 2z − u− p = v − p, so |q − u| = |v − p| ≤ 1
2dA.
We have dA1 ≤ 12dA.
We define
A2 ={p ∈ A1 : |u− p| ≤ 1
2dA1 for all u ∈ A1
},
and so on. z is in each of these, the An decrease, and since dAn → 0, thenthe intersection is the single point z.
20
If we let x′, y′ be the images of x, y under M , we define A′, A′1, . . . anal-ogously. Since the sets are defined only in terms of distances, M maps Aninto A′n. M−1 is also an isometry, and so M−1 maps A′n into An. ThereforeM maps ∩nAn onto ∩nA′n, or
M(x+ y
2
)=x′ + y′
2.
When y = 0, y′ = M(0) = 0, and this becomes M(x/2) = x′/2, and similarlyM(y/2) = y′/2. So
M(x+ y
2
)=x′ + y′
2.
Replacing x by 2x and y by 2y, we get the first property of linearity for M .
Since M(2x) = M(x + x) = M(x) + M(x) = 2M(x) and M(3x) =M(x + 2x) = M(x) + M(2x) = M(x) + 2M(x) = 3M(x) and so on, thenM(kx) = kM(x). Since
M(x) = M(k(x/k)) = kM(x/k),
we have M(x/k) = (1/k)M(x). Therefore M(rx) = rM(x) for all r rational.Since M is an isometry, it is continuous, and therefore M(rx) = rM(x) forall real r.
To give some insight into the above proof, let X = `∞, the set of boundedsequences, let x = (1, 0, . . .), and y = (0, 1, 0, 0, . . .). Then
A = {(12, 1
2, a3, a4, . . .) : |ai| ≤ 1
2for i ≥ 3}.
ThenA1 = {(1
2, 1
2, a3, a4, . . .) : |ai| ≤ 1
4for i ≥ 3},
and so on. As we define Ai, we close in on (1/2, 1/2, 0, . . .).
6 Hilbert space
6.1 Scalar product
We first look at the case F = R. We have a scalar product (a, b) mappingX ×X into R which is bilinear: (x1 + cx2, y) = (x1, y) + c(x2, y), symmetric:(x, y) = (y, x), and positive: (x, x) > 0 if x 6= 0.
21
When F = C, this changes to (ax, y) = a(x, y), (x1 + x2, y) = (x1, y) +(x2, y), and (y, x) = (x, y), and still positivity.
Note (x, ay) = (ay, x) = a(y, x) = a(x, y).
Define ‖x‖ = (x, x)1/2.
We have the Cauchy-Schwarz inequality:
Proposition 6.1 |(x, y)| ≤ ‖x‖ ‖y‖, and equality holds if y = 0 or x = ay.
Proof. Let t be real, y 6= 0.We have
0 ≤ ‖x+ ty‖2 = ‖x‖2 + 2tRe (x, y) + t2‖y‖2. (6.1)
(Equality holds only if x = ty.) Set t = −Re (x, y)/‖y‖2. Multiplying by‖y‖2, we have
(Re (x, y))2 ≤ ‖x‖2‖y‖2.
If Re (x, y) = reiθ, let a = e−iθ so that |a| = 1 and a(x, y) is real. Replacex by ax in the above.
A corollary is that ‖x‖ = max‖y‖≤1 |(x, y)|. To see this, by Cauchy-Schwarz, the right hand side is less than or equal to the left hand side.For the other direction, take y = x/‖x‖.
Corollary 6.2 ‖ · ‖ is a norm.
Proof. It is only subadditivity that is not trivial, Start with
‖x+ ty‖2 = ‖x‖2 + 2tRe (x, y) + t2‖y‖2,
take t = 1, and use Cauchy-Schwarz.
If in (6.1) we take first t = 1 and then t = −1 and add, we get theparallelogram identity:
‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.
22
We say x and y are orthogonal if (x, y) = 0.
A linear space with a scalar product that is complete with respect to ‖ · ‖is a Hilbert space.
One example is `2 with (x, y) =∑
j ajbj. Another example is L2 with
(x, y) =∫x(t)y(t)m(dt).
6.2 Convex subsets
Theorem 6.3 Let K be a convex, nonempty, and closed subset of a Hilbertspace H and let x ∈ H. There exists a unique y ∈ K that is closer to x thanany other point of K.
Proof. Let d = infz∈K ‖x − z‖. Choose yn ∈ K such that dn → d, wheredn = ‖x− yn‖. By the parallelogram identity with x−yn
2and x−ym
2, we have∥∥∥x− yn + ym
2
∥∥∥2
+ 14‖yn − ym‖2 = 1
2d2n + 1
2d2m.
The right hand side tends to d2. Since K is convex, yn+ym2∈ K, so the first
term on the left is greater than or equal to d2. Therefore 14‖yn − ym‖2 → 0.
Thus {yn} is a Cauchy sequence. Since H is complete, yn → y for somey ∈ H. Since K is closed, y ∈ K. We have
‖x− y‖ = lim ‖x− yn‖ = lim dn = d.
If y′ is another point with ‖x−y‖ = d, applying the parallelogram identitywith x− y and x− y′ for the last equality in what follows,
4d2 + ‖y − y′‖2 ≤ 4∥∥∥x− y + y′
2
∥∥∥2
+ ‖y − y′‖2
= ‖2x− (y + y′)‖2 + ‖y − y‖2 = 4d2.
So we must have equality, and then ‖y − y′‖2 = 0.
If Y is a linear subspace of H, the orthogonal complement of Y is
Y ⊥ = {v ∈ H : (v, y) = 0 for all y ∈ Y },
read “Y perp.”
23
Proposition 6.4 Suppose Y is a closed subspace of H. Then
1) Y ⊥ is a closed linear subspace of H.
2) H = Y ⊕ Y ⊥.
3) (Y ⊥)⊥ = Y .
Proof. 1) (We don’t need Y closed for this first part.) That Y ⊥ is a linearsubspace is easy. If vn ∈ Y ⊥ and vn → v, then for any y ∈ H,
|(v, y)− (vn, y)| = |(vn − v, y)| ≤ ‖v − vn‖ ‖y‖ → 0,
so (v, y) = lim(vn, y) = 0, and hence v ∈ Y ⊥, Therefore Y ⊥ is closed.
2) If x ∈ H, choose y ∈ Y closest to x and set v = x−y. Since y is closest,for all z ∈ Y ,
‖v‖2 ≤ ‖v + tz‖2 = ‖v‖2 + 2tRe (v, z) + t2‖z‖2.
Taking t very small and both positive and negative, this is only possible ifRe (v, z) = 0. By replacing z by az so that (v, az) is real, we see (v, z) = 0.Therefore v ∈ Y ⊥ and x = y + v.
If x = y′+v′ as well, then y−y′ = v′−v ∈ Y ∩Y ⊥. So y−y′ is orthogonalto itself, and by positivity, y − y′ = 0. Hence y = y′ and v = v′.
3) If y ∈ Y , then for any v ∈ Y ⊥ we have (y, v) = 0, and hence y ∈ (Y ⊥)⊥.We thus need to show (Y ⊥)⊥ ⊂ Y .
By 2), H = Y ⊕ Y ⊥. If y ∈ (Y ⊥)⊥, we can write y = v + z withz ∈ Y ⊂ (Y ⊥)⊥ and v ∈ Y ⊥. Then v = y − z ∈ (Y ⊥)⊥. Since v is also inY ⊥, we see v = 0, or y = z ∈ Y .
6.3 Linear functionals
For each y, `(x) = (x, y) is a linear functional, and |`(x)| ≤ c‖x‖.The converse also holds. Note R` is one-dimensional and R` is isomorphic
to H/N`, so the codimension of N` is 1. We have H = N` ⊕ N⊥` . So theobvious candidate to take is cy, where y ∈ N⊥` .
A linear functional ` is bounded if there exists c such that |`(x)| ≤ c‖x‖for all x.
24
Theorem 6.5 (Riesz-Frechet representation) Let ` be a bounded linear func-tional on a Hilbert space. Then there exists a unique y ∈ H such that`(x) = (x, y) for all x.
Proof.
If ` 6= 0, then N` is a closed subspace Y of H. (To show it is closed iseasy.) We can write H = Y ⊕ Y ⊥. Take p ∈ Y ⊥ with ‖p‖ = 1.
Let z = `(x)p − `(p)x. Then `(z) = 0, so z ∈ Y , and hence (p, z) = 0.This says
`(x)(p, p)− `(p)(x, p) = 0,
or `(x) = `(p)(x, p). We take y = `(p)p.
To show uniqueness, if (x, y′) = `(x) = (x, y), then (x, y − y′) = 0 for allx, in particular when x = y − y′. Now use positivity.
Theorem 6.6 (Lax-Milgram lemma) Let H be a Hilbert space and suppose
1. for each y, B(x, y) is linear in x;
2. for each x, B(x, y1 +y2) = B(x, y1)+B(x, y2) and B(x, cy) = cB(x, y);
3. there exists c such that |B(x, y| ≤ c‖x‖ ‖y‖
4. there exists b such that |B(y, y)| ≥ b‖y‖2 for all y.
(We do not assume B(x, y) = B(y, x).) Then every bounded linear functional` is of the form `(x) = B(x, y) for some unique y.
Proof. For each y, B(x, y) is a bounded linear functional of x, so thereexists z = z(y) such that B(x, y) = (x, z) for all x, and z is unique.
If Z = {z : z = z(y) for some y ∈ H}, then Z is a linear space.
Z is closed: setting x = y, and letting z = z(y),
b‖y‖2 ≤ B(y, y) = (y, z) ≤ c‖y‖ ‖z‖,
25
orb‖y‖ ≤ ‖z‖.
If zn ∈ Z and zn → z, let yn be a point such that zn = z(yn). Then B(x, yn) =(x, zn). So B(x, yn − ym) = (x, zn − zm), hence b‖yn − ym‖ ≤ ‖zn − zm‖, andtherefore yn is a Cauchy sequence. H is complete; let y be the limit. SinceB(x, yn)→ B(x, y) and (x, zn)→ (x, z), we have B(x, y) = (x, z), and hencez ∈ Z.
Z = H: For each y, there exists z(y) such that B(x, y) = (x, z) for ally. If Z 6= H, there exists x ∈ Z⊥. Applying the above with y = x, thereexists z(x) such that B(x, x) = (x, z(x)). Since x ∈ Z⊥ and z(x) ∈ Z,b‖x‖2 ≤ B(x, x) = (x, z(x)) = 0. So x = 0.
Existence: given `, there exists y such that `(x) = (x, y) for all x. Then`(x) = B(x, z(y)).
Uniqueness: if there are two such z, then B(x, z−z′) = B(x, z)−B(x, z′) =`(x)− `(x) = 0. Now set x = z − z′.
6.4 Linear spans
The closed linear span of S is the intersection of all closed linear subspacescontaining S.
One can check that the closed linear span of S equals the closure of thelinear span of S.
Proposition 6.7 y is in the closed linear span Y of {yj} if and only if(yj, z) = 0 for all j implies (y, z) = 0.
Proof. Suppose y is in the closed linear span. If (yj, z) = 0 for all j, then(∑ajyj, z) = 0, so (y, z) = 0 by continuity of the inner product. So let’s
look at the other direction.
Let Z = {z : (z, yj) = 0 for all j}. We claim Z = Y ⊥.
If z is orthogonal to all the yj, then z is orthogonal to all linear combina-tions, and hence to the limits of linear combinations. Thus z ∈ Y ⊥.
26
If z ∈ Y ⊥, then z is orthogonal to all the yj, hence is in Z.
Therefore Z = Y ⊥, and so Y = (Y ⊥)⊥ = Z⊥.
To conclude the proof, suppose z ∈ Z. Then (z, yj) = 0 for all j. Byhypothesis, (z, y) = 0. Then y ∈ Z⊥ = Y .
We say a set {xj} is orthonormal if (xj, xk) equals 0 when j 6= k andequals 1 when j = k. {xj} forms an orthonormal basis if the elements areorthonormal and the closed linear span is all of H.
If∑|aj|2 < ∞, we can define
∑ajxj as the limit in the Hilbert space
norm of finite sums.
We need Bessel’s inequality:
Proposition 6.8 Suppose y ∈ H and aj = (y, xj). Then∑|aj|2 ≤ ‖y‖2.
Proof. If G is a finite collection of the xj’s, then
0 ≤∥∥∥y −∑
G
ajxj
∥∥∥2
= ‖y‖2 −∑G
aj(y, xj)−∑G
aj(xj, y) +∑G
|aj|2
= ‖y‖2 −∑G
|aj|2.
Therefore ∑G
|aj|2 ≤ ‖y‖2,
and we take the supremum over all finite collections.
Lemma 6.9 If {xj} is an orthonormal set, then the closed linear span isequal to {∑
ajxj :∑|aj|2 <∞
}.
Note in this case ‖x‖2 =∑|aj|2 and aj = (x, xj). To see this, because
the {xj} are orthonormal, for any finite sum∥∥∥∑ ajxj
∥∥∥2
=∑‖ajxj‖2
27
as the cross terms are zero. We then take limits. To get the formula for aj,
(x, xk) =∑
aj(xj, xk) = ak.
Theorem 6.10 Every Hilbert space has an orthonormal basis.
If we consider all orthonormal sets, ordered by inclusion, then we useZorn’s lemma to get a maximal element. We want to show the closed linearspan Y of a maximal element is all of H.
Suppose not. Suppose there exists y /∈ Y . Let aj = (y, xj).
Define x =∑ajxj, and note x ∈ X. We have
(y − x, xj) = (y, xj)− (x, xj) = aj − aj = 0.
So y − x is orthogonal to all the xj. Since y − x 6= 0 (because y /∈ Y ), y−x‖y−x‖
could be added to the collection {xj} to make a strictly larger orthonormalset, a contradiction.
If H is separable, then any orthonormal basis will be countable, and wewouldn’t need Zorn’s lemma to obtain the orthonormal basis.
Proposition 6.11 Suppose H is a Hilbert space, and {xj}, {yj} are twoorthonormal bases for H. Given x ∈ H, we have x =
∑ajxj with aj =
(x, xj). The map x→ y =∑ajyj is an isometry.
7 Applications of Hilbert spaces
7.1 The Radon-NIkodym theorem
We can use Hilbert space techniques to give an alternate proof of the Radon-Nikodym theorem.
Suppose µ and ν are finite measures on a space S and we have the conditionν(A) ≤ µ(A) for all measurable A. For f ∈ L2(µ), define
`(f) =
∫f dν.
28
Our condition implies∫h dν ≤
∫h dµ if h ≥ 0. We use this with h = |f |
and use Cauchy-Schwarz and have
|`(f)| =∣∣∣ ∫ f dν
∣∣∣ ≤ ∫ |f | dν ≤ ∫ |f | dµ≤(µ(S)
)1/2(∫f 2 dµ
)1/2
≤ c‖f‖L2(µ).
There exists g such that `(f) = (f, g), which translates to∫f dν =
∫fg dµ.
Letting f = χA, we get ν(A) =∫Ag dµ.
If ν is absolutely continuous with respect to µ, we let ρ = µ+ν and applythe above to ν and ρ and also to µ and ρ. The absolute continuity impliesthat dµ/dρ > 0 a.e., and we use
dν
dµ=dν
dρ
dµ
dρ.
7.2 The Dirichlet problem
Let D be a bounded domain in Rn, contained in B(0, K), say, where thisis the ball of radius K about 0. Let (f, g) be the usual L2 scalar productfor real valued functions. It is easy to see that if C∞0 (D) is the set of C∞
functions that vanish on the boundary of D, then the completion of C∞(D)with respect to the L2 norm is simply L2(D). Define
E(f, g) =
∫D
(∇f(x),∇g(x)) dx.
Clearly E is bilinear and symmetric.
If we start with
f(x1, . . . , xn) =
∫ x1
−K
∂f
∂x1
(y, x2, . . . , xn) dy
and apply Cauchy-Schwarz, we have
|f(x1, . . . , xn)|2 ≤∫ K
−K1 dy
∫ K
−K|∇f(y, x2, . . . , xn)|2 dy.
29
Integrating over (x2, . . . , xn) ∈ [−K,K]n−1 we obtain∫D
|f(x)|2 dx ≤ c
∫D
|∇f(x)|2 dx,
or in other words,(f, f) ≤ cE(f, f).
If E(f, f) = 0, then (f, f) = 0, and so f = 0 (a.e., of course). This provesthat E is positive. We let H1
0 be the completion of C∞0 (D) with respect tothe norm induced by E . The superscript 1 refers to the fact we are workingwith first derivatives, the subscript 0 to the fact that our functions vanishon the boundary. E is an example of a Dirichlet form, something we willprobably look at more much later.
Recall the divergence theorem:∫∂D
(F, n) dσ =
∫D
divF dx,
where D is a reasonably smooth domain, ∂D is the boundary of D, n is theoutward pointing unit normal, and σ is surface measure. In three dimensions,this is also known as Gauss’ theorem, and along with Green’s theorem andStokes’ theorem are consequences of the fundamental theorem of calculus.
If we apply the divergence theorem to F = u div v, then
∂
∂x1
F1 =∂u
∂x1
∂v
∂x1
+ u∂2v
∂x21
,
and sodivF = (∇u,∇v) + u∆v,
where ∆v is the Laplacian. Also,
(divF, n) = u∂v
∂n,
where ∂v∂n
is the normal derivative of v. We then get Green’s first identity:∫D
u∆v +
∫D
(∇u,∇v) =
∫∂D
u∂v
∂n.
30
Our goal is to solve the equation ∆v = g in D with v = 0 on the boundaryof D. This is Poisson’s equation, while the Dirichlet problem more properlyrefers to the equation ∆v = 0 in D with v equal to some pre-specified functionf on the boundary of D.
If we have a solution v and u ∈ C∞0 (D), then by Green’s identity we get∫D
u(x)g(x) dx = −∫D
(∇u(x),∇v(x)) dx.
So one way of formulating a (weak) solution to Poisson’s equation is: giveng ∈ L2(D), find v ∈ H1
0 such that
E(u, v) = −∫ug
for all u ∈ C∞0 (D).
After all this, it is easy to find a weak solution to the Poisson equation.Suppose g ∈ H1
0 . Define `(u) = −(u, g). Then
|`(u)| ≤ ‖g‖ ‖u‖ ≤ c‖g‖E(u, u)1/2.
By the Riesz-Frechet theorem, there exists v ∈ H10 such that `(u) = E(u, v)
for all u. SoE(u, v) = `(u) = −(u, g),
and v is the desired solution.
8 Duals of normed linear spaces
8.1 Bounded linear functionals
If X is a normed linear space, a linear functional ` is a linear map from X toF , the field of scalars. ` is continuous if |xn − x| → 0 implies `(xn) → `(x).` is bounded if there exists c such that |`(x)| ≤ c|x| for all x.
Theorem 8.1 A linear functional ` is continuous if and only if it is bounded.
31
Proof. If f is bounded,
|`(xn)− `(x)| = |`(xn − x)| ≤ c|xn − x| → 0,
and so it is continuous.
Suppose ` is continuous but not bounded. So there exist xn such that`(xn) > n|xn|. If
yn =1√n
xn|xn|
,
then |yn − 0| = |yn| → 0, but
`(yn) >1√nn|xn||xn|
=√n,
which does not tend to 0 = `(0), a contradiction.
The collection of all continuous linear functionals of X is called the dualof X, written X ′ or X∗.
Note N` = `−1({0}) is closed, since ` is continuous.
Define
|`| = sup|x|6=0
|`(x)||x|
.
By linearity, this is the same as sup|x|=1 |`(x)|.
Proposition 8.2 X∗ is a Banach space.
Proof.
Subadditivity:
|`+m| = sup|x|=1
|(`+m)(x)| ≤ sup|x|=1
(|`(x)|+ |m(x)|)
≤ sup|x|=1
|`(x)|+ sup|x|=1
|m(x)| = |`|+ |m|.
Completeness: Suppose |`n − `m| → 0 as n,m→∞. For each x,
|`n(x)− `m(x)| ≤ |`n − `m| |x| → 0.
32
Since R and C are complete, lim `n(x) exists for each x; call the limit `(x).
Given ε, choose N such that |`n−`m| < ε if n,m ≥ N . So |`n(x)−`m(x)| ≤ε|x|. Let m→∞, so |`n(x)− `(x)| ≤ ε|x| if n ≥ N . This means |`n − `| ≤ εif n ≥ N , or `n → `.
8.2 Extensions of bounded linear functionals
Proposition 8.3 Let X be a normed linear space, Y a subspace, ` a linearfunctional on Y with |`(y)| ≤ c|y| for all y ∈ Y . Then ` can be extended toa bounded linear functional on X with the same bound on X as on Y .
Proof. This is the Hahn-Banach theorem with p(x) = c|x|.
y1, . . . , yN are said to be linearly independent if∑N
i=1 ciyi = 0 implies allthe ci are zero.
Theorem 8.4 Suppose y1, . . . , yN are linearly independent and a1, . . . , aNare scalars. Then there exists a bounded linear functional ` such that `(yj) =aj.
Proof. Let Y be the span of y1, . . . , yN . If y ∈ Y , then y can be written as∑bjyj in only one way, for if
∑b′jyj is another way, then∑
(bj − b′j)yj = y − y = 0,
and so bj = b′j for all j. Define
`(∑
bjyj
)=∑
ajbj.
Now use the preceding theorem to extend ` to all of X.
Theorem 8.5 If X is a normed linear space, then
|y| = max|`|=1|`(y)|.
33
Proof. |`(y)| ≤ |`| |y|, so the maximum on the right hand side is less thanor equal to y.
If y ∈ X, let Y = {ay} and define `(ay) = a|y|. Then the norm of ` on Yis 1. Now extend ` to all of X so as to have norm 1.
Theorem 8.6 Let X be a normed linear space over C and Y a linear sub-space. Let
m(z) = infy∈Y|z − y|, M(z) = max
|`|≤1,`=0 on Y|`(z)|.
Then m(z) = M(z).
Proof. If y ∈ Y , ` = 0 on Y , and |`| ≤ 1, then
|`(z)| = |`(z − y)| ≤ |z − y|.
So|`(z)| ≤ inf
y∈Y|z − y| = m(z),
and hence M(z) ≤ m(z).
Let Y0 = {y+az : y ∈ Y, a ∈ C}. Define `0(z) on Y0 by `0(y+az) = am(z).By the definition of m(z), `0 is bounded on Y0 by 1. (To see this, we write
|`0(y + az)| = |a|m(z) ≤ |a|∣∣∣z − −y
a
∣∣∣ = |az + y|.
Extend to all of X with the same bound.
`0(z) = `(0 + 1 · z) = m(z),
so m(z) ≤M(z).
We write Y ⊥ = {` : ` = 0 on Y } for the annihilator of Y .
Theorem 8.7 (Spanning criterion) Let Y be the closed linear span of {yj}.Then z ∈ Y if and only if `(yj) = 0 for a bounded linear functional ` for allj implies `(z) = 0.
34
Proof. If `(yj) = 0 for all j, then `(y) for all y of the form∑ajyj, and by
continuity of `, for all y ∈ Y .
Suppose z /∈ Y . Then
infy∈Y|z − y| = d > 0.
Let Z = {y + az : y ∈ Y }. Define `0 on Z by `0(y + az) = a. So
|y + az| = |a|∣∣∣− −y
a+ z∣∣∣ ≥ d|a|.
Therefore on Z, `0 is bounded by d−1. Extend `0 to all of X. But then`0(yj) = 0 while `0(z) = 1.
8.3 Reflexive spaces
If x ∈ X, define the linear functional Lx on X∗ by
Lx(`) = `(x).
The norm of Lx is |x|. So we can isomorphically embed X into X∗∗.
A Banach space is reflexive if X∗∗ = X.
Theorem 8.8 Hilbert spaces are reflexive.
Proof. Recall X∗ = X, and the result follows from this. To see X∗ = X,if ` is a linear functional, there exists y such that `(x) = (x, y) for all x. Ifwe show |`| = ‖y‖, this gives an isometry between X and X∗. By Cauchy-Schwarz, |`(x)| = |(x, y)| ≤ ‖x‖ ‖y‖, so |`| ≤ ‖y‖. Taking x = y, `(y) = ‖y‖2,hence |`| ≥ ‖y‖.
If 1 < p, q < ∞ and 1p
+ 1q
= 1, then the dual of Lp is isomorphic to Lq.Hence the Lp spaces are reflexive.
To prove the dual of Lp is Lq, two facts from real analysis are needed. SeeFolland, p.190 and preceding.
35
1) Holder’s inequality: ∫|fg| ≤ ‖f‖p‖g‖q.
2)
‖g‖q = sup{∣∣∣ ∫ fg
∣∣∣ : ‖f‖p = 1, f simple}.
Theorem 8.9 (Lp)∗ = Lq.
Proof. (Sketch) If we define `(f) =∫fg, then Holder’s inequality shows
that ` is a bounded linear functional.
Now suppose ` is a bounded linear functional on Lp. The heart of thematter is when µ, the measure, is finite, so let’s suppose that. Define ν(E) =`(χE). The linearity of ` shows ν is finitely additive. The fact that ` isa continuous linear functional allows one to prove countable additivity. Ifµ(E) = 0, then χE = 0 (a.e.), so ν(E) = `(0) = 0. Therefore ν is absolutelycontinuous with respect to µ. Note ν is either a signed measure or is complexvalued.
By the Radon-Nikodym theorem there exists g ∈ L1 such that
`(χE) = ν(E) =
∫E
g.
By linearity, if f is simple,
`(f) =
∫fg.
Since `(f) ≤ c‖f‖p, because ` is a bounded linear functional on Lp, takingthe supremum over simple f ’s with Lp norm 1, we get
‖g‖q ≤ |`|.
We can then use the continuity of ` and Holder’s inequality to obtain
`(f) =
∫fg
for all f ∈ Lp.
36
Proposition 8.10 If X is a normed linear space over C and X∗ is separable,then X is separable.
Proof. Since X∗ is separable, there is a countable dense subset {`n}. Recall|`n| = sup|x|=1 |`n(x)|. So for each n there exists xn ∈ X such that |xn| = 1
and `n(xn) > 12|`n|.
We claim {xn} is dense in X. To prove this, we start by showing that if `is a linear functional on X that vanishes on {xn}, then ` vanishes identically.
Suppose not and that there exists ` such that `(xn) = 0 for all xn but` 6= 0. We can normalize so that |`| = 1. Since the `n are dense in X∗, thereexists `n such that |`− `n| < 1/3. Therefore |`n| > 2/3.Then
13> |(`− `n)(xn)| = |`n(xn)| > 1
2|`n| > 1
2· 2
3,
a contradiction.
Therefore by the spanning criterion, the closed linear span of the xn isall of X. Then the set of finite linear combinations of the xn where all thecoefficients have rational coordinates, is also dense in X, and is countable.
By the Riesz representation theorem from real analysis, the dual of X =C([0, 1]) is the set of finite signed measures on [0, 1]. X is separable, but X∗
is not, since |δx−δy| = 2 whenever x 6= y. It follows that X is not reflexive: ifit were, we would have X∗∗ separable, but X∗ not, contradicting the previousproposition.
Proposition 8.11 Suppose X is a reflexive normed linear space and Y is asubspace. Then Y is reflexive.
Proof. If ` is a linear functional on X, let `0 be the restriction of ` to Y .By the Hahn-Banach theorem every bounded linear functional on Y can beextended to X. So the restriction map R : `→ `0 maps X∗ onto Y ∗.
If ϕ ∈ Y ∗∗, define Rϕ by
Rϕ(`) = ϕ(`0), ` ∈ X∗.
37
Here R maps Y ∗∗ into X∗∗.
Since X is reflexive, there exists z ∈ X such that Rϕ(`) = `(z) for alllinear functionals `. Therefore `(z) = ϕ(`0).
We argue that z ∈ Y . If ` ∈ Y ⊥, then `0 = 0. So `(z) = ϕ(`0) = ϕ(0) = 0.By the spanning criterion, z is in the closure of Y , and since Y is closed, itis in Y .
Therefore`0(z) = ϕ(`0).
Every functional in Y ∗ is an `0 for some `, so every ϕ ∈ Y ∗∗ can be identifiedwith some z ∈ Y .
8.4 The support function
If M is a subset of a linear space, define M to be the closure of the convexhull of M .
If M is a bounded subset of a normed linear space X over R, define themap SM from X∗ into R by
SM(`) = supy∈M
`(y).
SM is called the support function.
If M = {x0}, then SM(`) = `(x0). If M = BR(0), then SM(`) = R|`|.Using the fact that SM+N(`) = SM(`)+SN(`), if M = BR(x0), then SM(`) =`(x0) +R|`|.
Proposition 8.12 If M is a bounded subset of X, then z ∈ M if and onlyif `(z) ≤ SM(`) for all ` ∈ X∗.
Proof. One way is easy. If ` ∈ X∗ and z ∈ M , then `(z) ≤ SM(`). It is easyto check that SM(`) = SM(`).
Now suppose z /∈ M , but `(z) ≤ SM(`) for all `. M is closed, so thereexists R such that BR(z) ∩ M = ∅. By the hyperplane separation theorem,there exists `0 6= 0 and c such that
`0(u) ≤ c ≤ `0(v), u ∈ M, v ∈ BR(z).
38
`0 is a bounded linear functional.
If v ∈ BR(z), then v = z +Rx with |x| < 1, so
c ≤ `0(z) +R`0(x).
Also inf |x|<1 `0(x) = −|`0|. So c ≤ `0(z)−R|`0|. We have SM(`0) ≤ c, so
SM(`0) +R|`0| ≤ `0(z),
a contradiction to `(z) ≤ SM(`).
Proposition 8.13 Suppose K is a closed and convex subset of X and z /∈ K.Then
infu∈K|z − u| = sup
|`|=1
[`(z)− SK(`)].
Proof. SK(`) ≥ `(u) for all ` ∈ X∗ and all u ∈ K. If |`| = 1, then
SK(`) ≥ `(u) = `(u) + `(u− z) ≥ `(z)− |u− z|,
or |u− z| ≥ `(z)− SK(`). So the left hand side is larger than the right handside.
Let 0 < R < infu∈K |z − u|. K +BR has positive distance from z. So
SK+BR(`0) < `0(z)
for some `0 ∈ X∗ and we can take |`0| = 1. SK+BR(`0) = SK(`0) +R|`0|, or
R < `0(z)− SK(`0).
Therefore the right hand side is larger than R. This is true for any such R,so the right side is larger than the left side.
39
9 Weak convergence
Let X be a normed linear space. We say xn converges to x weakly, writtenw − limxn = x or xn
w−→x if `(xn)→ `(x) for all ` ∈ X∗.xn converges to x strongly, written s−limxn = x or xn
s−→x if |xn−x| → 0.
Strong convergence implies weak convergence because
|`(xn)− `(x)| = |`(xn − x)| ≤ |`| |xn − x| → 0.
As an example where we have weak convergence but not strong conver-gence, let X = `2 and let en be the element whose nth coordinate is 1 andall other coordinate coordinates are 0. Since |en| = 1, then en does not con-verge strongly to 0. But it does converge weakly to 0. To see this, if ` isany bounded linear functional on X, then ` is of the form `(x) = (x, y) forsome y ∈ X, which means y = (b1, b2, . . .) with
∑j |bj|2 <∞. In particular,
bj → 0. Then `(en) = bn → 0 = `(0).
This example stretches to any Hilbert space. If {xn} is an orthonormalsequence in the space, `(xn) = (xn, y) for some y. By Bessel’s inequality,∑|(xn, y)|2 ≤ ‖y‖2, so (xn, y)→ 0,
9.1 Uniform boundedness
Theorem 9.1 Let X be a Banach space and {`ν} a collection of boundedfunctionals such that |`ν(x)| ≤M(x) for all ν and each x. Then there existsc such that |`ν | ≤ c.
In other words, if the `ν are bounded pointwise, they are bounded uni-formly.
The proof relies on the Baire category theorem, which we now recall. IfA is a set, we use A for the closure of A and Ao for the interior of A. A setA is dense in X if A = X and A is nowhere dense if (A)o = ∅.
The Baire category theorem is the following.
Theorem 9.2 Let X be a complete metric space.
40
(a) If Gn are open sets with Gn = X, then ∩nGn is dense in X.
(b) X cannot be written as the countable union of nowhere dense sets.
We now prove the uniform boundedness theorem. (A generalization tobounded linear maps is called the Banach-Steinhaus theorem).
Proof. Let M(x) = supν |`ν(x)|. Let Gn = {x : M(x) > n}. Since x ∈ Gn
if and only if for some ν ∈ A we have |`ν(x)| > n|x| and x → |`ν(x)| is acontinuous function by the triangle inequality, we conclude
Gn = ∪ν{x : |`ν(x)| > n|x|}
is the union of open sets, so is open.
Suppose GN is not dense in X. So there exists x0 and r such that B(x0, r)∩GN = ∅, or if |x| ≤ r, then `(x0 + x) ≤ N for all ν. Since x = (x0 + x)− x0,
|`ν(x)| ≤ |`ν((x0 + x)|+ |`ν(x0)| ≤ 2N
if |x| ≤ r, and we then have supν |`ν | ≤ c with c = 2N/r.
The other possibility, by Baire’s theorem, is that every Gn is dense in Xand ∩nGn is a dense subset of X. But M(x) =∞ for every x ∈ ∩nGn.
Corollary 9.3 Let X be a normed linear space, {xν} a subset such that forall ` ∈ X∗ we have
|`(xν)| ≤M(`) for all xν .
Then there exists c such that |xν | ≤ c for all xν.
Proof. Write Lν(`) = `(xν). So each xν acts as a bounded linear functionalon X∗.
Corollary 9.4 Let X be a normed linear space and suppose xn convergesweakly to x. Then |x| ≤ lim inf |xn|.
Proof. There exists ` such that |`| = 1 and |`(x)| = |x|. Then |`(x)| =lim |`(xn)| and |`(xn)| ≤ |`| |xn| = |xn|.
41
9.2 Weak sequential compactness
The topology that goes along with weak convergence is not necessarily de-rived from a metric. Thus the topology cannot be characterized by sequentialconvergence.
We say a subset C of a normed linear space X is weakly sequentiallycompact if any sequence of points in C has a subsequence converging weaklyto a point of C.
Theorem 9.5 Let X be a reflexive Banach space. Then the closed unit ballis weakly sequentially compact.
Proof. Take {yn} with |yn| ≤ 1. Let Y be the closed linear span of theyn’s. By Theorem 15 of Chapter 8, Y is reflexive also. Since Y ∗∗ = Y isseparable, then Y ∗ is separable. Let {mj} be a countable dense subset of Y ∗.By a diagonalization procedure, there exists a subsequence zn of the yn suchthat mj(zn) converges for all j. Since the mj are dense in Y ∗, we have m(zn)converges for all m ∈ Y ∗. Call the limit L(m).
We can check that L(m) is a linear functional on Y ∗. We have
|L(m)| ≤ lim sup |m(zm)| ≤ |m| |zn| ≤ |m|,
so the linear functional L on Y ∗ has norm bounded by 1. L ∈ Y ∗∗. So thereexists y ∈ Y such that L(m) = m(y). For all m ∈ Y ∗, m(zn) → m(y), andhence zn converges weakly to y. Since Y ⊂ X, then X∗ ⊂ Y ∗, hence we havethe convergence for all m in X∗.
9.3 Weak∗ convergence
We say un ∈ X∗ is weak∗ convergent to u if limun(x) = u(x) for all x ∈ X.
If X is reflexive, then weak∗ convergence is the same as weak convergence.
Weak convergence in probability theory can be identified as weak∗ con-vergence in functional analysis.
42
As an example, if S is a compact Hausdorff space and X = C(S), thenX∗ is the collection of finite signed measures. Saying ia sequence of mea-sures νn converges in the weak∗ sense means that
∫f dνn converges for each
continuous function f .
A set is weak∗ sequentially compact if every sequence in the set has asubsequence which converges in the weak∗ sense to an element of the set.
Theorem 9.6 If X is a separable Banach space, then the closed unit ball inX∗ is weak∗ sequentially compact.
Proof. Let un ∈ X∗ with norms bounded by 1. Let {xk} be a countabledense subset of X. By diagonalization there exists a subsequence vn of theun such that vn(xk) converges for each k. Since |vn| ≤ 1 for all n, then vn(x)converges for all x. Call the limit v(x). This is a linear functional with normbounded by 1.
10 Applications of weak convergence
10.1 Approximating the δ function
Let kn be a sequence of integrable functions on the interval [−1, 1]. Theyapproximate the δ function (or are an approximation to the identity) if∫ 1
−1
f(t)kn(t) dt→ f(0) (10.1)
as n→∞ for all f continuous on [−1, 1].
Theorem 10.1 kn approximates the δ function on [−1, 1] if and only if thefollowing three properties hold.
(1)∫ 1
−1kn(t) dt→ 1.
(2) If g is C∞ and 0 in a neighborhood of 0, then∫ 1
−1
g(t)kn(t) dt→ 0
43
as n→∞.
(3) There exists c such that∫ 1
−1|kn(t)| dt ≤ c for all n.
Proof. If (1)–(3) hold, write f = (f − f(0)) + f(0), and we may supposewithout loss of generality that f(0) = 0. Choose g ∈ C∞ such that g is 0 ina neighborhood of 0 and |g − f | < ε. We have∣∣∣ ∫ 1
−1
(f − g)kn
∣∣∣ ≤ ε
∫|kn| ≤ cε
and ∫gkn → 0.
So
lim sup∣∣∣ ∫ fkn
∣∣∣ ≤ cε.
Since ε is arbitrary, this shows (10.1).
If (10.1) holds, then (1) holds by taking f identically 1 and (2) holds bytaking f equal to g. So we must show (3). If X is the set C of continuousfunctions on [−1, 1], then X∗ is the collection of finite signed measures. Letmn(dt) = kn(t) dt and m0(dt) = δ0(dt). Then (10.1) says that mn(f) →m0(f) for all f ∈ C, ormn converges tom0 in the sense of weak-∗ convergence.lim sup |mn(x)| < ∞, so |mn(x)| ≤ M(x) for all x, and by the uniformboundedness principle, |mn| ≤ c. Note |mn| is the total mass of mn, which
is∫ 1
−1|kn(t)| dt.
10.2 Divergence of Fourier series
From the approximation of the δ-function, we can show that there exists acontinuous function f whose Fourier series diverges at 0.
We look at the set of continuous functions on S1, the unit circle. We sayf(θ) has Fourier series
∑∞−∞ ane
inθ with
an =1
2π
∫ π
−πf(θ)e−inθ dθ.
44
The Fourier series converges at 0 if
limN→∞
N∑−N
an = f(0).
NowN∑−N
an =1
2π
∫ π
−πf(θ)kN(θ) dθ,
where
kN(θ) =N∑−N
e−inθ =sin(N + 1
2)θ
sin θ/2.
So the convergence of the Fourier series at 0 is equivalent to kN being anapproximation to the δ function. And if (3) fails, then
∑N−N an does not
converge for some f .
Since | sinx| ≤ |x|, then ∣∣∣ 1
sinx/2
∣∣∣ ≥ 2
|x|,
and therefore ∫ π
−π|kN(θ)| dθ ≥ 2
∫ π
−π| sin(N + 1
2)θ|dθ|θ|
= 2
∫ (N+12
)x
0
| sinx|dxx
≥ c logN
10.3 The Herglotz theorem
Theorem 10.2 (Herglotz) Let f be analytic in {|z| < 1} and h = Re f ≥ 0.Then
f(z) =
∫ 2π
0
1 + ze−iθ
1− ze−iθm(dθ) + ic
for some positive measure m and some constant c. If f satisfies the aboveequation, f is analytic with positive real part. The measure m is uniquelydetermined by f .
45
Proof. For R < 1 we have the Poisson integral formula:
f(z) =1
2π
∫ 2π
0
R + ze−iθ
R− ze−iθh(Reiθ) dθ + ic.
When z = 0,
h(0) =1
2π
∫ 2π
0
h(Reiθ) dθ.
Take a sequence Rn ↑ 1. Define `n on C(S1) by
`n(u) =1
2π
∫ 2π
0
h(Rneiθ)u(θ) dθ.
Since h ≥ 0,
|`n(u)| ≤ 1
π|u|∫ 2π
0
h(Rneiθ) dθ = |u|h(0).
C(S1) is separable, so by Helly’s theorem, there exists a subsequence (alsocalled `n) which is weak∗ convergent, say to `. If |un − u| → 0 and `n(u)→`(u), then
|`n(un)− `n(u)| ≤ |`n| |un − u| → 0,
so `n(un)→ `(u).
Let
un =Rn + ze−iθ
Rn − ze−iθ, u =
1 + ze−iθ
1− ze−iθ.
Fix z. Then |un − u| → 0, and therefore f(z) = `n(un) → `(u). The `n arepositive linear functionals so ` is too. By Riesz representation, there existsa measure m such that
`(u) =
∫ 2π
0
u(θ)m(dθ).
It is routine to check that if f(z) has the representation in the theorem,then f is analytic.
If we take real parts,
h(z) =1
2π
∫ 2π
0
1− r2
1− 2r cos(φ− θ) + r2m(dθ),
46
where z = reiφ. Multiply by u(φ) and integrate to get∫ 2π
0
h(reiφ)u(φ) dφ =
∫ 2π
0
[ 1
2π
∫1− r2
1− 2r cos(φ− θ) + r2u(φ) dφ
]m(dθ).
Call the expression inside the brackets ur(θ). Letting r → 1, ur → u strongly.so
limr→1
∫h(reiφ)u(φ) dφ =
∫ 2π
0
u(θ) dm.
The left hand side does not depend on m. If m1 and m2 are two suchmeasures, ∫ 2π
0
u(θ)m1(dθ) =
∫ 2π
0
u(θ)m2(dθ)
for all continuous u. Therefore m1 = m2.
11 Weak and weak∗ topologies
The weak topology is the coarsest topology (i.e., fewest sets) in which allbounded linear functionals are continuous.
Bounded linear functionals are continuous in the usual norm topology(also called the strong topology), so the weak topology is coarser than thestrong topology.
We argue that every open set in the weak topology is the union of finiteintersections of sets of the form {x : a < `(x) < b}. Let
S ={{x : ai < `i(x) < bi, i = 1, . . . , n} : n ∈ N, ai < bi, `i ∈ X∗
}.
We claim S is a basis for the weak topology, that is, every open set is aunion of elements of S. Note that each element of S is in the weak topologybecause `−1
i ((ai, bi)) must be in the weak topology, so ∩ni=1`−1i ((ai, bi)) is an
element of the weak topology. Note also that unions of elements of S forma topology and each ` is continuous with respect to this topology. ThereforeS is a basis for the weak topology.
Finite intersections of sets in S are unbounded if X is infinite dimensional,so every open set in the weak topology when X is infinite dimensional is
47
unbounded. The open unit ball ({x : |x| < 1}) is a set that is open in thestrong topology but not the weak topology.
Given a set S, the weak sequential closure of S is
{x : ∃xn ∈ S, xnw−→x}.
If the weak sequential closure of S is equal to S, then S is said to be weaklysequentially closed.
Proposition 11.1 (1) The weak sequential closure of S is contained in theclosure of S with respect to the weak topology.
(2) If X is infinite dimensional, there are sets that are are weakly sequen-tially closed, but are not closed in the weak topology.
Proof. 1) If x is not in the weak closure of S, there exists an open set A inthe weak topology such that x ∈ A and A ∩ S = ∅. We can choose A to bethe finite intersection of sets of the form {x : a < `(x) < b}, so
A = ∩ni=1{x : ai < `i(x) < bi}.
If x is in the weak sequential closure of S, there exist xk ∈ S such that xkw−→x.
For each i, `i(xk) → `i(x), so for n large enough, `i(xk) ∈ (ai, bi). Takingk large enough, `i(xk) ∈ (ai, bi) for all i ≤ n, hence xk ∈ A, contradictingA ∩ S = ∅.
2) Let Xk be finite dimensional subsets of X with dimXk = k. LetSk = {xkj} be a 1/k-net for {|x| = k} in Xk, and let S = S2 ∪ S3 ∪ · · · .
0 is in the closure of S in the weak topology: If A is open and contains 0,then A contains a subset of the form A = {x : |`i(x)| < ε, i = 1, . . . , n} with|`i| = 1. If k > n, then the dimension of Xk is greater than the number oflinear functionals, and so there exists xk such that |xk| = k and `i(xk) = 0.There exists xkj ∈ Sk within 1/k of xk, so
|`i(xkj)| = |`i(xkj − xk)| ≤ |xkj − xk| < 1/k.
So for k > 1/ε, xkj ∈ A.
S contains no nontrivial weakly convergent sequences: In any ball of radiusR, S has only finitely many points. Now use the uniform bounded principle:if xn
w−→x, then |xn| is bounded; see Corollary 9.3.
48
Proposition 11.2 Suppose K is convex and contained in a Banach space X.If K is closed in the strong topology, then K is closed in the weak topology.
Proof. Suppose z /∈ K. We show z is not in the weak closure of K. Kis closed in the strong topology, so there exists BR(z) that is disjoint fromK. By the hyperplane separation theorem, there exists a nonzero linearfunctional ` and c such that
`(u) ≤ c ≤ `(v), u ∈ K, v ∈ BR(z).
If w ∈ BR(0), v = −w + z, then
`(−w) = `(v − z) = `(v)− `(z) ≥ c− `(z).
So `(w) ≤ `(z) − c. If follows that |`(w)| is bounded above for w ∈ BR(0),which implies ` is bounded.
If v ∈ BR(z), then v = z + x with |x| < R. Then
c ≤ infv∈BR(z)
`(v) = `(z) + inf|x|<R
`(x) = `(z)−R|`|.
Therefore `(z) > c. (Before we only had `(z) ≥ c.) So A = {x : `(x) > c}contains z but no point of K. Therefore A is open in the weak topology, andz is not in the weak closure of K.
11.1 The Alaoglu theorem
Consider X∗, where X is a Banach space. For x ∈ X, define Lx : X∗ → Rby Lx(`) = `(x). The weak∗ topology is the coarsest topology on X∗ withrespect to which all the Lx with x ∈ X are continuous.
Theorem 11.3 (Alaoglu) The closed unit ball B in X∗ is compact in theweak∗ topology.
There is a connection with the Prohorov theorem of probability theory.Let X = C(S) where S is a compact Hausdorff space. If µn is a sequence ofprobabiity measures, then the µn are elements of the closed unit ball in X∗.
49
The Alaoglu theorem implies there must be a subsequence which convergesin the weak∗ sense.
Proof. If u ∈ B, then |u(x)| ≤ |x|. Let
P =∏x∈X
Ix,
where Ix = [−|x|, |x|]. Map B into P by setting ϕ(u) = {u(x)}, the functionwhose xth coordinate is u(x). The topology on P that we use is the coarsestone where all the projections u → u(x) are continuous. As in the proof ofthat S is a basis for the weak topology, we can check that finite intersectionsof sets of the form {u : a < u(x) < b} are a basis for this topology, whichis the product topology. The same argument shows that it is a basis for theweak∗ topology. By Tychonov’s theorem, P is compact. So it suffices toshow that ϕ(B) is closed.
Let p be in the closure of ϕ(B). We will show p = ϕ(u) for some u ∈ B.If px denotes the xth component of p, we want to show px = u(x).
Define u on X by u(x) = px. If q ∈ ϕ(B), then qx + qy = qx+y andqax = aqx since q = ϕ(v) for some v ∈ X∗. p is in the weak∗ closure of ϕ(B),so there exist qn ∈ ϕ(B) such that (qn)x converges to px, (qn)y converges topy, (qn)x+y converges to px+y, and (qn)kx converges to pkx. Passing to thelimit, (qn)x → px, etc., and we conclude p is linear. Since px ∈ Ix, we see thenorm of p is bounded by 1. Therefore p ∈ ϕ(B).
Corollary 11.4 Suppose S ⊂ X∗ is closed in the weak∗ topology. Then S isweak∗ compact if and only if it is bounded in norm.
Proof. If S is bounded in norm, then S is contained in a closed ball aboutthe origin. The closed ball is weak∗ compact, and S is a weak∗ closed subset,hence compact.
Now suppose S is weak∗ compact. For each x, the image of S under themap u → u(x) is compact, since continuous functions map compact sets tocompact sets. Therefore, for each x, {u(x)} is a bounded set. By the uniformboundedness principle, the collection {u} ⊂ S is bounded in norm.
50
12 Locally convex topological spaces
We look at topologies other than those defined in terms of linear functionals.
A locally convex topological (LCT) linear space is a linear space over thereals with a Hausdorff topology satisfying
1) (x, y)→ x+ y is a continuous mapping from X ×X → R.
2) (k, x)→ kx is a continuous mapping from F ×X → X.
3) Every open set containing the origin contains a convex open set con-taining the origin.
It is an exercise to show that the weak and weak∗ topologies are locallyconvex (i.e., satisfy 3)).
Proposition 12.1 In a LCT linear space,
(1) if T is open, so are T − x, kT , and −T .
(2) Every point of an open set T is interior to T .
Proof. (1) T − x is the inverse image of T under the map y → x+ y. kT issimilar.
(2) Suppose 0 ∈ T . Fix x ∈ X. k → kx is continuous, so {k : kx ∈ T} isopen. Since 0 ∈ T , then 0 is in this set, and therefore there exists an intervalabout 0 such that kx ∈ T if k is in this interval. This is true for all x, andtherefore 0 is an interior point. Use translation if the point we are interestedin is other than 0.
12.1 Separation of points
In a LCT space, we can talk about continuous linear functionals, but notbounded linear functionals.
Proposition 12.2 Continuous linear functionals in a LCT linear space Xseparate points: if y 6= z, there exists ` such that `(y) 6= `(z).
51
Proof. Without loss of generality assume y = 0. There exists an open setT that contains 0 but not z, since the topology is Hausdorff. We can takeT to be convex. 0 ∈ T is interior, so the gauge function pT is finite. RecallpT (u) < 1 if u ∈ T .
By the hyperplane separation theorem, there exists ` such that `(z) = 1and `(x) ≤ pT (x) for all x. Since `(y) = `(0), then ` separates.
It remains to prove that ` is continuous.
H = {w : `(w) < c} is open: if w ∈ H and u ∈ T , let r = c− `(w). Then
`(w + ru) = `(w) + r`(u) ≤ `(w) + rpT (u) < `(w) + c− `(w) = c,
so w + ru ∈ H. Hence H is open.
Making T symmetric about the origin by replacing T ∩ (−T ), a similarargument shows {w : `(w) > d} is open.
Using the extended hyperplane separation theorem, we have
Corollary 12.3 Let K be a closed convex set in a LCT space, z /∈ K. Thereexists a continuous linear functional ` such that `(y) ≤ c for y ∈ K and`(z) > c.
12.2 Krein-Milman theorem
Theorem 12.4 Let K be a nonempty, compact, convex subset of a LCTlinear space X. Then
1) K has at least one extreme point.
2) K is the closure of the convex hull of its extreme points.
Proof. 1) Let {Ej} be the collection of all nonempty closed extreme subsetsof K. It is nonempty because it contains K. We partially order by inclusion.We show that if we have a totally ordered subcollection, ∩jEj is a lowerbound, and hence by Zorn’s lemma a minimal element. (To be able to applyZorn’s lemma without change, say E1 ≤ E2 if Ec
1 ⊂ Ec2 and translate what a
maximal element and upper bound means.)
52
The intersection of any finite totally ordered subcollection {Ej} is justthe smallest one. Since K is compact, by the finite intersection property, theintersection of any totally ordered subcollection is nonempty. (If ∩Ej = ∅,then {Ec
j} forms an open cover of K, so there is a finite subcover, andthen the intersection of those finitely many Ej is empty, a contradiction.)The intersection of closed sets is closed, and it is easy to check that theintersection of extreme sets is extreme.
By Zorn’s lemma, there is a minimal element E. We claim E is a singlepoint. If not, there exists a continuous linear functional ` that separates 2 ofthe points of E. Let µ be the maximum value of ` on E. Since E is compact,this maximum value is attained. Let M = {x ∈ E : `(x) = µ}. M 6= E since` is not constant. ` is continuous and E is closed, so M is closed. `−1({µ}) isextreme, so M is extreme in E, and since E is extreme in K, M is extremein K. But this contradicts the fact that E was a minimal extreme subset.
2) Let Ke be the extreme points of K. We’ll show that if z is not in theclosure of the convex hull, then z /∈ K. There exists a continuous linearfunctional ` such that `(y) ≤ c for y ∈ Ke and `(z) > c. K is compact and` is continuous, so ` achieves it maximum on a closed subset E of K. E isextreme, and E must contain an extreme point p. Since p ∈ E ⊂ Ke, then`(p) ≤ c. Since `(p) = maxK `(x), then `(x) ≤ `(p) ≤ c for all x ∈ K. Since`(z) > c, then z /∈ K.
12.3 Choquet’s theorem
Here is a theorem of Caratheodory.
Theorem 12.5 Suppose K is a nonempty compact convex subset of a LCTlinear space X. Let Ke be the set of extreme points. If u ∈ K, there exists ameasure mu of total mass 1 on Ke such that
u =
∫Ke
emu(de)
in the weak sense.
A measure with total mass one is a probability measure, but this theoremhas nothing to do with probability.
53
The equation holding in the weak sense means
`(u) =
∫Ke
`(e)mu(de)
for all continuous linear functionals `.
Proof. Let m,M be the minimum and maximum of ` on K. K is compact,so these values are achieved. Then {x ∈ K : `(x) = m} is an extreme subsetof K and similarly with m replaced by M . They each contain extreme points.So if u ∈ K,
minp∈Ke
`(p) ≤ `(u) ≤ maxp∈Ke
`(p). (12.1)
If `1 and `2 are equal on Ke, then applying the above to `1 − `2 shows theyare equal on K.
Let L be the class of continuous functions on Ke that are the restrictionof a continuous linear functional. Fix u. Define r on L by setting
r(`) = `(u).
If L contains the constant function 1, then by (12.1) we have r(`) = `(u) = 1.If L does not contain the constant functions, adjoin the constant functionf0 = 1 to L and set r(f0) = 1. The set L is a linear subspace of C(Ke).Check that r is a positive linear functional on L.
Now use Hahn-Banach to extend r from L to C(Ke).
Ke is a closed subset of K, hence compact. r is a positive linear functionalon C(Ke). By the Riesz representation theorem from measure theory, thereexists a measure m such that
r(f) =
∫Ke
f dm.
Since r(f0) = 1, then m(Ke) = 1.
An example: in R3, let K be the unit circle in the (x, y) plane togetherwith {(1, 0, z) : |z| ≤ 1}. Then (1, 0, 0) /∈ Ke, so the collection of extremepoints is not closed.
Choquet’s theorem is an important extension of Caratheodory’s theoremin that we can take the integral to be over Ke rather than its closure, providedK is metrizable.
54
Theorem 12.6 Let K be a nonempty compact convex subset of a LCT spaceX that is metrizable. Then if u ∈ K,
u =
∫Ke
emu(de)
in the weak sense.
13 Examples of convex sets
and extreme points
Let Q be a compact Hausdorff space and X = C(Q). Let P be the positivelinear functionals on X = C(Q) such that `(1) = 1. If r ∈ Q, define er ∈ Pby er(f) = f(r).
Proposition 13.1 P is convex and the set of extreme points is {er}.
Proof. Convexity is easy. Suppose er were not extreme. Then er = am +(1 − a)`, where m, ` ∈ P and a ∈ (0, 1). Choose f ∈ C(Q) such that f ≥ 0and f(r) = 0. Then
er(f) = f(r) = 0 = am(f) + (1− a)`(f).
Since m(f), `(f) ≥ 0 and a ∈ (0, 1), then `(f) = m(f) = 0.
Now take f ∈ C(Q) with f(r) = 0. Then writing f = f+ − f−, wehave f+(r) = 0 and by the above `(f+) = 0 and the same for m. Similarlyf−(r) = 0 so the same is true for ` and m. Combining, `(f) = 0 and the samefor m. Thus Ner ⊂ Nm, N`. A non-zero linear functional has codimension1. Since codim N` = codim Nm, `,m are constant multiples of er. Since`(1) = m(1) = 1, ` = er and the same for m. Therefore er is extreme.
Now let ` be an extreme element of P . By Riesz representation thereexists a unique measure µ such that `(f) =
∫f dµ. Since `(1) = 1, then
µ(Q) = 1. If the support of µ is not a single point, there exist µ1, µ2 thatare non-zero, not equal, and µ = aµ1 + (1 − a)µ2. Let `j =
∫f dµj. Then
` = a`1 + (1 − a)`2; since ` is extreme, `1 = `2, which implies µ1 = µ2, a
55
contradiction. Hence the support of µ is a single point, and since µ(Q) = 1,µ(dx) = δr(dx) for some r. Therefore `(f) =
∫f(x) δr(dx) = f(r), or ` = er.
Choquet’s representation here is
` =
∫erm(dr).
13.1 Convex functions
Let C be the set of convex functions on [0, 1] such that f(0) = 0, f(1) = 1and f ≥ 0. Clearly C is a convex set. Let er(x) be the function that is 0 on[0, r], 1 at 1, and linear on [r, 1], and let e1(x) be 0 except 1 at 1.
Proposition 13.2 {er}are the extreme points for C.
Proof. Suppose er = af + (1− a)g with f, g ∈ C. Since f, g ≥ 0, they mustboth be 0 on [0, r]. If they were not equal to er on [r, 1], one of them, sayf , must be larger than er at some point y ∈ (r, 1). But f(r) = 0, f(1) = 1,and by convexity f must be less than the line connecting (r, 0) and (1, 1),contradicting f(y) > er(y). Therefore f = g = er and er is extreme.
Now suppose f ∈ C is extreme. We will show in a minute that we canwrite
f(x) =
∫ 1
0
er(x)m(dr) (13.1)
for some measure m on [0, 1] with total mass 1. As in the proof above for P ,the measure m must be supported on a single point, and hence f = er forsome r.
To prove (13.1), a convex function is differentiable almost everywhere,and its derivative f ′ is increasing. Let m be the Lebesgue-Stieltjes measureassociated with the right continuous version of f ′. Set m(dr) = (1−r)m(dr).
56
If r > x, then er(x) = 0. Then
f(x) = f(x)− f(0) =
∫ x
0
f ′(r) dr =
∫ x
0
∫ r
0
m(ds) dr
=
∫ x
0
∫ s
x
drm(ds) =
∫ x
0
(x− s)m(ds)
=
∫ x
0
x− r1− r
(1− r)m(dr) =
∫ 1
0
er(x)m(dr).
Since f(1) = 1 and er(1) = 1, we conclude m([0, 1]) = 1.
f =∫ 1
0er dm is the Choquet representation for f .
14 Bounded linear maps
14.1 Boundedness and continuity
A linear map M , which is the same as a linear operator or linear transforma-tion, from a Banach space X to a Banach space U is continuous if xn → ximplies Mxn → Mx. M is bounded if there exists a constant c such that|Mx| ≤ c|x|.
The following is proved just as for linear functionals.
Proposition 14.1 M is continuous if and only if it is bounded.
If X and U are only normed linear spaces and M is bounded, then Mcan be extended by continuity to a map from the completion of X to thecompletion of U .
Define
|M | = supx 6=0
|Mx||x|
,
or what is the same,|M | = sup
|x|=1
|Mx|.
57
Proposition 14.2 |aM | = |a| |M |, |M | ≥ 0 and equals 0 if and only ifM = 0, and |M +N | ≤ |M |+ |N |.
The proofs are easy.
We wrote L(X,U) for the set of linear maps from a Banach space X to aBanach space U .
Proposition 14.3 L is itself a Banach space.
The proof is the same as for linear functionals. There we used the com-pleteness of R or C; here we use the completeness of U .
Proposition 14.4 NM is closed.
Proof. {0} is closed, M is continuous, so NM = M−1({0}) is closed.
Suppose M : X → U . We define the transpose M ′ as follows. M ′ : U∗ →X∗. If ` ∈ U∗, `(Mx) is a linear functional on X, and we call this linearfunctional M ′`.
We sometimes use the notation `(u) = (u, `) and ξ(x) = (x, ξ). With thisnotation,
(Mx, `) = `(Mx) = M ′`(x) = (x,M ′`),
which justifies the name adjoint or transpose.
If R is a subspace of U , then R⊥ is the set of bounded linear functionalsthat vanish on R, called the annihilator of R. R⊥ ⊂ U∗.
If S ⊂ X∗, then S⊥ consists of those vectors that are annihilated by everylinear functional in S. S⊥ ⊂ X.
Proposition 14.5 (1) M ′ is bounded and |M ′| = |M |.(2) NM ′ = R⊥M .
(3) NM = R⊥M ′.
(4) (M +N)′ = M ′ +N ′.
58
Proof. (1) |M ′| = sup|`|=1 |M ′`| (recall M ′` ∈ X∗) and
|M ′`| = sup|x|=1
|M ′`(x)| = sup|x|=1
|`(Mx)|.
So|M ′| = sup
|`|=1,|x|=1
|`(Mx)| = sup|x|=1
|Mx| = |M |.
(2) Suppose ` ∈ NM ′ . Then M ′` = 0. If x ∈ X, 0 = M ′`(x) = `(Mx), or` ∈ R⊥M . On the other hand if ` ∈ R⊥M , then (M ′`)(x) = `(Mx) = 0, henceM ′` = 0, or ` ∈ NM ′ .
(3) is similar and (4) is easy.
14.2 Strong and weak topologies
We define some topologies on L(X,U).
1) Uniform: this is defined in terms of the norm M .
2) Strong: for each x ∈ X, let Rx : L → U be defined by RxM = Mx. Thestrong topology is the coarsest one where all the Rx are continuous functions.
3) Weak topology: for each x ∈ X and ` ∈ U∗, let Sx` : L → R be definedby Sx`M = (Mx, `) = `(Mx). The weak topology is the coarsest one suchthat all the Sx` are continuous.
We say Mn is strongly convergent to M if |Mnx−Mx| → 0 for all x. Mn
is weakly convergent to M if Mnxw−→Mx for all x, that is, for all ` ∈ U∗,
`(Mnx)→ `(Mx).
Proposition 14.6 Suppose X and U are Banach spaces, Mn ∈ L, and|Mn| ≤ c for all n. Suppose s − limMnx exists for a dense subset of x’s.Then Mn converges strongly for all x.
Proof. Let ε > 0, and given x choose xk in the dense subset such that|x−xk| < ε. Then |Mnx−Mnxk| ≤ c|x−xk| ≤ cε, Mnxk converges strongly.We conclude Mnx converges strongly.
59
14.3 Uniform boundedness principle
Proposition 14.7 If Mν is a collection of elements of L(X,U), where X,Uare Banach spaces, and supν |Mνx| is finite for all x, then there exists c suchthat |Mν | ≤ c for all c.
The proof is nearly identical to the linear functional case.
Theorem 14.8 Suppose X and U are Banach spaces such that for all x ∈ Xand all ` ∈ U∗, supν |(Mνx, `)| <∞. Then there exists c such that |Mν | ≤ cfor all ν.
Proof. Fix x. Let uν = Mνx. Then supν |`(uν)| < ∞. By the corollary tothe uniform boundedness principle for linear functionals,
supν|Mνx| = sup
ν|uν | <∞.
Now apply the preceding proposition.
14.4 Composition of maps
Proposition 14.9 Suppose X,U,W are Banach spaces, M is a linear mapfrom X to U , and N is a linear map from U to W . Then
1) |NM | ≤ |N | |M |.(2) (NM)′ = M ′N ′.
Proof. (1)|NMx| ≤ |N | |Mx| ≤ |N | |M | |x|.
(2)(NMx,m) = (Mx,N ′m) = (x,M ′N ′m).
60
14.5 Open mapping principle
Theorem 14.10 Suppose M is a bounded linear map from a Banach spaceX onto a Banach space U . Then there exists d such that Bd(0) ⊂M(B1(0)).
The “onto” condition is important.
Proof. M is onto, so ∪nM(Bn(0)) = U . By the Baire category theorem,at least one of M(Bn(0)) is dense in an open set in U . If V = M(Bn(0))and the open set contains a ball centered at y0, then V − y0 is dense insome open ball around the origin. Since M is onto, there exists x0 suchthat Mx0 = y0. So M(Bn(0) − x0) is dense in an open ball about 0. SinceBn(0) − x0 ⊂ Bn+|x0|(0), then M(Bn+|x0|(0)) is dense in a ball about 0. Byhomogeneity, M(B1(0)) is dense in Br(0) for some r, and then M(Bc(0)) isdense in Bcr(0) for all c > 0.
We show M(B2(0)) contains Br(0). Let u ∈ Br(0).
There exists x1 such that |u−Mx1| < r/2 and |x1| < 1. Because u−Mx1 ∈Br/2(0), there exists x2 such that
|(u−Mx1)−Mx2| < r/4, |x2| < 1/2.
We continue with this construction.
Since∑n
i=m |xi| ≤∑n
i=m 2−i+1, then∣∣∣ n∑i=m
xi
∣∣∣→ 0
as m,n→∞. Since X is complete, the sequence∑n
i=1 xi converges, say to x.We have |x| ≤
∑|xi| < 2. M is bounded, so M(
∑ni=1 xi) → Mx. Because
|u−M∑n
i=1 xi| ≤ r/2n → 0, we have Mx = u.
Corollary 14.11 M maps open sets onto open sets.
Corollary 14.12 If M is one-to-one and onto and bounded, then M−1 isbounded.
61
Proof. If u ∈ U with |u| = d/2, there exists x with |x| < 1 and Mx = u. Byhomogeneity, if u ∈ U , there exists x ∈ X with Mx = u and |x| < 2|u|/|d|.So x = M−1u and |M−1| < 2/d.
A map M : X → U is closed if whenever xn → x and MXn → u, thenMx = u. This is equivalent to the graph {x,Mx} being a closed set.
If M is continuous, it is closed If D is the differentiation operator on theset of differentiable functions on [0, 1], then D is closed, but not continuous.
Theorem 14.13 (Closed graph theorem) If X and U are Banach spaces andM a closed linear map, then M is continuous.
Proof. Let G = {g = (x,Mx)} with norm |g| = |x|+ |Mx|. It is easy to seethat |g| is a norm, and G is complete. Define P : G→ X by P (x,Mx) = x,so that P is a projection onto the first coordinate.
|Pg| = |x| ≤ |x| + |Mx| = |g|, so P is bounded with norm less than orequal to 1. P is linear and 1-1, and onto, so P−1 is bounded, i.e., there existsc such that c|Pg| ≥ |g|. So (c − 1)|x| ≥ |Mx|, which proves M is bounded.
Corollary 14.14 Suppose X has two norms such that if |xn − x|1 → 0 and|xn−y|2 → 0, then x = y. Suppose X is complete with respect to both norms.Then the norms are equivalent.
Proof. Let X1 = (X, | · |1) and similarly X2. Let I : X1 → X2. Thehypothesis is equivalent to I being closed. Therefore I and I−1 are bounded.
Corollary 14.15 Suppose X is a Banach space and X = Y ⊕ Z, where Yand Z are closed linear subspaces. If x = y+z, define PY x = y and PZx = z.then
(1) PY , PZ are linear operators, P 2Y = PY , P 2
Z = PZ, PY PZ = 0.
(2) PY , PZ are continuous.
62
PY and PZ are projections.
Proof. (1) is easy. (2) Since Y, Z are closed and the decomposition is unique,then the graphs of PY and PZ are closed. To see this, if xn = yn+zn, xn → x,yn → y′,and zn → z′, then y′ ∈ Y, z′ ∈ Z, and x = y′+z′. The decompositionis unique, so y′ = PY x, z
′ = PZx.
By the closed graph theorem, PY , PZ are continuous.
15 Distributions
(This is from Appendix B of Lax.)
15.1 Definitions and examples
Let C∞0 be the C∞ functions on Rn with compact support. We will workwith complex-valued functions.
Let Diu = ∂u∂xi
. For α = (α1, . . . , αn), let |α| = α1 + · · · + αn, and writeDα for Dα1
1 · · ·Dαnn .
We say uk → u if there exists a compact subset K such that supp (uk) ⊂ Kfor all k (here supp (u) is the support of u) and for each α ∈ Nn, Dαukconverges uniformly to Dαu.
A distribution is an element of the dual of C∞0 , that is, ` is a complex-valued linear functional such that `(uk)→ `(u) whenever uk → u.
For an integer N , let|u|N = max
|α|≤N|Dαu|.
Proposition 15.1 Let ` be a distribution, K a compact set. There exist Nand c depending on K such that if u ∈ C∞0 has support in K, then |`(u)| ≤c|u|N .
Proof. If not, then for each n there exists un with support in K such that`(un) = 1 and |u|n ≤ 1/n. Therefore un → 0. But `(un) = 1 while `(0) = 0.
63
Let D be the set of distributions. We embed C∞0 in D as follows: ifv ∈ C∞0 , define `v(u) =
∫uv dx = (u, v).
We will use the notation `(u) = (u, `).
Here are some examples of distributions.
1) If v is a continuous function, define `(u) =∫uv dx.
2) Dirac delta function: δ(u) = u(0).
3) If v is integrable and α is fixed, define `(u) =∫
(Dαu)v dx.
4) Define
`(u) = PV
∫u(x)
xdx = lim
ε>0
∫|x|>ε
u(x)
xdx.
If D is open, let C∞D be the C∞ functions with support contained in D.We can talk about distributions being the duals of elements of C∞D .
Suppose U is a continuous linear map from C∞0 into C∞0 . Define T` by
(v, T `) = (Uv, `).
It is an exercise to show T` is a distribution. It is easy to check thatT`v = `Uv. Therefore T acts an extension of U ′.
Here are some examples.
1) Let U be multiplication by a C∞ function t. Here T is an extension ofU .
2) Let U = −Di, differentiation. T is an extension of −U .
3) Translation: (Uau)(x) = u(x+ a). Ta is an extension of U−a.
4) Reflection: Ru(x) = u(−x), and T is an extension of R.
5) Convolution: if t is continuous with compact support, define
Uu = (t ∗ u)(x) =
∫t(y)u(x− y) dy.
T is convolution with respect to Rt.
6) Let φ : Rn → Rn be C∞ and map compact sets to compact sets.Suppose ψ = φ−1 exists and has the same properties. Define Uu(x) =u(φ(x)). Here U ′ is given by U ′v(y) = v(ψ(y))J(y).
64
Note that one cannot, in general, define the product of two distributions,or quantities like δ(x2).
15.2 Local properties
Let G be open. ` is zero on G if `(u) = 0 for all C∞0 functions u whosesupport is contained in G.
Lemma 15.2 If ` is zero on G1 and G2, then ` is zero on G1 ∪G2.
Proof. This is just the usual partition of unity proof. Given u with supportin G1∪G2, we will write u = u1+u2 with supp (u1) ⊂ G1 and supp (u2) ⊂ G2.Since `(u) = `(u1) + `(u2) = 0, that will do it.
Fix x ∈ supp (u). Since G1, G2 are open, we can find h = hx such that his non-negative, h(x) > 0, h is C∞, and the support of h is contained eitherin G1 or in G2. The set Bx = {y : hx(y) > 0} is open and contains x. Bycompactness we can cover suppu by finitely many of such balls. Look at thecorresponding finite collection of h’s.
Let h1 be the sum of those h’s in the finite collection whose support is inG1 and define h2 similarly. Then let
u1 =h1
h1 + h2
u, u2 =h2
h1 + h2
u.
If we have an arbitrary collection of open sets and ` is zero on each one,and suppu is contained in their union, there by compactness there existfinitely many of them that cover suppu, and by the above `(u) = 0.
The union of all open sets on which ` is zero is an open set on which ` iszero. Its complement is called the support of `.
Example: for the Dirac delta function, the support is {0}. Note that thesupport of Dαδ is also {0}.
Lemma 15.3 If ` is a distribution and supp ` = {0}, then there exists Nsuch that Dαu(0) = 0 for |α| ≤ N implies `(u) = 0.
65
Proof. Let f be 0 on |x| < 1 and 1 on |x| > 2, and in C∞. Let v = (1−f)u.Since fu = 0 for |x| < 1, then `(fu) = 0 and
`(v) = `(u)− `(fu) = `(u).
Thus is suffices to consider u such that suppu ⊂ B3(0).
There must exist N such that |`(u)| ≤ c|u|N . Define fk(x) = f(kx),uk = fku. Then uk = u if |x| > 2/k.
If |x| < 2/k, since u and Dβu are 0 at 0, Taylor’s theorem gives
|Dβu(x)| ≤ c|x|N+1−|β|i ≤ c|k||β|−1−N
if |β| ≤ N . Calculus (product rule) shows that
|Dαuk(x)| ≤ ck|α|−N−1
if |x| < 2/k.
Since uk = u if |x| > 2/k, then uk → u in CN norm, so `(uk) → `(u). ukis zero in a ball about the origin, so `(uk) = 0, which implies `(u) = 0.
Theorem 15.4 If supp ` = {0}, then
` =∑|α|≤N
cαDαδ.
Proof. If u1, u2 and all the derivatives up to order N agree at 0, then`(u1−u2) = 0, or `(u1) = `(u2). So `(u) depends only on the values of u andits derivatives up to order N . So
`(u) =∑|α|≤N
cαDαu∣∣∣x=0
. (15.1)
To elaborate on the last line of the proof, consider the case where thedimension is 1. We take functions pi that are equal to xi near 0 but are in
66
C∞0 . Choose the cα so that (15.1) holds when u is equal to pi. Then it willhold for all u.
We will need the Sobolev inequality. First suppose 1 ≤ p < n and p∗ = npn−p
so that1
p∗=
1
p− 1
n
and p∗ > p.
Theorem 15.5 Suppose 1 ≤ p < n. There exists C such that
‖u‖Lp∗ ≤ c‖Du‖Lp
for all u ∈ C1 with compact support.
Letting u ≡ 1 shows why compact support is necessary, but C does notdepend on the size of the support.
Proof. 1) Suppose p = 1. Write
u(x) =
∫ xi
−∞ui(x1, . . . , xi−1, yi, xi+1, . . . , xn) dy1.
Then
|u(x)| ≤∫ ∞−∞|Du(x1, . . . , yi, . . . , xn)| dyi.
From here on, all integrals are over R. Taking a product, we have
|u(x)|nn−1 ≤
n∏i=1
(∫|Du(. . . , yi, . . .)| dyi
)1/(n−1)
.
Then∫|u(x)|n/(n−1) dx1 ≤
∫ n∏i=1
(∫|Du| dyi
)1/(n−1)
dx1
=(∫|Du| dy1
)1/(n−1)∫ n∏
i=2
(∫|Du| dyi
)1/(n−1)
dx1
≤(∫|Du| dy1
)1/(n−1)( n∏i=2
∫ ∫|Du| dx1 dyi
)1/(n−1)
,
67
where the last inequality is from the generalized Holder inequality.
Integrating with respect to x2,∫ ∫|u|n/(n−1) dx1 dx2 ≤
(∫ ∫|Du| dx1 dy2
)1/(n−1)∫ ∏
i 6=2
I1/(n−1)i dxi,
where
I1 =
∫|Du| dy1
and
Ii =
∫ ∫|Du| dx1 dyi, i = 3, . . . , n.
Then∫ ∫|u|n/(n−1) dx1 dx2
≤( ∫ ∫
|Du| dx1 dy2
)1/(n−1)(∫ ∫|Du| dy1 dx2
)1/(n−1)
×n∏i=3
(∫ ∫ ∫|Du| dx1 dx2 dyi
)1/(n−1)
.
We continue, and after integrating with respect to dxn, we obtain thedesired inequality.
2) If p > 1, apply the above case to v = |u|γ, where
γ =p(n− 1)
n− p> 1.
We obtain(∫|u|γn/(n−1)
)(n−1)/n
≤∫|D|u|γ| dx = γ
∫|u|γ−1|Du| dx
≤ c(∫|u|(γ−1)p/(p−1)
)(p−1)/p(∫|Du|p
)1/p
by Holder’s inequality. Dividing gives the general case.
68
Theorem 15.6 If k < n/p and
1
q=
1
p− k
n,
then‖u‖Lq ≤ c‖u‖Wk,p .
Proof. Use the above theorem and induction.
Proposition 15.7 If ` has compact support, there exist L and continuousfunctions gα such that
` =∑|α|≤L
Dαgα.
An example: The delta function is the derivative of h, where h is 0 forx < 0 and 1 for x ≥ 0. h is the derivative of g, where g is 0 for x < 0 andg(x) = x for x ≥ 0. Therefore δ = D2g.
Proof. Let h ∈ C∞0 and equal to 1 on the support of `. Then `((1−h)u) = 0,or `(u) = `(hu). Therefore there exists N and c such that
|`(hu)| ≤ c|hu|N ≤ c|u|N .
Hence|`(u)| ≤ c|u|N .
Let
‖u‖M =( ∑|α|≤M
∫|Dαu|2 dx
)1/2
,
and let HM be the completion of C∞0 with respect to this norm. This is aHilbert space.
By the Sobolev inequality,
|u|N ≤ c‖u‖M , N < M − n
2.
69
So any Cauchy sequence in HM is also a Cauchy sequence with respect tothe CN norm. This shows HM can be embedded in CN .
Since |`(u)| ≤ c|u|N ≤ c‖u‖M , then ` is a bounded linear functional inHM . By the Riesz-Frechet representation theorem, there exists g ∈ HM suchthat
`(u) = (u, g)M =∑|α|≤M
(Dαu,Dαg).
This is equal to ∑(−1)|α|(u,D2αg),
using the example of transformations where T is differentiation. So
` =∑
(−1)|α|D2αg.
Since g ∈ HM , then g ∈ CN . Now take L = 2M −N .
Lemma 15.8 Suppose b is a C∞0 function with support contained in B1(0)and
∫b(x) dx = 1. Let bk(x) = knb(kx). Then `k = bk ∗ ` converges to ` in
the sense of distributions.
Proof. We have
`k(u) = (bk ∗ `)(u) = (bk ∗ `, u) = (`, Rbk ∗ u).
Rbk is an approximation to the δ function, so
Dα(Rbk ∗ u) = Rbk ∗Dαu→ Dαu.
Hence Rbk ∗ u converges to u in the C∞0 sense, and this implies `k(u) =(`, Rbk ∗ u)→ (`, u) = `(u).
Proposition 15.9 Suppose g is continuous and Djg is continuous, whereDjg is the partial derivative of g in the distribution sense. Then Djg is alsothe partial derivative of g in the classical sense.
70
Proof. Let gk = bk ∗ g. Then gk → g in the sense of distributions. Djgk =bk ∗ Djg, so gk → g and Djgk → Djg uniformly on compact subsets, sinceg,Djg are continuous. We have, if b − a is a multiple of the unit vector inthe xj direction,
gk(b)− gk(a) =
∫ b
a
Djgk dxj.
Passing to the limit,
g(b)− g(a) =
∫ b
a
Djg dxj.
A distribution ` is positive if `(u) ≥ 0 whenever u is non-negative.
Examples: ` is given by a non-negative measure, `(u) =∫u dm.
Let |u| = sup |u(x)|.
Lemma 15.10 Suppose ` is a positive distribution and K is compact. Thereexists c depending on K such that |`(u)| ≤ c|u| for all u supported in K.
Proof. Let p ∈ C∞0 with p ≥ 0 and p = 1 on K. Then
|u(x)| ≤ |u|p(x).
So `(u) ≤ |u|`(p). Similarly, −u ≤ |u|p(x), so −`(u) ≤ |u|`(p). Now letc = `(p).
Proposition 15.11 Every positive distribution is a measure.
Proof. Extend ` to all continuous functions with compact support. Thenuse Riesz representation.
71
15.3 Fourier transforms
Let S be the class of complex-valued C∞ functions u such that |xβDαu(x)| →0 as |x| → ∞ for all multi-indexes α and all positive integers β. S is calledthe Schwartz class. An example of an element of the Schwartz class is e−|x|
2.
Define |u|β,α = supx |x|β|Dαu(x)|.We say un ∈ S converges to u if |un − u|β,α → 0 for all β, α.
We can find a metric for S: define
d(u, v) =∑α,β
1
2β+|α||u− v|β,α
1 + |u− v|β,α.
The dual of S is the set of continuous linear functionals on S. Herecontinuity means that if un → u in the sense of the Schwartz class, then`(un)→ `(u).
Since C∞0 ⊂ S, then S∗ ⊂ D. This means that every element of S∗ is adistribution. Elements of S∗ are called tempered distributions.
Any distribution of compact support is a tempered distribution. So is`(u) =
∫vu dx if v grows slower than some power of |x| as |x| → ∞.
For u ∈ S, define the Fourier transform Fu = u by
u(ξ) = (2π)−n/2∫u(x)eix·ξ dx.
Theorem 15.12 F maps S into S continuously.
Proof. For elements of S, DαFu = F ((ix)α)u). If u ∈ S, |xαu| → 0faster than any power of |x|, so xαu ∈ L1. This implies DαF is a continuousfunction. Therefore Fu ∈ C∞.
By integration by parts, ξβDαFu = iα+βF (Dβ(xαu)). By the productrule, Dβ(xαu) is in L1. So ξβDαFu is continuous and bounded. ThereforeF ∈ S.
Proposition 15.13 For u ∈ S:(1) If Tau(x) = u(x− a), then FTau = eia·ξFu and TaFu = F (e−ia·xu).
72
(2) F (iDju) = ξjFu and DjFu = F (ixju).
(3) If A is a rotation about the origin and R is reflection about the origin,FR = RF and FA = AF .
(4) F (u ∗ v) = (2π)n/2(Fu)(Fv).
Because
Fu(ξ) = c
∫eix·ξu(x) dx,
(Fu, v) = c
∫ ∫eix·ξu(x)v(ξ) dx dξ
=
∫(Fu)(x)v(x) dx
= (u, Fv).
So F ∗ = F .
If ` is a tempered distribution, define F` by
(v, F`) = (Fv, `)
for all v ∈ S.
It then follows that the proposition above holds for ` ∈ sS∗.
Proposition 15.14 Let ` ≡ 1. Then F` = (2π)n/2δ.
` ≡ 1 means ` = `1, or `(u) =∫u · 1 dx =
∫u(x) dx.
Proof. Write d for F`.
xjd = xjF1 = F (iDj1) = F (0) = 0.
Since xjd = 0, the support of d is contained in {xj = 0}. This is true foreach j, so the support of d is {0}. By a previous proposition
d =∑|α|≤N
cαDαδ
for some N and c.
73
For any multi-index γ with |γ| > 0, xγd = 0. But xγd =∑|α|≤N cαx
γDαδ.
We claim
xγDαδ =
0 if |α| < |γ|0 if |α| = |γ|, α 6= γ
(−1)|α|α!δ if α = γ
To show this, if u ∈ C∞0 ,
(u, xγDαδ) = (xγu,Dαδ) = (−1)|α|(Dα(xγu), δ) = (−1)|α|Dα(xγu)(0).
Checking the three cases proves the claim.
Suppose there exists α with |α| = N > 0 and cα 6= 0. But
0 = xαd = cαxαDαδ = cα(−1)|α|α!δ,
or cα = 0. Therefore d = c0δ.
To calculate c0, let ` ≡ 1 and v = e−|x|2/2 in (Fv, `) = (v, F`). So
F` = d = c0δ and Fv = e−|ξ|2/2. Then
(2π)n/2 = (e−|ξ|2/2, 1) = (e−x
2/2, c0δ) = c0.
Theorem 15.15 (1) F is an invertible map from S into S and
u(x) = (2π)−n/2∫u(ξ)e−iξ·x dξ.
(2) F is an invertible map from S∗ into S∗ and F−1 = FR.
Proof. (1) Fe−ia·ξ = TaF . So F (e−ia·ξ`) = TaF`. Take ` ≡ 1 to get
Fe−ia·ξ = (2π)n/2δ(x− a).
Now let ` = e−ia·ξ and use (Fv, `) = (v, F`) to get∫v(ξ)e−ia·ξ dξ = (v, e−ia·ξ) = (2π)n/2(v, δ(x− a)) = (2π)n/2v(a).
74
(2) From (1),
u(x) = (2π)−n/2∫u(−ξ)eix·ξ dξ,
so F−1 = FR, and hence FRF = I. Then
(v, `) = (FRFv, `) = (RFv, F`) = (Fv,RF`) = (v, FRF`).
thus FRF` = `. The inverse of F exists because RF is a right inverse for F ,FR is a left inverse, and F and R commute. Therefore RF` = F−1`.
Theorem 15.16 If u ∈ L2, then u ∈ L2 and ‖u‖L2 = ‖u‖L2.
Proof. We will prove this for u ∈ S and then approximate functions in L2
by functions in S. Now
Fu = (2π)−n/2∫ue−iξ·x dx = RFu.
Letting v = Fu in (Fu, v) = (u, Fv) yields
(Fu, Fu) = (u, F (RFu)) = (u, u),
using FRF = F .
16 Banach algebras
16.1 Normed algebras
An algebra is a linear space over + and a ring over ·. We assume there isan identity for the multiplication, which we call I. Our algebras will be overthe scalar field C; the reasons will be very apparent shortly.
An algebra is a normed algebra if the linear space is normed and |NM | ≤|N | |M | and |I| = 1. If the normed algebra is complete, it is called a Banachalgebra.
75
One example is to let L = L(X,X), the set of linear maps from X intoX. Another is to let L be the collection of bounded continuous functions onsome set.
An element M of L is invertible if there exists N ∈ L such that NM =MN = I.
M has a left inverse A if AM = I and a right inverse B if MB = I. If ithas both, then B = AMB = A, and so M is invertible.
Proposition 16.1 (1) If M and K are invertible, then
(MK)−1 = K−1M−1.
(2) If M and K commute and MK is invertible, then M and K areinvertible.
Proof. (1) is easy. For (2), let N = (MK)−1. Then MKN = I, so KN is aright inverse for M . Also, I = NMK = NKM , so NK is a left inverse forM . Since M has a left and right inverse, it is invertible. The argument forK is similar.
Proposition 16.2 If K is invertible, then so is L = K − A provided |A| <1/|K−1|.
Proof. First we suppose K = I. If |B| < 1, then∣∣∣ n∑m
Bi∣∣∣ ≤ n∑
m
|Bi| ≤n∑m
|B|i
is a Cauchy sequence, so S =∑
iBi converges.We see BS =
∑∞i=1B
i = S−I,so (I −B)S = I. Similarly S(I −B) = I.
For the general case, write K − A = K(I −K−1A), and let B = K−1A.Then |B| ≤ |K−1| |A| < 1, and
(K − A)−1 = (I −K−1A)−1K−1.
76
The resolvent set of M , ρ(M), is the set of λ ∈ C such that λI −M isinvertible. The spectrum of M , σ(M), is the set of λ for which λI −M isnot invertible. We sometimes write λ−M for λI −M .
Let f : G→ X, where G ⊂ C. f(z) is strongly analytic if
limh→0
f(z + h)− f(z)
h
exists in the norm topology for all z ∈ G. One can check that most ofcomplex analysis can be extended to strongly analytic functions.
Proposition 16.3 (1) ρ(M) is open in C.
(2) (z −M)−1 is an analytic function of z in ρ(M).
Proof. If λ ∈ ρ(M), letting K = λI−M and A = −hI, K−A = (λ+h)−Mis invertible if h is small. So λ+ h ∈ ρ(M).
For (2),
((λ+ h)−M)−1 =∑
(λ−M)n−1hn
for h small. So the resolvent can be expanded in a power series in h whichis valid if |h| < |(λ−M)−1|−1.
We write|σ(M)| = sup
λ∈σ(M)
|λ|,
and call this the spectral radius of M .
Theorem 16.4 (1) σ(M) is closed, bounded, and nonempty.
(2) |σ(M)| = limk→∞ |Mk|1/k.
Proof. (1) ρ(M) is open, so σ(M) is closed.
(zI −M)−1 = z−1(I −Mz−1)−1 =∞∑n=0
Mnz−n−1
77
converges if |z−1M | < 1, or equivalently, |z| > |M |. Therefore, if |z| > |M |,then z ∈ ρ(M). Hence the spectrum is contained in B|M |(0).
(z −M)−1 =∞∑n=0
Mnz−n−1
is a Laurent series. If σ(M) = ∅, then (z − M)−1 would be everywhereanalytic. So if c > |M | and C is the circle |z| = c,
1
2πi
∫C
(z −M)−1 dz = 0.
But integrating∑Mnz−n−1 term by term over the curve C, all the terms
are zero except the n = 0 term, where we get
1
2πi
∫C
1
zdz = I,
a contradiction. Therefore σ(M) is nonempty.
(2) Fix k for the moment. If we write n = kq + r,
∣∣∣ ∞∑n=0
Mn
zn+1
∣∣∣ ≤∑ |Mn||z|n+1
≤k−1∑n=0
|M |r
|z|r+1
∑q
( |Mk||z|k
)q.
So∑Mn|z|−n−1 converges absolutely if |Mk|/|z|k < 1, or if |z| > |Mk|1/k.
If |z| > |Mk|1/k, then z ∈ ρ(M). Hence if λ ∈ σ(M), then |λ| ≤ |Mk|1/k.This is true for all k, so |σ(M)| ≤ lim infk→∞ |Mk|1/k.
Let C be a circle enclosing σ(M), namely the one about the origin withradius |σ(M)|+ δ. Using (z −M)−1 =
∑Mnz−n−1,
1
2πi
∫C
(z −M)−1zn dz = Mn.
So
|Mn| ≤ 1
2π
∫C
|(z −M)−1| |z|n |dz|
≤ c(|σ(M)|+ δ)n+1,
78
where c = supz∈C |(z −M)−1|.Thus
|Mn|1/n ≤ c1/n(|σ(M)|+ δ)1+ 1n .
Hencelim supn→∞
|Mn|1/n ≤ |σ(M)|+ δ.
Since this is true for all δ, that does it.
Note not that every element of σ(M) is an eigenvalue of M . For example,if M : `2 → `2 is defined by
M(x1, x2, . . .) = (x1, x2/2, x3/4, . . .),
then 1, 1/2, 1/4, . . . are eigenvalues. Since the spectrum is closed, then 0 ∈σ(M), but 0 is not an eigenvalue for M .
16.2 Functional calculus
We can define p(M) =∑n
i=1 aiMi for any polynomial p, and since
lim sup |Mk|1/k = |σ(M)|,
also for any analytic function whose power series’ radius of convergence islarger than the spectral radius (by the root test).
Let G be a domain containing σ(M), f analytic in G, C a closed curve inG∩ρ(M) whose winding number is 1 about each point in σ(M) and 0 abouteach point of Gc. Define
f(M) =1
2πi
∫C
(z −M)−1f(z) dz.
By Cauchy’s theorem, this is independent of the contour chosen.
Theorem 16.5 (1) If f is a polynomial, the two definitions agree.
(2) f(M)g(M) = (fg)(M).
(3) (Spectral mapping theorem)
σ(f(M)) = f(σ(M)).
79
Proof. (1) follows from
1
2πi
∫C
(z −M)−1zn dz = Mn.
(2)(zI −M)− (wI −M) = (z − w)I.
Multiplying this by(z −M)−1(w −M)−1
z − w,
we get
1
z − w[(w −M)−1 − (z −M)−1] = (z −M)−1(w −M)−1,
which is known as the resolvent identity.
f → f(M) is linear. Let C,D be closed curves as above with D strictlyinside C. Then
f(M)g(M) =( 1
2πi
)2∫C
∫D
(z −M)−1(w −M)−1f(z)g(w) dz dw
=( 1
2πi
)2∫C
∫D
(w −M)−1
z − wf(z)g(w) dz dw
−∫C
∫D
(z −M)−1
z − wf(z)g(w) dz dw.
Since1
2πi
∫C
f(z)
z − wdz = f(w),
the first integral is
1
2πi
∫D
(w −M)−1f(w)g(w) dw = (fg)(M).
Note C winds once around each point of D.
Because D does not wind around any point of C,
1
2πi
∫D
g(w)
z − wdw = 0,
80
and so the second integral is 0.
(3) Suppose µ /∈ f(σ(M)); we show µ /∈ σ(f(M)). If µ 6= f(λ) for someλ ∈ σ(M), then f(z)− µ does not vanish on σ(M). So g(z) = (f(z)− µ)−1
is analytic in an open set containing σ(M) and we can define g(M) there.
Leth = (f(z)− µ)g(z) = 1.
So by (2)[f(M)− µI]g(M) = h(M) = I.
Therefore g(M) is the inverse of f(M)− µI, and hence µ /∈ σ(f(M)).
Suppose λ ∈ σ(M). We show f(λ) ∈ σ(f(M)). Let
k(z) =f(z)− f(λ)
z − λ.
k(z) is analytic in an open set containing σ(M), so we can define k(M) there.(z − λ)k(z) = f(z)− f(λ), so
(M − λI)k(M) = f(M)− f(λ)I.
Since λ ∈ σ(M), we have that M − λI is not invertible. If the product were,then we saw that each factor would be as well. Therefore f(M) − f(λ)I isnot invertible.
Corollary 16.6 Suppose σ(M) = σ1 ∪ · · · ∪ σN , where the σi are closed anddisjoint. Let Cj be a closed curve that winds once around each point of σjbut not around any point of any other σk. Let
Pj =1
2πi
∫Cj
(z −M)−1 dz.
Then
(1) P 2j = Pj, PjPK = 0.
(2)∑Pj = I.
(3) PM 6= 0 if σm 6= ∅.
The proof uses the observation that C =∑Cj winds once around each
point of σ(M).
81
17 Commutative Banach algebra
We look at commutative Banach algebras with a unit. Commutative meansMN = NM for all M,N ∈ L.
p is a multiplicative functional on L if p is a homomorphism from L intoC.
Proposition 17.1 Every homomorphism is a contraction.
Proof. M = IM , so p(M) = p(IM) = p(I)p(M), or p(I) = 1. If K isinvertible,
p(K)p(K−1) = p(KK−1) = p(I) = 1,
so p(K) 6= 0. Suppose |p(M)| > |M | for some M . Then if B = M/p(M), wehave |B| < 1, so K = I −B is invertible. But
p(K) = p(I)− p(M/p(M)) = 1− 1 = 0,
a contradiction.
We will show that if p(K) 6= 0 for all homomorphisms, then K is invertible.
I ⊂ L is an ideal if I is a linear subspace, I 6= {0}, I 6= L, and if M ∈ Land J ∈ I, then MJ ∈ I.
As an example, let L = C(S), let r ∈ S, and let I = {f : f(r) = 0}.Sometimes the requirement that I 6= L is not part of the definition, and
we talk about proper ideals to be the ones that are properly contained in L.
If I ∈ I, then I = L. If I contains an invertible element, then I containsthe identity, and hence equals L.
Lemma 17.2 Let q be a homomorphism from L onto A, but where q is notan isomorphism and q(L) 6= 0. Then
(1) {K ∈ L : q(K) = 0} is an ideal. (This set is called the kernel of q.)
(2) If I is an ideal, then I is the kernel of some non-trivial homomor-phism.
82
Proof. (1) is easy. For (2), let A = L/I. Let q map M into the equivalenceclass containing M . Then the kernel of q is I.
Proposition 17.3 If K ∈ L, K 6= 0, and K not invertible, then K lies insome ideal.
Proof. Look at KL = {KM : M ∈ L}. Note KL does not contain theidentity.
Lemma 17.4 Every ideal is contained in some maximal ideal.
Proof. Order by inclusion. The union of a totally ordered subcollection willbe an upper bound. (Note that if I /∈ Iα for all α, then I /∈ ∪αIα.) Thenuse Zorn’s lemma.
A division algebra is one where every nonzero element is invertible.
Proposition 17.5 If M is a maximal ideal of L, then A = L/M is adivision algebra.
Proof. Suppose C ∈ A and C is not invertible. Then CA is an ideal in A.Let q : L → L/M = A be the usual map. C = CI ∈ CA. q−1(C) is inthe ideal CA but properly contains M because q(M) = 0 if M ∈ M. Thiscontradicts M being maximal.
Lemma 17.6 The closure of an ideal is an ideal.
Proof. The only thing to prove is that I /∈ I. We know I /∈ I, and so ifN ∈ B1(I), then N is invertible, and hence not in I. So B1(I) is an open setabout I that is disjoint from I. Therefore I /∈ I.
83
Lemma 17.7 If M is a maximal ideal, then M is closed.
Proof. If not, M is an ideal strictly larger than M.
Lemma 17.8 If I is a closed ideal in L, then L/I is a Banach algebra.
Proposition 17.9 If A is a Banach algebra with unit that is a divisionalgebra, then A is isomorphic to C.
Proof. If K ∈ A, there exists κ ∈ σ(K). So κI−K is not invertible. There-fore κI −K = 0, or K = κI. The map K → κ is the desired isomorphism.
Theorem 17.10 K ∈ L is invertible if and only if p(K) 6= 0 for all homo-morphisms p of L into C.
Proof. Suppose K is not invertible. K is in some maximal ideal M. ThenM is closed, L/M is a division algebra, and is isomorphic to C.
p : L → L/M→ C
is a homomorphism onto C, and its null space is M. Since K ∈ M, thenp(K) = 0.
18 Applications of Banach algebras
18.1 Absolutely convergent Fourier series
Let L be the set of continuous functions from the unit circle S1 to the complexfunctions such that f(θ) =
∑cne
inθ with∑|cn| < ∞. We let the norm of
f be∑|cn|. We check that L is a Banach algebra. To do that, we use the
fact that the Fourier coefficients for fg are the convolution of those for f andthose for g. But the convolution of two `1 functions is in `1, so fg ∈ L.
If w ∈ S1, set pw(f) = f(w). pw is a homomorphism from L to C.
84
Proposition 18.1 If p is a homomorphism from L to C, then there existsw such that p(f) = f(w) for all f ∈ L.
Proof. p(I) = 1 and |p(M)| ≤ |M |, so p has norm 1. Then
|p(eiθ)| ≤ 1, |p(e−iθ)| ≤ 1,
and1 = p(1) = p(eiθ)p(e−iθ).
We must have |p(eiθ)| = 1, or we would have inequality in the above.
Therefore there exists w such that p(eiθ) = eiw. Since p is a homomor-phism, by induction p(einθ) = einw. By linearity,
p( N∑n=−N
cneinθ)
=N∑
n=−N
cneinw.
If f ∈ L, since p is continuous and∑|cn| <∞, we have p(f) = f(w).
Theorem 18.2 Suppose f has an absolutely convergent Fourier series and fis never 0 on S1. Then 1/f also has an absolutely convergent Fourier series.
Proof. If p is a homomorphism on L, then p(f) = f(w) for some w. Sincef is never 0, p(f) 6= 0 for all non-trivial homomorphisms p. This implies fis invertible in L.
18.2 The corona problem
Let H∞ be the set of functions that are analytic and bounded in the unitdisk D. If we set |f | = supz∈D |f(z)|, this makes H∞ into a commutativeBanach algebra.
Mz = {f ∈ H∞ : f(z) = 0} is a maximal ideal. But if zn is a sequenceconverging to the boundary, then M = {f ∈ H∞ : lim f(zn) = 0} is anideal, so is contained in a maximal ideal that is not any of the Mz. Therefore
85
{Mz : z ∈ D} is not equal to the set of all maximal ideals A. The coronatheorem says that it is dense in A if we provide A with the natural topology.
If f ∈ H∞, we define f : A → C, the Gel’fand transform of f , as follows.If M ∈ A, then H∞/M is isomorphic to C. Let IM be the isomorphism.
Define f(M) = IM(f), where f is the equivalence class of H∞/M containingf .
Note that f(M) = 0 implies IM(f) = 0, so f = 0, which implies f ∈M .
Let z ∈ D. If f ∈ H∞/Mz , f = {g : g − f ∈ Mz} = {g : g(z) = f(z)}.So the isomorphism mapping H∞/Mz to C is just IMz(f) = f(z), and thus
f(Mz) = f(z).
We define a basic neighborhood of N ∈ A to be a set of the form
V {M ∈ A : |fj(M)− fj(N)| < ε, j = 1, . . . , n}
for some δ and some f1, . . . .fn ∈ H∞.
Theorem 18.3 Suppose δ > 0, n > 1, and f1, . . . , fn ∈ H∞ with
maxj|fj(z)| ≥ δ
for each z ∈ D. Then there exist g1, . . . , gn ∈ H∞ with
f1(z)g1(z) + · · ·+ fn(z)gn(z) = 1
for each z ∈ D.
The fj are called corona data, the gj corona solutions.
The corona theorem is a corollary to the above theorem.
Theorem 18.4 The closure of {Mz : z ∈ D} is equal to A.
Proof. Suppose not. Then there exists N ∈ A and a neighborhood V of Nof the form
V = {M ∈ A : |hj(M)− hj(N)| < δ, j = 1, . . . , n}
86
that contains no Mz.
Let fj = hj − hj(N), i.e., we normalize the hj by subtracting a constant.Then
V = {M ∈ A : |fj(M)| < δ, j = 1, . . . , n}.
By the normalization, fj(N) = 0, so fj ∈ N for each j.
If z ∈ D, then Mz /∈ V , so |fj(Mz)| ≥ δ for some j, or equivalently,|fj(z)| ≥ δ for some j. By the previous theorem, there exist g1, . . . , gn suchthat f1g1 + · · ·+ fngn = 1. But since each fj ∈ N and N is an ideal, 1 ∈ N ,a contradiction.
19 Operators and their spectra
19.1 Invertible maps
Proposition 19.1 Suppose X is a Banach space, and K : X → X isbounded and onto. Then there exists ε such that if |A| < ε, then K − Ais onto.
Note we do not assume K is 1-1. Compare this with the result that Kinvertible implies K − A is invertible if the norm of A is small enough.
Proof. By the open mapping theorem, there exists k such that if Kx = z,then |x| ≤ k|z|. Suppose |A| < 1/k. We show K − A is onto.
Fix u, let x0 = 0, and define recursively xn by
Kxn+1 = Axn + u.
|x1| ≤ k|u|, and since
K(xn+1 − xn) = A(xn − xn−1),
|xn+1 − xn| ≤ k|A| |xn − xn−1|.
Therefore xn converges. If x is the limit, taking the limit in the definition ofxn+1, Kx = Ax+ u, which is what we wanted.
87
Note
|x| ≤∑|xn+1 − xn| ≤ (k|A|)n|x1| <
k
1− k|A||u|.
Proposition 19.2 Suppose M is a bounded linear map from a Banach spaceX into itself. Then σ(M ′) = σ(M).
Proof. Let K = λ −M . If K is invertible, there exists L such that KL =LK = I. Therefore L′K ′ = K ′L′ = I ′ = I, or K ′ has an inverse.
IfX were reflexive, we could sayK ′ invertible impliesK = K ′′ is invertible.In general, we have K ′′L′′ = L′′K ′′ = I ′′. Check that the restriction of K ′′ toX is K, the restriction of I ′′ to X is I, and therefore the null space of K istrivial, and hence K is 1-1.
The restriction of L′′ to X is the inverse to K. Since L′′ is continuous, RK
is closed. If RK 6= X, we can use Hahn-Banach to find ` 6= 0 annihilatingRK . Then ` ∈ NK′ . But K ′ is invertible.
19.2 Shifts
Let R and L be the right and left shifts, resp., on `2. We have LR = I butRL 6= I. Check that R′ = L and R = L′.
Proposition 19.3 σ(R) = σ(L) = {|λ| ≤ 1}.
Proof. L is a contraction, so |L| ≤ 1. Similarly |Ln| ≤ 1. Then |σ(L)| =lim |Ln|1/n ≤ 1.
If |λ| < 1 and we set an = λna0, then a ∈ `2 and Lx = λx. Therefore σ(L)contains the open unit ball. Since σ(L) is closed, this does it.
19.3 Volterra integral operators
Let X = C[0, 1] and
(V x)(s) =
∫ s
0
x(r) dr.
88
Proposition 19.4 σ(V ) = {0},
Proof. By induction and integration by parts,
V nx(s) =1
(n− 1)!
∫ s
0
(s− r)n−1x(r) dr.
So
|V nx(s)| ≤ 1
(n− 1)!
∫ s
0
(s− r)n|x| dr ≤ |x|n!.
Then |V n| ≤ 1/n!, so |σ(V )| = lim |V n|1/n = 0.
19.4 Fourier transform
F 2 = R, reflection, and R2 = I, so F 4 = I. Therefore σ(F ) ⊂ {±1,±i} bythe spectral mapping theorem.
20 Compact maps
20.1 Basic properties
A subset S is precompact if S is compact. Recall that if A is a subset ofa metric space, A is precompact if and only if every sequence in A has asubsequence which converges in A. Also, A is compact if and only if A iscomplete and totally bounded.
A map K from a Banach space X to a Banach space U is compact ifK(BX
1 (0)) is precompact in U .
One example is if K is degenerate, so that RK is finite dimensional. Theidentity on `2 is not compact.
Here is a more complicated example. Let X = U = `2 and define
K(a1, a2, . . . , ) = (a1/2, a2/22, a3/2
3, . . .).
Take a sequence xn in K(B1(0)). The jth coordinates of xn are boundedby 2−j. So there exists a subsequence such that xjn converges. Use the
89
diagonalization procedure to get a subsequence along which xjn converges forevery j. The limit will have the jth coordinate bounded by 2−j, so Kxnconverges along the subsequence to an element of K(B1(0)),
The following facts are easy:
(1) If C1, C2 are precompact subsets of a Banach space, then C1 + C2 isprecompact.
(2) If C is precompact, so is the convex hull of C.
(3) If M : X → U and C is precompact in X, then M(C) is precompactin U .
Proposition 20.1 (1) If K1 and K2 are compact maps, so is kK1 +K2.
(2) If XL−→U M−→, where M is bounded and L is compact, then ML is
compact.
(3) In the same situation as (2), if L is bounded and M is compact, thenML is compact.
(4) If Kn are compact maps and lim |Kn −K| = 0, then K is compact.
We can use (4) to give another proof that our example K above is compact.It is the limit in norm of Kn, where
Kn(a1, a2, . . .) = (a1/2, a2/22, . . . , an/2
n, 0, . . .).
Proof. (1) For the sum, (K1 +K2)(B) ⊂ K1(B) +K2(B), and the multipli-cation by k is similar.
(2) ML(B) will be compact because L(B) is compact and M is continuous.
(3) L(B) will be contained in some ball, so ML(B) is precompact.
(4) Let ε > 0. Choose n such that |Kn −K| < ε. Kn(B) can be coveredby finitely many balls of radius ε, so K(B) is covered by the set of balls withthe same centers and radius 2ε. Therefore K(B) is totally bounded.
One way of rephrasing (2)- (4) is that the set of compact maps are a closed2-sided ideal in L(X).
90
Proposition 20.2 If X and U are Banach spaces and K : X → U is com-pact and Y is a closed subspace of X, then the map K |Y is compact.
Proposition 20.3 If K : X → X is compact and Y is a closed invariantsubspace of the Banach space X, then K : X/Y → X/Y is compact.
Recall the following fact that we proved in Chapter 5. If X is a normedlinear space and Y a proper closed linear subspace, then there exists x ∈ Xwith |x| = 1 and d(x, Y ) ≥ 1/2.
Theorem 20.4 Suppose K : X → X is compact and T = I −K.
(1) dim(NT ) <∞.
(2) If Nj = NT j , there exists i such that Nk = Ni for k > i.
(3) RT is closed.
Proof. (1) y ∈ NT implies y = Ky. So the unit ball in NT is precompact.But saying the unit ball is compact implies the space is finite dimensional.
(2) If not, Ni−1 is a proper subset of Ni for all i. There exists yi ∈ Ni suchthat |yi| = 1 and d(yi, Ni−1) > 1/2. If m < n,
Kyn −Kym = yn − Tyn − ym + Tym.
Now −Tyn − ym + Tym ∈ Nn−1. So |Kyn − Kym| > 1/2. therefore nosubsequence of Kyn converges, a contradiction.
(3) Suppose yk → y, where yk = Txk. We need to show y ∈ RT . Letdk = d(xk, NT ).
Step 1. dk is bounded: choose zk ∈ NT such that wk = xk − zk satisfies|wk| = |xk − zk| < 2dk. Tzk = 0, so Twk = Txk − Tzk = yk. yk → y, so {yk}is bounded. If dk is unbounded,
T(wkdk
)=ykdk→ 0.
Let uk = wk/dk, so |uk| < 2.
0← Tuk = uk −Kuk.
91
Since Kuk has a convergent subsequence, uk does too. Suppose uk → u(along the subsequence). Then since T is bounded, Tu = limTuk = 0, oru ∈ NT . We have |xk− z| ≥ dk if z ∈ NT . So |wk− z| = |xk− (z+ zk)| ≥ dk,hence |wk − dkz| ≥ dk, hence |uk − z| ≥ 1, a contradiction.
Step 2. Since |wk| < 2dk, then wk is a bounded sequence. wk−Kwk = yk →y. Kwk has a convergent subsequence, so wk does too, say with limit w.Then w −Kw = Tw = y, and RT is closed.
Proposition 20.5 Suppose K : X → X is compact. If NT = {0}, thenRT = X.
Proof. Since dimNT = 0, T is 1-1. Assume X1 = RT is a proper subset ofX. Then X2 = TX1 is a proper subset of X1. To see this, suppose u ∈ Xand u /∈ X1. Then Tu ∈ TX = X1. But if Tu ∈ X2, then Tu = Tv for somev ∈ X1, and then u = v ∈ X1 since T is 1-1, a contradiction. Continue todefine X3 and so on.
We have Xk = RTk , and
T k = (I −K)k = I +k∑j=1
(−1)j(kj
)Kj,
or T k is equal to I plus a compact operator. Therefore Xk is closed.
Pick xk ∈ Xk such that |xk| = 1 and dist (xk, Xk+1) > 12.
Kxm −Kxn = xm − Txm − xn + Txn.
The last three terms are in Xm+1, so |Kxn−Kxm| ≥ 12, hence no subsequence
converges, a contradiction to K being compact.
20.2 Spectral theory
Theorem 20.6 Suppose X is a Banach space and K : X → X.
(1) σ(K) consists of countably many complex numbers λn, whose onlyaccumulation point is {0}. If dimX =∞, then 0 ∈ σ(K).
92
(2) Each λj 6= 0 is an eigenvalue:
(a) The null space of K − λj is finite dimensional.
(b) There exists i such that the null space of (K − λj)k is the same as thenull space of (K − λj)i if k > i.
(b) says that the multiplicity of each nonzero eigenvalue is finite.
Proof. We prove (2) first. Suppose λj ∈ σ(K) and λj 6= 0. Let T =I−λ−1
j K. λjT = λj−K is not invertible, so its null space, which is the sameas that of T , if finite dimensional. If NT = {0}, then RT = X, and then Tis invertible. So NT is larger than {0}. If z ∈ NT , (λj −K)z = 0, or z is aneigenvector. By the previous theorem, the multiplicity is finite.
Now we look at (1). Suppose λn is a sequence of distinct non-zero eigenval-ues. with corresponding eigenvectors xn. Let Yn be the linear space spannedby {x1, . . . , xn}. We claim the xn are linearly independent. If not, we canwrite xn =
∑n−1j=1 cjxj, and then
λnxn = Kxn =n∑j=1
ckKxj =n−1∑j=1
cjλjxj,
orn−1∑j=1
cjxj =n−1∑j=1
cjλjλnxj,
which says {x1, . . . , xn−1} are linearly dependent. We then use induction.Therefore Yn−1 is a proper subset of Yn. Choose yn ∈ Yn such that |yn| = 1and dist (yn, Yn−1) > 1/2. We can write
yn =n∑j=1
ajxj.
So
Kyn − λnyn =n∑j=1
(λj − λn)ajxj ∈ Yn−1.
Therefore if n > m,
Kyn −Kym = (Kyn − λnyn)− (Kym − λmym)− λmym,
93
which equals λnyn plus an element of Yn−1, and so
|Kyn −Kym| ≥ |λn|/2.
Since K is compact, there can only be finitely many λn with |λn| > δ.
If dimX = ∞, define the yn as before. 0 /∈ σ(K) implies that K isinvertible. Since K is compact, there is a subsequence Kynj which converges.But then ynj = K−1Kynj converges, a contradiction.
Theorem 20.7 K is compact if and only if K ′ is compact.
Proof. We want to show that if {`n} ⊂ U∗ with |`n| ≤ 1, then {K ′`n} has aCauchy subsequence. Let J = K(B). The quantities |(`n, u)| are uniformlybounded, and also equicontinuous on J :
|(`n, u)− (`n, v)| = |(`n, u− v)| ≤ |u− v|.
J is compact, so by Ascoli-Arzela there exists a uniformly convergent subse-quence. So given ε, there exists N such that |(`n, u)−(`m, u)| < ε if n,m ≥ Nand u ∈ J . So
ε > |(`n − `m, Kx)| = |(K ′`n −K ′`m, x)|
for all x ∈ B. Therefore |K`n −K`m| < ε.
Conversely, if K ′ is compact, then K ′′ is compact. But K is the restrictionof K ′′ to X, so is also compact.
21 Positive compact operators
We’ll do the Krein-Rutman theorem, which is a generalization of the Perron-Frobenius theorem.
Theorem 21.1 Suppose Q is compact and Hausdorff and X = C(Q), thecomplex-valued continuous functions on Q. Suppose K : C(Q) → C(Q)
94
and K is compact. Suppose further than K maps real-valued functions toreal-valued functions. Finally, suppose that whenever p ≥ 0 and p is notidentically zero, then Kp is strictly positive. Then K has a positive eigenvalueσ of multiplicity one, the associated eigenfunction is positive, and all the othereigenvalues of K are strictly smaller in absolute value than σ.
Examples include matrices with all positive entries, the semigroup Pt whent = 1 for reflecting Brownian motion on a bounded interval, and
Kf(x) =
∫K(x, y)f(y)µ(dy),
where K is jointly continuous, positivie, and µ is a finite measure. It is anexercise to show that the operator K is compact.
Proof. If x ≤ y and x 6≡ y, then y − x ≥ 0, so K(y − x) > 0, or Kx < Ky.
Step 1. Suppose κ > 0 is such that there exists x ∈ C(Q) with x ≥ 0, andκx ≤ Kx at all points of Q.
Step 2. There exists such a κ > 0: Let x ≡ 1. So Kx is strictly positive, andwe can let κ = minKx.
NowκKx = K(κx) ≤ K(Kx) = K2x.
Soκ2x ≤ κKx ≤ K2x,
and in general,κnx ≤ Knx.
Since x ≥ 0,κn|x| ≤ |Knx| ≤ |Kn| |x|,
so|σ(K)| = lim |Kn|1/n ≥ κ.
Therefore |σ(K)| is strictly positive. Since K is compact, the set of eigenval-ues of K is nonempty. We have shown that there exists a non-zero eigenvaluefor K.
Step 3. K is compact, so there exists an eigenvalue λ and an eigenfunction zsuch that Kz = λz, |λ| = |σ(K)|. Let λ and z be any pair with |λ| = |σ(K)|.
95
(a) We claim: if y = |z| and σ = |σ(K)|, then σy ≤ Ky.
Proof: Let q ∈ Q. Multiply z by α ∈ C such that |α| = 1 and αλz(q) isreal and non-negative. Of course α depends on q. Write z = u+ iv. Then
Ku(q) + iKv(q) = Kz(q) = λz(q).
Looking at the real part,
λz(q) = (Ku)(q).
Next, u ≤ |z| = y, and
|λ|y(q) = |λz(q)| = Ku(q) ≤ (Ky)(q). (21.1)
Thenσy(q) ≤ Ky(q). (21.2)
Although z depends on α, which depends on q, neither σ nor y depend on q.Since q was arbitrary, the inequality (21.2) holds for all q.
(b) We claimσy = Ky.
Proof: If not, there exists q such that σy(q) < Ky(q). By continuity, thereexists a neighborhood N about q such that
σy(s) + δ ≤ Ky(s), s ∈ N.
Let p > 0 in N , 0 outside of N , and so Kp > 0.
We will find c, ε > 0 and set x = y + εp, κ = σ + cε, and get κx ≤ Kx.This will be a contradiction since the largest κ possible is less than or equalto |σ(K)| = σ.
Now Kp > 0, so there exists c ≤ 1 such that cy ≤ Kp. If s ∈ N ,
Kx(s) = Ky(s) + εKp(s) ≥ Ky(s) + εcy(s)
≥ σy(s) + δ + εcy(s).
Now
κx(s) = (σ + cε)(y + εp)(s) = σy(s) + εcy(s) + σεp(s)
+ cε2p(s)
≤ Kx(s)− δ + σεp(s) + cε2p(s).
96
Since p is bounded above, we can take ε small enough so that the last line isless than or equal to Kx(s).
If s /∈ N , then p(s) = 0 and
κx(s) = κy(s) = (σ + cε)y(s) = σy(s) + εcy(s)
≤ Ky(s) + εKp(s) = Kx(s).
Step 4. We next show that any other eigenvalue that has absolute value σis in fact equal to σ. Let z be any eigenfunction corresponding to λ with|λ| = σ. Fix q ∈ Q. As before, we may assume λz(q) ≥ 0. As before, writez = u+ iv and then λz(q) = Ku(q). We have u ≤ |z| = y.
Suppose u < y at some point q′ ∈ Q. Then u ≤ y and u < y at one pointmeans that we have Ku < Ky at every point, and so
|λ|y(q) = |λz(q)| = λz(q) = Ku(q) < Ky(q).
So σy(q)) < Ky(q). But we showed σy = Ky. Therefore u is identicallyequal to y. This implies that z is real and positive, and then it follows thatλ is real and positive. Since z = σ−1Kz, z is strictly positive.
Step 5. Finally, we show σ has multiplicity 1. If not, there exist real eigen-functions y1, y2. But some linear combination w of y1, y2 will be real and takeboth positive and negative values. As before |w| will be an eigenfunction thatis non-negative, and must also take the value 0. Moreover the correspondingeigenvalue is σ. But then 0 < K|w| = σ|w|, a contradiction to |w| taking thevalue 0.
22 Invariant subspaces
22.1 Compact maps
There exist operators without any eigenfunctions, but they still have non-trivial invariant subspaces. Y is a non-trivial invariant subspace for K if Yis a proper subset of X and KY ⊂ Y . Note that if K has an eigenvector y,then the linear subspace spanned by {y} is a non-trivial invariant subspace.What if the only element of σ(K) is the point 0?
97
Theorem 22.1 Let X be a Banach space over C of dimension greater than1 and let K : X → X be compact. Then K has a non-trivial invariantsubspace.
Proof. Suppose K 6= 0 and normalize so that |K| = 1. Choose x0 ∈ X suchthat |x0| > 1, |Kx0| > 1. Let B = B1(x0), and note 0 /∈ B. Let D = KB.D is compact. Since |x0| > 1 and |K| = 1, then 0 /∈ D.
Suppose K has no non-trivial invariant subspaces. Given y 6= 0, {p(K)y}where p is a polynomial, is invariant under K. Its closure is a closed invariantsubspace, so must be all of X. Therefore {p(K)y} is dense in X.
0 /∈ D, so if y ∈ D, there exists a polynomial p such that |p(K)y−x0| < 1.The set of z satisfying |p(K)z − x0| < 1 is an open set containing y. D iscompact, so can be covered by finitely many of them. Therefore there existp1, . . . , pN such that if y ∈ D, |pi(K)y − x0| < 1 for some pi.
Let Ki = pi(K). If y ∈ D, |Kiy − x0| < 1 for at least one i.
x0 ∈ B and Kx0 ∈ D. If y = Kx0, there exists i1 such that |Ki1K(x0)−x0| < 1. So Ki1Kx0 ∈ B, hence KKi1Kx0 ∈ D. Let y = KKi1Kx0 andthere exists i2 such that
|Ki2KKi1Kx0 − x0| < 1.
Continue, so ∣∣∣ n∏k=1
(KikK)x0 − x0
∣∣∣ < 1.
Hence ∣∣∣ n∏k=1
(KikK)x0
∣∣∣ > |x0| − 1 > 0.
The Ki’s and K commute, so∣∣∣( n∏k=1
Kik)Knx0
∣∣∣ > |x0| − 1.
Let c = supi |Ki|. Therefore
cn|Kn| |x0| > |x0| − 1.
98
thenc|Kn|1/n|x0|1/n > (|x0| − 1)1/n.
Therefore |σ(K)| ≥ 1/c > 0. By spectral theory, σ(K) contains points otherthan 0, and there are eigenvalues. The corresponding eigenspace is invariantunder K, a contradiction.
23 Compact symmetric operators
Let H be a complex Hilbert space. A : H → H is Hermitian (symmetric) ifA = A∗, that is, (Ax, y) = (x,Ay).
Proposition 23.1 (1) (Ax, x) is real.
(2) (Ax, x) is not identically 0 unless A = 0.
Proof. (1)(Ax, x) = (x,Ax) = (Ax, x).
(2) If (Ax, x) = 0 for all x, then
0 = (A(x+ y), x+ y) = (Ax, x) + (Ay, y) + (Ax, y) + (Ay, x)
= (Ax, y) + (y, Ax) = (Ax, y) + (Ax, y).
So Re (Ax, y) = 0. Replacing x by ix, Re (i(Ax, y)) = 0.
If (Ax, x) ≥ 0 for all x, we say A is positive, and write A ≥ 0. WritingA ≤ B means B − A ≥ 0. For matrices, one uses positive definite.
Now suppose A is compact.
Proposition 23.2 If xnw−→, then Axn
s−→.
Proof. If xnw−→x, then Axn
w−→Ax, since (Axn, y) = (xn, Ay) → (x,Ay) =(Ax, y). If xn converges weakly, then ‖xn‖ is bounded so Axn lies in aprecompact set.
99
Any subsequence ofAxn has a further subsequence which converges strong-ly. The limit must be Ax.
Theorem 23.3 (Spectral theorem) Suppose H is a complex Hilbert space,A is compact and symmetric. There exist zn ∈ H such that {zn} is anorthonormal basis for H, each zn is an eigenvector, the eigenvalues are real,and their only point of accumulation is 0.
Proof. If A = 0, any orthonormal basis will do. So suppose A 6= 0. Let
M = sup‖x‖=1
(Ax, x).
We may assume M > 0 since there exists x such that (Ax, x) 6= 0. (Look at−A if necessary.) We have (Ax, x) ≤ ‖Ax‖ ‖x‖, so M ≤ ‖A‖.
We claim the maximum is attained. Choose xn with ‖xn‖ = 1 suchthat (Axn, xn) → M . There exists a subsequence, also denoted xn, whichconverges weakly, say to z. Since A is compact, Axn
s−→Az. So (Az, z) =lim(Axn, xn) = M . ‖xn‖ ≤ 1, so ‖z‖ ≤ 1. M > 0 implies z 6= 0. Lety = z/‖z‖. Then (Ay, y) = M/‖z‖2. If ‖z‖ < 1, then (Ay, y) > M , acontradiction. Therefore ‖z‖ = 1.
Note z maximizes
RA(x) =(Ax, x)
‖x‖2
over all x 6= 0.
Let w ∈ H, t ∈ R. RA(z + tw) ≤ RA(z). So ∂∂tRA(z + tw) |t=0= 0. Doing
the calculation,
(Aw, z) + (Az,w)
‖z‖2− (Az, z)
(w, z) + (z, w)
‖z‖4= 0.
This impliesRe (Az −Mz,w) = 0.
This is true for all w, so Az −Mz = 0, or z is an eigenvector.
100
Suppose we have found eigenvalues z1, z2, . . . , zn. Let Y be the orthogonalcomplement of the linear subspace spanned by {z1, . . . , zn}. If x ∈ Y , then
(Ax, zk) = (x,Azk) = λk(x, zk) = 0,
or Ax ∈ Y . We then look at A |Y , which will still be compact and symmetric,and find a new eigenvector zn+1.
We next prove that the set of eigenvectors forms a basis. Suppose y isorthogonal to every eigenvector. Then
(Ay, zi) = (y, Azi) = (y, αzi) = 0
if zi is an eigenvector, and Ay is also orthogonal to every eigenvector. If Yis the orthogonal complement to {zi} and Y 6= {0}, then A : Y → Y , A |Yis symmetric, so there exists an eigenvector for A |Y , a contradiction since Yis orthogonal to every eigenvector.
It remains to show that 0 is the only accumulation point. If αn 6= αm,
αn(zn, zm) = (Azn, zm) = (zn, Azm) = αm(zn, zm),
or (zn, zm) = 0. We can take the zn to have norm 1.
Now if αn → α 6= 0, using the compactness of A, there is a subsequence,also called zn, such that Azn converges strongly, say, to w.
zn =1
αnAzn →
1
αw.
But ‖zn − zm‖ = 2 if n 6= m, a contradiction to an converging.
If α1 ≥ α2 ≥ · · · > 0 and Azn = αnzn, then our construction shows that
αN = maxx⊥z1,··· ,zN−1
(Ax, x)
‖x‖2.
This is known as the Rayleigh principle.
Let
RA(x) =(Ax, x)
‖x‖2.
101
Proposition 23.4 Let A be compact, symmetric, αk the eigenvalues withα1 ≥ α2 ≥ · · · . Then
(1) (Fisher’s principle)
αn = maxSN
minx∈SN
RA(x),
where the max is over linear subspaces SN of dimension N .
(2) (Courant’s principle)
αN = minSN−1
maxx⊥SN−1
RA(x).
Proof. Let z1, . . . , zN be eigenvectors with corresponding eigenvalues α1 ≥α2 ≥ · · · ≥ αN . Let TN be the linear subspace spanned by {z1, . . . , zN}. Ify ∈ TN , we have y =
∑Nj=1 cjzj for some complex numbers cj and then
(Ay, y) =N∑i=1
N∑j=1
cicj(Azi, zj) =∑i
∑j
cicjαi(zi, zj)
=∑i
|ci|2αi ≥∑i
|ci|2αN
= (y, y)
using the fact that the zi are orthogonal by our construction.
(1) Let zk be the eigenvectors. Let SN be a subspace of dimension N .There exists y ∈ SN such that (y, zk) = 0 for k = 1, . . . , N − 1. Since
αN = maxx⊥z1,...,zN−1
RA(x),
then y is one of the vectors over which the max is being taken, so RA(y) ≤ αNfor this y. So minx∈SN RA(x) ≤ αN . This is true for all spaces of dimensionN . So the RHS is less than or equal to αN .
But if SN is the linear span of {z1, . . . , zN}, the minimum of RA on SN isachieved at zN .
(2) If dimSN−1 = N −1, TN , the span of {z1, . . . , zN}, contains a vector yperpendicular to SN−1. If y ∈ TN , RA(y) ≥ αN , so maxx⊥SN−1
RA(x) ≥ αN .
But taking SN−1 = TN−1, we get equality.
102
Proposition 23.5 Suppose A ≤ B with eigenvalues αk, βk, resp., ordered tobe decreasing. Then αk ≤ βk for all k.
Proof. A ≤ B implies (Ax, x) ≤ (Bx, x), so RA(x) ≤ RB(x). Now useeither Fisher’s or Courant’s principle.
Theorem 23.6 Suppose A is compact and symmetric. Suppose f defined onσ(A) is bounded and complex valued. We can define f(A) such that
(1) If f1(σ) ≡ 1, then f1(A) = I.
(2) If f2(σ) ≡ σ, then f2(A) = A.
(3) f → f(A) is an isomorphism of the ring of bounded functions on σinto the algebra of bounded maps of H into H.
(4) ‖f(A)‖ = supσ∈σ(A) |f(σ)|.(5) If f is real, f(A) is symmetric,
(6) f ≥ 0 on σ(A) implies f(A) is positive.
Proof. If {zn} is an orthonormal basis of eigenvectors, and x =∑cnzn, set
f(A)x =∑f(αn)cnzn.
If A is positive, then σ(A) ⊂ [0,∞). We can define√A by taking f(λ) =√
λ.
24 Examples of compact symmetric opera-
tors
LetLu(x) = −u′′(x) + q(x)u(x)
for u ∈ C2[0, 2π] with u(0) = u(2π) = 0. Suppose q is continuous and q ≥ 1for all x.
Fact from PDE: Lu = f with u(0) = 0 and u(2π) = 0 has a uniquesolution if f is continuous.
Write u = Af .
103
Lemma 24.1
C ={u :
∫ 2π
0
(u′)2 ≤ 1,
∫ 2π
0
u2 ≤ 1}
is precompact in L2.
Proof. If 0 ≤ x ≤ y ≤ 2π, then
|u(y)− u(x)| =∣∣∣ ∫ y
x
u′(z) dz∣∣∣ ≤ |y − x|1/2(∫ 2π
0
|u′(z)|2 dz)1/2
using Cauchy-Schwarz. Thus the functions in C are equicontinuous. If|u(y)| ≥ 100 for some u ∈ C and some y, then
|u(x)| ≥ |u(y)| − |u(x)− u(y)| ≥ 100− (2π)1/2
for all x, and then∫u2 > 1, a contradiction to u being in C. Therefore the
functions in C are uniformly bounded. By the Ascoli-Arzela theorem, everysequence in C has a subsequence which converges uniformly on [0, 2π], andhence in L2. Therefore C is precompact.
Proposition 24.2 A is bounded, compact, symmetric, and positive.
Proof. If u ∈ C2 and Lu = f , then by integration by parts:∫ 2π
0
uf dx =
∫ 2π
0
uLu dx =
∫((u′)2 + qu2).
Now ∫uf ≤
(∫u2)1/2(∫
f 2)1/2
≤ 12
∫u2 + 1
2
∫f 2,
using the inequality xy ≤ 12(x2 + y2). Since q ≥ 1,∫(u′)2 + 1
2
∫u2 ≤ 1
2
∫f 2.
This implies A is bounded.
104
(Since ∫uf =
∫(au)(f/a) ≤ 1
2a2
∫u2 + 1
2a2
∫f 2,
all we really need is that q is bounded below by a positive constant.)
To show A is symmetric, (Lu, v) = (u, Lv). Letting f = Lu and g = Lv,(f, Ag) = (Af, g).
To see that A is positive, note
(Af, f) =
∫uf =
∫(u′)2 + qu2 ≥ 0.
It remains to prove A is compact.
Since∫
(u′)2 + 12
∫u2 ≤ 1
2
∫f 2, if f is in the unit ball in L2, then Af has
norm bounded by 1 and (Af)′ has norm bounded by 1/2. By the lemma, theimage of the unit ball under A is precompact.
We can extend A to all of L2. Let en be eigenfunctions, αn the eigenvalues,so Aen = αnen. ‖e′n‖ ≤ c‖Aen‖ < ∞. So en is continuous. Len = α−1
n en.The αn → 0, so α−1
n →∞.
Remark 24.3 The above also applies to Lu = −∆u + qu in D with u = 0on the boundary.
For an example, take q(x) ≡ c. Solving
−f ′′(x) + cf(x) = λf(x),
we see that the solutions are of the form en = cn sin(nx/2). From our theorywe know the en are orthogonal and form a complete orthonormal basis.
25 Trace class operators
25.1 Polar decomposition
Proposition 25.1 Suppose T : H → H is compact. Then T = UA, whereA is a positive symmetric operator and U∗U = I on the range of A, U = 0on R⊥A.
105
This is called the polar decomposition and is the analog of writing a com-plex number as reiθ.
Proof. T ∗T is a non-negative symmetric compact operator as (T ∗Tu, u) =(Tu, Tu) ≥ 0. So there exists a square root: A = (T ∗T )1/2.
‖Tu‖2 = (Tu, Tu) = (u, T ∗Tu) = (u,A2u) = (Au,Au) = ‖Au‖2. (1)
So if Au = Av, then A(u − v) = 0, so T (u − v) = 0, so Tu = Tv. DefineU : RA → H by U(Au) = Tu.
By (1), U is an isometry on RA. Define Un = 0 if n ∈ R⊥A. For such n,(Un, v) = (n, U∗v) = 0 for all v, so U∗ : H → (R⊥A)⊥ ⊂ RA.
Write R for RA. We claim U∗Uw = w for w ∈ R. If z, w ∈ R,
(z, w) = (Uz, Uw) = (z, U∗Uw)
since U is an isometry. So (z, U∗Uw − w) = 0, or (U∗Uw − w) ⊥ R. SinceU∗ : H → R, then U∗Uw − w is in R and is orthogonal to R, hence is 0.Therefore U∗Uw = w.
The only place we used the compactness was to define (U∗U)1/2. We’llsee later that we can define the positive square root of U∗U for any boundedlinear operator U .
T compact implies that T ∗T is compact, so A is compact. Let sj be theeigenvalues of A. These are called the singular values of T .
25.2 Trace class
Let T : H → H be compact. T is of trace class if
‖T‖tr =∑
sj(T ) <∞.
Proposition 25.2 Let T be of trace class, B bounded.
(1)‖T‖tr = ‖T ∗‖tr.(2) ‖BT‖tr ≤ ‖B‖ ‖T‖tr.(3) ‖TB‖tr ≤ ‖B‖ ‖T‖tr.(4) ‖S + T‖tr ≤ ‖S‖tr + ‖T‖tr.
106
Proof. (1) We will show sj(T ) = sj(T∗). Since sj(T
∗) is an eigenvalue of(T ∗∗T ∗)1/2 = (TT ∗)1/2, it suffices to show that TT ∗ and T ∗T have the sameeigenvalues.
Let z, λ be an eigenvector, eigenvalue pair for T ∗T with λ 6= 0, so thatT ∗Tz = λz. then TT ∗Tz = λTz,and λ is an eigenvalue for TT ∗. (Tz 6= 0since λ 6= 0.)
(2) We will show sj(BT ) ≤ ‖B‖sj(T ).
(T ∗B∗BTu, u) = ‖BTu‖2 ≤ ‖B‖2‖Tu‖2 = ‖B‖2(T ∗Tu, u).
So(BT )∗BT ≤ ‖B‖2T ∗T,
hences2j(BT ) ≤ ‖B‖2s2
j(T ).
(3)sj(TB) = sj(B
∗T ∗) ≤ ‖B∗‖sj(T ∗) = ‖B‖sj(T ).
(4) We will show
‖T‖tr = sup∑n
|(Tfn, en)|,
where the supremum is over all orthonormal bases {fn}, {en}.Let zj be normalized eigenvectors for A: Azj = sjzj, ‖zj‖ = 1. Then
f =∑
(f, zj)zj, Af =∑
sj(f, zj)zj.
ThenTf = UAf =
∑sj(f, zj)wj,
where wj = Uzj. wj is an orthonormal basis for R = RA. then
(Tf, e) =∑
sj(f, zj)(wj, e)
and so ∑(Tfn, en) =
∑∑sj(fn, zj)(wj, en).
107
The right hand side is less than or equal to∑j
sj
(∑n
|(fn, zj)|2)1/2∑
n
|(wj, en)|2)1/2
=∑j
sj
(‖zj‖2‖wj‖2
)1/2
=∑
sj = ‖T‖tr.
Therefore sup∑|(Tfn, en)| ≤ ‖T‖tr.
Now let fn = zn, en = wn (supplemented arbitrarily on (RA)⊥. Then(Tfn, en) = sn and the sup is attained.
In particular,
‖S + T‖tr = sup∑n
|((S + T )fn, en)|,
and use subadditivity on the right hand side.
If fn is an orthonormal basis, then we define tr T =∑
(Tfn, fn). Thisis comparable with the trace of a matrix, which is the sum of the diagonalelements.
Proposition 25.3 The definition of trace is independent of the choice oforthonormal basis. If T is of trace class, then the sum converges absolutely.
Proof. We have ∑(Tfn, en) =
∑∑sj(fn, zj)(wj, en).
Take en = fn to get∑(Tfn, fn) =
∑∑sj(fn, zj)(wj, fn).
We have already shown that the right hand side converges, and the sum isbounded by ‖T‖tr. By Parseval,∑
n
(fn, zj)(fn, wj) = (wj, zj),
108
so ∑(Tfn, fn) =
∑sj(wj, zj),
which is independent of the sequence fn. Here zj are the eigenvectors of Aand wj = Uzj.
Proposition 25.4 (1) |tr T | ≤ ‖T‖tr.(2) tr T is a linear function of T .
(3) tr T ∗ = tr T .
(4) If B is bounded, tr BT = tr TB.
Proof. (1) has been done already. (2) and (3) follow from
tr T =∑
(Tfn, fn).
For (4),
Tf = UAf = U(∑
sj(f, zj)zj
)=∑j
sj(f, zj)wj.
Then BTfn =∑sj(fn, zj)Bwj, so using Parseval,∑
(BTfn, fn) =∑∑
sj(fn, zj)(Bwj, fn) =∑
sj(zj, Bwj).
From Tf =∑sj(f, zj)wj,
TBf =∑
sj(Bf, zj)wj =∑
sj(f,B∗zj)wj.
So
tr TB =∑
(TBfn, fn) =∑
sj(wj, B∗zj)
=∑∑
sj(fn, B∗zj)(wj, fn)
=∑
sj(Bwj, zj).
109
Theorem 25.5 tr T =∑λj if T is of trace class, compact, and symmetric.
Proof. Let fn be an orthonormal basis consisting of eigenvectors of T . Then
tr T =∑
(Tfn, fn) =∑
λn(fn, fn) =∑
λn.
This theorem is true under much weaker assumptions on T .
Define K : L2[0, 1]→ [0, 1] by
Ku(s) =
∫ 1
0
K(s, t)u(t) dt.
K∗ has kernel K(t, s).
Suppose K is continuous, symmetric, and real-valued. Then K is compact.This is an exercise, but the idea is to use equicontinuity of Ku:
|Ku(s)−Ku(v)| =∣∣∣ ∫ 1
0
[K(s, t)−K(v, t)]u(t) dt
≤(∫ 1
0
|K(s, t)−K(v, t)|2 dt)1/2(∫ 1
0
u(t)2 dt)1/2
.
So K(B) is a compact subset of C[0, 1] ⊂ L2[0, 1].
Therefore there exists a complete orthonormal system (κj, ej). K : L2 →C[0, 1], so ej = κ−1
j ej is continuous if κj 6= 0.
Theorem 25.6 (Mercer) Suppose K is real-valued, symmetric, and contin-uous. Suppose K is positive: (Ku, u) ≥ 0 for all u ∈ H. Then
K(s, t) =∑j
κjej(s)ej(t),
and the series is uniformly absolutely continuous.
An example is to let K = Pt, the transition density of absorbing or re-flecting Brownian motion.
110
Proof. K ≥ 0 on the diagonal: If not, if K(r, r) < 0, then K(s, t) < 0 if|s− r|, |t− r| < δ for some δ. Take u = 1[r−δ/2,r+δ/2]. Then
(Ku, u) =
∫ ∫K(s, t)u(t)u(s) ds dt < 0,
a contradiction.
Let KN(s, t) =∑N
j=1 κjej(s)ej(t). K − KN is a positive operator, for itseigenvectors are the ej and its eigenvalues κj > 0, j > N . So K − KN isnon-negative on the diagonal:
0 ≤ K(s, s)−N∑j=1
κjej(s)2.
Each term is non-negative, so the sum converges for each s. By Dini’s the-orem,
∑Nj=1 κjej(s)
2 converges uniformly and absolutely in s. By Cauchy-Schwarz, KN(s, t) converges uniformly in s and t.
Let the limit be K∞. We need to show K∞ = K. K and K∞ have thesame eigenfunctions and the same eigenvalues. They both map all functionsthat are orthogonal to all the ej’s to 0. Therefore Ku = K∞u for all u. Sothey have the same kernel.
Set s = t:K(s, s) =
∑κjej(s)
2.
Integrate over s: ∫ 1
0
K(s, s) ds =∑
κj.
Therefore, since the left hand side is finite, K is of trace class. This is truemore generally: K need not be symmetric.
K : H → H is a Hilbert-Schmidt operator if there exists an orthonormalbasis ej and
∑‖Kej‖2 < ∞. We will not prove this, but K is Hilbert-
Schmidt if and only if∑sj(K)2 <∞.
26 Spectral theory of symmetric operators
We let M be a symmetric operator over a complex-valued Hilbert space, sothat (Mx, y) = (x,My).
111
If A is compact, we can write x =∑anen and Ax =
∑λnanen. Let En
be the projection onto the eigenspace with eigenvector λn, so x =∑Enx
and Ax =∑λnEn(x).
If we define a projection-valued measure E(S) by
E(S) =∑λn∈S
En
for S a Borel subset of R, then x =∫E(dλ)x and Ax =
∫λE(dλ)x.
Here E is a pure point measure. In general, we get the same result, butE might not be pure point.
Proposition 26.1 (1) If B is bounded and symmetric, then (Bx, y) isbounded, linear in x, and skew linear in y (that is, (Bx, cy) = c(Bx, y)).
(2) If b(x, y) is skew-symmetric (b(y, x) = b(x, y)), linear in x, and |b(x, y)|≤ c‖x‖ ‖y‖, then b(x, y) = (x,By), where B is bounded, symmetric, and‖B‖ ≤ c.
Proof. (1) is easy.
(2) Fix y, let `(x) = b(x, y), and then |`(x)| ≤ c‖x‖ ‖y‖. So there existsw, depending on y, such that b(x, y) = (x,w). If x = w,
‖w‖2 = b(w, y) ≤ c‖w‖ ‖y‖
or ‖w‖ ≤ c‖y‖. Define B by w = By.
(x,By) = b(x, y) = b(y, x) = (y,Bx) = (Bx, y).
26.1 Spectrum of symmetric operators
Proposition 26.2 If M is bounded and symmetric, then σ(M) ⊂ R.
112
Proof. Let λ = α + iβ, β 6= 0. We want to show that λ is in the resolvent.Define B(x, y) = (x, (M − λ)y). B is linear in x and skew linear in y.
|B(x, y)| ≤ ‖x‖ ‖M − λ)y‖ ≤ ‖x‖ ‖y‖ (‖M‖+ |λ|).
NowB(y, y) = (y, (M − λ)y) = (y,My)− α(y, y)− iβ(y, y).
Since (x,Mx) = (Mx, x) = (x,Mx), then (x,Mx) is real. Therefore
|B(y, y)| ≥ |β(y, y)| = |β| ‖y‖2.
By the Lax-Milgram lemma, if z ∈ H and `(x) = (x, z), there exists ysuch that B(x, y) = `(x) for all x. So
(x, (M − λ)y) = B(x, y) = (x, z).
This is true for all x, so z = (M − λ)y, which proves M − λ is invertible.
Proposition 26.3 |σ(M)| = ‖M‖.
Proof.
‖Mx‖2 = (Mx,Mx) = (x,M2x) ≤ ‖x‖ ‖M2x‖ ≤ ‖x‖2‖M2‖.
So ‖M‖2 ≤ ‖M2‖. Similarly, ‖M‖n ≤ ‖Mn‖ if n = 2k. The other directionfollows from ‖AB‖ ≤ ‖A‖ ‖B‖. Taking the nth root,
|σ(M)| = lim ‖Mn‖1/n = ‖M‖.
Proposition 26.4 Let a = inf‖x‖=1(x,Mx) and b the supremum. Thenσ(M) ⊂ [a, b] and a, b ∈ σ(M).
113
Proof. Let λ < a.
(x, (M − λ)x) = (x,Mx)− λ(x, x) ≥ (a− λ)‖x‖2.
If we define B(x, y) = (x, (M−λ)y), then the hypotheses of the Lax-Milgramlemma hold, and as in proof of the proposition that M bounded and sym-metric implies σ(M) ⊂ R. we see that (M −λ) is invertible. Thus λ /∈ σ(M)and similarly for λ > b.
We have |(x,Mx)| ≤ ‖x‖ ‖Mx‖ ≤ ‖M‖ if ‖x‖ = 1. So |a|, |b| ≤ ‖M‖, andsince ‖M‖ = |σ(M)|, we have |σ(M)| ≤ |a|∨ |b|. So if b > |a|, then b ∈ σ(M)and if |a| > b, then a ∈ σ(M).
Replacing M by M + cI shifts the spectrum by c. We conclude botha, b ∈ σ(M).
Proposition 26.5 Suppose M,N is symmetric. Define dist (σ(M), σ(M))to be the larger of
maxν∈σ(N)
minµ∈σ(M)
|ν − µ|, maxµ∈σ(M)
minν∈σ(N)
|ν − µ|.
Then dist (σ(M), σ(N)) ≤ ‖M −N‖.
Proof. Let d = ‖M − N‖. Suppose the first quantity is larger than d. Sofor some ν ∈ σ(N), minµ∈σ(M) |µ− ν| > d.
ν ∈ ρ(M), so M − ν is invertible. Then |σ((M − ν)−1)| = |(σ(M)− ν)−1|,and so |σ(M − ν)|−1 < d−1. Therefore
d−1 > |σ((M − ν)−1)| = ‖(M − ν)−1‖.
N − νI = M − νI +N −M = (M − νI)(I +K),
whereK = (M − ν)−1(N −M).
Note‖K‖ = ‖(M − ν)−1‖ ‖N −M‖ < d · d−1 = 1.
Therefore I +K is invertible, and so N − νI is invertible, a contradiction toν ∈ σ(N).
114
26.2 Functional calculus for symmetric operators
Let q be a polynomial with real coefficients. If M is symmetric, then q(M)is symmetric. Also, σ(q(M)) = q(σ(M)). So
‖q(M)‖ = supλ∈σ(M)
|q(λ)|.
Let f be continuous on σ(M). Extend f continuously to an interval I con-taining σ(M). By the Weierstrass approximation theorem, there are polyno-mials qn converging uniformly to f on I. So qn is a Cauchy sequence. Givenε, there exists N such that
supλ∈I|qn(λ)− qm(λ)| < ε
if n,m ≥ N . Then‖qn(M)− qm(M)‖ < ε.
So limn→∞ qn(M) exists, and we call the limit f(M).
Proposition 26.6 (1) We have (f+g)(M) = f(M)+g(M) and (fg)(M) =f(M)g(M).
(2) ‖f(M)‖ = supλ∈σ(M) |f(λ)|.(3) f(M) is symmetric and σ(f(M)) = f(σ(M)).
Proof. (1) This holds for polynomials, so take limits.
(2)‖f(M)‖ = lim ‖qn(M)‖ = lim sup
λ∈σ(M)
|qn(λ)| = sup |f(λ)|.
(3) qn(M) is symmetric, and then
(x, f(M)y) = lim(x, qn(M)y) = lim(qn(M)x, y) = (f(M)x, y).
Finally, since
dist (σ(qn(M)), σ(f(M)) ≤ ‖qn(M)− f(M)‖ → 0,
115
σ(f(M)) is the limit of σ(qn(M)) = qn(σ(M)), and this limit is f(σ(M)).
M is positive if (Mx, x) ≥ 0 for all x.
Proposition 26.7 Let M be bounded and symmetric. M is positive if andonly if σ(M) ≥ 0.
Proof. If σ(M) ≥ 0, then f(λ) =√λ is a continuous function for λ ≥ 0,
and so N =√M exists. Then
(Mx, x) = (N2x, x) = (Nx,Nx) ≥ 0.
If M is positive, σ(M) ⊂ [a,∞), where a = inf‖x‖=1(x,Mx) ≥ 0.
Corollary 26.8 Every positive symmetric operator has a positive symmetricsquare root.
As mentioned before, this allows us to write T = UA, where A is positiveand symmetric and U is an isometry on RA and 0 on R⊥A.
26.3 Spectral resolution
Fix x, y ∈ H and define `x,y(f) = (f(M)x, y) for f ∈ C(σ(M)). Recall σ(M)is bounded and closed, hence compact. ` is a linear functional and by Rieszrepresentation, there exists a complex valued measure mx,y such that
(f(M)x, y) =
∫σM
f(λ)mx,y(dλ).
Proposition 26.9 (1) mx,y is linear in x and skew linear in y.
(2) my,x = mx,y.
(3) The total variation of mx,y is less than or equal to ‖x‖ ‖y‖.(4) The measure mx,x is real and non-negative.
116
Proof. (1) `x,y is linear in x and skew linear in y. mx+z,y and mx,y + mz,y
both represent `x+z,y, and by the uniqueness of the Riesz representation,mx,y +mz,y = mx+z,y.
(2) Since M is symmetric, (f(M)x, y) is skew symmetric.
(3) The total variation is the same as the norm of `x,y. But
|`x,y(f)| = |(f(M)x, y)| ≤ ‖f(M)‖ ‖x‖ ‖y‖ ≤ (supλ|f |)‖x‖ ‖y‖.
(4) f real implies σ(f(M)) = f(σ(M)) ∈ [0,∞). So f(M) is a positiveoperator. Then `x,x(f) = (f(M)x, x) ≥ 0.
If S ⊂ σ(M), then mx,y(S) is a bounded symmetric functional of x and y.So there exists a bounded symmetric operator E(S) such that
mx,y(S) = (E(S)x, y).
Proposition 26.10 (1) E∗(S) = E(S).
(2) ‖E(S)‖ ≤ 1.
(3) E(∅) = 0, E(σ(M)) = I.
(4) If S, T are disjoint, E(S ∪ T ) = E(S) + E(T ).
(5) E(S) and M commute.
(6) E(S ∩ T ) = E(S)E(T ).
(7) E(S) is an orthogonal projection. If S, T are disjoint, then E(S), E(T )are orthogonal.
(8) E(S) and E(T ) commute.
Proof. (1) E(S) is symmetric.
(2) From the fact that the total variation of mx,y is bounded by ‖x‖ ‖y‖.(3) mx,y(∅) = 0, so E(∅) = 0. If f ≡ 1, then f(M) = I, and
(x, y) =
∫σ(M)
mx,y(dλ) = (E(σ(M))x, y).
117
This is true for all y, so x = E(σ(M))x holds for all x.
(4) Because mx,y is additive.
(5)
mMx,y(f) = (f(M)Mx, y) = (Mf(M)x, y) = (f(M)x,My) = mx,My(f),
where mx,y(f) denotes the integral of f with respect to mx,y.
(6) (Lax never really does this one clearly.) Since mx,y is a finite measure,we can approximate (E(S)x, y) = mx,y(S) by mx,y(f). So it suffices to showE(f)E(g) = E(fg) for f, g continuous and use approximations.
Now
(E(f)E(g)x, y) =
∫f(λ)mE(g)x,y(dλ) = (f(M)E(g)x, y)
= (E(g)x, f(M)y) =
∫g(λ)mx,f(M)y(dλ)
= (g(M)x, f(M)y) = (f(M)g(M)x, y)
= ((fg)(M)x, y) =
∫(fg)(λ)mx,y(dλ)
= (E(fg)x, y).
This is true for all y, so E(f)E(g)x = E(fg)x.
(7) Setting S = T in (6) shows E(S) = E2(S), so E(S) is a projection.IF S ∩ T = ∅, then E(S)E(T ) = E(∅) = 0, so they are orthogonal.
(8)E(S)E(T ) = E(S ∩ T ) = E(T ∩ S) = E(T )E(S).
We have proved most of the following:
Theorem 26.11 Let H be a complex-valued Hilbert space and M a boundedsymmetric operator. There exists a projection valued measure E such thatE(S ∩ T ) = E(S)E(T ),
f(M) =
∫σ(M)
f(λ)E(dλ),
and the measure E is unique.
118
A few remarks.
(1) Uniqueness follows from the uniqueness of mx,y.
(2) Suppose f is bounded and measurable. If f is simple, i.e., f =∑ciχAi ,
where the Ai are disjoint, we define f(M) =∑ciE(Ai). In the next propo-
sition, we show‖f(M)‖ = sup
λ∈σ(M)
|f(λ)|
if f is simple. If f is bounded and measurable, we can take fn simple con-verging to f uniformly. Then
‖fn(M)− fm(M)‖ = supλ∈σ(M)
|fn − fm|,
since fn−fm is simple, and therefore fn(M) is a Cauchy sequence. We definef(M) to be the limit of fn(M).
(3) We showed that the above equality holds in the weak sense:
(f(M)x, y) = · · · .
But in fact the equality holds in the norm topology. One needs to show thatthe integral exists, and to do that one needs to approximate by Riemann-Stieltjes integrals. The key estimate is the next proposition.
(4) If we write mx,y = mPx,y +mS
x,y +mCx,y, where this is the decomposition
of the measure into pure point part, singular part, and absolutely continuouspart, we get corresponding operators EP , ES, EC and we can write
H = REP ⊕RES ⊕REC .
Here is the promised proposition.
Proposition 26.12 Suppose Ai are disjoint. Then
‖∑
ciE(Ai)‖ = max |ci|.
Proof. Let r = maxi |ci|. Given x, let xi = E(Ai)x. Then
(xi, xj) = (E(Ai)x,E(Aj)x) = (x,E(Ai)E(Aj)x) = 0
119
if i 6= j. Then∣∣∣∑i
ciE(Ai)x∣∣∣2 =
(∑i
ciE(Ai)x,∑j
cjE(Aj)x)
=(∑
i
cixi,∑j
cjxj
)=∑i
|ci|2‖xi‖2 ≤ r2∑i
‖xi‖2 ≤ r2‖x‖2.
Therefore the operator norm is less than or equal to r. If j is such that|cj| = r, then take x in the range of E(Aj), and then
∑i ciE(Ai)x = cjx,
which implies that the norm is equal to r.
Proposition 26.13
‖f(M)x‖2 =
∫|f(λ)|2mx,x(dλ).
Proof. If f is real
‖f(M)x‖2 = (f(M)x, f(M)x) = ((f(M))2x, x) =
∫f 2(λ)mx,x(dλ),
and the proof for complex f is similar.
26.4 Normal operators
Let F be a Banach algebra with unit. Then if Q ∈ F ,
σ(Q) = {p(Q) : p a homomorphism of F into C}.
The reason for this is that λ ∈ σ(Q) if and only if λI − Q is not invertible,which happens if and only if p(λI − Q) = 0 for some homomorphism p.p(I) = 1, so this happens if and only if λ = p(Q) for some p.
Proposition 26.14 If p is a homomorphism, then p(T ∗) = p(T ).
120
Proof. Let A = (T + T ∗)/2 and B = (T − T ∗)/2. Then A∗ = A, B∗ = −B,T = A + B, T ∗ = A − B, and so p(T ) = p(A) + p(B) and similarly with Treplaced by T ∗.
It will suffice to show p(A) is real and p(B) is imaginary. Write p(A) =a+ ib and let U = A+ itI, so that U∗ = A− itI. Then
U∗U = A2 + t2I.
We have p(U) = a+i(b+t), so |p(U)|2 = a2 +(b+t)2. We have |p(U)| ≤ ‖U‖,and hence
a2 + (b+ t)2 ≤ ‖A‖2 + t2
for all t, which can only happen if b = 0. (If b > 0, take t large positive, andtlarge negative if b < 0.) The operator iB is self-adjoint, so apply the aboveto iB.
Proposition 26.15 If T and T ∗ commute, then ‖T‖ = |σ(T )|.
Proof. We already know this if T is self-adjoint. For general T ,
p(T ∗T ) = p(T ∗)p(T ) = p(T )p(T ) = |p(T )|2.
Every point in σ(T ) is of the form p(T ) for some homomorphism p, so
|σ(T ∗T )| = |σ(T )|2.
We also know that ‖T ∗T‖ = |σ(T ∗T )| since T ∗T is symmetric. But since
‖Tx‖2 = |(x, T ∗Tx)| ≤ ‖x‖2‖T ∗T‖,
then ‖T‖2 ≤ ‖T ∗T‖. We have ‖T ∗T‖ ≤ ‖T‖2, and so we obtain ‖T‖2 =|σ(T )|2.
An operator N is normal if N∗N = NN∗.
Theorem 26.16 Let N be normal. There exists an orthogonal projectionvalued measure E on σ(N) such that I =
∫σ(N)
dE and N =∫σ(N)
λE(dλ).
121
Proof. Let q(x, y) be a polynomial in x and y. If we let w = x + yi ∈ C,we can let x = (w + w)/2, y = (w − w)/2, and write q(x, y) = R(w,w) forsome polynomial R. Set Q = R(N,N∗). Since N and N∗ commute, theyeach commute with Q, and so Q and Q∗ commute. In the lemma below wewill show σ(Q) = R(λ, λ) for λ ∈ σ(N). We have ‖Q‖ = |σ(Q)|, since Q andQ∗ commute. Therefore
‖Q‖ = supλ∈σ(N)
|R(λ, λ)|.
Now we can define f(N) as the limit of polynomials, and the rest of the proofis as before.
Lemma 26.17σ(Q) = {R(λ, λ) : λ ∈ σ(N)}.
Proof. Operators of the form R(N,N∗) are a commutative algebra withunit. Let F be the closure in the operator norm.
Now p(Q) = R(p(N), p(N∗)) = R(p(N), p(N)). Then σ(Q) is equal to theset of points R(p(N), p(N)) where p is a homomorphism, which is the sameas the set of R(λ, λ) where λ ∈ σ(N).
26.5 Unitary operators
U is unitary if it is linear, isometric, 1-1, and onto. (Cf. rotations) So‖Ux‖ = ‖x‖, or (Ux, Ux) = (x, x). By polarization, (Ux, Uy) = (x, y), so(x, U∗Uy) = (x, y), which implies U∗U = I. U is invertible, since it is 1-1and onto, and thus U−1 = U∗.
It is an exercise to show that σ(U) ⊂ {z : |z| = 1}.U∗U = I = UU∗, so unitary operators are also normal operators.
27 Spectral theory of unbounded operators
An example: look at {f : f ∈ C2, f(0) = f(1) = 0}. By integration by parts,∫ 1
0
f ′′g = −∫ 1
0
f ′g′ =
∫ 1
0
fg′′,
122
or (Lf, g) = (f, Lg). Then L, the second derivative operator, is symmetric,but is an unbounded operator. Writing Lf = λf , there are solutions satisfy-ing the boundary condition only if λ < 0 and λ = −n2π2. The correspondingeigenfunctions are sinnπx, although these are unnormalized. {−n2π2} isunbounded, so L is not a bounded operator on L2.
Proposition 27.1 H a Hilbert space over C, M self-adjoint. If M is definedeverywhere on H, then M is bounded.
Proof. M is closed: if xn → x and Mxn → u, then
(Mxn, y) = (xn,My)→ (x,My) = (Mx, y).
Also (Mxn, y)→ (u, y). True for all y, so Mx = u.
By the closed graph theorem, M is bounded.
If H is a Hilbert space over C and D a dense subspace with A defined onD, then D∗ is the set of v ∈ H for which there exists a vector A∗v ∈ H suchthat (Au, v) = (u,A∗v) for all u ∈ D. Since D is dense, A ∗ v is uniquelydefined. A is self-adjoint if D = D∗ and A∗ = A. (So (Au, v) = (u,Av) forall u, v ∈ D(A), and the domain cannot be enlarged.)
Our goal is to prove
Theorem 27.2 Let A be self-adjoint, D,H be as above. There exists a pro-jection valued measure E such that
(1) E(∅) = 0, E (R) = I.
(2) E(S ∩ T ) = E(S)E(T ).
(4) E∗(S) = E(S).
(5) E commutes with A.
(6) D = {u :∫λ2 (E(dλ)u, u) <∞} and Au =
∫λE(dλ)u.
We say z is in the resolvent set if A− zI maps D one-to-one onto H.
Proposition 27.3 If z is not real, then z is in the resolvent set.
123
Proof. (1) R = Range (A− zI) is a closed subspace.
R is equal to the set of all vectors u of the form Av − zv = u for somev ∈ D. Then(Av, v) − z(v, v) = (u, v). A is self-adjoint, so (Av, v) is real.Looking at the imaginary parts,
−Im (z, ‖v‖2) = Im (u, v),
so |Im z|‖v2‖ ≤ ‖u‖ ‖v‖, or
‖v‖ ≤ 1
|Im z|‖u‖.
If un ∈ R and un → u, then ‖vn − vm‖ ≤ (1/|Im z|)‖un − um‖, so vn is aCauchy sequence, and hence converges to some point v.
Since Avn − zvn = un → u and zvn converges to zv, then Avn converges,say to r, and then r − zv = u. Since (Avn, w) = (vn, Aw) for w ∈ D, then(r, w) = (v, Aw), which implies r ∈ D and Av = r.
(2) R = H. If not, there exists k 6= 0 such that k is orthogonal to R, andthen
(Av − zv, k) = (Av, k)− (v, zk) = 0
for all v ∈ D. Then (Av, k) = (v, zk), so k ∈ D and Ak = zk. But then(k,Ak) = z(k, k) is not real, a contradiction.
(3) A−zI is one-to-one. If not, there exists k ∈ D such that (A−zI)k = 0.But then ‖k‖ ≤ (1/|Im z|)‖0‖ = 0, or k = 0.
If we set R(z) = (A− zI)−1 the resolvent, we have
‖R(z)‖ ≤ 1
|Im z|.
If u,w ∈ H and v = R(z)u, then (A− z)v = u, and
(u,R(z)w) = ((A− z)v,R(z)w) = (v, ((A− z)R(z)w) = (v, w) = (R(z)u,w).
So the adjoint of R(z) is R(Z).
124
27.1 Cayley transform
DefineU = (A− i)(A+ i)−1.
This is the image of A under the function
F (z) =z − iz + i
,
which maps the real line to ∂B1(0) \ {1}.
Proposition 27.4 U is a unitary operator.
Proof. A+ i, A− i map D(A) one-to-one onto H, so U maps H onto itself.
U is norm preserving: Let u ∈ H, v = (A+i)−1u, w = Uu. So (A+i)v = u,(A− i)v = w. Then
‖u‖2 = ((A+ i)v, (A+ i)v) = ‖Av‖2 + ‖v‖2 + i[(v, Av)− (Av, v)]
= ‖Av‖2 + ‖v‖2,
and similarly
‖w‖ = ((A− i)v, (A− i)v) = ‖Av‖2 + ‖v‖2.
Proposition 27.5 If U is unitary, then σ(U) ⊂ {|z| = 1}.
Proof. (λI−U) = λ(I−U/λ). Since U is an isometry, then ‖U‖ = 1. ThenI − 1
λU is invertible if 1
|λ|‖U‖ < 1, or if |λ| > 1.
(λI − U) = U(λU−1 − I) = U(λU∗ − I). Since ‖λU∗‖ = |λ| < 1, thenI − λU∗ is invertible.
Proposition 27.6 Given A and U as above and E the spectral resolutionfor U , E({1}) = 0.
125
Proof. Write E1 for E({1}). If E1 6= 0, there exists z 6= 0 in the range ofE1, so z = E1w. Then
Uz =
∫σ(U)
λE(dλ)z =
∫σ(U)
λ (E − E1)(dλ)z +
∫{1}λE1(dλ)z.
The first integral is zero since (E − E1)(A) and E1 are orthogonal for all A.The second integral is equal to
E1z = E1E1w = E1w = z
since E1 is a projection.
We conclude z is an eigenvector for U with eigenvalue 1. So (A− iI)(A+iI)−1z = z. Let v = (A+ iI)−1z, or z = (A+ iI)v. Then
z = (A− iI)(A+ iI)−1z = (A− iI)v,
and then iv = −iv, so v = 0, and hence z = 0, a contradiction.
Let M be a bounded symmetric operator. Let f be bounded and measur-able. Define
mx,y(f) =
∫σ(M)
f(λ)mx,y(dλ).
This is a bounded symmetric functional of x and y, since it is true for mx,y(S)and we can take limits. So there exists an operator, called f(M), such that
mx,y(f) = (f(M)x, y), x, y ∈ H.
We have
|(f(M)x, y)| = |mx,y(f)| ≤ |f | |mx,y(σM)| ≤ |f | ‖x‖ ‖y‖.
Therefore ‖f(M)‖ ≤ |f |. We thus can extend our construction of f(M) fromcontinuous f to bounded and measurable f . We now want to define f(M)for some unbounded functions f .
(The rest of this chapter is from Rudin’s Functional Analysis.
Proposition 27.7 Let
Df ={x :
∫σ(M)
|f(λ)|2mx,x(dλ) <∞}.
126
Then
(1) Df is a dense subspace of H.
(2) If x, y ∈ H,∫σ(M)
|f(λ)| |mx,y|(dλ) ≤ ‖y‖(∫
σ(M)
|f(λ)|2mx,x(dλ))1/2
.
(3) If f is bounded and v = f(M)z, then
mx,v(dλ) = f(λ)mx,z(dλ), x, z ∈ H.
Proof. (1) Let S ⊂ σ(M) and z = x+ y.
‖E(S)z‖2 ≤ (‖E(S)x‖+ ‖E(S)y‖)2 ≤ 2‖E(S)x‖2 + 2‖E(S)y‖2.
Somz,z(S) ≤ 2mx,x(S) + 2my,y(S).
This is true for all S, so
mz,z(dλ) ≤ 2mx,x(dλ) + 2my,y(dλ).
Let Sn = {minσ(M) : |f(λ)| < n}. Then if x = E(Sn)z, E(S)x = E(S ∩ Sn)x, somx,,x(S) = mx,x(S ∩ Sn). Then∫
σ(M)
|f(λ)|2mx,x(dλ) =
∫Sn
|f(λ)|2mx,x(dλ) ≤ n2‖x‖2 <∞.
Therefore the range of E(Sn) ⊂ D(f). σ(M) = ∪nSn, so y = limE(Sn)y,hence y is in the closure of Df .
(2) If x, y ∈ H, f bounded,
f(λ)mx,y(dλ)� |f(λ)| |mx,y|(dλ),
so there exists u with |u| = 1 such that
u(λ)f(λ)mx,y(dλ) = |f(λ)| |mx,y|(dλ).
127
So ∫σ(M)
|f(λ)| |mx,y|(dλ) = (uf(M)x, y) ≤ ‖uf(M)x‖ ‖y‖.
But
‖uf(M)x‖2 =
∫|uf |2 dmx,x =
∫|f |2 dmx,x.
So (2) holds for bounded f . Now take a limit.
(3) Let g be continuous.∫σ(M)
g dmx,v = (g(M)x, v) = (g(M)x, f(M)z)
= ((fg)(M)x, z) =
∫gf dmx,z.
this is true for all g continuous, so dmx,x = f dmx,z.
Theorem 27.8 Let E be a resolution of the identity.
(a) Suppose f : σ(M) → C is measurable. There exists a densely definedoperator f(M) with domain Df and
(f(M)x, y) =
∫σ(M)
f(λ)mx,y(dλ) (1)
‖f(M)x‖ =
∫σ(M)
|f(λ)|2mx,x(dλ). (2)
(b) If Dfg ⊂ Dg, then f(M)g(M) = (fg)(M).
(c) f(M)∗ = f(M) and f(M)f(M)∗ = f(M)∗f(M) = |f |2(M).
Proof. (a) If x ∈ Df , then y →∫σ(M)
f dmx,y is a bounded linear functional
with norm at most (∫|f |2 dmx,x)
1/2. Choose f(M)x ∈ H to satisfy (1) forall y.
Let fn = f1(|f |≤n). Then Df−fn = Df . By dominated convergence theo-rem,
‖f(M)x− fn(M)x‖2 ≤∫σ(M)
|f − fn|2 dmx,x → 0.
128
Since fn is bounded, (2) holds with fn. Now let n→∞.
(b) and (c): Prove for fn and let n→∞.
Theorem 27.9 (Change of measure principle) Let E be a resolution of theidentity on A, Φ : A → B one-to-one and bimeasurable. Let E ′(S ′) =E(Φ−1(S ′)). Then E ′ is a resolution of the identity on B, and∫
B
f dm′x,y =
∫B
(f ◦ Φ) dmx,y.
Proof. Prove for f the indicator of a set, use linearity, and take limits.
Proof of the spectral theorem. Start with the unbounded operator A.Let U = (A− iI)(A+ iI)−1. Then U is unitary with a spectrum on ∂B1(0) \{1}. Let the resolution of the identity for U be given by E ′.
Define Φ to be the map taking ∂B1(0) \ {1} to R. Apply the changeof measure principle. If we let E(S) = E ′(Φ−1(S)), it is just a matter ofchecking that E is a resolution of the identity for A.
28 Examples of self-adjoint operators
28.1 Extensions
M is symmetric if (x,My) = (Mx, y). Saying M is self-adjoint includes con-ditions on the domains. The difference only applies for unbounded operators.
C is an extension of B if D(B) ⊂ D(C) and Cu = Bu for u ∈ D(B).
Given B symmetric, so that (Bu, v) = (u,Bv) for all u, v ∈ D(B), can weextend B to a self-adjoint operator?
If un ∈ D(B), un → u, and Bun → w, then
(Bun, v) = (un, Bv)→ (u,Bv)
129
for all v ∈ D(B). Also (Bun, v) → (w, v). Since D(B) is dense in H, wis uniquely determined by u. Define B by Bu = w for all u,w such that(w, v)− (u,Bv) for all v ∈ D(B).
Proposition 28.1 Let B be a densely defined operator and B its closure.
(1) B is a closed operator.
(2) B is symmetric.
(3) If z /∈ R, then B− z maps D(B) one-to-one onto a closed subspace ofH.
Proof. (1) is easy.
(2) If v ∈ D(B), choose vn ∈ D(B) such that vn → v and Bvn → Bv.(Bu, vn) = (u,Bvn). Let n→∞, and (Bu, v) = (u,Bv).
(3) If u ∈ D(B), let f = (B − z)u. So
(Bu, u)− z(u, u) = (f, u).
Since B is symmetric, the first term is real. Taking imaginary parts,
|Im z| ‖u‖2 = |Im (f, u)| ≤ ‖f‖ ‖u‖,
so
‖u‖ ≤ 1
|Im z|‖f‖.
So B − z is one-to-one.
If fn is in the range of B − z and fn → f , then write (B − z)un = fn.By the above inequality, {un} is a Cauchy sequence, hence converges, say, tou. So un converges, fn converges, hence Bun converges. Since B is closed,u ∈ D(B) and Bu = f + zu.
Corollary 28.2 If A is self-adjoint, then A is closed.
Proof. A is symmetric, so D(A) ⊂ D(A∗) = D(A).
130
Theorem 28.3 Let A be a symmetric operator. A is self-adjoint if and onlyif C \ R ⊂ ρ(A).
Proof. That A self-adjoint implies that all non-real z are in the resolventset has already been proved.
Suppose z is non-real, so z, z ∈ ρ(A).
We claim (A− z)−1 is the adjoint of (A− z)−1. We need to show that iff, g ∈ H, then
((A− z)−1f, g) = (f, (A− z)−1g).
Let x = (A− z)−1f and y = (A− z)−1g. So we need to show
(x, (A− z)y) = ((A− zx, y).
Since A is symmetric, this is true for all x, y ∈ D(A). Since A− z and A− zmap D(A) one-to-one onto H, this is true for all f, g ∈ H.
A is self-adjoint: we have to show that if v ∈ D(A∗), then v ∈ D(A) andA∗v = Av. Suppose v ∈ D(A∗) with w = A∗v. Then (Ax, v) = (x,w) for allx ∈ D(A), or ((A− z)x, v) = (x,w− zv). Let x = (A− z)−1f , g = x− zv so
(f, v) = ((A− z)−1f, w − zv) = (f, (A− z)−1(w − zv)).
This must hold for all f ∈ H, so v = (A − z)−1(w − zv). Since the rangeof (A − z)−1 is D(A), then v ∈ D(A). Hit both sides with A − z, and thenAv = w.
28.2 Examples
1. Let H = L2(R), B = i(d/dx), and D(B) = C10 , the C1 functions with
compact support.
Proposition 28.4 B is symmetric and its closure is self-adjoint.
Proof. Symmetry follows by integration by parts.
131
Let z ∈ C. The range of B − z is {f : iu′ − zu = f, u ∈ C10} = C.
d
dxi(eizxu) = eizxf,
or
0 =
∫ ∞−∞
eizxf(x) dx,
using that u has compact support.
Conversely, if the equality holds, define
u(x) = −i∫ x
−∞eiz(y−x)f(y) dy.
Then u ∈ C1 and if f has compact support, then u has the same support. Cis dense in L2. If z /∈ R, range of B − z is closed, so range is all of H. SinceB − z is 1-1, z ∈ ρ(B), and therefore B is self-adjoint.
2. H = L2[0,∞), B as before, D(B) as before, but now C10 means support
in (0,∞).
Proposition 28.5 B is symmetric, but B is not self-adjoint. B has noself-adjoint extension.
Proof. f ∈ D(B) implies f = 0 in a neighborhood of 0, so we can useintegration by parts as before to show the symmetry.
As before, if f is in the range of B − z, 0 =∫∞
0eizxf(x) dx. If Im z < 0,
as before the range of B − z is dense in z.
If Im z > 0, eizx is square integrable, and therefore the range of B − z isthe set of f ∈ H such that f is orthogonal to eizx. So B is not self-adjoint.
If A is a self-adjoint extension, A would be an extension of B. Let v ∈D(A) \ D(B). Let Im z < 0. B − z maps D(B) onto H, so there existsu ∈ D(B) such that
(B − z)u = (A− z)v.
A is an extension, so (A − z)(v − u) = 0. A is symmetric, so eigenvectorsonly exist if z ∈ R, so v − u = 0, a contradiction.
132
29 Semigroups
Let X be a Banach space over the complex numbers, Z(t) = Zt linearbounded operators for t ≥ 0. Z is a semigroup if Z(t + s) = Z(t)Z(s),Z(0) = I.
Proposition 29.1 (1) Let G : X → X be bounded. Then Z(t) = etG (de-fined as etG =
∑tnGn/n!) is a semigroup that is continuous in the norm
topology.
(2) If Z(t) is a semigroup and lim(Z(t)− I) = 0, then Z = etG for someG.
Proof. (1) This is functional calculus for operators.
(2) Define
logZ = log(I + Z − I) = (Z − I)− (Z − I)2
2+ · · · .
So at least for t small we can define logZ.
Choose a such that |Z(t) − I| < 13
for t < a, and define L(t) = logZ(t).Z(t) and Z(s) commute so L(t + s) = L(t) + L(s). So for t rational witht < a, 1
tL(t) does not depend on t. Let G = 1
tL(t), so L(t) = tG.
Z(t+ h)− Z(t) = Z(t)[Z(h)− I],
so Z is continuous in t. So L(t) = tG for all t < a. Then etG = Z(t) fort < a. By the semigroup property, this holds for all t.
29.1 Strongly continuous semigroups
To use this in PDE, we need weaker conditions. Z(t) is strongly continuousat t = 0 if |Z(t)x− x| → 0 as t→ 0 for all x ∈ X.
Proposition 29.2 Suppose Z(t) is a strongly continuous semigroup at 0.
(1) There exists b and k such that |Z(t)| ≤ bekt.
(2) Z(t)x is strongly continuous in t for all x ∈ X.
133
Proof. We claim |Z(t)| is bounded near 0. If not, there exists tj → 0 suchthat |Z(tj)| → ∞. By the uniform boundedness principle, Z(tj)x cannotconverge to x for all x, a contradiction to strong continuity. So there existsa, b such that |Z(t)| ≤ b for t ≤ a.
Write t = na+ r. Z(t) = Z(a)nZ(r), so
|Z(t)| ≤ |Z(a)|n|Z(r)| ≤ bn+1 ≤ bekt
with k = 1a
log b.
(2) Z(t)x− Z(s)x = Z(s)[Z(t− s)x− x], so
|Z(t)x− Z(s)x| ≤ |Z(s)| |Z(t− s)x− x| → 0.
Suppose D is dense in X and G : D → X is closed. z ∈ ρ(G), the resolventset, if z − G maps D = D(G) 1-1 onto X. Write R(z) = Rz = (zI − G)−1.Since G is closed, then R(z) is closed. R(z) is defined on all of X, so by theclosed graph theorem, R(z) is a bounded operator.
We define G′ by (Gx, `) = (x,G′`), where D(G′) is the set of ` such that(Gx, `) is a bounded linear functional of x ∈ D(G).
Let Z be a strongly continuous one parameter semigroup. The infinitesi-mal generator of G is defined by
Gx = s− limh→0
Z(h)x− xh
,
with the domain of G being those x for which the strong limit exists.
Proposition 29.3 (1) G commutes with Z(t) in the sense that if x ∈ D(G),then Z(t)x ∈ D(G) and GZ(t)x = Z(t)Gx.
(2) D(G) is dense in X.
(3) D(Gn) is dense.
(4) G is closed.
(5) If |Z(t)| ≤ bekt and Re z > k, then z ∈ ρ(G). The resolvent of G isthe Laplace transform of Z(t).
134
Proof. (1)
Z(t+ h)− Z(t)
hx = Z(t)
Z(h)− Ih
x =Z(h)− I
hZ(t)x.
If x ∈ D(G), the middle term converges to Z(t)Gx. So the limit exists in thethird term, and Z(t)x ∈ D(G). Moreover d
dtZ(t)x = Z(t)Gx = GZ(t)x.
(2) We claim
Z(t)x− x = G
∫ t
0
Z(s)x ds.
To see this, Z(s)x is a continuous function of s. By a Riemann sum approx-imation,
Z(h)− Ih
∫ t
0
Z(s)x ds =1
h
∫ t
0
[Z(s+ h)x− Z(s)x] ds
=1
h
∫ t+h
t
Z(s)x ds− 1
h
∫ h
0
Z(s)x ds
→ Z(t)x− x.
So∫ t
0Z(s)x ds ∈ D(G). But 1
t
∫ t0Z(s)x ds→ x.
(3) Let φ be C∞ and supported in [0, 1]. Let
xφ =
∫φ(s)Z(s)x ds.
Then Gxφ = −∫φ′(s)Z(s)x ds. Repeating, xφ ∈ D(Gn). Take φj approxi-
mating the identity.
(4) Z(t)x−x =∫ t
0Z(s)Gxds: To see this, both are 0 at 0. The derivative
on the left is Z(t)Gx, which is the same as the derivative on the right. Letxn ∈ D(G), xn → x, Gxn → y. Then
Z(t)xn − xn =
∫ t
0
Z(s)Gxn ds→∫ t
0
Z(s)y ds.
The left hand term converges to Z(t)x− x. Divide by t and let t→ 0. Theright hand side converges to y. Therefore x ∈ D(G) and Gx = y.
(5) Let
L(z)x =
∫ ∞0
e−zsZ(s)x ds.
135
The Riemann integral converges when Re z > k.
|L(z)x| ≤∫ ∞
0
be(k−Re z)s|x| ds ≤ 1
Re z − k|x|.
We claim L(z) = R(z). Check that e−ztZ(t) is also a semigroup with in-finitesimal generator G− zI.
e−ztZ(t)− x = (G− zI)
∫ t
0
e−zsZ(s)x ds.
As t → ∞, the left hand side tends to −x and the right hand side tends to(G − zI)L(z)x. Since G is closed, x = (zI − G)L(z)x. So L(z) is the rightinverse of (zI −G). Similarly, we see that it is also the left inverse.
29.2 Generation of semigroups
Proposition 29.4 A strongly continuous semigroup of operators is uniquelydefined by its infinitesimal generator.
Proof. If W,Z have the same generator, let x ∈ D(G) and
d
dtW (t)Z(s− t)x = W (t)GZ(s− t)x−W (t)GZ(s− t)x = 0.
Therefore
0 =
∫ s
0
d
drW (r)Z(s− r)x dr = W (s)Z(0)x−W (0)Z(s),
or W (s)x = Z(s)x. Now use the fact that D(G) is dense.
Z(t) is a contraction if |Z(t)| ≤ 1 for all t.
Theorem 29.5 (1) The infinitesimal generator of a strongly continuoussemigroup of contractions has (0,∞) ⊂ ρ(G) and
|R(λ)| = |(λI −G)−1| ≤ 1
λ. (1)
136
(2) (Hille-Yosida theorem) Let G be a densely defined unbounded operatorsuch that (0,∞) ⊂ ρ(G) and (1) is satisfied. Then G is the infinitesimalgenerator of a strongly continuous semigroup of contractions.
Proof. (1) We already did; this is the case b = 1, k = 0.
2) Note nR(n)− I = R(n)G since R(n)(nI −G) = I. Let Gn = nGR(n).Then Gn = n2R(n)− nI, so Gn is a bounded operator. Define Zn(t) = etGn .
Claim: nR(n)x→ x for all x.
To prove this,
|nR(n)x− x| = |R(n)G(x)| ≤ 1
n|Gx|,
so the claim is true for x ∈ D(G). Since |nR(n)| ≤ 1 and D(G) is dense inX, this proves the claim.
If x ∈ D(G), then Gn(x)→ G(x):
Gnx = nGR(n)x = nR(n)Gx→ Gx.
We have
Zn(t) = etGn = e−nten2R(n)t = e−nt
∑ (n2t)m
m!Rm(n),
so |Zn(t)| ≤ entent = 1.
Gn and Gm commute with Zn and Zm.
d
dtZn(s− t)Zm(t)x = Zn(s− t)Zm(t)[Gm −Gn]x.
The norm of the right hand side is bounded by |Gnx−Gmx|. So
|Zn(s)x− Zm(s)x| ≤ s|Gnx−Gmx| → 0
as n,m → ∞. Therefore Zn(s)x converges, say, to Z(s)x, uniformly in s.D(G) is dense so this holds for all x.
Zn(s) is a strongly continuous semigroup of contractions, so the same holdsfor Z(s).
137
It remains to show that G is the infinitesimal generator of Z. We have
Zn(t)x− x =
∫ t
0
Zn(s)Gnx ds.
If x ∈ D(G), we can let n→∞ to get
Z(t)x− x =
∫ t
0
Z(s)Gxds.
If H is the generator of Z, dividing by t and letting t → 0, we get D(G) ⊂D(H) and H = G on D(G). So H is an extension of G. If λ > 0, thenλ ∈ ρ(G), ρ(H), which implies H cannot be a proper extension.
This last assertion is proved as follows. Suppose x ∈ D(H) \D(G). (λ−H)x ∈ X, so
(λ−G)−1(λ−H)x ∈ D(G) ⊂ D(H).
So
(λ−H)(λ−G)−1(λ−H)x = (λ−G)(λ−G)−1(λ−H)x = (λ−H)x.
Hit both sides with (λ−H)−1 to obtain (λ−G)−1(λ−H)x = x. So x ∈ D(G),a contradiction.
29.3 Perturbation of semigroups
Lemma 29.6 (Lumer-Phillips) Let G be densely defined in a Hilbert space Hand suppose (0,∞) ⊂ ρ(G). Then |R(λ)| ≤ 1/λ if and only if Re (x,Gx) ≤ 0for all x ∈ D(G).
If the last property holds, we say G is dissipative.
Proof.
‖(λI −G)−1u‖2 ≤ 1
λ2‖u‖2.
Let x = (λI −G)−1u. So
(x, x) ≤ 1
λ2(λx−Gx, λx−Gx).
138
This becomes
(x,Gx) + (Gx, x) ≤ 1
λ‖Gx‖2.
This is true for all λ.
The converse is left as an exercise.
Theorem 29.7 (Trotter) Suppose G is the infinitesimal generator of a semi-group of contractions in a Hilbert space. Let H be a densely defined dissipativeoperator such that D(G) ⊂ D(H) and there exist b > 0 and a ∈ (0, 1) suchthat
‖Hx‖ ≤ a‖Gx‖+ b‖x‖, x ∈ D(G).
Then G+H (defined on D(G)) is the generator of a contraction semigroup.
Proof. First, G+H is closed: Let xn → x and yn = (G+H)xn → y. So
G(xn − xm) = yn − ym −H(yn − ym),
and
‖G(xn − xm)‖ ≤ ‖yn − ym‖+ a‖G(xn − xm)‖+ b‖xn − xm‖.
Since a < 1, then Gxn converges. Therefore Hxn converges. G is closed, soGxn → Gx. If x ∈ D(G) ⊂ D(H),
‖Hnx−Hx‖ ≤ a‖Gnx−Gx‖+ b‖xn − x‖ → 0.
Next, if λ is sufficiently large, then λ ∈ ρ(G+H): By the Lumer-Phillipslemma, G is dissipative. H is also. So G + H is dissipative. By Lumer-Phillips,
‖x‖ ≤ 1
λ‖(λI − (G+H))x‖.
Therefore the range of (G+H)− λI is closed.
The range is H; if not, there exists v 6= 0 perpendicular to the range.G−λI is invertible, so there exists x ∈ D(G) such that (G−λI)x = v. Thenv +Hx is in the range, or (v +Hx, v) = 0. So ‖v‖2 + (Hx, v) = 0, or
‖v‖2 ≤ ‖Hx‖ ‖v‖,
139
and so ‖v‖ ≤ ‖Hx‖. Then
‖Gx− λx‖ ≤ ‖Hx‖ ≤ a‖Gx‖+ b‖x‖.
Squaring and use the fact that G is dissipative,
‖Gx‖2 + λ2‖x‖2 ≤ a2‖Gx‖2 + 2ab‖Gx‖ ‖x‖+ b2‖x‖.
a < 1, so for λ large enough, ‖x‖ = 0. So x = 0 and the range is the wholespace.
Now use Lumer-Phillips (not quite, since this only true for large λ.)
Examples: Non-divergence operators and Levy processes.
30 Groups of unitary operators
We prove Stone’s theorem.
Theorem 30.1 (1) Suppose A is self-adjoint and H is a Hilbert space. Thereexists a strongly continuous group U(t) of unitary operators with infinitesimalgenerator iA.
(2) Given a strongly continuous group of unitary operators, the generatoris of the form iA where A is self-adjoint.
Proof. (1) We saw ‖R(z)‖ ≤ 1/|Im z|. The resolvent set of iA containsthe positive reals. So iA and −iA satisfy the Hille-Yosida theorem. LetU(t), V (t) be the respective semigroups.
V and U are inverses:
d
dtU(t)V (t) = U(t)iAV (t)x− U(t)iAV (t)x = 0.
So U(t)V (t)x is independent of t. When t = 0, we get x. So U(t)V (t)x = xif x ∈ D(A). But D(A) is dense.
Both U and V are contractions. Since U(t)V (t) = I, they must be normpreserving. This is because
‖x‖ = ‖U(t)V (t)x‖ ≤ ‖V (t)x‖ ≤ ‖x‖,
140
so ‖x‖ = ‖V (t)x‖ and similarly with U . Since they are invertible, they areunitary. Define U(t) = V (−t) for t < 0.
(2) Let V (t) = U(−t). Then U(t) and V (t) are strongly continuous semi-groups of contractions, and the infinitesimal generators are additive inverses.So the generators are G,−G.
Since both G,−G are infinitesimal generators, all real numbers except 0are in the resolvent set of G. Take x ∈ D(G).
‖U(t)x‖2 = (U(t)x, U(t)x) = ‖x‖2.
Take the derivative with respect to t:
(Gx, x) + (x,Gx) = 0.
Replacing x by x+ y, we get
(Gx, y) = (x,Gy).
Replacing y by iy, we see G is antisymmetric. So G∗ is an extension of −G.ρ(G∗) = ρ(G). If z 6= 0 and z ∈ R, then z ∈ ρ(G), so z ∈ ρ(G∗). Alsoz ∈ ρ(−G). So G∗ cannot be a proper extension of −G, hence G∗ = −G.
141