Upload
jwmotten1547
View
228
Download
0
Embed Size (px)
Citation preview
8/2/2019 Linear Analysis 2010
1/66
Linear Analysiscourse code: 151124
October 2010
University of Twente
8/2/2019 Linear Analysis 2010
2/66
Preface
This course is an introduction to Functional Analysis with the main difference that topology is left out almost entirely.
The topics in the notes for the year 2010-2011 differ only marginally from that of previous years, but the text is
substantially different and, we hope, more precise and easier to read.
ii
8/2/2019 Linear Analysis 2010
3/66
Contents
1 Introduction: real and complex vectors and matrices 1
1.1 Vectors and matrices in Rn and Rkn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The dot product and orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Euclidean norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Pythagoras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Orthogonal complement in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.6 Subspace, column space and null space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.7 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.8 Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.9 Normal equations and the projection operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.10 Vectors and matrices inCn and Ckn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Vector space 9
2.1 Real vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Complex vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Linear combination and span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Linear transformation 19
3.1 Linear transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Familiar linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Kernel, image and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Linear transformation onRn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Matrix representation and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Normed vector space 27
4.1 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Cauchy sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Banach space = complete vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 Bounded linear operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Inner product 37
5.1 Real inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Complex inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Orthogonal complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.5 Cauchy-Schwarz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.6 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.7 Orthogonal projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.8 Orthonormal sequences and Parseval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.9 Gram-Schmidt process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Hilbert space 47
6.1 Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2 Complete orthonormal basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3 Adjoint operator on Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.4 Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.5 Unitary operators norm preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
iii
8/2/2019 Linear Analysis 2010
4/66
Index 59
iv
8/2/2019 Linear Analysis 2010
5/66
Notation
tr(A) trace of a square matrix: tr(A) = i aiidet(A) determinant of a square matrix A
Natural, integer, rational, real, complex numbers:
N set of positive integers {1, 2, 3, . . .}N0 set of nonnegative integers {0, 1, 2, 3, . . .}Z set of integers {. . . , 2, 1, 0, 1, 2, . . .}Q set of rational numbers { n
k
n, k Z, k = 0 }R set of real numbers
C set of complex numbers
Real and complex vectors and matrices:
Rn set of ordered n-tuples (u1, . . . , un ) with uk R k = 1, 2, . . . , nCn set of ordered n-tuples (u1, . . . , un ) with uk C k = 1, 2, . . . , nSequence space:
sequence space {(u1, u2, . . . ) uk R, k N}. It can also be written as {u : N R}
(A,B) {u : A B} with A Z. For instance, = (N,R)2 {u : N R
uk R,k=1 u2k < }2(A,C) {u : A C
uk C,kA |uk|2 < }finite {u : N R
uk = 0 for finitely many k N}Function space:
F(A,B) {f : A B}. This is the set of functions that map from some setA to some set Bfor instance R
n
= F({1, . . . , n},R) and =F(N,R).Typically, though,F is used for function spaces such as F([0, 1],R)
L2[a, b] The square integrable functions on [a, b] R:{f : [a, b] B
ba
|f(t)|2 dt < } with either B = R or B = CL1[a, b] {f : [a, b] B
ba
|f(t)| dt < } with either B = R or B = CC[a, b] {f : [a, b] B
f is continuous} with either B = R or B = CPn(A,B) The space of polynomials of degree n or less, that map from A to B. Here A RP The space of polynomials of abitrary degree, P = n0Pn
v
8/2/2019 Linear Analysis 2010
6/66
vi
8/2/2019 Linear Analysis 2010
7/66
1 Introduction: real and complex
vectors and matrices
In this introductory chapter we review familiar facts
about vectors and matrices in Rn and Rkn and their com-plex counterparts, and we introduce a version on of the pro-
jection theorem. It is this projection theorem and most
notably its proof that we use as a motivation for the ab-
stractions and generalizations of the following chapters. It
are these abstractions and generalizations that are the main
focus of this course. In the end the real and complex vec-
tors and matrices play only a marginal role, but it is where
our story begins.
1.1 Vectors and matrices in Rn and Rkn
The set Rn is the set of ordered n-tuples (x1,x2, . . . ,xn)
with xi R, i {1, 2 . . . , n}. Commonly these n-tuplesare identified with column vectors, so we write
Rn = {x
x =
x1x2
...
xn
with xi R }.
Likewise Rnm denotes the set n m real matrices. Ma-trices are denoted by capital letters and their elements by
lower case letters with two subscript indices. The first index
is the row index, the second the column index, for example
A =
a11 a12 a13 a1ma21 a22 a23 a2m
......
......
...
an1 an2 an3 anm
Rnm .
The transpose AT is formed by considering all rows of Aas columns of AT,
AT =
a11 a21 an1a12 a22 an2a13 a23 an3
......
......
a1m a2m anm
R
mn.
It is convenient to think of the transpose AT as the result of
reflecting A in its diagonal. The kth column of a matrix A
is denoted by Ak and, similarly, Ar means its rth row.Thezero (matrix) in whatever dimension nm is usually
denoted simply as 0; the square n n identity matrix is
denoted by In or simply by I,
0 =
0 0...
......
0 0
, I =
1 0 0
0
0
0 0 1
.
We assume familiarity with the common matrix addition
and matrix multiplication.
1.2 The dot product and orthogonality
Definition 1.2.1 (Dot product and orthogonality in Rn ).
The dot product x y of two vectors x , y Rn is the realnumber defined as
x y = x1y1 + x2y2 + +xnyn.
We say that two vectors x , y Rn are orthogonal (withrespect to the dot product) if x y = 0.
Orthogonality ofx and y is often denoted as x y.
Example 1.2.2 (Orthogonality with respect to the
dot product). Consider the vectors v and w shown in
Fig. 1.1(a), that is,
v= v1v2 = 2
1 , w = w1
w2 = 1
2 .
These two vector are orthogonal because
v w = (2 1) + (1 2) = 0.
It is not hard to show that the set of vectors x R2 forwhich v x 0 is the half space shown in Fig. 1.1(b).
v
0
{x R2
v x 0}
v
w
2
1
1
2
0
Figure 1.1: Orthogonal vectors
For R2 and R3 the dot product being zero agrees with
our intuition of being orthogonal (perpendicular) but realize
that we take x y = 0 to be the definition of orthogonalityand that this is the definition for anyRn .
1
8/2/2019 Linear Analysis 2010
8/66
1.3 Euclidean norm
Definition 1.3.1 (Euclidean norm). The Euclidean norm
|x| ofx Rn is defined as
|x| =
x 21 +x 22 + +x 2n .
The set {x Rn |x | 1} is known as the unit ball
(in the Euclidean norm). For n = 1 it is the unit interval[1, 1] and for n = 2 it is the unit disc, see Fig. 1.2.
The Euclidean norm of x equals the square root of the
dot product ofx with itself,
|x| = x x.
{x R2 |x | 1}
(1,0)
(0,1)
Figure 1.2: Unit ball in the Euclidean norm, n = 2
1.4 Pythagoras
Now that orthogonality is definedas having zero dot prod-
uct, the Pythagorean theorem is trivial:
Theorem 1.4.1 (Pythagorean theorem). Let x , y
Rn .
Then
x y |x + y|2 = |x |2 + |y|2.
Proof.
|x + y|2 = (x + y) (x + y)= x (x + y) + y (x + y)= (x x ) + (x y) + (y x ) + (y y)= |x |2 + 2(x y) + |y|2.
Here we used that z (x + y) = z x + z y and thatx y = y x . Convince yourself of these properties.
1.5 Orthogonal complement inRn
The orthogonal complementof some set S Rn is the setofall vectors that are orthogonal to all elements of S. The
orthogonal complement is denoted S.
Example 1.5.1 (Orthogonal complement). Consider
x := 13
10
R3.
Its orthogonal complement is
x : = {y R3 x y = 0}
= {y R3 y1 + 3y2 + 10y3 = 0}
= {y R3
y3 = 1
10y1
3
10y2}
= { ab 1
10a 3
10b
a, b R}
=
x
x
The orthogonal complement here is a plane.
We write V W whenever all elements ofV are per-pendicular to all elements ofW.
1.6 Subspace, column space and null space
(a) (b)
(c) (d)
Figure 1.3: (a) subspace; (b) affine subspace; (c,d)
not subspaces
Very loosely speaking a subspace ofRn is a subset that
is flat, extends in all directions and contains the origin,see Fig. 1.3(a). It is not too hard to formalize subspace:
2
8/2/2019 Linear Analysis 2010
9/66
Definition 1.6.1 (Subspace). A subset S ofRn is a sub-
space if
1. The zero vector 0 is in S,
2. u, v
S implies u
+v
S, (closed under addition)
3. u S, R implies v S. (closed under scaling)
It is customary to use 0 for both the origin (i.e. the
zero vector) and the zero number. IfS is a subspace and
x Rn then x + S is referred to as an affine subspace, seeFig. 1.3(b).
Example 1.6.2 (Column space and Null space). The set
S := {v R3
v= (v1, v2, 0), v1, v2 R}
is a subspace ofR3
. It is the (x , y)-plane. Let us verifythe three defining properties of subspace:
1. Clearly 0 = (0, 0, 0) S2. Ifv, w S then v= (v1, v2, 0) and w = (w1, w2, 0).
Hence v+ w = (v1, v2, 0) + (w1, w2, 0) = (v1 +w1, v2 + w2, 0) and since its last entry is zero alsothis vector is in S.
3. If v S then v = (v1, v2, 0) so that v =(v1, v2, 0) = (v1, v2, 0) and this clearly is againan element ofS.
This subspace can be represented in many different ways:
Let
A =1 00 1
0 0
R32.
Our set S equals the column space Col(A) of the ma-
trix A. This is the set of all possible linear combina-
tions of the columns of A,
Col(A) :
= {x x = Ay, y R
2
}.
Let
W = 0 0 1 R13.The null space, Null(W), of a matrix W is the set of
vectors x for which W x = 0. It will be no surprisethat Null(W) = S for our W.
We can also interpret the null space with dot products.
Let w be the above W, now seen as a vector
w = 001
R3.
The set S is the orthogonal complement w:
S = {v R3 v w = 0}
Equivalently, it is the orthogonal complement of the
entire column space
Col(
00
1
).
This is just to say that the (x , y)-plane is the set of
vectors that is orthogonal to the z-axis.
The following lemma states that any subspace ofRn can
be represented by matrices.
Lemma 1.6.3 (Matrix representation of subspace). Let
S be a subset ofRn . The following four statements are
equivalent:
S is a subspace
S = Col(A) for some matrix A Rnk and somek N
S = Null(W) for some W Rmn and some m N S = W for some set W Rn .
Given a subspace S there are many matrices A and W for
which S = Col(A) = Null(W).
1.7 Projection
x
vV
Figure 1.4: Orthogonal projection in R3
With orthogonality, norm and subspace defined it is now
possible to formulate our intuition that connects minimal
distance (norm) with orthogonality. Here is our first ver-
sion. Have a look at the proof because it is a basis for later
generalizations.
Definition 1.7.1 (Best approximation). An element v V Rn is a best approximation in V of x Rn if
|x
v| |
x
v|
vV.
See Fig. 1.4.
3
8/2/2019 Linear Analysis 2010
10/66
Theorem 1.7.2 (A projection theorem). Let x Rn andlet V be a subspace ofRn . Then
1. v is a best approximation inV ofx iff1 (x v) V,
2. If the best approximation v exists then it is uniqueand it satisfies
|x v|2 = |x|2 |v|2.
Proof. Suppose (x v) V for some v V. Then forany v V the difference v v is in V by the subspaceproperty, and so by Pythagoras we get
|x v|2 = |(x v)
V
(v v)
V
|2
= |x
v|
2
+ |v
v|
2
|x
v|
2.
Hence if v= v then the norm of x vexceeds that ofx v, making v the unique best approximation.
Conversely, suppose (x v) V. Then by definitionthere is a v V such that (x v) vi.e. such that (x v) v= 0. In particular this vis nonzero. We constructan improved approximation of x of the form v + vwiththe real number yet to be determined.
|x (v + v )|2= |(x v) v|2
= |x v|2 2(x v) (v) + |v|2= |x v|2 2[(x v) v] + 2|v|2.
This quadratic expression in is minimized for =(xv)v
|v|2 , rendering it equal to
= |x v|2 2[(x v) v]2
|v|2 +[(x v) v]2
|v|2
= |x v|2 [(x v) v]2
|v|2
< |x v|2
.
So then v is not a best approximation.The equality |x v|2 = |x |2 |v|2 is a restatement of
Pythagoras, see Fig. 1.4.
The theorem avoids the issue of existence of the best
approximation v because we prefer no to worry about itnow. Here (in Rn) it does exist though.
1.8 Transpose
1iff means if-and-only-if
For explicit representations of the best approximation we
remind you of an alternative representation of the dot prod-
uct in terms of transpose of vectors,
x v= vTx = v1 v2 vnx1x
2...
xn
.
Then we get the handy rule that for any kn-matrix A andvectors x Rn , y Rk, the matrix A can be moved fromone side of the dot product to the other
(Ax ) y = x (ATy).
Indeed, (Ax ) y = y T(Ax ) = (ATy)Tx = x (ATy).
1.9 Normal equations and the projection op-erator
If we have the subspace V given in the explicit form
V = Col(A),
then the best approximation v V of x can be obtainedrather explicitly:
Lemma 1.9.1 (Explicit projection normal equations).
Let x Rn and A Rnk. Then
y = arg minyRk
|x Ay|
iff y Rk satisfies the normal equations
ATAy = ATx . (1.1)
The best approximation v Col(A) of x then is v =Ay.
Proof. This is the projection theorem for V = Col(A) andv = Ay. By the projection theorem we need only estab-lish that (x Ay) V.
(x Ay) V (x Ay) (Ay) = 0 y Rk y TAT(x Ay) = 0 y Rk AT(x Ay) = 0 (See Problem 1.2) ATx = ATAy.
This result clearly shows that the transpose is a conve-
nient notion. With it, projections can be formulated ex-
plicitly, something we will come back to later (at which
point we generalize transpose to something called adjoint).The equations (1.1) are known as the normal equations.
4
8/2/2019 Linear Analysis 2010
11/66
The lemma does not require that ATA is invertible and in-
deed the solution y of the normal equation need not beunique, but if ATA is invertible then (1.1) yields
y = (ATA)1ATx
and hence the best approximation v = Ay equalsv = A(ATA)1ATx . (1.2)
Example 1.9.2 (Projection in R2). Let
V = Col(
3
1
) and x =
0
1
.
According to (1.2) the best approximation in V ofx is
v =
3
1
3 1
3
1
1
3 1
0
1
=
3
1
1
10=
0.3
0.1
xv V
0
1.10 Vectors and matrices in Cn and Ckn
We briefly summarize the complex counterpart of the reals.
The setCn is the set of ordered n-tuples (x1,x2, . . . ,xn)
with xi C, i {1, 2 . . . , n}. As in the real case thesen-tuples are often identified with column vectors, so we
write
Cn = {x x =
x1x2
...
xn
with xi C }.
The set Cnm denotes the set n m of complex valuedmatrices. Given a complex matrix,
A =
a11 a12 a13
a1ma21 a22 a23 a2m
......
......
...
an1 an2 an3 anm
Cnm
its complex conjugate transpose2 AH is the matrix defined
as
AH =
a11 a21 a1na12 a22 a2na13 a23 a3n
......
......
a1m a2m
anm
Cmn .
2or Hermitian transpose or conjugate transpose.
The complex conjugate transpose AH can be obtained by
reflecting A in its diagonal and then replacing each element
by its complex conjugate. We say that a matrix isHermitian
if A = AH. If the matrix happens to be real then AH =AT. There are two well accepted notations for complex
conjugate transpose: AH
and A. We choose AH
to set itapart from the adjoint operators that we introduce later.
Example 1.10.1 (Pauli matrix). The Pauli matrices3
1, 2, 3 are the three 2 2 matrices
1 =
0 1
1 0
,
2 =
0 ii 0
,
3 =
1 0
0 1
.
All three are Hermitian and they have the property that
21 = 22 = 23 = i123 = I2.
Let us verify that 22 = I2:
22 =
0 ii 0
0 ii 0
=
1 0
0 1
.
Since Hi = i we also have that Hi i = I2. This propertyis what we later call unitary.
The dot product x y for complex vectors x and y ofequal dimension is defined as
x y = y Hx = y1 yn
x1...
xn
= n
k=1ykxk.
Example 1.10.2 (Norm of complex vector). For v =(1, 2 + i, 3i) C3 we have
v v= v1v1 + v2v2 + v3v3= |v1|2 + |v2|2 + |v3|2
=12
+ |2
+i
|2
+ | 3i
|2
= 12 + (22 + 12) + 32 = 15.
The norm defined as |v| = v vhence is
15.
1.11 Problems
1.1 Let
x =
1
2
3
3 is the common notation for Pauli matrices in physics. In this coursewe typically denote matrices with capital letters however.
5
8/2/2019 Linear Analysis 2010
12/66
a) Determine a matrix A such that x = Col(A)b) How many columns ofA are needed?
1.2 Show that x y = 0 for all y Rn implies that x = 0.1.3 Let W
Rm
n . Prove that Null(W) is a subspace.
1.4 Let A Rnk. Prove that Col(A) is a subspace.1.5 Let S Rn . Show that
a) S (S)b) S is a subspace iff(S) = S
1.6 Let S1,S2 Rn . Is the intersectionS1S2 a subspaceifS1 and S2 are subspaces?
1.7 Consider
A =1 01 4
0 1
Compute the best approximation in V = Col(A) ofx = (0, 0, 1)
1.8 Redo the previous example but now for
A =1 21 2
1 2
1.9 Let A Rn3 and let (as always) Ak denote its kthcolumn. Show that
ATA =
|A1|
2 A2 A1 A3 A1A1 A2 |A2|2 A3 A2A1 A3 A2 A3 |A3|2
.
1.10 Let
V = Null(1 1 1).a) Express V as V = Col(A) for some matrix Ab) Determine the best approximation in V of x =
(0, 0, 1).
c) Sketch V and both x = (0, 0, 1) and its bestapproximation.
1.11 Prove the two properties used in the proof of the
Pythagorean theorem:
x y = y x z (x + y) = (z x ) + (z y)
1.12 Suppose Q is a 2 2 matrix such that |Qx | = |x | forall x
R2.
a) Show that QT Q = I
b) Show that Q has the form
Q =
cos() sin()sin() cos()
or
Q =
cos() sin()
sin() cos()
.
vx+V
x
V
0
Figure 1.5: Minimum norm element v of an affinesubspace x + V
1.13 A version of the projection theorem that appears often
in applications is the following (see Fig. 1.5):
Let x Rn and letV be a subspace ofRn . A vector v V is a minimal normelement of the affine subspacex +V if andonly ifv V.
Prove it.
1.14 Sketch the affine subspace 01 + Col 12 and deter-mine the minimal norm element of this set.1.15 Determine the complex conjugate transpose of
a)
1 + i 1 + 2i
1 + 3i 1 + 4i
b)
3 2 + i 3 + 2i
4 + 2i 4 + i 4
c)i 0 i 1 + i
d)
0 1 + i 3 + 4i1 + i 0 2 6i
3 + 4i 2 6i 0
1.16 Let
x =
1
2i
C2
Determine a complex matrix A and W such that
x = Col(A) = Null(W).(In the complex case, Col(A) is the set of vectors of
the form Ay with y Ck, where k is the number ofcolumns of A.)
1.17 what is the smallest subspace ofR3 that contains theunit circle {(x , y,z)
x 2 + y2 = 1,z = 0}?
6
8/2/2019 Linear Analysis 2010
13/66
1.18 Show that
a) Col(A) = Null(AT)b) Col(A AT) = Null(AT)c) Col(A)
=Col(A AT)
1.19 Formulate and prove a projection theorem for x Cnand V a subspace ofCn . This also requires that you
think about what subspace should mean in Cn (this
chapters only defines it for real vectors).
tk
xk|k|
Figure 1.6: Least squares fit
1.20 Least squares approximation. A very common prob-
lem is to approximate a set of pairs of real numbers,
(t1,x1), (t2,x2), . . . , (tn,xn)
by a straight line, see Fig. 1.6. This can be seen as an
application of the projection theorem inRn with n thenumber of pairs. We write the candidate straight line
as
x (t) = y1 + y2t, with y1, y2 R
and the approximation error of the kth pair we write
as k := xk x(tk), see Fig. 1.6. Ideally k = 0 kwhich would mean that the straight line interpolates
all pairs. In practice we try to make the errors as small
as possible, and the most popular way of doing this is
by least squares approximation:
a) Express the vector (1, . . . , n ) of errors as
12...
n
=
x1x2
...
xn
x
? ?
? ?...
...
? ?
A
y1y2
y
(that is, determine the matrix A)
b) Show that
AT
A =n
k=11 tk
tk t2k
c) Show that ATA is invertible iff tk = tj for atleast one pair (j, k). (This might be a tough
problem.)
d) Show that the sum of squares
nk=1
2k of the er-
rors equals
|x
Ay
|2 and write down the corre-
sponding normal equations in terms of the avail-able data (tk,xk)
e) The least squares fit is defined as the straight
line that minimizes the sum of squaresn
k=1 2k.
Determine the least squares fit (that is, deter-
mine the optimal y1, y2 as functions of tk,xk.
You may assume that ATA is invertible)
7
8/2/2019 Linear Analysis 2010
14/66
8
8/2/2019 Linear Analysis 2010
15/66
2 Vector space
Let us say that it is our purpose to generalize the pro-
jection theorem. Then we should generalize the various
players in the projection theorem. These are
spaceRn ,
subspaceV ofRn ,
dot product,
Euclidean norm.
In this chapter we generalize space Rn (to be called vector
space) and subspace V (still to be called subspace). Vector
spaces and subspaces can be recognized in loads of appli-
cations, the projection theorem being just one of them.
2.1 Real vector space
What properties ofRn did we implicitly use in the projec-
tion theorem and its proof? Have a look at Thm. 1.7.2 and
its proof and you will probably agree that the following
eight properties will do:
Definition 2.1.1 (Real vector space). A real vector space(X,, ) is a nonempty set of elements X, called vectors,on which vector addition XX X and real scalar mul-tiplication RX X is defined with the following eightproperties for all v, w X and all scalars , R:
1. u v= v u commutative2. (u v) w = u (v w) associative3. There is a zero vector, also known as origin, 0 X
such that u 0 = u u X4. For each v
X there is an additive inverse
v
X
such that v (v) = 05. 1v= v6. (v) = ()v associative7. ( +)v= vv distributive8. (u v) = u v distributive
If this is your first contact with such a formal definition
then please realize this: we have the freedom to define our
own addition and multiplication and we may dream up
really weird sets X; but the moment that X with that addi-tion and multiplication satisfies the eight axioms of vector
space then automatically all results we will derive for gen-
eral vector spaces hold for our weird X as well. Thats the
beauty of generality and abstraction.
Before entering a series of examples, you will want to
know that the 8 axioms of vector space imply a host of
other properties. Here are some basic ones:
Theorem 2.1.2 (Basic properties of vector space). Sup-
pose (X,, ) is a real vector space. Then1. The origin 0 X is unique2. The additive inverse is unique: if v w1 = 0 and
v w2 = 0 then w1 = w2.3. 0v= 04. 0 = 05. The additive inverse
vequals (
1)
v
6. v= 0, v= 0 = 0Proof.
1. Suppose that 01 and 02 are two zero vectors. Then
01 02 = 01 and 01 02 = 02. So the two zerovectors are the same.
2. Let w1 and w2 be two additive inverses of v. Then
w1 = w1 0 = w1 (vw2) = (w1 v) w2 =0 w2 = w2.
3. 0
v
=0
v 0
=0
v (0
v
(0
v))
=(0
+0)v
(0v) = 0v(0v) = 04. We proved it already for = 0. If = 0 then v
0 = ( 1 v 0) = ( 1 v) = 1v= vfor every v.Hence 0 satisfies the conditions of the zero vector.
5. v (1)v= 1v+ (1)v= (1 1)v= 0v= 0.6. Suppose v= 0 and v= 0. If = 0 then 1
(v) =
1v = v = 0 while 1(v) = 1
0 = 0. This is a
contradiction. Hence = 0.
In fact properties 3, 4 and 6 of the above theorem can be
combined into
v= 0 ( = 0 and/or v= 0).
One may choose to include any number of the above prop-
erties into the definition of vector space but it is customary
not to do that. We prefer to strip a property from a defi-
nition if it is implied by others properties (axioms) of the
definition.
Example 2.1.3 (Rn ). The space Rn of ordered sequences
of given length n
N, with entries in R,
Rn = {u u = (u1, u2, . . . , un), uk R }
9
8/2/2019 Linear Analysis 2010
16/66
is a vector space under the vector addition and scalar mul-
tiplication defined elementwise as
u v:= (u1 + v1, u2 + v2, . . . , un + vn),u := (u1, u2, . . . , un ).
The subtlety is that the plus-sign in u vrepresents addi-
tion of two vectors whereas the plus-sign in u1 + v1 rep-resents ordinary addition of two real numbers. Likewise
u is a product of scalar and vector u while u1 simplymeans product of two real numbers. It is easy to verify that
the 8 defining properties of vector space hold, i.e. that this
(Rn,, ) is a real vector space. Example 2.1.4 (Sequence space). The space (N; R) isthe set of one-sided infinite sequences
(N; R) = {u
u = (u1, u2,. . .), uk R, k N }.
As in Rn
it is a vector space under the addition and scalarmultiplication defined elementwise as
u v:= (u1 + v1, u2 + v2, u3 + v3,. . .),v:= (u1, u2, u3, . . . ) .
we leave it to the reader to establish that the 8 properties of
real vector space indeed hold.
u
v
u v
Figure 2.1: Two vectors u, v R25 and their sum
Figure 2.1 depicts vector addition in R25. The reason toinclude this figure is to convince you of the fact that also
function spaces can be seen as vector spaces and that con-
ceptually the step from Rn to function space is marginal.
Example 2.1.5 (Function space). The set of functions
F([0, 1],R) := {f : [0, 1] R}that map from [0, 1] to R, is a vector space under addition
and scalar multiplication defined pointwise, at each t, as
( f g)(t) = f(t) + g(t), (f)(t) = f(t).
See Fig. 2.2. It is a bit of a bore to verify the eight definingrules of vector space, but once we have to do it:
1. f g = g f because ( f g)(t) = f(t) + g(t) =g(t)+ f(t) = (g f)(t). The vector addition inheritsthe commutative property of addition of real numbers.
2. ( f g) p = f (g p) indeed, and its proof isvery similar to that of part 1.
3. the function n(t) = 0 t satisfies fn = f for everyfunction f, so n is a zero vector
4. f defined pointwise as f(t) = (1)f(t) is an addi-tive inverse of f because then f f = n
5. 1f = f because (1f)(t) = 1( f(t)) = f(t) t.6. (f) = ()f. This is possibly the trickiest to
prove. Its proof is a series of applications of the defi-
nition of scalar multiplication on our function space:
(f)(t)
=f(t)
t.
Here we go:
((f))(t)= ( f)(t)= (f(t))= ()f(t)= (()f)(t)
So (f) and ()f are indeed the same functions.7. (
+)
f
=
f
f see Problem 2.6.
8. ( f g) = f g see Problem 2.7.
f
g
f g
Figure 2.2: Graph of functions f and g and their
sum f g
Notation cleanup
To avoid unduly cumbersome notation we simplify the no-
tation somewhat.
The dot on top of vector addition was used to empha-
size that it differs from addition of scalars. Now that the
difference is clear, we almost always skip the dot on vector
addition and so + from now on means both vector additionand scalar addition. The context makes clear which one it
is.
Similarly the dot in scalar-vector multiplication such asin vis deleted altogether: v.
10
8/2/2019 Linear Analysis 2010
17/66
Also the underline in the zero vector 0 is usually omitted,
so from now on 0 is used both for the scalar zero and the
zero vector.
Finally, we typically say X is a vector space instead of
the more precise but also more cumbersome (X, +, ) is avector space.
2.2 Complex vector space
A complex vector space differs from a real vector space
only in that the scalars the s and s in a complex
vector space are taken from C instead ofR. For complete-
ness: a complex vector space X is a nonempty set of ele-
ments, called vectors, on which vector additionXX Xand complex scalar multiplication C X X is de-fined that satisfy the 8 properties of Definition 2.1.1 for
all v, w
X and all ,
C. From the context it will be
clear whether we deal with real or complex vector spacesand we refer to the s and s simply as scalars.
The basic properties of Lemma 2.1.2 also holds for com-
plex vector space (the proof is identical).
Example 2.2.1 (Cn ). The space Cn is the set of ordered
n-tuples of complex numbers,
Cn = {u u = (u1, u2, . . . , un); u1, . . . , un C }.
It is a vector space under the addition and scalar multipli-
cation defined elementwise as
u + v= (u1 + v1, u2 + v2, . . . , un + vn),u = (u1, u2, . . . , un ).
Example 2.2.2 (Doubly infinite complex sequence).
The space (Z; C) is the set of doubly infinite orderedsequences
(Z; C)= {u
u = ( . . . , u1, u0, u1,. . .), uk C, k Z }.It is a vector space under the addition and scalar multipli-
cation defined elementwise as
u
+v
=( . . . , u1
+v1, u0
+v0, u1
+v1,. . .),
u = ( . . . , u1, u0, u1, . . . ) .
Example 2.2.3 (Function space). Complex-valued func-
tions
F([0, 1],C) := {f : [0, 1] C}that map from [0, 1]toC can be seen as a vector space with
addition and scalar multiplication defined pointwise as
( f + g)(t) = f(t) + g(t) t [0, 1],(f)(t) = ( f(t)) t [0, 1].
The zero element is the function n(t) that is zero for everyt [0, 1].
2.3 Subspace
A subset of a vector space may be a vector space itself.
For instance the (x , y)-plane of the vector space R3 is it-
self a vector space with addition and scalar multiplication
borrowed from vector space R3. If it has been settled thatX is a vector space, then to test whether or not a subset
V X is a vector space, we need not redo all the 8 defin-ing properties of vector space. It is sufficient to check that
the set is closed under addition and scalar multiplication.
All other axioms of vector space are then inherited by that
ofX. Such subsets, when nonempty, we call subspace.
Definition 2.3.1 (Subspace). A subset V of a vector
spaceX is a subspace ofX if for all u, v V and scalar :1. 0 V,
2. u + v V, closed under addition3. v V. closed under scaling
In a non-empty setV the third condition implies the first
(take = 0). Therefore the first condition in effect onlysays that subspaces are not allowed to be empty.
Example 2.3.2 (Subspace of function space). The set
S = {f : R R c, d R such that
f(t) = c cos(t) + dsin(t) t R }
is a subspace ofF(R,R). Let us verify:
1. the zero function n(t) = 0 t ofF(R,R) is an ele-ment ofS (take c = d = 0),
2. it is closed under addition, for if fk(t) = ck cos(t) +dk sin(t) S then so is their sum ( f1 + f2)(t) =(c1 + c2) cos(t) + (d1 + d2) sin(t) S.
3. it is closed under scalar multiplication, for if
f(t) := c cos(t) + dsin(t) is in S then so isf(t) = (c) cos(t) + (d) sin(t).
Our intuition forR3 that says that a subspace is something
flat may fail for function space. It is a subspace nonethe-
less.
Example 2.3.3 (Finitely nonzero sequence space). The
set of infinite sequences of which only finitely many entries
are nonzero,
finite(N,R) := {u : N R only finitely many uk
are nonzero }
is a subspace of(N; R). See problem 2.14. The next example is important. It considers the set of
square summable sequences and they play a key role infunctional analysis.
11
8/2/2019 Linear Analysis 2010
18/66
Example 2.3.4 (Square summable sequence). The set
of square summable sequences u = (u1, u2, . . . ) of realnumbers is denoted 2(N; R). That is,
2(N
;R)
= {u
=(u1, u2, . . . ) un R,
n=1 u2n 0. Is it true that
any set ofn elements that spans X is a basis ofX?
2.27 A subset S of a vector space is an affine subspace if
it is closed under affine combination, meaning that if
x , yS then
1x + 2y S
for all 1 and 2 that add up to one, 1 + 2 = 1.a) Consider R2 and two elements x1 =
01
and
x2 =
21
. Sketch in the plane the set of all
affine combinations of x1 and x2
b) Show that a nonempty S is an affine subspace
(of some vector space X) iffS = x0 + V forsome x0 X and some subspace V ofX.
c) Let S be an affine subspace. Show that for any
n and any x1, . . . ,xn S we haven
i=1ixi S
whenevern
i=1 i = 1.2.28 Let n > 0. Show that Rn is not a subspace Cn
2.29 Suppose V is a subspace ofX and that dim(X) < .Show that V = X iff dim(V) = dim(X).
2.30 Prove that a subspace of a vector space is itself a vec-
tor space.
2.31 ConsiderP3 with basis {1,x 3, (x 3)2, (x 3)3}.Determine the coordinates with respect to this basis
of
a) 1
b) x
c) x 2
2.32 Consider span{1, eix , eix } F(R,C) with obviousbasis S = {1, eix , eix}. With respect to this basis,determine the vector of coordinates of
a) sin(x )
16
8/2/2019 Linear Analysis 2010
23/66
b) 1 + cos(x )2.33 Alternative definition of vector space. Less common
but more concise is this definition of vector space:
A real vector space is a nonempty setV
with an addition : V V V andscalar multiplication : R V Vthat satisfy the following six axioms for all
x , y,z V and all, R: x (y z) = (x y)z 0x does not depend on x ( +)x = (x ) (x ) (x y) = (x ) (y) ()x = (x ) 1x = x
We denote0
x as 0. We abbreviate(
1)
x
tox andx (y) to x y.a) Show that Definition 2.1.1 implies the above six
axioms
b) Show that the above six axioms imply the eight
of Definition 2.1.1.
In other words, the two definitions of vector space are
equivalent.
17
8/2/2019 Linear Analysis 2010
24/66
8/2/2019 Linear Analysis 2010
25/66
3 Linear transformation
v F(v)
F(V)V
W
Figure 3.1: A mapping F fromV to W
Linear transformations (also known as linear operators
and linear mappings) are everywhere. For instance the pro-
jection theorem (page 4) states that the best approximation
v of an x is unique so we can consider the mapping F thatsends x to its best approximation v = F(x ). This is justone of the many mappings F that turn out to be linear.
3.1 Linear transformation
Definition 3.1.1 (Linearity). Let V and W be two vector
spaces (both real or both complex vector spaces). A map-
ping F from V to W is linear if for every v1, v2, v Vand scalar :
1. F(v1 + v2) = F(v1) +F(v2), additive
2. F(v) = F(v). homogeneous
IfFmaps from V to W then we write F : V W. Wecan apply F to elements (vectors) v V but also to setsS V, and we use the notation F(S) to mean
F(S)= {
F(v) v S}.The range ofF : V W is defined as F(V), i.e. it is theset of all possible outcomes of the mapping. The range is
also known as the image (of its domain) and is denoted as
Im(F). The setW to which Fmaps is sometimes referred
to as the codomain ofF. The codomainW may well be a
much bigger set than the range ofF.
Example 3.1.2 (Linearity on function space). This is
an attempt to graphically explain what linearity means on
function space. Suppose thatFmaps functions x : R Rto functions y : R R, and suppose that
F
=
and
F( ) = .Then additivity implies that
F =
and homogeneity implies that
F
= .
The vector addition and scalar multiplication of the
codomain W induce a form of addition of mappings and
scalar multiplication with mappings. Specifically, for any
two mappings F,G : V W we define the sum of thetwo mappings as
(F+ G)(x ) := F(x ) + G(x )and the product of scalar and the mapping is defined as
(F)(x ) := (F(x )).Also, ifF1 : V1 V2 and F2 : V2 V3 are two map-pings then F2F1 : V1 V3 by definition is the mappingdefined as
(F2F1)(x ) := F2(F1(x )).
3.2 Familiar linear transformations
Well, the most familiar linear transformations are the ones
that map from Rn to Rksee a later sectionbut here are
other standard ones. It is easy to verify that they are indeed
linear. Following the colloquial definition some trickier
issues regarding domain and codomain are added.
Example 3.2.1 (Fourier transform). The Fourier trans-
formation is a linear transformation that sends continuous
time functions x : R C to continuous frequency func-tions x : R C, defined as
x = F(x ) : x() =
x (t) eit dt.
As domain V we could take the set of absolutely inte-
grable functions {x : R C
R|x (t)| dt < } (with
standard addition and multiplication) because then x () iswell defined for every R. As codomain we may takeW := F(R,C). Example 3.2.2 (Fourier series). The Fourier series can
be seen as a linear mapping that sends continuous time
functions on finite interval, x : [0, T] R to countablymany Fourier coefficients x : Z C,
x = F(x ) : xk = 1T
T0
x (t) eik2T t dt, k Z.
19
8/2/2019 Linear Analysis 2010
26/66
As domainVwe could take the set of continuous functions
on [0, T] (but other sensible domains can be dreamed up).
Codomain (Z; C) is natural. Example 3.2.3 (Laplace transform). Likewise the unilat-
eral Laplace transformL is linear as well,
X = L(x ) : X(s) =
0
x (t) est dt.
One might remember that every bounded function x (t) has
a Laplace transform X(s) that is defined for all s C withRe(s) > 0. So if the domain is V = {x : [0, ) R c > 0 such that |x (t)| < c t R} then as codomain
W we might take the functions defined on the open right-
half complex plane, W = {x : ((0, ) + iR) C}. Example 3.2.4 (Convolution and Fredholm). Here is an-
other familiar linear mapping: the convolution Ch ,
y = Ch(u) : y(t) = (hu)(t) :=
h( )u(t ) d.
The convolution is in fact a special case of the general
linear mapping fromF(R,R) to F(R,R),
y = Ffredholm(u) : y(t) =b
a
K(t, s)u(s) ds.
If a and b are finite and K(t, s) is continuous and u is
continuous as well, then the operator is well defined and
its outcome is continuous. The equation relating u and y
is often called Fredholm equation (and the game then is to
find u for given K and y).
Example 3.2.5 (Differentiator). Also linear is the differ-
entiator D,
f = D(g) : f(t) = g(1)(t).
As domain V we should take a vector space whose ele-
ments are differentiable, such as
V = {f : R R
f is differentiable}.
Codomain F(R; R) will do. Let us verify linearity. Forone it is additive, because for any g, h V the derivativeof the sum is the sum of the derivatives,
D(g + h) = (g + h)(1)= g(1) + h(1)= (Dg) + (Dh)
and it is homogeneous as well,
D(g) = (g)(1) = (g(1)) = (Dg).
v(t)
t
wk
k
Figure 3.2: Original signal, sampled signal
Example 3.2.6 (Sampler). The ideal sampler Sh maps
functions to sequences, see Fig. 3.2. More specific, for
a given sampling period h > 0, it is defined as
w = Sh (v) : w(k) = v(kh ), k Z.It is a well defined linear transformation if we choose as
domain, say, V = {v : R R vis continuous } and as
codomainW = (Z; R), both with their standard additionand multiplication. Additivity in words means that the
samples of the sum equals the sum of the samples. Indeed,
(Sh( f + g))(k) = ( f + g)(kh )= f(kh ) + g(kh )= (Sh ( f))(k) + (Sh (g))(k).
It is also homogeneous: the samples of the scaled signal
are the scaled samples of the signal (or scaling commutes
with sampling):
(Sh (f))(k) = (f)(kh ) = ( f(kh )) = (Sh ( f))(k).
3.3 Kernel, image and dimension
Let F : V W be a linear mapping from vector space Vto vector space W. It is readily verified that then ker(F) is
a subspace of the domain V and that Im(F) is a subspace
of the codomain W (Problem 3.7). Now suppose that we
have to find the solutions x of the equation
F(x ) = w.There are two possibilities: either w Im(F), so then nosolution x exists, or
w Im(F).In that case there is at least one x0 for which F(x0) =w. We claim that the complete solution set is the affine
subspace
x0 + ker(F).Indeed, ifF(x0) = w then x satisfies
F(x ) = w F(x ) = F(x0) F(x x0) = 0
x
x
0 ker(F)
x x0 + ker(F).
20
8/2/2019 Linear Analysis 2010
27/66
Example 3.3.1. Let V be the subspace of twice differen-
tiable functions in F(R; R) and let D : V F(R; R) bethe differential operator defined as
D(y) = y(2) + y.
What is the complete solution set (in V) of
(Dy)(t) = 2 et?Clearly y0(t) = et is one solution. The complete solutionset hence is
et + ker(D) = et + span{sin, cos}.
From linear algebra one may recall that any n m ma-trix through elementary row andcolumn operations can be
transformed into the formIr 0r,mr
0nr,r 0nr,mr
Rnm .
In this form it is immediate that the kernel has dimension
m r and that the image has dimension r. These twodimensions add up to m, which is the number of columns
of the matrix. This result holds in greater generality (no
proof):
Lemma 3.3.2 (A dimension theorem). Let F : V Wbe a linear operator from vector space V to vector spaceW
and assume that V is finite dimensional. Then
dim(ker(F)) + dim(Im(F)) = dim(V).In particular, if dim(V) = dim(W) < , then the abovesays that F is injective iff it is surjective:
ker(F) = {0} Im(F) = W.
Example 3.3.3 (Differentiator). Consider the vector
space of polynomials Pn of degree at most n, and the
differentiatorD : Pn Pn defined as D(p) = p.The kernel ofD is
ker(D) = {p Pn p = 0}= {p Pn
p is constant }= P0.
Clearly this kernel has dimension 1. So by the dimension
theorem the range, Im(D), has dimension dim(Pn) 1 =n. It does:
Im(D)
= {D(p) p(t) = antn + + a1t + a0, ai R}
= {nantn1 + (n 1)an1tn2 + a1
ai R}
=Pn
1.
Example 3.3.4 (Abstract interpolation). Clearly given
any two points (x1, y1) and (x2, y2) in R2, with x1 = x2,
there is a unique degree-1 or constant polynomial that in-
terpolates these points:
(x1, y1)
(x2, y2)
With the dimension theorem this can be generalized as fol-
lows. Consider an arbitrary set ofn+1 points (xi , yi ) R2with all xi distinct. We show that there is a unique polyno-
mial of degree n or less that interpolates these points. To
this end consider the mapping
F : PnRn+1
that sends a polynomial p to
F(p) = (p(x1), p(x2) , . . . , p(xn+1)).
A polynomial p interpolates (x1, y1) , . . . , (xn+1, yn+1) iffF(p) = y where y = (y1, . . . , yn+1). The mapping Fis linear (verify this yourself). Now it is well known that
a polynomial of degree n or less does not have n + 1 ze-ros, unless it is the zero function. Hence on Pn we have
F(p) = 0 only if p is the zero element, so
ker(F)
= {0
}.
By the dimension theorem and the fact that Pn and Rn+1
have the same dimension we thus have
Im(F) = Rn+1.
In other words for every y = (y1, . . . , yn+1) there is ap0 Pn that interpolates the n + 1 points (xi , yi ). Infact the solution is unique because the general solution is
p0 + ker(F), and ker(F) = {0}. See Fig. 3.3.
(x1, y1)(x2, y2)
(x3, y3)
Figure 3.3: There is a unique p P2 that interpo-lates the three points
3.4 Linear transformation on Rn
On Rn linear mappings are often identified with matrices.
21
8/2/2019 Linear Analysis 2010
28/66
v
F(v)
0/2
v
F(v)
0
Figure 3.4: Rotation and reflection
Example 3.4.1 (Rotation in R2). Figure 3.4(a) illustrates
the rotation F : R2 R2 operator. It rotates its argu-ment over an angle of (counter clockwise). It is a linear
mapping (verify this). In particular it maps the unit vector
e1 :=
10
to y1 :=
cos()sin()
and the unit vector e2 :=
01
to y2 := sin()cos() . Combining the two outcomes in a ma-trixFrotation :=
y1 y2
= cos() sin()sin() cos()
is the standard way of representing this linear mapping.
Example 3.4.2 (Reflection in R2). Figure 3.4(b) depicts
the reflection transformation F : R2 R2. It reflectsits argument with respect to the line with angle /2. The
matrix F now becomes
Freflection =cos() sin()
sin() cos()
.
Example 3.4.3 (transformation on R3). Suppose we
have a mapping T that we know to be linear and that
sends the unit cube to a stretched version, see Fig. 3.5, in
particular that
T(e1) = e1, T(e2) = 2e2, T(e3) = e3.
The matrix T associated with this mapping (with respect
to the standard basis) is
T =1 0 00 2 0
0 0 1
.
Identifying linear mappings with their matrix has to do
with the fact that the linear mapping is completely specified
by its matrix (a proof follows shortly). The drawback of
such a matrix approach is that it assumes that we all agree
on what the standard basis is and while this may be so
(well) in Rn , for other vector spaces this may not be soobvious.
e1
e2
e3
T(e1) = e1 T(e2) = 2e2
T(e3) = e3
Figure 3.5: Unit cube linearly transformed
3.5 Matrix representation and eigenvectors
A message of the previous section is this: once we settle
on a basis then the linear mapping may be identified with a
matrix of scalars. As mentioned earlier, a drawback of such
an approach is that it assumes agreement on the choice of
basis. On the other hand, an advantage is that it translates
the linear mapping into a matrix of numbers, which makes
it explicit (e.g. matlabable). Consistent with the previoussection we define:
Definition 3.5.1 (Matrix representation of linear map-
pings). Let V be a vector space with finite ordered basis
S = {v1, v2, . . . , v n}. For any x V let xS Rn (or Cn)denote the column vector of coordinates of x with respect
to the basis, that is,
x =n
i=1vixS,i . (3.1)
For any linear transformation
F : V V
the matrix FS S ofFwith respect to the basis S is defined as
the n n matrix whose columns are the coordinate vectorsof the transformed basis elements,
FS S =
[F(v1)]S [F(v2)]S [F(vn)]S
.
The connection (3.1) between x and xS may be written
compactly using a row vector of basis elements, as
x = v1 v2 vnxS.For x = F(vi ) this shows that [F(vi )]S is determined bythe equation
F(vi ) =
v1 v2 vn
[F(vi )]S,
and that the matrix FS S since it is just the collection of
all these [F(vi )]S is determined byF(v1) F(v2) F(vn)
= v1 v2 vn FS S.The following lemma says that linear transformations on
finite dimensional vector space are completely specified bytheir matrix:
22
8/2/2019 Linear Analysis 2010
29/66
Lemma 3.5.2 (Matrix representation of linear transfor-
mation). Let V be a vector space with finite ordered ba-
sis S = {v1, . . . , v n} and let x , y V and suppose thatF : V V is linear. Then
y
=F(x )
yS
=FS SxS .
Proof. By definition ofxS we have x =
v1 vn
xS .
Using linearity we get F(x ) = F(v1 vnxS) =F(v1) F(vn)
xS =
v1 vn
FS SxS . So
y = F(x ) iff v1 vnyS = v1 vn FS SxS .As the {v1, . . . , v n} are linearly independent this last equal-ity holds iff yS = FS SxS.Lemma 3.5.3 (Eigenvalue and eigenvector). Let be a
scalar. Consider a linear mapping F : V V and let FS Sbe the matrix of this mapping, given some basis S ofV.
The following statements are equivalent.
1. There is an x V, x = 0 such that F(x ) = x .2. is an eigenvalue of the matrix FS S .
Such nonzero x we call an eigenvectorof the mapping, and
the scalar an eigenvalue of the mapping.
Proof. Apply Lemma 3.5.2 for y = x , and realize thatx = 0 iffxS = 0.
The lemma implies that the eigenvalues of FS S do not
depend on the choice of basis. Better yet, the notion of
eigenvalue does not require the notion of basis. For com-
plicated linear mappings it may however be hard to find the
eigenvalues and eigenfunctions and then a matrix represen-tation may help.
Example 3.5.4 (Differentiator). Consider the differentia-
tor D : Pn Pn that sends polynomials p of degree atmost n to their derivative D(p) := p(1). A basis for Pnclearly is
S := {1, t, t2, . . . , tn}and they map to
{0, 1, 2t, . . . , ntn1}.
With respect to this basis S, the matrix DS S that representsthe differentiator on Pn can be derived from
D(1) D(t) D(t2) D(tn )= 0 1 2t ntn1
= 1 t t2 tn
0 1 0 00 0 2
. . ....
0 0. . .
. . . 0...
. . .. . .
. . . n
0 0
DSS
.
The matrix DS S is not invertible hence neither is the differ-entiator. Indeed the differentiator is not invertible because
every constant maps to 0. The only eigenvalue that the ma-
trix has is = 0 hence the differentiator has no eigenval-ues other than = 0. Indeed, the derivative of any poly-nomial is of lower degree so nonconstant eigenfunctions
do not exist. The eigenfunctions with eigenvalue 0 are the
constant functions.If we choose as domainV = span{et, e2t} with obvious
basis V = {et, e2t} then the matrix DV V of the differentia-tor becomes
DV V =
1 0
0 2
because
D(et) D(e2t)
= et 2 e2t = et e2t 1 00 2
.
Now DV V is invertible, hence the differentiator is invert-
ible on span{et, e2t}, indeed it is. Also, its eigenvalues are1 and 2 hence f span{et, e2t} exist with D( f) = f andD( f) = 2 f, Clearly such f exist.
1/2
p(t) g(t) = p(1 t)
Figure 3.6: g(t)
=p(t
1)
Example 3.5.5 (Eigenfunction). Consider the mapping
F : P2 P2 defined as
g = F(p) : g(t) = p(1 t).
The graph (t, g(t)) is the graph (t, p(t)) reflected in the
vertical axis at t = 1/2, see Fig. 3.6. With respect to thestandard basis S = {1, t, t2} the matrix FS S follows as
F(1) F(t) F(t2) = 1 1 t (1 t)2
= 1 t t2 1 1 10 1 20 0 1
FSS
.
Because of its upper-triangular structure, the eigenvalues
of FS S are the diagonal elements,
= 1(twice) and = 1.
It is readily verified that the corresponding eigenvectors
(modulo scaling etc.) are
= 1 : v1 = 100
, v1 = 011
23
8/2/2019 Linear Analysis 2010
30/66
and
= 1 : v1 =12
0
.
This corresponds to the eigenfunctions
p1(t) =
1 t t2 10
0
= 1,
p1(t) =
1 t t2 01
1
= t2 t,
and
p1(t) = 1 t t2 120
= 2t 1.See Fig. 3.7. Since the eigenvector v1 for = 1 isunique (up to scaling) the eigenfunction p1 with eigen-value 1 is unique as well (up to scaling). The eigenfunc-tions with eigenvalue 1 are the linear combinations of p1and p1 .
12
p1
1
2
p1
12
p1
Figure 3.7: Three eigenfunctions (Example 3.5.5)
3.5.1 Eigenspace
Eigenvectors are not unique. If vis an eigenvector then
so are 2vand 3v, all with the same eigenvalue. For anyeigenvalue of a linear mapping F, the set of all eigenvec-
tors, including the zero element, equals
E : = {v F(v) = v}
= {v 0 = (I F)(v) }
= ker(I F).
This set E is a subspace and we call it the eigenspace of
F for eigenvalue .
Example 3.5.6 (Eigenspaceon infinite dimensional vec-
tor space). Let L : F(R,R) F(R,R) be the linearmapping defined as
(Lf)(t) = t2 f(t) t R.
We determine the eigenvalues and eigenspaces of this map-
ping. Now a nonzero f F(R,R) is an eigenvector witheigenvalue if
t2 f(t) = f(t) t R. (3.2)
Since t2 is real, any eigenvalue is necessarily real as well.
Among these we distinguish three cases:
If < 0 then (3.2) holds only if f(t) = 0 t. Butthe zero function is by definition not an eigenvector.
Hence no < 0 is an eigenvalue.
If = 0 then (3.2) implies that f(t) = 0 for all t = 0.The value f(t) at t = 0 may be anything as long asit is nonzero because eigenvectors are by definition
nonzero. So
f(t) = 1 t = 00 t = 0
is an eigenvector with eigenvalue 0 and the corre-
sponding eigenspace is the 1-dimensional
E0 = span{f1}.
If > 0 then (3.2) holds at t = irrespective off. At all other t we need f(t) = 0. Now
f2(t)
= 1 t = 0 t =
, f3(t)
= 1 t = 0 t =
are two independent eigenvectors with eigenvalue
and the eigenspace in this case equals
E = span{f2, f3}.
It has dimension two.
Notice that in the above example every real number 0 is an eigenvalue of the mapping. This is in stark contrast
with mappings on finite dimensional vector space, which
have finitely many eigenvalues only.
Example 3.5.7. The differentiatorD : Pn Pn of Ex-ample 3.5.4 has one eigenvalue only, = 0, and the eigen-vectors were shown to equal the nonzero constant func-
tions. The eigenspace for = 0 is E=0 = span{1}. It isthe set of all constant functions, including the zero func-
tion.
Example 3.5.8. The mapping of Example 3.5.5 has two
eigenvalues, = 1 and = 1. The eigenspaces are
E=1=
span
{1, t2
t
}, E=1
=span
{2t
1
}.
24
8/2/2019 Linear Analysis 2010
31/66
3.5.2 Diagonalization
A linear transformation F : V V is said to be diagonal-izable ifV has a basis S with respect to which the matrix
FS S is diagonal. More succinctly, it is diagonalizable if the
space has a basis of eigenvectors ofF.
Example 3.5.9 (Differentiator). The differentiator D :
Pn Pn of Example 3.5.4 is not diagonalizable be-cause only the constant functions are eigenfunctions and
these do not form a basis ofPn (unless n = 0).The same differentiator D : V V but now with V =
span{et, e2t} is diagonalizable. Example 3.5.10. Consider the linear mapping A : R2 R2 that, with respect to some basis S = {s1, s2}, has matrixrepresentation
AS S =1 1
6 2
.
This matrix has characteristic polynomial
det(I AS S) = det
1 16 2
= 2 3 4
and its zeros are 1 = 4 and 2 = 1. The correspondingeigenspaces follow as
E
=4
=ker(4I
A)
=ker
3 16 2 = span
1
3and
E=1 = ker(IA) = ker2 16 3
= span
1
2
.
Hence V := {v1, v2} defined as
v1 =
s1 s2 1
3
, v2 =
s1 s2
12
are eigenvectors ofA, and the matrix AV V with respect to
this basis is the diagonal matrix of eigenvalues,
AV V =
1 0
0 2
=
4 0
0 1
.
The AS S we started with can now be written as a product
of three matrices, each with its own interpretation:
AS S =
1 1
3 2
transform
coordinates in basis Vto coordinates in basis S
4 0
0 1
apply mapping
in coordinatesof basis V
1 1
3 21
transform
coordinates in basis Sto coordinates in basis V
3.6 Problems
3.1 Let L : F(R,R) F(R,R) be the operator de-fined as (Lf)(x ) = x 2 f(x ). Show that L is linear.
3.2 Determine which of the following mappings are lin-ear:
a) F : R R : F(t) = 3t+ 1b) A : Pn R : F(p) = p(1)(3)c) B : P P : B(p) = p(1)d) G : Cn C : G(x ) = aHx (where a Cn is
some given vector)
3.3 The plus sign + appears four times in Section 3.1.Which of these four plus signs indicate the the same
type of addition?
3.4 Let V,W be two real vector spaces or two complexvector spaces and let L(V,W) be the set of linear op-
erators from V to W. On this set of operators we de-
fine addition and scalar multiplication as
(A+ B)(x ) := A(x ) + B(x ),(A)(x ) := (A(x )).
a) Show thatA+ B is linear ifA,B are linearb) Show that A is linear ifA is linear and is
scalar
c) Show thatL(V,W) is a vector space
d) Briefly comment on a link between L(Ck
,Cn
)and n k complex matrices3.5 Let V be a subspace ofRn . Show that the orthog-
onal projection from x to its best approximation v(Thm. 1.7.2) is linear.
3.6 Assume F is linear. Show that for any m N andscalars a1, . . . , am and vectors v1, . . . , v m there holds
F(a1v1 + a2v2 + + amvm )= a1F(v1) + a2F(v2) + + amF(vm ).
3.7 Suppose F : V
W is linear and that V and W are
complex vector spaces.
a) Show that ker(F) is a subspace ofV
b) Show that Im(F) is a subspace ofW
3.8 Let B Cnn . Show that the mapping L : Cnn Cnn defined as L(A) = A B B A is linear.
3.9 Let C([a, b],R) denote the subspace of continuous
functions inF([a, b],R). Is the integral operator J :
C[a, b] C[a, b] defined as
f
=J(g) : f(t)
= t
a
g( ) d
linear?
25
8/2/2019 Linear Analysis 2010
32/66
3.10 Consider the linear transformation F : P1 P1defined by
F(0 + 1t) = 0 + (80 1)t.a) Determine the matrix ofF with respect to the
standard basis ofP1.
b) Determine the matrix ofFwith respect to basis
{t+ 1, t 1}.c) Determine the eigenvalues of the above two ma-
trices.
d) Determine the eigenvalues ofF without using
the matrices.
3.11 Let (N; R) and consider the mapping :(N; R) (N; R) defined as
f
=(g) : fk
=kgk.
a) Show that the mapping is linear
b) What are the eigenvalues of?
3.12 Consider the complex vector space of infinitely often
differentiable functions
C(R,C)
= {u + iv u(k), v(k) F(R,R) k N}.
Consider on this space the differentiator D( f) =f(1). Determine all eigenvalues ofD.
3.13 LetA,B, C : V V and supposeV has a finite basisS. Show that
A = BC AS S = BS SCS S
3.14 Consider the subspaceW := span{1, sin(x ), sin(2x )}of F(R,R) and the second derivative T : W W, T(g) = g(2).
a) Determine the eigenvalues and eigenspaces ofT
b) Is T : W W diagonalizable?3.15 Let V be a vector space and A : V V a linear
transformation.
a) SupposeA = A2. Show that 0 or 1 are the onlypossible eigenvalues
b) SupposeAk = 0 for some k N. Which eigen-values are possible?
c) Construct a V and linear A : V V for whichA = 0 while A2 = 0.
3.16 Consider the mapping F : P2 P2 defined asF(p)(t) = p(t).
a) Determine a basis S ofP2
b) Determine the matrix FS S of the mapping withrespect to this basis S
c) Find the eigenvalues and eigenvectors ofFS S
d) Find the eigenvalues and eigenfunctions p P2 ofF.
3.17 Repeat the previous question but now for mapping
F(p)(t) = p(t+ 1).3.18 Determine eigenvalues and eigenvectors of A and
check whether or not A can be diagonalized, for
a) A =
1 2
0 3
b) A =
0 1
2 3
c) A =
1 1
0 1
3.19 Show that
A =
1 3
1 1
is diagonalizable. Use this to compute A4.
3.20 Is the operator of Example 3.5.5 diagonalizable?
26
8/2/2019 Linear Analysis 2010
33/66
4 Normed vector space
A normed vector space loosely speaking is a vector space
in which a length a size of a vector is available. This
additional structure allows us to deal with optimal approx-
imation and with limits of vectors. We denote the length
of a vector x by x and call it the norm ofx .
4.1 Norm
Definition 4.1.1 (norm). Let V be a real or complex vec-
tor space. A mapping from V to R is a norm if for allx , y V and all scalars it satisfies the three axioms:
1. x = ||x, (positive homogeneous)2. x + y x + y, (triangle inequality)3. x > 0 for every x = 0. (positive definite)
For = 0 the first axiom tells us that 0 = 0. Soa norm x is zero if and only if x is the zero vector. Anormed vector space is a vector space on which a norm is
defined. Formally one should say (V, ) is a normedvector space but we usually just say V is normed vector
space assuming that the choice of norm is clear from theproblem at hand. Be aware, however, that a vector space
can be equipped with many different norms.
x1 1 (1, 0)
(0, 1)
x2 1 (1, 0)
(0, 1)
x 1
(1, 1)(1, 1)
Figure 4.1: Unit balls in p-norm for p = 1, 2,
Example 4.1.2 (Three different norms on R2).
1. The 1-norm is defined as
x1 = |x1| + |x2|.
In the first quadrant where x1 and x2 are nonneg-
ative the 1-norm is just the the sum the entries,
x1 = x1 + x2. In the first quadrant therefore thenorm is at most 1 iffx2 1 x1, which is the region
(1, 0)
(0, 1)
Combined with the other three quadrants we get that
the unit ball {x x1 1} is a polytope, a square in
fact, see Fig. 4.1(a).
2. The Euclidean norm, also known as the 2-norm, is
defined as
x2 :=
x 21 +x 22 .
In this norm the unit ball {x x2 1} is the unit
disc, see Fig. 4.1(b).
3. The max-norm, or -norm, is defined as
x = max(|x1|, |x2|).
Now in this norm the unit ball {x
x 1} is a
square with its axes parallel to the x1- and x2 axis, see
Fig. 4.1(c).
The 1-norm is sometimes called the manhattan norm be-
cause in a rectangular street grid which is common in
US cities the 1-norm x y1 is the minimal Euclideandistance required to travel from junction x to junction y,
see Fig. 4.2.
x
y
Figure 4.2: Manhattan norm: all three routes are
equally long, x y1
The triangle inequality x + y x + y looselyspeaking says that in any norm traveling from 0 to x + yvia x or y can only mean a detour. Moving the y to theleft-hand side of the inequality turns the triangle inequality
into a statement that says that any side in a triangle is at
least the difference of the other two sides:
x + y y xx
y
x+
y
This is sometimes called the reverse triangle inequality and
it is commonly formulated in terms of z = x + y as:
Lemma 4.1.3. |z y| z y. In this form it is immediate that if two vector z and y are
close then their norms are close as well. This impliesthat norms are continuous in some way (see 4.4.1).
27
8/2/2019 Linear Analysis 2010
34/66
Example 4.1.4. The space of finitely nonzero sequences
finite(N; R) is a normed vector space in the 1-normdefinedas
f1 :=
i=1|fi |.
See Problem 4.4.
Example 4.1.5 (Continuous functions in max-norm).
The standard norm on the vector space C[a, b] of continu-
ous functions on real interval [a, b] is the max-norm, also
known as -norm, defined asf = max
x[a,b]|f(x )|.
We now verify that this indeed satisfies the three axioms of
norm:
1. For every scalar we have
f
=maxx
|f(x )
| =maxx |||f(x )| = || maxx |f(x)| = ||f.2. The max norm inherits the triangle inequality from
R: since for every p, q R we have that |p + q| |p| + |q|, we also have for every f, g C[a, b] that
f + g = maxx
|f(x ) + g(x )| max
x|f(x )| + |g(x)|
maxx
|f(x )| + maxx
|g(x )|= f + g.
3. If f is not the zero function then f(x0) = 0 for atleast one x0 [a, b]. Now f |f(x0)| > 0.
In some literature the vector space C[a, b] is identified
with the normed vector space (C[a, b], ). This isunfortunate since we may want to consider other norms on
the space of continuous functions, for instance:
Example 4.1.6. On C[a, b]
f1 :=b
a
|f(x )| dx (4.1)
is a norm (Problem 4.5).
Notice that in this example the norm f1 exists (is fi-nite) for every continuous function. For arbitrary functions
in F([a, b],R) that need not be the case and this is the
reason we restricted attention to C[a, b]. However the
space of continuous functions also has its drawbacks for
this norm:
Example 4.1.7 (Limit does not exist in the space). Con-
sider C[1, 1] and the 1-norm defined in (4.1). In thisnorm the sequence of functions
fn(t) = 0 t [1, 0]nt t (0,
1
n )1 t [ 1
n, 1]
1/n
1
does not converge in the space C[1, 1] because no con-tinuous function f exists for which limn fn f1 = 0. (Convince yourself of this.) Neverthelessthe sequence of functions do approach one another in the
sense that
supn>N,m>N
fn fm1
goes to zero as N . This follows from the fact thatfor any n, m > N we have
fn fm1 =1
1|fn(t) fm (t)| dt
=1/ min(n,m)
0
|fn (t) fm (t)| dt
1/N
0
1 dt
= 1N
.
What fails in this example is that limn fn does notexist in the space, even though the fn become arbitrarily
close to one another in the given norm. We thus need to
make a distinction between converging sequence and se-
quences whose elements become closer and closer. The
latter is called Cauchy sequence and it is the topic of the
next section. Incidentally this difference is not specific to
vector space. It also shows up in sets like the rational num-bers Q. Indeed, in Q we can construct sequences that ap-
proach one another in absolute value but that do not have
a limit in the set of rational numbers. An example is the
sequence of rational numbers {3, 3.1, 3.14, 3.141, . . .} thatconverges to the nonrational .
4.2 Cauchy sequence
Definition 4.2.1 (Cauchy sequence and convergent se-
quence). LetX be a normed vector space and let {xn}nNbe a sequence in
X.
{xn} is a Cauchy sequence if for every > 0 N Nsuch that
n, m > N xn xm < .
{xn} is a convergent sequence if there is an x Xsuch that limn xn x = 0.
It can be shown that for sequences {n} ofreal numbersthe two notions are equivalent. I.e. a real sequence con-
verges iff it is a Cauchy sequence. Figure 4.3 makes thisplausible.
28
8/2/2019 Linear Analysis 2010
35/66
N n
n
Figure 4.3: Cauchy criterion for real sequences
Example 4.2.2 (Integral test for real-valued sequences).
Consider the real sequence n = 1 + 122 +1
32+ + 1
n2.
Now for every m n > N we have
|m n | =m
k=n+1
1
k2
N. Then by thetriangle inequality fn fm = ( fn f) ( fm f) fn f+fm f < /2+/2 = for every n, m > N.So {fn} is Cauchy.
4.3 Banach space = complete vector space
Definition 4.3.1 (Banach space). A normed vector space
X is said to be complete if every Cauchy sequence has a
limit in X. Complete normed vector spaces are called Ba-
nach spaces.
In a Banach space therefore a sequence converges if and
only if it is a Cauchy sequence. This is beneficial because
the Cauchy property is often easier to check since it doesnot require knowledge of the limit, see Example 4.2.2, and
more importantly all sorts of limits are then guaranteed to
exist. This will be of great help in the final chapter of this
course.
Over the years many spaces have been shown to be Ba-
nach spaces, and also many have been shown to fail the
Banach property. In this introductory course we will notworry about completeness proofs because the proofs are
often intricate. We simply list a couple in the remainder of
this section.
Theorem 4.3.2 (continuous functions with max norm).
C[a, b] is a Banach space in the max-norm .Proof. Suppose fn is a Cauchy sequence. Then > 0there is an N > 0 such that fn fm < for alln, m > N. Now at any t [a, b] we have
|fn(t) fm (t)| fn fm < n, m > N .
So for every fixed t [a, b] the sequence of real numbers{fn(t)} is Cauchy. Since R is a Banach space we hencehave that the pointwise limit f(t) := limn fn (t) exists.For m we obtain that
|fn(t) f(t)| n > Nand that this N does not depend on t. Hence fn f 0 as n . Remains to show that this f iscontinuous. Fix an n > N/3. By continuity of fn we have
at each t that |fn(t) fn (t+h)| < /3 for all h [t, t]for some small enough t. For all such h there holds
|f(t
+h)
f(t)
| =|f(t
+h)
fn(t + h) + fn (t+ h) fn(t) + fn (t) f(t)|
|f(t + h) fn (t+ h)|+ |fn (t + h) fn(t)|+ |fn (t) f(t)|
< /3 + /3 + /3 = .So f is continuous.
Notice that C[a, b] is not complete in the 1-norm (Ex-
ample 4.1.7) thus completeness is norm dependent. On fi-nite dimensional space it does not depend on the norm:
Theorem 4.3.3 (Finite dimensional space). Every finite
dimensional normed vector space is a Banach space.
Proof (idea only). Suppose S := {v1, . . . , v m} is a basisof the space. If fn is a Cauchy sequence then it may shown
that its coordinate vectors fn,S is a Cauchy sequence inRm
in, say, the Euclidean norm. This implies that each fixed
entry of these vectors is a Cauchy sequence. Since these
entries are real numbers, they have a limit. The vectors
fn,S hence entry-wise converges to some fS Rm as n . The corresponding f := v1 vm fS is welldefined, and one can show that limn fn f = 0.
29
8/2/2019 Linear Analysis 2010
36/66
4.3.1 Sequence space 1, 2,
On the infinite sequence space (N; R) the 1-norm, 2-normand -norm, that we defined on Rn , become the infinitesums and suprema
v1 := |v1| + |v2| + |v3| + |v4| +
v2 :=
|v1|2 + |v2|2 + |v3|2 + |v4|2 + v := sup(|v1|, |v2|, |v3|, |v4|,. . .).
These, however, are not norms on (N; R) because theyare not finite for some sequences. For instance all three
norms are infinite for the growing sequence
v= (1, 2, 3, 4, 5,. . .).The way out of this problem is as simple as it is elegant.
Merely restricting the sequence space to those elements that
have finite norm will do the job, and the result is a Banachspace (we skip the proof):
Theorem 4.3.4 (Complete sequence spaces). The three
sequence spaces
1 := {v (N,R) v1 < }
2 := {v (N,R) v2 < }
:= {v (N,R) v < }
are all complete in their respective norms.
Example 4.3.5 (Cauchy or not Cauchy). Consider the in-
finite sequence
vn = (1, 12 , 13 , . . . , 1n , 0, 0, . . . )depending on n N. For every n the vn has only finitelymany nonzero entries, so it has finite 1, 2, -norm andthus is in all three vector spaces 1, 2 and . The se-quence vn pointwise converges to
v = (1, 12 , 13 , 14 , 15 , 16 , . . . )
as n . This v is not in 1 because
v
1
=1
+12
+13
+ = but it is in 2 and with respective norms
v2 =
1 + 122
+ 132
+ < v = sup
k1(1, 1
2, 1
3, . . . ) = 1 < .
This is consistent with the observations that
{vn}nN is not Cauchy in the 1-norm because no mat-ter how large N is, the quantity
vn vm1 = 1n+1 + 1n+2 + + 1mcan be taken arbitrary large by appropriate choice ofm n > N.
{vn}nN is Cauchy in the 2-norm because for alln, m > N we have vn vm22 < 1/N 0 asN (See Example 4.2.2). Since 2 is a Banachspace the vn hence converges in
2. Indeed.
{vn}nN is Cauchy in the -norm because for alln, m > N we have vn vm < 1/N 0 asN .
4.3.2 Lebesgue space L1 and L2
The function space equivalent of1 we naively define as
L1[a, b] := {f : [a, b] R
f1 < }where the 1-norm is now defined as
f1 =b
a
|f(t)| dt.
We allow a = and b = +. This definition ofL1[a, b] is not precise because it still depends on the def-
inition of integralb
a|f(t)| dt. The Riemann integral def-
inition is not ideal because one can construct a Cauchy
sequence of Riemann integrable functions whose limiting
function is so crazy that its Riemann integral is no longer
well defined. Hence the space L1[a, b] would then fail to
be complete. The desire of having a complete function
space was so strong that it prompted mathematicians to
look for alternative definitions of integration! In the be-ginning of the 20th century the issue was settled by Henri
Lebesgue. He devised the Lebesgue measure and Lebesgue
integration with respect to which the space L1[a, b] is
complete. The interested reader should follow a course on
measure theory. The symbolL is standard in the math lit-
erature and it is in honor of its inventor Lebesgue. The dif-
ference between Riemann- and Lebesgue integration only
shows up in really weird functions and in this course we
need not worry about such functions. We simply accept
that:
Theorem 4.3.6 (Complete / Banach). L1[a, b] is com-
plete in the 1-norm.
Built in in the definition ofL1 is that its elements have
a well defined 1-norm. This space contains all continu-
ous functions but also many more, and they need not be
bounded.
Example 4.3.7 (SeveralL1 functions). All functions of
Fig. 4.4 are elements ofL1[0, 1], except the last function
f9(t) = 0.1t . Indeed1
0 f9(t) dt = 0.1log(t)|10 = . We should first fix a possibly unsettling problem: part of
the definition of norm is that
f > 0 for all f = 0
30
8/2/2019 Linear Analysis 2010
37/66
0 1
f(t) = 1
0 1
f(t) = t
0 1
0 1
1t
0 1
1/2|t1/2|
0 1
log(t)
sin(1/t) 0.1/t
Figure 4.4: The first 8 functions are in L1[0, 1],
the 9th is not
but here that is not the case! The 8th function of Fig. 4.4,
for instance,
f(t) =1 t = 1/2
0 elsewhere
is not the zero function, yet its 1-norm is zero. The sim-
plistic way out of this problem is to identify every function
f with zero norm with the zero function. That is not far
fetched because iff1 = 0 then
f1 =b
a
|f(t)| dt = 0
implying that f(t) is zero1 almost everywhere. From
now on we do not distinguish between functions f and g
when their difference has norm zero, so from now on by
definition
f = g f g1 = 0.The counter part of 2 is the space of square integrable
functions:
Lemma 4.3.8 (Lebesgue space L2). The space of
square integrable functions
L2[a, b] := {f : [a, b] R
f(t)2 < }
1In a course on measure theory this identification will be formalized
through equivalence classes and then the notion ofalmost everywhere
will be properly defined.
is complete in the 2-norm defined as
f2 :=b
a
|f(t)|2 dt.
Here a = and b = + are allowed. The top threefunctions of Fig. 4.4 are in L2[0, 1]. The fourth and fifth
function of that figure are not L2[0, 1].
Example 4.3.9 (Complete in L2, not complete in C).
Consider the standard 2-norm of functions. All functions
fn : [0, 1] R defined as
fn(t) =
n4/5t 0 t 1/n1
t1/51/n < t 1
1/n1/5
1/n
are continuous. All fn are therefore in C[0, 1] as well as
in L2[0, 1]. The pointwise limit
f(t) =
0 t = 01
t1/50 < t 1
is not in C[0, 1] because it is not continuous and in fact it
is not bounded. It is in L2[0, 1], however, because
f2 =1
0
f2(t) dt =1
0
t25 dt = 53 t
35
10
= 53
is finite. One can show that fn is a Cauchy sequence in
the 2-norm. Since C[a, b] is not complete in this norm,
its limit is not guaranteed to exist in the space C[a, b]
and indeed it does not exist. The space L2[a, b] how-
ever is complete in this norm and hence limn fn existsin L2[a, b]. Indeed.
4.4 Bounded linear operator
Having a norm of vectors allows us to come up with bounds
for mappings on vectors.
Definition 4.4.1 (Bounded operator). Let X and Y be
normed vector spaces. A linear operator F : X Y isboundedif a c 0 exists such that
F(x )Y cxX x X. (4.2)
The smallest possible c in (4.2) gives an indication on
how big the operator is. Ifc for instance is < 1 then we
know that the norm of the imageF(x ) is less then that ofx ,irrespective of the choice of x . Likewise if (4.2) holds for
31
8/2/2019 Linear Analysis 2010
38/66
c = 2 then the norm ofF(x ) will never be more than twicethe norm x . Et cetera. The smallest possible c is what is
called the operator norm2.
Definition 4.4.2 (Operator norm). Let X,Y be normed
vector spaces and F : X
Y a bounded operator. The
operator norm F ofF is defined as3
F = supx=0
F(x )YxX
.
IfX = {0} then we define F = 0. By definition of operator norm we have for every non-
trivial vector space X and every x X that
F(x )Y cxX (4.3)
ifc = F
, while for every c less than F
there are x thatviolate (4.3).
Example 4.4.3 (Bounded operator). We determine the
operator norm ofA : C[a, b] R defined as
A( f) =b
a
f(t) dt.
On C[a, b] we take the max-norm, on R we take the abso-
lute value.
|A( f)
| = b
a
f(t) dtb
a
|f(t)| dt
b
a
f dt
= (b a)f.
The operatorA thus is bounded and its operator norm is at
most b a. For the constant function f(t) = 1, the aboveis an equality,
|A(1)| = b
a 1 dt = (b a) = (b a)1.
The operator norm hence equals b a. Example 4.4.4 (Unbounded operator). ConsiderC[0, 1]
with the 1-norm. On this space the operator : C[0, 1] R defined as
( f) = f(0)2The attentive reader will wonder why we call it operator norm. Doesnt
this require that some set of operators F is a vector space and that on
this vector space the operator norm has the property of norm? The
answers are yes and yes, but we will not deal with such matters in this
course, even though we are very close to settling it.3supremum means least upperbound.
is unbounded. To see this take for instance the sequence of
functions
fn
(t)=
n(1
tn
) 0
t
1n
0 elsewhere.
1/n
n
The 1-norm of each fn is 1/2 while |( f)| = n. The ratio|(f)|/f1 = 2n is unbounded. This shows that is anunbounded operator.
4.4.1 Continuity of maps
We say that a mapping A on a normed vector space is con-
tinuous at y if for every > 0 there is a > 0 such that
x y < A(x ) A(y) < .If the mapping is continuous at y for every y in the do-
main, thenA is said to be continuous. For linearmappings,
boundedness and continuity are equivalent:
Theorem 4.4.5 (Bounded = continuous for linear maps).
For a linear operator A the following three statements are
equivalent.
1. A is continuous
2. A is continuous at 0
3. A is bounded
Proof. (1. 2.) is trivial. Now (2. 3.): IfA is