Upload
agus-leonardi
View
242
Download
1
Embed Size (px)
DESCRIPTION
Maths UROPS Sem 1 - 2010 - incomplete preliminary version -
Citation preview
Undergraduate Research Opportunity
Programme in Science
NORMS IN VECTOR SPACE
Agus Leonardi Soenjaya
Supervisor: Assoc. Prof. Victor Tan
Department of Mathematics
National University of Singapore
2010
Contents
List of Symbols ii
Abstract iii
1 Vector Norms 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Basic Properties of Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Norms and Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Analytic Properties of Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Geometric Properties of Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Duality of Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Matrix Norms 22
2.1 Basic Properties of Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Induced Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Generalised Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Applications of Norms 46
3.1 Sequences and Series of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Bounds for Roots of Algebraic Equations . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Perturbation of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Bibliography 62
i
List of Symbols
The list of common symbols used throughout this paper are included below for reference.
R the real numbers
C the complex numbers
F a field, which is either R or C
Rn space of n-tuples of complex numbers
Cn space of n-tuples of real numbers
Fn space of n-tuples of either real numbers or complex numbers
Mn space of n× n matrices over the field F
Re (z) real part of complex number z
Im (z) imaginary part of complex number z
〈u,v〉 inner product of u and v
‖v‖ norm of a vector v
‖v‖D dual norm of a vector v
v∗ conjugate transpose of a vector v
B‖·‖ unit ball of vector norm ‖ · ‖
|||A ||| norm of a matrix A
G(A) generalised matrix norm of a matrix A
A∗ conjugate transpose of a matrix A
A−1 inverse of a matrix A
A = [aij ] matrix A with (i, j)-th entry equals aij
diag(d1, . . . , dn) diagonal matrix inMn with diagonal entries d1, . . . , dn
ρ(A) spectral radius of a matrix A
det(A) determinant of a matrix A
κ(A) condition number of matrix A with respect to a given matrix norm
ii
Abstract
This paper discusses the notion of norms and its properties, mainly in finite-dimensional vector
space over the real or complex field, as an abstraction of ‘size’ of vectors and matrices. Some
applications of norms will also be studied.
We begin by discussing the basic properties of vector norms as well as its analytic and geo-
metric properties in Chapter 1. Furthermore, we study some important classes of vector norms,
including vector norms derived from inner product and the dual of vector norms. In Chapter
2, we will generalise the notion of norms to matrices in Mn. Some important classes of matrix
norms as well as its further generalisation, namely the generalised matrix norm inMn, will also
be studied.
Some applications of norms will be discussed in Chapter 3. We will generalise the notion of
series to matrices in Mn using norms to derive various properties analogous to the series of
real-valued functions. We will also derive some bounds for the roots of complex polynomial
using matrix norms discussed earlier. Lastly, some basic theories on perturbation of eigenvalues,
which is important in numerical linear algebra, will be discussed.
iii
Chapter 1
Vector Norms
1.1 Introduction
In Mathematics, it is often necessary to measure the ‘size’ of a vector in Cn or a matrix inMn.
One notion of size which arises naturally in R2 or R3 is the Euclidean length. For instance, given
a vector v = (v1, v2) ∈ R2, one can define the Euclidean length of this vector to be(v2
1 + v22
) 12 .
How could one generalise the notion of ‘size’ further to complex-valued vectors in n-dimensional
complex vector space for instance? What about the ‘size’ of a matrix? One way to answer this
question is to study the notion of norms for vectors and matrices.
1.2 Basic Properties of Vector Norms
Definition 1.2.1 (Vector Norm Axioms). Let V be a vector space over the field F (R or C). A
function ‖ · ‖ : V → R is a vector norm if for all x,y ∈ V , the following axioms are satisfied:
1. (Non-negative) ‖x‖ ≥ 0.
2. (Positive) ‖x‖ = 0 if and only if x = 0.
3. (Homogeneous) ‖cx‖ = |c|‖x‖ for all scalars c ∈ F.
4. (Triangle Inequality) ‖x + y‖ ≤ ‖x‖+ ‖y‖.
Proposition 1.2.2 (Generalised Triangle Inequality). If ‖ · ‖ is a vector norm on V , then
| ‖x‖ − ‖y‖ | ≤ ‖x + y‖ ≤ ‖x‖+ ‖y‖
for all x,y ∈ V .
Proof. The inequality ‖x + y‖ ≤ ‖x‖+ ‖y‖ follows from the norm axioms. It remains to prove
the other inequality.
Since y = −x + (x + y), we have
‖y‖ ≤ ‖−x‖+ ‖x + y‖ = ‖x‖+ ‖x + y‖.
1
by the triangle inequality and the homogeneity axioms. From then, it follows that
‖y‖ − ‖x‖ ≤ ‖x + y‖
Similarly, by writing x = −y + (x + y), we have
‖x‖ − ‖y‖ ≤ ‖x + y‖
Hence, we have proven ±(‖x− y‖) ≤ ‖x + y‖, which is what is needed to be proven.
Below are some examples of common vector norms in finite-dimensional vector space. In
these examples, we let x = (x1, x2, . . . , xn) ∈ Fn.
Example 1.2.3 (l1 norm on Fn).
‖x‖1 ≡ |x1|+ |x2|+ . . .+ |xn|
l1 norm is sometimes called sum norm or Manhattan norm
Example 1.2.4 (l2 norm on Fn).
‖x‖2 ≡(|x1|2 + |x2|2 + . . .+ |xn|2
) 12
l2 norm is also called Euclidean norm.
Example 1.2.5 (l∞ norm on Fn).
‖x‖∞ ≡ max{|x1|, |x2|, . . . , |xn|}
l∞ norm is also called max norm.
Example 1.2.6 (lp norm on Fn).
‖x‖p ≡
(n∑i=1
|xi|p) 1
p
for p ≥ 1.
To prove that lp norm is in general a vector norm (in particular, to prove the triangle
inequality axiom), we need to introduce some inequalities. More details can be seen in the
reference given.
Theorem 1.2.7 (Holder’s inequality). Let 1 < p, q < ∞ and p−1 + q−1 = 1. Suppose
z1, z2, . . . , zn, w1, w2, . . . , wn ∈ C. Then
n∑k=1
|zkwk| ≤
(n∑k=1
|zk|p) 1
p(
n∑k=1
|wk|q) 1
q
with equality if and only if all zk’s are 0, or there exists constant M ≥ 0 such that |wk|q = M |zk|p
for all k.
2
Proof. See [6] for proofs and more details.
Theorem 1.2.8 (Minkowski’s Inequality). Let 1 ≤ p <∞. Suppose a1, a2, . . . , an, b1, b2, . . . , bn ∈C. Then (
n∑k=1
|ak + bk|p) 1
p
≤
(n∑k=1
|ak|p) 1
p
+
(n∑k=1
|bk|p) 1
p
Proof. See [6] for proofs and more details.
Now, we are able to justify rigorously that lp-norm is indeed a vector norm. Moreover, we
will show that l∞-norm is also a vector norm.
Proposition 1.2.9. lp-norm is a vector norm on Fn for 1 ≤ p <∞.
Proof. We need to show that all the norm axioms are satisfied. Let x = (x1, . . . , xn) and c ∈ F.
1. (Non-negative) Clearly ‖x‖p =
(n∑k=1
|xk|p) 1
p
≥ 0.
2. (Positive) Inequality above holds with equality if and only if |xi|p = 0 for i = 1, 2, . . . , n,
which implies all xi’s are 0, i.e. x = 0.
3. (Homogeneous) ‖cx‖p =
(n∑k=1
|cxk|p) 1
p
=
(|c|p
n∑k=1
|xk|p) 1
p
= |c|‖x‖p.
4. (Triangle Inequality) This follows immediately from Minkowski’s inequality.
Hence, lp-norm is a vector norm on Fn for 1 ≤ p <∞.
Proposition 1.2.10. l∞-norm is a vector norm on Fn.
Proof. We need to show all the norm axioms are satisfied. Let x = (x1, . . . , xn) and y =
(y1, . . . , yn) be vectors in Fn and c ∈ F.
1. (Non-negative) Clearly ‖x‖∞ = max1≤k≤n
{|xk|} ≥ 0.
2. (Positive) As |xi| ≥ 0 for any real number xi, where i = 1, 2, . . . , n, inequality above holds
with equality if and only if all xi’s are 0, i.e. x = 0.
3. (Homogeneous) ‖cx‖∞ = max1≤k≤n
{|cxk|} = |c| max1≤k≤n
{|xk|} = |c|‖x‖∞.
4. (Triangle Inequality) Using triangle inequality for real numbers, we have
‖x + y‖∞ = max1≤k≤n
{|xk + yk|}
≤ max1≤k≤n
{|xk|+ |yk|}
≤ max1≤k≤n
{|xk|}+ max1≤k≤n
{|yk|}
= ‖x‖∞ + ‖y‖∞
Hence, we have proven l∞-norm is a vector norm on Fn.
Furthermore, the theorem below shows how lp-norm is connected with the l∞-norm.
3
Theorem 1.2.11. ‖x‖∞ = limp→∞ ‖x‖p for all x ∈ Fn.
Proof. Let x = (x1, . . . , xn). We have
n∑i=1
|xi|p ≥ |xk|p
i.e. (n∑i=1
|xi|p) 1
p
≥ |xk|
for each k = 1, 2, . . . , n. Then we have
(n∑i=1
|xi|p) 1
p
≥ max1≤k≤n
|xk|
i.e.
‖x‖p ≥ ‖x‖∞ (1.1)
Moreover, for p ≥ 1, we have
(‖x‖p)p =n∑i=1
|xi|p ≤ n(
max1≤k≤n
|xk|)p
= n‖x‖∞ (1.2)
Hence, combining inequalities (1.1) and (1.2), we have ‖x‖∞ ≤ ‖x‖p ≤ n1p ‖x‖∞.
Taking limit, and using the fact that limp→∞
n1p = 1, we have the conclusion.
Now we want to construct even bigger classes of vector norms. It can be shown that any
positive linear combinations of vector norms form a new vector norm. The maximum of several
vector norms also form a new vector norm. We can also look at several other ways to construct
new vector norms. The following propositions establish these results more precisely.
Proposition 1.2.12. Let V be a finite-dimensional vector space over the field F (R or C). Let
‖ · ‖α and ‖ · ‖β be two given vector norms and k1, k2 ∈ R+. Then ‖ · ‖γ ≡ k1‖ · ‖α + k2‖ · ‖β is
also a vector norm.
Proof. Clearly, the non-negativity and positivity axioms are satisfied. We check that the re-
maining axioms are also satisfied.
Let x = (x1, . . . , xn) and y = (y1, . . . , yn) be vectors in Fn and c ∈ F.
1. (Homogeneous) ‖cx‖γ = k1‖cx‖α + k2‖cx‖β = |c|(k1‖x‖α + k2‖x‖β) = |c|‖x‖γ by the
homogeneity of each given vector norm.
2. (Triangle Inequality) We have
‖x + y‖γ = k1‖x + y‖α + k2‖x + y‖β≤ k1‖x‖α + k2‖y‖α + k2‖x‖β + k2‖y‖β= (k1‖x‖α + k2‖x‖α) + (k2‖y‖β + k2‖y‖β)
= ‖x‖γ + ‖y‖γ
4
Therefore, ‖ · ‖γ is a vector norm.
Proposition 1.2.13. Let V be a finite-dimensional vector space over the field F (R or C). Let
‖ ·‖α and ‖ ·‖β be two given vector norms. Then ‖ ·‖γ ≡ max{‖ ·‖α, ‖ ·‖β} is also a vector norm.
Proof. Clearly, the non-negativity and positivity axioms are satisfied. We check that the re-
maining norm axioms are satisfied.
Let x = (x1, . . . , xn) and y = (y1, . . . , yn) be vectors in Fn and c ∈ F.
1. (Homogeneous) ‖cx‖γ = max{‖cx‖α, ‖cx‖β} = max{|c|‖x‖α, |c|‖x‖β} = |c|‖x‖γ by the
homogeneity of each given vector norm.
2. (Triangle Inequality) We have
‖x + y‖γ = max{‖x + y‖α, ‖x + y‖β}
≤ max{‖x‖α + ‖y‖α, ‖x‖β + ‖y‖β}
≤ max{‖x‖α, ‖x‖β}+ max{‖‖y‖α, ‖y‖β}
= ‖x‖γ + ‖y‖γ
where we used triangle inequality for each vector norm.
Hence, the result is proven.
Proposition 1.2.14. Let ‖ · ‖ be a vector norm on a finite-dimensional vector space over the
field F (R or C). If T ∈Mn is non-singular, then ‖ · ‖T defined by ‖x‖T ≡ ‖Tx‖, where x ∈ Fn
is also a vector norm on Fn.
Proof. Clearly, the non-negativity and positivity axioms are satisfied. We check that the re-
maining norm axioms are satisfied.
Let x = (x1, . . . , xn) and y = (y1, . . . , yn) be vectors in Fn and c ∈ F.
1. (Homogeneous) ‖cx‖T = ‖T (cx)‖ = |c|‖Tx‖ = |c|‖x‖T by the homogeneity of the given
vector norm.
2. (Triangle Inequality) We have
‖x + y‖T = ‖T (x + y)‖ ≤ ‖Tx‖+ ‖Ty‖ = ‖x‖T + ‖y‖T
hence proving the result.
Another way to construct a new norm from another vector norm is by using the concept of
duality. This will be discussed in the later section.
1.3 Norms and Inner Products
In this section, we will study another class of vector norms which are as important as the vector
norms previously described. These are vector norms derived by the so-called ‘inner product’.
The notion of inner product comes from the study of angle between two vectors in Cn. We will
formalise this concept in this section.
5
Definition 1.3.1 (Inner Product Axioms). Let V be a vector space over the field F (R or C).
A function 〈·, ·〉 : V × V → F is an inner product if for all x,y, z ∈ V , the following axioms are
satisfied:
1. (Non-negative) 〈x,x〉 ≥ 0.
2. (Positive) 〈x,x〉 = 0 if and only if x = 0.
3. (Additive) 〈x + y, z〉 = 〈x, z〉+ 〈y, z〉.
4. (Homogeneous) 〈cx,y〉 = c〈x,y〉 for all scalars c ∈ F.
5. (Hermitian) 〈x,y〉 = 〈y,x〉
Below are some useful properties of inner product as consequences of the above axioms.
Proposition 1.3.2. Let V be a vector space over the field F (R or C) equipped with the inner
product 〈·, ·〉. Let x,y, z ∈ V and c ∈ F. Then
1. 〈x, cy〉 = c〈x,y〉.
2. 〈x,y + z〉 = 〈x,y〉+ 〈x, z〉.
3. 〈x,y〉 = 0 for all y ∈ V if and only if x = 0.
4. 〈x, 〈x,y〉y〉 = |〈x,y〉|2
Proof. 1. We have
〈x, cy〉 = 〈cy,x〉 = c〈y,x〉 = c〈x,y〉
2. We have
〈x,y + z〉 = 〈y + z,x〉 = 〈y,x〉+ 〈z,x〉 = 〈x,y〉+ 〈x, z〉
3. (⇒) Suppose that 〈x,y〉 = 0 for all y ∈ V . Observe that, in particular, the identity is
satisfied when y = x, i.e. 〈x,x〉 = 0 if and only if x = 0 as required.
(⇐) Suppose that x = 0. Then we have
〈0,y〉 = 〈0 + 0,y〉 = 〈0,y〉+ 〈0,y〉
i.e. 〈0,y〉 = 0 as required.
4. Treating 〈x,y〉 as constant and using identity 1, we have
〈x, 〈x,y〉y〉 = 〈x,y〉〈x,y〉 = |〈x,y〉|2
Below is an important inequality involving inner products known as Cauchy-Schwarz in-
equality.
6
Theorem 1.3.3 (Cauchy-Schwarz Inequality). If 〈·, ·〉 is an inner product on a vector space V
over the field F (R or C), then
|〈x,y〉|2 ≤ 〈x,x〉〈y,y〉
for all x,y ∈ V .
Equality occurs if and only if y = αx for some α ∈ F.
Proof. If y = 0, then the assertion is trivial.
Assume y 6= 0. Let t ∈ F.
Consider p(t) ≡ 〈x + ty,x + ty〉 = 〈x,x〉 + 2tRe 〈x,y〉 + t2〈y,y〉, which is a real quadratic
polynomial with real coefficients.
By the non-negativity axiom of inner product, we have p(t) ≥ 0 for all real value of t. The
discriminant of p(t) is therefore non-positive, i.e.
(2 Re 〈x,y〉)2 − 4〈y,y〉〈x,x〉 ≤ 0
and hence
(Re 〈x,y〉)2 ≤ 〈x,x〉〈y,y〉
Since this inequality holds for any pair of vectors, we can replace y by 〈x,y〉y. Then we have
(Re {〈x, 〈x,y〉y〉})2 ≤ 〈x,x〉〈y,y〉|〈x,y〉|2
However, we have Re 〈x, 〈x,y〉y〉 = Re |〈x,y〉|2 = |〈x,y〉|2. Therefore the inequality above is
equivalent to
|〈x,y〉|4 ≤ 〈x,x〉〈y,y〉|〈x,y〉|2
Now, if 〈x,y〉 = 0, then the assertion is trivial as it follows directly from the non-negativity
axiom of inner product. Otherwise, we may divide both sides of the inequality by |〈x,y〉|2 to
obtain the desired result.
By the positivity axiom of inner product, p(t) can have a real (double) root if and only if
x + ty = 0 for some t, i.e. y = αx for some constant α ∈ F.
Now, we are in position to define another class of norm that is derived from some inner
product. This is stated in the theorem below.
Theorem 1.3.4. If 〈·, ·〉 is an inner product on a vector space over the field F (R or C), then
‖x‖ ≡√〈x,x〉 is a vector norm on V . In this case, ‖ · ‖ is said to be a vector norm derived from
an inner product 〈·, ·〉
Proof. Let x,y ∈ Fn and c ∈ F.
1. (Non-negative) By the inner product axiom,√〈x,x〉 ≥ 0.
2. (Positive) By the inner product axiom, equality for the above occur only when x = 0.
3. (Homogeneous) We have ‖cx‖ =√〈cx, cx〉 =
√cc〈x,x〉 = |c|
√〈x,x〉 = |c|‖x‖.
7
4. (Triangle Inequality) We have
‖x + y‖2 = 〈x + y,x + y〉
= 〈x,x〉+ 〈x,y〉+ 〈y,x〉+ 〈y,y〉
= ‖x‖2 + 2 Re (〈x,y〉) + ‖y‖2
≤ ‖x‖2 + 2√〈x,x〉〈y,y〉+ ‖y‖2
= ‖x‖2 + 2‖x‖‖y‖+ ‖y‖2
= (‖x‖+ ‖y‖)2
where above we used Cauchy-Schwarz Inequality.
Therefore, we have ‖x + y‖ ≤ ‖x‖+ ‖y‖ as required.
Hence, we have shown that ‖ · ‖ is a vector norm.
Next, we will formulate a necessary and sufficient condition for a vector norm to be derived
from an inner product.
Theorem 1.3.5 (Parallelogram and Polarisation Identity). Let V be a vector space over the
field F (R or C) equipped with a vector norm ‖ · ‖. Let x,y ∈ V . Then ‖ · ‖ is derived from an
inner product if and only if the Parallelogram Identity :
‖x + y‖2 + ‖x− y‖2 = 2(‖x‖2 + ‖y‖2
)is satisfied.
In such case, the inner product is necessarily given by the Polarisation Identity :
1. (For real vector space) 〈x,y〉 = 14
(‖x + y‖2 − ‖x− y‖2
).
2. (For complex vector space) 〈x,y〉 = 14
(‖x + y‖2 − ‖x− y‖2 + i‖x + iy‖2 − i‖x− iy‖2
).
for all x,y ∈ V .
Proof. (⇒)
Suppose ‖ · ‖ is derived from an inner product 〈·, ·〉.Expanding the left-hand side of parallelogram identity, we have
‖x + y‖2 + ‖x− y‖2 = 〈x + y,x + y〉+ 〈x− y,x− y〉
= (〈x,x〉+ 〈x,y〉+ 〈y,x〉+ 〈y,y〉) + (〈x,x〉 − 〈x,y〉 − 〈y,x〉+ 〈y,y〉)
= 2 (〈x,x〉+ 〈y,y〉)
= 2(‖x‖2 + ‖y‖2
)proving the parallelogram identity.
(⇐)
Refer to [6] for proof of sufficiency.
8
For the Polarisation Identity in real vector space, expanding the right-hand side, we have
‖x + y‖2 − ‖x− y‖2 = 〈x + y,x + y〉 − 〈x− y,x− y〉
= (〈x,x〉+ 〈x,y〉+ 〈y,x〉+ 〈y,y〉)− (〈x,x〉 − 〈x,y〉 − 〈y,x〉+ 〈y,y〉)
= 2 (〈x,y〉+ 〈y,x〉)
= 4〈x,y〉
hence proving the Polarisation Identity in real vector space.
By similar calculation as above, for the case of complex vector space, we have:
‖x + y‖2 − ‖x− y‖2 = 2〈x,y〉+ 2〈y,x〉 (1.3)
and
‖x + iy‖2 − ‖x− iy‖2 = 2〈x, iy〉+ 2〈iy,x〉 (1.4)
Adding (1.3) and (1.4), we have
‖x + y‖2 − ‖x− y‖2 + i‖x + iy‖2 − i‖x− iy‖2 = 2〈x,y〉+ 2〈y,x〉+ 2i〈x, iy〉+ 2i〈iy,x〉
= 2〈x,y〉+ 2〈y,x〉+ 2(−i)2〈x,y〉+ 2i2〈y,x〉
= 4〈x,y〉
proving the Polarisation Identity in complex vector space.
By the above theorem, we can now show that some vector norms are not derived from inner
product. The following proposition will apply the above theorem for the case of l∞-norm.
Proposition 1.3.6. l∞-norm is not derived from any inner product.
Proof. Let x = (1, 1) and y = (0,−1).
Then it can be calculated that
1
2
(‖x + y‖2∞ + ‖x− y‖2∞
)=
1
2(12 + 22) =
5
2
but
‖x‖2∞ + ‖y‖2∞ = 12 + 12 = 2
hence Parallelogram Identity is not satisfied, proving the assertion.
1.4 Analytic Properties of Vector Norms
In the previous sections, we have derived several classes of vector norms. This is necessary
because one norm is more appropriate in some situations than others. For instance, l2 norm
is most commonly used in optimisation theory because it is continuously differentiable almost
everywhere (see [7]). On the other hand, l1 norm is more naturally used in statistics as it gives
a robust estimator in some statistical problems (see [8]). It turns out that in finite-dimensional
vector space, all vector norms are ‘equivalent’ in a certain sense, as we will see in this section.
We will begin by examining some useful analytic properties of vector norm.
9
Definition 1.4.1. Let V be a vector space over the field F (R or C) and let ‖ · ‖ be a norm on
V . The sequence {xk} of vectors in V is said to converge to a vector x ∈ V with respect to the
norm ‖ · ‖ if and only if ‖xk − x‖ → 0 as k →∞.
In such case, we write
limk→∞
xk = x with respect to ‖ · ‖
Furthermore, the theorem below guarantees that the limit of a sequence of vectors, if it
exists, is unique.
Theorem 1.4.2. Let ‖ · ‖ be a vector norm on V . If xk → x with respect to ‖ · ‖ and xk → y
with respect to the same vector norm ‖ · ‖, then x = y.
Proof. By triangle inequality, we have,
0 ≤ ‖x− y‖ ≤ ‖xk − x‖+ ‖xk − y‖ (1.5)
The right hand side of (1.5) converges to 0 by assumption. This implies ‖x− y‖ converges to 0
as k →∞ with respect to ‖ · ‖. Hence x = y as required.
To compare between one vector norm and the other vector norms, we need the notion of the
equivalence of vector norms as defined below.
Definition 1.4.3 (Equivalence of Vector Norms). Let V be a vector space over the field F (R or
C). Let ‖ · ‖α and ‖ · ‖β be any two vector norms. Then ‖ · ‖α and ‖ · ‖β are said to be equivalent
if and only if there exists finite positive constants Cm and CM such that
Cm‖x‖α ≤ ‖x‖β ≤ CM‖x‖α
for all x ∈ V .
Furthermore, in finite-dimensional vector space, there is an equivalent criterion for the con-
vergence of a sequence of vectors, namely the Cauchy Criterion. We will first define this notion
and then formulate this precisely in the following theorem.
Definition 1.4.4 (Cauchy Sequence). A sequence {xk} in a vector space V is said to be a
Cauchy sequence with respect to the vector norm ‖ · ‖ if for each ε > 0, there exists a positive
integer N = N(ε), such that whenever m,n ≥ N ,
‖xm − xn‖ < ε
Theorem 1.4.5. Let ‖ · ‖ be a given vector norm on a finite-dimensional real or complex vector
space V , and let {xk} be a given sequence of vectors in V . The sequence {xk} converges to a
vector in V if and only if it is a Cauchy sequence with respect to the norm ‖ · ‖.
Proof. By choosing a basis B for V , performing change of coordinate between vector spaces, and
considering the equivalence of norms in finite-dimensional vector space, we see that there is no
loss of generality in assuming V = Cn for some integer n.
(⇐)
10
Suppose {xk} is a Cauchy sequence, then so is each component sequence x(i)k of complex numbers
for each i = 1, . . . , n. Since a Cauchy sequence of complex numbers must have a limit, then for
each i = 1, . . . , n, there exists a scalar x(i) such that limk→∞
x(i)k = x(i). It is easily verified that
limk→∞
xk = x, where x = (x(1), . . . , x(n)) ∈ V .
(⇒)
Conversely, if there exists x ∈ V such that limk→∞
xk = x, then by Triangle Inequality,
‖xm − xn‖ ≤ ‖xm − x‖+ ‖xn − x‖
where both terms on the right hand side converge to 0, hence the given sequence is a Cauchy
sequence.
We will now discuss the notion of the equivalence of norms in finite-dimensional vector space.
To do so, we need a result on the continuity property of vector norms.
Lemma 1.4.6. Let ‖ · ‖ be a vector norm on a vector space V over the field F (R or C), and
let x1,x2, . . . ,xm be given vectors. Then the function g : Fm → R defined by
g(z1, z2, . . . , zm) ≡ ‖z1x1 + z2x2 + . . .+ zmxm‖
is a uniformly continuous function.
Proof. Let u =m∑i=1
uixi and v =m∑i=1
vixi. Then we have
|g(u1, . . . , um)− g(v1, . . . , vm)| = |‖u‖ − ‖v‖|
≤ ‖u− v‖
=
∥∥∥∥∥m∑i=1
(ui − vi)xi
∥∥∥∥∥≤
m∑i=1
|ui − vi|‖xi‖
≤ C max1≤i≤m
|ui − vi|
where C ≡ m max1≤i≤m
‖xi‖.Now if xi’s are all zero vector, then there is nothing to show. If not, given ε > 0, then in order
to have |g(u1, . . . , um)− g(v1, . . . , vm)| < ε, we only need to choose |ui − vi| < ε/C, proving the
result.
One of the useful corollary which is almost immediate from the above lemma is stated below.
Corollary 1.4.7. Every vector norm on Fn is a uniformly continuous function
Proof. In Lemma 1.4.6, choose the given vectors x1, . . . ,xn to be a basis for Fn. Then every
vector in Fn can be written as linear combination of the basis vectors. The result then follows.
The following theorem is a slightly more general result that we will need to establish the
equivalence of all vector norms in finite-dimensional vector space.
11
Theorem 1.4.8. Let f1 and f2 be two-real valued functions on a finite-dimensional vector space
V over the field F (R or C), and let B = {x1, . . . ,xn} be a basis for V . Furthermore, assume
that both f1 and f2 are:
1. Positive: fi(x) ≥ 0 for all x ∈ V and fi(x) = 0 if and only if x = 0;
2. Homogeneous: fi(αx) = |α|fi(x) for all α ∈ F and all x ∈ V ;
3. Continuous: fi(x(z)) is continuous on Fn, where
z = (z1, z2, . . . , zn) ∈ Fn and x(z) ≡ z1x1 + . . .+ znxn
Then there exists finite positive constants Cm and CM such that
Cmf1(x) ≤ f2(x) ≤ CMf1(x)
for all x ∈ V .
Proof. Define h(z) = f1(x(z))/f2(x(z)) on the Euclidean unit sphere S = {z ∈ Fn : ‖z‖2 = 1},which is closed and bounded in Fn. Note that h(z) does not vanish on S by assumption (1),
and therefore h(z) is continuous on S by assumption (3). By the Weierstrass theorem (see [7]
for proof and more details], the continuous function h achieves a finite positive maximum CM
and a positive minimum Cm on the closed and bounded set S. Hence, we have
Cmf1(x(z)) ≤ f2(x(z)) ≤ CMf1(x(z)) (1.6)
for all z ∈ S. Now, we have z/‖z‖2 for every non-zero vector z ∈ Fn. By the homogeneity
assumption (2), the inequality (1.6) hold for all non-zero z ∈ Fn. Trivially, the case where z = 0
holds. Now, every vector x ∈ V is of the form x = x(z) for some z ∈ Fn because B is a basis,
hence the inequality holds for all x ∈ V .
It then follows immediately by Theorem 1.4.8 that all vector norms on finite-dimensional
vector space are equivalent. This is stated in the corollary below.
Corollary 1.4.9. Let V be a finite-dimensional vector space over the field F (R or C). Then
all vector norms in V are equivalent.
The following theorem give equivalent statements on the equivalence of two vector norms.
Theorem 1.4.10. Let ‖ · ‖α and ‖ · ‖β be two vector norms on the vector space V over the field
F (R or C). Then the following statements are equivalent:
1. ‖ · ‖α and ‖ · ‖β are equivalent vector norms.
2. There exists finite positive constants Cm and CM such that
Cm‖x‖α ≤ ‖x‖β ≤ CM‖x‖α
for all x ∈ V .
3. limk→∞
xk = x with respect to ‖ · ‖α if and only if limk→∞
xk = x with respect to ‖ · ‖β
12
Proof. (1) ⇔ (2) follows immediately from definition.
(2) ⇒ (3)
By assumption, we have
Cm‖xk − x‖α ≤ ‖xk − x‖β ≤ CM‖xk − x‖α
for all k and some finite positive constants Cm and CM .
Then it follows that ‖xk − x‖β → 0 if ‖xk − x‖α → 0 as k →∞.
Similarly, we have
0 ≤ ‖xk − x‖α ≤ C−1m ‖xk − x‖β
from which it follows that ‖xk − x‖α → 0 if ‖xk − x‖β → 0 as k →∞.
(2) ⇐ (3)
Let f(x) ≡‖x‖β‖x‖α
on the unit sphere S of ‖ · ‖β, i.e. f(x) =1
‖x‖αon S.
Suppose f is unbounded. Then for all k ∈ R, there exists positive integer N such that if n > N ,
then f(xn) > k, i.e. 0 < ‖xn‖α < 1k and ‖xn‖β = 1.
But this implies, ‖xn‖α → 0 and ‖xn‖β → 1, which contradicts the equivalence of ‖ · ‖α and
‖ · ‖β.
Hence, f must be bounded, i.e. there exists positive constants Cm and CM such that
Cm ≤‖x‖β‖x‖α
≤ CM
which upon simplification giving the result.
1.5 Geometric Properties of Vector Norms
We will look at some of the geometric properties of vector norms in this section. In particular,
the properties of the unit ball of vector norms will be studied.
Definition 1.5.1. Let ‖ · ‖ be a vector norm on a finite-dimensional vector space V over the
field F (R or C) and let x be a vector in V . Let r > 0 be given. The ball of radius r around x
is defined to be the set
B‖·‖(r, x) ≡ {y ∈ V : ‖y − x‖ ≤ r}
In particular, the unit ball of ‖ · ‖ is the set
B‖·‖ ≡ B‖·‖(1, 0) = {y ∈ V : ‖y‖ ≤ 1}
The following gives an example of the relation between vector norms and the unit ball.
Example 1.5.2. Let ‖ · ‖α and ‖ · ‖β be two vector norms on finite-dimensional vector space
V over the field F (R or C). Define a new vector norm ‖ · ‖ by ‖ · ‖ ≡ max(‖ · ‖α, ‖ · ‖β). Then
B‖·‖ = B‖·‖α ∩B‖·‖β .
Proof. We have x ∈ B‖·‖ if and only if max(‖ · ‖α, ‖ · ‖β) ≤ 1, which is equivalent to ‖x‖α ≤
13
1 and ‖x‖β ≤ 1, i.e. x ∈ B‖·‖α and x ∈ B‖·‖β .
Hence, x ∈ B‖·‖α ∩B‖·‖β , proving the result.
The ordering of vector norms can be described geometrically by the containment of the unit
balls. This is seen in the following proposition.
Proposition 1.5.3. Let ‖ · ‖α and ‖ · ‖β be two vector norms on finite-dimensional vector space
V over the field F (R or C). Then ‖x‖α ≤ ‖x‖β if and only if B‖·‖β ⊆ B‖·‖α .
Proof. (⇒) Suppose ‖x‖α ≤ ‖x‖β.
Then for any z ∈ B‖·‖β , i.e. ‖z‖β ≤ 1, we have ‖z‖α ≤ ‖z‖β ≤ 1, i.e. z ∈ B‖·‖αThus we have proven B‖·‖β ⊆ B‖·‖α .
(⇐) Conversely suppose B‖·‖β ⊆ B‖·‖α .
Then for any z such that ‖z‖β ≤ 1, we have ‖z‖α ≤ 1.
Now for any z 6= 0 ∈ V , we have
∥∥∥∥ z
‖z‖β
∥∥∥∥β
= 1.
This implies
∥∥∥∥ z
‖z‖β
∥∥∥∥α
≤ 1, i.e. ‖x‖α ≤ ‖x‖β, proving the claim.
We will now characterise the unit ball of vector norms by several properties that it possesses.
Proposition 1.5.4. Let ‖ · ‖ be a vector norm on finite-dimensional vector space V over the
field F (R or C). The unit ball of ‖ · ‖, i.e. B‖·‖ has the following properties:
1. B‖·‖ contains 0 as an interior point.
2. B‖·‖ is equilibrated, i.e. if x ∈ B‖·‖, then αx ∈ B‖·‖ for all scalars α such that |α| = 1.
3. B‖·‖ is convex, i.e. for all x,y ∈ B‖·‖ and for all t ∈ [0, 1], tx + (1− t)y ∈ B‖·‖.
Proof. (1) ‖0‖ = 0 ≤ 1, hence 0 ∈ B‖·‖
(2) Let v ∈ B‖·‖ and |α| = 1, then ‖αv‖ = |α|‖v‖ ≤ 1, i.e. v ∈ B‖·‖.
(3) Let x,y ∈ B‖·‖ and t ∈ [0, 1], then
‖tx + (1− t)y‖ ≤ ‖tx‖+ ‖(1− t)y‖
= t‖x‖+ (1− t)‖y‖
≤ t+ (1− t) ≤ 1
i.e. tx + (1− t)y ∈ B‖·‖.
The above properties are in fact sufficient to characterise the unit ball of vector norms in
finite-dimensional vector space.
Theorem 1.5.5. A set B in a finite-dimensional vector space V over the field F (R or C) is the
unit ball of a vector norm on V if and only if B is a compact, convex, and equilibrated set with
0 as an interior point.
14
Proof. The necessary conditions have been proven in Proposition 1.5.4. To see that they suffice
for the definition of a norm, we consider any nonzero point x ∈ V . We construct a ray segment
{αx : 0 ≤ α ≤ 1} from the origin through x. Define the length of ‖x‖ as the proportional
distance along this ray from the origin to x, with the length of the interval of the ray from the
origin to the unique point on the boundary of the unit ball serving as one unit. By defining ‖ · ‖in this way, the unit ball B completely characterises the vector norm ‖ · ‖. Formally, define ‖x‖by
‖x‖ =
{0 if x = 0
min{
1t : t > 0, tx ∈ B
}if x 6= 0
Observe that the function is finite for each nonzero vector x because B is compact and 0 is an
interior point of B. It remains to check that ‖ · ‖ is a vector norm.
1. (Non-negative and Positive) It immediately follows by the definition of ‖ · ‖ that ‖x‖ ≥ 0
for all x ∈ V . Observe that min{
1t : t > 0, tx ∈ B
}cannot be zero by the compactness of
B. Hence we have ‖x‖ = 0 if and only if x = 0.
2. (Homogeneous) This is trivially true when x = 0. Suppose x 6= 0 and let α ∈ F. Again,
this is trivially true if α = 0. Now, suppose α 6= 0. Then
‖αx‖ = min
{1
t: t > 0 and tαx ∈ B
}= min
{|α|t′
:t′
α> 0 and
t′
|α|αx ∈ B
}= min
{|α|t′
: t′ > 0 and t′x ∈ B}
(by equilibriation assumption)
= |α|min
{1
t′: t′ > 0 and t′x ∈ B
}= |α|‖x‖
3. (Triangle Inequality) This is trivially true when x = 0 or y = 0. Let x,y be nonzero
vectors in V , thenx
‖x‖and
y
‖y‖are unit vectors that lie on the boundary of B. By
convexity assumption, the vector
z =‖x‖
‖x‖+ ‖y‖x
‖x‖+
‖y‖‖x‖+ ‖y‖
y
‖y‖
also lies in B, i.e. ‖z‖ ≤ 1, which is equivalent to ‖x + y‖ ≤ ‖x‖+ ‖y‖.
The above theorem shows that the unit ball of a norm are sufficient to characterise a vector
norm, a property which will be used to prove the duality theorem in the later section.
1.6 Duality of Vector Norms
Using the fact that the unit ball of any vector norm on Rn or Cn is compact, we will study
another important method of generating new class of vector norms as well as its properties
through the concept of ‘dual norm’.
15
Definition 1.6.1. Let ‖ · ‖ be a vector norm on finite-dimensional vector space V over the field
F (R or C). Let x,y ∈ V . The function defined by
‖y‖D ≡ max‖x‖=1
Re y∗x
is the dual norm of ‖ · ‖.
The dual norm is a well-defined function on V . Observe that Re y∗x is a continuous function
of x for each fixed y ∈ V . Furthermore, as the unit sphere of ‖ · ‖ is compact, by Weierstrass
theorem, the maximum of Re y∗x is attained at some point x0 on the unit sphere of ‖ · ‖. Below
we give an equivalent definition of the dual norm.
Proposition 1.6.2. Let ‖ · ‖ be a norm on finite-dimensional vector space V over the field F(R or C) and let ‖ · ‖D be its dual norm. Let x,y ∈ V . Then
‖y‖D = max‖x‖=1
|y∗x|
Proof. Let c ∈ F be a scalar such that |c| = 1. Then by the homogeneity of f , we have
max‖x‖=1
|y∗x| = max‖x‖=1
max|c|=1
Re cy∗x
= max‖x‖=1
max|c|=1
Re y∗(cx)
= max|c|=1
max‖x/c‖=1
Re y∗x
= max‖x‖=1
Re y∗x
hence the two definitions are equivalent.
Dual norm is named as such because the dual norm of a vector norm is again a vector norm.
This is proved in the following proposition.
Proposition 1.6.3. Let ‖ · ‖D be the dual norm of a vector norm ‖ · ‖ on finite-dimensional
vector space V over the field F (R or C). Then ‖ · ‖D is a vector norm.
Proof. We will check that ‖ · ‖D satisfies the norm axioms. Let x,y ∈ V .
1. (Homogeneity) Let c ∈ F. Then we have
‖cy‖D = max‖x‖=1
|(cy)∗x|
= max‖x‖=1
|c||y∗x|
= |c| max‖x‖=1
|y∗x|
= |c|‖y‖D
2. (Positive and Non-Negative) For y 6= 0,
‖y‖D = max‖x‖=1
|y∗x| ≥∣∣∣∣y∗ y
‖y‖
∣∣∣∣ =‖y‖22‖y‖
> 0
16
Furthermore, we have ‖0‖D = 0. As x 6= 0, we have max‖x‖=1
|y∗x| = 0 if and only if y = 0.
3. (Triangle Inequality) Let x,y, z ∈ V
‖y + z‖D = max‖x‖=1
|(y + z)∗x|
≤ max‖x‖=1
(|y∗x|+ |z∗x|)
≤ max‖x‖=1
|y∗x|+ max‖x‖=1
|z∗x|
= ‖y‖D + ‖z‖D
We will derive the dual of some common vector norms in the following propositions.
Proposition 1.6.4. The dual of l1-norm is l∞-norm. The dual of l∞-norm is l1-norm.
Proof. Let x,y ∈ Fn. We have, by Triangle Inequality,
|y∗x| =
∣∣∣∣∣n∑i=1
yixi
∣∣∣∣∣ ≤n∑i=1
|yixi| ≤ max1≤i≤n
|yi|n∑j=1
|xj | = ‖y‖∞‖x‖1 (1.7)
Now, given vector y, equality holds in (1.7) when x is a unit vector (with respect to ‖ · ‖1) such
that xi = 1 for one value of i for which |yi| = ‖y‖∞, and xi = 0 otherwise, where 1 ≤ i ≤ n.
Hence, we have
(‖y‖1)D = max‖x‖1=1
|y∗x| = max‖x‖1=1
‖y‖∞‖x‖1 = ‖y‖∞
from which we conclude the result. Similarly, given vector x, then equality holds in (1.7) when
y is a unit vector (with respect to ‖ · ‖∞), such that yi = xi/|xi| for all i where xi 6= 0, and
yi = 0 otherwise. Then, we have
(‖y‖∞)D = max‖x‖∞=1
|y∗x| = max‖x‖∞=1
‖y‖1‖x‖∞ = ‖y‖1
from which we could further conclude that (‖ · ‖∞)D = ‖ · ‖1.
Proposition 1.6.5. The dual of l2-norm is itself.
Proof. Let x,y ∈ Fn. Then by Cauchy-Schwarz Inequality,
|y∗x| =
∣∣∣∣∣n∑i=1
yixi
∣∣∣∣∣ ≤ ‖y‖2‖x‖2with equality when x = y/‖y‖2.
By similar argument as in Proposition 1.6.4, we can prove the result.
In Proposition 1.6.5, it is observed that the dual of l2-norm is itself. This, in fact, is the only
vector norm whose dual is itself. To prove this, it is necessary to first establish the following
inequality, which is a natural generalisation of the Cauchy-Schwarz Inequality.
17
Proposition 1.6.6. Let ‖ · ‖ be a vector norm on finite-dimensional vector space V over the
field F (R or C). Then for all x,y ∈ V ,
|y∗x| ≤ ‖x‖‖y‖D (1.8)
|y∗x| ≤ ‖x‖D‖y‖ (1.9)
Proof. When x = 0, inequality (1.8) holds trivially.
Suppose x 6= 0. Then we have ∣∣∣∣y∗ x
‖x‖
∣∣∣∣ ≤ max‖z‖=1
|y∗z| = ‖y‖D
and hence |y∗x| ≤ ‖x‖‖y‖D, proving inequality (1.8). Inequality (1.9) follows since we have
|y∗x| = |x∗y|.
Proposition 1.6.7. Let ‖ · ‖ be a vector norm on a finite-dimensional vector space V over the
field F (R or C), and let ‖ · ‖D be its dual norm. Let c > 0 be given. Then ‖x‖ = c‖x‖D for all
x ∈ V if and only if ‖ · ‖ =√c‖ · ‖2.
In particular, ‖ · ‖ = ‖ · ‖D if and only if ‖ · ‖ is the l2-norm.
Proof. (⇐)
Suppose ‖ · ‖ =√c‖ · ‖2. Let x ∈ V . Then
‖x‖D = max‖y‖=1
|x∗y| = max‖y‖2= 1√
c
|x∗y| = max‖y‖2=1
∣∣∣∣x∗ y√c
∣∣∣∣=
1√c
max‖y‖2=1
|x∗y|
=1√c‖x‖D2
=1√c‖x‖2
=1
c‖x‖
Hence for any x ∈ V , ‖x‖ = c‖x‖D as required.
(⇒)
Conversely, suppose ‖ · ‖ = c‖ · ‖D for some c > 0. Let x ∈ V . Then by Proposition 1.6.6,
‖x‖22 = |x∗x| ≤ ‖x‖‖x‖D =1
c‖x‖2 (1.10)
so ‖x‖ ≥√c‖x‖2. Moreover by Proposition 1.6.5, we have
|x∗y| ≤ ‖x‖2‖y‖2 (1.11)
with equality when y = x/‖x‖2. Observe that ‖y‖2 = 1. Hence,
maxy 6=0
∣∣∣∣x∗ y
‖y‖2
∣∣∣∣ = max‖y‖2=1
|x∗y| = ‖x‖2∥∥∥∥ x
‖x‖2
∥∥∥∥2
= ‖x‖2 (1.12)
18
where this maximum is attained at y = x/‖x‖2.
Using (1.12), we establish the reverse bound if x 6= 0, by considering
1
c‖x‖ = ‖x‖D = max
‖y‖=1|x∗y|
= maxy 6=0
∣∣∣∣x∗ y
‖y‖
∣∣∣∣= max
y 6=0
∣∣∣∣x∗ y
‖y‖2‖y‖2‖y‖
∣∣∣∣≤ max
y 6=0
∣∣∣∣x∗ y
‖y‖21√c
∣∣∣∣= ‖x‖2
1√c
Hence, combining this with (1.10), we have proven ‖x‖ =√c‖x‖2.
By taking c = 1, the final assertion follows, and we have shown that the l2-norm is the only
norm which is its own dual.
We will now establish the equivalence between dual norms as well as between the vector
norm with its dual. This is one of the nice and useful properties that finite-dimensional vector
space has.
Lemma 1.6.8. Let ‖ · ‖α and ‖ · ‖β be two given vector norms, and ‖ · ‖α and ‖ · ‖β be their
duals respectively, on finite-dimensional vector space V over the field F (R or C). Suppose there
exists some constant C > 0 such that ‖x‖α ≤ C‖x‖β for all x ∈ V . Then ‖x‖Dβ ≤ C‖x‖Dα for
all x ∈ V .
Proof. We have
‖x‖Dα = max‖y‖α=1
|x∗y| = maxy 6=0
∣∣∣∣ x∗y
‖y‖α
∣∣∣∣≥ max
y 6=0
|x∗y|C‖y‖β
=1
Cmaxy 6=0
|x∗y|‖y‖β
=1
Cmax‖y‖β=1
|x∗y|
=1
C‖x‖Dβ
which upon rearranging, gives the conclusion.
Theorem 1.6.9. Let ‖ · ‖α and ‖ · ‖β be two given vector norms on finite-dimensional vector
space V over the field F (R or C). Suppose that ‖ · ‖α and ‖ · ‖β are equivalent vector norms.
Then their duals, ‖ · ‖Dα and ‖ · ‖Dβ are also equivalent.
Proof. By the equivalence of ‖ · ‖α and ‖ · ‖β, we have, for all x ∈ V and some positive constant
cm and CM ,
cm‖x‖β ≤ ‖x‖α ≤ CM‖x‖β
19
In particular, we have ‖x‖α ≤ CM‖x‖β implies ‖x‖Dβ ≤ CM‖x‖Dα , and ‖x‖β ≤ c−1m ‖x‖α implies
‖x‖Dα ≤ c−1m ‖x‖Dβ for all x ∈ V by Lemma 1.6.8.
Hence, we have shown
C−1M ‖x‖
Dβ ≤ ‖x‖Dα ≤ c−1
m ‖x‖Dβ
proving the equivalence of ‖ · ‖Dα and ‖ · ‖Dβ .
Theorem 1.6.10. Let ‖·‖ be a vector norm and ‖·‖D be its dual on a finite-dimensional vector
space V over the field F (R or C). Then ‖ · ‖ and ‖ · ‖D are equivalent.
Proof. Note that in finite-dimensional vector space we have the equivalence of ‖ · ‖ and l2-norm:
cm‖x‖ ≤ ‖x‖2 ≤ CM‖x‖ (1.13)
for all x ∈ V and some positive constants cm and CM .
Using (1.11) and (1.13), for the upper bound we have
maxx 6=0
‖x‖D
‖x‖= max
x 6=0
max‖y‖=1 |x∗y|‖x‖
= maxx 6=0
max‖y‖=1
|x∗y|‖x‖
= maxx 6=0
max‖y‖=1
∣∣∣∣( x
‖x‖
)∗y
∣∣∣∣= max‖x‖=1
max‖y‖=1
|x∗y|
≤ max‖x‖=1
max‖y‖=1
‖x‖2‖y‖2
≤ max‖x‖=1
max‖y‖=1
CM‖x‖CM‖y‖
= C2M
By similar argument as above, for the lower bound, consider
minx 6=0
‖x‖D
‖x‖= min‖x‖=1
max‖y‖=1
|x∗y| ≥ min‖x‖=1
∣∣∣∣x∗x‖x‖∣∣∣∣
= min‖x‖=1
∣∣∣∣‖x‖22‖x‖
∣∣∣∣≥ min‖x‖=1
∣∣∣∣c2m‖x‖2
‖x‖
∣∣∣∣= c2
m
Hence, combining the upper and lower bound, we have
c2m ≤
‖x‖D
‖x‖≤ C2
M
or equivalently
c2m‖x‖ ≤ ‖x‖D ≤ C2
M‖x‖
as required. Therefore, ‖ · ‖ and ‖ · ‖D are equivalent.
We will conclude this chapter with the following duality theorem, which says that the dual
of the dual norm is itself. This is related to the convexity of the unit ball of the vector norm, as
20
is evident from the proof of this theorem below.
Theorem 1.6.11 (Duality Theorem). Let ‖ · ‖ be a vector norm on a finite-dimensional vector
space over the field F (R or C). Let ‖ · ‖D denotes the dual norm of ‖ · ‖ and let ‖ · ‖DD denotes
the dual norm of ‖ · ‖D. Let
B ≡ {x ∈ V : ‖x‖ ≤ 1} (1.14)
B′′ ≡ {x ∈ V : ‖x‖DD ≤ 1} (1.15)
denote the unit ball of ‖ · ‖ and the unit ball of ‖ · ‖DD respectively.
Then B = B′′, i.e. ‖ · ‖DD = ‖ · ‖.
Proof. First we will prove the following claim:
Claim 1: B′′ ⊂ Co B, where Co B is the closed convex hull of B, which is the intersection of all
convex sets containing B (see [7] for more rigorous treatment of convex hull and half-spaces).
Proof of Claim 1: Observe that the set {t ∈ V : Re t∗v ≤ 1} is a general closed half-space that
contains the origin. Now, let u ∈ B′′ be a given point and observe that
u ∈ {t ∈ V : Re t∗v ≤ 1 for every v such that ‖v‖D ≤ 1}
= {t ∈ V : Re t∗v ≤ 1 for every v such that Re v∗w ≤ 1 for every w such that ‖w‖ ≤ 1}
= {t ∈ V : Re t∗v ≤ 1 for every v such that Re w∗v ≤ 1 for all w ∈ B}
This implies that u lies in every closed half-space that has the property that it contains every
point of B, i.e. u lies in every closed half-space that contains B. Since the intersection of all
such closed half-spaces is the closed convex hull of B, we conclude that u ∈ Co B. This implies
B′′ ∈ Co B, proving Claim 1.
Next we prove another claim:
Claim 2: B ⊆ B′′.
Proof of Claim 2: By Proposition 1.6.6:
‖x‖DD = max‖y‖D=1
|y∗x| ≤ max‖y‖D=1
‖x‖‖y‖D = ‖x‖
which is equivalent to B ⊆ B′′ by Proposition 1.5.3, proving Claim 2.
Now, as B is the unit ball of a vector norm, by Theorem 1.5.5, B is a convex set. Hence,
the smallest convex set containing B is B itself, i.e. we have B = Co B. Then we have by Claim
1 and Claim 2 the following chain of inclusion:
Co B = B ⊆ B′′ ⊆ Co B
which implies B = B′′, i.e. ‖ · ‖DD = ‖ · ‖ as required.
21
Chapter 2
Matrix Norms
We will now generalise the notion of norm introduced in Chapter 1 to measure the ‘size’ of
matrices. Since Mn is itself a vector space of dimension n2, one may measure the ‘size’ of a
matrix by using any vector norm on Fn2. However, Mn has a natural multiplication operation,
and it is often useful to relate the ‘size’ of the matrix AB to the ‘size’ of A and B. In this
chapter, the notion of matrix norm and its properties will be studied.
2.1 Basic Properties of Matrix Norms
Definition 2.1.1 (Matrix Norm Axioms). A function ||| · ||| :Mn → R is a matrix norm on Mn
if for all A,B ∈Mn, the following axioms are satisfied:
1. (Non-negative) |||A ||| ≥ 0.
2. (Positive) |||A ||| = 0 if and only if A = 0.
3. (Homogeneous) ||| cA ||| = |c| |||A ||| for all scalars c ∈ F.
4. (Triangle Inequality) |||A+B ||| ≤ |||A |||+ |||B |||.
5. (Submultiplicative) |||AB ||| ≤ |||A ||| |||B |||.
We will now establish some basic results concerning matrix norm.
Proposition 2.1.2. Let ||| · ||| be a matrix norm on Mn. Then
1.∣∣∣∣∣∣Ak ∣∣∣∣∣∣ ≤ |||A |||k for every positive integer k.
2. ||| I ||| ≥ 1.
Proof. We will prove (1) by induction.
1. Let P (n) be the statement ‘|||An ||| ≥ |||A |||n’.
When n = 1, the statement is trivially true.
Suppose P (n) is true for some positive integer n = k, we will show it is true for n = k+ 1.
We have ∣∣∣∣∣∣∣∣∣Ak+1∣∣∣∣∣∣∣∣∣ =
∣∣∣∣∣∣∣∣∣AAk ∣∣∣∣∣∣∣∣∣ ≤ |||A ||| ∣∣∣∣∣∣∣∣∣Ak ∣∣∣∣∣∣∣∣∣ ≤ |||A ||| |||A |||k = |||A |||k+1
Hence, P (n) is true for all positive integers n by induction.
22
2. We have ∣∣∣∣∣∣ I2∣∣∣∣∣∣ ≤ ||| I ||| ||| I |||
which implies ||| I ||| ≤ ||| I |||2. As ||| I ||| 6= 0, we have ||| I ||| ≥ 1 as required.
We will now give some examples of the commonly used matrix norm and prove directly in
the case of Frobenius norm that it is indeed a matrix norm. Later, in Proposition 2.2.4, ||| · |||1will also be shown to be a matrix norm by showing that it is induced. ||| · |||∞ can also be shown
to be a matrix norm by similar method.
Example 2.1.3 (maximum column sum matrix norm on Mn).
|||A |||1 ≡ max1≤j≤n
n∑i=1
|aij |
Example 2.1.4 (maximum row sum matrix norm on Mn).
|||A |||∞ ≡ max1≤i≤n
n∑j=1
|aij |
Example 2.1.5 (Frobenius norm on Mn).
|||A |||F ≡
n∑i,j=1
|aij |21/2
Proof. Non-negativity and Positivity axioms are easy to check. We will show that ||| · |||F is a
matrix norm by checking the remaining matrix norm axioms:
1. (Homogeneous) Let c ∈ F. Then
||| cA ||| =
n∑i,j=1
|caij |21/2
=
|c|2 n∑i,j=1
|aij |21/2
= |c|
n∑i,j=1
|aij |21/2
= |c| |||A |||
2. (Triangle Inequality) Triangle inequality follows from Minkowski’s Inequality (Theorem
1.2.8) by taking p = 2.
3. (Submultiplicative) Let A,B ∈Mn. Then by Cauchy-Schwarz Inequality (Theorem 1.3.3),
|||AB |||2F =
n∑i,j=1
∣∣∣∣∣n∑k=1
aikbkj
∣∣∣∣∣2
≤n∑
i,j=1
[(n∑k=1
|aik|2)(
n∑m=1
|bmj |2)]
=
n∑i,k=1
|aik|2 n∑
m,j=1
|bmj |2
= |||A |||2F |||B |||2F
Hence, Frobenius norm is indeed a matrix norm.
23
Example 2.1.6 (spectral norm on Mn).
|||A |||2 ≡ max{√λ : λ is an eigenvalue of A∗A}
Notice that if A∗Ax = λx and x 6= 0, then x∗A∗Ax = ‖Ax‖22 = λ‖x‖22, hence λ is real and
non-negative, hence |||A ||| is well-defined. It can be checked that spectral norm is a matrix norm.
Similar as in the case of vector norms, all matrix norm in Mn are equivalent. To prove this
fact, we need to first show the following lemma, similar to Lemma 1.4.6.
Lemma 2.1.7. Let ||| · ||| be a matrix norm on Mn, and let A1, A2, . . . , Am be given matrices.
Then the function g : Fm → R defined by
g(z1, z2, . . . , zm) ≡ ||| z1A1 + z2A2 + . . .+ zmAm |||
is a uniformly continuous function.
Proof. Let U =
m∑i=1
uiAi and V =
n∑i=1
viAi. Then we have
|g(u1, . . . , um)− g(v1, . . . , vm)| = | |||U ||| − |||V ||| | ≤ |||U − V |||
=
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣m∑i=1
(ui − vi)Ai
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
≤m∑i=1
|ui − vi| |||Ai |||
≤ C max1≤i≤m
|ui − vi|
where C ≡ max1≤i≤m
|||Ai |||.Now if all Ai’s are zero matrices, then there is nothing to show. If not, given ε > 0, then in order
to have |g(u1, . . . , um)− g(v1, . . . , vm)| < ε, we only need to choose |ui − vi| < ε/C, proving the
result.
Corollary 2.1.8. Every matrix norm on Mn is a uniformly continuous function.
Proof. Using Lemma 2.1.7, choose the given matrices A1, . . . , An2 to be the basis of Mn. Then
every matrix in Mn can be written as linear combination of the chosen basis. The result then
follows.
We can now formulate the equivalence of matrix norms in Mn in the sense similar to the
equivalence of vector norms in Fn.
Theorem 2.1.9. Let ||| · |||α and ||| · |||β be two matrix norms on Mn. Then ||| · |||α and ||| · |||β are
equivalent, in the sense that there exist positive constants cm and CM such that
cm |||A |||α ≤ |||A |||β ≤ CM |||A |||α
for all A ∈Mn.
24
Proof. This follows immediately from Theorem 1.4.8 as ||| · ||| is positive and homogeneous by
definition. It is also uniformly continuous by Lemma 2.1.7. The conclusion then follows.
2.2 Induced Matrix Norms
In this section, we will define another class of matrix norm which is related to a vector norm in
some sense, and derive some of its important properties.
Definition 2.2.1 (Induced Matrix Norms). Let ‖ · ‖ be a vector norm on a finite-dimensional
vector space over the field F (R or C). Define ||| · ||| on Mn by
|||A ||| ≡ max‖x‖=1
‖Ax‖ = maxx 6=0
‖Ax‖‖x‖
Then ||| · ||| is said to be the matrix norm induced by ‖ · ‖ or the operator norm associated with
‖ · ‖.Note that the use of ‘max’ in the above definition is justified since ‖Ax‖ is a continuous function
of x and the unit ball B‖·‖ is compact.
We will now show that the induced norm defined above is indeed a matrix norm.
Proposition 2.2.2. ||| · ||| defined in Definition 2.2.1 is a matrix norm.
Proof. We will show that it is a matrix norm by checking the matrix norm axioms.
1. (Non-Negative and Positive) Non-negativity follows from the fact that |||A ||| is the maxi-
mum of a non-negative valued function. Moreover, by the definition, |||A ||| = 0 if and only
if Ax = 0 for all x 6= 0, which is equivalent to A = 0.
2. (Homogeneous) Let c ∈ F. We have
||| cA ||| = max‖x‖=1
‖cAx‖ = |c| max‖x‖=1
‖Ax‖ = |c| |||A |||
3. (Triangle Inequality) Let A,B ∈Mn. Then
|||A+B ||| = max‖x‖=1
‖(A+B)x‖ ≤ max‖x‖=1
(‖Ax‖+ ‖Bx‖)
≤ max‖x‖=1
‖Ax‖+ max‖x‖=1
‖Bx‖
= |||A |||+ |||B |||
4. (Submultiplicative) Let A,B ∈Mn. Suppose x is not in the nullspace of B. Then
|||AB ||| = maxx 6=0
‖ABx‖‖x‖
= maxx 6=0
‖ABx‖‖Bx‖
‖Bx‖‖x‖
≤ maxy 6=0
‖Ay‖‖y‖
maxx 6=0
‖Bx‖‖x‖
= |||A ||| |||B |||
Note that when x is in the nullspace of B, then Bx = 0 and the result is trivially true.
Hence, we have proven that ||| · ||| is a matrix norm.
25
Now, we will derive some basic properties of induced matrix norm.
Proposition 2.2.3. Let ||| · ||| be a matrix norm induced by the vector norm ‖ · ‖. Then
1. ‖Ax‖ ≤ |||A ||| ‖x‖, for all A ∈Mn and all x ∈ Fn.
2. ||| I ||| = 1.
Proof. First we will prove (1).
When x = 0, the inequality is trivially true. By definition of induced matrix norm, for all x 6= 0,
we have
∥∥∥∥ Ax
‖x‖
∥∥∥∥ ≤ |||A |||.By homogeneity of vector norm, we then have ‖Ax‖ ≤ |||A ||| ‖x‖ as required.
Now, for (2) we have
||| I ||| = max‖x‖=1
‖Ix‖ = max‖x‖=1
‖x‖ = 1
Note that by the proposition above, the condition ||| I ||| = 1 is a necessary condition for ||| · |||to be an induced matrix norm. However, note that it is not sufficient.
In the following, we will find the matrix norm induced by some common vector norms.
Proposition 2.2.4. The maximum column sum matrix norm ||| · |||1 is induced by l1-norm.
Proof. Write A ∈Mn in terms of its columns as A = [a1 a2 . . . an]. Then
|||A |||1 = max1≤i≤n
‖ai‖1
If x = (x1, . . . , xn), then
‖Ax‖1 = ‖x1a1 + . . .+ xnan‖1 ≤n∑i=1
‖xiai‖1
=
n∑i=1
|xi|‖ai‖1
≤n∑i=1
|xi|(
max1≤k≤n
‖ak‖1)
=
n∑i=1
|xi| |||A |||1
= ‖x‖1 |||A |||1
Hence, we have max‖x‖1=1
‖Ax‖1 ≤ |||A |||1. Now, choose x = ek (the k-th unit basis vector). Then
for any k = 1, . . . , n we have
max‖x‖1=1
‖Ax‖1 ≥ ‖ak‖1
26
and therefore
max‖x‖1=1
‖Ax‖1 ≥ max1≤k≤n
‖ak‖1 = |||A |||1
Hence, we have the required conclusion.
In the similar way, we can show that ||| · |||∞ is induced by l∞-norm.
Below are two more examples of matrix norms induced by some vector norm.
Example 2.2.5. The maximum row sum matrix norm ||| · |||∞ defined on Mn is induced by the
l∞-norm.
Example 2.2.6. The spectral norm ||| · |||2 defined on Mn is induced by the l2-norm.
We have seen in the previous section that all matrix norms in Mn are equivalent. In this
section, we are going to explore this notion of equivalence specifically for the case of induced
matrix norms further. Moreover, the relation between different induced matrix norms will also
be studied. Before that, we will introduce the following lemma.
Lemma 2.2.7. Let ‖ · ‖ be a given vector norm on finite-dimensional vector space V over the
field F (R or C) and let y ∈ Fn be a given fixed vector. Then there exists a vector y0 ∈ Fn such
that both of the following are satisfied:
1. (y0)∗y = ‖y‖; and
2. |(y0)∗x| ≤ ‖x‖ for all x ∈ V .
Proof. By the duality theorem, we have
‖y‖ = ‖y‖DD = max‖z‖D=1
|y∗z|
Moreover, by the compactness of unit sphere of the vector norm ‖ · ‖D, the maximum is actually
achieved for some vector z = y0 such that ‖y0‖D = 1, so ‖y‖ = |y∗y0|. Multiplying y0 with a
suitable factor of modulus 1, the inner product y∗y0 can be made positive, satisfying (1).
Moreover, by Proposition 1.6.6, we have
|(y0)∗x| ≤ ‖y0‖D‖x‖ = ‖x‖
for all x ∈ Fn, satisfying (2). Hence, we have found the required vector y0.
Theorem 2.2.8. Let ‖ · ‖α and ‖ · ‖β be two given vector norms on finite-dimensional vector
space V over the field F (R or C). Let ||| · |||α and ||| · |||β be the respective induced matrix norms
on Mn. Define
Rαβ ≡ maxx 6=0
‖x‖α‖x‖β
and Rβα ≡ maxx 6=0
‖x‖β‖x‖α
(2.1)
Then
maxA 6=0
|||A |||α|||A |||β
= maxA 6=0
|||A |||β|||A |||α
= RαβRβα (2.2)
Proof. Let A ∈Mn and x ∈ V be given, and suppose that x 6= 0 and Ax 6= 0. Then
‖Ax‖α‖x‖α
=‖Ax‖α‖Ax‖β
‖Ax‖β‖x‖β
‖x‖β‖x‖α
≤ Rαβ‖Ax‖β‖x‖β
Rβα
27
an inequality which also holds when Ax = 0. Thus we have
|||A |||α ≡ maxx 6=0
‖Ax‖α‖x‖α
≤ Rαβ maxx 6=0
‖Ax‖β‖x‖β
Rβα = RαβRβα |||A |||β
and hence|||A |||α|||A |||β
≤ RαβRβα (2.3)
for all nonzero A ∈Mn.
Now, rewrite (2.1) as follows
maxx 6=0
‖x‖α‖x‖β
= maxx 6=0
∥∥∥ x‖x‖2
∥∥∥α∥∥∥ x
‖x‖2
∥∥∥β
= max‖y‖2=1
‖y‖α‖y‖β
Hence, by the compactness of unit sphere and Weierstrass Theorem, each of the extrema in (2.1)
is achieved for some nonzero vectors, i.e. there exist vectors y, z ∈ V such that ‖y‖2 = ‖z‖2 = 1
and ‖y‖α = Rαβ‖y‖β and ‖z‖β = Rβα‖z‖α. By Lemma 2.2.7, there exists a vector z0 ∈ V such
that
1. |z∗0x| ≤ ‖x‖β; and
2. z0∗z = ‖z‖β.
Now, consider the matrix A0 ≡ yz0∗. By (2), we have
‖A0z‖α‖z‖α
=‖yz0
∗z‖α‖z‖α
=‖y‖α|z0∗z|‖z‖α
=‖y‖α‖z‖β‖z‖α
so by definition of induced matrix norm, we have the lower bound
|||A0 |||α ≥‖y‖α‖z‖β‖z‖α
= RαβRβα‖y‖β (2.4)
Moreover, by (1), we have
‖A0x‖β‖x‖β
=‖yz0
∗x‖β‖x‖β
=‖y‖β|z0∗x|‖x‖β
≤‖y‖β‖x‖β‖x‖β
= ‖y‖β (2.5)
By definition of induced matrix norm, we have the upper bound |||A0 |||β ≤ ‖y‖βCombining (2.4) and (2.5),
|||A0 |||α|||A0 |||β
≥RαβRβα‖y‖β‖y‖β
= RαβRβα
which shows that equality is possible in (2.3), hence establishing one part of (2.2). The complete
assertion in (2.2) follows because of the symmetry in α and β.
The above theorem has several interesting corollaries which are presented below. Corollary
2.2.9 shows that two different vector norms could induce the same matrix norm if and only if
one of the vector norms is a constant multiple of the other.
28
Corollary 2.2.9. Let ‖ ·‖α and ‖ ·‖β be vector norms on finite-dimensional vector space V over
the field F (R or C). Let ||| · |||α and ||| · |||β denote the respective induced matrix norm on Mn.
Then |||A |||α = |||A |||β for all A ∈ Mn if and only if there exists a positive constant c such that
‖x‖α = c‖x‖β for all x ∈ V .
Proof. Observe that
Rβα = maxx 6=0
‖x‖β‖x‖α
=
[minx 6=0
‖x‖α‖x‖β
]−1
≥[maxx 6=0
‖x‖α‖x‖β
]−1
=1
Rαβ
Hence, we have the general inequality
RαβRβα ≥ 1 (2.6)
with equality if and only if
minx 6=0
‖x‖α‖x‖β
= maxx 6=0
‖x‖α‖x‖β
which can occur if and only if the function ‖x‖α/‖x‖β is constant for all x 6= 0. Therefore, if
‖x‖α ≡ c‖x‖β, we have RαβRβα = 1, hence |||A |||α ≤ |||A |||β and |||A |||α ≤ |||A |||β by Theorem
2.2.8, implying |||A |||α = |||A |||β for all A ∈Mn.
Conversely, if two induced matrix norms are identical, then RαβRβα = 1 again by Theorem
2.2.8, and hence equality holds in (2.4) and the ratio ‖x‖α/‖x‖β is constant for all x ∈ V by
the preceding argument, proving the result.
Moreover, we have the following corollary which says that no induced matrix norm can be
uniformly dominated by another. This is made precise as follows.
Corollary 2.2.10. Let ‖ · ‖α and ‖ · ‖β be vector norms on finite-dimensional vector space V
over the field F (R or C). Let ||| · |||α and ||| · |||β denote the respective induced matrix norm on
Mn. Then |||A |||α ≤ |||A |||β for all A ∈Mn if and only if |||A |||α = |||A |||β for all A ∈Mn.
Proof. If |||A |||α ≤ |||A |||β for all A ∈ Mn, then RαβRβα ≤ 1, which because of (2.4) in the
previous corollary, implies RαβRβα = 1. Therefore, |||A |||α = |||A |||β for all A ∈ Mn by similar
argument as Corollary 2.2.9.
Corollary 2.2.10 says that no induced matrix norm can be uniformly dominated by another
induced matrix norm. The following theorem will examine the case when we compare induced
matrix norm to another (not necessarily induced) matrix norm.
Theorem 2.2.11. Let ||| · ||| be a given matrix norm on Mn, and let ||| · |||α be a given induced
matrix norm on Mn. Then
1. There is an induced matrix norm ||| · |||β such that |||A |||β ≤ |||A ||| for all A ∈Mn; and
2. |||A ||| ≤ |||A |||α for all A ∈Mn if and only if |||A ||| = |||A |||α for all A ∈Mn.
Proof. Define the vector norm ‖ · ‖ on Fn by
‖x‖ ≡ |||X ||| , where X ≡ [x x . . . x] ∈Mn (2.7)
29
where the matrix X consists of the columns of vector x.
Consider the matrix norm ||| · |||β on Mn induced by ‖ · ‖. For any A ∈Mn, we have
|||A |||β ≡ maxx 6=0
‖Ax‖‖x‖
= maxx 6=0
||| [Ax Ax . . . Ax] |||||| [x x . . . x] |||
= maxx 6=0
|||AX ||||||X |||
≤ maxx 6=0
|||A ||| |||X ||||||X |||
= |||A |||
which proves (1).
To prove (2), suppose that |||A ||| ≤ |||A |||α for all A ∈ Mn. Then by (1) just proven above,
we have
|||A |||β ≤ |||A ||| ≤ |||A |||α
for all A ∈Mn. However, both ||| · |||β and ||| · |||α are induced matrix norms, hence |||A |||β = |||A |||αby Corollary 2.2.10, and hence |||A ||| ≡ |||A |||α for all A ∈Mn.
The above result is the motivation of the following definition of minimal matrix norm.
Definition 2.2.12. A matrix norm ||| · ||| onMn is said to be a minimal matrix norm if the only
matrix norm ||| · |||α on Mn such that |||A |||α ≤ |||A ||| for all A ∈Mn is ||| · |||α = ||| · |||.
We will now establish the properties of the minimal matrix norm. The following theorem
gives some equivalent conditions.
Theorem 2.2.13. Let ||| · ||| be a matrix norm onMn. Let ||| · |||y be the matrix norm induced by
the vector norm defined by ‖x‖y ≡ |||xy∗ ||| for a given y ∈ Fn. Then the following are equivalent:
1. ||| · ||| is an induced matrix norm.
2. ||| · ||| is a minimal matrix norm.
3. ||| · ||| = ||| · |||y for all nonzero y ∈ Fn.
Proof. The assertion (1) implies (2) is proven in Theorem 2.2.11. Moreover, the assertion
(3) implies (1) is trivial because ||| · |||y is induced by definition. It remains to prove (2)
implies (3).
Observe that ‖ · ‖y is a vector norm on Fn with the property that for all A ∈Mn,
‖Ax‖y = |||A(xy∗) ||| ≤ |||A ||| |||xy∗ ||| = |||A ||| ‖x‖y
Now we have, for all A ∈Mn,
|||A |||y ≡ maxx 6=0
‖Ax‖y‖x‖y
≤ maxx 6=0
|||A ||| ‖x‖y‖x‖y
= |||A |||
30
If ||| · ||| is a minimal matrix norm, then the above inequality implies |||A ||| = |||A |||y for all
A ∈Mn.
Hence, we have proven the statement.
We have proven in the above theorem that the induced matrix norms are minimal among
all matrix norms. Subsequently, we will characterise the minimal matrix norms among some
important classes of matrix norms.
Definition 2.2.14. Let ||| · ||| be a matrix norm onMn such that |||A ||| = |||UAV ||| for all A ∈Mn
and all unitary matrices U, V ∈Mn. Then ||| · ||| is said to be a unitarily invariant matrix norm.
Some examples of unitarily invariant matrix norm include the Frobenius norm and the spec-
tral norm. Next, we will define the notion of adjoint of a matrix norm.
Definition 2.2.15. Let ||| · ||| be a matrix norm on Mn, then the function ||| · |||∗ defined by
|||A |||∗ = |||A∗ |||
for all A ∈Mn is a matrix norm. ||| · |||∗ is said to be the adjoint of ||| · |||.
Proof. We will show that ||| · |||∗ is a matrix norm by checking the norm axioms.
1. (Non-negative and Positive) Non-negativity follows immediately by definition. Moreover,
|||A |||∗ = 0 if and only if A∗ = 0, which implies A = 0.
2. (Homogeneous) Let c ∈ F. Then
||| cA |||∗ = ||| (cA)∗ ||| = ||| cA∗ ||| = |c| |||A∗ ||| = |c| |||A |||∗
3. (Triangle Inequality) Let A,B ∈Mn. Then
|||A+B |||∗ = ||| (A+B)∗ ||| = |||A∗ +B∗ |||
≤ |||A∗ |||+ |||B∗ |||
= |||A |||∗ + |||B |||∗
4. (Submultiplicative) Let A,B ∈Mn. Then
|||AB |||∗ = ||| (AB)∗ ||| = |||B∗A∗ ||| ≤ |||B∗ ||| |||A∗ ||| = |||A |||∗ |||B |||∗
Hence ||| · |||∗ is indeed a matrix norm.
We will now define the notion of self-adjoint matrix norm, analogous to the notion of self-
adjoint matrix.
Definition 2.2.16. Let ||| · ||| be a matrix norm onMn and let ||| · |||∗ be its adjoint. Then ||| · ||| issaid to be self-adjoint if |||A |||∗ = |||A ||| for all A ∈Mn.
Some examples of matrix norm which are self-adjoint include Frobenius matrix norm and
spectral norm. Next, we will show that every unitarily invariant matrix norm is in fact self-
adjoint.
31
Proposition 2.2.17. Every unitarily invariant matrix norm on Mn is self-adjoint.
Proof. Let A,U, V ∈Mn and U, V be unitary matrices. Then for all A ∈Mn, we have
|||A |||∗ = |||A∗ ||| = |||UA∗V ||| = ||| (V ∗AU∗)∗ ||| = |||V ∗AU∗ ||| = |||A |||
where we used the definition of unitarily invariant matrix norm, and the fact that U∗ and V ∗ is
also unitary.
We are now ready to find the matrix norms which are minimal among the class of unitarily
invariant matrix norms and self-adjoint matrix norms. It turns out that the minimal matrix
norm in this case is just the spectral norm.
Theorem 2.2.18. If ||| · ||| is a unitarily invariant matrix norm onMn, then |||A |||2 ≤ |||A ||| for all
A ∈Mn. The spectral norm is the only matrix norm onMn that is both induced and unitarily
invariant.
Proof. Suppose that ||| · ||| is a given unitarily invariant matrix norm. By Theorem 2.2.11, there
exists a matrix norm ||| · |||β such that |||A |||β ≤ |||A ||| for all A ∈ Mn, where ||| · |||β is induced by
the vector norm ‖ · ‖ defined in the statement (2.7) of the proof in Theorem 2.2.11.
If U ∈Mn is unitary, then
‖Ux‖ = |||UX ||| = |||X ||| = ‖x‖
Now observe the fact that if y ∈ Fn, such that ‖y‖2 = 1, is a given nonzero vector, then there
exists a unitary matrix U such that Uy = e1, where e1 is the basis vector of Fn with the first
entry equal to 1 and 0 otherwise, so there exists x ∈ Fn such that Ux = ‖x‖2e1. Thus, we have
‖x‖ = ‖Ux‖ = ‖‖x‖2e1‖ = ‖x‖2‖e1‖
for all x ∈ Fn. The vector norm ‖ · ‖ is therefore a scalar multiple of the Euclidean norm. By
Corollary 2.2.9, we have ||| · |||β (matrix norm induced by ‖ · ‖) equals ||| · |||2 (matrix norm induced
by ‖ · ‖2). Therefore, ||| · |||2 = ||| · |||β ≤ |||A ||| for all A ∈Mn.
If ||| · ||| is assumed to be induced, then it is minimal and hence |||A |||2 = |||A ||| by Theorem 2.2.13,
hence proving the statement.
Next, we will determine the minimal matrix norm in the class of self-adjoint matrix norms.
For that, we need to establish several lemmas.
Lemma 2.2.19. Let ||| · ||| be a given matrix norm on Mn. Then ||| · |||∗ is an induced matrix
norm if and only if ||| · ||| is an induced matrix norm.
Proof. Let ||| · |||α be a matrix norm on Mn. Now if |||A |||α ≤ |||A |||∗ = |||A∗ ||| for all A ∈ Mn,
then |||A |||∗α = |||A∗ |||α ≤ |||A ||| for all A ∈Mn.
If ||| · ||| is induced, hence minimal by Theorem 2.2.13, then we have |||A∗ |||α = |||A |||, which
implies |||A |||α = |||A∗ ||| for all A ∈Mn, and therefore ||| · |||α = ||| · |||∗. So ||| · |||∗ is a minimal (hence
induced) matrix norm.
The converse can be established by similar reasoning.
Lemma 2.2.20. Let ||| · ||| be a given matrix norm on Mn. If the matrix norm ||| · ||| is induced
by the vector norm ‖ · ‖, then ||| · |||∗ is induced by the dual norm ‖ · ‖D.
32
Proof. Suppose that ||| · ||| is induced by the vector norm ‖ · ‖. By the duality theorem, we have
|||A |||∗ = |||A∗ ||| = max‖x‖=1
‖A∗x‖
= max‖x‖=1
(‖A∗x‖D
)D= max‖x‖=1
max‖z‖D=1
|(A∗x)∗z|
= max‖z‖D=1
max‖x‖=1
|x∗Az|
= max‖z‖D=1
‖Az‖D
and hence ||| · |||∗ is induced by ‖ · ‖D by definition.
Now we are in position to prove that in fact the spectral norm is the only induced matrix
norm in the class of self-adjoint matrices.
Theorem 2.2.21. The spectral norm ||| · |||2 is the only matrix norm that is both induced and
self-adjoint.
Proof. Observe that if the matrix norm ||| · ||| is induced by the vector norm ‖·‖, and if ||| · ||| = ||| · |||∗,then by Lemma 2.2.20, ||| · ||| is also induced by ‖ · ‖D. However, Corollary 2.2.9 says that the
vector norm that induces a matrix norm is uniquely determined up to a positive scalar factor.
Hence, there exists some c > 0 such that ‖ · ‖D = c‖ · ‖. Now, by Proposition 1.6.7, we then
have ‖ · ‖ = ‖ · ‖2/√c. Since the given norm is a positive multiple of the Euclidean vector norm,
they both induce the same matrix norm, hence we conclude that ||| · ||| = ||| · |||2.
2.3 Generalised Matrix Norms
When we relax the submultiplicativity axiom in the definition of matrix norms, we obtain a
bigger class of norms which could be useful for some important applications. In this section, we
will explore the properties of such class of matrix norms.
Definition 2.3.1 (Generalised Matrix Norm Axioms). A function G(·) :Mn → R is said to be
a generalised matrix norm on Mn or a vector norm on Mn if for all A,B ∈ Mn, the following
axioms are satisfied:
1. (Non-negative) G(A) ≥ 0.
2. (Positive) G(0) = 0 if and only if A = 0.
3. (Homogeneous) G(cA) = |c|G(A) for all scalars c ∈ F.
4. (Triangle Inequality) G(A+B) ≤ G(A) +G(B).
We will now give examples of some generalised matrix norms which are not matrix norms.
We will also check the norm axioms for some of the cases.
Example 2.3.2. Let ||| · ||| be a matrix norm onMn and T, S ∈Mn are non-singular. Then the
function GT,S(·) defined by
GT,S(A) ≡ |||TAS |||
33
for all A ∈ Mn is a generalised matrix norm on Mn. However, in general it is not a matrix
norm on Mn.
Proof. Now, we will check that the above function is a generalised matrix norm on Mn.
1. (Non-negative and Positive) By definition of matrix norm, GT,S(A) ≥ 0 for all A ∈ Mn.
We also have GT,S(A) = 0 if and only if TAS = 0. Moreover, as T and S are non-singular,
we have A = 0.
2. (Homogeneous) Let c ∈ F. Then
GT,S(cA) = |||T (cA)S ||| = |c| |||TAS ||| = |c|GT,S(A)
3. (Triangle Inequality) Let A,B ∈Mn. Then
GT,S(A+B) = |||T (A+B)S ||| = |||TAS + TBS |||
≤ |||TAS |||+ |||TBS |||
= GT,S(A) +GT,S(B)
Hence, GT,S(·) is a generalised matrix norm on Mn.
Taking
T = A = B = I and S =
(18
14
14
18
)
and the matrix norm to be ||| · |||∞, we have GT,S(AB) = |||S |||∞ = 12 , but GT,S(A)GT,S(B) =
|||S |||∞ |||S |||∞ = 14 , hence the submultiplicativity axiom is not satisfied, and GT,S(·) is not a
matrix norm on Mn in general.
Example 2.3.3. Define the Hadamard product of two matrices A = [aij ] and B = [bij ] of the
same size to be the entry-wise product A ◦ B ≡ [aijbij ]. If H = [hij ] ∈ Mn is a given matrix
with no zero entries, and if ||| · ||| is any matrix norm on Mn, then the function GH(·) given by
GH(A) ≡ |||H ◦A |||
is a generalised matrix norm on Mn.
However, in general, it is not a matrix norm on Mn.
Proof. We will check that the above function is a vector norm on Mn.
1. By definition of matrix norm, we have GH(A) ≥ 0 for all A ∈Mn. Moreover, GH(A) = 0
if and only if H ◦A = 0. As H contains no zero entries, we must have A = 0.
2. Let A = [aij ] ∈Mn and c ∈ F. Then we have
GH(cA) = |||H ◦ cA ||| = ||| [hij caij ] ||| = ||| c[hij aij ] ||| = |c| |||H ◦A ||| = |c|GH(A)
34
3. Let A = [aij ], B = [bij ] ∈Mn. Then we have
GH(A+B) = |||H ◦ (A+B) ||| = ||| [hij ] ◦ [aij + bij ] |||
= ||| [hij(aij + bij)] |||
= ||| [hij aij ] + [hij bij ] |||
= |||H ◦A+H ◦B |||
≤ |||H ◦A |||+ |||H ◦B |||
= GH(A) +GH(B)
Hence, GH is a generalised matrix norm on Mn.
Taking
H =
(12
12
12
12
), A =
(0 1
0 0
), B =
(0 0
1 0
),
and the matrix norm to be ||| · |||∞, we have GH(AB) = 12 , but GH(A)GH(B) = 1
4 , hence the
submultiplicativity axiom is not satisfied, and GH(·) is not a matrix norm onMn in general.
Example 2.3.4. The function G∞(·) defined by
G∞(A) ≡ max1≤i,j≤n
|aij |
for all A = [aij ] ∈ Mn is a generalised matrix norm on Mn. However, it can be checked that
G∞(·) is not a matrix norm on Mn in general.
Some properties of matrix norms do carry over to the case of generalised matrix norms. One
of the useful properties that generalised matrix norms also enjoy is the equivalence property in
finite-dimensional vector space.
Theorem 2.3.5. Let G(·) and Gα(·) be generalised matrix norms on Mn. Then there exist
finite positive constants cm and CM such that
cmGα(A) ≤ G(A) ≤ CMGα(A) (2.8)
for all A ∈ Mn. In particular, this inequality also holds when Gα(·) is replaced by any matrix
norm ||| · ||| on Mn, i.e. we also have
cm |||A ||| ≤ G(A) ≤ CM |||A ||| (2.9)
for all A ∈Mn.
Proof. This follows almost immediately from Theorem 1.4.8. It remains to prove that G(·) is a
continuous function. However, the statement of Lemma 2.1.7 concerning the uniform continuity
of matrix norm is also valid for the case of generalised matrix norm. This is because the
definition of generalised matrix norm is almost similar to that of matrix norm, only without the
submultiplicativity property. However, nowhere in the proof of Lemma 2.1.7 have we used the
fact that matrix norm is submultiplicative. Hence, we have the conclusion.
35
Considering the fact that the definition of generalised matrix norm differs from matrix norm
only in the submultiplicativity property, we would expect these two classes of norms to be related
closely. The following result shows that a generalised matrix norm can be made into a matrix
norm by multiplying it with appropriate positive constant. First, we prove a lemma.
Lemma 2.3.6. Let G(·) be a generalised matrix norm on Mn. Define
c(G) ≡ maxA 6=0,B 6=0
G(AB)
G(A)G(B)(2.10)
If ||| · ||| is a matrix norm on Mn such that
cm |||A ||| ≤ G(A) ≤ CM |||A ||| (2.11)
for all A ∈Mn, then c(G) ≤ CMc2m
.
Proof. We have
c(G) ≡ maxA 6=0,B 6=0
G(AB)
G(A)G(B)= max
A 6=0,B 6=0G
(AB
G(A)G(B)
)= max
G(A′)=G(B′)=1G(A′B′)
which is finite and positive by the continuity and the compactness of the unit sphere of G(·).Then we have
G(AB) ≤ CM |||AB ||| ≤ CM |||A ||| |||B |||
≤ CM[
1
cmG(A)
] [1
cmG(B)
]=CMc2m
G(A)G(B)
This impliesG(AB)
G(A)G(B)≤ CM
c2m
for all nonzero A,B ∈Mn. Hence we have c(G) ≤ CMc2m
as required.
Theorem 2.3.7. Let G(·) be a generalised matrix norm onMn and c(G) as defined in Lemma
2.3.6. Define the function ||| · ||| by |||A ||| ≡ kG(A), where k is a positive constant. If k ≥ c(G),
then ||| · ||| is a matrix norm.
In particular, ||| · ||| ≡ CMc2m
G(·) is a matrix norm.
Proof. We will check that ||| · ||| satisfies the matrix norm axioms.
1. (Non-negative and Positive) Clearly a positive multiple of a generalised matrix norm is
always non-negative. Moreover, |||A ||| = 0 if and only if G(A) = 0, which implies A = 0.
2. (Homogeneous) Let c ∈ F. Then
||| cA ||| = kG(cA) = |c|kG(A) = |c| |||A |||
36
3. (Triangle Inequality) Let A,B ∈Mn. Then
|||A+B ||| = kG(A+B) ≤ k(G(A) +G(B)) = |||A |||+ |||B |||
4. (Submultiplicativity) Let A,B ∈Mn. Then by definition of c(G),
|||AB ||| = kG(AB) ≤ kc(G)G(A)G(B)
≤ kG(A) kG(B)
= |||A ||| |||B |||
Hence, ||| · ||| is a matrix norm.
In particular, ||| · ||| ≡ CMc2m
G(·) is a matrix norm asCMc2m
≥ c(G) by Lemma 2.3.6, and hence
satisfying the hypothesis of the above theorem.
At this point, to derive further properties of generalised matrix norms, we are going to define
the notion of spectral radius of a matrix. Further properties and applications of this spectral
radius will be given in the next chapter.
Definition 2.3.8 (Spectral Radius). The spectral radius ρ(A) of a matrix A ∈Mn is
ρ(A) ≡ max{|λ| : λ is an eigenvalue of A}
Below is the relation between spectral radius and matrix norm that we require at the moment.
Theorem 2.3.9. If ||| · ||| is any matrix norm on Mn, then ρ(A) ≤ |||A ||| for all A ∈Mn.
Proof. Observe that if λ is any eigenvalue of A, then |λ| ≤ ρ(A). Moreover, there is at least
one eigenvalue λ for which |λ| = ρ(A). Let x be an eigenvector associated with λ. Consider the
matrix X ∈Mn, all the columns of which are equal to the eigenvector x. Then we have
|λ| |||X ||| = |||λX ||| = |||AX ||| ≤ |||A ||| |||X |||
and hence |λ| = ρ(A) ≤ |||A ||| as required.
Next, we will define the notion of compatible matrix norm as well as compatible vector norm
on matrices.
Definition 2.3.10. The vector norm ‖ · ‖ on a finite-dimensional vector space V over the field
F (R or C) is said to be compatible with the generalised matrix norm G(·) in Mn if
‖Ax‖ ≤ G(A)‖x‖
for all x ∈ V and all A ∈Mn.
Similarly, ‖ · ‖ is compatible with the matrix norm ||| · ||| on Mn if
‖Ax‖ ≤ |||A ||| ‖x‖
for all x ∈ V and all A ∈Mn.
37
In Proposition 2.2.3, we have shown that given a vector norm on Fn, then there is a matrix
norm compatible with it (the induced matrix norm). Here, we will show the converse.
Proposition 2.3.11. If ||| · ||| is a matrix norm on Mn, then there is some vector norm on Fn
that is compatible with it.
Proof. If we define a vector norm ‖ · ‖ on Fn by ‖x‖ ≡ ||| [x 0 . . . 0] |||, then we have
‖Ax‖ = ||| [Ax 0 . . . 0] ||| = |||A[x 0 . . . 0] |||
≤ |||A ||| ||| [x 0 . . . 0] |||
= |||A ||| ‖x‖
Hence, we find a vector norm ‖ · ‖ compatible with given ||| · ||| as required.
Now we are going to extend the above notion of compatibility to the case of generalised
matrix norm. In this case, the situation is more complicated as some generalised matrix norms
on Mn have compatible vector norms in Fn, while others do not. This will be explored in the
next few theorems.
Theorem 2.3.12. Let G(·) be a generalised matrix norm on Mn that has a compatible vector
norm ‖ · ‖ on Fn. Then G(A) ≥ ρ(A) for all A ∈Mn. More generally,
G(A1)G(A2) . . . G(Ak) ≥ ρ(A1A2 . . . Ak) (2.12)
for all A1, A2, . . . , Ak ∈Mn and all k = 1, 2, . . ..
Proof. We will first prove the following claim.
Claim: G(A1)G(A2) . . . G(Ak)‖x‖ ≥ ‖A1A2 . . . Akx‖ for A1, . . . , Ak ∈ Mn and x ∈ Fn, where
‖ · ‖ is compatible with G(·).
Proof of Claim: We will proceed by induction.
Let P (k) be the statement ‘G(A1)G(A2) . . . G(Ak)‖x‖ ≥ ‖A1A2 . . . Akx‖ for A1, . . . , Ak ∈ Mn
and x ∈ Fn’.
Clearly, P (1) is true by definition of ‖ · ‖ being compatible with G(·).Suppose P (k) is true for some positive integer m, i.e. P (m) is true. We will prove P (m+ 1) is
true. We have
G(A1) . . . G(Am)G(Am+1)‖x‖ ≥ G(A1) . . . G(Am)‖Am+1x‖
≥ ‖A1 . . . AmAm+1x‖
Hence P (m+ 1) is true. Therefore the claim is proven by mathematical induction.
Now, let x be nonzero vector such that A1 . . . Akx = λx, where |λ| = ρ(A1 . . . Ak). Then
G(A1)G(A2) . . . G(Ak) ≥ ‖A1A2 . . . Akx‖
= ‖λx‖
= ρ(A1A2 . . . Ak)‖x‖
38
Hence, the conclusion follows.
We have observed in Theorem 2.3.12 the necessary condition for a given vector norm onMn
to have a compatible vector norm on matrices. It is in fact sufficient. To show this, we need
another lemma.
Lemma 2.3.13. Let G(·) be a generalised matrix norm on Mn that satisfies the condition
(2.10) in Theorem 2.3.12. Let ||| · |||2 denotes the spectral norm onMn. Then there exists a finite
positive constant c = c(G) such that
G(A1)G(A2) . . . G(Ak) ≥ c |||A1A2 . . . Ak |||2
for all A1, A2, . . . , Ak ∈Mn and all k = 1, 2, . . ..
Proof. Let k be a given positive integer and let A1, A2, . . . , Ak ∈Mn be given.
By singular value decomposition theorem, there exist unitary matrices V and W and a diagonal
matrix Σ = diag(σ1, σ2, . . . , σn) with all σi ≥ 0 for i = 1, . . . , n, such that A1A2 . . . Ak = V ΣW ∗
and ρ(Σ) = max{σ1, . . . , σn} = |||A1A2 . . . Ak |||2.
(Refer to [2] for more details on singular value decomposition theorem, and the proof of the
above assertion).
By the hypothesis, we have
G(V ∗)G(A1)G(A2) . . . G(Ak)G(W ) ≥ ρ(V ∗A1A2 . . . AkW )
= ρ(Σ)
= |||Σ |||2= |||V ∗A1A2 . . . AkW |||2= |||A1A2 . . . Ak |||2
where we used the fact that the spectral norm is unitarily invariant.
Furthermore, by the equivalence between generalised matrix norm and matrix norm on Mn,
there exists a finite positive constant b = b(G) such that |||A |||2 ≥ bG(A) for all A ∈Mn.
We then have
G(A1)G(A2) . . . G(Ak) ≥1
G(V ∗)G(W )|||A1A2 . . . Ak |||2
≥ b2
|||V ∗ |||2 |||W |||2|||A1A2 . . . Ak |||2
= b2 |||A1A2 . . . Ak |||2
The conclusion follows by taking c = b2.
Now, we are ready to prove the sufficient condition for a given vector norm to have a com-
patible generalised matrix norm.
Theorem 2.3.14. Let G(·) be a generalised matrix norm on Mn. There exists a vector norm
‖ · ‖ on Fn such that
‖Ax‖ ≤ G(A)‖x‖
39
for all x ∈ Fn and all A ∈Mn if and only if
G(A1)G(A2) . . . G(Ak) ≥ ρ(A1A2 . . . Ak)
for all A1, A2, . . . , Ak ∈Mn and all k = 1, 2, . . ..
Proof. Necessity has been proven in Theorem 2.3.12. For sufficiency, we will show that there
exists a matrix norm ||| · ||| on Mn such that G(A) ≥ |||A ||| for all A ∈Mn.
Let ‖ · ‖ be a vector norm on Fn that is compatible with ||| · ||| (which is guaranteed to exist by
Proposition 2.3.11) and let x ∈ Fn and A ∈ Mn be given. Then we have ‖Ax‖ ≤ |||A ||| ‖x‖ ≤G(A)‖x‖, so we will be done if we can construct a matrix norm that is dominated by G(·).
Now observe that for a given matrix A ∈ Mn, there are various ways to represent A as a
product of matrices or as a sum of products of matrices. Define
|||A ||| ≡ inf
{∑i
G(Ai1) . . . G(Aiki) :∑i
Ai1 . . . Aiki = A and all Aikj ∈Mn
}(2.13)
It remains to check that ||| · ||| defined above is indeed a matrix norm.
1. (Non-negative and Positive) By Lemma 2.3.13 and triangle inequality for spectral norm,
we have ∑i
G(Ai1) . . . G(Aiki) ≥∑i
c |||Ai1 . . . Aiki |||2
≥ c
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∑
i
Ai1 . . . Aiki
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣2
= c |||A |||2
from which it follows that ||| · ||| is non-negative. Moreover, we have |||A ||| = 0 if and only
if A = 0, for if A 6= 0 then by the inequality above, we have |||A ||| ≥ c |||A |||2 > 0 by the
property of matrix norm.
2. (Homogeneous) Let c ∈ F. Then we have
||| cA ||| = inf
{∑i
G(Bi1) . . . G(Biki) :∑i
Bi1 . . . Biki = cA and all Bikj ∈Mn
}
= inf
{|c|∑i
G(Bi1) . . . G
(1
cBij
). . . G(Biki) :
∑i
Bi1 . . .1
cBij . . . Biki = cA
}
= |c| inf
{∑i
G(Bi1) . . . G
(1
cBij
). . . G(Biki) :
∑i
Bi1 . . .1
cBij . . . Biki = cA
}= |c| |||A |||
3. (Triangle Inequality) Let A,B ∈Mn and let C = A+B. Consider the following set:
A′ ≡
{∑i
G(Ai1) . . . G(Aiki) :∑i
Ai1 . . . Aiki = A and all Aikj ∈Mn
}
40
B′ ≡
{∑i
G(Bi1) . . . G(Biki) :∑i
Bi1 . . . Biki = B and all Bikj ∈Mn
}
C ′ ≡
{∑i
G(Ci1) . . . G(Ciki) :∑i
Ci1 . . . Ciki = C and all Cikj ∈Mn
}where each set defined above is a set of the sum of products of generalised matrix norm on
Mn over the various representation of the matrix concerned as a sum of products. Define
the addition of sets by
A′ +B′ ≡ {a+ b : a ∈ A′, b ∈ B′}
Then we have (A′+B′) ⊆ C ′ because every representation of A and B separately as a sum
of products yields a representation of C as a sum of products, but not all representations
of C arise in this way. Therefore
|||A+B ||| = |||C ||| = inf C ′ ≤ inf(A′ +B′)
= inf(A′) + inf(B′)
= |||A |||+ |||B |||
4. (Submultiplicative) Let A,B ∈ Mn and let C = AB. Let the set A′, B′, C ′ be as defined
above. Define the product of sets by
A′B′ ≡ {ab : a ∈ A′, b ∈ B′}
Then we have A′B′ ⊆ C ′ for the same reason that every representation of A and B
separately as a sum of products yields a representation of C as a sum of products, but not
all representations of C arise in this way. Therefore
|||AB ||| = |||C ||| = inf C ′ ≤ inf(A′B′)
= inf(A′) inf(B′)
= |||A ||| |||B |||
Hence, we have proven that ||| · ||| is a matrix norm and the conclusion follows.
Now we have understood the useful necessary and sufficient conditions for a generalised
matrix norm on Mn to have a compatible vector norm on Fn. Furthermore, we also know that
given a vector norm on Fn, there exists a matrix norm onMn that is compatible with it (which
is the induced matrix norm). Now we are going to show that given a vector norm on Fn, one
can always find a compatible generalised matrix norm onMn that is not submultiplicative (i.e.
not a matrix norm).
Proposition 2.3.15. Let ‖ · ‖ be a given vector norm on Fn. Then there exists a generalised
matrix norm G(·) on Mn, which is not a matrix norm and is such that
‖Ax‖ ≤ G(A)‖x‖
for all x ∈ Fn and all A ∈Mn.
41
Proof. Let P ∈Mn be any permutation matrix whose entries in main diagonal are all zero. For
instance, take P = [pij ] with
pij =
{1 if j = i+ 1 or if i = n and j = 1
0 otherwise
Let ||| · ||| denote the matrix norm on Mn which is induced by the vector norm ‖ · ‖. Let A =
[aij ] ∈Mn. Define G(·) on Mn by
G(A) ≡ |||A |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ max
1≤i≤n|aii|
We will now verify that G(·) is a generalised matrix norm onMn. Non-negativity and Positivity
axioms are almost immediate. We will check the remaining axioms.
1. (Homogeneous) Let c ∈ Fn. Then
G(cA) = ||| cA |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ max
1≤i≤n|caii|
= |c| |||A |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ |c| max
1≤i≤n|aii|
= |c|(|||A |||+ |||P |||
∣∣∣∣∣∣P T ∣∣∣∣∣∣ max1≤i≤n
|aii|)
= |c|G(A)
2. (Triangle Inequality) Let A,B ∈Mn. Then
G(A+B) = |||A+B |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ max
1≤i≤n|aii + bii|
≤ |||A |||+ |||B |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ (max
1≤i≤n|aii|+ max
1≤i≤n|bii|)
= |||A |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ max
1≤i≤n|aii|+ |||B |||+ |||P |||
∣∣∣∣∣∣P T ∣∣∣∣∣∣ max1≤i≤n
|bii|
= G(A) +G(B)
Hence G(·) is a generalised matrix norm on Mn. Moreover G(A) ≥ |||A ||| for all A ∈Mn, and
‖Ax‖ ≤ |||A ||| ‖x‖ ≤ G(A)‖x‖
for all A ∈Mn and all x ∈ Fn. Observe that P is orthogonal and ||| I ||| = 1 by Proposition 2.2.3.
However, we have
G(PP T ) = G(I) = ||| I |||+ |||P |||∣∣∣∣∣∣P T ∣∣∣∣∣∣ = 1 + |||P |||
∣∣∣∣∣∣P T ∣∣∣∣∣∣G(P ) = |||P |||
G(P T ) =∣∣∣∣∣∣P T ∣∣∣∣∣∣
Hence, we have
G(PP T ) ≥ G(P )G(P T )
Hence, the vector norm G(·) on Mn is compatible with the given vector norm ‖ · ‖ on Fn, but
42
it is not submultiplicative.
We have observed in Theorem 2.3.12 that the condition G(A) ≥ ρ(A) for all A ∈ Mn is a
necessary and sufficient condition for a generalised matrix norm on Mn to have a compatible
vector norm ‖ · ‖ on Fn. Subsequently, we will study and characterise the generalised matrix
norms on Mn that have the above property.
Definition 2.3.16. Let G(·) be a generalised matrix norm on Mn. If G(A) ≥ ρ(A) for all
A ∈Mn, then G(·) is said to be spectrally dominant.
Definition 2.3.17. Let G(·) be a generalised matrix norm on Mn. Define the spectral charac-
teristic of G(·) to be
m(G) = maxG(A)≤1
ρ(A)
A generalised matrix norm G(·) onMn is said to be minimally spectrally dominant if m(G) = 1.
Observe that a matrix norm induced by a vector norm is an example of minimally spectrally
dominant matrix norm.
Proposition 2.3.18. Induced matrix norm ||| · ||| on Mn is minimally spectrally dominant.
Proof. We will show that m(||| · |||) = 1. We have
m(||| · |||) = max|||A |||≤1
ρ(A) ≤ max|||A |||≤1
|||A ||| ≤ 1
Moreover, we have ||| I ||| = 1 by Proposition 2.2.3 and ρ(I) = 1, hence the maximum above is
actually attained, and we have m(||| · |||) = 1 as required.
We will now explore some of the properties of spectrally dominant generalised matrix norms.
Proposition 2.3.19. Let G(·) be a generalised matrix norm on Mn. Then G(·) is spectrally
dominant if and only if m(G) ≤ 1.
Proof. First, suppose G(A) ≥ ρ(A).
Then we have
m(G) = maxG(A)≤1
ρ(A) ≤ maxG(A)≤1
G(A) ≤ 1
Conversely, suppose m(G) ≤ 1.
Then we have
maxA 6=0
ρ(A)
G(A)= max
A 6=0ρ
(A
G(A)
)= max
G(A)=1ρ(A)
≤ maxG(A)≤1
ρ(A)
= m(G)
≤ 1
Thus we have ρ(A) ≤ G(A) for all nonzero matrix A.
For A = 0, the result is trivially true, hence proving the statement for all A ∈Mn.
43
Now, any generalised matrix norm can be made into a spectrally dominant norm by multi-
plying it with a suitable constant, as illustrated below.
Theorem 2.3.20. Let G(·) be a generalised matrix norm on Mn. Then G′(·) defined by
G′(A) ≡ m(G)G(A)
for all A ∈Mn is a spectrally dominant generalised matrix norm on Mn.
Proof. It can be checked that G′(·) is a vector norm on Mn. We now need to show that
G′(A) ≥ ρ(A) for all A ∈Mn. We have
ρ(A)
G(A)≤ max
B 6=0
ρ(B)
G(B)= max
B 6=0ρ
(B
G(B)
)= max
G(B)=1ρ(B)
≤ maxG(B)≤1
ρ(B)
= m(G)
This implies
m(G)G(A) ≥ ρ(A) (2.14)
for all A ∈Mn. Hence G′(A) ≥ ρ(A) for all A ∈Mn as required.
We will end this chapter with some sufficient conditions of spectrally dominant generalised
matrix norm. For that, we need to prove a lemma.
Lemma 2.3.21. Let G(·) be a generalised matrix norm and A1, A2, . . . be a sequence in Mn
such that
ρ(Aj) = 1
for j = 1, 2, . . .. Then G(Aj) does not tend to 0, as j →∞.
Proof. Suppose G(Aj)→ 0 as j →∞.
Now when ρ(B) = 1, where B ∈ Mn, we have m(G)G(B) ≥ ρ(B) = 1 by statement (2.14) in
Theorem 2.3.20, which implies G(B) ≥ 1
m(G)> 0.
However for the above sequence, we have G(Aj) ≤1
m(G)by taking j sufficiently large, which is
a contradiction.
Theorem 2.3.22. Let G(·) be a generalised matrix norm on Mn and A ∈ Mn. If there exists
a constant γA (depending on G(·) and A) such that for all integers k > 0,
G(Ak) ≤ γAG(A)k (2.15)
then G is spectrally dominant.
Proof. Suppose that γA exists with the above property for each A ∈ Mn. Suppose ρ(A) = m,
but G(A) < m.
44
Then we haveG(A)
m< 1 which implies
1
mkG(A)k → 0 as k →∞. This implies
γAmk
G(A)k → 0.
By statement (2.15), this implies1
mkG(Ak)→ 0
i.e. G
(1
mkAk)→ 0 as k →∞.
Moreover, we have ρ(Ak) = mk, which implies ρ
(1
mkAk)
= 1 for all k. However, this is a
contradiction by Lemma 2.3.21, hence proving the statement.
Another application of Lemma 2.3.21 yields different sufficient condition of spectral domi-
nance.
Proposition 2.3.23. Let G(·) be a generalised matrix norm on Mn. If for some fixed positive
integer k,
G(Ak) ≤ G(A)k (2.16)
for all A ∈Mn, then G(·) is spectrally dominant.
Proof. We will first prove that G(Ak
l)≤ G(A)k
lfor all positive integers l. Let P (l) be the
above statement. The case when l = 1 is given to be true. Suppose the statement is true for
some l. We will show that P (l + 1) is true.
G(Ak
l+1)
= G
[(Ak
l)k]≤ G
(Ak
l)k
≤[G(A)k
l]k
= G(A)kl+1
proving the statement.
Now, suppose ρ(A) = m, but G(A) < m. Then similarly, as the proof of Theorem 2.3.22,
we have ρ
(1
mklAk
l
)= 1 for all l, while
1
mklG(Ak
l) → 0 as l → ∞, i.e. G
(1
mklAk
l
)→ 0,
which is a contradiction by Lemma 2.3.21. Hence G(·) is spectrally dominant.
45
Chapter 3
Applications of Norms
In this chapter, some applications of the various classes of norms discussed in previous chapters
will be given. In particular, we will discuss the notion of convergence of series of matrices
and bounds for the roots of algebraic equations using the norms we have discussed so far.
Furthermore, we will give some simple applications of norms on perturbation of eigenvalues,
which is an important notion in numerical linear algebra.
3.1 Sequences and Series of Matrices
We will discuss the infinite sequences and series of matrices, as well as power series of matrices,
in this section. This can be thought of as a generalisation of infinite sequences and series of real
numbers. We will need the notion of spectral radius as defined in Definition 2.3.8 as well as its
basic property in Theorem 2.3.9. Now we will define the notion of convergence inMn formally.
This is a natural extension of the convergence of sequences of vectors in Fn.
Note that in the following, we do not specify with respect to which matrix norm the sequence
converges, as all matrix norm in Mn are equivalent by Theorem 2.1.9.
Definition 3.1.1. Let ||| · ||| be a matrix norm on Mn. Then the sequence {Ak} of matrices in
Mn converges to a matrix A ∈ Mn if and only if |||Ak −A ||| → 0 as k → ∞. In such case, we
write
limk→∞
Ak = A
Definition 3.1.2. Let ||| · ||| be a matrix norm on Mn. Then the sequence {Ak} of matrices in
Mn is said to be a Cauchy sequence if for each ε > 0, there exists a positive integer N = N(ε),
such that whenever m,n ≥ N ,
|||Am −An ||| < ε
Now that we have these definitions, equivalence of matrix norms in Mn established earlier,
and the similarity of the axioms of vector norms and matrix norms, all the analytic properties
of vector norms carry over to the case of matrix norms. Below we will state some of them. The
proof is essentially the same as those in the case of vector norms in Section 1.4 (by replacing
vector norms with matrix norms, and basis of Fn with basis of Mn).
Theorem 3.1.3. Let ||| · ||| be a matrix norm on Mn and let A,B ∈ Mn. If Ak → A and
Ak → B, then A = B.
46
Theorem 3.1.4. Let ||| · ||| be a matrix norm on Mn and let {Ak} be a sequence of matrices on
Mn. The sequence {Ak} converges to a matrix A ∈Mn if and only if it is a Cauchy sequence.
We will now formulate several lemmas to determine the behaviour of the sequence of matrices
{Ak} as k → ∞, which will be useful later to determine the convergence of power series of
matrices.
Lemma 3.1.5. If ||| · ||| is a matrix norm on Mn and if S ∈ Mn is non-singular, the function
||| · |||S defined by
|||A |||S ≡∣∣∣∣∣∣SAS−1
∣∣∣∣∣∣is a matrix norm.
Proof. Non-negativity and Homogeneity axioms of matrix norm is easy to verify. We will check
the remaining axioms are satisfied.
1. (Triangle Inequality) Let A,B ∈Mn. Then
|||A+B |||S =∣∣∣∣∣∣S(A+B)S−1
∣∣∣∣∣∣ =∣∣∣∣∣∣SAS−1 + SBS−1
∣∣∣∣∣∣≤∣∣∣∣∣∣SAS−1
∣∣∣∣∣∣+∣∣∣∣∣∣SBS−1
∣∣∣∣∣∣= |||A |||S + |||B |||S
2. (Submultiplicative) Let A,B ∈Mn. Then
|||AB |||S =∣∣∣∣∣∣SABS−1
∣∣∣∣∣∣ =∣∣∣∣∣∣ (SAS−1)(SBS−1)
∣∣∣∣∣∣≤∣∣∣∣∣∣SAS−1
∣∣∣∣∣∣ ∣∣∣∣∣∣SBS−1∣∣∣∣∣∣
= |||A |||S |||B |||S
Hence, ||| · |||S is a matrix norm on Mn.
Lemma 3.1.6. Let A ∈ Mn and ε > 0 be given. Then there exists a matrix norm ||| · ||| such
that ρ(A) ≤ |||A ||| ≤ ρ(A) + ε.
Proof. By Schur’s Triangularisation Theorem (see [3] for proof), there exists a unitary matrix
U and upper triangular matrix ∆, such that A = U∗∆U . Let Dt ≡ diag (t, t2, . . . , tn), and let
the entries of ∆ be
[∆]ij =
{dij if i ≥ j0 otherwise
Then we have
Dt∆D−1t =
d11 t−1d12 t−2d13 . . . t−n+1d1n
0 d22 t−1d23 . . . t−n+2d2n
... 0 d33 . . ....
0...
.... . .
...
0 0 0 . . . dnn
Hence, by taking t > 0 sufficiently large, we can make the sum of all the absolute values of the
off-diagonal entries arbitrarily small (i.e. less than ε). Then taking the maximum column sum
matrix norm, we have∣∣∣∣∣∣Dt∆D
−1t
∣∣∣∣∣∣1≤ ρ(A) + ε for t > 0 sufficiently large.
47
Now, construct a function ||| · ||| defined by
|||A ||| ≡
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣Dt UAU
∗︸ ︷︷ ︸∆
D−1t
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1
=∣∣∣∣∣∣ (DtU)A(DtU)−1
∣∣∣∣∣∣1
(3.1)
which is a matrix norm on Mn by Lemma 3.1.5. Hence, we have constructed a matrix norm
such that |||A ||| ≤ ρ(A) + ε. Since |||A ||| ≥ ρ(A) for any matrix norm by Theorem 2.3.9, the
conclusion follows.
Note that Lemma 3.1.6 implies that ρ(A) = inf{|||A ||| : ||| · ||| is a matrix norm}.
Lemma 3.1.7. Let A ∈ Mn be a given matrix. If there is a matrix norm ||| · ||| such that
|||A ||| < 1, then limk→∞
Ak = 0; that is, all the entries of Ak tend to zero as k →∞.
Proof. If |||A ||| < 1, then∣∣∣∣∣∣Ak ∣∣∣∣∣∣ ≤ |||A |||k → 0 as k → ∞. As all matrix norm on Mn are
equivalent, taking ||| · ||| to be the maximum row sum norm ||| · |||∞, this implies the entries of
Ak → 0 as k →∞.
We are now ready to state and proof a theorem which explains the behaviour of Ak as
k →∞.
Theorem 3.1.8. Let A ∈ Mn. Then limk→∞
Ak = 0 if and only if ρ(A) < 1. In this case, matrix
A is said to be convergent.
Proof. Suppose Ak → 0 as k → ∞. If x 6= 0 is an eigenvector of A such that Ax = λx, then
Akx = λkx→ 0 if and only if |λ| < 1. Since this inequality must hold for every eigenvalue λ of
A, we conclude that ρ(A) < 1.
Conversely, suppose ρ(A) < 1. Then by Lemma 3.1.6, there exists some matrix norm ||| · |||such that |||A ||| < 1. Thus, Ak → 0 as k →∞ by Lemma 3.1.7.
We can derive several corollaries from the above theorem. One of the useful corollaries is
the Gelfand’s Formula for spectral radius. The other is a useful bound on the size of the entries
of Ak as k →∞.
Corollary 3.1.9. Let A ∈ Mn be a given matrix, and let ε > 0 be given. Then there exists a
constant C = C(A, ε) such that
|(Ak)ij | ≤ C[ρ(A) + ε]k
for all k = 1, 2, . . . and all i, j = 1, 2, . . . , n, where (Ak)ij denotes the (i, j)-entry of the matrix
Ak.
Proof. Consider the matrix A = [ρ(A) + ε]−1A, which has spectral radius ρ(A) =ρ(A)
ρ(A) + ε< 1.
Then by Theorem 3.1.8, Ak → 0 as k →∞.
In particular, the entries of the matrices in the sequence {Akn}∞n=1 are bounded, i.e. there exists
a constant C > 0 such that |(Akn)ij | ≤ C for all n. This implies |(Ak)ij | ≤ C[ρ(A) + ε]k as
required.
48
Corollary 3.1.10 (Gelfand’s Formula). Let ||| · ||| be a matrix norm on Mn. Then
ρ(A) = limk→∞
∣∣∣∣∣∣∣∣∣Ak ∣∣∣∣∣∣∣∣∣ 1kfor all A ∈Mn.
Proof. Since ρ(A)k = ρ(Ak) ≤∣∣∣∣∣∣Ak ∣∣∣∣∣∣, we have ρ(A) ≤
∣∣∣∣∣∣Ak ∣∣∣∣∣∣ 1k for all k = 1, 2, . . ..
Given ε > 0, the matrix A = [ρ(A) + ε]−1A has spectral radius strictly less than 1, hence it is
convergent.
By definition of limits, there exists natural number N , such that∣∣∣∣∣∣∣∣∣ Ak ∣∣∣∣∣∣∣∣∣ < 1 for all k ≥ N , i.e.∣∣∣∣∣∣Ak ∣∣∣∣∣∣ ≤ [ρ(A) + ε]k for all k ≥ N , which implies
∣∣∣∣∣∣Ak ∣∣∣∣∣∣ 1k ≤ ρ(A) + ε.
Then we have ρ(A) ≤∣∣∣∣∣∣Ak ∣∣∣∣∣∣ 1k ≤ ρ(A) + ε. Since ε > 0 is arbitrary, the result follows.
We have established several useful result on the convergence of Ak. We will extend this
concept of convergence to infinite series and power series of matrices subsequently.
Theorem 3.1.11. Let {Ak} ⊂ Mn be a given infinite sequence of matrices. If there exists a
matrix norm ||| · ||| on Mn such that the series of real numbers∞∑k=0
|||Ak ||| is convergent, then the
series
∞∑k=0
Ak is convergent.
Proof. By Cauchy’s Criterion, the convergence of∞∑k=0
|||Ak ||| implies that given ε > 0, there exists
positive integers N0 such that for all natural numbers n,m where n > m > N0, we have∣∣∣∣∣n∑
k=m+1
|||Ak |||
∣∣∣∣∣ < ε.
Now, given ε > 0, for all n,m such that n > m > N0, we have∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
n∑k=m+1
Ak
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ≤
∣∣∣∣∣n∑
k=m+1
|||Ak |||
∣∣∣∣∣ < ε
By Cauchy’s Criterion,∞∑k=0
Ak is convergent.
Theorem 3.1.12. Let A ∈ Mn and {ak} be a sequence of complex numbers. Then the series∞∑k=0
akAk converges if there exists a matrix norm ||| · ||| onMn such that the series of real numbers
∞∑k=0
|ak| |||A |||k converges.
Proof. Similarly as proof of Theorem 3.1.12, the convergence of
∞∑k=0
|ak| |||A |||k, implies that
given any ε > 0, there exists positive integer N0, such that for all natural numbers n,m where
n > m > N0, we have
∣∣∣∣∣n∑
k=m+1
|ak| |||A |||k∣∣∣∣∣ < ε.
49
Now, given ε > 0, for all n,m such that n > m > N0, we have∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
n∑k=m+1
akAk
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ≤
∣∣∣∣∣n∑
k=m+1
∣∣∣∣∣∣∣∣∣ akAk ∣∣∣∣∣∣∣∣∣∣∣∣∣∣ =
∣∣∣∣∣n∑
k=m+1
|ak|∣∣∣∣∣∣∣∣∣Ak ∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ≤
∣∣∣∣∣n∑
k=m+1
|ak| |||A |||k∣∣∣∣∣ < ε
By Cauchy’s Criterion,∞∑k=0
akAk converges.
The notion of absolute convergence and radius of convergence for power series of real numbers
(see [1] for further details) can also be carried over to the case of power series of matrices in the
following way.
Theorem 3.1.13. Let the function f(z) be defined by f(z) =
∞∑k=0
akzk, which has radius of
convergence R > 0, and let ||| · ||| be a matrix norm onMn. Then f(A) ≡∞∑k=0
akAk is well-defined
for all A ∈Mn such that |||A ||| < R.
In particular, f(A) is well-defined for all A ∈Mn such that ρ(A) < R.
Proof. We have f (|||A |||) =
∞∑k=0
|ak| |||A |||k, which converges absolutely because |||A ||| < R, where
R is the radius of convergence of f(z).
By Theorem 3.1.12, the conclusion follows immediately.
The above theorem enables us to define power series of matrices similar to the case of power
series for real numbers.
Example 3.1.14. The matrix exponential is given by the power series
eA ≡∞∑k=0
1
k!Ak
which is well-defined for all A ∈ Mn, because the corresponding power series for real numbers
ex =∞∑k=0
1
k!xk has radius of convergence R =∞.
Other type of functions can also be defined similarly. For instance, one can define trigono-
metric function of a matrix, analogous to the case of power series for real numbers.
Another important power series of matrices which will be used later is the power series expression
for the inverse of a matrix.
Proposition 3.1.15. Let A ∈Mn. Then A is invertible if there exists a matrix norm ||| · ||| such
that ||| I −A ||| < 1. In such case, we have
A−1 =∞∑k=0
(I −A)k
50
Proof. If ||| I −A ||| < 1, then the series∞∑k=0
(I − A)k converges to some matrix C ∈ Mn because
the radius of convergence of the series∞∑k=0
zk is 1. However, we have
AN∑k=0
(I −A)k = [I − (I −A)]N∑k=0
(I −A)k = I − (I −A)N+1
which tends to matrix I as N →∞. Hence, we conclude that C = A−1.
The above result would be useful when we study the perturbation theory in the solution of
linear system later. Moreover, we have the following results as corollaries.
Corollary 3.1.16. Let ||| · ||| be a matrix norm onMn. Suppose that a given matrix A ∈Mn is
related to another matrix B ∈Mn by |||BA− I ||| < 1. Then A and B are both invertible.
Proof. By Proposition 3.1.16, BA is invertible. This implies det(AB) 6= 0, hence det(A) 6= 0
and det(B) 6= 0, proving the result
Corollary 3.1.17. Let A = [aij ] ∈Mn. Suppose that
|aii| >n∑
j=1,j 6=i|aij | for all i = 1, 2, . . . , n (3.2)
Then A is invertible.
A matrix which satisfies condition (3.2) is said to be strictly diagonally dominant.
Proof. The hypothesis (3.2) ensures that all main diagonal entries aii are nonzero.
Set D ≡ diag(a11, . . . , ann), such that D is an invertible diagonal matrix and D−1A has all 1’s
on the main diagonal. Then the matrix B ≡ I −D−1A has zero entries on the main diagonal
and bij =−aijaii
if i 6= j, where [bij ] is the (i, j)-th entry of B.
Consider the maximum row sum norm ||| · |||∞. The hypothesis guarantees |||B |||∞ < 1, so that
I −B = D−1A is invertible by Proposition 3.1.15, hence A is invertible.
3.2 Bounds for Roots of Algebraic Equations
In this section, we will examine how matrix norms can be used in various ways to give a bound for
the roots of polynomials with real or complex coefficients. First we begin with some definitions.
Definition 3.2.1. Any polynomial f(z) of degree at least 1 can be written in the form f(z) =
Azkp(z), where A is a nonzero constant and
p(z) = zn + an−1zn−1 + . . .+ a1z + a0
where a0 6= 0.
51
The companion matrix of p(z) is given by
C(p) ≡
−an−1 −an−2 . . . a1 a0
1 0 . . . 0 0
0 1. . . 0 0
......
. . ....
...
0 0 . . . 1 0
(3.3)
where the characteristic polynomial of C(p) is exactly p(z).
The following proposition is the key result used to obtain the bound for p(z).
Proposition 3.2.2. If |z| is a root of p(z) and if ||| · ||| is any matrix norm on Mn, then |z| ≤|||C(p) |||.
Proof. z is a root of p(z) = 0 if and only if z is an eigenvalue of C(p).
Now, for any eigenvalues of C(p), we have |z| ≤ ρ[C(p)] ≤ |||C(p) ||| as required.
We will now make use the above proposition by varying the matrix norm used to obtain
various upper bounds for p(z).
Proposition 3.2.3 (Montel’s Upper Bound). Let z be a root of p(z). Then
|z| ≤ 1 + |a0|+ |a1|+ . . .+ |an−1|
Proof. Using ||| · |||∞ as the matrix norm in Proposition 3.2.2, we have
|z| ≤ max{1, |a0|+ |a1|+ . . .+ |an−1|}
≤ 1 + |a0|+ |a1|+ . . .+ |an−1|
as required.
Proposition 3.2.4 (Cauchy’s Upper Bound). Let z be a root of p(z). Then
|z| ≤ 1 + max{|a0|, |a1|, . . . , |an−1|}
Proof. Using ||| · |||1 as the matrix norm in Proposition 3.2.2, we have
|z| ≤ max{|a0|, 1 + |a1|, . . . , 1 + |an−1|}
≤ 1 + max{|a0|, |a1|, . . . , |an−1|
as required.
Observe that Cauchy’s upper bound is a stronger bound compared to Montel’s upper bound.
Now, we will derive another upper bound in a slightly different way.
Proposition 3.2.5 (Carmichael and Mason’s Upper Bound). Let z be a root of p(z). Then
|z| ≤(1 + |a0|2 + |a1|2 + . . .+ |an−1|2
) 12
52
Proof. Write C(p) = S +R, where
S =
0 0 0 0 0 0
1 0 0 0 0 0
0 1. . . 0 0 0
... 0. . .
. . ....
......
.... . .
. . .. . .
...
0 0 . . . 0 1 0
and
R =
−an−1 −an−2 . . . −a1 −a0
0 0 . . . 0 0
0 0 . . . 0 0...
.... . .
......
0 0 . . . 0 0
We have S∗R = R∗S = 0. Moreover, we have (S∗S)2 = diag(1, 1, . . . , 1, 0) such that |||S∗S |||2 =
max{√λ : λ is an eigenvalue of (S∗S)∗(S∗S) = (S∗S)2} = 1.
Similarly, we have |||R∗R |||2 = |a0|2 + |a1|2 + . . .+ |an−1|2.
Then we have
|||C(p) |||22 = |||C(p)∗C(p) |||2 = ||| (S +R)∗(S +R) |||2= |||S∗S +R∗R |||2≤ |||S∗S |||2 + |||R∗R |||2
which implies |z| ≤(1 + |a0|2 + |a1|2 + . . .+ |an−1|2
) 12 as required.
We will now generalise Cauchy’s upper bound to derive another upper bound due to Kojima.
Proposition 3.2.6 (Generalised Cauchy’s Upper Bound). Let z be a root of p(z). Let D ≡diag(p1, p2, . . . , pn) be any diagonal matrix with all pi > 0. Then
|z| ≤ max
{|a0|
pnp1, |a1|
pn−1
p1+pn−1
pn, |a2|
pn−2
p1+pn−2
pn−1, . . . , |an−2|
p2
p1+p2
p3, |an−1|+
p1
p2
}Proof. Observe that ρ(A) = ρ(D−1AD) for any non-singular matrix D.
Then |z| ≤ ρ[C(p)] = ρ[D−1C(p)D] ≤∣∣∣∣∣∣D−1C(p)D
∣∣∣∣∣∣ for any matrix norm ||| · ||| on Mn.
We have
D−1C(p)D =
−an−1 −an−2p−11 p2 . . . −a1p
−11 pn−1 −a0p
−11 pn
p1p−12 0 . . . 0 0
0 p2p−13
. . . 0 0...
.... . .
......
0 0 . . . pn−1p−1n 0
53
In particular, using the matrix norm ||| · |||1, we have
|z| ≤∣∣∣∣∣∣D−1C(p)D
∣∣∣∣∣∣1
= max
{|a0|
pnp1, |a1|
pn−1
p1+pn−1
pn, |a2|
pn−2
p1+pn−2
pn−1, . . . , |an−2|
p2
p1+p2
p3, |an−1|+
p1
p2
}as required.
Proposition 3.2.7 (Kojima’s Upper Bound). Let z be a root of p(z). If all ai’s are nonzero,
then
|z| ≤ max
{∣∣∣∣a0
a1
∣∣∣∣ , 2 ∣∣∣∣a1
a2
∣∣∣∣ , 2 ∣∣∣∣a2
a3
∣∣∣∣ , . . . , 2 ∣∣∣∣an−2
an−1
∣∣∣∣ , 2|an−1|}
Proof. Using Proposition 3.2.6, choose pk ≡p1
|an−k+1|, for k = 2, 3, . . . , n, which is always
positive. Then the conclusion follows almost immediately.
Now that we have established several upper bounds for the roots of polynomial, we are
interested to establish the corresponding lower bounds. The following lemma is required to
establish the required lower bounds.
Lemma 3.2.8. If p(z) is given by p(z) = zn + an−1zn−1 + . . .+ a1z + a0 with a0 6= 0, then the
function q(z) defined by
q(z) ≡ 1
a0znp
(1
z
)= zn +
a1
a0zn−1 +
a2
a0zn−2 + . . .+
an−1
a0z +
1
a0
is a polynomial of degree n whose roots are exactly the reciprocals of the roots of p(z).
Proof. Let z0 be the roots of p(z). Then q
(1
z0
)=
1
a0zn0 p(z0) = 0, i.e.
1
z0is the root of q(z).
Now every root of p(z) gives rise to a root of q(z), giving a total of n roots, as required.
We are now in position to examine various lower bounds for the roots of p(z). We will derive
the lower bound by applying each of the upper bounds established previously to the polynomial
q(z) defined above. By combining the lower and upper bounds, it is possible to locate the roots
of the polynomial p(z) in the annulus {z : r1 ≤ |z| ≤ r2}.
Proposition 3.2.9 (Montel’s Lower Bound). Let z be a root of p(z). Then
|z| ≥ |a0|1 + |a0|+ |a1|+ . . .+ |an−1|
Proof. Applying Montel’s upper bound to q(z), we have∣∣∣∣1z∣∣∣∣ ≤ 1 +
∣∣∣∣ 1
a0
∣∣∣∣+
∣∣∣∣an−1
a0
∣∣∣∣+ . . .+
∣∣∣∣a1
a0
∣∣∣∣=
1 + |a0|+ |a1|+ . . .+ |an−1||a0|
Hence, the conclusion follows by taking reciprocals.
54
Proposition 3.2.10 (Cauchy’s Lower Bound). Let z be a root of p(z). Then
|z| ≥ |a0||a0|+ max{1, |an−1|, |an−2|, . . . , |a1|}
Proof. Applying Cauchy’s upper bound to q(z), we have∣∣∣∣1z∣∣∣∣ ≤ 1 + max
{1,
∣∣∣∣a1
a0
∣∣∣∣ , . . . , ∣∣∣∣an−1
a0
∣∣∣∣}=|a0|+ max{1, |an−1|, |an−2|, . . . , |a1|}
|a0|
Hence, the conclusion follows by taking reciprocals.
Proposition 3.2.11 (Carmichael and Mason’s Lower Bound). Let z be a root of p(z). Then
|z| ≥ |a0|(1 + |a0|2 + . . .+ |an−1|2)
12
Proof. Applying Carmichael and Mason’s upper bound to q(z), we have
∣∣∣∣1z∣∣∣∣ ≤
(1 +
∣∣∣∣ 1
a0
∣∣∣∣2 +
∣∣∣∣an−1
a0
∣∣∣∣2 + . . .+
∣∣∣∣a2
a0
∣∣∣∣2 +
∣∣∣∣a1
a0
∣∣∣∣2) 1
2
=
(1 + |a0|2 + |a1|2 + . . .+ |an−1|2
|a0|2
) 12
Hence, the conclusion follows by taking reciprocals.
Proposition 3.2.12 (Kojima’s Lower Bound). Let z be a root of p(z). Then
|z| ≥ min
{|an−1| ,
∣∣∣∣ an−2
2an−1
∣∣∣∣ , . . . , ∣∣∣∣ a0
2a1
∣∣∣∣}Proof. Applying Kojima’s upper bound to q(z), we have∣∣∣∣1z
∣∣∣∣ ≤ max
{∣∣∣∣ 1/a0
an−1/a0
∣∣∣∣ , 2 ∣∣∣∣an−1/a0
an−2/a0
∣∣∣∣ , . . . , 2 ∣∣∣∣a1
a0
∣∣∣∣}= max
{∣∣∣∣ 1
an−1
∣∣∣∣ , 2 ∣∣∣∣an−1
an−2
∣∣∣∣ , . . . , 2 ∣∣∣∣a1
a0
∣∣∣∣}Hence, the conclusion follows by taking reciprocals.
We will end this section with an example on how to use the various bounds established above
to find the location of the roots of a polynomial.
Example 3.2.13. Consider
f(z) =1
n!zn +
1
(n− 1)!zn−1 + . . .+
1
2z2 + z + 1
which is the n-th partial sum of the power series for the exponential function ez, where n is a
55
positive integer. Then all roots z of f(z) satisfy the inequality
1
2≤ |z| ≤ 1 + n!
Proof. Write f(z) as1
n!p(z), where
p(z) = zn +n!
(n− 1)!zn−1 +
n!
(n− 2)!zn−2 + . . .+
n!
1!z + n!
We then only need to consider the root of p(z). Let z denotes a root of p(z). Then using
Cauchy’s upper bound, we have
|z| ≤ 1 + max
{n!,
n!
2!,n!
3!, . . . ,
n!
(n− 1)!
}= 1 + n!
Using Cauchy’s lower bound, we have
|z| ≥ n!
1 + max{
1, n!, n!2! , . . . ,
n!(n−1)!
} =n!
1 + n!≥ 1
2
which give the conclusion as required.
3.3 Perturbation of Eigenvalues
As another application of matrix and vector norms, we consider what happened to the eigen-
value of a matrix when the matrix is perturbed. This is an important aspect in numerical linear
algebra, especially since many computations are done by computer, hence errors of rounding and
truncation are unavoidable. Vector and matrix norms can quantify this ‘error’ more precisely.
We will begin this section with definition and some properties of condition number, which
is a measure of sensitivity of error due to a small perturbation.
Definition 3.3.1 (Condition Number). The condition number of A with respect to the matrix
norm ||| · ||| on Mn is defined to be:
κ(A) =
{|||A |||
∣∣∣∣∣∣A−1∣∣∣∣∣∣ if A is non-singular
∞ if A is singular
Definition 3.3.2 (Well-conditioned and Ill-Conditioned). Let A ∈Mn.
A is said to be well-conditioned with respect to ||| · ||| if κ(A) is small (near 1).
A is said to be ill-conditioned with respect to ||| · ||| if κ(A) is large.
If κ(A) = 1, then A is said to be perfectly conditioned.
It turns out that when a well-conditioned matrix A is slightly perturbed, the eigenvalues will
not change by very much. However, in the case of ill-conditioned matrix, a small perturbation of
its entries will change its eigenvalues by large amount. Below is a concrete example to emphasise
the importance of condition number.
56
Example 3.3.3 (Wilkinson-Bidiagonal Matrix). Consider a 20×20 triangular matrix B, where
B =
20 20 0 0 . . . 0
0 19 20 0 . . . 0
0 0 18 20 . . . 0...
.... . .
. . .. . .
......
......
. . . 2 20
0 0 0 . . . 0 1
Note that using MATLAB, we found κ(B) = 5.3 × 108 with respect to the Frobenius norm.
Moreover, the eigenvalues of B are found to be 1, 2, . . . , 20. If we perturb the (20, 1)-entry of
B by ε = 10−10, then using MATLAB, we found that the eigenvalues will change drastically,
with some even becoming complex. A plot of original and perturbed eigenvalues of B in Argand
diagram is shown in Figure 3.1 below.
Figure 3.1: A plot of original and perturbed eigenvalues of matrix B
It is clear from Definition 3.3.1 that the condition number of a matrix A depends on the
matrix norm used. In fact, all condition numbers are equivalent in the sense described below.
Theorem 3.3.4. Let κα(·) be the condition number of a matrix with respect to ||| · |||α and κβ(·)be the condition number of a matrix with respect to ||| · |||β. Then there exists a finite positive
constants cm and CM such that
cmκα(A) ≤ κβ(A) ≤ CMκα(A)
for all invertible A ∈Mn.
Proof. By equivalence of matrix norm on Mn, there exists finite positive constants c1, c2, c′1, c′2
such that
c1
∣∣∣∣∣∣A−1∣∣∣∣∣∣α≤∣∣∣∣∣∣A−1
∣∣∣∣∣∣β≤ c2
∣∣∣∣∣∣A−1∣∣∣∣∣∣α
(3.4)
57
c′1 |||A |||α ≤ |||A |||β ≤ c′2 |||A |||α (3.5)
Multiplying (3.4) and (3.5), we have
c1c′1 |||A |||α
∣∣∣∣∣∣A−1∣∣∣∣∣∣α≤ |||A |||β
∣∣∣∣∣∣A−1∣∣∣∣∣∣β≤ c2c
′2 |||A |||α
∣∣∣∣∣∣A−1∣∣∣∣∣∣α
Hence, the conclusion follows by taking cm = c1c′1 and CM = c2c
′2.
Below are some useful lower bound to obtain a rough estimate of the condition number.
Proposition 3.3.5. For any non-singular matrix A ∈Mn and any matrix norm,
κ(A) ≥ max{|λA|}min{|λA|}
where max{|λA|} denotes the maximum value of the modulus of eigenvalues of A, and min{|λA|}denotes the minimum value of the modulus of eigenvalues of A.
Proof. Observe that λ is an eigenvalue of A if and only if λ−1 is an eigenvalue of A−1. Hence,
we have
κ(A) = |||A |||∣∣∣∣∣∣A−1
∣∣∣∣∣∣ ≥ ρ(A)ρ(A−1)
= max{|λA|}max{|λA−1 |}
= max{|λA|}max{|λ−1A |}
=max{|λA|}min{|λA|}
Proposition 3.3.6. Let B ∈Mn be any singular matrix. For any non-singular matrix A ∈Mn
and any matrix norm ||| · ||| on Mn,
κ(A) ≥ |||A ||||||A−B |||
Proof. We have B = A − (A − B) = A[I − A−1(A − B)] is singular. Hence I − A−1(A − B) is
singular. By Proposition 3.1.16, we have∣∣∣∣∣∣A−1(A−B)
∣∣∣∣∣∣ ≥ 1. Hence,
∣∣∣∣∣∣A−1∣∣∣∣∣∣ |||A−B ||| |||A ||| ≥ ∣∣∣∣∣∣A−1(A−B)
∣∣∣∣∣∣ |||A ||| ≥ |||A |||which implies κ(A) ≥ |||A |||
|||A−B |||as required.
We will now move to the actual application of the above notion of condition number to the
theory of perturbation of eigenvalues. First we begin with a theorem on location of eigenvalues
of a matrix due to Gershgorin.
Theorem 3.3.7 (Gershgorin Circle Theorem). Let A = [aij ] ∈Mn and let
R′i(A) =
n∑j=1,j 6=i
|aij |, 1 ≤ i ≤ n
58
denote the deleted absolute row sums of A. Then all eigenvalues of A are located in the union
of n discs, G(A) defined as
G(A) ≡n⋃i=1
{z ∈ C : |z − aii| ≤ R′i(A)}
Proof. Let λ be an eigenvalue of A. Suppose Ax = λx, where x = [xi] 6= 0. Then there exists
an entry of x that has the largest absolute value, say |xp| ≥ |xi| for all i = 1, 2, . . . , n and xp 6= 0.
Then
λxp = [λx]p = [Ax]p =n∑j=1
apjxj
here [x]p denotes the p-th entry of vector x.
This is equivalent to
xp(λ− app) =
n∑j=1,j 6=p
apjxj
By Triangle Inequality,
|xp||λ− app| =
∣∣∣∣∣∣n∑
j=1,j 6=papjxj
∣∣∣∣∣∣ ≤n∑
j=1,j 6=p|apjxj |
=n∑
j=1,j 6=p|apj ||xj |
≤ |xp|n∑
j=1,j 6=p|apj |
= |xp|R′p(A)
Hence, |λ− app| ≤ R′p(A).
Since, we do not know which p is appropriate to each eigenvalue λ (unless we know its associated
eigenvector, in which case we could just calculate λ exactly), we can only conclude that λ lies
in the union of all such discs.
We will apply the above theorem to find an a relation between eigenvalues of a perturbed
matrix and the original matrix. We only consider the case when the matrix is diagonalisable
in this case (see [2] for more detailed treatment). The basic idea is contained in the following
proposition.
Proposition 3.3.8. Let D = diag(λ1, λ2, . . . , λn) ∈ Mn. Let E = [eij ] ∈ Mn and consider the
perturbed matrix D + E. If λ is an eigenvalue of D + E, then there exists some eigenvalue λi
of D such that |λ− λi| ≤ |||E |||∞.
Proof. By Theorem 3.3.6, the eigenvalues of D + E are contained in the union of discs
S =n⋃i=1
z ∈ C : |z − λi − eii| ≤ R′i(D + E) =
n∑j=1,j 6=i
|eij |
(3.6)
59
We will first show that the set S in (3.6) above is contained in the union of discs
T =n⋃i=1
z ∈ C : |z − λi| ≤ Ri(E) =
n∑j=1
|eij |
(3.7)
Let z0 ∈ S. Then we have
|z0 − λi| = |z − λi − eii + eii| ≤ |z − λi − eii|+ |eii|
≤
n∑j=1,j 6=i
|eij |
+ |eii|
≤n∑
j=1,j 6=i|eij |+
n∑i=1
|eii|
=
n∑j=1
|eij | = Ri(E)
Hence, z0 ∈ T , proving the claim that S ⊆ T .
This implies that if λ is an eigenvalue of D + E, then there exists some λi of D such that
|λ− λi| ≤ |||E |||∞ as required.
We can extend the above argument to the case in which the matrix is diagonalisable.
Theorem 3.3.9. Let A ∈ Mn be diagonalisable with A = SΛS−1 and Λ = diag(λ1, . . . , λn).
Let E ∈ Mn and let ||| · ||| be a matrix norm such that |||D ||| = max1≤i≤n
di for all diagonal matrices
D = diag(d1, . . . , dn) ∈Mn.
If λ is an eigenvalue of A+ E, then there exists some eigenvalue λi of A for which
|λ− λi| ≤ κ(S) |||E ||| (3.8)
where κ(·) is the condition number with respect to the matrix norm ||| · |||.
Proof. Observe that A+ E and S−1(A+ E)S = Λ + S−1ES have the same eigenvalues. If λ is
an eigenvalue of Λ + S−1ES, then λI − Λ− S−1ES is singular.
Now if λI − Λ is singular, then λ = λi for some i and the bound (3.8) is trivially satisfied.
Suppose, however, that λI − Λ is nonsingular. In this case, the matrix
(λI − Λ)−1(λI − Λ− S−1ES) = I − (λI − Λ)−1S−1ES
is singular. Hence we have∣∣∣∣∣∣∣∣∣ (λI − Λ)−1S−1ES
∣∣∣∣∣∣∣∣∣ ≥ 1 by Proposition 3.1.15.
By the assumption about the behaviour of the matrix norm ||| · ||| on diagonal matrices, we have
1 ≤∣∣∣∣∣∣∣∣∣ (λI − Λ)−1S−1ES
∣∣∣∣∣∣∣∣∣ ≤ ∣∣∣∣∣∣S−1ES∣∣∣∣∣∣ (λI − Λ)−1
=∣∣∣∣∣∣S−1ES
∣∣∣∣∣∣ max1≤i≤n
|λ− λi|−1
=
∣∣∣∣∣∣S−1ES∣∣∣∣∣∣
min1≤i≤n |λ− λi|
60
Hence,
min1≤i≤n
|λ− λi| ≤∣∣∣∣∣∣S−1ES
∣∣∣∣∣∣ ≤ ∣∣∣∣∣∣S−1∣∣∣∣∣∣ |||S ||| |||E ||| = κ(S) |||E |||
as required.
All our estimates so far have been a priori bounds on the perturbations induced in the
eigenvalues; they do not involve the computed eigenvalues or eigenvectors or any quantity derived
from them. Now suppose that an ‘approximate eigenvector’ and ‘approximate eigenvalue’ have
been found somehow, then we can obtain an estimate of how close it is to the actual eigenvalue
by using the residual vector.
Theorem 3.3.10. Let A ∈ Mn be diagonalisable with A = SΛS−1 and Λ = diag(λ1, . . . , λn).
Let ‖ · ‖ be a vector norm on Cn. Let ||| · ||| be a matrix norm onMn induced by ‖ · ‖. Moreover,
|||D ||| = max1≤i≤n
|di| whenever D = diag(d1, . . . , dn) ∈Mn.
Let x ∈ Cn be a given nonzero ‘approximate eigenvector’ of A, λ be a given ‘approximate
eigenvalue’ associated with x, and let r = Ax− λx be the residual vector.
Then there exists some eigenvalue λi of A for which
|λ− λi| ≤ κ(S)‖r‖‖x‖
Proof. Write A = SΛS−1, and suppose that λ is not exactly equal to any eigenvalue of A. Then
r = Ax− λx = S(Λ− λI)S−1x
so that x = S(Λ− λI)−1S−1r. Then we have
‖x‖ = ‖S(Λ− λI)−1S−1r‖ ≤∣∣∣∣∣∣∣∣∣S(Λ− λI)−1S−1
∣∣∣∣∣∣∣∣∣ ‖r‖≤ |||S |||
∣∣∣∣∣∣S−1∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣ (Λ− λI)−1
∣∣∣∣∣∣∣∣∣ ‖r‖= κ(S)
∣∣∣∣∣∣∣∣∣ (Λ− λI)−1∣∣∣∣∣∣∣∣∣ ‖r‖
= κ(S)
(min
1≤i≤n|λi − λ|
)−1
‖r‖
Hence,
‖x‖ min1≤i≤n
|λi − λ| ≤ κ(S)‖r‖
which upon rearranging give the desired conclusion.
We end this chapter with a note that a typical example of the matrix norm ||| · ||| onMn which
satisfies the condition stated in Theorem 3.3.9 and Theorem 3.3.10, namely |||D ||| = max1≤i≤n
|di|
whenever D = diag(d1, . . . , dn) ∈Mn, is the matrix norm induced by the lp vector norm. Hence,
in particular, ||| · |||1, ||| · |||2 and ||| · |||∞ can be used as the matrix norm in the above theorems. For
further results and discussions on matrix norm with the above property, see [3] or [5].
61
Bibliography
[1] Bartle, R. G., and Sherbert D. R., Introduction to Real Analysis. John Wiley and Sons, Inc.,
New York, 2000.
[2] Datta, B. N., Numerical Linear Algebra and Applications. Society for Industrial and Applied
Mathematics (SIAM), 2010.
[3] Horn, R. A., and Johnson C. R., Matrix Analysis. Cambridge University Press, Cambridge,
1990.
[4] Householder, A. S., The Theory of Matrices in Numerical Analysis. Blaisdell, New York,
1964.
[5] Lancaster, P. and Tismenetsky, M., The Theory of Matrices with Applications. Academic
Press, New York, 1985.
[6] Ponnusamy, S., Foundations of Functional Analysis. Taylor and Francis, London, 2003.
[7] Sundaram, R. K., A First Course in Optimization Theory. Cambridge University Press,
Cambridge, 1996.
[8] Wilcox, R. R., Introduction to Robust Estimation and Hypothesis Testing. Academic Press,
New York, 2005.
62