67
Lecture Notes 1: Matrix Algebra Part A: Vectors and Matrices Peter J. Hammond My email is [email protected] or [email protected] A link to these lecture slides can be found at https://web.stanford.edu/ ~ hammond/pjhLects.html revised 2018 September 15th University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 67

Matrix Algebra Part A

  • Upload
    others

  • View
    15

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Matrix Algebra Part A

Lecture Notes 1: Matrix AlgebraPart A: Vectors and Matrices

Peter J. HammondMy email is [email protected]

or [email protected] link to these lecture slides can be found at

https://web.stanford.edu/~hammond/pjhLects.html

revised 2018 September 15th

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 67

Page 2: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 67

Page 3: Matrix Algebra Part A

Example of Two Equations in Two UnknownsIt is easy to check that

x + y = 10

x − y = 6

}=⇒ x = 8, y = 2

More generally, one can:

1. add the two equations, to eliminate y ;

2. subtract the second equation from the first, to eliminate x .

This leads to the following transformation

x + y = b1

x − y = b2

}=⇒

{2x = b1 + b2

2y = b1 − b2

of the two equation system with general right-hand sides.

Obviously the solution is

x = 12(b1 + b2), y = 1

2(b1 − b2)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 67

Page 4: Matrix Algebra Part A

Using Matrix Notation, I

Matrix notation allows the two equations

1x + 1y = b1

1x − 1y = b2

to be expressed as (1 11 −1

)(xy

)=

(b1b2

)or as Az = b, where

A =

(1 11 −1

), z =

(xy

), and b =

(b1b2

).

Here A, z,b are respectively: (i) the coefficient matrix;(ii) the vector of unknowns; (iii) the vector of right-hand sides.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 67

Page 5: Matrix Algebra Part A

Using Matrix Notation, II

Also, the solution x = 12(b1 + b2), y = 1

2(b1 − b2)can be expressed as

x = 12b1 + 1

2b2

y = 12b1 −

12b2

or as

z =

(x

y

)=

(12

12

12 −1

2

)(b1

b2

)= Cb, where C =

(12

12

12 −1

2

)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 5 of 67

Page 6: Matrix Algebra Part A

Two General EquationsConsider the general system of two equations

ax + by = u = 1u + 0vcx + dy = v = 0u + 1v

in two unknowns x and y , filled in with some extra 1s and 0s.

In matrix form, these equations can be written as(a bc d

)(xy

)=

(1 00 1

)(uv

).

In case a 6= 0, we can eliminate x from the second equationby adding −c/a times the first row to the second.

After defining the scalar constant D := a[d + (−c/a)b] = ad − bc,then clearing fractions, we obtain the new equality(

a b0 D/a

)(xy

)=

(1 0−c/a 1

)(uv

)University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 6 of 67

Page 7: Matrix Algebra Part A

Two General Equations, Subcase 1AIn Subcase 1A when D := ad − bc 6= 0,multiply the second row by a to obtain(

a b0 D

)(xy

)=

(1 0−c a

)(uv

)Adding −b/D times the second row to the first yields(

a 00 D

)(xy

)=

(1 + (bc/D) −ab/D−c a

)(uv

)Recognizing that 1 + (bc/D) = (D + bc)/D = ad/D,then dividing the two rows/equations by a and D respectively,we obtain (

xy

)=

(1 00 1

)(xy

)=

1

D

(d −b−c a

)(uv

)which implies the unique solution

x = (1/D)(du − bv) and y = (1/D)(av − cu)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 7 of 67

Page 8: Matrix Algebra Part A

Two General Equations, Subcase 1BIn Subcase 1B when D := ad − bc = 0,the multiplier −ab/D is undefined and the system(

a b0 D/a

)(xy

)=

(1 0−c/a 1

)(uv

)collapses to (

a b0 0

)(xy

)=

(u

v − cu/a

).

This leaves us with two “subsubcases”:

if cu 6= av , then the left-hand side of the second equation is 0,but the right-hand side is non-zero,so there is no solution;

if cu = av , then the second equation reduces to 0 = 0,and there is a continuum of solutionssatisfying the one remaining equation ax + by = u,or x = (u − by)/a where y is any real number.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 67

Page 9: Matrix Algebra Part A

Two General Equations, Case 2

In the final case when a = 0,simply interchanging the two equations(

a bc d

)(xy

)=

(1 00 1

)(uv

).

gives (c d0 b

)(xy

)=

(1 00 1

)(vu

).

Provided that b 6= 0, one has y = u/b and,assuming that c 6= 0, also x = (v − dy)/c = (bv − du)/bc.

On the other hand, if b = 0,we are back with two possibilities like those of Subcase 1B.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 67

Page 10: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 10 of 67

Page 11: Matrix Algebra Part A

Vectors and Inner Products

Let x = (xi )mi=1 ∈ Rm denote a column m-vector of the form

x1x2...xm

.

Its transpose is the row m-vector

x> = (x1, x2, . . . , xm).

Given a column m-vector x and row n-vector y> = (yj)nj=1 ∈ Rn

where m = n, the dot or scalar or inner product is defined as

y>x := y · x :=∑n

i=1yixi .

But when m 6= n, the scalar product is not defined.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 67

Page 12: Matrix Algebra Part A

Exercise on Quadratic Forms

ExerciseConsider the quadratic form f (w) := w>was a function f : Rn → R of the column n-vector w.

Explain why f (w) ≥ 0 for all w ∈ Rn,with equality if and only if w = 0,where 0 denotes the zero vector of Rn.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 67

Page 13: Matrix Algebra Part A

Net Quantity VectorsSuppose there are n commodities numbered from i = 1 to n.

Each component qi of the net quantity vector q = (qi )ni=1 ∈ Rn

represents the quantity of the ith commodity.

Often each such quantity is non-negative.

But general equilibrium theory, following Debreu’s Theory of Value,often uses only the sign of qi to distinguish between

I a consumer’s demands and supplies of the ith commodity;

I or a producer’s outputs and inputs of the ith commodity.

This sign is taken to be

positive for demands or outputs;

negative for supplies or inputs.

In fact, qi is taken to be

I the consumer’s net demand for the ith commodity;

I the producer’s net supply or net output of the ith commodity.

Then q is the net quantity vector.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 67

Page 14: Matrix Algebra Part A

Price Vectors

Each component pi of the (row) price vector p> ∈ Rn

indicates the price per unit of commodity i .

Then the scalar product

p>q = p · q =∑n

i=1piqi

is the total value of the net quantity vector qevaluated at the price vector p.

In particular, p>q indicates

I the net profit (or minus the net loss) for a producer;

I the net dissaving for a consumer.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 14 of 67

Page 15: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 67

Page 16: Matrix Algebra Part A

Definitions

Consider any two n-vectors x = (xi )ni=1 and y = (yi )

ni=1 in Rn.

Their sum s := x + y and difference d := x− y are constructedby adding or subtracting the vectors component by component— i.e., s = (si )

ni=1 and d = (di )

ni=1 where

si = xi + yi and di = xi − yi

for i = 1, 2, . . . , n.

The scalar product λx of any scalar λ ∈ Rand vector x = (xi )

ni=1 ∈ Rn is constructed by multiplying

each component of the vector x by the scalar λ — i.e.,

λx = (λxi )ni=1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 16 of 67

Page 17: Matrix Algebra Part A

Algebraic Fields

DefinitionAn algebraic field (F,+, ·) of scalars is a set F that, together withthe two binary operations + of addition and · of multiplication,satisfies the following axioms for all a, b, c ∈ F:

1. F is closed under + and ·:— i.e., both a + b and a · b are in F.

2. + and · are associative:both a + (b + c) = (a + b) + c and a · (b · c) = (a · b) · c .

3. + and · both commute:both a + b = b + a and a · b = b · a.

4. There are identity elements 0, 1 ∈ F for + and · respectively,with 0 6= 1, such that: (i) a + 0 = a; (ii) 1 · a = a.

5. There are inverse operations − for + and −1 for · such that:(i) a + (−a) = 0; (ii) provided a 6= 0, also a · a−1 = 1.

6. The distributive law: a · (b + c) = (a · b) + (a · c).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 17 of 67

Page 18: Matrix Algebra Part A

Three Examples of Real Algebraic Fields

ExerciseVerify that the following well known sets are algebraic fields:

I the set R of all real numbers,with the usual operations of addition and multiplication;

I the set Q of all rational numbers— i.e., those that can be expressed as the ratio r = p/qof integers p, q ∈ Z with q 6= 0.(Check that Q is closedunder the usual operations of addition and multiplication,and that each non-zero rationalhas a rational multiplicative inverse.)

I the set Q +√

2Q := {r1 +√

2r2 | r1, r2 ∈ Q} ⊂ Rof all real numbers that can be expressed as the sum of:(i) a rational number;(ii) a rational multiple of the irrational number

√2.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 67

Page 19: Matrix Algebra Part A

Two Examples of Complex Algebraic Fields

ExerciseVerify that the following well known sets are algebraic fields:

I C, the set of all complex numbers— i.e., those that can be expressed as c = a + ib,where a, b ∈ R and i is defined to satisfy i2 = −1.

Note that C is effectively the set R×R of ordered pairs (a, b)satisfying a, b ∈ R, together with the operations of:

(i) addition defined by (a, b) + (c , d) = (a + c , b + d)because (a + bi) + (c + di) = (a + c) + (b + d)i ;

(ii) multiplication defined by(a, b) · (c , d) = (ac − bd , ad + bc)because (a + bi)(c + di) = (ac − bd) + (ad + bc)i .

I the set of all rational complex numbers— i.e., those that can be expressed as c = a + ib,where a, b ∈ Q and i is defined to satisfy i2 = −1.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 67

Page 20: Matrix Algebra Part A

General Vector Spaces

DefinitionA vector (or linear) space V over an algebraic field Fis a combination 〈V ,F,+, ·〉 of:

I a set V of vectors;

I the field F of scalars;

I the binary operation V × V 3 (u, v) 7→ u + v ∈ Vof vector addition

I the binary operation F× V 3 (α,u) 7→ αu ∈ Vof multiplication by a scalar.

These are required to satisfyall of the following eight vector space axioms.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 20 of 67

Page 21: Matrix Algebra Part A

Eight Vector Space Axioms

1. Addition is associative: u + (v + w) = (u + v) + w

2. Addition is commutative: u + v = v + u

3. Additive identity: There exists a zero vector 0 ∈ Vsuch that v + 0 = v for all v ∈ V .

4. Additive inverse: For every v ∈ V , there existsan additive inverse −v ∈ V of v such that v + (−v) = 0

5. Multiplication by a scalar is distributivew.r.t. vector addition: α(u + v) = αu + αv

6. Multiplication by a scalar is distributivew.r.t. field addition: (α + β)v = αv + βv

7. Multiplication by a scalar and field multiplicationare compatible: α(βv) = (αβ)v

8. The unit element 1 ∈ F is an identity elementfor scalar multiplication: 1v = v

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 67

Page 22: Matrix Algebra Part A

Multiplication by the Zero Scalar

ExerciseProve that 0v = 0 for all v ∈ V .

Hint: Which three axioms justify the following chain of equalities

0v = [1 + (−1)]v = 1v + (−1)v = v − v = 0 ?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 67

Page 23: Matrix Algebra Part A

A General Class of Finite Dimensional Vector Spaces

ExerciseGiven an arbitrary algebraic field F, let Fn denotethe space of all lists 〈ai 〉ni=1 of n elements ai ∈ F— i.e., the n-fold Cartesian product of F with itself.

1. Show how to construct the respective binary operations

Fn × Fn 3 (x, y) 7→ x + y ∈ Fn

F× Fn 3 (λ, x) 7→ λx ∈ Fn

of addition and scalar multiplicationso that (Fn,F,+,×) is a vector space.

2. Show too that subtraction and division by a (non-zero) scalarcan be defined by v −w = v + (−1)w and v/α = (1/α)v.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 67

Page 24: Matrix Algebra Part A

Two Particular Finite Dimensional Vector Spaces

From now on we mostly consider real vector spacesover the real field R,and especially the space (Rn,R,+,×) of n-vectors over R.

We will consider, however, the space (Cn,C,+,×)of n-vectors over C — the complex plane — when considering:

I eigenvalues and diagonalization of square matrices;

I systems of linear difference and differential equations;

I the characteristic function of a random variable.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 67

Page 25: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 25 of 67

Page 26: Matrix Algebra Part A

Linear Combinations

DefinitionA linear combination of vectors is the weighted sum

∑kh=1 λhx

h,where xh ∈ V and λh ∈ F for h = 1, 2, . . . , k .

ExerciseBy induction on k, show that the vector space axiomsimply that any linear combination of vectors in Vmust also belong to V .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 26 of 67

Page 27: Matrix Algebra Part A

Linear Functions

DefinitionA function V 3 u 7→ f (u) ∈ F is linear provided that

f (λu + µv) = λf (u) + µf (v)

for every linear combination λu + µv of two vectors u, v ∈ V ,with λ, µ ∈ F.

ExerciseProve that the function V 3 u 7→ f (u) ∈ F is linearif and only if both:

1. for every vector v ∈ V and scalar λ ∈ Fone has f (λv) = λf (v);

2. for every pair of vectors u, v ∈ Vone has f (u + v) = f (u) + f (v).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 27 of 67

Page 28: Matrix Algebra Part A

Key Properties of Linear Functions

ExerciseUse induction on k to showthat if the function f : V → F is linear, then

f

(∑k

h=1λhx

h

)=∑k

h=1λhf (xh)

for all linear combinations∑k

h=1 λhxh in V

— i.e., f preserves linear combinations.

ExerciseIn case V = Rn and F = R, show thatany linear function is homogeneous of degree 1,meaning that f (λv) = λf (v) for all λ ∈ R and all v ∈ Rn.

In particular, putting λ = 0 gives f (0) = 0.

What is the corresponding property in case V = Qn and F = Q?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 28 of 67

Page 29: Matrix Algebra Part A

Affine Functions

DefinitionA function g : V → F is said to be affineif there is a scalar additive constant α ∈ Fand a linear function f : V → F such that g(v) ≡ α + f (v).

ExerciseUnder what conditions is an affine function g : R→ R linearwhen its domain R is regarded as a vector space?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 29 of 67

Page 30: Matrix Algebra Part A

An Economic Aggregation Theorem

Suppose that a finite population of households h ∈ Hwith respective non-negative incomes yh ∈ Q+ (h ∈ H)have non-negative demands xh ∈ R (h ∈ H)which depend on household income via a function yh 7→ fh(yh).

Given total income Y :=∑

h yh, under what conditionscan their total demand X :=

∑h xh =

∑h fh(yh)

be expressed as a function X = F (Y ) of Y alone?

The answer is an implication of Cauchy’s functional equation.

In this context the theorem asserts that this aggregation conditionimplies that the functions fh (h ∈ H) and F must be co-affine.

This means there exists a common multiplicative constant ρ ∈ R,along with additive constants αh (h ∈ H) and A, such that

fh(yh) ≡ αh + ρyh (h ∈ H) and F (Y ) ≡ A + ρY

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 30 of 67

Page 31: Matrix Algebra Part A

Cauchy’s Functional Equation: Proof of Sufficiency

TheoremExcept in the trivial case when H has only one member,Cauchy’s functional equation F (

∑h∈H yh) ≡

∑h∈H fh(yh)

is satisfied for functions F , fh : Q→ R if and only if:

1. there exists a single function φ : Q→ R such that

F (q) = F (0) + φ(q) and fh(q) = fh(0) + φ(q) for all h ∈ H

2. the function φ : Q→ R is linear,implying that the functions F and fh are co-affine.

Proof.Suppose fh(yh) ≡ αh + ρyh for all h ∈ H, and F (Y ) ≡ A + ρY .Then Cauchy’s functional equation F (

∑h∈H yh) ≡

∑h∈H fh(yh)

is obviously satisfied provided that A =∑

h∈H αh.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 31 of 67

Page 32: Matrix Algebra Part A

Cauchy’s Equation: Beginning the Proof of Necessity

LemmaThe mapping Q 3 q 7→ φ(q) := F (q)− F (0) ∈ R must satisfy;

1. φ(q) ≡ fi (q)− fi (0) for all i ∈ H and q ∈ Q;

2. φ(q + q′) ≡ φ(q) + φ(q′) for all q, q′ ∈ Q.

Proof.To prove part 1, consider any i ∈ H and all q ∈ Q.

Note that Cauchy’s equation F (∑

h yh) ≡∑

h fh(yh)implies that F (q) = fi (q) +

∑h 6=i fh(0)

and also F (0) = fi (0) +∑

h 6=i fh(0).

Now define the function φ(q) := F (q)− F (0) on the domain Q.

Then subtract the second equation from the first to obtain

φ(q) = F (q)− F (0) = fi (q)− fi (0)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 32 of 67

Page 33: Matrix Algebra Part A

Cauchy’s Equation: Continuing the Proof of Necessity

Proof.To prove part 2, consider any i , j ∈ H with i 6= j ,and any q, q′ ∈ Q.

Note that Cauchy’s equation F (∑

h yh) ≡∑

h fh(yh) implies that

F (q + q′) = fi (q) + fj(q′) +

∑h∈H\{i ,j} fh(0)

F (0) = fi (0) + fj(0) +∑

h∈H\{i ,j} fh(0)

Now subtract the second equation from the first,then use the equation φ(q) = F (q)− F (0) = fi (q)− fi (0)derived in the previous slide, to obtain successively

φ(q + q′) = F (q + q′)− F (0)= fi (q)− fi (0) + fj(q

′)− fj(0)= φ(q) + φ(q′)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 33 of 67

Page 34: Matrix Algebra Part A

Cauchy’s Equation: Resuming the Proof of Necessity

Because φ(q + q′) ≡ φ(q) + φ(q′),for any k ∈ N one has φ(kq) = φ((k − 1)q) + φ(q).

As an induction hypothesis, which is trivially true for k = 2,suppose that φ((k − 1)q) = (k − 1)φ(q).

Confirming the induction step, the hypothesis implies that

φ(kq) = φ((k − 1)q) + φ(q) = (k − 1)φ(q) + φ(q) = kφ(q)

So φ(kq) = kφ(q) for every k ∈ N and every q ∈ Q.

Putting q′ = kq implies that φ(q′) = kφ(q′/k).

Interchanging q and q′, it follows that φ(q/k) = (1/k)φ(q).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 34 of 67

Page 35: Matrix Algebra Part A

Cauchy’s Equation: Completing the Proof of Necessity

So far we have proved that, for every k ∈ N and every q ∈ Q,one has both φ(kq) = kφ(q) and φ(q/k) = (1/k)φ(q).

Hence, for every rational r = m/n ∈ Qone has φ(mq/n) = mφ(q/n) = (m/n)φ(q) and so φ(rq) = rφ(q).

In particular, φ(r) = rφ(1), so φ is linear on its domain Q(though not on the whole of R without additional assumptionssuch as continuity or monotonicity).

The rest of the proof is routine checking of definitions.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 35 of 67

Page 36: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 36 of 67

Page 37: Matrix Algebra Part A

Euclidean Norm as LengthPythagoras’s theorem implies that the length

of the typical vector x = (x1, x2) ∈ R2 is√x21 + x22

or, perhaps less clumsily, (x21 + x22 )1/2.

In R3, the same result implies that the lengthof the typical vector x = (x1, x2, x3) is[(

(x21 + x22 )1/2)2

+ x23

]1/2= (x21 + x22 + x23 )1/2.

An obvious extension to Rn is the following:

DefinitionThe length of the typical n-vector x = (xi )

ni=1 ∈ Rn

is its (Euclidean) norm

‖x‖ :=(∑n

i=1x2i

)1/2=√x>x =

√x · x

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 37 of 67

Page 38: Matrix Algebra Part A

Unit n-Vectors, the Unit Sphere, and Unit Ball

DefinitionA unit vector u ∈ Rn is a vector with unit norm— i.e., its components satisfy

∑ni=1 u

2i = ‖u‖2 = 1.

The set of all such unit vectors forms a surfacecalled the unit sphere of dimension n − 1(one less than n because of the defining equation).

It is defined as the hollow set (like a football or tennis ball)

Sn−1 :={x ∈ Rn

∣∣∣ ∑n

i=1x2i = 1

}The unit ball B ⊂ Rn is the solid set (like a cricket ball or golf ball)

B :={x ∈ Rn

∣∣∣ ∑n

i=1x2i ≤ 1

}of all points bounded by the surface of the unit sphere Sn−1 ⊂ Rn.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 38 of 67

Page 39: Matrix Algebra Part A

Cauchy–Schwartz Inequality

TheoremFor all pairs a,b ∈ Rn, one has |a · b| ≤ ‖a‖‖b‖.

Proof.Define the function R 3 ξ 7→ f (ξ) :=

∑ni=1(aiξ + bi )

2 ∈ R.

Clearly f is the quadratic form f (ξ) ≡ Aξ2 + Bξ + Cwhere A :=

∑ni=1 a

2i = ‖a‖2, B := 2

∑ni=1 aibi = 2a · b,

and C :=∑n

i=1 b2i = ‖b‖2.

There is a trivial case when A = 0 because a = 0.

Otherwise, A > 0 and so completing the square gives

f (ξ) ≡ Aξ2 + Bξ + C = A[ξ + (B/2A)]2 + C − B2/4A

But the definition of f implies that f (ξ) ≥ 0 for all ξ ∈ R,including ξ = −B/2A, so 0 ≤ f (−B/2A) = C − B2/4A.Hence 1

4B2 ≤ AC ,

implying that |a · b| =∣∣12B∣∣ ≤ √AC = ‖a‖‖b‖.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 39 of 67

Page 40: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 40 of 67

Page 41: Matrix Algebra Part A

The Angle Between Two Vectors

Consider the triangle in Rn whose verticesare the three disjoint vectors x, y, 0.

Its three sides or edges have respective lengths ‖x‖, ‖y‖, ‖x− y‖,where the last follows from the parallelogram law.

Note that ‖x− y‖2 S ‖x‖2 + ‖y‖2 according as the angle at 0 is:(i) acute; (ii) a right angle; (iii) obtuse. But

‖x− y‖2 − ‖x‖2 − ‖y‖2 =∑n

i=1(xi − yi )

2 −∑n

i=1(x2i + y2i )

=∑n

i=1(−2xiyi ) = −2x · y

So the three cases (i)–(iii) occur according as x · y T 0.

Using the Cauchy–Schwartz inequality, one can define the anglebetween x and y as the unique solution θ = arccos (x · y/‖x‖‖y‖)in the interval [0, π) of the equation cos θ = x · y/‖x‖‖y‖ ∈ [−1, 1].

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 41 of 67

Page 42: Matrix Algebra Part A

Orthogonal and Orthonormal Sets of VectorsCase (ii) suggests defining two vectors x, y ∈ Rn

as orthogonal just in case x · y = 0,which is true if and only if θ = arccos 0 = 1

2π = 90◦.

A set of k vectors {x1, x2, . . . , xk} ⊂ Rn is said to be:

I pairwise orthogonal just in case xi · xj = 0 whenever j 6= i ;

I orthonormal just in case, in addition, each ‖xi‖ = 1— i.e., all k elements of the set are vectors of unit length.

Define the Kronecker delta function

{1, 2, . . . , n} × {1, 2, . . . , n} 3 (i , j) 7→ δij ∈ {0, 1}

on the set of pairs i , j ∈ {1, 2, . . . , n} by

δij :=

{1 if i = j

0 otherwise

Then the set of k vectors {x1, x2, . . . , xk} ⊂ Rn is orthonormalif and only if xi · xj = δij for all pairs i , j ∈ {1, 2, . . . , k}.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 42 of 67

Page 43: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 43 of 67

Page 44: Matrix Algebra Part A

The Canonical Basis of Rn

Example

A prominent orthonormal set is the canonical basis of Rn,defined as the set of n different n-vectors ei (i = 1, 2, . . . , n)whose respective components (e ij )

nj=1

satisfy e ij = δij for all j ∈ {1, 2, . . . , n}.

ExerciseShow that each n-vector x = (xi )

ni=1 is a linear combination

x = (xi )ni=1 =

∑n

i=1xie

i

of the canonical basis vectors {e1, e2, . . . , en},with the multiplier attached to each basis vector ei

equal to the respective component xi (i = 1, 2, . . . , n).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 44 of 67

Page 45: Matrix Algebra Part A

The Canonical Basis in Commodity Space

Example

Consider the case when each vector x ∈ Rn is a quantity vector,whose components are (xi )

ni=1,

where xi indicates the net quantity of commodity i .

Then the ith unit vector ei of the canonical basis of Rn

represents a commodity bundlethat consists of one unit of commodity i ,but nothing of every other commodity.

In case the row vector p> ∈ Rn

is a price vector for the same list of n commodities,the value p>ei of the ith unit vector ei must equal pi ,the price (of one unit) of the ith commodity.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 45 of 67

Page 46: Matrix Algebra Part A

Linear Functions

TheoremThe function Rn 3 x 7→ f (x) ∈ R is linearif and only if there exists y ∈ Rn such that f (x) = y>x.

Proof.Sufficiency is easy to check.

Conversely, note that x equals the linear combination∑n

i=1 xiei

of the n canonical basis vectors.

Hence, linearity of f implies that

f (x) = f(∑n

i=1xie

i)

=∑n

i=1xi f (ei ) =

∑n

i=1f (ei ) xi = y>x

where y is the column vector whose componentsare yi = f (ei ) for i = 1, 2, . . . , n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 46 of 67

Page 47: Matrix Algebra Part A

Linear Transformations: Definition

DefinitionThe vector-valued function

Rn 3 x 7→ F(x) = (Fi (x))mi=1 ∈ Rm

is a linear transformation just in caseeach component function Rn 3 x 7→ Fi (x) ∈ R is linear— or equivalently, iff F(λx + µy) = λF(x) + µF(y)for every linear combination λx + µy of every pair x, y ∈ Rn.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 47 of 67

Page 48: Matrix Algebra Part A

Characterizing Linear Transformations

TheoremThe mapping Rn 3 x 7→ F(x) ∈ Rm is a linear transformationif and only if there exist vectors yi ∈ Rm for i = 1, 2, . . . , nsuch that each component function satisfies Fi (x) = y>i x.

Proof.Sufficiency is obvious.

Conversely, because x equals the linear combination∑n

i=1 xieiof the n canonical basis vectors {ei}ni=1

and because each component function Rn 3 x 7→ Fi (x) is linear,one has

Fi (x) = Fi

(∑n

j=1xjej

)=∑n

j=1xjFi (ej) = y>i x

where y>i is the row vector whose componentsare (yi )j = Fi (ej) for i = 1, 2, . . . ,m and j = 1, 2, . . . , n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 48 of 67

Page 49: Matrix Algebra Part A

Representing a Linear Transformation

DefinitionA matrix representationof the linear transformation Rn 3 x 7→ F(x) ∈ Rm

relative to the canonical bases of Rn and Rm

is an m × n array whose n columnsare the m-vector images F(ej) = (Fi (ej))mi=1 ∈ Rm

of the n canonical basis vectors {ej}nj=1 of Rn.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 49 of 67

Page 50: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 50 of 67

Page 51: Matrix Algebra Part A

Linear Combinations and Dependence: Definitions

DefinitionA linear combination of the finite set {x1, x2, . . . , xk} of vectorsis the scalar weighted sum

∑kh=1 λhx

h,where λh ∈ R for h = 1, 2, . . . , k.

DefinitionThe finite set {x1, x2, . . . , xk} of vectors is linearly independentjust in case the only solution of the equation

∑kh=1 λhx

h = 0is the trivial solution λ1 = λ2 = · · · = λk = 0.

Alternatively, if the equation has a non-trivial solution,then the set of vectors is linearly dependent.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 51 of 67

Page 52: Matrix Algebra Part A

Characterizing Linear Dependence

TheoremThe finite set {x1, x2, . . . , xk} of k n-vectors is linearly dependentif and only if at least one of the vectors, say x1 after reordering,can be expressed as a linear combination of the others— i.e., there exist scalars αh (h = 2, 3, . . . , k)such that x1 =

∑kh=2 αhx

h.

Proof.If x1 =

∑kh=2 αhx

h, then (−1)x1 +∑k

h=2 αhxh = 0,

so∑k

h=1 λhxh = 0 has a non-trivial solution.

Conversely, suppose∑k

h=1 λhxh = 0 has a non-trivial solution.

After reordering, we can suppose that λ1 6= 0.

Then x1 =∑k

h=2 αhxh,

where αh = −λh/λ1 for h = 2, 3, . . . , k .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 52 of 67

Page 53: Matrix Algebra Part A

Dimension

DefinitionThe dimension of a vector space V is the sizeof any maximal set of linearly independent vectors,if this number is finite.

Otherwise, if there is an infinite set of linearly independent vectors,the dimension is infinite.

ExerciseShow that the canonical basis of Rn is linearly independent.

Example

The previous exercise shows that the dimension of Rn is at least n.

Later results will imply that any set of k > n vectors in Rn

is linearly dependent.

This implies that the dimension of Rn is exactly n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 53 of 67

Page 54: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 54 of 67

Page 55: Matrix Algebra Part A

Matrices as Rectangular ArraysAn m × n matrix A = (aij)m×n is a (rectangular) array

A =

a11 a12 . . . a1na21 a22 . . . a2n

......

...am1 am2 . . . amn

= ((aij)mi=1)nj=1 = ((aij)

nj=1)mi=1

Note that in aij , we write the row number ibefore the column number j .

An m × 1 matrix is a column vector with m rows and 1 column.

A 1× n matrix is a row vector with 1 row and n columns.

The m × n matrix A consists of:

n columns in the form of m-vectorsaj = (aij)

mi=1 ∈ Rm for j = 1, 2, . . . , n;

m rows in the form of n-vectorsa>i = (aij)

nj=1 ∈ Rn for i = 1, 2, . . . ,m.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 55 of 67

Page 56: Matrix Algebra Part A

The Transpose of a Matrix

The transpose of the m × n matrix A = (aij)m×nis defined as the n ×m matrix

A> = (a>ij )n×m = (aji )n×m =

a11 a21 . . . an1a12 a22 . . . an2

......

...a1m a2m . . . anm

Thus the transposed matrix A> results from transformingeach column m-vector aj = (aij)

mi=1 (j = 1, 2, . . . , n) of A

into the corresponding row m-vector a>j = (a>ji )mi=1 of A>.

Equivalently, for each i = 1, 2, . . . ,m,the ith row n-vector a>i = (aij)

nj=1 of A

is transformed into the ith column n-vector ai = (aji )nj=1 of A>.

Either way, one has a>ij = aji for all relevant pairs i , j .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 56 of 67

Page 57: Matrix Algebra Part A

Rows Before Columns

VERY Important Rule: Rows before columns!

This order really matters.

Reversing it gives a transposed matrix.

ExerciseVerify that the double transpose of any m × n matrix Asatisfies (A>)> = A— i.e., transposing a matrix twice recovers the original matrix.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 57 of 67

Page 58: Matrix Algebra Part A

Outline

Solving Two Equations in Two UnknownsFirst Example

VectorsVectors and Inner ProductsAddition, Subtraction, and Scalar MultiplicationLinear versus Affine FunctionsNorms and Unit VectorsOrthogonalityThe Canonical BasisLinear Independence and Dimension

MatricesMatrices and Their TransposesMatrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 58 of 67

Page 59: Matrix Algebra Part A

Multiplying a Matrix by a Scalar

A scalar, usually denoted by a Greek letter,is simply a member α ∈ F of the algebraic field Fover which the vector space is defined.

So when F = R, a scalar is a real number α ∈ R.

The product of any m × n matrix A = (aij)m×nand any scalar α ∈ Ris the new m × n matrix denoted by αA = (αaij)m×n,each of whose elements αaij resultsfrom multiplying the corresponding element aij of A by α.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 59 of 67

Page 60: Matrix Algebra Part A

Matrix Multiplication

The matrix product of two matrices A and Bis defined (whenever possible) as the matrix C = AB = (cij)m×nwhose element cij in row i and column jis the inner product cij = a>i bj of:

I the ith row vector a>i of the first matrix A;

I the jth column vector bj of the second matrix B.

Again: rows before columns!

Note that the resulting matrix product C must have:

I as many rows as the first matrix A;

I as many columns as the second matrix B.

Yet again: rows before columns!

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 60 of 67

Page 61: Matrix Algebra Part A

Compatibility for Matrix Multiplication

Question: when is this definition of matrix product possible?

Answer: if and only if A has as many columns as B has rows.

This condition ensures that every inner product a>i bj is defined,which is true iff (if and only if) every row of Ahas exactly the same number of elements as every column of B.

In this case, the two matrices A and Bare compatible for multiplication.

Specifically, if A is m × ` for some m,then B must be `× n for some n.

Then the product C = AB is m × n,with elements cij = a>i bj =

∑`k=1 aikbkj

for i = 1, 2, . . . ,m and j = 1, 2, . . . , n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 61 of 67

Page 62: Matrix Algebra Part A

Laws of Matrix Multiplication

ExerciseVerify that the following laws of matrix multiplication holdwhenever the matrices A,B,C are compatible for multiplication.

associative law for matrices: A(BC) = (AB)C;

distributive: A(B + C) = AB + AC and (A + B)C = AC + BC;

transpose: (AB)> = B>A>.

associative law for scalars: α(AB) = (αA)B = A(αB) (all α ∈ R).

ExerciseLet X be any m × n matrix, and z any column n-vector.

1. Show that the matrix product z>X>Xz is well-defined,and that its value is a scalar.

2. By putting w = Xz in the previous exercise regarding the signof the quadratic form w>w, what can you concludeabout the value of the scalar z>X>Xz?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 62 of 67

Page 63: Matrix Algebra Part A

Exercise for Econometricians I

ExerciseAn econometrician has access to data series (such as time series)involving the real values

I yt (t = 1, 2, . . . ,T ) of one endogenous variable;

I xti (t = 1, 2, . . . ,T and i = 1, 2, . . . , k)of k different exogenous variables— sometimes called explanatory variables or regressors.

The data is to be fitted into the linear regression model

yt =∑k

i=1bixti + et

whose scalar constants bi (i = 1, 2, . . . , k)are unknown regression coefficients,and each scalar et is the error term or residual.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 63 of 67

Page 64: Matrix Algebra Part A

Exercise for Econometricians II

1. Discuss how the regression model can be writtenin the form y = Xb + e for suitable column vectors y, b, e.

2. What are the dimensions of these vectors,and of the exogenous data matrix X?

3. Why do you think econometricians use this matrix equation,rather than the alternative y = bX + e?

4. How can the equation y = Xb + eaccommodate the constant term αin the alternative equation yt = α +

∑ki=1 bixti + et?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 64 of 67

Page 65: Matrix Algebra Part A

Matrix Multiplication Does Not Commute IThe two matrices A and B commute just in case AB = BA.

Note that typical pairs of matrices DO NOT commute,meaning that AB 6= BA — i.e., the order of multiplication matters.

Indeed, suppose that A is `×m and B is m × n,as is needed for AB to be defined.

Then the reverse product BA is undefinedexcept in the special case when n = `.

Hence, for both AB and BA to be defined,where B is m × n, the matrix A must be n ×m.

But then AB is n × n, whereas BA is m ×m.

Evidently AB 6= BA unless m = n.

Thus all four matrices A, B, AB and BA are m ×m = n × n.

We must be in the special casewhen all four are square matrices of the same dimension.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 65 of 67

Page 66: Matrix Algebra Part A

Matrix Multiplication Does Not Commute II

Even if both A and B are n × n matrices,implying that both AB and BA are also n × n,one can still have AB 6= BA.

Example

Here is a 2× 2 example:(0 11 0

)(0 00 1

)=

(0 10 0

)6=(

0 01 0

)=

(0 00 1

)(0 11 0

)

ExerciseFor matrix multiplication, explain why there are two differentversions of the distributive law — namely

A(B + C) = AB + AC and (A + B)C = AC + BC

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 66 of 67

Page 67: Matrix Algebra Part A

More Warnings Regarding Matrix Multiplication

ExerciseLet A,B,C denote three general matrices.

Give examples showing that:

1. The matrix AB might be defined, even if BA is not.

2. One can have AB = 0 even though A 6= 0 and B 6= 0.

3. If AB = AC and A 6= 0, it does not follow that B = C.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 67 of 67