Linear algebra lecture notes - basics of linear algebra for CN Yang scholars; a kind of a compressed version of Linear Algebra I.pdf

7/27/2019 Linear algebra lecture notes - basics of linear algebra for CN Yang scholars; a kind of a compressed version of Lin

1/54

Fedor Duzhin

MATH1B: Complete Lecture Notes

Basics of Linear Algebra

and some other stuff not included in MAS212


2/54

Contents

1 Polar coordinates 3

1.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Correspondence between Cartesian and polar coordinates . . . . . . . . . . . . . . . . . . . . . . 51.3 Functions in polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Gaussian elimination 8

3 Matrix multiplication and finding inverse matrix 10

3.1 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Gaussian elimination in matrix form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3 Finding inverse matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Determinant 14

4.1 Definition of determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Determinant 175.1 Determinant and Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Axiomatic definition of the determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 Row and Column Expansion 21

6.1 Rows and columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216.2 Row and column expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

7 Concept of Dimension 26

8 Vector Spaces 30

8.1 General definition of a vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.2 General properties of vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

9 Basis 35

9.1 Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359.2 Linear span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369.3 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1


3/54

10 Linear Dependence 40

10.1 L inear dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4010.2 H ow to construct a basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4210.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

11 Rank 46

11.1 R ank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4611.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

12 Rank, Determinant, and Matrix Multiplication 51

12.1 Determinant of a product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5112.2 R ank of a product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2


4/54

Week 1

Polar coordinates

1.1 Polar coordinates

Recall that in order to introduce Cartesian coordinates ( ) on the plane, one needs two orthogonaloriented lines called the -axis and the -axis. The coordinates are defined by the two projections of thepoint onto the lines O and O (see Figure 1.1).

Polar coordinates ( ) have some advantages. For example, the graph of a function in polar coordi-nates need not pass the vertical line test. On the other hand the correspondence

points on the plane couples of numbers ( )

is not one-to-one.To introduce polar coordinates, one needs a half-line, say OP. Its initial point Ois the origin. Given apoint A, let (A) be distance from A to the origin and let (A) be the angle between the half-lines OPandOA measured in the counterclockwise direction (see Figure 1.2).

Thus must be non-negative. If = 0, it means that A is the origin, so can be any number. Generally,an angle is defined up to addition with the full angle 2. It means that for > 0, couples of numbers

( 6) ( 4) ( 2) ( ) ( + 2) ( + 4)

represent the same point on the plane.

E . Polar coordinates [ = 1 = ] and [ = 1 = 5] represent the same point whose Cartesiancoordinates are = 1, = 0. But the point (1 0) can only be represented with Cartesian coordinates(

1

0); any other (

) with

=

1 or

= 0 is a different point on the actual plane.

On the other hand, any couple of real numbers ( ) with 0 give a certain point A on the plane. To

find this point, we first rotate the half-line in the angle || about the origin counterclockwise if > 0 andclockwise if < 0. Finally, we put A on this half-line on the distance from the origin.

Thus polar coordinates represent a function

A : {( )} plane

3


5/54

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 11

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

0.4

0.6

(0.4,0.6)

(0,0)

x>0, y>0x0

x>0, y


6/54

Its domain is the set of couples ( ), where is a non-negative and is an arbitrary real number. The

range of this function is the plane. This function is onto, but not one-to-one.

1.2 Correspondence between Cartesian and polar coordinates

One can easily extend polar coordinates for negative values of . Indeed, instead of taking the distance < 0 from the origin, we can go to the opposite direction by > 0. Opposite direction means that theangle is changed by (or 180), that is, A( ) = A(+ ) for < 0. This gives us the function

A : {( )} plane

defined for all pairs of numbers ( ). Since a pair of numbers ( ) is a point of the set R2, and each pointof the plane is uniquely encoded by its Cartesian coordinates ( ) we obtain a function

R2 R2 (1.1)

T . The function (1.1) is given by the explicit formula

= cos = sin (1.2)

On the other hand, for 0 polar coordinates can be expressed implicitly in terms of Cartesiancoordinates by

=

2 + 2

tan

=

/(1.3)

P For 0, the formula (1.2) is obvious from the picture (in fact, it can be thought of as a definitionof the sine and cosine functions).

For < 0, we have = cos = () cos(+ ) = sin = () sin(+ )

which is the definition of polar coordinates for < 0. :)

1.3 Functions in polar coordinates

Cartesian coordinates and are similar. One can consider functions = () and = (), but, in fact,there is no much difference. The graph of a function = () reflected about the line = gives thegraph of the function = ().

Conversely, the polar coordinates and are essentially different. One usually uses graphs of functions = () since they are more visual than = ().E . The graph of the function = 1 is the unit circle (see Figure 1.3).

5


7/54

1 0. 8 0 .6 0. 4 0 .2 0 0 .2 0 .4 0 .6 0 .8 1

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

Figure 1.3: Unit circle = 1

E . The graph of the function = 15is a spiral. As shown in Figure 1.4, it fails the vertical linetest.

Equation (1.2) can be used to transform functions in polar coordinates into parametric equations inCartesian coordinates.

E . The curve given by = , 0 in polar coordinates is = cos = sin 0

E . The curve given by = 1 cos 2in polar coordinates is

= (1 cos2)cos = (1 cos2)sin

On the other hand, equations in Cartesian coordinates can be transformed into equations in polarcoordinates.

E . The equation 2 + 2 = 1 gives us ( cos )2 + ( sin )2 = 1, so it is just = 1.

E . The equation 2/3 + 2/3 = 1 gives ( cos )2/3 + ( sin )2/3 = 1, so it is

= 1

(cos )2/3 + (sin )2/3

3/2

This formula can be transformed into the parametric equation

= cos (cos )2/3 + (sin )2/33/2

= sin (cos )2/3 + (sin )2/33/2

6


8/54

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 11

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

Figure 1.4: Spiral = , [0 2]

However, a simple parametrization can be obtained by a trick. Since sin 2 + cos2 = 1 and we have theequation ( 3

)2 + ( 3)2 = 1, lets try to modify polar coordinates as follows.3 = cos

3 = sin

= 3 cos3 = 3 sin3

so the equation 2/3 + 2/3 = 1 is just = 1 and we get the parametrization = cos3 , = sin3 .

7


9/54

Week 2

Gaussian elimination

D . A system of linear equations is

111 + 122 + + 1 = 1211 + 222 + + 2 = 2

...11 + 22 + + =

Here, 1 are variables, 11 and 1 are some constant coefficients.

A system of linear equations can be written in matrix form as

Ax = b, where

Ais a fixed

by

matrix,

x R is the vector of unknown variables, and b R is a fixed vector representing the right part of thesystem.

Given a system of linear equation, for example,

2 + 2 + = 5 + = 1

4 + 3 = 6 (2.1)

the easiest algorithm to solve it is the Gaussian elimination. The idea is to simplify the system to thediagonal form.

The algorithm consists of steps of three kinds. Namely, if R is the th row of the system, the first stepreplaces R with R + R for some = 0. For instance, we eliminate 21 and 31 in the system 9.1 byreplacing

R2 with

1

2 R1 +

R2 and

R3 with

2R

1 +R

3:

2 + 2 + = 5 + = 1

4 + 3 = 6

2 + 2 + = 5 32 = 32 5 + = 4

8


10/54

The second step swaps th row and th row. For example,

2 + 2 + = 5 32 = 32

5 + = 4

2 + 2 + = 5 5 + = 4

32 = 32The idea is to keep = 0. The third step multiplies a row with a number = 0. For example,

2 + 2 + = 5

5 + = 4 3

2 = 3

2

+ + 12 = 52 1

5 = 4

5 = 1The idea is here to get = 1.

Once we get an upper triangular form, that is, = 0 for > , with = 1, we do the same frombottom to top. In our example, by replacing R2 with R2 + 15 R3 and R1 with R1 12 R3, we are getting rid of13 and 23:

+ + 12 = 52

15 = 45 = 1

+ = 2 = 1

= 1Finally, we do first step once more replacing R1 with R1 R2:

+ = 2

= 1 = 1

= 1 = 1

= 1The solution is now obvious.

9


11/54

Week 3

Matrix multiplication and finding

inverse matrix

3.1 Matrix multiplication

First of all, lets define matrix multiplication for matrices 1 and 1. By definition, it is

1 2

12

...

= 11 + 22 + +

which is a 1 1 matrix, that is, a number.Consider now an matrix A and an matrix B. The matrix A has rows, each of them is a

1 vector. Lets denote them by A1 A. Similarly, the matrix B has columns, each of them is an 1 vector. Lets denote them B1 B. Then

AB =

A1 B1 A1 B2 A1BA2 B1 A2 B2 A2B

......

. . ....

A B1 A B2 AB

is an matrix.Matrix multiplication is

(i) Associative, that is, (AB)C= A(BC),(ii) Distributive, that is, A(B + C) = AB + ACand (A + B)C= AC+ BC,

but its not commutative, so generally AB = BA. Moreover, let A be a and B be a matrix.Then the product AB is defined, but the product BA is not unless = .

10


12/54

The unit matrix is an matrix given by

I=

1 0 0 00 1 0 00 0 1 0...

.... . .

...0 0 0 1

(3.1)

For any by matrix A, we have AI= IA = A. For an matrix A, we would have AI= A (theproduct IA would not be defined). For an matrix A, we would have IA = A (the product AIwouldnot be defined).

3.2 Gaussian elimination in matrix form

Given a system of linear equations

111 + 122 + + 1 = 1211 + 222 + + 2 = 2

...11 + 22 + + =

(3.2)

let

A =

11 12 121 22 2

......

. . ....

1 2

x =

12

...

b =

12

...

Then the system (3.2) can be written as a single matrix equation

Ax = b x R b R (3.3)

Lets try to think how to express the Gaussian elimination in terms of matrix multiplication. First, wellcheck how the matrix A is being changed under the three steps.

Lets look at the same system as in Lecture B:

2 + 2 + = 5 + = 1

4 + 3 = 6

In the beginning, we replaced R2 with 12R1 + R2 as

2 + 2 + = 5 + = 1

4 + 3 = 6

2 + 2 + = 5 32 = 32

4 + 3 = 6

11


13/54

The matrix A and the vector b have been changed as 2 2 1 51 1 1 1

4 1 3 6

2 2 1 50 0 32 32

4 1 3 6

Notice that it correspond to multiplication

2 2 1 50 0 32 324 1 3 6

=

1 0 0 12 1 0

0 0 1

2 2 1 51 1 1 1

4 1 3 6

Generally, replacing R with R + R corresponds to multiplication

(A|b)I+ F

(A|b)

where F has all zero elements except of the one in the th row, th column, which is 1.Further, the second step was swapping rows. It was done as

2 + 2 + = 5

32 = 3

2 5 + = 4

2 + 2 + = 5 5 + = 4

32 = 32For matrices, we have

2 2 1 50 0 32 320 5 1 4

2 2 1 50 5 1 4

0 0 32 32

which is

2 2 1 50 5 1 40 0 32 32

= 1 0 00 0 10 1 0

2 2 1 50 0 32 320 5 1 4

Generally, we have multiplication

(A|b) S(A|b)

where S is the unit matrix with and th rows swapped.Finally, multiplying th row of the system by a number is the same as multiplying it with the diagonal

matrix

D(1 ) =

1 0 0 00 2 0 00 0 3 0..

.

..

.

. . ...

.0 0 0

3.3 Finding inverse matrix

12


14/54

D . Given an by matrix A, a matrix B is called As inverse ifAB = BA = I

The inverse matrix is denoted by A1.

Now if we know the inverse matrix, then the system of linear equations Ax = b is easily solvable for anyvector b R since

x = A1Ax = A1bRecall that Gaussian elimination transforms a given matrix A to the unit matrix I via a number of

elementary steps, each of them is a multiplication with a simple matrix. Thus we have

S S 2S1A = Iwhere S1 S are matrices of consecutive elementary steps. By definition, we see that

A1 = S S 2S1

Recalling again that each multiplication with S means a simple transformation of the matrix A, weobtain the following algorithm to find A1:

By elementary transformations, simplify the matrix A to the unit matrix I. At the same moment, applythe same elementary transformations to the unit matrix. The result is A1.

13


15/54


16/54

we can solve in by Gaussian elimination. The answer is

= =

det

det

= =

det

det

In a similar manner, a system of three linear equations

1 + 1 + 1 = 2 + 2 + 2 = 3 + 3 + 3 =

has the solution

=det 1 1 2 2

3 3

det

1 1 12 2 2

3 3 3

=det 1 12 2

3 3

det

1 1 12 2 2

3 3 3

=det 1 1 2 2

2 3

det

1 1 12 2 2

3 3 3

Permutations

D . A permutation of order is a rearrangement of the numbers 1 2 . It is written as

= (

1

).

E . (5 6 1 2 4 3 8 7) is a permutation of order 8,

E . (5 6 1 2 4 3 8 7 1) is not a permutation because 1 occurs here twice,

E . (5 1 2 4 3 8 7) is not a permutation of order 8 because it doesnt have 6.

Obviously, there are ! different permutations of order . For a permutation , one can find the numberof reverse-ordered pairs in it, that is,

() = #{() : < > }

E . For the permutation (1 3 4 2), we have the following pairs of indices where < :(1 2) = (1 3) (1 3) = (1 4) (1 4) = (1 2)(2 3) = (3 4) (2 4) = (3 2) (3 4) = (4 2)

so there are 2 reversed pairs.

15


17/54

D . The sign of a permutation is (1)(

)

. A permutation of positive sign is even. A permu-tation of negative sign is odd.

4.1 Definition of determinant

Now we can state the general definition of the determinant.

D . The determinant of a matrix A =

=1 is

det A = =(12)(1)

(

)

11 22 so the sum is taken over all permutations of order .

For example, for = 3 we have 6 permutations. Their respective numbers of reversed pairs are

(1 2 3) = 0 (2 3 1) = 2 (3 1 2) = 2(1 3 2) = 1 (2 1 3) = 1 (3 2 1) = 3

Thus we get the formula

det11 12 1321 22 2331 32 33 =

112233 + 122331 + 132132 112332 122133 132231which we already saw as equation (4) for the volume of a parallelepiped.

The explicit formula for the determinant of a matrix contains ! summands, each of them isthe product of elements of the matrix. Applying it is either tricky or heavy. Nevertheless, some simpleproperties of the determinant can be used to calculate it.

16


18/54

Week 5

Determinant

5.1 Determinant and Gaussian elimination

Recall the main definition first.

D . The determinant of a matrix A = =1 isdet A =

=(12)

(1)()1122 (5.1)

so the sum is taken over all permutations of order .

In particular,

det

11 12 1321 22 23

31 32 33

=

112233 + 122331 + 132132 112332 122133 132231Applying (5.1) to actually find determinants of order 4 and higher requires too much calculation unless

the matrix A has a lot of zero entries.E . Assume that A has an upper-triangular form, that is,

A =

11

12

10 22 2

......

. . ....

0 0

Then det A = 1122 . Indeed, lets look at any summand 1122 in the expression (5.1). If > for some , then = 0 and the whole product is 0. But its almost always the case because theonly summand for which its not true is 1122 .

17


19/54

However, there are other methods to compute the determinant. For instance, Gaussian elimination

would do the job. First, we need to explore determinants properties. The determinant can be consideredas a function of vector variables. Specifically, given vector-rowsa1 = (11 12 1) a = (1 2 )

let

det(a1 a) = det

11 12 121 22 2

......

. . ....

1 2

T . (A-) The determinant is skew-symmetric, that is,

det(a1 a1 a a+1 a1 a a+1 a) = det(a1 a)

P From the definition (5.1), what we actually need to show is that switching any two elements in apermutation changes its sign.

Consider a permutation = (1 2 ) and let be the result of switching and in the ar-rangement. Thus,

= (1 1

A|| +1 1

B|| +1

C)

= (1 1 A

|| +1 1 B

|| +1 C

)

where A, B, and Care pieces of permutation same for both and . After switching and , reversedpairs inside A, B, and Cdo not change. Thus we need only to look what happens with reversed pairsinvolving and .

Let 1, 1, and 1 count the number of reversed pairs ( ) such that belongs to A, B, Crespectively.Let also 2, 2, and 2 count the number of reversed pairs ( ) such that belongs to A, B, C respectively.For instance,

2 = #{ : B > }

etc. Let also |A|, |B|, and |C| denote the number of elements in A, B, C respectively.After switching

and , we see that 1, 2, 1, 2 remain unchanged; 1 becomes

|B

| 1 and 2

becomes |B| 2; and the pair () changes sign. We see that 2|B| + 21 + 22 + 1 pairs change their signsin total. Hence sgn = sgn . :)

C . The determinant of a matrix with two equal rows is 0.

18


20/54

P Indeed, if we swap these two rows, the determinant must change sign. On the other hand, the

matrix is same, so the determinant must be same. :)

T . The determinant is multi-linear, that is, for substituting a = b + c for R weget

det(a1 a1 b + c a) = det(a1 b a) + det(a1 c a)

P It is obvious from (5.1) as = + . :)

C . Step 1 of the Gaussian elimination, that is, replacing th row A with A + A does notchange the determinant. Step 2 of the Gaussian elimination, that is, swapping the th and the th rowsreverses the sign of the determinant.

P Step 1 follows from linearity as

det

A1 A + A A

= det (A1 A A) + det

A1 A A

where the second summand is 0 because it is the determinant of a matrix with two identical rows A onboth th and th positions. Step 2 follows from the skew-symmetry. :)

5.2 Axiomatic definition of the determinant

T . Determinant is the only multi-linear skew-symmetric function whose entries are matrices such its value on the unit matrix Iis 1.

P Assume that is a multi-linear skew-symmetric function on matrices. Given a matrix A, letstry to find (A) by transforming A to I with Steps 1-3 of Gaussian elimination.

Due to linearity, if we apply Step 1 of Gaussian elimination, (A) remains same. Due to skew-symmetry,if we apply Step 2 of Gaussian elimination, (A) changes sign. Due to linearity, if we apply Step 3 of Gaussian

elimination, (A) is multiplied with 1 and we are either in the position where A is upper-triangularwith 1s on the diagonal or = 0 for some and then (A) = 0. Finally, after applying Step 1 a few moretimes, A is transformed to I and (I) = 1.

In other words, (A) is calculated in exactly same manner as det A. Hence (A) = det A. :)

Now we can give an alternative definition of determinant. It is called axiomatic because instead of anexplicit formula, the object is being defined via its properties, that is, axioms. The theorem we just provedshows that these axioms define the object completely and uniquely.

19


21/54

D . The determinant is a function of an matrix argument such that(a) it is multi-linear;

(b) it is skew-symmetric;

(c) its value on the unit matrix is 1.

20


22/54

Week 6

Row and Column Expansion

6.1 Rows and columns

T . The determinant is transpose-invariant, that is

det A = det A where A = =1

P By definition, det A =

(1)()1122 . Since in the product 1122 , we took oneelement from each row and one from each column, we can rearrange it as

1122 , where

is

another permutation such that = = . Such a permutation is called inverse to and is denoted1. Thus we actually need to prove that the sign of a permutation is always same as that of its inverse.But it is obvious because reversed pairs in and in = 1 are clearly same. :)

In a transposed matrix, rows become columns and columns become rows. Therefore Theorem 6.1implies that one use elementary column transformations in order to find det A instead of row transforma-tions.

6.2 Row and column expansion

Various formulae are available to express determinant of a matrix in terms of determinants of its sub-

matrices. For example,11 12 1321 22 2331 32 33

= 112233 + 213213 + 311223 211233 132231 112332

21


23/54

Grouping together terms with 11, 12, 13, we see that it equals11(2233 2332) + 12(3123 2133) + 13(2132 2231)

= 11 22 2332 33

12 21 2331 33

+ 13 21 2231 32

Notice that the three submatrices in this expression are obtained by deleting the first row and, respectively,1st, 2nd and 3rd column from A.

D . The () minor of a matrix A is the determinant M of the (1)(1) matrix obtainedfrom A by deleting the th row and the th column.

E .

A =

1 2 3 40 2 4 6

20 23 26 294 8 12 16

M24 =

1 2 3

20 23 264 8 12

= 544 Thus for a 3 3 matrix we have det A = 11M11 12M12 + 13M13.

D . The () cofactor of a matrix A is C = (1)+M. The cofactor matrix ofA is the matrix Cwhose entries are cofactors of A.

In other words, cofactors are signed minors, where signs are designed such that we have det A =11C11 + 12C12 + 13C13. More generally, the following theorem holds.

T . (R E) For an matrix A and a fixed row index , we have

det A =

=1C =

=1

(1)+M

Since determinant can also be found in terms of columns, the Column Expansion formula

det A =

=1C =

=1

(1)+M

also holds for a fixed column index .

22


24/54

E . Expanding the 1st row, we get

2 0 3 00 2 0 67 0 5 0

4 8 4 2

= 2

2 0 60 5 08 4 2

+ 3

0 2 67 0 0

4 8 2

Applying 1st Column Expansion in the first matrix and 2nd Row Expansion in the second matrix, we obtain

2

2 5 04 2

+ 8 0 65 0

+ 3 (7) 2 68 2

Computing the 2 2 determinants directly, we see that it equals

2(2 10 + 8 30) 21 44 = 1364

E . Lets find det A, where

A =

2 1 0 0 01 2 1 0 00 1 2 1 00 0 1 2 0...

.... . .

. . .. . .

...0 0 0 1 2

Applying 1st Row Expansion, we obtain

2 1 0 0 01 2 1 0

0

0 1 2 1 00 0 1 2 0...

.... . .

. . .. . .

...0 0 0 1 2

= 2

2 1 0 01 2 1

0

0 1 2 0...

. . .. . .

. . ....

0 0 1 2

1 1 0 00 2 1

0

0 1 2 0...

. . .. . .

. . ....

0 0 1 2

Lets expand the 2nd summand in the 1st column. We have then

1 1 0 00 2 1 00 1 2 0...

. . .. . .

. . ....

0 0 1 2

=

2 1 01 2 0

. . .. . .

. . ....

0 1 2

Denoting the such determinant by X, we see that these equations say that X = 2X1 X2. In

order to complete the computation, we need to know what to start with, which is

X1 = |2| = 2 X2 = 2 11 2

= 3Continuing this series, we get X3 = 2X2 X1 = 4, X4 = 2X3 X2 = 5 etc. Thus we observe that apparentlyX = + 1. Lets prove it by induction.

23


25/54

The base case is already shown. For the inductive step, we assume that X = + 1 and X1 = .

Hence we have X+1 = 2( + 1) = 2 + 2 = + 2, which is what we needed.

D . For an matrix A, its adjugatematrix A is the transpose of the cofactor matrix, thatis, (adj A) = C = (1)+M .

T . Let A be any matrix. Then A adj A = det A I.

P We need to check that

(i)

=1 (adj A) = det A;(ii) and that

=1 (adj A) = 0 for = .

The first equation is nothing but Row Expansion because

=1 (adj A) =

=1 C = det A.The second equation is also Row Expansion, but for a different matrix. Specifically, for = we have

=1 C =

11 12 121 22 2

......

......

1 2 ...

......

...

1 2 ...

.... . .

...1 2

= 0

because it is the determinant of a matrix with two identical rows. Namely, it is obtained from A by replacingthe th row with a duplicate of the th one. :)

In other words, we have the following expression for the inverse matrix:

A1 = 1det A adj A

It is defined when det A = 0. What if detA = 0? Its natural to assume that det(AB) = det A det B, so ifdet A = 0, then det(AA

1

) = det A det A1

= 0 = 1 = det Iand therefore A1

is not defined. Well provethat det(AB) = det A det B later, let us now just believe that it is so.E .

A = 1 0 22 2 5

1 4 0

M=

20 5 108 2 4

4 9 2

C=

20 5 108 2 4

4 9 2

24


26/54

Since det A = 40, we obtain

adj A = 20 8 45 2 9

10 4 2

A1 = 140

adj A = 12 15 1101

81

209

40

14 110 120

25


27/54

Week 7

Concept of Dimension

Homogeneous systems

D . A homogeneous system of linear equations with variables is Ax = 0, where A is an matrix, x R is a vector-column, that is, 1 matrix, and 0 R is the 1 matrix whoseentries are all zeroes.

If there is only one equation, that is, = 1, then the set of solutions is called a straight line for = 2,a plane for = 3 and a hyperplane in general.

E . A homogeneous linear equation A +B = 0 defines a straight line passing trough the originin R2 unless A = B = 0. IfB = 0, it can be written as = , where = AB . IfB = 0, then the equation is = 0.

E . A homogeneous linear equation A + B + C = 0 defines a plane passing trough the originin R3 unless A = B = C= 0.

E . A homogeneous system of two linear equations in three variables 11 + 12 + 13 = 021 + 22 + 23 = 0

defines the intersection of two planes in R3, that is, a line unless the planes coincide. The planes coincide ifand only if their equations are proportional, that is, (11 12 13) = (21 22 23). For instance, the system

+ 2 + = 0 + + = 0

defines a line and + 2 + = 02 4 2 = 0

a plane.

26


28/54

Equations in a homogeneous systems cannot give parallel planes because there is always the trivial

solution the zero vector. Geometrically it means that all those hyperplanes contain the origin. Butparallel planes do not have a common point, so its not the case for homogeneous systems.

T . Given a system of linear equations Ax = b, let x0 be some its particular solution and letx be any solution to the homogeneous system Ax = 0. Then the general solution to the system Ax = bis x = x0 + x.

P To prove that x = x0 + x is the general solution, we need to check two things:

(i) that x = x0 + x is always a solution

(ii) and that any solution can be expressed in the form x = x0 + x.

First, let x = x0 + x. Then Ax = A(x0 + x) = Ax0 + Ax = b + 0 = b, so it is, indeed, a solution.Second, let Ax = b. Then x = x0 +(xx0) and let x = xx0. Then Ax = A(xx0) = AxAx0 = bb = 0,

so, indeed, x is a solution to the homogeneous system. :)

E . Consider the system + 2 + = 1 4 2 = 1

Obviously, it has a solution = 1, = = 0. To find the general solution, we need to solve the homoge-neous system, that is,

+ 2 + = 0 4 2 = 0 A2 A2 + 2A1 + 2 + = 03 = 0 so the solution is = 0 and = 2. Therefore the general solution to the original system is = 1, isarbitrary, = 2. It can be given in the parametric form as = 1 = = 2.

Idea of Dimension

Given a system of equations with variables, recall that is admits a unique solution if and only if thedeterminant of the matrix is nonzero. For a homogeneous system Ax = 0, there is always the trivialsolution x = 0. If det A = 0, then this trivial solution is the only one. On the other hand, assume thatdet A = 0.

E . Consider the system

+ 2 + = 02 4 2 = 02 + + 2 = 0

Notice that all the equations are proportional to the first one. Indeed, the second one is obtained bymultiplying the first one with 2 and the third one by 12 . Therefore they all define the same plane + 2 + = 0, which can be parameterized by = , = , = 2.

27


29/54

E . Consider the system

+ 2

+

= 0

= 03 = 0

Again, the matrix has zero determinant. What about the set of solutions? Lets apply Gaussian elimination.

+ 2 + = 0 = 03 = 0

+ 2 + = 03 + 2 = 06 4 = 0

Now the third equation is proportional to the second one, so it is redundant and we can remove it. Theoriginal system is equivalent to the system of two equations

+ 2 + = 03

+ 2

= 0

Lets apply Gaussian elimination again. It gives us + 2 + = 0 + 23 = 0

3 = 0

+ 23 = 0

so can be chosen as a parameter and the solution is = 3

, = 23

, = . In general, Gaussian elimination transforms a matrix into the following form:

0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 ...

. ..

..

.

..

.0 0 0 0 0 0 0 0 0 0 0 1

with some more possible zero rows in the end. It is called the reduced row echelon form. For instance,the reduced row echelon matrix from Example 7.7 is

1 2 10 0 00 0 0

and the reduced row echelon matrix from Example 7.8 is

1 0 130 1 23

0 0 0

After re-arranging columns the reduced echelon form can be compactly written as I 0 0

where I is a unit matrix, 0s denote submatrices with all zero entries and is a submatrix witharbitrary entries. Variables that correspond to entries in the part can be taken as parameters. One

28


30/54

needs one parameter to encode a line and two parameters to encode a plane. Generally, the number of

parameters needed to encode the set of solutions of a homogeneous system corresponds to the idea ofdimension.

29


31/54

Week 8

Vector Spaces

Some examples

Recall that Sum Rule in Calculus describes a similar pattern for various types of objects that one canmultiply with constants and add together.

E . Let be the set of differentiable functions from R to R. Then for any C D R andany (which means that and are defined), we have (C + D) = C + D and thereforeC + D .

E . Let S be the set of converging series or, more precisely, sequences {}=1 such that theseries

=1 converges. Assume that {} {} S and let C D R. Then since

=1

(C + D) = C

=1 + D

=1

we see that {C + D} S.

E . Let X be the set of functions () such that 20 () = 0. Assume that X and letC D R. Then 2

0

C() + D() = C2

0

() + D2

0

() = 0

so C + D X.

8.1 General definition of a vector space

Besides functions or sequences, what are other types of objects can be added together and multiplied withnumbers? For example, one can do it with real numbers, elements ofR, matrices, geometric vectors ona plane or in a 3D space, polynomials etc.

30


32/54

D . A set U is called a vector space and its elements vectors if there are operations of(i) addition, that is, for any two vectors u U and v U their sum u + v U is defined, and

(ii) multiplication by a number, that is, for a vector u U and a scalar R their product is a vectoru U.

Here, the word scalar means a real number as opposed to a vector, an element of the vector spacebeing considered. The operations in a vector space must satisfy the following conditions:

(i) For any three vectors u v w U, we have (u + v) + w = u + (v + w)(ii) For any two vectors u v U, we have u + v = v + u

(iii) There is the unique zero vector 0 such that u + 0 = u for any u U(iv) For any vector u U there is the unique vector u such that u + u = 0(v) For any number R and any two vectors u v U, we have (u + v) = u + v

(vi) For any two numbers R and any vector u U, we have ( + )u = u + u(vii) For any two numbers R and any vector u U, we have ()u = (u)

(viii) For any vector u U, we have 1 u = uAs we have already seen, the set of differentiable functions R R, the set of converging series, and the

set of functions whose integral from 0 to 2 is 0 are vector spaces. Obviously, R is vector space. Moregenerally, the set of matrices with real entries is a vector space. Overall, vector spaces occur veryoften in mathematics.

8.2 General properties of vector spaces

Some statements about a vector space are quite obvious in each particular case, but generally we have todeduce them from the definition.

T . Given a vector space U, for any vector u U, we have

0 u = 0 (8.1)

P The following chain of equalities holds:

0 u 3= 0 u + 0 4= 0 u + u + u 8= 0 u + 1 u + u6

= (0 + 1) u + u 8= u + u 4= 0(8.2)

Here, the number above each equality means the axiom from the definition making the equality true. :)

31


33/54

T . Let Ube a vector space. For any vector u U, we haveu = (1) u (8.3)

P Since the definition claims that the vector u is unique, our goal is just to show that u + (1) u = 0.It easily follows from the distributivity of multiplication:

u + (1) u = 1 u + (1) u = (1 1) u = 0 u = 0 (8.4)due to the previous theorem. :)

Further, instead of notation u we write u. Naturally, the notation u v means just u + v.

Geometric vectors

The notion of a vector is studied at school, although it means just an oriented line segment. Two suchvectors are equal if they have the same length and the same direction.

Zero vector is a vector of zero length a point. To multiply a geometric vector by a number R,one simply multiplies its length by || keeping the direction when > 0 and reversing it when < 0.Thus multiplication by 0 always yields the zero vector. To add two vectors, one places their initial pointsto the origin; the sum is the diagonal of the parallelogram. Obviously, this addition is commutative.

To see that it is associative, we can use the equation AB+BC = AC, which implies that (AB+BC)+CD =AB + (BC + CD) = AD. All the other axioms of a vector space can be checked just as easy.

Given two orthogonal oriented straight lines in the plane, one can find Cartesian coordinates of ageometric vector. It gives a bijective correspondence between vectors on the plane and couples of realnumbers, that is, R2. In the same manner, vectors in the space correspond to triples of real numbers, thatis, R3.

8.3 Subspaces

D . A non-empty subset Uof a vector space V is called a vector subspace if it is a vectorspace itself.

Thus if we need to find out whether U V is a subspace, we need to check that for any C D R andfor any x y U, we have Cx + Dy U. In particular, the vector 0 belongs to any subspace because0 = x x U, where x is any vector in U.E . The set 2 of twice differentiable functions R R is a subset in the set of differentiablefunctions. Is it a vector subspace? We need to check that if and are defined, then for any C D R,the second derivative (C + D) is defined. Obviously, (C + D) = C + D, so it is true.

32


34/54

E . Let F be the set of all functions defined on the interval [0 2] and let X be the subset

consisting of those functions whose integral from 0 to 2 is 0. As Example 8.3 shows, X is a subspace.How about the set Y of functions whose integral from 0 to 2 is 1? Assume that Y and let C D R.Then 2

0

C() + D() = C2

0

() + D2

0

() = C+ D

which may or may not be 1. For instance, if () = 12, () = 0, C= 2 and D= 0, the integral ofC + Dis 2 and C + D / Y.

E . Consider the plane. The set consisting of just one point the origin is a vector subspace.On the other hand, the set of non-zero vectors is not a vector subspace because any subspace must containthe zero vector.

D . Let V be a vector space. Given vectors v1 v V and scalars C1 C R, theexpression C1v1 + + Cv V is called a linear combination of vectors v1 v V with coefficientsC1 C .

Let V be any vector space and let S V be a subset (may not be a subspace). Then all possible linearcombinations of vectors from S form a subspace. Indeed, if we multiply a linear combination with a scalar,we get a linear combination of same vectors. If we add two linear combinations, the sum is also a linearcombination.

D . The subspace consisting of all linear combinations of vectors from S is called the sub-

space spanned by S and is denoted S.

E . Let P be the space of all polynomials. Then 1 2 is the space consisting of polynomialsof degree at most 2.

E . In R2, consider the set A of all points ( ), where Z. Then A is the line = .Indeed, an arbitrary point on this line is ( ) = (1 1), so it is a linear combination of just (1 1) A.

T . For any vector space V and any its subset S, the space S is characterized as

(i) intersection of all subspaces in V containing S;

(ii) minimal subspace in V containing S in the sense that if U S is a subspace in V, then U S.

P (i) Let M be the intersection of all subspaces containing S. In order to prove that M = S, weneed to check two things: that M S and that M S.

33


35/54

(a) Let x M. Therefore x belongs to any subspace containing S. In particular, x S because S

is a subspace containing S. Thus M S.(b) Let x S. Then x =

=1 Cx , where x1 x S. Now consider any subspace U containing

S. Since U is a subspace and since x1 x U, any linear combination of vectors x1 xalso belongs to U. Hence x U for any U and therefore x is in the intersection of all suchsubspaces U.

(ii) Since S is the intersection of all subspaces containing U, any such U contains S. :)

34


36/54

Week 9

Basis

9.1 Isomorphism

E . Consider the space of 2 2 symmetric matrices and the space of 2 2 traceless matrices. Asymmetric matrix satisfies 12 = 21 and therefore is of the form

A traceless matrix satisfies 11 + 22 = 0 and therefore is of the form

Notice that in both situations a matrix is encoded by three real numbers, that is, a 3-vector ( ) R3.In other words, spaces of symmetric matrices and traceless matrices are equivalent to the space R3.

D . Given vector spaces U and V, one says that a map : U V is an isomorphism if

(i) is bijective,

(ii) and (Cx + Dy) = C(x) + D(y).

If such exists, the spaces U and V are called isomorphic, denoted by U = V.

The first condition means that U and V are in one-to-one correspondence. The second one says that Uand V have the same vector space structure. For instance, Example 9.1 shows that the spaces of symmetricmatrices and traceless matrices are isomorphic to R3.

E . Consider the space of polynomials of degree at most 2. An arbitrary such polynomial is + + 2 . Assigning the triple () R3 to it defines an isomorphism with R3. It is given by

( + + 2) = () R3

35


37/54

Following spaces (and a lot more, actually) are isomorphic to R3: symmetric 2 2 matrices, traceless

2 2 matrices, anti-symmetric 3 3 matrices, polynomials of degree at most 2, geometric vectors in thespace. They all are equivalent as vector spaces, that is, operations of addition and multiplication with ascalar are same in all of them.

E . The map from polynomials of degree at most 2 to R3 given by

( + + 2) = (3 5 7)

is bijective, but not an isomorphism because

(2 + 2) = (8 32 0) = 2(1 + )

9.2 Linear span

D . Let V be a vector space. Given vectors v1 v V and scalars C1 C R, theexpression C1v1 + + Cv V is called a linear combination of vectors v1 v V with coefficientsC1 C .

Let V be any vector space and let S V be a subset (may not be a subspace). Then all possible linearcombinations of vectors from S form a subspace. Indeed, if we multiply a linear combination with a scalar,we get a linear combination of same vectors. If we add two linear combinations, the sum is also a linearcombination.

D . The subspace consisting of all linear combinations of vectors from Sis called the subspacespanned by S or the linear span of S and is denoted S.

E . Let P be the space of all polynomials. Then 1 2 is the space consisting of polynomialsof degree at most 2.

E . Consider D(0 1) the space of functions differentiable on the interval (0 1). Let S ={1 2 3 }. Then S is the space of all polynomials.

E . In R2, consider the set A of all points ( ), where Z. Then A is the line = . Indeed,an arbitrary point on this line is ( ) = (1 1), so it is a linear combination of just (1 1) A.

T . For any vector space V and any its subset S, the space S is characterized as

(i) intersection of all subspaces in V containing S;

(ii) minimal subspace in V containing S in the sense that if U S is a subspace in V, then U S.

36


38/54

P (i) Let M be the intersection of all subspaces containing S. In order to prove that M = S, we

need to check two things: that M S and that M S.(a) Let x M. Therefore x belongs to any subspace containing S. In particular, x S because S

is a subspace containing S. Thus M S.

(b) Let x S. Then x =

=1 Cx , where x1 x S. Now consider any subspace U containingS. Since U is a subspace and since x1 x U, any linear combination of vectors x1 xalso belongs to U. Hence x U for any U and therefore x is in the intersection of all suchsubspaces U.

(ii) Since S is the intersection of all subspaces containing U, any such U contains S. :)

9.3 Basis and dimension

D . One says that a set S V is a basis of a vector space V if any vector x V is uniquelyexpressed as a linear combination of vectors from S.

E . The space of polynomials has the basis 1 2 It contains infinitely many vectors.

E . The space R has the following basis:

e1 =

10

0...0

e2 =

01

0...0

e3 =

00

1...0

e =

00

0...1

Indeed, given x R, assume that x = A1e1 + + Ae. It follows then that A1 = 1 A2 = 2 A = ,so, indeed, x is uniquely expressed as a linear combination of {e1 e2 e}.

The zero vector can never belong to a basis. Indeed, 0 = 10 = 20 are two different linear combinationsfor 0 while the expression is supposed to be unique if its a basis.

D . If V has a basis consisting of vectors {e1 e2 e}, then one says that V has dimen-sion , denoted by dim V = . In this situation, given a vector x, we have

x = A1e1 + + Ae

Here, A1 A2 A are coordinates of the vector x in the basis {e1 e2 e}.

37


39/54

E . Consider vectors

e1 =

121

e2 = 210

e3 = 011

x = 320

and lets prove that e1 e2 e3 form a basis ofR3 and find coordinates of x in this basis. First, we need tocheck that an arbitrary vector is uniquely expressed as a linear combination of e1 e2 e3, that is, we canfind from

=

12

1

+

21

0

+

01

1

and the solution is unique. Writing this equation component by component, we obtain

+ 2 =

2 + = + =

which is a system of linear equations. It admits a unique solution if and only if its matrix is nonsingular(of nonzero determinant). The determinant is 1, so it is true.

Further, in order to find coordinates of the vector x in this basis, we need to solve the system

+ 2 = 32 + = 2 + = 0

Applying Gaussian elimination, we find = = = 1, so x has coordinates (1 1 1) in the basis e1 e2 e3.

Notice that the matrix of the system we had in Example 9.15 is

1 2 02 1 11 0 1

which is combined from vectors e1 e2 e3 as rows. Generally, vectors in R form a basis if and only ifthe determinant of the matrix whose rows are these vectors is nonzero.

E . Consider the space of traceless symmetric matrices 3 3 and lets find its basis and dimen-sion. An arbitrary such matrix is of the form

x =

The easy way to construct a basis is to set one of parameters to be 1 and the rest 0. We obtain the following

matrices then:

e1 =

1 0 00 0 0

0 0 1

e2 =

0 0 00 1 0

0 0 1

e3 =

0 1 01 0 0

0 0 0

e4 =

0 0 10 0 0

1 0 0

e5 =

0 0 00 0 1

0 1 0

38


40/54

Lets prove that this is, indeed, a basis. The matrix equation x = e1 + e2 + e3 + e4 + e5 means the

following equations for the matrix entries (written for 11 12 33): = , = , = , = , = , = , = , = , = . Removing the duplicates and the redundant last one (since it is thesum of the first and the fifth ones), we get the trivial system = , = , = , = , = . It obviouslyhas a unique solution. Therefore the dimension of the space of traceless symmetric 3 3 matrices is 5.

T . Let a vector space V have a basis e1 e2 e . Then V = R with the isomorphismgiven by

(x) = (A1 A2 A)

where (A1 A2 A) are coordinates of the vector x in the basis e1 e2 e .

P We need to check two things:

(i) The fact that is bijective. It follows from the definition of the basis.

(ii) The fact that preserves the structure of vector space. Let x V have coordinates (A1 A)and let y have coordinates (B1 B). We need to check that the vector Cx + Dy has coordinates(CA1 + DB1 C A + DB). We have then

Cx + Dy = C

=1Ae + D

=1

Be =

=1(CA + DB)e

which is what required. :)

Theorem 9.17 implies that any vector space V of finite dimension is isomorphic to R, with = dim V.However, it we dont yet know whether it is possible that R = R with = .

We dont know either if any vector space has a basis, finite or infinite. Logically, there could be apossibility when we try to construct a basis of a vector space by taking more and more vectors and onsome stage the current collection of vectors is not enough to span the whole space, but adding any othervector to it leads to a non-unique expression as a linear combination.

Well see further that both situations are impossible.

39


41/54

Week 10

Linear Dependence

10.1 Linear dependence

D . An -tuple of vectors x1 x2 x is called linearly dependent if there are scalarsC1 C2 C , not all equal 0, such that

C1x1 + C2x2 + + Cx = 0

If such scalars do not exists, that is, the only way for the linear combination C1x1 + + Cx to be equalto 0 is C1 = C2 = = C = 0, then vectors x1 x2 x are linearly independent.

E . The system of one vector x is linearly dependent if and only if x = 0.

E . The following vectors are linearly dependent: 10

2

43

1

149

1

Indeed,

2

10

2

3

43

1

149

1

=

00

0

E . The following three vectors are linearly independent: 11

1

01

1

00

1

40


42/54

Indeed, assume that

11

1

+ 011

+ 001

= 000

so we have = 0, + = 0, + + = 0. The only solution is = = = 0.

T . Vectors x1 x2 x are linearly dependent if and only if one of them can be expressedas a linear combination of the rest.

P First, assume that they are linearly dependent. Then

C1x1 + C2x2 + + Cx = 0

where C = 0. Hence,

x = C1C

x1 C1

Cx1

C+1C

x+1 CC

x

is expression of x as a linear combination of other vectors.Conversely, assume that

x = C1x1 + + C1x1 + C+1x+1 + + Cx

Then we havex C1x1 C 1x1 C+1x+1 C x = 0

and the vectors are linearly dependent because at least coefficient at x here is nonzero. :)

E . Any system of vectors including 0 is linearly dependent. Indeed, 0 is always expressed asa linear combination of the rest of them with zero coefficients.

E . Two vectors are linearly dependent if and only if they are collinear.

E . Three vectors are linearly dependent if and only if they are coplanar.

C . A basis of a vector space always consists of linearly independent vectors.

P The proof is by contradiction. Assume that e1 e2 e is linearly dependent basis. Then oneof these vectors is expressed as a linear combination of the rest of them. Let it be, for example e1, soe1 =

=2 Ce . On the other hand, e1 = 1e1, so we got two different expressions for e1, which contradicts

the definition of a basis. :)

41


43/54

10.2 How to construct a basis

E . Lets construct a basis of the subspace V in R3 given by the equation + + = 0. Letsstart with a random vector, say e1 = (2 1 1). Since it is not zero, the set consisting of only this vector islinearly independent. Now we need to add another vector e2 such that e1 and e2 are linearly independent,that is, non-collinear. For example, lets take e2 = (2 0 2).

Now given x = ( ) V, let x = e1 + e2, so we have equations

= 2 2 =

= + 2

+ 2 = 2 =

2 = 2

so the only solution is = and = 2 . We see that it is, indeed, the basis.

Generally, one could construct a basis starting from a random nonzero vector and adding more and

more vectors until its impossible to add a new one such that the system is linearly independent. However,there might be a possibility that we have linearly independent vectors e1 e2 e and a vector x, whichcannot be expressed as a linear combination of them, but e1 e2 e x are linearly dependent. Thefollowing statement shows that its impossible.

L . Assume that

(i) vectors e1 e2 e are linearly independent;

(ii) and vector x cannot be expressed as a linear combination of them.

Then e1 e2 e x are linearly independent.

P Suppose thatC1e1 + C2e2 + + Ce + Cx = 0

and lets prove that all coefficients here equal 0.If C = 0, then we can divide by C and express x as a linear combination

x = C1C

e1 C2C

e2 CC

e

which is supposed to be impossible. Therefore C = 0. Thus

=1 Ce = 0. Since vectors e1 e2 e

are linearly independent, the only way for their linear combination to be 0 is to have all zero coefficients.

Hence C1 = = C = 0. :)

L . Assume that a system of vectors e1 e2 e V satisfies

(i) it is linearly independent;

42


44/54

(ii) any vector from the space V can be expressed as a linear combination of e1

e2

e

.

Then e1 e2 e is a basis of the space V.

P We are already given that each vector x V can be expressed as a linear combination ofe1 e2 e , so we only need to check that this expression is always unique. Suppose that

x =

=1

Ce =

=1De

and lets check that C = D must hold for = 1 . We have

=1(C D)e = 0, which is only possibleif all coefficients are zeroes, so C D = 0 and C = D. :)

10.3 Dimension

Lemma 10.11 and Lemma 10.12 show that we can always construct a basis adding random vectors suchthat the system is linearly independent. But why do we always end up with the same number of vectors?There is still a logical possibility that the space admits bases e1 e2 e and f1 f2 f with = .

T . If a vector space V has a basis e1 e2 e, then any > vectors are linearlydependent.

P Consider some vectors x1 x2 x Vwith > . We have thenx1 = 11e1 + 12e2 + + 1ex2 = 21e1 + 22e2 + + 2e

...x = 1e1 + 2e2 + + e

and we need to construct linear combination

=1 Cx = 0. We get the following system for the coefficientsC :

11C1 + 21C2 + + 1C = 012C1 + 22C2 + + 2C = 0

..

.1C1 + 2C2 + + C = 0

so its a homogeneous system of linear equations with > variables. Obviously, Gaussian eliminationwould lead to a non-trivial row-echelon form and there will be non-zero solutions. :)

43


45/54

R . Lets clarify how a row-echelon form leads to non-trivial solutions on a simple example.

Assume that we got the following system after Gaussian elimination: + 5 = 0 2 = 0

Then we can assign any value to , say = 1978 and therefore = 5 = 9890, = 2 = 3956 will be anontrivial solution.

C . The dimension of a vector space is the maximal number of linearly independentvectors in it.

Applying Gaussian elimination to check linear dependence

T . Consider vectors x1 x V. They are linearly dependent if and only if(i) x1 x1 x + x x+1 x Vare linearly dependent for R.

(ii) x1 x x x Vare linearly dependent for < .(iii) 1x1 x V are linearly dependent for 1 = 0.

In other words, this theorem says that three steps of Gaussian elimination do not change linear dependenceor independence.

P Linear dependence or independence is about the following equation:

C1x1 + C2x2 + + Cx = 0

where C1 C are some scalars. The question is if there is a non-zero solution of this equation. Weneed to check that a non-zero solution of the original equation exists if and only if it does for the modifiedsystem in all the three cases.

(i) Since the first step concerns only x and x, lets track only these two vectors. We have

+ (C C) D

x + CD

(x + x) + = 0

Obviously, C = C = 0 if and only if D = D = 0, so were done.

(ii) This step is similar to (i), but even easier, so we skip it.

44


46/54

(iii) We have,

C1x1 + + Cx = 0 C11 (

1x1) + +C (

x) = 0

so it is also true. :)

Thus to check if some vectors are linearly dependent, we arrange their coordinates into a matrix andapply Gaussian elimination to it. What are we going to end with is a row-echelon form. If it contains thezero vector, then they are linearly dependent according to Example 10.6. If not, they are obviously linearlyindependent.

E . Lets find out if the vectors (1 2 0 4), (2 0 1 1), (3 2 5 1) are linearly dependent. Wearrange them into a matrix and apply elementary row operations to it:

1 2 0 42 0 1 13 2 5 1

1 2 0 40 4 1 9

0 8 5 13 1 2 0 40 4 1 9

0 0 7 8

so they are definitely linearly independent.

Since dimension is the maximal number of linearly independent vectors, we can apply Gaussian elimi-nation to find the dimension.

E . Lets find the dimension of the space spanned by vectors (1 2 0 4), (2 0 1 1), (5 2 3 1).Again, we arrange them into matrix and apply elementary row operations to it:

1 2 0 4

2 0 1 15 2 3 1

1 2 0 40 4 1 70 12 3 21

1 2 0 40 4 1 70 0 0 0

so the maximal number of linearly independent vectors here is 2.

45


47/54


48/54

Its columns are, obviously, linearly independent and therefore span a space of dimension 2 (a plane in R3).

On the other hand, the first row and the third row are linearly independent and therefore span thewhole space R2. Even three rows cannot generate anything more than the entire R2, so the dimension ofthe row space is also 2.

Recall that elementary row-operations keep the dimension of the row space. In the same manner,elementary column-operations keep the dimension of the column space. Also, both elementary row- andcolumn-operations have a clear effect on the determinant: operation 1 doesnt change the determinant,operation 2 changes sign, operation 3 multiplies it with a non-zero number.

Recall that M denotes the determinant of the matrix obtained by removing the th row and the thcolumn from A. More generally, we have the following definition.

D . A minor of order of a matrix A is the determinant of any sub-matrix obtained

by removing some rows and some columns.

E . The matrix

A =

1 2 1

2 4 2

has 6 minors of order 1 its 1 1 sub-matrices, that is, just their elements. It also has three minors oforder 2 and each of them is 0: 2 14 2

= 1 12 2

= 1 22 4

= 0

T . Given an matrix A, its row space has same dimension as the column space. Bothof them equal the maximal order of a non-zero minor.

P Let be the dimension of the row space. It means that there are linearly independent rowsA1 A of the matrix A but any + 1 or more rows are linearly dependent.

Let A be the matrix whose rows are A1 A . Applying Gaussian elimination, we transform itto a reduced row-echelon form

0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 ...

. . ....

...0 0 0 0 0 0 0 0 0 0 0 1

Any elementary row-operation of this process restricts to an elementary row operation of any sub-matrix here. Therefore all zero minors of order remain zero and all non-zero minors of order alsoremain to be non-zero under elementary row-operations. Clearly, there is the unit sub-matrix hereformed by those columns with one unit and zeroes on the rest of positions, so there is a non-zero minor of

47


49/54

order . On the other hand, determinant of any bigger square sub-matrix is 0 because a bigger sub-matrix

has at least + 1 rows, so they are linearly dependent.Since there is a non-zero minor of order and no non-zero minor of order > , it means that isthe maximal order of a non-zero minor. In the same manner, we would show that the dimension of thecolumn space is the maximal order of a non-zero minor. :)

D . The rank of a matrix A is

(i) Dimension of the row space;

(ii) Dimension of the column space;

(iii) Maximal number of linearly independent rows;

(iv) Maximal number of linearly independent columns;(v) Maximal order of a non-zero minor.

E . What is the maximal possible rank of a 31 65 matrix? Of an matrix?

Due to Theorem 11.6, numbers (i)-(v) are equal. How do we actually find the rank? Its easy we havethe good old Gaussian elimination.

T . Elementary row- and column-operations do not change the rank of a matrix.

P Elementary row-operations do not change the dimension of the row space, which is the rank.Elementary column-operations do not change the dimension of the column space, which is also the rank.:)

Recall that the final result of Gaussian elimination is the reduced row-echelon form

0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 ...

. . ....

...0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

... ... ... ...0 0 0 0 0 0 0 0 0 0 0 0 0 0

Clearly, non-zero rows of such a row-echelon matrix are linearly independent. Therefore the rank of amatrix it the number of non-zero rows in the reduced row-echelon form.

48


50/54

E . Lets find

rank

1 2 12 4 2

2 1 33 3 4

By elementary row-operations, we get

1 2 1

2 4 22 1 33 3 4

1 2 10 0 00 3 10 3 1

1 2 10 3 10 0 00 0 0

We could go on by getting 1 on position (2 2) and 0 on position (1 2), but it is already clear that there willbe 2 non-zero rows in the reduced row-echelon form, so the rank of this matrix is 2.

E . Let A be an matrix. Knowing that det A = 0, what can you say about rank A?

11.2 Applications

The first application concerns solvability of systems of linear equations.

T . (K-C) (i) A system of linear equations Ax = b has a solution ifand only if rank A = rank(A|b), where (A|b) is the matrix A extended with the column b.

(ii) A system of linear equations Ax = b has a unique solution if and only if rank A equals the number

of variables.

P Applying Gaussian elimination, we can transform A to a reduced row-echelon form, say A. At thesame time, the extended matrix (A|b) is transformed to (A|b).

(i) Now its clear that the system Ax = b might not be solvable only if it contains a row of the form0 = , where actually = 0. It means that some row contains only zeroes in A and a non-zeroelement in b, so A has more zero rows than the extended matrix.

(ii) The system Ax = b is uniquely solvable if and only if its equations are 1 = 1 = with

some more possible zero equations, which do not affect the rank. :)

Another application of the rank is within the realm of calculus. Notice that a 1 matrix (1 )has rank 1 if and only if it is non-zero, that is, at least one of its elements is not zero. Obviously, a 1 matrix cannot have rank 2 or more because it doesnt even have minors of order 2.

Further, a 3 2 matrix 11 1221 22

31 32

49


51/54

has rank 2 if and only if its rows are linearly independent. To say that they are linearly independent means

that 112131

1222

32

= 00

0

Of course, the rank of a 3 2 matrices cannot exceed 2 because it doesnt even have minors of order 3or more.

How is it related to calculus? Recall that a map : R R defines a smooth curve if grad = 0 anda map : R2 R3 defines a smooth surface if = 0. More generally, one says that the image of a1-to-1 map : R R with < is a smooth manifold if its Jacobi matrix has rank at any point.

How about critical points of a function? Clearly, a critical point of a function is a point where the rankof the Jacobi matrix is 0. More generally, critical points of a map : R R are defined to be thosewhere the rank of the Jacobi matrix is not maximal, that is, less than and . For instance, if = 1, weget the critical points of a function; for = 1 we get points where the corresponding curve is not smooth

etc. Critical points of maps are studied by Singularity Theory a branch of mathematics that arose fromTopology in 1950-s and now has applications to economics and finance.

50


52/54

Week 12

Rank, Determinant, and Matrix

Multiplication

12.1 Determinant of a product

E . Lets check if the determinant of the product of matrices equals the product of determinants.For example,

det

1 23 4

= 2 det

7 55 3

= 4

and

det 1 23 4

7 5

5 3 = det3 11 3 = 8

Of course, its easy to check by direct calculation that det(AB) = det A det B for 2 2 matrices A andB. How about matrices? In order to prove the general theorem, we need to recall some facts aboutdeterminants.

The determinant was originally defined as some polynomial expression in matrix entries. Since matrixrows are -vectors, the determinant can be thought of as a function of vector arguments A1 A R.It was readily seen that determinant was an -linear skew-symmetric function and that det I = 1. Thanksto this observation, one can use Gaussian elimination to calculate det A. Another method of finding det A isa row or column expansion when one expresses det A in terms of minors (determinants of ( 1) ( 1)sub-matrices) along a certain row or a certain column.

Further, if (A) is an -linear skew-symmetric function of vectors A1 A R considered as rowsof a matrix A, then (A) = det A

(I). In other words, determinant is the only skew-symmetric -linear

function in the sense that any such function would be just a multiple of determinant.

T . For any matrices A and B, we have det(AB) = det A det B.

51


53/54

P Let (A) = det(AB). In other words, we consider B as a fixed matrix and A as an argument of

the function . By definition of matrix multiplication, rows of AB are A1B AB, where A1 Aare rows of A. Its easy to see that (A) is an -linear skew-symmetric function of A1 A and hence(AB) = det A (I), where (A) = det(IB) = det B. :)

E . Use Theorem 12.2 to prove that if det A = 0, then A is not invertible, that is, there is nomatrix B such that AB = I.

Recall that we already know that if det A == 0, then A is invertible and that A1 = 1det A adj A, whereadj A is the adjugate matrix constructed in some clever way from minors of matrix A.

12.2 Rank of a product

Recall that given an matrix A, its rank is the dimension of the column space, that is, maximalnumber of linearly independent columns. At the same time, rankA is the dimension of the row spaceand is the maximal order of a non-zero minor of the matrix A. Rank can also be found by Gaussianelimination as it equals the number of non-zero rows in a row-echelon form of a matrix A. Finally, thewords invertible, nonsingular, non-degenerate applied to an matrix A mean that det A = 0, whichis same as rank A = .

Given matrices A and B, how can we find the rank of the product AB? Lets first consider matricesA and B. If det A = 0 and det B = 0, then rank A = rank B = . Further, det(AB) = det A det B = 0, sorank A = . On the other hand, if det A = 0, then also det(AB) = 0, so we see that rank A < andrank(AB) < , no matter what rank B is. Hence we can conjecture that rank(AB) min(rank A rank B).

T . (i) If det A= 0 for an

matrix A, then rank(AB) = rank B for any

matrix

B,

(ii) Similarly, if det A = 0 for an matrix A, then rank(BA) = rank B for any matrix B.

P Since rank is defined as a maximal number of linearly independent rows or columns, its enough toprove that vectors B1 B R are linearly independent if and only ifAB1 A B are, where det A = 0(and same for right multiplication with A). But it is obvious since

=1

AB = 0 A1

=1AB = 0

=1

B = 0

because A is an invertible matrix. :)

T . We have rank(AB) min(rank A rank B) for any matrix A and matrix B,

52


54/54

P By elementary row operations, we can transform A into a row-echelon form, that is, express A

as a product A = T1R, where T1 is the matrix of the corresponding Gaussian elimination and R is therow-echelon form. Further, det T1 = 0 because T1 is a product of elementary matrices and each of themis non-degenerate.

In the same manner, we can transform B into a column-echelon form, that is, express B as a productB = CT2, where T2 is the matrix of the corresponding Gaussian elimination and C is the column-echelonform. Further, det T2 = 0 because T2 is a product of matrices of elementary column-operations and eachof them is non-degenerate.

By Theorem 12.4, we see that rank AB = rank RC. Obviously, the matrix RCcannot have more nonzerorows than R does, which is rank A, and cannot have more nonzero columns than C does, which is rank B.Thus rank(AB) rank A and rank(AB) rank B. :)

E . The sign cannot be replaced with = here. Indeed, consider the following product:

0 01 0 0 00 1 = 0 00 0 The product of matrices of rank 1 is a matrix of rank 0.

53

Documents

Linear algebra lecture notes - basics of linear algebra for CN Yang scholars; a kind of a compressed version of Linear Algebra I.pdf