1 Systems of Linear Equations - Jay Daigle › ... › 214 › chapter_1_linear_equations.pdf · 2020-06-13 · Jay Daigle Occidental College Math 214: Linear Algebra 1 Systems of

Jay Daigle Occidental College Math 214: Linear Algebra

1 Systems of Linear Equations

We’re going to start this course with a very concrete, very algebraic problem: solving equa-

tions. As the course progresses, we will see how this problem relates to geometric and formal

ideas. We will bring in ideas from geometric and formal perspectives to help us approach

this problem, and see how we can use our equation-solving techniques to answer questions

that arise in geometric and formal settings.

1.1 Basics of Linear Equations

A linear equation is an equation of the form

a1x1 + · · ·+ anxn = b (1)

where a1, . . . , an, and b are all real numbers, and x1, . . . , xn are unknowns or variables. (We

might write a1, . . . , an, b ∈ R; the symbol R stands for the real numbers, and the symbol ∈means “is an element of” or just “in”). We say that this equation has n unknowns.

A system of linear equations is a system of the form

a11x1 + · · ·+ a1nxn = b1

a21x1 + · · ·+ a2nxn = b2

......

am1x1 + · · ·+ amnxn = bm

with the aij and bis all real numbers. We say this is a system of m equations in n unknowns.

Importantly, these equations are restricted to be relatively simple. In each equation

we multiply each variable by some constant real number, add them together, and set that

equal to some constant real number. We aren’t allowed to multiply variables together, or

do anything else fancy with them. This means the equations can’t get too complicated, and

are relatively easy to work with.

Example 1.1. A system of two linear equations in two variables is

2x+ y = 3

x+ 5y = −3.

A system of two equations in three variables is

5x+ 2y + z = 7

3x+ 2y + z = 6.

http://jaydaigle.net/teaching/courses/2020-spring-214/ 2

http://jaydaigle.net/teaching/courses/2020-spring-214/


A system of three equations in one variable is

3x = 3

5x = 5

x = 2.

We want to find solutions to this system of equations. Since there are n variables, a

solution much be a list of n real numbers. We write Rn = {(x1, . . . , xn) : xi ∈ R} for the set

of ordered lists of n real numbers. (We sometimes call these “ordered n-tuples” or “vectors”).

Thus R1 = R is just the set of real numbers; R2 is the set of ordered pairs that makes up

the Cartesian plane.

An element (x1, . . . , xn) ∈ Rn is a solution to a system of linear equations if all of the

equalities hold for that collection of xi. The solution set of a system of linear equations is

the set of all solutions, and we say two systems are equivalent if they have the same set of

solutions.

Example 1.2. The system

2x+ y = 3

x+ 5y = −3

has (2,−1) ∈ R2 as a solution. We will see later that this is the only solution, and thus the

set of solutions is {(2,−1)}.The system

4x+ 2y + 2z = 8

3x+ 2y + z = 6

has (1, 1, 1) as a solution. This is not the only solution; in fact, the set of solutions is

{(x, 2 − x, 2 − x) : x ∈ R}. (This means that for each real number x, the ordered triple

(x, 2 − x, 2 − x) is a solution to our system). We say this is a subset of R3, since it is a

collection of elements of R3, and write {(x, 2− x, 2− x) : x ∈ R} ⊂ R3.

The system

3x = 3

5x = 5

x = 2




clearly has no solutions, since the first equation implies that x = 1 but the third equation

implies that x = 2. Thus the set of solutions is the empty set {} = ∅.

We say that two systems of equations are equivalent if they have the same set of solutions.

Thus the process of solving a system of equations is mostly the process of converting a system

into an equivalent system that is simpler.

There are three basic operations we can perform on a system of equations to get an

equivalent system:

1. We can write the equations in a different order.

2. We can multiply any equation by a nonzero scalar.

3. We can add a multiple of one equation to another.

All three of these operations are guaranteed not to change the solution set; proving this is a

reasonable exercise. Our goal now is to find an efficient way to use these rules to get a useful

solution to our system.


2x+ y = 3

x+ 5y = −3

is equivalent to

2x+ y = 3

−2x+−10y = 6

and then

0x+−9y = 9

−2x+−10y = 6

then

0x+ y = −1

−2x+−10y = 6

0x+ y = 1

−2x+ 0y = −4




0x+ y = 1

x+ 0y = 2

which give us our solution of x = 2, y = 1 or (x, y) = (2, 1).

This takes up a really awkward amount of space on the page, though, and we’d like to

find a better and more systematic way of approaching this process.

Remark 1.4. There’s another possible approach to solving these systems, called the method

of substitution. We could observe that if 2x + y = 3 then y = 3 − 2x, and substitute that

into our other equation to give

x+ 5(3− 2x) = −3

15− 9x = −3

9x = 18

x = 2

and from here we can see that y = 3− 2(2) = −1.

This is often much simpler to do in your head for small systems. But it scales up really

poorly to systems with more than two or three equations and variables, so we’ll want to

learn something more effective.

1.2 The matrix of a system

Looking at a system of linear equations, we notice that it can be described by an array of

real numbers. These numbers are naturally laid out in a rectangular grid, so we wnat to find

an efficient way to represent them.

Definition 1.5. A (real) matrix is a rectangular array of (real) numbers. A matrix with m

rows and n columns is a m× n matrix, and we notate the set of all such matrices by Mm×n.

A m× n matrix is square if m = n, that is, it has the same number of rows as columns.

We will sometimes represent the set of n× n square matrices by Mn.

We will generally describe the elements of a matrix with the notation

(aij) =

a11 a12 . . . a1n

a21 a22 . . . a2n...

.... . .

...

am1 am2 . . . amn

.




We can now take the information from a system of linear equations and encode it in a

matrix. Right now, we will just use this as a convenient notational shortcut; we will see later

on in the course that this has a number of theoretical and practical advantages.

Definition 1.6. The coefficient matrix of a system of linear equations given by

a11x1 + · · ·+ a1nxn = b1

a21x1 + · · ·+ a2nxn = b2

......


is the matrix a11 a12 . . . a1n

a21 a22 . . . a2n...

.... . .

...

am1 am2 . . . amn

and the augmented coefficient matrix is

a11 a12 . . . a1n b1

a21 a22 . . . a2n b2...

.... . .

......

am1 am2 . . . amn bm

.

Example 1.7. Suppose we have a system

4x+ 2y + 2z = 8

3x+ 2y + z = 6.

Then the coefficient matrix is [4 2 2

3 2 1

]and the augmented coefficient matrix is[

4 2 2 8

3 2 1 6.

]




Earlier we listed three operations we can perform on a system of equations without

changing the solution set: we can reorder the equations, multiply an equation by a nonzero

scalar, or add a multiple of one equation to another. We can do analogous things to the

coefficient matrix.

Definition 1.8. The three elementary row operations on a matrix are

I Interchange two rows.

II Multiply a row by a nonzero real number.

III Replace a row by its sum with a multiple of another row.

Example 1.9. What can we do with our previous matrix? We can[4 2 2

3 2 1

]I→

[3 2 1

4 2 2

]II→

[3 2 1

2 1 1

]III→

[1 1 0

2 1 1

].

So how do we use this to solve a system of equations? The basic idea is to remove variables

from successive equations until we get one equation that contains only one variable—at which

point we can substitute for that variable, and then the others. To do that with this matrix,

we have[4 2 2 8

3 2 1 6

]III→

[1 0 1 2

3 2 1 6

]III→

[1 0 1 2

0 2 −2 0

]II→

[1 0 1 2

0 1 −1 0

].

What does this tell us? That our system of equations is equivalent to the system

x+ z = 2

y − z = 0.

This gives us the answer I stated earlier: z = 2− x and y = z = 2− x.

Example 1.10. Solve the system of equations

x+ 2y + z = 3

3x− y − 3z = −1

2x+ 3y + z = 4.




This system has augmented coefficient matrix1 2 1 3

3 −1 −3 −1

2 3 1 4

III→

1 2 1 3

0 −7 −6 −10

2 3 1 4

III→

1 2 1 3

0 −7 −6 −10

0 −1 −1 −2

II→

1 2 1 3

0 −7 −6 −10

0 1 1 2

I→

1 2 1 3

0 1 1 2

0 −7 −6 −10

III→

1 2 1 3

0 1 1 2

0 0 1 4

which gives us the system

x+ 2y + z = 3

y + z = 2

z = 4.

The last equation tells us z = 4, which then gives y = −2 and x = 3. We can check that

this solves the system.

1.3 Row Echelon Form

We want to solve systems of linear equations, using these matrix operations. We want to be

somewhat more concrete about our goals: what exactly would it look like for a system to be

solved?

Definition 1.11. A matrix is in row echelon form if

� Every row containing nonzero elements is above every row containing only zeroes; and

� The first (leftmost) nonzero entry of each row is to the right of the first nonzero entry

of the above row.

Remark 1.12. Some people require the first nonzero entry in each nonzero row to be 1. This

is really a matter of taste and doesn’t matter much, but you should do it to be safe; it’s an

easy extra step to take by simply dividing each row by its leading coefficient.

Example 1.13. The following matrices are all in Row Echelon Form:1 3 2 5

0 3 −1 4

0 0 −2 3

5 1 3 2 8

0 0 1 1 1

0 0 0 0 −7

1 1 5

0 −2 3

0 0 7

.http://jaydaigle.net/teaching/courses/2020-spring-214/ 8



The following matrices are not in Row Echelon Form:

1 1 1 1

1 1 1 1

1 1 1 1

3 2 5 1

0 0 1 3

0 5 1 2

1 3 5

0 1 2

0 0 3

0 0 1

.

Definition 1.14. The process of using elementary row operations to transform a system

into row echelon form is Gaussian elimination.

A system of equations sometimes has a solution, but does not always. We say a system

is inconsistent if there is no solution; we say a system is consistent if there is at least one

solution.

Example 1.15. Consider the system of equations given by

x1 + x2 + x3 + x4 + x5 = 1

−1x1 +−1x2 + x5 = −1

−2x1 +−2x2 + 3x5 = 1

x3 + x4 + 3x5 = −1

x1 + x2 + 2x3 + 2x4 + 4x5 = 1.

This translates into the augmented matrix

1 1 1 1 1 1

−1 −1 0 0 1 −1

−2 −2 0 0 3 1

0 0 1 1 3 −1

1 1 2 2 4 1

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 2 2 5 3

0 0 1 1 3 −1

0 0 1 1 3 0

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 0 0 1 3

0 0 0 0 1 −1

0 0 0 0 1 0

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 0 0 1 3

0 0 0 0 0 −4

0 0 0 0 0 −3

.

We see that the final two equations are now 0 = −4 and 0 = −3 ,s othe system is inconsistent.




Example 1.16. Let’s look at another system that is almost the same.

x1 + x2 + x3 + x4 + x5 = 1

−1x1 +−1x2 + x5 = −1

−2x1 +−2x2 + 3x5 = 1

x3 + x4 + 3x5 = 3

x1 + x2 + 2x3 + 2x4 + 4x5 = 4.

This translates into the augmented matrix

1 1 1 1 1 1

−1 −1 0 0 1 −1

−2 −2 0 0 3 1

0 0 1 1 3 3

1 1 2 2 4 4

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 2 2 5 3

0 0 1 1 3 3

0 0 1 1 3 3

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 0 0 1 3

0 0 0 0 1 3

0 0 0 0 1 3

→

1 1 1 1 1 1

0 0 1 1 2 0

0 0 0 0 1 3

0 0 0 0 0 0

0 0 0 0 0 0

.

We see this system is now consistent. Our three equations are

x1 + x2 + x3 + x4 + x5 = 1 x3 + x4 + 2x5 = 0 x5 = 3.

Via back-substitution we see that we have

x5 = 3 x3 + x4 = −6 x1 + x2 = 4.

Thus we could say the set of solutions is {(α, 4− α, β,−6− β, 3)} ⊆ R5.

What we were just doing definitely worked, but even after we finished transforming the

matrix we still needed to do some more work. So we’d like to reduce the matrix even further

until we can just read the answer off from it.

Definition 1.17. A matrix is in reduced row echelon form if it is in row echelon form, and

the first nonzero entry in each row is the only entry in its column.

This means that we will have some number of columns that each have a bunch of zeroes

and one 1. Other than that we may or may not have more columns, which can contain




basically anything; we’ve used up all our degrees of freedom to fix those columns that contain

the leading term of some row.

Note that the columns we have fixed are not necessarily the first columns, as the next

example shows.

Example 1.18. The following matrices are all in reduced Row Echelon Form:1 0 0 5

0 1 0 4

0 0 1 3

1 17 0 2 8

0 0 1 1 0

0 0 0 0 1

1 0 5

0 1 3

0 0 0

.The following matrices are not in reduced Row Echelon Form:

1 1 1 1

0 1 1 1

0 0 1 1

3 0 0 1

0 3 0 3

0 0 2 2

1 0 15 3

0 0 1 2

0 0 0 1

.Example 1.19. Let’s solve the following system by putting the matrix in reduced row

echelon form.

x1 + x2 + x3 + x4 + x5 = 2

x1 + x2 + x3 + 2x4 + 2x5 = 3

x1 + x2 + x3 + 2x4 + 3x5 = 2

We have1 1 1 1 1 2

1 1 1 2 2 3

1 1 1 2 3 2

→

1 1 1 1 1 2

0 0 0 1 1 1

0 0 0 1 2 0

→

1 1 1 1 1 2

0 0 0 1 1 1

0 0 0 0 1 −1

→

1 1 1 0 0 1

0 0 0 1 1 1

0 0 0 0 1 −1

→

1 1 1 0 0 1

0 0 0 1 0 2

0 0 0 0 1 −1

From this we can read off the solution x1 + x2 + x3 = 1, x4 = 2, x5 = −1. Thus the set

of solutions is {(1− α− β, α, β, 2,−1)}.

We say some systems of equations are “overdetermined”, which means that there are

more equations than varaibles. Overdetermined equations are “usually” inconsistent, but

not always—they can be consistent when some of the equations are redundant.





x1 + 2x2 + x3 = 1

2x1 − x2 + x3 = 2

4x1 + 3x2 + 3x3 = 4

2x1 − x2 + 3x3 = 5

gives the matrix 1 2 1 1

2 −1 1 2

4 3 3 4

2 −1 3 5

→

1 2 1 1

0 −5 −1 0

0 −5 −1 0

0 −5 1 3

→

1 2 1 1

0 −5 −1 0

0 0 0 0

0 0 2 3

→

1 2 1 1

0 1 1/5 0

0 0 0 0

0 0 1 3/2

→

1 0 3/5 1

0 1 1/5 0

0 0 0 0

0 0 1 3/2

→

1 0 0 1/10

0 1 0 −3/10

0 0 1 3/2

0 0 0 0

This gives us the solution x1 = 1/10, x2 = −3/10, x3 = 3/2, which you can go back and

check solves the original system.

This overdetermined system does have a solution, but only because two of the equations

were redundant, as we could see in the second matrix where two lines are identical. In fact

we can go back to the original set of equations, and see that if we add two times the first

equation to the second equation, we get the third—which is the redundancy.

Other systems of equations are “underdetermined”, which means there are more variables

than equations. These systems are usually but not always consistent.

Example 1.21. Let’s consider the system

−x1 + x2 − x3 + 3x4 = 0

3x1 + x2 − x3 − x4 = 0

2x1 + x2 − 2x3 − x4 = 0.




This gives us the matrix−1 1 −1 3 0

3 1 −1 −1 0

2 1 −2 −1 0

→

1 −1 1 −3 0

0 4 −4 8 0

0 3 −4 5 0

→

1 −1 1 −3 0

0 1 −1 2 0

0 3 −4 5 0

→

1 −1 1 −3 0

0 1 −1 2 0

0 0 −1 −1 0

→

1 −1 1 −3 0

0 1 −1 2 0

0 0 1 1 0

→

1 0 0 −1 0

0 1 −1 2 0

0 0 1 1 0

→

1 0 0 −1 0

0 1 0 3 0

0 0 1 1 0

We see that we can’t “simplify” the fourth column in any way; we don’t have any degrees of

freedom after we fix the first three columns. This means that we can pick x4 to be anything

we want, and the other variables are given by x1 − x4 = 0, x2 − 3x4 = 0, x3 + x4 = 0. Thus

the set of solutions is {(α, 3α,−α, α)}.

Remark 1.22. A system of any size can be either consistent or inconsistent. 0 = 1 is an

inconsistent system with one equation, and

x1 + · · ·+ x100 = 0

x1 + · · ·+ x100 = 1

is an inconsistent system with a hundred variables and only two equations. In contrast,

x1 = 1

x1 = 1

......

x1 = 1

has only one variable, and many equations, and is still consistent.

1.4 Matrix Algebra

So far we’ve treated matrices as just being a convenient way to write down a bunch of

numbers. But matrices are interesting mathematical objects in their own right, and we can

do a lot of useful calculations with them.




1.4.1 Simple Operations

We want to start with a couple of simple operations. Neither of these operations really

depend on the structure of the matrix; they treat the matrix as a list of numbers.

Definition 1.23. If A = (aij) is an m× n matrix, and r ∈ R is a real number, then we can

multiply each entry of the matrix A by the real number R. This is called scalar multiplication

and we say that r is a scalar.

rA = (raij) =

ra11 ra12 . . . ra1n

ra21 ra22 . . . ra2n...

.... . .

...

ram1 ram2 . . . ramn

.

Definition 1.24. If A = (aij) and B = (bij) are two m × n matrices, we can add the two

matrices by adding each individual pair of coordinates together.

A+B = (aij + bij) =

a11 + b11 a12 + b12 . . . a1n + b1n

a21 + b21 a22 + b22 . . . a2n + b2n...

.... . .

...

am1 + bm1 am2 + bm2 . . . amn + bmn

.Example 1.25.

3

[2 5

−1 4

]=

[6 15

−3 12

] [4 1 3

−2 5 −1

]+

[−2 7 5

1 −6 4

]=

[2 8 8

−1 −1 3

]

1.4.2 Matrix Multiplication

Definition 1.26. If A ∈ M`×m and B ∈ Mm×n, then there is a matrix AB ∈ M`×n whose

ij element is

cij =m∑k=1

aikbkj.

If you’re familiar with the dot product, you can think that the ij element of AB is the

dot product of the ith row of A with the jth column of b.

Note that A and B don’t have to have the same dimension! Instead, A has the same

number of columns that B has rows. The new matrix will have the same number of rows as

A and the same number of columns as B.




Example 1.27.[1 3

2 4

][5 −1

3 2

]=

[1 · 5 + 3 · 3 1 · (−1) + 3 · 22 · 5 + 4 · 3 2 · (−1) + 4 · 2

]=

[14 5

22 6

][

4 6

2 1

][3 1 5

4 1 6

]=

[4 · 3 + 6 · 4 4 · 1 + 6 · 1 4 · 5 + 6 · 62 · 3 + 1 · 4 2 · 1 + 1 · 1 2 · 5 + 1 · 6

]=

[36 10 56

10 3 16

].

Matrix multiplication is associative, by which we mean that (AB)C = A(BC).

Matrix multiplication is not commutative: in general, it’s not even the case that AB and

BA both make sense. If A ∈ M3×4 and B ∈ M4×2 then AB is a 3× 2 matrix, but BA isn’t

a thing we can compute. But even if AB and BA are both well-defined, they are not equal.

Example 1.28.

[3 5 1

−2 0 2

]2 1

1 3

4 1

=

[3 · 2 + 5 · 1 + 1 · 4 3 · 1 + 5 · 3 + 1 · 1−2 · 2 + 0 · 1 + 2 · 4 −2 · 1 + 0 · 3 + 2 · 1

]=

[15 19

4 0

]

2 1

1 3

4 1

[

3 5 1

−2 0 2

]=

2 · 3 + 1 · (−2) 2 · 5 + 1 · 0 2 · 1 + 1 · 21 · 3 + 3 · (−2) 1 · 5 + 3 · 0 1 · 1 + 3 · 24 · 3 + 1 · (−2) 4 · 5 + 1 · 0 4 · 1 + 1 · 2

=

4 10 4

−3 5 7

10 20 6

.Particularly nice things happen when our matrices are square. Any time we have two

n× n matrices we can multiply them by each other in either order (though we will still get

different things each way!).

Example 1.29. [4 1

−3 5

][−1 1

1 −2

]=

[−3 2

8 −13

][−1 1

1 −2

][4 1

−3 5

]=

[−7 4

10 −9

].

However, matrix multiplication does satisfy the distributive and associative properties.

Fact 1.30. If A ∈M`×m and B,C ∈Mm×n then A(B + C) = AB + AC.

If A ∈M`×m, B ∈Mm×n, C ∈Mn×p then (AB)C = A(BC).

Example 1.31. Let

A =

[4 1

−3 5

]B =

[−1 1

1 −2

]C =

[3 2

1 −5

].




Then we have

AB =

[−3 2

8 −13

]AC =

[13 3

−4 −31

]AB + AC =

[10 5

4 −44

]

B + C =

[2 3

2 −7

]A(B + C) =

[10 5

4 −44

].

Thus we see AB + AC = A(B + C).

We can similarly compute

AB =

[−3 2

8 −13

](AB)C =

[−3 2

8 −13

][3 2

1 −5

]=

[−7 −16

11 81

]

BC =

[−2 −7

1 12

]A(BC) =

[4 1

−3 5

][−2 −7

1 12

]=

[−7 −16

11 81

]

1.4.3 Transposes

Definition 1.32. If A is a m×n matrix, then we can form a n×m matrix B by flipping A

across its diagonal, so that bij = aji. We say that B is the transpose of A, and write B = AT .

If A = AT we say that A is symmetric. (Symmetric matrices must always be square).

Example 1.33.

If A =

[1 3 5

−1 4 2

]then AT =

1 −1

3 4

5 2

.

If B =

[5 3

3 −2

]then BT =

[5 3

3 −2

]and thus B is symmetric.

Fact 1.34. � (AT )T = A.

� (A+B)T = AT +BT .

� (rA)T = rAT .

� If A ∈M`×m and B ∈Mm×n then (AB)T = BTAT .




1.4.4 Matrices and Systems of Equations

We will do a lot with matrices in the future (a linear algebra class that doesn’t cover general

vector spaces is often called a matrix algebra class). In the current context we mostly want

it to make it easier to talk about systems of equations.

Let

a11x1 + · · ·+ a1nxn = b1

a21x1 + · · ·+ a2nxn = b2

......


be a system of linear equations. Then A = (aij) ∈ Mm×n is its coefficient matrix, and

b = (b1, . . . , bm) is an element of Rm, but we can also think of it as a m × 1 matrix b =

[b1, . . . , bm]T . If we take x = [x1, . . . , xn]T to be a n × 1 matrix, we can rewrite our linear

system as the equation

Ax = b,

which is certainly much easier to write down.

Example 1.35. If A =

[1 3

2 4

]and b = [4, 6]T , then the equation Ax = b is

[1 3

2 4

][x

y

]=

[4

6

][x+ 3y

2x+ 4y

]=

[4

6

]x+ 3y = 4

2x+ 4y = 6

1.5 The identity matrix and matrix inverses

We just saw that any system of linear equations can be written Ax = b, which reminds us

of the single-variable linear equation ax = b. In the single-variable case we can just divide

both sides of the equation by a, as long as a 6= 0; it would be nice if we can do the same

thing for any system of linear equations.




But what does it mean to divide by a matrix? When we define division, we often start

by understanding reciprocals 1a. So we start by asking what matrix is the equivalent of the

number 1.

Definition 1.36. For any n we define the identity matrix to be In ∈ Mn to have a 1 on

every diagonal entry, and a zero everywhere else. For example,

I4 =

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

.

If A ∈ Mn then InA = A = AIn. Thus it is a multiplicative identity in the ring of n× nmatrices.

The identity matrix is symmetric (that is, ITn = In).

Now we want to define multiplicative inverses, the equivalent of reciprocals. The defini-

tion is not difficult to invent:

Definition 1.37. Let A and B be n× n matrices, such that AB = In = BA. Then we say

that B is the inverse (or multiplicative inverse) of A, and write B = A−1.

If such a matrix exists, we say that A is invertible or nonsingular. If no such matrix

exists, we say that A is singular.

Example 1.38. The identity matrix In is its own inverse, and thus invertible.

The matrices [2 4

3 1

]and

[−1/10 2/5

3/10 −1/5

]are inverses to each other, as you can check.

Example 1.39. The matrix

[1 0

0 0

]has no inverse, since

[1 0

0 0

][a b

c d

]=

[a b

0 0

]

won’t be the identity for any a, b, c, d. Thus this matrix is singular.

Remark 1.40. If AB = In then BA = In. This isn’t really trivial but we won’t prove it.




As the last example shows, finding the inverse to a matrix is a matter of solving a big

pile of linear equations at the same time (one for each coefficient of the inverse matrix).

Fortunately, we just got good at solving linear equations. Even more fortunately, there’s an

easy way to organize the work for these problems.

Proposition 1.41. Let A be a n×n matrix. Then if we form the augmented matrix[A In

],

then A is invertible if and only if the reduced row echelon form of this augmented matrix is[In B

]for some matrix B, and furthermore B = A−1.

Proof. Let X be a n × n matrix of unknowns, and set up the system of equations implied

by AX = In. This will be the same set of equations we are solving with this row reduction,

and thus a matrix X exists if and only if this system has a solution, which happens if and

only if the reduced row echelon form of[A In

]has no all-zero rows on the A side.

Example 1.42. Let’s find an inverse for A =

1 2 3

0 1 4

0 0 1

.

We form and reduce the augmented matrix1 2 3 1 0 0

0 1 4 0 1 0

0 0 1 0 0 1

→

1 0 −5 1 −2 0

0 1 4 0 1 0

0 0 1 0 0 1

→

1 0 0 1 −2 5

0 1 0 0 1 −4

0 0 1 0 0 1

.

Thus A−1 =

1 −2 5

0 1 −4

0 0 1

. We can check this by multiplying the matrices back together.

Example 1.43. Find the inverse of B =

1 0 4

1 1 6

−3 0 −10

.We form and reduce the augmented matrix

1 0 4 1 0 0

1 1 6 0 1 0

−3 0 −10 0 0 1

→

1 0 4 1 0 0

0 1 2 −1 1 0

0 0 2 3 0 1

→

1 0 4 1 0 0

0 1 2 −1 1 0

0 0 1 3/2 0 1/2

→

1 0 0 −5 0 −2

0 1 2 −4 1 −1

0 0 1 3/2 0 1/2

.http://jaydaigle.net/teaching/courses/2020-spring-214/ 19



Thus B−1 =

−5 0 −2

−4 1 −1

3/2 0 1/2

.Example 1.44. What happens if we try to find an inverse for C =

[1 0

0 0

]? We start with

[1 0 1 0

0 0 0 1

]but then there is no way to make the left-side block of the matrix into the identity I2. Thus

this matrix C is not invertible.

There are many more interesting properties of inverse matrices we’d like to discuss, but

we don’t have the tools to explain them properly yet. We will be returning to the properties

of matrices throughout the course as we develop more techniques and vocabulary.

1.6 Homogeneous systems and subspaces

There’s one particular category of systems of linear equations that’s especially important to

us, and will lead into the main subject matter of the course.

Definition 1.45. The n × 1 matrix 0 = [0, . . . , 0]T whose entries are all zero is called the

zero vector.

A system of linear equations Ax = b is called homogeneous if b = 0, that is, if the

constant term in each equation is zero. Otherwise, it is non-homogeneous.

It’s pretty clear that every homogeneous system has at least one solution: the solution

where every variable is equal to zero. It may have many more solutions than that.

Definition 1.46. For a given matrix A, the subspace of solutions to the equation Ax = 0

is called the nullspace N(A) or the kernel ker(A) of the matrix A.

Example 1.47. Find the null space of

[1 1 1 0

2 1 0 1

].

We row reduce the matrix[1 1 1 0 0

2 1 0 1 0

]→

[1 1 1 0 0

0 −1 −2 1 0

]→

[1 0 −1 1 0

0 −1 −2 1 0

]We see that x3 and x4 are fixed variables, and x1, x2 are determined by x3 and x4. (You

could of course do this the other way around). Then we have x1 = x3−x4 and x2 = x4−2x3.

Thus N(A) = {(α− β, β − 2α, α, β)} = {α(1,−2, 1, 0) + β(−1, 1, 0, 1)}.




Remark 1.48. It’s not too hard to see that a square matrix A is invertible if and only

if N(A) = {0}. If the matrix is invertible, then row-reducing it gets to be the identity

matrix—and so the solution to the associated homogeneous system is just 0. Conversely, if

the only solution is 0 then you must not have any rows of all zeros in the reduced form of

your matrix, so it’s invertible.

We can see that if we add together two solutions to this system of equations, we will get

another. In fact, this must be true of any homogeneous system.

Proposition 1.49 (Homogeneity). Suppose Ax = 0 is a homogeneous sytem of linear equa-

tions. Then:

1. 0 is a solution to the system.

2. If x1 and x2 are solutions to this system, then x1 + x2 is a solution.

3. If x is a solution to this system, and r is a real number, then rx is a solution.

Remark 1.50. We can rephrase this result: for any matrix A, we have

1. 0 ∈ N(A)

2. If x1,x2 ∈ N(A) then x1 + x1 ∈ N(A)

3. If r ∈ R and x ∈ N(A) then rx ∈ N(A).

This says exactly the same thing, but puts the emphasis on the matrix A rather than on the

equation Ax = 0.

Proof. 1. Calculation confirms that A0 = 0.

2. If x1 and x2 are solutions, then Ax1 = 0 and Ax2 = 0, so we have

A(x1 + x2) = Ax1 + Ax2 = 0 + 0 = 0.

Thus x1 + x2 is a solution.

3. If x is a solution and r ∈ R, then

A(rx) = rAx = r0 = 0.

Thus rx is a solution.




In contrast, the set of solutions to a non-homogeneous system Ax = b where b 6= 0 never

has these nice properties.

1. The zero vector is never a solution, since A0 = 0 6= b.

2. Adding two solutions doesn’t give you another solution: A(x1 + x2) = Ax1 + Ax2 =

b + b = 2b 6= b.

3. Multiplying a solution by a scalar doesn’t give another solution: Arx = rb 6= b unless

r = 1.

So there’s something special about homogeneous systems, which we will discuss in more

detail in 2.3.

But even though the set of solutions to a non-homogeneous system doesn’t have the nice

properties of proposition 1.49, we can still say a lot about what it looks like.

Proposition 1.51. Suppose Ax = b is a non-homogeneous linear system.

If U = N(A)) and x0 is a solution to Ax = b, then the set of solutions to the system

Ax = b is the set

N(A) + x0 = {y + x0 : y ∈ N(A)}.

Proof. We want to show that two sets are equal, so we show that each is a subset of the

other.

First, suppose that x1 is a solution to Ax1 = b. Then we have

b = Ax0

b = Ax1

b− b = Ax1 − Ax0 = A(x1 − x0)

0 = A(x1 − x0).

Thus y = x1 − x0 is a solution to Ax = 0, and then x1 = x0 + y for some y ∈ U .

Conversely, suppose x1 = x0 + y for some y ∈ U . Then

Ax1 = A(x0 + y) = Ax0 + Ay = b + 0 = b.

Thus x1 is a solution to Ax = b.

Remark 1.52. Notice this did not depend on the specific matricx, or even really the fact that

A is a matrix at all; it only depends on the ability to distribute matrix multiplication across

sums of vectors. Operations with this property are called “linear” and we will discuss them

in much more detail in section 4.




Definition 1.53. Suppose Ax = b is a system of linear equations. We call the equation

Ax = 0 the associated homogeneous system of linear equations. That is, the associated

homogeneous system has the same coefficients for all the variables, but the constants are all

zero.

Thus proposition 1.51 lets us understand the set of solutions to a non-homogeneous

system based on the solutions to the associated homogeneous system.

Example 1.54. Let’s find a set of solutions to the system

x1 + x2 + x3 = 3

x1 + 2x2 + 3x3 = 6

2x1 + 3x2 + 4x2 = 9.

Gaussian elimination gives1 1 1 3

1 2 3 6

2 3 4 9

→

1 1 1 3

0 1 2 3

0 1 2 3

→

1 1 1 3

0 1 2 3

0 0 0 0

→

1 0 −1 0

0 1 2 3

0 0 0 0

.Taking x3 = α as a free variable, our solution set is {(α, 3−2α, α)} = {(0, 3, 0)+α(1,−2, 1)}.Indeed, we see that this set corresponds to elements of the vector space spanned by {(1,−2, 1)},plus a specific solution (0, 3, 0).

Alternatively, we could have solved the homogeneous system first, and seen that the

solution was x1 − x3 = 0, x2 + 2x3 = 0, telling us that N(A) = {α(1,−2, 1)}. Then we just

need to find a solution; to my eyes the obvious solution is (1, 1, 1). So our theorem tells us

that the solution set is {(1, 1, 1) + α(1,−2, 1)}. This may not look like the solution we got

before, but it is in fact the same set, since (1, 1, 1) = (0, 3, 0) + (1,−2, 1).

Example 1.55. Now consider the system

x1 + x2 + x3 = 3

x1 + 2x2 + 3x3 = 3

2x1 + 3x2 + 4x2 = 3.

It’s easy enough to see that this system has no solutions, since the sum of the first two

equations should be the third.

This at first might seem concerning, since N(A) is never empty. But our proposition

assumed that there was at least one solution to the non-homogeneous system; when there




are no solutions, the proposition doesn’t actually say anything. But if any solution exists,

proposition 1.51 tells us that the set of solutions is just the nullspace of A, plus an offset.

Example 1.56. Let’s find the set of solutions to

x+ y + z = 0

x− 2y + 2z = 4

x+ 2y − z = 2.

We form the matrix1 1 1 0

1 −2 2 4

1 2 −1 2

→

1 1 1 0

0 −3 1 4

0 1 −2 2

→

1 1 1 0

0 1 −2 2

0 −3 1 4

→

1 1 1 0

0 1 −2 2

0 0 −5 10

→

1 1 1 0

0 1 −2 2

0 0 1 −2

→

1 0 3 −2

0 1 −2 2

0 0 1 −2

→

1 0 0 4

0 1 0 −2

0 0 1 −2

giving us the sole solution x1 = 4, x2 = −2, x3 = −2.

If we look at the corresponding homogeneous system, we see that we can reduce the

matrix to

1 0 0

0 1 0

0 0 1

and thus the sole solution to the homogeneous system of equations is

x1 = x2 = x3 = 0. Then every solution to our non-homogeneous system is a solution to

our homogeneous system plus some vector in {~0}. Since there is only one vector in that set,

there is only one solution to our system.



Documents

1 Systems of Linear Equations - Jay Daigle › ... › 214 › chapter_1_linear_equations.pdf · 2020-06-13 · Jay Daigle Occidental College Math 214: Linear Algebra 1 Systems of